NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
Haoxiang Wang
|
Yong Lin
|
Wei Xiong
|
Rui Yang
|
Shizhe Diao
|
Shuang Qiu
|
Han Zhao
|
Tong Zhang
|
Paper Details:
Month: August
Year: 2024
Location: Bangkok, Thailand
Venue:
ACL |
Citations
URL
No Citations Yet
https://github.com/RLHFlow/
https://github.com/tatsu-lab/alpaca_eval
https://crfm
https://github
https://hf.co/weqweasdas/RM-Mistral-7B
https://hf.co/wandb/gemma-2b-zephyr-sft
https://hf.co/wandb/gemma-2b-zephyr-dpo
Field Of Study