NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Mitigating the Alignment Tax of RLHF
Yong Lin
|
Hangyu Lin
|
Wei Xiong
|
Shizhe Diao
|
Jianmeng Liu
|
Jipeng Zhang
|
Rui Pan
|
Haoxiang Wang
|
Wenbin Hu
|
Hanning Zhang
|
Hanze Dong
|
Renjie Pi
|
Han Zhao
|
Nan Jiang
|
Heng Ji
|
Yuan Yao
|
Tong Zhang
|
Paper Details:
Month: November
Year: 2024
Location: Miami, Florida, USA
Venue:
EMNLP |
Citations
URL
No Citations Yet
https://github.com/avalonstrel/
https://tatsu-lab.github.io/alpaca_eval/
https://huggingface.co/datasets/anon8231489123/ShareGPT_
https://huggingface.co/HuggingFaceH4/zephyr-7b-beta
https://huggingface.co/HuggingFaceH4/mistral-7b-sft-beta
https://huggingface.co/HuggingFaceH4/zephyr-7b-gemma-v0.1
https://huggingface.co/google/gemma-7b
https://github.com/
https://github.com/OptimalScale/LMFlow
https://github.com/huggingface/trl
Field Of Study