NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Improving Discriminative Capability of Reward Models in RLHF Using Contrastive Learning
Lu Chen
|
Rui Zheng
|
Binghai Wang
|
Senjie Jin
|
Caishuang Huang
|
Junjie Ye
|
Zhihao Zhang
|
Yuhao Zhou
|
Zhiheng Xi
|
Tao Gui
|
Qi Zhang
|
Xuanjing Huang
|
Paper Details:
Month: November
Year: 2024
Location: Miami, Florida, USA
Venue:
EMNLP |
Citations
URL
No Citations Yet
https://huggingface.co/datasets/anon8231489123/ShareGPT-
https://huggingface.co/datasets/Anthropic/hh-rlhf
https://huggingface.co/datasets/Anthropic/hh-
Field Of Study