NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion
Yannis Flet-Berliac
|
Nathan Grinsztajn
|
Florian Strub
|
Eugene Choi
|
Bill Wu
|
Chris Cremer
|
Arash Ahmadian
|
Yash Chandak
|
Mohammad Gheshlaghi Azar
|
Olivier Pietquin
|
Matthieu Geist
|
Paper Details:
Month: November
Year: 2024
Location: Miami, Florida, USA
Venue:
EMNLP |
Citations
URL
No Citations Yet
https://github.com/openai/
https://huggingface.co/meta-llama/
Field Of Study