Don’t Forget Your Reward Values: Language Model Alignment via Value-based Calibration

Xin Mao | Feng-Lin Li | Huimin Xu | Wei Zhang | Wang Chen | Anh Tuan Luu |

Paper Details:

Month: November
Year: 2024
Location: Miami, Florida, USA
Venue: EMNLP |