Self-Generated Critiques Boost Reward Modeling for Language Models

Yue Yu | Zhengxing Chen | Aston Zhang | Liang Tan | Chenguang Zhu | Richard Yuanzhe Pang | Yundi Qian | Xuewei Wang | Suchin Gururangan | Chao Zhang | Melanie Kambadur | Dhruv Mahajan | Rui Hou |

Paper Details:

Month: April
Year: 2025
Location: Albuquerque, New Mexico
Venue: NAACL |

Citations

URL

No Citations Yet

No URLs Found

Field Of Study