NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Mutual-Taught for Co-adapting Policy and Reward Models
Tianyuan Shi
|
Canbin Huang
|
Fanqi Wan
|
Longguang Zhong
|
Ziyi Yang
|
Weizhou Shen
|
Xiaojun Quan
|
Ming Yan
|
Paper Details:
Month: July
Year: 2025
Location: Vienna, Austria
Venue:
ACL |
Citations
URL
No Citations Yet
https://github.com/tatsu-lab/alpaca_eval
https://github.com/
https://github.com/
Field Of Study