NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Fixing Distribution Shifts of LLM Self-Critique via On-Policy Self-Play Training
Rong Bao
|
Donglei Yu
|
Kai Fan
|
Minpeng Liao
|
Paper Details:
Month: July
Year: 2025
Location: Vienna, Austria
Venue:
ACL |
Citations
URL
No Citations Yet
https://github.com/rbao2018/SCOP
https://www.python.org/
https://pytorch.org/
https://huggingface.co/
Field Of Study