S2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning

Ruotian Ma | Peisong Wang | Cheng Liu | Xingyan Liu | Jiaqi Chen | Bang Zhang | Xin Zhou | Nan Du | Jia Li |

Paper Details:

Month: July
Year: 2025
Location: Vienna, Austria
Venue: ACL |