EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

Xiaoqian Liu | Ke Wang | Yongbin Li | Yuchuan Wu | Wentao Ma | Aobo Kong | Fei Huang | Jianbin Jiao | Junge Zhang |

Paper Details:

Month: July
Year: 2025
Location: Vienna, Austria
Venue: ACL |

Citations

URL

No Citations Yet