NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning
Xiaoqian Liu
|
Ke Wang
|
Yongbin Li
|
Yuchuan Wu
|
Wentao Ma
|
Aobo Kong
|
Fei Huang
|
Jianbin Jiao
|
Junge Zhang
|
Paper Details:
Month: July
Year: 2025
Location: Vienna, Austria
Venue:
ACL |
Citations
URL
No Citations Yet
https://github.com/lxqpku/EPO
Field Of Study