NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
Hao Peng
|
Yunjia Qi
|
Xiaozhi Wang
|
Zijun Yao
|
Bin Xu
|
Lei Hou
|
Juanzi Li
|
Paper Details:
Month: July
Year: 2025
Location: Vienna, Austria
Venue:
ACL |
Citations
URL
No Citations Yet
https://github.com/THU-KEG/
https://opensource.org/license/mit
https://huggingface.co/Skywork
https://serper.dev/
https://github.com/lm-sys/FastChat/tree/main/
https://github.com/EleutherAI/
https://en.wikipedia.org/wiki/Raymond_III,_Count_of_Tripol
Field Of Study