NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
APLOT: Robust Reward Modeling via Adaptive Preference Learning with Optimal Transport
Zhuo Li
|
Yuege Feng
|
Dandan Guo
|
Jinpeng Hu
|
Anningzhe Gao
|
Xiang Wan
|
Paper Details:
Month: November
Year: 2025
Location: Suzhou, China
Venue:
EMNLP |
Citations
URL
No Citations Yet
https://github.com/BIRlz/APLOT
https://huggingface.co/datasets/llm-blender/
https://huggingface.co/datasets/llm-blender/
http://arxiv
https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct
https://huggingface.co/OpenRLHF/Llama-3-8b-sft-mixture
https://github.com/modelscope/ms-swift
https://github.com/modelscope/evalscope
Field Of Study