P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs

Yidan Zhang | Yu Wan | Boyi Deng | Baosong Yang | Hao-Ran Wei | Fei Huang | Bowen Yu | Dayiheng Liu | Junyang Lin | Fei Huang | Jingren Zhou |

Paper Details:

Month: November
Year: 2025
Location: Suzhou, China
Venue: EMNLP |