NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
GEM - 2025
Total Papers:- 68
Total Papers accross all years:- 164
Total Citations :- 0
«
1
2
3
4
5
»
Knockout LLM Assessment: Using Large Language Models for Evaluations through Iterative Pairwise Comparisons
Isik Baran Sandan |
Tu Anh Dinh |
Jan Niehues |
Learning and Evaluating Factual Clarification Question Generation Without Examples
Matthew Toles |
Yukun Huang |
Zhou Yu |
sPhinX: Sample Efficient Multilingual Instruction Fine-Tuning Through N-shot Guided Prompting
Sanchit Ahuja |
Kumar Tanmay |
Hardik Hansrajbhai Chauhan |
Barun Patra |
Kriti Aggarwal |
Luciano Del Corro |
Arindam Mitra |
Tejas Indulal Dhamecha |
Ahmed Hassan Awadallah |
Monojit Choudhury |
Vishrav Chaudhary |
Sunayana Sitaram |
Coreference as an indicator of context scope in multimodal narrative
Nikolai Ilinykh |
Shalom Lappin |
Asad B. Sayeed |
Sharid Loáiciga |
Psycholinguistic Word Features: a New Approach for the Evaluation of LLMs Alignment with Humans
Javier Conde |
Miguel González Saiz |
María Grandury |
Pedro Reviriego |
Gonzalo Martínez |
Marc Brysbaert |
Towards Comprehensive Evaluation of Open-Source Language Models: A Multi-Dimensional, User-Driven Approach
Qingchen Yu |
IRSum: One Model to Rule Summarization and Retrieval
Sotaro Takeshita |
Simone Paolo Ponzetto |
Kai Eckert |
Shallow Preference Signals: Large Language Model Aligns Even Better with Truncated Data?
Xuan Qi |
Jiahao Qiu |
Xinzhe Juan |
Yue Wu |
Mengdi Wang |
Bridging the LLM Accessibility Divide? Performance, Fairness, and Cost of Closed versus Open LLMs for Automated Essay Scoring
Kezia Oketch |
John P. Lalor |
Yi Yang |
Ahmed Abbasi |
PapersPlease: A Benchmark for Evaluating Motivational Values of Large Language Models Based on ERG Theory
Junho Myung |
Yeon Su Park |
Sunwoo Kim |
Shin Yoo |
Alice Oh |
PATCH! Psychometrics-AssisTed BenCHmarking of Large Language Models against Human Populations: A Case Study of Proficiency in 8th Grade Mathematics
Qixiang Fang |
Daniel Oberski |
Dong Nguyen |
Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in QA Agents
Ashley Lewis |
Event-based evaluation of abstractive news summarization
Huiling You |
Samia Touileb |
Lilja Øvrelid |
Erik Velldal |
CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization
Brihi Joshi |
Sriram Venkatapathy |
Mohit Bansal |
Nanyun Peng |
Haw-Shiuan Chang |
ELAB: Extensive LLM Alignment Benchmark in Persian Language
Zahra Pourbahman |
Fatemeh Rajabi |
Mohammadhossein Sadeghi |
Omid Ghahroodi |
Somayeh Bakhshaei |
Arash Amini |
Reza Kazemi |
Mahdieh Soleymani Baghshah |
Conference Topic Distribution
Linguistic
Task
Approach
Language
Dataset
Conference Citation Distribution
Conference Papers have no Citations yet
Topics