NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
HumEval - 2024
Total Papers:- 27
Total Papers accross all years:- 71
Total Citations :- 0
1
2
»
Towards Holistic Human Evaluation of Automatic Text Simplification
Luisa Carrer |
Andreas Säuberli |
Martin Kappus |
Sarah Ebling |
Adding Argumentation into Human Evaluation of Long Document Abstractive Summarization: A Case Study on Legal Opinions
Mohamed Elaraby |
Huihui Xu |
Morgan Gray |
Kevin Ashley |
Diane Litman |
Exploring Reproducibility of Human-Labelled Data for Code-Mixed Sentiment Analysis
Sachin Sasidharan Nair |
Tanvi Dinkar |
Gavin Abercrombie |
Once Upon a Replication: It is Humans’ Turn to Evaluate AI’s Understanding of Children’s Stories for QA Generation
Andra-Maria Florescu |
Marius Micluta-Campeanu |
Liviu P. Dinu |
A Gold Standard with Silver Linings: Scaling Up Annotation for Distinguishing Bosnian, Croatian, Montenegrin and Serbian
Aleksandra Miletić |
Filip Miletić |
The 2024 ReproNLP Shared Task on Reproducibility of Evaluations in NLP: Overview and Results
Anya Belz |
Craig Thomson |
Insights of a Usability Study for KBQA Interactive Semantic Parsing: Generation Yields Benefits over Templates but External Validity Remains Challenging
Ashley Lewis |
Lingbo Mo |
Marie-Catherine de Marneffe |
Huan Sun |
Michael White |
ReproHum #0033-3: Comparable Relative Results with Lower Absolute Values in a Reproduction Study
Yiru Li |
Huiyuan Lai |
Antonio Toral |
Malvina Nissim |
ReproHum #0712-01: Reproducing Human Evaluation of Meaning Preservation in Paraphrase Generation
Lewis N. Watson |
Dimitra Gkatzia |
ReproHum #0043-4: Evaluating Summarization Models: investigating the impact of education and language proficiency on reproducibility
Mateusz Lango |
Patricia Schmidtova |
Simone Balloccu |
Ondrej Dusek |
ReproHum #0866-04: Another Evaluation of Readers’ Reactions to News Headlines
Zola Mahlaza |
Toky Hajatiana Raboanary |
Kyle Seakgwa |
C. Maria Keet |
Exploratory Study on the Impact of English Bias of Generative Large Language Models in Dutch and French
Ayla Rigouts Terryn |
Miryam de Lhoneux |
ReproHum #0927-3: Reproducing The Human Evaluation Of The DExperts Controlled Text Generation Method
Javier González Corbelle |
Ainhoa Vivel Couso |
Jose Maria Alonso-Moral |
Alberto Bugarín-Diz |
ReproHum #0892-01: The painful route to consistent results: A reproduction study of human evaluation in NLG
Irene Mondella |
Huiyuan Lai |
Malvina Nissim |
ReproHum #0087-01: Human Evaluation Reproduction Report for Generating Fact Checking Explanations
Tyler Loakman |
Chenghua Lin |
Conference Topic Distribution
Linguistic
Task
Approach
Language
Dataset
Conference Citation Distribution
Conference Papers have no Citations yet
Topics