NLPExplorer

HumEval - 2023

Total Papers:- 17

Total Papers accross all years:- 71

Total Citations :- 0

1 2 »

Unveiling NLG Human-Evaluation Reproducibility: Lessons Learned and Key Insights from Participating in the ReproNLP Challenge

Lewis Watson | Dimitra Gkatzia |

Same Trends, Different Answers: Insights from a Replication Study of Human Plausibility Judgments on Narrative Continuations

Yiru Li | Huiyuan Lai | Antonio Toral | Malvina Nissim |

Human Evaluation Reproduction Report for Data-to-text Generation with Macro Planning

Mohammad Arvan | Natalie Parde |

Reproducing a Comparative Evaluation of German Text-to-Speech Systems

Manuela Hürlimann | Mark Cieliebak |

Some lessons learned reproducing human evaluation of a data-to-text system

Javier González Corbelle | Jose Alonso | Alberto Bugarín-Diz |

Reproduction of Human Evaluations in: “It’s not Rocket Science: Interpreting Figurative Language in Narratives”

Saad Mahamood |

A Reproduction Study of the Human Evaluation of Role-Oriented Dialogue Summarization Models

Mingqi Gao | Jie Ruan | Xiaojun Wan |

Hierarchical Evaluation Framework: Best Practices for Human Evaluation

h_da@ReproHumn – Reproduction of Human Evaluation and Technical Pipeline

Margot Mieskes | Jacob Georg Benz |

A Manual Evaluation Method of Neural MT for Indigenous Languages

Linda Wiechetek | Flammie Pirinen | Per Kummervold |

With a Little Help from the Authors: Reproducing Human Evaluation of an MT Error Detector

Ondrej Platek | Mateusz Lango | Ondrej Dusek |

Designing a Metalanguage of Differences Between Translations: A Case Study for English-to-Japanese Translation

Tomono Honda | Atsushi Fujita | Mayuka Yamamoto | Kyo Kageura |

The 2023 ReproNLP Shared Task on Reproducibility of Evaluations in NLP: Overview and Results

Anya Belz | Craig Thomson |

How reproducible is best-worst scaling for human evaluation? A reproduction of ‘Data-to-text Generation with Macro Planning’