Using LLM Judgements for Sanity Checking Results and Reproducibility of Human Evaluations in NLP

Rudali Huidrom | Anya Belz |

Paper Details:

Month: July
Year: 2025
Location: Vienna, Austria and virtual meeting
Venue: GEM | WS |