NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
MMLU-SR: A Benchmark for Stress-Testing Reasoning Capability of Large Language Models
Wentian Wang
|
Sarthak Jain
|
Paul Kantor
|
Jacob Feldman
|
Lazaros Gallos
|
Hao Wang
|
Paper Details:
Month: November
Year: 2024
Location: Miami, Florida, USA
Venue:
GenBench |
WS |
Citations
URL
No Citations Yet
No URLs Found
Field Of Study