Rethinking KenLM: Good and Bad Model Ensembles for Efficient Text Quality Filtering in Large Web Corpora

Yungi Kim | Hyunsoo Ha | Sukyung Lee | Jihoo Kim | Seonghoon Yang | Chanjun Park |

Paper Details:

Month: July
Year: 2025
Location: Vienna, Austria
Venue: ACL |

Citations

URL

No Citations Yet

No URLs Found

Field Of Study