Speculating LLMs’ Chinese Training Data Pollution from Their Tokens

Qingjie Zhang | Di Wang | Haoting Qian | Liu Yan | Tianwei Zhang | Ke Xu | Qi Li | Minlie Huang | Hewu Li | Han Qiu |

Paper Details:

Month: November
Year: 2025
Location: Suzhou, China
Venue: EMNLP |