Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts

Tong Zhu | Daize Dong | Xiaoye Qu | Jiacheng Ruan | Wenliang Chen | Yu Cheng |

Paper Details:

Month: April
Year: 2025
Location: Albuquerque, New Mexico
Venue: NAACL |