DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Damai Dai | Chengqi Deng | Chenggang Zhao | R.x. Xu | Huazuo Gao | Deli Chen | Jiashi Li | Wangding Zeng | Xingkai Yu | Y. Wu | Zhenda Xie | Y.k. Li | Panpan Huang | Fuli Luo | Chong Ruan | Zhifang Sui | Wenfeng Liang |

Paper Details:

Month: August
Year: 2024
Location: Bangkok, Thailand
Venue: ACL |