摘要: 目录概MoE训练 Shazeer N., Mirhoseini A., Maziarz K., Davis A., Le Q., Hinton G. and Dean J. Outrageously large neural networks: The sparsely-gated mixture- 阅读全文
posted @ 2024-05-10 10:21 馒头and花卷 阅读(17) 评论(0) 推荐(0) 编辑