摘要: ©原创作者 | LJ GLaM: Efficient Scaling of Language Models with Mixture-of-Experts https://arxiv.org/pdf/2112.06905.pdf 01 摘要 这是上个月谷歌刚刚在arxiv发布的论文,证明了一种能sc 阅读全文
posted @ 2022-03-08 21:23 NLP论文解读 阅读(797) 评论(0) 推荐(0) 编辑