摘要: BERT-Large, Uncased (Whole Word Masking): 24-layer, 1024-hidden, 16-heads, 340M parameters BERT-Large, Cased (Whole Word Masking): 24-layer, 1024-hidd 阅读全文
posted @ 2019-06-14 00:46 叶建成 阅读(6798) 评论(0) 推荐(2) 编辑