Proj KaggleNLU Paper Reading: Unsupervised Cross-lingual Representation Learning at Scale

Github

(https://github.com/facebookresearch/fairseq, https://github.com/facebookresearch/pytext)
https://github.com/facebookresearch/XLM.git

Abstract

本文任务：训练大规模跨语言面向多种自然语言Transfer任务的预训练方法
方法：用超过2TB的CommonCrawl数据训练了Transformer-based masked model
效果：

比mBERT在很多benchmarks上更好
+14.6% average accu-racy on XNLI
+13% average F1 score on MLQA
+2.4% F1 score on NER
在（训练）资源更少的语言上表现更好
improving 15.7% in XNLI accuracy for Swahili over previous XLM models
11.4% for Urdu over previous XLM models.
对关键参数进行了详细分析
positive transfer 和 capacity dilution之间的权衡
high/ low resource languages
XLM-R能够处理多种语言而比单语言模型更好

posted @ 2021-10-14 14:01 雪溯阅读(49) 评论(0) 编辑收藏举报

刷新页面返回顶部