Proj KaggleNLU Paper Reading: Unsupervised Cross-lingual Representation Learning at Scale

Github

(https://github.com/facebookresearch/fairseq, https://github.com/facebookresearch/pytext)
https://github.com/facebookresearch/XLM.git

Abstract

本文任务:训练大规模跨语言面向多种自然语言Transfer任务的预训练方法
方法:用超过2TB的CommonCrawl数据训练了Transformer-based masked model
效果:

  1. 比mBERT在很多benchmarks上更好
  2. +14.6% average accu-racy on XNLI
  3. +13% average F1 score on MLQA
  4. +2.4% F1 score on NER
  5. 在(训练)资源更少的语言上表现更好
  6. improving 15.7% in XNLI accuracy for Swahili over previous XLM models
  7. 11.4% for Urdu over previous XLM models.
  8. 对关键参数进行了详细分析
  9. positive transfer 和 capacity dilution之间的权衡
  10. high/ low resource languages
  11. XLM-R能够处理多种语言而比单语言模型更好
posted @ 2021-10-14 14:01  雪溯  阅读(46)  评论(0编辑  收藏  举报