Transformer整理
这篇讲的挺好的,有些图可能看不清可以看Jay Alammar的原文
作者: 龙心尘
时间:2019年1月
审校:百度NLP、龙心尘
翻译:张驰、毅航、Conrad
原作者:Jay Alammar
原链接:https://jalammar.github.io/illustrated-transformer/
下面这些相关研究也是挺不错的,可以看看:
Depthwise Separable Convolutions for Neural Machine Translation
Discrete Autoencoders for Sequence Models
Generating Wikipedia by Summarizing Long Sequences
Training Tips for the Transformer Model
Self-Attention with Relative Position Representations
Fast Decoding in Sequence Models using Discrete Latent Variables
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
随心随我