Transformer整理

这篇讲的挺好的，有些图可能看不清可以看Jay Alammar的原文

作者：龙心尘
时间：2019年1月

出处：图解Transformer（完整版）

审校：百度NLP、龙心尘
翻译：张驰、毅航、Conrad
原作者：Jay Alammar
原链接：https://jalammar.github.io/illustrated-transformer/

下面这些相关研究也是挺不错的，可以看看：

Attention Is All You Need

Transformer博客

Tensor2Tensor announcement：

Łukasz Kaiser的Colab介绍

Depthwise Separable Convolutions for Neural Machine Translation

One Model To Learn Them All

Discrete Autoencoders for Sequence Models

Generating Wikipedia by Summarizing Long Sequences

Image Transformer

Training Tips for the Transformer Model

Self-Attention with Relative Position Representations

Fast Decoding in Sequence Models using Discrete Latent Variables

Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

posted @ 2021-09-07 16:23 小筱痕阅读(91) 评论(0) 收藏举报

刷新页面返回顶部