Attention[Content]
0. 引言
神经网络中的注意机制就是参考人类的视觉注意机制原理。即人眼在聚焦视野区域中某个小区域时,会投入更多的注意力到这个区域,即以“高分辨率”聚焦于图像的某个区域,同时以“低分辨率”感知周围图像,然后随着时间的推移调整焦点。
参考文献:
- [arxiv] - .attention search
- [CV] - Mnih V, Heess N, Graves A. Recurrent models of visual attention[J]. arXiv preprint arXiv:1406.6247, 2014.
- [Bahdanau] - Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.
- [CV] - Ba J, Mnih V, Kavukcuoglu K. Multiple object recognition with visual attention[J]. arXiv preprint arXiv:1412.7755, 2014.
- [CV] - Xu K, Ba J, Kiros R, et al. Show, attend and tell: Neural image caption generation with visual attention[J] .arXiv preprint arXiv:1502.03044, 2015.
- [Speech] - Chorowski J K, Bahdanau D, Serdyuk D, et al. Attention-based models for speech recognition[J]. arXiv preprint arXiv:1506.07503, 2015.
- [**Luong **] - Luong M T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation[J]. arXiv preprint arXiv:1508.04025, 2015.
- [Speech] - Bahdanau D, Chorowski J, Serdyuk D, et al. End-to-end attention-based large vocabulary speech recognition[J]. arXiv preprint arXiv:1508.04395, 2015.
- [QA] - Yang Z, He X, Gao J, et al. Stacked attention networks for image question answering[J]. arXiv preprint arXiv:1511.02274, 2015.
- [Weight normalization] - Salimans T, Kingma D P. Weight normalization: A simple reparameterization to accelerate training of deep neural networks[J]. arXiv preprint arXiv:1602.07868, 2016.
- [Text] - Nallapati R, Xiang B, Zhou B. Sequence-to-Sequence RNNs for Text Summarization[J]. 2016.
- [Survey] - Wang F, Tax D M J. Survey on the attention based RNN model and its applications in computer vision[J]. arXiv preprint arXiv:1601.06823, 2016.
- [Translation] - Wu Y, Schuster M, Chen Z, et al. Google's neural machine translation system: Bridging the gap between human and machine translation[J]. arXiv preprint arXiv:1609.08144, 2016.
- [Translation] - Neubig G. Neural Machine Translation and Sequence-to-sequence Models: A Tutorial[J]. arXiv preprint arXiv:1703.01619, 2017.
- [BahdanauMonotonic] - Raffel C, Luong T, Liu P J, et al. Online and linear-time attention by enforcing monotonic alignments[J]. arXiv preprint arXiv:1704.00784, 2017.
- [survey] - .Vaswani A, Shazeer N, Parmar N, et al. Attention Is All You Need[J].arXiv preprint arXiv:1706.03762v4, 2017.
- [Blog] - .Attention and Augmented Recurrent Neural Networks
- [Quora] - .How-does-an-attention-mechanism-work-in-deep-learning
- [Quora] - .Can-you-recommend-to-me-an-exhaustive-reading-list-for-attention-models-in-deep-learning
- [Quora] - .What-is-attention-in-the-context-of-deep-learning
- [Quora] - .What-is-an-intuitive-explanation-for-how-attention-works-in-deep-learning
- [Quora] - .What-is-exactly-the-attention-mechanism-introduced-to-RNN-recurrent-neural-network-It-would-be-nice-if-you-could-make-it-easy-to-understand
- [Quora] - .How-is-a-saliency-map-generated-when-training-recurrent-neural-networks-with-soft-attention
- [Quora] - .What-is-the-difference-between-soft-attention-and-hard-attention-in-neural-networks
- [Quora] - .What-is-Attention-Mechanism-in-Neural-Networks
- [Quora] - .How-is-the-attention-component-of-attentional-neural-networks-trained