Coming from computer vision and new to transformers? Here are some resources that greatly accelerated my learning.
http://jalammar.github.io/illustrated-transformer/
http://peterbloem.nl/blog/transformers
https://nlp.seas.harvard.edu/2018/04/03/attention.html