大模型学习笔记：attention 机制

Understanding Query, Key, Value in Transformers and LLMs

This self-attention process is at the core of what makes transformers so powerful. They allow every word (or token) to dynamically adjust its importance based on the surrounding context, leading to a more accurate and nuanced understanding as the model processes multiple layers of the network.

Must-Read Starter Guide to Mastering Attention Mechanisms in Machine Learning
- Soft Attention
- Hard Attention
- Self-Attention
- Global Attention
Explainable AI: Visualizing Attention in Transformers

posted @ 2024-11-24 11:54 dudu 阅读(12) 评论(0) 编辑收藏举报

刷新页面返回顶部