Sequence Model - Sequence Models & Attention Mechanism
Various Sequence To Sequence Architectures
Basic Models
Sequence to sequence model

Image captioning
use CNN(AlexNet) first to get a 4096-dimensional vector, feed it to a RNN
Picking the Most Likely Sentence
translate a French sentence to the most likely English sentence .
it's to find
-
Why not a greedy search?
(Find the most likely words one by one) Because it may be verbose and long.
Beam Search
-
set the , find most likely English outputs
-
consider each for the most likely second word, and then find most likely words
-
do it again until
if , it's just greedy search.
Refinements to beam search
Length normalization
is much less than (close to ) take
it tends to give the short sentences.
So you can normalize it ( is a hyperparameter)
Beam search discussion
- large : better result, slower
- small : worse result, faster
Error Analysis in Beam Search
let be human high quality translation, and be algorithm output.
- : Beam search is at fault
- : RNN model is at fault
Bleu(bilingual evaluation understudy) Score
if you have some good referrences to evaluate the score.
Bleu details
calculate it with
BP = brevity penalty
don't want short translation.
Attention Model Intuition
it's hard for network to memorize the whole sentence.

compute the attention weight to predict the word from the context

Attention Model
Use a BiRNN or BiLSTM.

Computing attention
train a very small network to learn what the function is
the complexity is , which is so big (quadratic cost)

Speech Recognition - Audio Data
Speech recognition
Attention model for sppech recognition
generate character by character
CTC cost for speech recognition
CTC(Connectionist temporal classification)
"ttt_h_eee___ ____qqq" "the quick brown fox"
Basic rule: collapse repeated characters not separated by "blank"
Trigger Word Detection
label the trigger word, let the output be s
__EOF__

本文链接:https://www.cnblogs.com/zjp-shadow/p/15178221.html
关于博主:评论和私信会在第一时间回复。或者直接私信我。
版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
声援博主:如果您觉得文章对您有帮助,可以点击文章右下角【推荐】一下。您的鼓励是博主的最大动力!
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 单线程的Redis速度为什么快?
· 展开说说关于C#中ORM框架的用法!
· Pantheons:用 TypeScript 打造主流大模型对话的一站式集成库
· SQL Server 2025 AI相关能力初探
· 为什么 退出登录 或 修改密码 无法使 token 失效