Sequence Models - Recurrent Neural Networks
Examples of sequence data:
- Speech recognition
- Music generation
- Sentiment classification
- DNA sequence analysis
- Machine translation
- Video activity recognition
- Name entity recognition
Recurrent Neural Network Model
Why not a standard network?
- Inputs, outputs can be different lengths in different examples.
- Doesn't share features learned across different position of text.
Weakness of RNN
only use the earlier information in sequence
(use Bidirectional RNN instead)
Forward Propagation

- the activation will often be a in choice of RNN
- will often be
- binary classification problem:
- k-way classification problem:
Back Propagataion

Different Types of RNN

-
many-to-one architecture:
Sentiment Classification
-
one-to-many architecture
Music Generation
-
many-to-many architecture:
Machine Traslation: input, output can be diffent lengths. (encoder, decoder)
Language Model and Sequence Generation
-
Language modelling
give the probability of a sentence:
basic job: estimates the probability of sequences
-
Traingning set: large corpus of english text.
- add at the end of sentence.
- replace the unkown words with
-
Training with RNN model
replace the with .
Sampling novel sequences

-
Sampling a sequence from a trained RNN
Generate the sentence word by word.
-
Character-level language model
Vanishing gradients with RNNS
Basic RNNs is not very good at capturing long-range dependencies.
Exploding gradients in Backpropagation. (addressed by using
Gated Recurrent Unit(GRU)
GRU(simplified)

and
is a candidate for replacing
as being either or most of the time.
if , the is maintained pretty much exactly even across many times that.
- adress vanishing gradient problem
- learn even very long-range dependencies
Full GRU
is a standing of relevance
Long Short Term Memory (LSTM)

peephole connection(element-wise): fifth element affect fifth element.
Bidirectional RNN
forward prop
Acyclic graph
BRNN with LSTM blocks would be a pretty reasonable first thing to try
Deep RNNs

Homework: Improvise a Jazz Solo with an LSTM Network
You would like to create a jazz music piece specially for a friend's birthday. However, you don't know how to play any instruments, or how to compose music. Fortunately, you know deep learning and will solve this problem using an LSTM network!
You will train a network to generate novel jazz solos in a style representative of a body of performed work. 😎🎷
There's something coming into me when I saw it... Aye...
Exercise 1 - djmodel

We will use:
- optimizer: Adam optimizer
- Loss function: categorical cross-entropy (for multi-class classification)
Exercise 2 - music_inference_model

Exercise 3 - predict_and_sample
__EOF__

本文链接:https://www.cnblogs.com/zjp-shadow/p/15139528.html
关于博主:评论和私信会在第一时间回复。或者直接私信我。
版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
声援博主:如果您觉得文章对您有帮助,可以点击文章右下角【推荐】一下。您的鼓励是博主的最大动力!
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 单线程的Redis速度为什么快?
· 展开说说关于C#中ORM框架的用法!
· Pantheons:用 TypeScript 打造主流大模型对话的一站式集成库
· SQL Server 2025 AI相关能力初探
· 为什么 退出登录 或 修改密码 无法使 token 失效