【动手学深度学习pytorch】学习笔记 8.4 循环神经网络

8.4. 循环神经网络 — 动手学深度学习 2.0.0-beta0 documentation (d2l.ai)

主要内容：

对隐状态使用循环计算的神经网络。
隐状态可以捕获直到当前时间步序列的历史信息。
参数数量不会随着时间步的增加而增加。
可以使用循环神经网络创建字符级语言模型。
可以使用困惑度来评价语言模型的质量。

作者讲的清晰易懂，读完之后感觉很通透。

由于对信息论不太熟悉，对困惑度这个概念有些困惑。。。

本节的代码比较简单，验证了一个运算：

隐状态中 $X_{t} W_{x h} + H_{t - 1} W_{h h}$

$X_{t} W_{x h} + H_{t - 1} W_{h h}$

import torch
from d2l import torch as d2l

# torch.normal(mean, std, *, generator=None, out=None)
# 该函数返回从单独的正态分布中提取的随机数的张量，该正态分布的均值是mean，标准差是std。
X, W_xh = torch.normal(0, 1, (3, 1)), torch.normal(0, 1, (1, 4))
H, W_hh = torch.normal(0, 1, (3, 4)), torch.normal(0, 1, (4, 4))

print('X = ', X)
print('W_xh = ', W_xh)
print('H = ', H)
print('W_hh = ', W_hh)
print('𝐗𝑡𝐖𝑥ℎ + 𝐇𝑡−1𝐖ℎℎ\n ', torch.matmul(X, W_xh) + torch.matmul(H, W_hh))
print('𝐗𝑡 和𝐇𝑡−1的拼接 与𝐖𝑥ℎ和𝐖ℎℎ的拼接的矩阵乘法\n', torch.matmul(torch.cat((X, H), 1), torch.cat((W_xh, W_hh), 0)))

x_t = torch.reshape(torch.arange(start=0, end=3, step=1),(3,1))
print(x_t)
w_t = torch.reshape(torch.arange(start=0, end=4, step=1),(1,4))
print(w_t)
h_t = torch.reshape(torch.arange(start=0, end=12, step=1),(3,4))
print(h_t)
w_ht = torch.reshape(torch.arange(start=0, end=16, step=1),(4,4))
print(w_ht)
print('𝐗𝑡𝐖𝑥ℎ + 𝐇𝑡−1𝐖ℎℎ\n ', torch.matmul(x_t, w_t) + torch.matmul(h_t, w_ht))
print('𝐗𝑡 和𝐇𝑡−1的拼接 与𝐖𝑥ℎ和𝐖ℎℎ的拼接的矩阵乘法\n', torch.matmul(torch.cat((x_t, h_t), 1), torch.cat((w_t, w_ht), 0)))

X = tensor([[0.0750],
[0.9167],
[1.6353]])
W_xh = tensor([[-0.2704, -1.1983, -0.1330, -0.9958]])
H = tensor([[ 2.2878, -0.2639, -0.9919, 0.1534],
[-0.2766, 1.0092, 2.2080, 0.6165],
[-0.5322, 0.0030, 0.1436, 0.6592]])
W_hh = tensor([[-9.6877e-01, -1.4911e+00, 5.7608e-01, 3.3117e+00],
[ 9.5269e-01, -1.1773e+00, 2.5746e-01, -1.5353e+00],
[-9.5935e-01, -2.1764e+00, 7.7573e-01, -1.0381e+00],
[-1.3758e-01, -7.3054e-01, -4.6560e-04, 9.2867e-01]])
𝐗𝑡𝐖𝑥ℎ + 𝐇𝑡−1𝐖ℎℎ
tensor([[-1.5575, -1.1435, 0.4705, 9.0790],
[-1.2215, -7.1301, 1.6911, -5.0979],
[-0.1523, -1.9637, -0.4122, -2.9326]])
𝐗𝑡 和𝐇𝑡−1的拼接与𝐖𝑥ℎ和𝐖ℎℎ的拼接的矩阵乘法
tensor([[-1.5575, -1.1435, 0.4705, 9.0790],
[-1.2215, -7.1301, 1.6911, -5.0979],
[-0.1523, -1.9637, -0.4122, -2.9326]])
tensor([[0],
[1],
[2]])
tensor([[0, 1, 2, 3]])
tensor([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
tensor([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
𝐗𝑡𝐖𝑥ℎ + 𝐇𝑡−1𝐖ℎℎ
tensor([[ 56, 62, 68, 74],
[152, 175, 198, 221],
[248, 288, 328, 368]])
𝐗𝑡 和𝐇𝑡−1的拼接与𝐖𝑥ℎ和𝐖ℎℎ的拼接的矩阵乘法
tensor([[ 56, 62, 68, 74],
[152, 175, 198, 221],
[248, 288, 328, 368]])

posted on 2022-06-10 12:00 HBU_DAVID 阅读(69) 评论(0) 编辑收藏举报

刷新页面返回顶部

Notebook

【动手学深度学习pytorch】学习笔记 8.4 循环神经网络

导航