LSTM和GRU
LSTM和GRU
LSTM
忽略偏置: $$\begin{align} i_t&=\sigma(x_t\cdot W_i+h_{t-1}\cdot U_i)\\ f_t&=\sigma(x_t\cdot W_f+h_{t-1}\cdot U_f)\\ o_t&=\sigma(x_t\cdot W_o+h_{t-1}\cdot U_o)\\ \widetilde{C}_t&=tanh(x_t\cdot W_c+h_{t-1}\cdot U_c)\\ C_t&=f\cdot C_{t-1}+ i\cdot \widetilde{C}_{t}\\ h_t&=tanh(o_t\cdot C_t) \end{align} $$ 其中: >$i_t:$输入门 >$f_t:$遗忘门 >$o_t:$输出门 >$\widetilde{C}_t:$新信息GRU——LSTM的一种变体
比较如图:
GRU节点更新方式:
\[\begin{align}
z_t&=\sigma(x_t\cdot W_z+h_{t-1}\cdot U_z)\\
r_t&=\sigma(x_t\cdot W_r+h_{t-1}\cdot U_r)\\
\widetilde{h}_t&=tanh(x_t\cdot W+(r_t\odot h_{t-1})\cdot U)\\
h_t&=(1-z_t)h_{t-1}+z_t\cdot \widetilde{h}_t
\end{align}
\]
其中:
\(z_t:\)更新门
\(r_t:\)重置门
怕什么真理无穷,进一寸有一寸的欢喜