x:
SENTENCE_START what are n't you understanding about this ? !
[0, 51, 27, 16, 10, 856, 53, 25, 34, 69]
y:
what are n't you understanding about this ? ! SENTENCE_END
[51, 27, 16, 10, 856, 53, 25, 34, 69, 1]
\[i = \sigma(U^ix_i + W^is_{t-1} + b^i) \\
i \text{: input gate, defines how much of the newly computed state for the current input you want to let through.} \\
\]
\[f = \sigma(U^fx_i + W^fs_{t-1} + b^f) \\
f \text{: forget gate, defines how much of the previous state you want to let through.} \\
\]
\[o = \sigma(U^ox_i + W^os_{t-1} + b^o) \\
o \text{: output gate, defines how much of the internal state you want to expose to the external network.} \\
\]
\[g = tanh(U^gx_i + W^gs_{t-1} + b^g) \\
g \text{: a candidate hidden state.} \\
c_t = c_{t-1} \circ f + g \circ i \\
c_t \text{: the internal memory of the unit.} \\
s_t = tanh(c_t) \circ o
\]
门控循环单元 - GRUs (Gated Recurrent Units)
先看看计算公式:
\[x_e = Ex_t \\
z = \sigma(U^zx_e + W^zs_{t-1} + b^z) \\
r = \sigma(U^rx_e + W^rs_{t-1} + b^r) \\
h = tanh(U^hx_e + W^h(s_{t-1} \circ r) + b^h) \\
s_t = (1 - z) \circ h + z \circ s_{t-1} \\
o_t = Vs_t + c
\]