Long Short-Term Memory (LSTM)公式简介
Long short-term memory:
make that short-term memory last for a long time.
Paper Reference:
A Critical Review of Recurrent Neural Networks for Sequence Learning
Three Types of Gate
Input Gate:
Controls how much of the current input \(x_t\) and the previous output \(h_{t-1}\) will enter into the new cell.
Forget Gate:
Decide whether to erase (set to zero) or keep individual components of the memory.
Cell Update:
Transforms the input and previous state to be taken into account into the current state.
Output Gate:
Scales the output from the cell.
Internal State update:
Computes the current timestep's state using the gated previous state and the gated input.
Hidden Layer:
Output of the LSTM scaled by a \(\tanh\) (squashed) transformations of the current state.
其中\(\cdot\) 代表"element-wise matrix multiplication"(对应元素相乘),\(\phi(x)=\tanh(x),\sigma(x)=sigmoid(x)\)
Parallel Computing
input gate, forget gate, cell update, output gate can be computed in parallel.
LSTM network for Semantic Analysis
Model Architecture
Model: LSTM layer --> Averaging Pooling --> Logistic Regession
Input sequence:
representation sequence:
This representation sequence is then averaged over all timesteps resulting in representation h:
Bidirectional LSTM
貌似只能用于 fixed-length sequence. 还有一点就是在传统的机器学习中我们实际上无法获取到 future infromation