LSTM block和cell区别
LSTM的结构中每个时刻的隐层包含了多个memory blocks(一般我们采用一个block),每个block包含了包含一个Cell(有多个memory cell组成)和三个gate,一个基础的结构示例如下图:
一个memory cell只能产出一个标量值,一个block能产出一个向量。
The nomenclature is a bit confusing, but goes back to the original LSTM paper: an LSTM "cell" is strictly speaking mostly the CEC, i.e., the "inner" unit that stores a value through time. A "block" is CEC + gates. In theory, you could have more than one cell in a block. In that case, all cells within the block would use the same gates. So they would store/output information together. This way you could e.g. store two "bits" of information with the same gates. However this idea never really caught on -- all modern LSTMs have one cell per block, and the distinction between cell & blocks eroded over time (which luckily simplifies notation, which was often a bit cumbersome in old Hochreiter, Gers and Graves papers that used the idea of blocks).
so, cell应该是指LSTM内部的四个神经网络层(全连接层)中的神经元,而block显然是包含了这些神经网络层和门的看起来是方块的整体。