cntk-notes

cntk Embedding layer

“Embedding” refers to representing words or other discrete items by dense continuous vectors. This layer assumes that the input is in one-hot form. E.g., for a vocabulary size of 10,000, each input vector is expected to have dimension 10,000 and consist of zeroes except for one position that contains a 1. The index of that location is the index of the word or item it represents.
通过上面一段话,Embedding layer的工作就是把 one_hot类型的向量转换为dense continuous vector。该转换过程是通过矩阵相乘实现的,但是当 vocabulary size 很大时, 为了提高矩阵乘法的效率,在进行相乘之前,需要将one_hot类型向量变换为稀疏形式的写法,这种转换是通过参数 is_sparse=True来实现的:
input = C.input_variable(shape=(784, ), is_sparse=True)

posted @ 2016-11-08 21:55  huizhu  阅读(159)  评论(0编辑  收藏  举报