一般deep learning需要两步, unsupervised pre-train 和 superviesed NN training. Pre-training 用unsupervised方法train神经网络,得到word representation. 在第二步 supervised NN training 中, word representation 也会update去适应新的问题,比如NER, POS。 这一点和别的方法不同,比如用svm, 只用第一步固定的word representation来做 supervised learning.
Deep learning另一个好处是statistical, Potential to learn from less labeled examples because of sharing of statistical strength.