待学习文档
常用算法
一、HMM及CRF相关
1.区别总概:https://www.zhihu.com/question/35866596
2.【论文】https://www.zhihu.com/question/20078729
二、特征工程
1.总概的一篇文章:http://weibo.com/p/1001593872942714153228
2. 很全面的一篇http://blog.csdn.net/jasonding1354/article/details/47171115
3. 特征抽取 http://appliedpredictivemodeling.com/blog/2015/7/28/feature-engineering-versus-feature-extraction
4.知乎相关介绍:https://www.zhihu.com/question/29316149
5.【他人收藏】http://prml.me/stds/topic/index/id/41
三、数据分布倾斜
1.【应对非均衡数据集分类问题的八大策略】http://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/
2.【不均匀正负样本分布下的机器学习】http://ml.memect.com/remix/3777228405757024.html
3.https://www.quora.com/In-classification-how-do-you-handle-an-unbalanced-training-set
4.《Practical Guide to deal with Imbalanced Classification Problems in R》:http://t.cn/RqzAtHU?u=1402400261&m=3977454739175021&cu=1980029427&ru=1402400261&rm=3958006066937367
四、word embedding
1.http://mp.weixin.qq.com/s?__biz=MzAxMzU5MTQ5MA==&mid=208173897&idx=1&sn=2fa4d667846eff2f0782a4d5236eb7ee#rd
感:pLSI与LDA的区别是什么。
补一篇讨论:http://www.zhizhihu.com/html/y2012/3976.html
《LDA漫游指南》:http://yuedu.baidu.com/ebook/d0b441a8ccbff121dd36839a
五、命名实体
1.方法总概,抽象:http://blog.csdn.net/cuixianpeng/article/details/18084807
2.http://hanlp.linrunsoft.com/doc/_build/html/ner.html
六。新词发现
1.【Matrix67博客】http://www.csdn.net/article/2013-05-08/2815186
七、神经网络
1.http://pan.baidu.com/s/1hs11XQW
六、深度学习
1.斯坦福深度学习进行自然语言处理公开课:http://cs224d.stanford.edu/syllabus.html
2.在文本上应用:http://blog.dato.com/practical-text-analysis-using-deep-learning
3.RBM:http://vdisk.weibo.com/s/drxvP-I4Y6ChC?from=page_100505_profile&wvr=6
4.http://t.cn/RqpdeJE?u=3390189710&m=3967733445864858&cu=1980029427&ru=1402400261&rm=3967704509053002
七、PRML专场
1.http://www.52nlp.cn/category/pattern-recognition-and-machine-learning-2
2.【优先】http://www.52nlp.cn/prml%E8%AF%BB%E4%B9%A6%E4%BC%9A%E7%AC%AC%E4%B9%9D%E7%AB%A0-mixture-models-and-em
八、闲话
1.【2015 机器学习颁奖礼】http://dataunion.org/21807.html