机器学习笔记

最近在看吴恩达的机器学习视频，这里写下自己的理解。

第一阶段 Introduction

定义

机器学习最早的定义是由Arthur Samuel (1959)提出的：

Field of study that gives computers the ability to learn without being explicitly programmed.

另一个更加准确的定义是

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

比如在一个预测垃圾邮件的机器学习算法中，E代表学习垃圾/非垃圾邮件的标题，T代表将垃圾邮件分类，P代表分类的正确率

用处

里面列举了这四大类：

- Database mining .Large datasets from growth of automation/web. E.g., Web click data, medical records, biology, engineering

- Applications can’t program by hand. E.g., Autonomous helicopter, handwriting recognition, most of Natural Language Processing (NLP), Computer Vision.

- Self-customizing programs E.g., Amazon, Netflix product recommendations

- Understanding human learning (brain, real AI).

数据挖掘，不能手动编写的程序，产品推荐系统，理解人类的学习...

算法

然后将算法分成两大类：监督学习（ Supervised Learning）和非监督学习（Unsupervised Learning）

监督学习意思是通过给定”正确的“数据集来预测输入数据对应的输出数据。比如预测房价（这是个连续集合，虽然房价本质上也是离散的，叫做回归算法），比如预测是否患有乳腺癌（这是个离散集合，只有几个可选答案，叫做分类算法）

非监督学习目的是将给定数据集分类，比如新闻聚合，基因分类，将目标客户分入细分市场，社交网络分析等。给了一个有意思的例子是”鸡尾酒会问题“，通过分离背景音乐和两种语言，讲两个人说话的声音分离开来。

posted @ 2018-05-15 17:33 andrew-chen 阅读(125) 评论(0) 编辑收藏举报

刷新页面返回顶部

AndrewChen的博客

兴趣驱动学习

机器学习笔记