风过蔷薇

导航

 

一  机器学习的概念

Machine Learning
- Grew out of work in AI
- New capability for computers
Examples:例子
- Database mining 数据挖掘
Large datasets from growth of automation/web.
E.g., Web click data, medical records, biology, engineering
- Applications can’t program by hand.
E.g., Autonomous helicopter, handwriting recognition, most of
Natural Language Processing (NLP), Computer Vision.
- Self-customizing programs 自我定制程序
E.g., Amazon, Netflix product recommendations
- Understanding human learning (brain, real AI).

下面这个是机器学习的完整定义:

“A computer program is said to learn from experience E with respect to
some task T and some performance measure P, if its performance on T,
as measured by P, improves with experience E.”

 

二  机器学习的分类

机器学习算法分为两类:监督学习和非监督学习

Machine learning algorithms:
- Supervised learning 监督学习
- Unsupervised learning 非监督学习
Others: Reinforcement learning, recommender systems.还有增强学习,推荐系统
Also talk about: Practical advice for applying learning algorithms.

 

 

三  监督学习(Supervised Learning)

有监督学习就是,给定一个数据集并且给定正确的输出是什么,让机器去搞明白输入和输出的关系。

有监督学习分为两类:回归和分类,如果输出是连续的就是回归问题,如果是特定结果如猫或者狗则是分类

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

 

Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.

 

Example 1: 

Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem. 

We could turn this example into a classification problem by instead making our output about whether the house "sells for more or less than the asking price." Here we are classifying the houses based on price into two discrete categories.

 

Example 2: 

(a) Regression(回归) - Given a picture of a person, we have to predict their age on the basis of the given picture 

(b) Classification(线性) - Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.

 

四  无监督学习(Unsupervised Learning)

 无监督学习就是给定数据集,但是没有给定结果应该是什么样子,机器只能自己推导

无监督学习分为两类:聚类和非聚类

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.

 

We can derive this structure by clustering the data based on relationships among the variables in the data.

 

With unsupervised learning there is no feedback based on the prediction results.

 

Example:

 

Clustering(聚类): Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.

 

Non-clustering(非聚类): The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).

 

posted on 2016-12-05 10:50  风过蔷薇  阅读(753)  评论(0编辑  收藏  举报