logistic回归(一)
http://www.2cto.com/kf/201307/226576.html
,
这个是Sigmoid函数,在这个回归过程中非常重要的函数,主要的算法思想和这个密切相关。这个函数的性质大家可以自己下去分析,这里就不细说了。
然后我们说明下流程,首先我们将每个特征都乘以一个回归系数,然后将这个总和带入上面的函数,进而得到一个数值在0~1的值,则大于0.5归到1类,小于0.5归到0类。但是这么多维特征的系数该怎么选取成了我们最关心的问题。这样我们就构建了一个二分类的模型,判定一个东西是不是某个分类。
迭代使用的微分公式:
我们沿着这个进行迭代求最优权重参数,这样出来的参数就可以出来了。对于二维空间的我们可以参考一张示意图:
当然步长的设置不能太长,否则可能跨越最佳值。O(∩_∩)O~当然这里给出的只是一个玩具示意下,这个复杂的数学过程是如何进行的。
最后给出python代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
|
from numpy import * def loadDataSet(): dataMat = []; labelMat = [] fr = open ( 'testSet.txt' ) for line in fr.readlines(): lineArr = line.strip().split() dataMat.append([ 1.0 , float (lineArr[ 0 ]), float (lineArr[ 1 ])]) labelMat.append( int (lineArr[ 2 ])) return dataMat,labelMat def sigmoid(inX): return 1.0 / ( 1 + exp( - inX)) def gradAscent(dataMatIn, classLabels): dataMatrix = mat(dataMatIn) #convert to NumPy matrix labelMat = mat(classLabels).transpose() #convert to NumPy matrix m,n = shape(dataMatrix) alpha = 0.001 maxCycles = 500 weights = ones((n, 1 )) for k in range (maxCycles): #heavy on matrix operations h = sigmoid(dataMatrix * weights) #matrix mult error = (labelMat - h) #vector subtraction weights = weights + alpha * dataMatrix.transpose() * error #matrix mult return weights dataArr, labelMat = loadDataSet() print (gradAscent(dataArr,labelMat)) from numpy import * def loadDataSet(): dataMat = []; labelMat = [] fr = open ( 'testSet.txt' ) for line in fr.readlines(): lineArr = line.strip().split() dataMat.append([ 1.0 , float (lineArr[ 0 ]), float (lineArr[ 1 ])]) labelMat.append( int (lineArr[ 2 ])) return dataMat,labelMat def sigmoid(inX): return 1.0 / ( 1 + exp( - inX)) def gradAscent(dataMatIn, classLabels): dataMatrix = mat(dataMatIn) #convert to NumPy matrix labelMat = mat(classLabels).transpose() #convert to NumPy matrix m,n = shape(dataMatrix) alpha = 0.001 maxCycles = 500 weights = ones((n, 1 )) for k in range (maxCycles): #heavy on matrix operations h = sigmoid(dataMatrix * weights) #matrix mult error = (labelMat - h) #vector subtraction weights = weights + alpha * dataMatrix.transpose() * error #matrix mult return weights |