Logistic回归和梯度上升 公式推导
Logistic回归和梯度上升 公式推导
一、总结
一句话总结:
还是得多看视频,这样比较快,如果找不到好视频,还是自己推导比较快
1、逻辑回归 梯度上升算法迭代更新θ的取值?
$$\theta _ { j } = \theta _ { j } + \alpha \frac { \partial \log L ( \theta ) ) } { \partial \theta _ { j } }$$
$$\frac { \partial \log L ( \theta ) ) } { \partial \theta _ { j } } == \sum _ { i = 1 } ^ { m } ( y _ { i } - h _ { \theta } ( x ^ { ( i ) } ) ) x ^ { ( i ) }$$
所以权重的迭代更新式为:$$\theta _ { j } = \theta _ { j } + \alpha \sum _ { i = 1 } ^ { m } ( y _ { i } - h _ { \theta } ( x ^ { ( i ) } ) ) x ^ { ( i ) }$$
2、逻辑回归 批量梯度上升?
批量梯度上升【每进行一次迭代更新】就会【计算所有样本】,因此得到的模型正确率比较高,但同时计算复杂度高,算法耗时。计算过程如下:
1.首先根据权重和训练样本计算估计值;2.计算误差;3.迭代更新
3、随机梯度上升?
根据样本数量进行迭代,每计算一个样本就进行一次更新,过程如下:(以上步骤更新m次。)
1.计算x^(i)样本对应的估计值:$$h = \left( \begin{array} { l l l } { x _ { 1 } ^ { ( i ) } } & { x _ { 2 } ^ { ( i ) } } & { x _ { 3 } ^ { ( i ) } } \end{array} \right) \left( \begin{array} { l } { \theta _ { 1 } } \\ { \theta _ { 2 } } \\ { \theta _ { 3 } } \end{array} \right)$$
2.计算误差:注意,此处的误差是个数,不再是个向量:$$error= y _ { i } - h ( \text { number } )$$
3.迭代更新:$$w = w + \alpha \left( \begin{array} { c } { x _ { 1 } ^ { ( i ) } } \\ { x _ { 2 } ^ { ( i ) } } \\ { x _ { 3 } ^ { ( i ) } } \end{array} \right)error$$
4、随机梯度、批量梯度、改进的随机梯度?
def stocGradAscent(dataMat,labelMat,alpha = 0.01): #随机梯度上升 start_time = time.time() #记录程序开始时间 m,n = dataMat.shape weights = np.ones((n,1)) #分配权值为1 for i in range(m): h = sigmoid(np.dot(dataMat[i],weights).astype('int64')) #注意:这里两个二维数组做内积后得到的dtype是object,需要转换成int64 error = labelMat[i]-h #误差 weights = weights + alpha*dataMat[i].reshape((3,1))*error #更新权重 duration = time.time()-start_time print('time:',duration) return weights def gradAscent(dataMat,labelMat,alpha = 0.01,maxstep = 1000): #批量梯度上升 start_time = time.time() m,n = dataMat.shape weights = np.ones((n,1)) for i in range(maxstep): h = sigmoid(np.dot(dataMat,weights).astype('int64')) #这里直接进行矩阵运算 labelMat = labelMat.reshape((100,1)) #label本为一维,转成2维 error = labelMat-h #批量计算误差 weights = weights + alpha*np.dot(dataMat.T,error) #更新权重 duration = time.time()-start_time print('time:',duration) return weights def betterStoGradAscent(dataMat,labelMat,alpha = 0.01,maxstep = 150): start_time = time.time() m,n = dataMat.shape weights = np.ones((n,1)) for j in range(maxstep): for i in range(m): alpha = 4/(1+i+j) + 0.01 #设置更新率随迭代而减小 h = sigmoid(np.dot(dataMat[i],weights).astype('int64')) error = labelMat[i]-h weights = weights + alpha*dataMat[i].reshape((3,1))*error duration = time.time()-start_time print('time:',duration) return weights
二、Logistic回归及梯度上升算法
转自或参考:Logistic回归及梯度上升算法
https://blog.csdn.net/u011197534/article/details/53492915