梯度下降

梯度讲解

更新过程的公式有问题,修改为:

 

a代表学习率或者说事步长

举例说明

假设

有样本点(4,20)、(8,50)、(5,30)、(10,70)、(12,60)

求回归函数

求解过程:

将样本点拆分

x=[4, 8, 5, 10, 12]

y = [20, 50, 30, 70, 60]

假设回归函数是线性函数:y = theta0 + theta1*x

x、y已知求theta0、theta1,则可以写成下面的目标函数,求目标函数的最小时的theta0和theta1的值

 使用梯度下降法:

迭代得到新的theta0,theta1

将新的theta0,theta1带入目标函数得到新的目标函数值j1,与上一次theta0,theta1带入目标函数得到上一次的目标函数值j0;j1与j0相减小于一个阈值(很小的数)时,可以认为此时新的theta0,theta1就是所求的theta0,theta1。

代码如下:(python)

#y= theta0 + theta1*x
X = [4, 8, 5, 10, 12]
y = [20, 50, 30, 70, 60]
theta0 = theta1 = 0
#学习率 步长
alpha = 0.00001
#迭代次数
cnt = 0
#误差
error0=error1 = 0
#指定阈值用于检查两个误差的差 一遍用来停止迭代
threshold = 0.0000001
while True:
    #dif[0]为theta0的梯度, dif[1]为theta1的梯度
    dif = [0, 0]
    m = len(X)
    for i in range(m):
        dif[0] += y[i] - (theta0 + theta1*X[i])
        dif[1] += (y[i] - (theta0 + theta1*X[i])) * X[i]
        pass
    theta0 = theta0 + alpha*dif[0]
    theta1 = theta1 + alpha*dif[1]
    #计算误差
    for  i in range(m):
        error1 += (y[i] - (theta0 + theta1*X[i]))**2
        pass
    error1 /= m
    if abs(error1 - error0) <= threshold:
        break
    else:
        error0 = error1
        pass
    cnt += 1
    pass
print(theta0, theta1, cnt)
            
def predicty(theta0, theta1, x_test):
    return theta0 + theta1*x_test
print(predicty(theta0, theta1, 15))

 结果:

练习:编程实现

#y= theta0*x0 + theta1*x1 + theta2*x2
#X = [[1,0,3],[1,1,3],[1,2,3],[1,3,2],[1,4,4]]
X0 = [1,1,1,1,1]
X1 = [0,1,2,3,4]
X2 = [3,3,3,2,4]
y = [95.364, 97.217205, 75.195834, 60.105519, 49.342380]
theta0 = theta1 = theta2 = 0
#学习率 步长
alpha = 0.00001
#迭代次数
cnt = 0
#误差
error0=error1 = 0
#指定阈值用于检查两个误差的差 一遍用来停止迭代
threshold = 0.000000001
while True:
    #dif[0]为theta0的梯度, dif[1]为theta1的梯度
    dif = [0, 0, 0]
    m = len(X)
    for i in range(m):
        dif[0] += y[i] - (theta0*x0[i] + theta1*X1[i] + theta2*X2[i])*X0[i]
        dif[1] += (y[i] - (theta0*x0[i] + theta1*X1[i] + theta2*X2[i]))*X1[i]
        dif[2] += (y[i] - (theta0*x0[i] + theta1*X1[i] + theta2*X2[i]))*X2[i]
        pass
    theta0 = theta0 + alpha*dif[0]
    theta1 = theta1 + alpha*dif[1]
    theta2 = theta2 + alpha*dif[2] 
    #计算误差
    for  i in range(m):
        error1 += (y[i] - (theta0 + theta1*X1[i] + theta2*X2[i]))**2
        pass
    error1 /= m
    if abs(error1 - error0) <= threshold:
        break
    else:
        error0 = error1
        pass
    cnt += 1
    pass
print(theta0, theta1, theta2, cnt)
            
def predicty(theta0, theta1, theta2,x1_test, x2_test):
    return theta0 + theta1*x1_test +theta2 * x2_test
print(predicty(theta0, theta1, theta2, 0,3))

结果:

 

 

 

 

 

posted @ 2021-04-12 17:38  北极星!  阅读(84)  评论(0编辑  收藏  举报