机器学习笔记(七)——岭回归(sklearn)

本博客仅用于个人学习,不用于传播教学,主要是记自己能够看得懂的笔记(

学习知识、资源和数据来自:机器学习算法基础-覃秉丰_哔哩哔哩_bilibili

岭回归就是正则化衍生出来的回归方式。正则化的目的是防止过拟合,使曲线尽量平滑,所以在Loss函数后面加了个,其中的λ就是sklearn中岭回归库ridge里的alpha。

根据矩阵变换,可得到:

在sklearn的库中,有RidgeCV函数,可以进行交叉验证法,来选出Loss最小的alpha。其中store_cv_value可以判断是否储存每次交叉验证的结果。

最后画一个alpha与Loss函数的关系图。

数据和相关博客在底下。

Python代码如下:

import matplotlib.pyplot as plt
import numpy as np
from sklearn import linear_model

data=np.genfromtxt('C:/Users/Lenovo/Desktop/学习/机器学习资料/线性回归以及非线性回归/longley.csv',delimiter=',')
x_data=data[1:,2:]
y_data=data[1:,1]

als=np.linspace(0.001,1) #创建50个0.001到1的数据
model=linear_model.RidgeCV(alphas=als,store_cv_values=True) #交叉验证法,选出平均Loss最小的那一个alpha
model.fit(x_data,y_data)

print(model.alpha_)
print(model.cv_values_.shape) #50个alpha,每个alpha都要交叉验证16次,得到16*50个Loss

plt.plot(als,model.cv_values_.mean(axis=0)) #每个alpha的16个Loss求平均
plt.plot(model.alpha_,min(model.cv_values_.mean(axis=0)),'ro') #最低点
print(y_data[2])
print(model.predict(x_data[2,np.newaxis])) #预测值与真实值比较
plt.show()

得到结果:

0.40875510204081633
(16, 50)
88.2
[88.11216213]

所用数据:

"","GNP.deflator","GNP","Unemployed","Armed.Forces","Population","Year","Employed"
"1947",83,234.289,235.6,159,107.608,1947,60.323
"1948",88.5,259.426,232.5,145.6,108.632,1948,61.122
"1949",88.2,258.054,368.2,161.6,109.773,1949,60.171
"1950",89.5,284.599,335.1,165,110.929,1950,61.187
"1951",96.2,328.975,209.9,309.9,112.075,1951,63.221
"1952",98.1,346.999,193.2,359.4,113.27,1952,63.639
"1953",99,365.385,187,354.7,115.094,1953,64.989
"1954",100,363.112,357.8,335,116.219,1954,63.761
"1955",101.2,397.469,290.4,304.8,117.388,1955,66.019
"1956",104.6,419.18,282.2,285.7,118.734,1956,67.857
"1957",108.4,442.769,293.6,279.8,120.445,1957,68.169
"1958",110.8,444.546,468.1,263.7,121.95,1958,66.513
"1959",112.6,482.704,381.3,255.2,123.366,1959,68.655
"1960",114.2,502.601,393.1,251.4,125.368,1960,69.564
"1961",115.7,518.173,480.6,257.2,127.852,1961,69.331
"1962",116.9,554.894,400.7,282.7,130.081,1962,70.551

参考博客:

使用sklearn库学习线性回归(二)_理科男同学-CSDN博客

numpy.linspace使用详解 - 蚂蚁flow - 博客园 (cnblogs.com)

posted @ 2021-07-26 11:16  Lcy的瞎bb  阅读(640)  评论(0编辑  收藏  举报