LR

二项LR

二项LR模型

\[\begin{array}{*{20}{l}}
{P\left( {Y = 1|x} \right) = \frac{{{e^{w \cdot x}}}}{{1 + {{\rm{e}}^{w \cdot x}}}} = \sigma \left( {w \cdot x} \right)}\\
{P\left( {Y = 0|x} \right) = \frac{1}{{1 + {{\rm{e}}^{w \cdot x}}}} = 1 - \sigma \left( {w \cdot x} \right)}
\end{array}\]

损失函数

LR的损失函数可以用最大似然估计得到,等价于交叉熵损失

\[L\left( w \right) =  - t\ln \left( {\sigma \left( {w \cdot x} \right)} \right) - \left( {1 - t} \right)\ln \left( {1 - \sigma \left( {w \cdot x} \right)} \right)\]

权重更新公式推导

梯度计算

\[\begin{array}{l}
\frac{{dL\left( w \right)}}{{d\sigma \left( {w \cdot x} \right)}} =  - \frac{t}{{\sigma \left( {w \cdot x} \right)}} + \frac{{1 - t}}{{1 - \sigma \left( {w \cdot x} \right)}}\\
\frac{{dL\left( w \right)}}{{d\left( {w \cdot x} \right)}} = \frac{{dL\left( w \right)}}{{d\sigma \left( {w \cdot x} \right)}} \cdot \frac{{d\sigma \left( {w \cdot x} \right)}}{{d\left( {w \cdot x} \right)}} = \left( { - \frac{t}{{\sigma \left( {w \cdot x} \right)}} + \frac{{1 - t}}{{1 - \sigma \left( {w \cdot x} \right)}}} \right)\sigma \left( {w \cdot x} \right)\left( {1 - \sigma \left( {w \cdot x} \right)} \right) = \sigma \left( {w \cdot x} \right) - t\\
\frac{{dL\left( w \right)}}{{dw}} = \frac{{dL\left( w \right)}}{{d\sigma \left( {w \cdot x} \right)}} \cdot \frac{{d\sigma \left( {w \cdot x} \right)}}{{d\left( {w \cdot x} \right)}} \cdot \frac{{d\left( {w \cdot x} \right)}}{{dw}} = \left( {\sigma \left( {w \cdot x} \right) - t} \right) \cdot x
\end{array}\]

权重更新

\[w +  = learning\_rate \cdot \left( {t - \sigma \left( {w \cdot x} \right)} \right) \cdot x\]

 

LR和线性回归的联系

事件的几率:$\frac{p}{{1 - p}}$

LR的对数几率:$\log \frac{{P\left( {Y = 1|x} \right)}}{{1 - P\left( {Y = 1|x} \right)}} = w \cdot x$

LR是一个分类模型,LR可以看作是在线性回归输出上加了一个sigmoid函数,属于广义线性模型。

 

多项LR

\[\begin{array}{l}
P\left( {Y = k|x} \right) = \frac{{\exp \left( {{w_k} \cdot x} \right)}}{{1 + \sum\limits_{i = 1}^{K - 1} {\exp \left( {{w_i} \cdot x} \right)} }},k = 1,2, \cdots ,K - 1\\
P\left( {Y = K|x} \right) = \frac{1}{{1 + \sum\limits_{i = 1}^{K - 1} {\exp \left( {{w_i} \cdot x} \right)} }}
\end{array}\]

 

LR应用

LR的应用和优缺点

优点:

(1)对内存需求小,容易并行,无论在时间还是空间上都相当高效。

(2)模型可解释性强,很容易分析出各个特征对预测结果的影响。 

缺点:

(1)容易欠拟合,分类精度不高。

2)对特征工程依赖大,模型本身无法提取高阶特征,FM正是为了弥补这个缺陷。

LR怎么处理特征多重共线性的问题

(1)用PCA主成分分析去除特征共线性。

(2)L2正则化

LR+离散特征优势

1.     逻辑回归属于广义线性模型,表达能力受限;单变量离散化为N个后,每个变量有单独的权重,相当于为模型引入了非线性,能够提升模型表达能力。

2.     离散化后的特征对异常数据有很强的鲁棒性:比如一个特征是年龄>301,否则0。如果特征没有离散化,一个异常数据“年龄300岁”会给模型造成很大的干扰。

3.     稀疏向量内积乘法运算速度快,计算结果方便存储。

 

LR+L1 + L2正则python实现

import numpy as np


class LogisticRegression:
    def __init__(self, learning_rate=0.01, num_iterations=1000, l1_reg=0.0, l2_reg=0.0):
        self.learning_rate = learning_rate
        self.num_iterations = num_iterations
        self.l1_reg = l1_reg
        self.l2_reg = l2_reg
        self.weights = None
        self.bias = None

    def sigmoid(self, z):
        return 1 / (1 + np.exp(-z))

    def forward_propagation(self, X):
        z = np.dot(X, self.weights) + self.bias
        y_pred = self.sigmoid(z)
        return y_pred

    def backward_propagation(self, X, y, y_pred):
        m = X.shape[0]
        dw = (1 / m) * np.dot(X.T, (y_pred - y)) + self.l1_reg * np.sign(self.weights) + 2 * self.l2_reg * self.weights
        db = (1 / m) * np.sum(y_pred - y)
        return dw, db

    def fit(self, X, y):
        n_features = X.shape[1]
        self.weights = np.zeros((n_features, 1))
        self.bias = 0

        for _ in range(self.num_iterations):
            y_pred = self.forward_propagation(X)
            dw, db = self.backward_propagation(X, y, y_pred)

            self.weights -= self.learning_rate * dw
            self.bias -= self.learning_rate * db

    def predict(self, X):
        y_pred = self.forward_propagation(X)
        y_pred_cls = [1 if i > 0.5 else 0 for i in y_pred]
        return np.array(y_pred_cls)


if __name__ == '__main__':
    # 示例数据
    X = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
    y = np.array([[0], [0], [1], [1]])
    
    # 创建模型实例
    model = LogisticRegression(learning_rate=0.01, num_iterations=1000, l1_reg=0.1, l2_reg=0.1)
    
    # 训练模型
    model.fit(X, y)
    
    # 进行预测
    predictions = model.predict(X)
    print(predictions)

 

 

sklearn中的LR

from sklearn.linear_model import LogisticRegression
clf = LogisticRegression(max_iter=200)
clf.fit(X_train, y_train)
clf.score(X_test, y_test)

 

 

 

 

 

参考博客

https://www.cnblogs.com/pinard/p/6029432.html

https://blog.csdn.net/touch_dream/article/details/79371462

 

posted @ 2019-06-19 10:42  AI_Engineer  阅读(314)  评论(0)    收藏  举报