线性分类-逻辑回归

线性分类-逻辑回归

思想

线性回归模型是通过对数据的拟合得到一个线性方程,实现的是对连续目标的 \(X\) ,预测 \(Y\),其范围在 \([+\infty, -\infty]\) 之间;而对于线性分类问题我们是要得到 $ {0, 1},或者 [0, 1]$ ,那如何通过线性回归到线性分类函数上呢,那么就是线性回归通过一个激活函数映射到线性分类上。而其做法则是先拟合决策边界(不局限于线性,还可以是多项式),再建立这个边界与分类的概率联系,从而得到了二分类情况下的概率。

假设数据集:i\(\{x_i, y_i\}_{i = 1}^{N}、x_i \in R^p, y_i \in [0, 1]\)

对于 sigmoid function:

\[\sigma(z) = \frac{1}{1 + e^{-z}} \quad \quad \begin{cases} z \rightarrow +\infty, \lim \sigma(z) = 1 \\ z \rightarrow -\infty, \lim \sigma(z) = 0 \\ \end{cases} \]

则已知数据预测结果的概率为:

\[p_1 = p(y = 1|x) = \sigma(w^Tx) = \frac{1}{1 + e^{-w^Tx}}=\pi(x), y = 1\\ p_0 = p(y = 0|x) = 1- p(y = 1|x) = \frac{e^{-w^Tx}}{1 + e^{-w^Tx}}= 1 - \pi(x), y = 0\\ \]

似然函数:

\[L(w) = \prod_{i = 1}^{N}[p(x_i)]^{y_i}[1 - p(x_i)]^{1 - y_i} \]

两边同时取对数,对数似然函数:

\[L(w) = log\prod_{i = 1}^{N}[p(x_i)]^{y_i}[1 - p(x_i)]^{1 - y_i} \\ =\sum_{i = 1}^{N}[y_i\log p(x_i) + (1 - y_i)\log (1 - p(x_i))] \quad \quad = -cross\quad entropy\\ =\sum_{i = 1}^{N}[y_i\log \frac{p(x_i)}{1 - p(x_i)} + \log (1 - p(x_i)] \\ =\sum_{i = 1}^{N}[y_i(w · x_i) - \log (1 - e^{wx_i})] \\ \]

我们的目标是最大化似然函数:

\[argmax_w = L(w) \]

在机器学习中习惯最小化损失函数,则进一步转化:

\[J(w) = -\frac{1}{n}\log L(w)\\ = -\frac{1}{n} (\sum_{i = 1}^{N}[y_i\log p(x_i) + (1 - y_i)\log (1 - p(x_i))]) \]

利用随机梯度下降求解 \(w\)

\[g_i = \frac{\partial J(w)}{\partial w_i} = (p{x_i} - y_i)x_i \\ w_i^{k + 1} = w_i^{k} - \alpha g_i \]

code

import numpy as np
from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt

# Import helper functions
# from utils import make_diagonal, normalize, train_test_split, accuracy_score
# from utils import Plot

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def shuffle_data(X, y, seed=None):
    """ Random shuffle of the samples in X and y """
    if seed:
        np.random.seed(seed)
    idx = np.arange(X.shape[0])
    np.random.shuffle(idx)
    return X[idx], y[idx]

def normalize(X, axis=-1, order=2):
    """ Normalize the dataset X """
    l2 = np.atleast_1d(np.linalg.norm(X, order, axis))
    l2[l2 == 0] = 1
    return X / np.expand_dims(l2, axis) 


def accuracy_score(y_true, y_pred):
    """ Compare y_true to y_pred and return the accuracy """
    accuracy = np.sum(y_true == y_pred, axis=0) / len(y_true)
    return accuracy


def train_test_split(X, y, test_size=0.5, shuffle=True, seed=None):
    """ Split the data into train and test sets """
    if shuffle:
        X, y = shuffle_data(X, y, seed)
    # Split the training data from test data in the ratio specified in
    # test_size
    split_i = len(y) - int(len(y) // (1 / test_size))
    X_train, X_test = X[:split_i], X[split_i:]
    y_train, y_test = y[:split_i], y[split_i:]

    return X_train, X_test, y_train, y_test


class LogisticRegression():
    """
        Parameters:
        -----------
        n_iterations: int
            梯度下降的轮数
        learning_rate: float
            梯度下降学习率

    """
    def __init__(self, learning_rate=.1, n_iterations=4000):
        self.learning_rate = learning_rate
        self.n_iterations = n_iterations

    def initialize_weights(self, n_features):
        # 初始化参数
        # 参数范围[-1/sqrt(N), 1/sqrt(N)]
        limit = np.sqrt(1 / n_features)
        w = np.random.uniform(-limit, limit, (n_features, 1))
        b = 0
        self.w = np.insert(w, 0, b, axis=0)

    def fit(self, X, y):
        m_samples, n_features = X.shape
        self.initialize_weights(n_features)
        # 为X增加一列特征x1,x1 = 0
        X = np.insert(X, 0, 1, axis=1)
        y = np.reshape(y, (m_samples, 1))

        # 梯度训练n_iterations轮
        for i in range(self.n_iterations):
            h_x = X.dot(self.w)
            y_pred = sigmoid(h_x)
            w_grad = X.T.dot(y_pred - y)
            self.w = self.w - self.learning_rate * w_grad

    def predict(self, X):
        X = np.insert(X, 0, 1, axis=1)
        h_x = X.dot(self.w)
        y_pred = np.round(sigmoid(h_x))
        return y_pred.astype(int)



def main():
    # Load dataset
    data = datasets.load_iris()
    X = normalize(data.data[data.target != 0])
    y = data.target[data.target != 0]
    y[y == 1] = 0
    y[y == 2] = 1

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, seed=1)

    clf = LogisticRegression()
    clf.fit(X_train, y_train)
    y_pred = clf.predict(X_test)
    y_pred = np.reshape(y_pred, y_test.shape)

    accuracy = accuracy_score(y_test, y_pred)
    print("Accuracy:", accuracy)
    



if __name__ == "__main__":
    main()
posted @ 2021-11-21 18:08  owo_owo  阅读(92)  评论(1编辑  收藏  举报