Linear Regression
Hypothesis
Linear Regression, 线性回归是机器学习中监督学习的一种非常常用的方法。
\[h_\theta(x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2 + \cdots + \theta_n x_n
\]
The vectorized version is:
\[h_\theta(x) = \left[
\begin{matrix}
\theta_{0}&\theta_{1}&\cdots&\theta_{n}
\end{matrix}
\right]
\left[
\begin{matrix}
x_0 \\
x_1 \\
\vdots \\
x_n
\end{matrix}
\right]
= \theta^T x
\]
\(\text{Training examples are stored in X row-wise, like such:}\)
\[\boldsymbol{X} = \left[
\begin{matrix}
x_{0}^{(1)}&x_{1}^{(1)} \\
x_{0}^{(2)}&x_{1}^{(2)} \\
x_{0}^{(3)}&x_{1}^{(3)}
\end{matrix}
\right],
\boldsymbol{\theta} = \left[
\begin{matrix}
\theta_{0} \\
\theta_{1}
\end{matrix}
\right]
\]
\(\text{yet, like this:}\)
\[h_\theta(\boldsymbol{X}) = \boldsymbol{X}\boldsymbol{\theta}
\]
Cost Function
\[J(\theta) = \frac{1}{2m} \sum_{i=1}^{m} (h_\theta^{(i)} - y^{(i)})^2
\]
The vectorized version is:
\[J(\theta) = \frac{1}{2m} (\boldsymbol{X}\boldsymbol{\theta} - \vec{\boldsymbol{y}} )^T(\boldsymbol{X}\boldsymbol{\theta} - \vec{\boldsymbol{y}})
\]
Gradient Descent
\[\theta_0 = \theta_0 - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_0^{(i)}
\]
\[\theta_1 = \theta_1 - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_1^{(i)}
\]
\[\theta_2 = \theta_2 - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_2^{(i)}
\]
\[\cdots
\]
In other words:
\[\theta_j = \theta_j - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_j^{(i)}
\]
The vectorized version is:
\[\boldsymbol{\theta} = \boldsymbol{\theta} - \frac{\alpha}{m} (\boldsymbol{X}\boldsymbol{\theta} - \vec{y})
\]
Stochastic Gradient Descent
- Randomly 'shuffle' the dataset
- For i=1…m
\[\theta_j = \theta_j - \alpha (h_\theta^{(i)} - y^{(i)}) x_j^{(i)}
\]
Regularized Linear Regression
Cost Function
\[min_\theta \frac{1}{2m} \left[ \sum_{i=1}^m (h_\theta^{(i)} - y^{(i)})^2 + \lambda \sum_{j=1}^{n}\theta_{j}^2 \right]
\]
Gradient Descent
\[\theta_0 = \theta_0 - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_0^{(i)}
\]
\[\theta_j = \theta_j - \alpha \left[ \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_j^{(i)} + \frac{\lambda}{m} \theta_j \right], j \in \{1, 2, ..., n \}
\]
Code
# todo