使用次梯度法求解lasso

Using subgradient method to solve lasso problem

The problem is to solve:

\[\underset{\beta}{\operatorname{minimize}}\left\{\frac{1}{2 N} \sum_{i=1}^{N}\left(y_{i}-z_{i} \beta\right)^{2}+\lambda|\beta|\right\} \]

Subgradient Optimality:

\[0 \in \partial\left\{\frac{1}{2 N} \sum_{i=1}^{N}\left(y_{i}-z_{i} \beta\right)^{2}+\lambda|\beta|\right\} \]

\[\Longleftrightarrow 0 \in-\frac{1}{N}\sum_{i=1}^{N}z_i(y_i-z_i\beta)+\lambda \partial|\beta| \]

Denote \(v=\partial|\beta|\),according to the definition of subgradient, we have

\[v \in\left\{\begin{array}{ll} \{1\} & \text { if } \beta>0 \\ \{-1\} & \text { if } \beta<0 \\ {[-1,1]} & \text { if } \beta=0 \end{array}\right. \]

The subgradient optimality condition is

\[\frac{1}{N}\sum_{i=1}^{N}z_i(y_i-z_i\beta)=\lambda v \]

  • if \(\beta>0, v=1\)

    \[\frac{1}{N}\sum_{i=1}^{N}z_i(y_i-z_i\beta)=\lambda \]

    we can solve \(\beta=\frac{\sum z_iy_i-\lambda N}{\sum z_i^2}\)

    Since zi is standardized,\(\sum z_i^2=N\),

    \[\beta=\frac{\sum z_iy_i-\lambda N}N\\=\frac{1}{N}\langle\mathbf{z}, \mathbf{y}\rangle-\lambda \]

  • if \(\beta<0\), \(v=-1\)

    \[\frac{1}{N}\sum_{i=1}^{N}z_i(y_i-z_i\beta)=-\lambda \]

    we can solve \(\beta=\frac{\sum z_iy_i+\lambda N}{\sum z_i^2}\)

    Since zi is standardized,\(\sum z_i^2=N\),

    \[\beta=\frac{\sum z_iy_i+\lambda N}N\\=\frac{1}{N}\langle\mathbf{z}, \mathbf{y}\rangle+\lambda \]

  • if \(\beta=0,|v|\le1\)

    \[|\frac{1}{N}\sum_{i=1}^{N}z_i(y_i-z_i\beta)|\le\lambda \]

    Since \(\beta=0,\) we have \(\frac{1}{N}|\langle\mathbf{z}, \mathbf{y}\rangle| \leq \lambda\)

In conclusion, we have:

\[\widehat{\beta}=\left\{\begin{array}{ll} \frac{1}{N}\langle\mathbf{z}, \mathbf{y}\rangle-\lambda & \text { if } \frac{1}{N}\langle\mathbf{z}, \mathbf{y}\rangle \quad>\lambda \\ 0 & \text { if } \frac{1}{N}|\langle\mathbf{z}, \mathbf{y}\rangle| \leq \lambda \\ \frac{1}{N}\langle\mathbf{z}, \mathbf{y}\rangle+\lambda & \text { if } \frac{1}{N}\langle\mathbf{z}, \mathbf{y}\rangle<-\lambda \end{array}\right.\]

i.e.

\[\widehat{\beta}=\mathcal{S}_{\lambda}\left(\frac{1}{N}\langle\mathbf{z}, \mathbf{y}\rangle\right) \]

Where $$\mathcal{S}_{\lambda}(x)=\operatorname{sign}(x)(|x|-\lambda)$$

posted @ 2020-05-10 17:30  跑得飞快的凤凰花  阅读(1158)  评论(0编辑  收藏  举报