CS229 Machine Learning学习笔记:Note 6(感知机与大间隔分类器、在线学习)

考虑如下的在线学习问题:

  • 1.学习模型为\(h_\theta(x)=g(\theta^Tx)\),其中,\(z\geq 0\)\(g(z)=1\)\(z<0\)\(g(z)=-1\)

  • 2.初始时\(\theta=0\);然后,依次给出m个训练样本\((x^{(1)},y^{(1)}),\cdots,(x^{(m)},y^{(m)})\),每次将第\(i\)个样本给模型预测,若模型预测分类正确,则\(\theta\)不变,否则

\[\theta:=\theta+yx \]

(这里省去了学习率,或者说令学习率为1,因为学习率的大小不会影响后面的讨论)

定理: 给定训练样本的序列\((x^{(1)},y^{(1)}),\cdots,(x^{(m)},y^{(m)})\),假设\(\forall i\)\(\|x^{(i)}\|\leq D\),此外,存在一个单位列向量\(u\)\(u^Tx=0\)是决策边界,\(\forall i\)\(y^{(i)}(u^Tx^{(i)})\geq \gamma\)(决策边界到所有训练样本点的函数(几何)间隔大于等于\(\gamma\));则,在这m个训练样本中,这个学习模型最多分类错误\((\frac D \gamma)^2\)

证明: 该学习模型,只会在对当前训练样本分类错误时更新\(\theta\),令\(\theta^{(k)}\)为它犯第k次错误时的参数,显然\(\theta^{(1)}=0\)(第一次犯错时参数还是初始的0,没更新过);若第k次错误发生在学习第i个训练样本\((x^{(i)},y^{(i)})\)时,\(g((x^{(i)})^T\theta^{(k)})\neq y^{(i)}\),则

\[(x^{(i)})^T\theta^{(k)}y^{(i)}\leq 0 \quad\cdots(1) \]

(因为分类错误,所以\(y^{(i)}=-1\)\((x^{(i)})^T\theta^{(k)}\geq 0\),\(y^{(i)}=1\)\((x^{(i)})^T\theta^{(k)}< 0\))

根据参数更新规则,\(\theta^{(k+1)}=\theta^{(k)}+y^{(i)}x^{(i)}\),左右同时转置,右乘单位列向量\(u\)

\[(\theta^{(k+1)})^Tu=(\theta^{(k)})^Tu+y^{(i)}(x^{(i)})^Tu\geq (\theta^{(k)})^Tu+\gamma \]

\[(\theta^{(k+1)})^Tu\geq (\theta^{(k)})^Tu+\gamma\geq (\theta^{(k-1)})^Tu+2\gamma\geq \cdots\geq k\gamma \quad\cdots(2) \]

类似地,有

\[\|\theta^{(k+1)}\|^2=\|\theta^{(k)}+y^{(i)}x^{(i)}\|^2 \]

\[=\|\theta^{(k)}\|^2+\|y^{(i)}x^{(i)}\|^2+2(y^{(i)}x^{(i)})\cdot \theta^{(k)} \]

\[=\|\theta^{(k)}\|^2+\|y^{(i)}x^{(i)}\|^2+(2y^{(i)})(x^{(i)}\cdot \theta^{(k)}) \]

\[=\|\theta^{(k)}\|^2+\|y^{(i)}x^{(i)}\|^2+2y^{(i)}(x^{(i)})^T \theta^{(k)} \]

\[\leq \|\theta^{(k)}\|^2+\|y^{(i)}x^{(i)}\|^2 \]

(根据不等式(1) )

\[= \|\theta^{(k)}\|^2+\|x^{(i)}\|^2 \]

\[\leq \|\theta^{(k)}\|^2+D^2 \quad\cdots(3) \]

(根据最开始的假设:\(\forall i\)\(\|x^{(i)}\|\leq D\))

\[\|\theta^{(k+1)}\|^2\leq \|\theta^{(k)}\|^2+D^2\leq \|\theta^{(k-1)}\|^2+2D^2\leq \cdots \leq kD^2 \quad\cdots(4) \]

使用不等式(2)和(4):

\[\sqrt k D\geq \|\theta^{(k+1)}\| \]

(根据不等式(4))

\[\geq (\theta^{(k+1)})^Tu \]

\(\|u\|=1\),\((\theta^{(k+1)})^Tu=\|\theta^{(k+1)}\|\cdot\|u\|\cos\phi\leq\|\theta^{(k+1)}\|\cdot\|u\|=\|\theta^{(k+1)}\|\)

\[\geq k\gamma \]

(使用不等式(2))

所以,对于任意的第k次错误而言,\(\sqrt k\leq \frac D \gamma\)\(k\leq (\frac D \gamma)^2\)

posted @ 2018-07-17 14:58  YongkangZhang  阅读(311)  评论(0编辑  收藏  举报