xinyu04

导航

< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

统计

机器学习中的优化 Optimization Chapter 2 Gradient Descent(1)

1. Step of Gradient descent

(1)xt+1=xtγf(xt)

2. Vanilla Analysis

Let gt=f(xt), therefore we can get:

(2)gt=(xtxt+1)/γ

hence we get:

(3)gtT(xtx)=1γ(xtxt+1)T(xtx)

Basic vector equation:  2vTw=||v||2+||w||2||vw||2

Hence we can obtain:

(4)gtT(xtx)=1γ(xtxt+1)T(xtx)(5)=12γ[||xtxt+1||2+||xtx||2||xt+1x||2](6)=12γ[γ2||gt||2+||xtx||2||xt+1x||2](7)=γ2||gt||2+12γ[||xtx||2||xt+1x||2]

Then we sum up:

(8)t=0T1gtT(xtx)=γ2t=0T1||gt||2+12γ[||x0x||2||xTx||2](9)γ2t=0T1||gt||2+12γ||x0x||2

Then we take the Convexity into consideration: f(y)>f(x)+gT(yx). Hence we can get:

(10)f(xt)f(x)<gtT(xtx)

Combine the inequality (9):

(11)t=0T1f(xt)f(x)γ2t=0T1||gt||2+12γ||x0x||2

This gives us an upper bound for the average error.

3. Lipschitz Convex function: O(1/ϵ2) steps

Theorem 2.1:
f:RdR convex and differentiable with a global minimum x;Suppose that ||x0x||R,||f(x)||B for all x. Choosing the stepsize: γ=RBT,gradient descent yields: 

(12)1Tt=0T1(f(xt)f(x))RBT

Proof:
From inequality (11), we can just put the assumption together and get the results.

4. Smooth Convex functions: O(1/ϵ) steps

Definition 2.2: Smooth with a parameter L:

(13)f(y)f(x)+g(x)T(yx)+L2||xy||2

More generally, all quadratic functions of the form f(x)=xTQx+bTx+c are smooth.

Lemma 2.4:
f:RdR be convex and differentiable. The following statements are equivalent:

(14)(i)f is smooth with parameter L(15)(ii)||f(y)f(x)||L||xy||

Lemma 2.6:
f:RdR be differentiable and smoothwith parameter L. Choosing γ=1L, gradient descent yields

(16)f(xt+1)f(xt)12L||f(xt)||2

Proof:
Obviously, we can get 

(17)xt+1=xt1Lf(xt)

By Smooth definition:

(18)f(xt+1)f(xt)+g(xt)T(xt+1xt)+L2||xtxt+1||2(19)f(xt)+L(xtxt+1)T(xt+1xt)+L2||xtxt+1||2(20)f(xt)L21L2||f(xt)||2(21)=f(xt)12L||f(xt)||2

posted on   Blackzxy  阅读(27)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· C#/.NET/.NET Core技术前沿周刊 | 第 29 期(2025年3.1-3.9)
· 从HTTP原因短语缺失研究HTTP/2和HTTP/3的设计差异
点击右上角即可分享
微信分享提示