xinyu04

导航

< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

统计

机器学习中的优化 Optimization Chapter 2 Gradient Descent(2)

Theorem 2.7:
f:RdR be convex and differentiable with a global minimum x; Suppose f is smooth with parameter L. Choosing stepsize: γ=1L, gradients descent yields:

(1)f(xT)f(x)L2T||x0x||2

Proof:
f is differentiable and smooth, according to Lemma 2.6, we can get:

(2)f(xt+1)f(xt)12L||gt||2

Therefore:

(3)12L||gt||2f(xt)f(xt+1)

Now we sum up:

(4)12Lt=0T1||gt||2t=0T1[f(xt)f(xt+1)](5)=f(x0)f(xT)

γ=1/L, therefore from previous analysis:

(6)t=0T1[f(xt)f(x)]γ2t=0T1||gt||2+12γ||x0x||2

Combine (5) and (6):

(7)t=0T1[f(xt)f(x)]γ2t=0T1||gt||2+12γ||x0x||2(8)f(x0)f(xT)+12γ||x0x||2

Hence:

(9)t=1T[f(xt)f(x)]12γ||x0x||2(10)=L2||x0x||2

As the result:

(11)T(f(xT)f(x))t=1T[f(xt)f(x)](12)=L2||x0x||2

(13)f(xT)f(x)L2T||x0x||2

1. Smooth and strongly convex function:O(log(1/ϵ)) steps

First-order method: only use the gradient information to minimize f.

Definition 2.9:
Strongly convex function: 

(14)f(y)f(x)+f(x)T(yx)+L2||xy||2

Lemma 2.10:
if f is strongly convex with parameter μ>0, then f is strictly convex and has a unique global minimum.

Assume that f is stringly convex with μ, from vanilla analysis:

(15)gt(xtx)=f(xt)T(xtx)(16)f(xt)f(x)+μ2||xtx||2

Hence:

(17)f(xt)f(x)12γ[γ2||gt||2+||xtx||2||xt+1x||2]μ2||xtx||2

Rewrite it as:

(18)||xt+1x||22γ[f(x)f(xt)]+γ2||gt||2+(1μγ)||xtx||2

Theorem 2.12:
f:RdR be convex and differnentiable. Suppose f is smooth with L, and strongly convex with μ. Choosing stepsize:

(19)γ=1/L

Gradient descent with arbitary x0 satisfies the following two properties:
(i)

(20)||xt+1x||2(1μL)||xtx||2

Proof:
By smooth, we know:

(21)f(x)f(xt)f(xt+1)f(xt)12L||gt||2

Combine (18), we get

(22)||xt+1x||2γ2||gt||2+γ2||gt||2+(1μγ)||xtx||2(23)(1μL)||xtx||2

(ii)

(24)f(xT)f(x)L2(1μL)T||x0x||2

Proof:
From smooth:

(25)f(xt)f(x)+L2||xtx||2

(26)f(xT)f(x)L2||xTx||2(27)...L2(1μL)T||x0x||2

posted on   Blackzxy  阅读(34)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· C#/.NET/.NET Core技术前沿周刊 | 第 29 期(2025年3.1-3.9)
· 从HTTP原因短语缺失研究HTTP/2和HTTP/3的设计差异
点击右上角即可分享
微信分享提示