Exploring Recursion in Convex Optimization
Recursion in optimization
In this blog post, I aim to provide a overview of the various recursive methods I have seen in convex optimization. Optimization methods often yield a sequence denoted as
Convergent sequence
The concept of convergent and divergent sequences is fundamental in advanced mathematical studies, such as the well-known Cauchy sequence. In the following discussion, I will narrow our focus to specific recursion techniques, shedding light on their convergence properties. To delve deeper into this subject, let's explore a particular recursion method:
The formulation mentioned above holds universal significance in the field of optimization (see Polyak ch 2.2.3).
Now, let's delve into two fundamental lemmas that play a crucial role in determining the convergence of a sequence governed by Eq.
Lemma 1 Let
where
The condition
Lemma 2 Let
where
Drawing from these two lemmas, it becomes evident that various convergent recursions can be derived, as illustrated in Franci, B..
Convergence rate
When dealing with a convergent sequence, a critical question arises: How many iterations are needed to obtain an
To begin our analysis, let's consider a scenario where both
This indicates that
As highlighted in Bottou thm4.6, employing a constant learning rate results in the linear convergence of expected objective values to a neighborhood of the optimal value. While a smaller stepsize might degrade the contraction constant in the convergence rate, it facilitates approaching closer to the optimal value.
To ensure convergence, we opt for a diminishing stepsize strategy, leading to a linear convergence rate of
In general, we reformulate
Theorem 1 Let
where
where
We prove it by induction. When
which leads to
In Polyak lemma 2.2.4, there is the convergence rate for the recursion
As shown in Bottou thm5.1, when
Reference
- Polyak, B. T. (1987). Introduction to optimization.
- Franci, B., & Grammatico, S. (2022). Convergence of sequences: A survey. Annual Reviews in Control, 53, 161-186.
- Bottou, L., Curtis, F. E., & Nocedal, J. (2018). Optimization methods for large-scale machine learning. SIAM review, 60(2), 223-311.
- Stich, S. U. (2019). Unified optimal analysis of the (stochastic) gradient method. arXiv preprint arXiv:1907.04232.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· .NET10 - 预览版1新功能体验(一)