机器学习中的优化 Optimization Chapter 1 Mathematics Background(数学基础)
1. Notation
||\(x\)||: Euclidean norm (\(l_2\) norm),
$ \mathbb{R_+} = { x\in \mathbb{R}: x\geq 0 } $
2. Cauchy-Schwarz inequality
Let \(u,v\in \mathbb{R^d}\), then
3. Spectral Norm
Let \(A\) be a matrix, \(A\in \mathbb{R^{m\times d}}\), then
where \(v\in \mathbb{R^d}\). Besides,
Furthermore, we have triangle inequality:
4. Mean Value Theorem
\(\large{\text{Theorem}\ 1.3}\): Let \(a<b\) be real numbers, \(h:[a,b]\rightarrow \mathbb{R}\) be a continuous function which is differential on \((a,b)\). Then there exists \(c\in (a,b)\) such that:
5. Differentiability
\(\large{\text{Definition}\ 1.5}\): \(f: dom(f)\rightarrow \mathbb{R^m}, dom(f)\rightarrow \mathbb{R^d}. f \text{ is called differentiable at } x \text{ in the interior of } dom(f) \text{ if there exists a matrix } A\in \mathbb{R^{m\times d}} \text{ and an error function }r:\mathbb{R^d} \rightarrow \mathbb{R^m}\text{ defined in some neighborhood of } 0\in \mathbb{R^d} \text{such that for all }y \text{ in some neighborhood of }x,\)
where
Besides, \(Df(x)_{ij} = \frac{\partial f_i}{\partial x_j}(x)\)
6. Convex Set
\(\large{\text{Theorem}\ 1.9}\): \(f:dom(f)\rightarrow \mathbb{R^m} \text{ be differentiable}, X\in dom(f) \text{ a convex set}, B\in \mathbb{R_+}. \text{ If } X\in dom(f) \text{ is non-empty and open the following statements are equivalent:}\)
7. Epigraph
\(\large{\text{Lemma}\ 1.11}\): \(f\text{ is a convex function if and only if } {\bf epi}(f) \text{ is a convex set}\)
\(\large{\text{Lemma}\ 1.15}\): $f\text{ is convex if and only if } dom(f) \text{ is convex and} $:
\(\text{holds for all }x,y\in dom(f)\).
\(\large{\bf{\text{Lemma}}\ 1.16}\): \(\text{Suppose that }dom(f) \text{ is open and } f\text{ is differentiable. Then } f\text{ is convex if and only if } dom(f) \text{ is convex and}\)
\(\text{holds for all }x,y\in dom(f)\).
\(\bf{Proof}: \text{If }f \text{ is convex,from first-order convex property we have}:\)
\(\text{for all } x,y\in dom(f). \text{ After adding up these two inequalities, then we get:}\)
8. Second-Order Characterization of Convexity
\(\large{\text{Lemma}\ 1.17}\): $f\text{ is convex if and only if } dom(f) \text{ is convex and} $:
\(\text{holds for all }x,y\in dom(f) \text{ Positive Semidefinite: } M\geq 0, x^TMx\geq 0 \text{ for all }x\neq 0\).
\(\large{\text{Lemma}\ 1.25}\): \(f:dom(f)\rightarrow \mathbb{R}\text{ be strictly convex. Then }f \text{ has just at most one global minimum.}\)
\(\large{\text{Lemma}\ 1.27}\): \(\text{Suppose }f:dom(f)\rightarrow \mathbb{R}\text{ is convex and differentiable.} X\in dom(f) \text{ be a convex set. Point } x^*\in X \text{ is a } {\bf minimizer} \text{ of } f \text{ if and only if}\)
9. log-sum-exp function
\({\bf \large Statement}: \text{log-sum-exp function is a convex function}\)
\(\bf Proof\): \(f(x) = \log(\sum_i e^{x_i})\)
\(\text{By Inequality}:\)
\(\text{Where }1/p+1/q=1. \text{ Therefore,}\)