非线性规划——凸优化、凸函数、凸规划（二）

凸规划是指若最优化问题的目标函数为凸函数，不等式约束函数也为凸函数，等式约束函数是线性的。凸规划的可行域为凸集，因而凸规划的局部最优解就是它的全局最优解。当凸规划的目标函数为严格凸函数时，若存在最优解，则这个最优解一定是唯一的最优解。

一、凸集

凸集：设$C$为$n$维欧式空间的一个集合，若$C$内任意两点间的线段也均在$C$内，则称集合$C$为凸集。即

\[\forall x_1, x_2 \in C, \forall \theta \in [0,1]，则 x= \theta * x_1 + (1-\theta)*x_2 \in C \]

凸组合（convex combination）：
$$λ_1x_1+⋯+λ_kx_k$$

其中，$λ_1,⋯,λ_k≥0,∑λ_i=1$，二元 $ k_1 * x_1 + (1-k_1)*x_2$就是一个凸组合。

凸集有良好的运算性质：(1) 交集为凸集。(2) 和集为凸集。(3) 直和为凸集。

超平面hyperplane：$H=\{x|a^Tx=b\}(a≠0)$;

半平面halfspace：$ H^+={ x|a^Tx \geq b}(a\neq 0) \\ H^-=\{ x|a^Tx \leq b\}(a\neq 0)$;

多面体polyhedra：多个线性不等式所刻画的集合：$$\{ x|a^T_ix\leq b_i,i=1,\cdots,m\}$$
注：线性等式刻画的集合也是多面体！（可以将等式转换为两个不等式，线性规划里称为单纯形）

椭球（Ellipsoid）：

\[\{x| (x-x_c)^T P^ {-1} (x-x_c)\leq 1 \}; 其中 P为正定矩阵; 椭球的轴长为\sqrtλ_i \]

二、凸函数

设$f$为定义在区间$I$上的函数，若对$I$上的任意两点$x_1, x_2$和任意$\lambda \in (0,1) $有

\[f(\lambda x_i + (1-\lambda)x_2)\leq \lambda f(x_i)+(1-\lambda)f(x_2) \]

将"$\leq$"换成"$<$"也成立则就是严格凸函数。

性质1：设 $f:R^n–>R^1$，$C$是凸集，若$f$是凸函数，则对于任意的$β$，水平集$D_\beta = \{x|f(x) \leq \beta, x \in C\}$是凸集。

性质2 : 凸优化问题的局部极小值是全局极小值。

性质3：若$f$一阶可微，则函数$f$为凸函数当且仅当$f$的定义域$domf$为凸集，且$ \forall x,y \in domf, f(y)\geq f(x) + \triangledown f(x)^T(y-x)$。

性质4：若$f$二阶可微，则函数$f$为凸函数当且仅当$f$的定义域$domf$为凸集，且$\triangledown ^2f(x)\succeq 0$。

正定：$f(x_1,x_2,...,x_n)=x^TAx$，(所有的二次齐次式都可以写成这个形式）如果对任意的$x \neq 0均有f(x)>0$，则称$(f(x)$为正定二次型，同时称$A$为正定矩阵。

[定理] 设$f(x):{R^n} \to R$是在凸集$X \subseteq {R^n}$上二阶可微。$f(x)$是凸函数的充分必要条件是$f(x)$的二阶海塞矩阵${\nabla ^2}f(x)$是半正定的；$f(x)$是凹函数的充分必要条件是$f(x)$的二阶海塞矩阵${\nabla ^2}f(x)$是半负定的。

例1：判断下述函数的凸凹性:
(1)$f(x) = {e^x}$
(2)$f(x) = x^{\frac{1}{2}}$
(3)$f({x_1},{x_2}) = 2 + {x_1} + {x_2} - x_1^2 - x_2^2$
解：
(1) $f^{\prime \prime}(x)=e^x \geq 0$, 所以 $f(x)$ 函数是凸集 $R$ 上的的凸函数.
(2) $f^{\prime \prime}(x)=-\frac{1}{4} x^{-3 / 2} \leq 0$, 所以 $f(x)$ 函数是凸集 $R$ 上的的凹函数.
(3) $f\left(x_1, x_2\right)$ 的 Hessian 矩阵为:

\[\nabla^2 f(x)=\left[\begin{array}{cc} -2 & 0 \\ 0 & -2 \end{array}\right] \]

为了判断 $\nabla^2 f(x)$ 的半正 (负) 定性, 计算:

\[\begin{aligned} \operatorname{def}\left(\nabla^2 f(x)-\lambda I\right) & =\operatorname{def}\left[\begin{array}{cc} -2-\lambda & 0 \\ 0 & -2-\lambda \end{array}\right] \\ & =(-2-\lambda)^2=0 \end{aligned} \]

那么, $\lambda_1=\lambda_2=-2$, 所以矩阵 $\nabla^2 f(x)$ 为负定矩阵, $f(x)$ 为凸集 $R$ 上的凹函数.

三、凸规划

基本形式：

\[\begin{array}{l} \min f(x)\\ \begin{array}{*{20}{c}} {} \end{array}s.t\begin{array}{*{20}{c}} {} \end{array}{g_i}(x) \le 0\begin{array}{*{20}{c}} {}&{}&{i = 1, \cdots ,m} \end{array}\\ \begin{array}{*{20}{c}} {}&{}&{{h_j}(x) = 0} \end{array}\begin{array}{*{20}{c}} {}&{j = 1, \cdots ,l} \end{array} \end{array}\]

[定理] 上面规划是凸规划的充分条件为:(1)$f(x):C \to R$是凸函数；(2)${h_j}:{R^n} \to R,j=1,2,...,l$, 都是线性函数;（3）$g_i:{R^n} \to R,i=1,2,...,m$, 都是凸函数（$-g_i$是凹函数）。

例2：判断下述规划是否为凸规划

\[\begin{array}{l} \min f(X) = x_1^2 + x_2^2 - 4{x_1} + 4\\ \left\{ {\begin{array}{*{20}{c}} {{g_1}(X) = - {x_1} + {x_2} - 2 \le 0}\\ {{g_2}(X) = x_1^2 - {x_2} + 1 \le 0}\\ {{x_1},{x_2} \ge 0} \end{array}} \right. \end{array}\]

解：

\[{\nabla ^2}f(X) = \left[ \begin{array}{l} {\rm{ }}\frac{{{\partial ^2}f}}{{\partial x_1^2}}{\rm{ }}\frac{{{\partial ^2}f}}{{\partial {x_1}\partial {x_2}}}\\ \frac{{{\partial ^2}f}}{{\partial {x_2}\partial {x_1}}}{\rm{ }}\frac{{{\partial ^2}f}}{{\partial x_2^2}} \end{array} \right]{\rm{ }} = \left[ \begin{array}{l} {\rm{2 \quad 0}}\\ {\rm{0 \quad 2}} \end{array} \right]\]

\[\begin{array}{l} {\nabla ^2}{g_1}(X) = \left[ \begin{array}{l} {\rm{ }}\frac{{{\partial ^2}{g_1}}}{{\partial x_1^2}}{\rm{ }}\frac{{{\partial ^2}{g_1}}}{{\partial {x_1}\partial {x_2}}}\\ \frac{{{\partial ^2}{g_1}}}{{\partial {x_2}\partial {x_1}}}{\rm{ }}\frac{{{\partial ^2}{g_1}}}{{\partial x_2^2}} \end{array} \right]{\rm{ }} = \left[ \begin{array}{l} {\rm{0 \quad 0}}\\ {\rm{0 \quad 0}} \end{array} \right]{\rm{ , }}\\ {\nabla ^2}{g_2}(X) = \left[ \begin{array}{l} {\rm{ }}\frac{{{\partial ^2}{g_2}}}{{\partial x_1^2}}{\rm{ }}\frac{{{\partial ^2}{g_2}}}{{\partial {x_1}\partial {x_2}}}\\ \frac{{{\partial ^2}{g_2}}}{{\partial {x_2}\partial {x_1}}}{\rm{ }}\frac{{{\partial ^2}{g_2}}}{{\partial x_2^2}} \end{array} \right]{\rm{ }} = \left[ \begin{array}{l} {\rm{ 2 \quad 0}}\\ {\rm{ 0 \quad 0}} \end{array} \right]{\rm{ }} \end{array} \]

该问题是凸规划，有最小值$x^*=(x_1,x_2)=(1.62,3.62)^T$，$f(x^*)=13.25$，参看下图。

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize

# 定义目标函数
def objective(X):
    x1, x2 = X
    return x1**2 + x2**2 - 4*x1 + 4

# 定义约束函数
def constraint1(X):
    return -X[0] + X[1] - 2

def constraint2(X):
    return X[0]**2 - X[1] + 1

# 初始猜测值
initial_guess = [0.5, 0.5]

# 定义约束
constraints = [
    {'type': 'ineq', 'fun': constraint1},
    {'type': 'ineq', 'fun': constraint2},
    {'type': 'ineq', 'fun': lambda X: X[0]},  # x1 >= 0
    {'type': 'ineq', 'fun': lambda X: X[1]}   # x2 >= 0
]

# 记录迭代点
history = []
def callback(X):
    history.append(np.copy(X))

# 进行优化
result = minimize(objective, initial_guess, method='SLSQP', constraints=constraints, callback=callback)

# 打印优化结果
print("Optimization Result:")
print(f"Success: {result.success}")
print(f"Minimum Value of the Function: {result.fun:.2f}")
print(f"Point at which Minimum Occurs: x1 = {result.x[0]:.2f}, x2 = {result.x[1]:.2f}")
print(f"Number of Iterations: {result.nit}")

# 生成等高线图
x1 = np.linspace(-1, 3, 400)
x2 = np.linspace(-1, 5, 600)
X1, X2 = np.meshgrid(x1, x2)
Z = X1**2 + X2**2 - 4*X1 + 4

plt.figure(figsize=(15, 9))
cp = plt.contour(X1, X2, Z, levels=np.logspace(-1, 3, 20), cmap='jet', linewidths=1.5)
plt.colorbar(cp)

# 绘制约束边界
x1_vals = np.linspace(0, 3, 400)
x2_g1 = x1_vals + 2
x2_g2 = x1_vals**2 + 1

plt.plot(x1_vals, x2_g1, 'r--', label='$g_1(x_1, x_2) = 0$', linewidth=1.5)
plt.plot(x1_vals, x2_g2, 'g--', label='$g_2(x_1, x_2) = 0$', linewidth=1.5)

# 绘制可行域
plt.fill_between(x1_vals, np.maximum(x2_g1, 0), 5, color='gray', alpha=0.3)

# 绘制优化路径
history = np.array(history)
plt.plot(history[:, 0], history[:, 1], 'ro-', label='Optimization Path', markersize=5, linewidth=1.5)
plt.plot(result.x[0], result.x[1], 'wo', label='Minimum', markersize=14, markeredgewidth=3, markeredgecolor='black')

# 设置图形参数
plt.title("Contour Plot with Optimization Path")
plt.xlabel('$x_1$')
plt.ylabel('$x_2$')
plt.legend()
plt.xlim(-1, 3)
plt.ylim(-1, 5)
plt.grid()
plt.show()

Optimization Result:
Success: True
Minimum Value of the Function: 13.24
Point at which Minimum Occurs: x1 = 1.62, x2 = 3.62
Number of Iterations: 7

例3：判断下述规划是否为凸规划

\[min\quad ( f(x_1, x_2, x_3) = 2x_1^2 + x_2^2 +2x_3^2 +x_1 x_3 - x_1 x_2+2x_2\\ g_1(x) = x_1^2 +x_2^2-x_3 \le 0\\ g_2(x) = x_1 +x_2+2x_3 \le 16\\ g_3(x) = -x_1 -x_2+x_3 \le 0 \]

目标函数的凸性

一个函数是凸的，当且仅当其 Hessian 矩阵（即二阶偏导数矩阵）是半正定的。计算目标函数的 Hessian 矩阵：

\[ H = \begin{bmatrix} \frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \frac{\partial^2 f}{\partial x_1 \partial x_3} \\ \frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2^2} & \frac{\partial^2 f}{\partial x_2 \partial x_3} \\ \frac{\partial^2 f}{\partial x_3 \partial x_1} & \frac{\partial^2 f}{\partial x_3 \partial x_2} & \frac{\partial^2 f}{\partial x_3^2} \end{bmatrix} \]

计算各二阶偏导数：

\[ \frac{\partial^2 f}{\partial x_1^2} = 4, \quad \frac{\partial^2 f}{\partial x_2^2} = 2, \quad \frac{\partial^2 f}{\partial x_3^2} = 4 \\ \frac{\partial^2 f}{\partial x_1 \partial x_2} = -1, \quad \frac{\partial^2 f}{\partial x_1 \partial x_3} = 1, \quad \frac{\partial^2 f}{\partial x_2 \partial x_3} = 0 \]

因此，Hessian 矩阵为：

\[ H = \begin{bmatrix} 4 & -1 & 1 \\ -1 & 2 & 0 \\ 1 & 0 & 4 \end{bmatrix} \]

计算 Hessian 矩阵的特征值来判断其是否为半正定。使用 NumPy 进行计算：

import numpy as np
H = np.array([
    [4, -1, 1],
    [-1, 2, 0],
    [1, 0, 4]
])

# 计算特征值
eigenvalues = np.linalg.eigvals(H)
print("Hessian Matrix Eigenvalues:", eigenvalues)

如果所有特征值均为非负，则 Hessian 矩阵为半正定，即目标函数为凸函数。

约束条件的凸性
约束条件为等式约束，需要检查其是否为仿射函数（即线性函数和常数项的和）。一个仿射函数既是凸函数也是凹函数。
$ ( g_1(x) = x_1^2 + x_2^2 - x_3 )$ —— 非线性函数，不是仿射函数，因此可能不是凸函数。
$ ( g_2(x) = x_1 + x_2 + 2x_3 - 16 )$ —— 线性函数，是仿射函数，因此是凸函数。
$ ( g_3(x) = -x_1 - x_2 + x_3 ) $—— 线性函数，是仿射函数，因此是凸函数。

由于 $ g_1(x) $ 是非线性函数，我们需要检查其 Hessian 矩阵是否为半正定。计算 $ g_1(x)$的 Hessian 矩阵：

\[ H_{g1} = \begin{bmatrix} 2 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 0 \end{bmatrix} \]

这个矩阵的特征值是 $ [2, 2, 0] $，均为非负，因此 $ g_1(x)$ 也是凸函数。

结论

目标函数的 Hessian 矩阵是否为半正定需要通过特征值检查。
约束条件 $ g_1(x) $为凸函数（Hessian 矩阵半正定）。
约束条件$ g_2(x)$ 和 $ g_3(x) $ 为仿射函数，因此是凸函数。

通过上述分析，目标函数和所有约束条件均为凸函数，这意味着该非线性规划问题是一个凸规划问题。

#求解x1 = 2.67；x2 = 2.67；x3 = 5.33，最小化目标函数值: 93.33
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
from scipy.optimize import minimize

# 定义目标函数
def objective(x):
    x1, x2, x3 = x
    return 2*x1**2 + x2**2 + 2*x3**2 + x1*x3 - x1*x2 + x1 + 2*x2

# 定义约束条件
def constraint1(x):
    x1, x2, x3 = x
    return x1**2 + x2**2 - x3

def constraint2(x):
    x1, x2, x3 = x
    return x1 + x2 + 2*x3 - 16

def constraint3(x):
    x1, x2, x3 = x
    return -x1 - x2 + x3

# 初始猜测
x0 = [0, 0, 0]

# 约束条件字典
cons = [{'type': 'eq', 'fun': constraint1},
        {'type': 'eq', 'fun': constraint2},
        {'type': 'eq', 'fun': constraint3}]

# 运行优化
solution = minimize(objective, x0, constraints=cons)

# 提取优化结果
x1_opt, x2_opt, x3_opt = solution.x

# 打印结果
print("最优解:")
print("x1 =", x1_opt)
print("x2 =", x2_opt)
print("x3 =", x3_opt)
print("最小化目标函数值:", objective(solution.x))

# 准备数据以进行绘制
x1 = np.linspace(-100, 100, 200)
x2 = np.linspace(-100, 100, 200)
x1, x2 = np.meshgrid(x1, x2)
x3 = (x1**2 + x2**2)  # 从constraint1求解x3

# 计算目标函数值
h = 2*x1**2 + x2**2 + 2*x3**2 + x1*x3 - x1*x2 + x1 + 2*x2
h1 = x1**2 + x2**2 - x3
h2 = x1 + x2 + 2*x3 - 16
h3 = -x1 - x2 + x3

# 绘制图像
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

# 绘制目标函数值
ax.plot_surface(x1, x2, h, color='r', alpha=0.3)

# 绘制约束条件
ax.plot_surface(x1, x2, h1, color='b', alpha=0.3)
ax.plot_surface(x1, x2, h2, color='g', alpha=0.3)
ax.plot_surface(x1, x2, h3, color='y', alpha=0.3)

# 标记最优解
ax.scatter(x1_opt, x2_opt, objective([x1_opt, x2_opt, x3_opt]), color='k', marker='o', s=100)

ax.set_xlabel('x1')
ax.set_ylabel('x2')
ax.set_zlabel('Objective function value')
plt.show()

例4：求Rosenbrock 香蕉函数$f(x, y) = (2 - x)^2 + 100(y - x^2)^2$的最小值

一阶偏导数
函数的梯度为：

\[\nabla f(x, y) = \begin{pmatrix} \frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y} \end{pmatrix} \]

其中：

\[\frac{\partial f}{\partial x} = 2(200x^3 - 200xy + x - 2) \quad \frac{\partial f}{\partial y} = 200(y - x^2) \]

将这些导数设为零得到临界点：

\[2(200x^3 - 200xy + x - 2) = 0 \quad 200(y - x^2) = 0 \]

解这些方程得到：

\[x = 2；\quad y = 4 \]

海森矩阵
函数的海森矩阵为：

\[\nabla^2 f(x, y) = \begin{pmatrix} \frac{\partial^2 f}{\partial x^2} & \frac{\partial^2 f}{\partial xy} \\ \frac{\partial^2 f}{\partial yx} & \frac{\partial^2 f}{\partial y^2} \end{pmatrix} \]

其中：

\[ \frac{\partial^2 f}{\partial x^2} = 1200x^2 - 400y + 2 \quad \frac{\partial^2 f}{\partial xy} = -400x \\ \frac{\partial^2 f}{\partial yx} = -400x \quad \frac{\partial^2 f}{\partial y^2} = 200 \]

代入$(x, y) = (2, 4)$我们得到：

\[\nabla^2 f(x, y) = \begin{pmatrix} 3202 & -800 \ -800 & 200 \end{pmatrix} \]

判断其是否为正定矩阵
首先，我们检查海森矩阵是否对称，它是。然后我们检查它是否满足正定矩阵的任意一个定理。对于2x2矩阵，我们可以通过行列式来判断：

\[ a_{11} = 3202 \\ \left| \begin{array}{cc} a_{11} & a_{12} \\ a_{21} & a_{22} \end{array} \right| = 3202 \times 200 - (-800) \times (-800) = 400 \]

求最小值
上面求了半天只求了局部极小值。
局部极小值并不一定是最小值，极大值也不一定大于极小值，可以通过求出所有的极值，选择其中最小的就是最小值。

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize

# 定义Rosenbrock's banana function
def rosenbrock_banana(x):
    return (2 - x[0])**2 + 100 * (x[1] - x[0]**2)**2

# 定义函数以记录每次迭代的点
history = []
def callback(xk):
    history.append(np.copy(xk))

# 初始猜测值
initial_guess = [0, 0]

# 使用BFGS算法进行优化，并记录迭代过程
result = minimize(rosenbrock_banana, initial_guess, method='BFGS', callback=callback)

# 打印优化结果
print("Optimization Result:")
print(f"Success: {result.success}")
print(f"Minimum Value of the Function: {result.fun:.2f}")
print(f"Point at which Minimum Occurs: x = {result.x[0]:.2f}, y = {result.x[1]:.2f}")
print(f"Number of Iterations: {result.nit}")

# 生成函数的等高线图
x = np.linspace(-1, 3, 400)
y = np.linspace(-1, 5, 600)
X, Y = np.meshgrid(x, y)
Z = rosenbrock_banana([X, Y])

plt.figure(figsize=(10, 6))
cp = plt.contour(X, Y, Z, levels=np.logspace(-1, 3, 20), cmap='jet')
plt.colorbar(cp)
plt.plot([x[0] for x in history], [x[1] for x in history], 'ro-', label='Optimization Path', markersize=5)
plt.plot(result.x[0], result.x[1], 'wo', label='Minimum', markersize=12, markeredgewidth=2, markeredgecolor='black')
plt.title("Contour Plot of Rosenbrock's Banana Function with Optimization Path")
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.xlim(-1, 3)
plt.ylim(-1, 5)
plt.show()

Optimization Result:
Success: True
Minimum Value of the Function: 0.00
Point at which Minimum Occurs: x = 2.00, y = 4.00
Number of Iterations: 30

凸规划的可行域为凸集，其局部最优解即为全局最优解，而且其最优解的集合形成一个凸集。当凸规划的目标函数$f(x)$为严格凸函数时，其最优解必定唯一（假定最优解存在）。由此可见，凸规划是一类比较简单而又具有重要理论意义的非线性规划。

参考文献

posted @ 2023-06-08 20:01 郝hai 阅读(3181) 评论(0) 编辑收藏举报

刷新页面返回顶部

haohai9309

格物致知，知行合一！

非线性规划——凸优化、凸函数、凸规划（二）

一、凸集

二、凸函数

三、凸规划

参考文献

公告