离散的LQR优化问题一般具有如下形式:
\[\min{J=\frac{1}{2}x_N^THx_N+\frac{1}{2}\sum_{k=0}^{N-1}{[x_k^TQ_kx_k+u_k^TR_ku_k]}}\\ \begin{aligned} \text{subject to}\quad&x_{k+1}=A_kx_k+B_ku_k,\\ &x_0\quad\text{given},\\ &H=H^T\geq0,\\ &Q=Q^T\geq0,\\ &R=R^T>0, \end{aligned} \tag{1}
\]
优化的目标是已知\(x\)的情况下,通过选择控制输入\(u_0\cdots u_{N-1}\)使得代价\(J\)最小。我们将代价函数\(J\)分开来看,定义从状态\(x_i\)到\(x_N\)的部分代价函数为\(J_i\),除此之外部分的代价函数为\(J_i^{'}\),于是有:
\[\begin{aligned} J_i&=\frac{1}{2}x_N^THx_N+\frac{1}{2}\sum_{k=i}^{N-1}{[x_k^TQ_kx_k+u_k^TR_ku_k]} \quad\text{when}\quad 0 \leq i \leq N-1\\ J_N&=\frac{1}{2}x_N^THx_N \end{aligned} \tag{2}
\]
\[\begin{aligned} J_i^{'}&=\frac{1}{2}\sum_{k=0}^{i-1}{[x_k^TQ_kx_k+u_k^TR_ku_k]} \quad\text{when}\quad 1 \leq i \leq N-1\\ J_0^{'}&=0 \end{aligned} \tag{3}
\]
显然有:
\[J=J_i+J_i^{'} \tag{4}
\]
我们逐步对控制输入\(u\)进行最小化\(J\)的操作。考虑控制输入的最后一项是\(u_N\),有:
\[\underset{u_{N-1}}\min{J}=\underset{u_{N-1}}\min{\{J_{N-1}+J_{N-1}^{'}\}} \tag{5}
\]
根据定义可知,\(J_{N-1}^{'}\)与\(u_{N-1}\)无关,因为它表示的是\(u_{N-1}\)作用于系统之前的代价,于是有:
\[\begin{aligned} \underset{u_{N-1}}\min{J}&=J_{N-1}^{'}+\underset{u_{N-1}}\min{J_{N-1}}\\ &=J_{N-1}^{'}+\underset{u_{N-1}}\min{\{\frac{1}{2}[x_{N-1}^TQ_{N-1}x_{N-1}+u_{N-1}^TR_{N-1}u_{N-1}]+J_{N}\}} \end{aligned} \tag{6}
\]
接下来在看\(u_{N-2}\):
\[\begin{aligned} \underset{u_{N-2},u_{N-1}}\min{J}&=\underset{u_{N-2}}\min{\{J_{N-1}^{'}+\underset{u_{N-1}}\min{J_{N-1}}\}}\\ &=J_{N-2}^{'}+\underset{u_{N-2}}\min{\{\frac{1}{2}[x_{N-2}^TQ_{N-2}x_{N-2}+u_{N-2}^TR_{N-2}u_{N-2}]+\underset{u_{N-1}}\min{J_{N-1}}\}} \end{aligned} \tag{7}
\]
根据上述规律,我们定义:
\[\begin{aligned} \hat{J}_i=\underset{u_i\cdots u_{N-1}}\min{J_i}&=\underset{u_i}\min{\{\frac{1}{2}[x_i^TQ_ix_i+u_i^TR_iu_i]+\underset{u_{i+1}\cdots u_{N-1}}\min{J_{i+1}}\}}\\ &=\underset{u_i}\min{\{\frac{1}{2}[x_i^TQ_ix_i+u_i^TR_iu_i]+\hat{J}_{i+1}\}} \end{aligned} \tag{8}
\]
现在我们考察\(\hat{J}_{N-1}\)的具体形式:
\[\begin{aligned} \hat{J}_{N-1}=\underset{u_{N-1}}\min{J_{N-1}}&=\underset{u_{N-1}}\min{\{\frac{1}{2}[x_{N-1}^TQ_{N-1}x_{N-1}+u_{N-1}^TR_{N-1}u_{N-1}]+J_{N}\}}\\ &=\underset{u_{N-1}}\min{\{\frac{1}{2}[x_{N-1}^TQ_{N-1}x_{N-1}+u_{N-1}^TR_{N-1}u_{N-1}]+\frac{1}{2}x_N^THx_N\}} \end{aligned} \tag{9}
\]
代入\(x_N=A_{N-1}x_{N-1}+B_{N-1}u_{N-1}\):
\[\begin{aligned} J_{N-1}=\frac{1}{2}[x_{N-1}^TQ_{N-1}x_{N-1}+u_{N-1}^TR_{N-1}u_{N-1}]+\frac{1}{2}(A_{N-1}x_{N-1}+B_{N-1}u_{N-1})^TH(A_{N-1}x_{N-1}+B_{N-1}u_{N-1}) \end{aligned} \tag{10}
\]
对\(J_{N-1}\)对\(u_{N-1}\)求偏导数,并令其为0,有:
\[\frac{\partial J_{N-1}}{\partial u_{N-1}}=R_{N-1}u_{N-1}+B_{N-1}^THA_{N-1}x_{N-1}+B_{N-1}^THB_{N-1}u_{N-1}=0 \tag{11}
\]
因此可以使\(J_{N-1}\)取最小的\(u_{N-1}\)满足如下必要条件:
\[\begin{aligned} \hat{u}_{N-1}&=-(R_{N-1}+B_{N-1}^THB_{N-1})^{-1}B_{N-1}^THA_{N-1}x_{N-1}\\ &=-F_{N-1}x_{N-1} \end{aligned} \tag{12}
\]
若要使其充分,还需满足:
\[\frac{\partial^2 J_{N-1}}{\partial u_{N-1}^2}=R_{N-1}+B_{N-1}^THB_{N-1}>0 \tag{13}
\]
显然\(\hat{u}_{N-1}\)确实能使\(J_{N-1}\)取全局最小值。
将公式(12)式代入到公式(10),可以得到\(\hat{J}_{N-1}\)是一个关于\(x_{N-1}\)的二次型:
\[\begin{aligned} \hat{J}_{N-1}&=\frac{1}{2}x_{N-1}^T\{Q_{N-1}+F_{N-1}^TR_{N-1}F_{N-1}+(A_{N-1}-B_{N-1}F_{N-1})^TH(A_{N-1}-B_{N-1}F_{N-1})\}x_{N-1}\\ &=\frac{1}{2}x_{N-1}^TP_{N-1}x_{N-1} \end{aligned} \tag{14}
\]
如果我们定义初始的\(P_N=H\),则上述关系变为:
\[\begin{aligned} \hat{J}_N&=\frac{1}{2}x_N^TP_Nx_N\\ \hat{J}_{N-1}&=\frac{1}{2}x_{N-1}^TP_{N-1}x_{N-1}\\ P_{N-1}&=Q_{N-1}+F_{N-1}^TR_{N-1}F_{N-1}+(A_{N-1}-B_{N-1}F_{N-1})^TP_N(A_{N-1}-B_{N-1}F_{N-1}) \end{aligned} \tag{15}
\]
将上述关系推广到更一般的情况,则有:
\[\begin{aligned} \hat{J}_i&=\frac{1}{2}x_i^TP_ix_i\\ F_{i}&=(R_i+B_i^TP_{i+1}B_i)^{-1}B_iP_{i+1}A_i\\ P_i&=Q_i+F_i^TR_iF_i+(A_i-B_iF_i)^TP_{i+1}(A_i-B_iF_i)\\ &=Q_i+A_i^T\{P_{i+1}-P_{i+1}B_i[R_i+B_i^TP_{i+1}B_i]^{-1}B_i^TP_{i+1}\}A_i\\ \hat{u}_i&=-F_ix_i \end{aligned} \tag{16}
\]
上述关系即是一般线性离散系统的LQR控制率。
现在考虑离散线性时不变系统,系统参数不再随角标变化,又考虑到\(P\)最终趋于稳态,则有如下关系:
\[\begin{aligned}
P&=Q+A^T\{P-PB[R+B^TPB]^{-1}B^TP\}A\\
F&=(R+B^TPB)^{-1}BPA\\
\hat{u}_i&=-Fx_i
\end{aligned}
\tag{17}
\]
上式中的第一项是离散时间代数黎卡提方程 (Discrete time Algebraic Riccati Equation - DARE),第二项是状态反馈增益,第三项是最优控制率。
推导完毕。
未经授权,禁止转载