最优化算法-递推最小二乘法
Recursive Least Square(RLS)
最小二乘算法(Least Square)解决的问题是一个多元线性拟合问题:
\(\{a_1,a_2,a_3,...,a_n,b\}\), 其中\(a_i\)为自变量, \(b\)为响应值. 在线系统会不断获得新的观测值\(\{a_1^i,a_2^i,a_3^i,...,a_n^i,b^i\}\). 在线观测保存所有的观测值并使用经典最小二乘法进行求解, 需要耗费大量的计算资源与内存, 所以需要有一种递推的类似动态规划的方式来在线更新线性模型的参数.
对于线性拟合问题\(Ax=b\):
\[\min\limits_x||Ax-b||^2
\]
对应的经典最小二乘解可以写成:
\[\begin{array}{rcl}
A_0 x_0 = b_0 & & x_0=(A_0^TA_0)^{-1}A_0^Tb_0\\
\begin{bmatrix}A_0 \\ A_1\end{bmatrix}x_1 = \begin{bmatrix} b_0\\b_1 \end{bmatrix}& &
x_1=\begin{pmatrix}\begin{bmatrix}A_0 \\ A_1\end{bmatrix}^T\begin{bmatrix}A_0 \\ A_1\end{bmatrix}\end{pmatrix}^{-1}\begin{bmatrix}A_0 \\ A_1\end{bmatrix}^T\begin{bmatrix}b_0 \\ b_1\end{bmatrix}\\
\end{array}
\]
接下来需要消去\(A_0,b_0\)两个变量
\[\begin{array}{rcl} G_0=A_0^TA_0 &&& x_0=G_0^{-1}A_0^Tb_0\end{array} \\
\begin{split}
G_1&=\begin{bmatrix}A_0 \\ A_1\end{bmatrix}^T
\begin{bmatrix}A_0 \\ A_1\end{bmatrix}\\ &=G_0+A_1^TA_1 \\
\end{split}
\\
\begin{split}
\begin{bmatrix}A_0 \\ A_1\end{bmatrix}^T\begin{bmatrix}b_0 \\ b_1\end{bmatrix}
&=A_0^T(A_0x_0)+A_1^Tb_1 \\
&=G_0x_0+A_1^Tb_1 \\
&=(G_1-A_1^TA_1)x_0 +A_1^Tb_1 \\
&= G_1x_0 + A_1^T(b_1-A_1x_0)
\end{split} \\
\begin{split}
x_1 &= G_1^{-1}G_1x_0 + G_1^{-1}A_1^T(b_1-A_1x_0) \\
&= x_0 + G_1^{-1}A_1^T(b_1-A_1x_0)
\end{split} \\
x_{k+1} = x_k + G_{k+1}^{-1}A_{k+1}^T(b_{k+1}-A_{k+1}x_k)
\]
得到线性模型的参数的迭代更新关系\(x_{k+1} = x_k + G_{k+1}^{-1}A_{k+1}^T(b_{k+1}-A_{k+1}x_k)\)
但此时需要对\(G_{k+1}=\begin{bmatrix}A_k \\ A_{k+1}\end{bmatrix}^T \begin{bmatrix}A_k \\ A_{k+1}\end{bmatrix}=G_k+A_{k+1}^TA_{k+1}\\\)求逆, 可以根据Sherman-Morrison-Woodbury formula进一步优化:
\[(A+UV)^{-1}= A^{-1}-(A^{-1}U)(I+VA^{-1}U)^{-1}(VA^{-1}) \\
\begin{split}
G_{k+1}^{-1} &= (G_k+A_{k+1}^TA_{k+1})^{-1} \\
&= G_k^{-1}-G_k^{-1}A_{k+1}(I+A_{k+1}^TG_k^{-1}A_{k+1})^{-1}A_{k+1}^TG_k^{-1}\\
G_{k+1}^{-1} \equiv P_{k+1}&= P_k - P_kA_{k+1}(I+A_{k+1}^TP_kA_{k+1})^{-1}A_{k+1}^TP_k
\end{split}
\]
进一步计算\((I+A_{k+1}^TG_k^{-1}A_{k+1})^{-1}\), 考虑特殊情况\(A_{k+1}\equiv a_{k+1}^T\),即新观测量为单行的向量, \(I+A_{k+1}^TG_k^{-1}A_{k+1}\)退化成标量:
\[\begin{split}
P_{k+1} &= P_k - P_kA_{k+1}(I+A_{k+1}^TP_kA_{k+1})^{-1}A_{k+1}^TP_k \\
&= P_k - \frac{P_ka_{k+1}a_{k+1}^TP_k }{I+a_{k+1}^TP_ka_{k+1}}
\end{split}\\
\begin{split}
x_{k+1} &= x_k + G_{k+1}^{-1}A_{k+1}^T(b_{k+1}-A_{k+1}x_k) \\
&= x_k + P_{k+1}a_{k+1}(b_{k+1}-a_{k+1}^Tx_k)
\end{split}
\]
回顾以上推导, 可得如下的初始解和迭代过程
\[\begin{split}
P_0 &=(A_0^TA_0)^{-1}\\
x_0 &= P_0A_0^Tb_0 \\
P_{k+1} &= P_k - \frac{P_ka_{k+1}a_{k+1}^TP_k }{I+a_{k+1}^TP_ka_{k+1}} \\
x_{k+1} &= x_k + P_{k+1}a_{k+1}(b_{k+1}-a_{k+1}^Tx_k)
\end{split}
\]
Implementation
def RLS(A0: np.ndarray, b0: np.ndarray):
P0 = np.linalg.inv( np.dot(A0.T, A0) )
return P0, np.dot(P0, np.dot(A0.T, b0))
def RLS(Pk: np.ndarray, xk: np.ndarray, a:np.ndarray, b:np.ndarray):
P = Pk - np.dot(np.dot(Pk, a), np.dot(a.T, Pk))/(1+np.dot(np.dot(a.T, P_k), a))
x = xk + np.dot(np.dot(P, a), b - np.dot(a.T, x))
return P, x