三、高斯牛顿法(Gauss-Newton),列文伯格-马奎尔特法(Levenberg-Marquardt)
下面是离散数据样本集的最小化函数,高斯牛顿算法就是通过迭代发现以下此函数的最小值:
依据高斯牛顿算法,对于直线函数,β为自变量参数矩阵[a,b]:
δ为自变量a,b梯度矩阵, Jf为f(Xi,β)函数的雅可比矩阵,即f(Xi,β)函数自变量的偏导数矩阵[Xi ,1]
而列文伯格-马奎尔特法是在 的对角线上增加了λ:
在有些情况下,可不用解逆矩阵,由于其是对角正定型,可以用乔勒斯基分解法(Cholesky_decomposition)求解。
1 public static double[] GaussNewton(List<Point> points) 2 { 3 double[] a = new double[] { 100, -100 }; 4 double chi2 = 0; 5 double[] da = new double[2]; 6 double[] ia = new double[2]; 7 double[,] tj = new double[2, 2]; 8 double[,] itj = new double[2, 2]; 9 10 for (int iter = 0; iter < 300; iter++) 11 { 12 double[] beta = new double[2]; 13 for (int i = 0; i < points.Count; i++) 14 { 15 tj[0, 0] += (double)(points[i].X * points[i].X); 16 tj[0, 1] += (double)(points[i].X); 17 tj[1, 0] += (double)(points[i].X); 18 tj[1, 1] += 1.0; 19 double dy = points[i].Y - points[i].X * a[0] - a[1]; 20 chi2 += dy * dy; 21 beta[0] += dy * points[i].X; 22 beta[1] += dy; 23 } 24 25 double jm = tj[0, 0] * tj[1, 1] - tj[0, 1] * tj[1, 0]; 26 itj[0, 0] = tj[1, 1] / jm; 27 itj[0, 1] = -tj[1, 0] / jm; 28 itj[1, 0] = -tj[0, 1] / jm; 29 itj[1, 1] = tj[0, 0] / jm; 30 31 32 da[0] = (itj[0, 0] * beta[0] + itj[0, 1] * beta[1]); 33 da[1] = (itj[1, 0] * beta[0] + itj[1, 1] * beta[1]); 34 35 ia[0] = (a[0] + da[0]); 36 ia[1] = (a[1] + da[1]); 37 38 double incrementedChi2 = 0; 39 for (int i = 0; i < points.Count; i++) 40 { 41 double dy = points[i].Y - points[i].X * ia[0] - ia[1]; 42 incrementedChi2 += dy * dy; 43 } 44 45 if (incrementedChi2 < chi2) 46 { 47 a = ia; 48 } 49 } 50 return a; 51 }
1 public static double[] LevenbergMarquardt(List<Point> points, double lambda) 2 { 3 double[] a = new double[] { 100, -100 }; 4 double chi2 = 0; 5 double[] da = new double[2]; 6 double[] ia = new double[2]; 7 double[,] tj = new double[2, 2]; 8 double[,] itj = new double[2, 2]; 9 10 for (int iter = 0; iter < 300; iter++) 11 { 12 double[] beta = new double[2]; 13 for (int i = 0; i < points.Count; i++) 14 { 15 tj[0, 0] += (double)(points[i].X * points[i].X); 16 tj[0, 1] += (double)(points[i].X); 17 tj[1, 0] += (double)(points[i].X); 18 tj[1, 1] += 1.0; 19 double dy = points[i].Y - points[i].X * a[0] - a[1]; 20 chi2 += dy * dy; 21 beta[0] += dy * points[i].X; 22 beta[1] += dy; 23 } 24 25 tj[0, 0] *= (1.0 + lambda); 26 tj[1, 1] *= (1.0 + lambda); 27 28 double jm = tj[0, 0] * tj[1, 1] - tj[0, 1] * tj[1, 0]; 29 itj[0, 0] = tj[1, 1] / jm; 30 itj[0, 1] = -tj[1, 0] / jm; 31 itj[1, 0] = -tj[0, 1] / jm; 32 itj[1, 1] = tj[0, 0] / jm; 33 34 35 da[0] = (itj[0, 0] * beta[0] + itj[0, 1] * beta[1]); 36 da[1] = (itj[1, 0] * beta[0] + itj[1, 1] * beta[1]); 37 38 ia[0] = (a[0] + da[0]); 39 ia[1] = (a[1] + da[1]); 40 41 double incrementedChi2 = 0; 42 for (int i = 0; i < points.Count; i++) 43 { 44 double dy = points[i].Y - points[i].X * ia[0] - ia[1]; 45 incrementedChi2 += dy * dy; 46 } 47 if (incrementedChi2 >= chi2) 48 { 49 lambda *= 10; 50 } 51 else 52 { 53 lambda /= 10; 54 a = ia; 55 } 56 } 57 return a; 58 }
总结:直线拟合的简单方法就是最小二乘法,梯度下降、高斯牛顿、列-马算法主要用于非线性系统,这里主要是为便于理解其用法,在以后的神经网络、机器学习、机器视觉中,这些算法是基础。
看看维基中的解释:梯度下降法 高斯牛顿法 列文伯格-马奎尔特算法