Cholesky分解
1、为什么要进行矩阵分解
个人认为,首先,当数据量很大时,将一个矩阵分解为若干个矩阵的乘积可以大大降低存储空间;其次,可以减少真正进行问题处理时的计算量,毕竟算法扫描的元素越少完成任务的速度越快,这个时候矩阵的分解是对数据的一个预处理;再次,矩阵分解可以高效和有效的解决某些问题;最后,矩阵分解可以提高算法数值稳定性,关于这一点可以有进一步的说明,
借用一个上学时老师给的例子:
有方程组:
令 , ,
解方程组可得:
现在对b进行微小扰动:
,扰动项为:
此时相应的解为:
。
这个例子说明,当方程组常数项发生微小变动的时候会导致求出的结果差别相当大,而导致这种差别的并不是求解方法,而是方程组系数矩阵本身的问题,这会给我们解决问题带来很大危害,例如,我们在用计算机求解这类问题时难以避免在计算当中出现舍入误差,如果矩阵本身性质不好会直接导致所答非所问。
对常数向量b和矩阵A进行一个简单的扰动分析:
1)、扰动b,原方程组为:
(式子1),(,A非奇异)
扰动后为:
(式子2)
把式子1带入式子2得:,用2-范式来衡量这种变化得:,由于,于是得到:
而利用式子1同理可得,整理后得:
,可见b的扰动对解的影响由决定。
2)、扰动A,扰动后为:
(式子3),(,A非奇异)
稍微做一下变换:
把式子1带入后得到:
对两边同时取2-范式有:
于是有:
,整理一下就是:
,A的扰动对解的影响依然是由决定。
3)、对于同时扰动A和b的情况偶就不推了,最后的结果依然是,扰动对解的影响依然由决定。
定义矩阵的条件数来描述矩阵的病态程度,一般认为条件数小于100为良态,条件数在100到1000之间为中等程度的病态,条件数超过1000存在严重病态。以上面的矩阵A为例,采用2-范数表示的条件数为:,看来矩阵处于中等病态程度。
矩阵其实就是一个给定的线性变换,特征向量描述了这个线性变换的主要方向,而特征值描述了一个特征向量的长度在该线性变换下缩放的比例,有关特征值和特征向量的相关概念可查看http://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors,对开篇的例子进一步观察发现,A是个对称正定矩阵,A的特征值分别为:14.93303437 和:0.06696563,两个特征值在数量级上相差很大,这意味着b发生扰动时,向量x在这两个特征向量方向上的移动距离是相差很大的——对于对应的特征向量只需要微小的移动就可到达b的新值,而对于,由于它比起太小了,因此需要x做大幅度移动才能到达b的新值,于是悲剧就发生了……………..。
关于矩阵可以有以下各种分解方式,①矩阵的三角分解(Cholesky分解、LU分解等),②矩阵的正交三角分解(QR分解等),③矩阵的满秩分解,④矩阵的奇异值分解(SVD)(关于SVD可以查看高人LeftNotEasy的http://www.cnblogs.com/LeftNotEasy/archive/2011/01/19/svd-and-applications.html#2038925一文以及他提供的参考资料)。
再看矩阵A,它是个对称正定矩阵,对这种矩阵都可以进行Cholesky分解,也就是将矩阵A分解为:,其中为一个下三角矩阵,具体操作随后讨论,回头看方程组,它就变成了:
,
将它看成两个方程组:
和,其中:,特征值为:2.2360680和:0.4472136
此时采用2-范数表示的条件数为,显然上面这两个方程组也都是良态的且只需要存储矩阵的下三角部分即可,矩阵分解的优点可见一斑。
2、实正定Hermit矩阵的完全Cholesky分解
设矩阵A有如下形式:
借用http://en.wikipedia.org/wiki/Cholesky_decomposition中的推导:
令,在第i次迭代时有:
,其中为i-1维单位矩阵
定义矩阵:
于是要满足,就有:
这样一直迭代下去,直到,也就是有:,最后得到分解后的下三角矩阵的元素:
当然也可以是上三角矩阵,此时
所以的元素是:
一种比较直白的分解算法可以是这样的:
设有实对称矩阵:
以及向量和保存分解结果的矩阵,算法伪码描述如下:
1: for i:=1 to n do //逐行计算L的值,w向量初始状态为实hermit矩阵A的上三角部分的一行
2: w(i .. n) := a(i,i .. n);
3: for k:=1 to i-1 do
4: temp := L(k,i);
5: if(temp != 0) then
6: w(i .. n) := w(i .. n) - temp*L(k,i .. n); //计算L(i,j)公式的分子部分
7: endif;
8: endfor;
9: w(i) := sqrt(w(i)); //L矩阵对角线元素计算
10: w(i+1 .. n) := w(i+1 .. n) / w(i); //L矩阵非对角线元素计算
11:
12: L(i,i .. n) := w(i .. n); //更新L矩阵
13: w(i .. n) :=0; //清空w向量
14: endfor;
一个简单的实现如下:
1: //串行完全Chlolesky分解,时间复杂度(O(n^3))
2: double **complete_cholesky_decompose(double **A)
3: {
4: if(NULL==A)
5: return NULL;
6:
7: double **L=malloc_matrix();
8: clear_matrix(L);
9:
10: int i,j,k,m;
11:
12: double *w=malloc_vector();
13:
14: clear_vector(w);//清除向量
15:
16: for(i=0;i<LEN;i++){
17: for(m=i;m<LEN;m++){
18: w[m]=A[i][m];
19: }
20:
21: for(k=0;k<i;k++){
22: double temp=L[k][i];
23: if(temp!=0){
24: for(m=i;m<LEN;m++){
25: w[m] -= temp*L[k][m];
26: }
27: }
28: }
29:
30: w[i]=sqrt(w[i]);
31: for(m=i+1;m<LEN;m++){
32: w[m] /=w[i];
33: }
34:
35: for(m=i;m<LEN;m++){
36: L[i][m]=w[m];
37: }
38:
39: clear_vector(w);
40: }
41:
42: return L;
43: }
代码可以在这里下载到。
在实践中,完全Cholesky分解有时并不是必须的,为了提高效率或者有利于编写并行算法等原因,不完全Cholesky(ICF)有时更加实用,关于ICF有很多算法,还有待学习。
3、实用工具和网址
(1)、R
我认为R是进行关于矩阵相关操作的较好工具,它是免费的和开源的(我记的是),功能不亚于matlab,当然它能做的事情远远不止矩阵计算,还可以做分类、聚类、回归、统计分析等等等等,R可以从http://www.r-project.org/获取到,同时有linux版本和windows版本。
关于矩阵的一些简单命令我稍微列一下:
1)、创建向量
数据无规律:
1: > x=c(0,1,3,4,6,8,9)
2: > x
3: [1] 0 1 3 4 6 8 9
数据有简单规律(从1~3以0.1的间隔输出):
1: > x=seq(1,3,by=0.1)
2: > x
3: [1] 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0
数据有复杂规律(从0~5,每个数字重复2次):
1: > x=rep(0:5,rep(2,6))
2: > x
3: [1] 0 0 1 1 2 2 3 3 4 4 5 5
2)、创建矩阵
函数原型:
1: > matrix
2: function (data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)
3: {
4: data <- as.vector(data)
5: if (missing(nrow))
6: nrow <- ceiling(length(data)/ncol)
7: else if (missing(ncol))
8: ncol <- ceiling(length(data)/nrow)
9: .Internal(matrix(data, nrow, ncol, byrow, dimnames))
10: }
11: <environment: namespace:base>
data存储矩阵数据,nrow和ncol分别为行数和列数,byrows默认为FALSE,表示矩阵数据生成时以列为主序,例如:
1: > data=c(
2: + 0.0100,0.1710,0.0967,0.1661,0.1254,0.0728,0.0928,0.1272,0.1041,0.0644,
3: + 0.1710,3.3743,2.3896,3.5556,2.6312,1.6968,1.8502,2.6637,2.5208,1.2792,
4: + 0.0967,2.3896,2.5098,3.0234,2.1745,1.6661,1.5959,2.5601,2.6507,1.0512,
5: + 0.1661,3.5556,3.0234,4.3530,3.4144,2.5287,2.3244,3.6691,3.5036,1.6727,
6: + 0.1254,2.6312,2.1745,3.4144,2.9375,2.4152,1.9473,3.0911,2.9016,1.6776,
7: + 0.0728,1.6968,1.6661,2.5287,2.4152,2.5097,1.7161,2.8719,2.6663,2.0713,
8: + 0.0928,1.8502,1.5959,2.3244,1.9473,1.7161,2.0160,3.2206,2.3529,1.8967,
9: + 0.1272,2.6637,2.5601,3.6691,3.0911,2.8719,3.2206,5.8295,4.0840,3.4258,
10: + 0.1041,2.5208,2.6507,3.5036,2.9016,2.6663,2.3529,4.0840,3.8915,2.7323,
11: + 0.0644,1.2792,1.0512,1.6727,1.6776,2.0713,1.8967,3.4258,2.7323,4.3582
12: + )
13: > a=matrix(data,nrow=10,ncol=10,byrow=TRUE)
14: > a
15: [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
16: [1,] 0.0100 0.1710 0.0967 0.1661 0.1254 0.0728 0.0928 0.1272 0.1041 0.0644
17: [2,] 0.1710 3.3743 2.3896 3.5556 2.6312 1.6968 1.8502 2.6637 2.5208 1.2792
18: [3,] 0.0967 2.3896 2.5098 3.0234 2.1745 1.6661 1.5959 2.5601 2.6507 1.0512
19: [4,] 0.1661 3.5556 3.0234 4.3530 3.4144 2.5287 2.3244 3.6691 3.5036 1.6727
20: [5,] 0.1254 2.6312 2.1745 3.4144 2.9375 2.4152 1.9473 3.0911 2.9016 1.6776
21: [6,] 0.0728 1.6968 1.6661 2.5287 2.4152 2.5097 1.7161 2.8719 2.6663 2.0713
22: [7,] 0.0928 1.8502 1.5959 2.3244 1.9473 1.7161 2.0160 3.2206 2.3529 1.8967
23: [8,] 0.1272 2.6637 2.5601 3.6691 3.0911 2.8719 3.2206 5.8295 4.0840 3.4258
24: [9,] 0.1041 2.5208 2.6507 3.5036 2.9016 2.6663 2.3529 4.0840 3.8915 2.7323
25: [10,] 0.0644 1.2792 1.0512 1.6727 1.6776 2.0713 1.8967 3.4258 2.7323 4.3582
3)、矩阵转置
1: > A=matrix(0:11,nrow=3,ncol=4,byrow=TRUE)
2: > A
3: [,1] [,2] [,3] [,4]
4: [1,] 0 1 2 3
5: [2,] 4 5 6 7
6: [3,] 8 9 10 11
7: > t(A)
8: [,1] [,2] [,3]
9: [1,] 0 4 8
10: [2,] 1 5 9
11: [3,] 2 6 10
12: [4,] 3 7 11
4)、矩阵加减
1: > A=matrix(0:11,nrow=3,ncol=4,byrow=TRUE)
2: > B=matrix(1:12,nrow=3,ncol=4,byrow=TRUE)
3: > B-A
4: [,1] [,2] [,3] [,4]
5: [1,] 1 1 1 1
6: [2,] 1 1 1 1
7: [3,] 1 1 1 1
8: > B+A
9: [,1] [,2] [,3] [,4]
10: [1,] 1 3 5 7
11: [2,] 9 11 13 15
12: [3,] 17 19 21 23
5)、矩阵相乘
1: > A=matrix(0:11,nrow=3,ncol=4,byrow=TRUE)
2: > B=matrix(1:12,nrow=4,ncol=3,byrow=TRUE)
3: > A%*%B
4: [,1] [,2] [,3]
5: [1,] 48 54 60
6: [2,] 136 158 180
7: [3,] 224 262 300
6)、取方阵对角元素
1: > A=matrix(0:8,nrow=3,ncol=3,byrow=TRUE)
2: > A
3: [,1] [,2] [,3]
4: [1,] 0 1 2
5: [2,] 3 4 5
6: [3,] 6 7 8
7: > diag(A)
8: [1] 0 4 8
7)、矩阵求逆
1: > A=matrix(c(1,2,3,0,4,5,0,0,6),nrow=3,ncol=3,byrow=TRUE)
2: > solve(A)
3: [,1] [,2] [,3]
4: [1,] 1 -0.50 -0.08333333
5: [2,] 0 0.25 -0.20833333
6: [3,] 0 0.00 0.16666667
8)、特征值与特征向量
1: > A=matrix(c(1,2,3,0,4,5,0,0,6),nrow=3,ncol=3,byrow=TRUE)
2: > eigen(A)
3: $values
4: [1] 6 4 1
5:
6: $vectors
7: [,1] [,2] [,3]
8: [1,] 0.5108407 0.5547002 1
9: [2,] 0.7981886 0.8320503 0
10: [3,] 0.3192754 0.0000000 0
11:
9)、正定hermit矩阵的完全Cholesky分解
1: > data=c(
2: + 0.0100,0.1710,0.0967,0.1661,0.1254,0.0728,0.0928,0.1272,0.1041,0.0644,
3: + 0.1710,3.3743,2.3896,3.5556,2.6312,1.6968,1.8502,2.6637,2.5208,1.2792,
4: + 0.0967,2.3896,2.5098,3.0234,2.1745,1.6661,1.5959,2.5601,2.6507,1.0512,
5: + 0.1661,3.5556,3.0234,4.3530,3.4144,2.5287,2.3244,3.6691,3.5036,1.6727,
6: + 0.1254,2.6312,2.1745,3.4144,2.9375,2.4152,1.9473,3.0911,2.9016,1.6776,
7: + 0.0728,1.6968,1.6661,2.5287,2.4152,2.5097,1.7161,2.8719,2.6663,2.0713,
8: + 0.0928,1.8502,1.5959,2.3244,1.9473,1.7161,2.0160,3.2206,2.3529,1.8967,
9: + 0.1272,2.6637,2.5601,3.6691,3.0911,2.8719,3.2206,5.8295,4.0840,3.4258,
10: + 0.1041,2.5208,2.6507,3.5036,2.9016,2.6663,2.3529,4.0840,3.8915,2.7323,
11: + 0.0644,1.2792,1.0512,1.6727,1.6776,2.0713,1.8967,3.4258,2.7323,4.3582
12: + )
13: > a=matrix(data,nrow=10,ncol=10,byrow=TRUE)
14: > chol(a)
15: [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
16: [1,] 0.1 1.7100000 0.9670000 1.6610000 1.2540000 0.7280000 0.9280000 1.2720000 1.0410000 0.6440000
17: [2,] 0.0 0.6709694 1.0969650 1.0660545 0.7256068 0.6735329 0.3924471 0.7281703 1.1039102 0.2652282
18: [3,] 0.0 0.0000000 0.6094086 0.4066049 0.2722586 0.3663913 0.4398089 0.8718268 0.7106926 0.2256384
19: [4,] 0.0 0.0000000 0.0000000 0.5406286 0.8273109 0.8369753 0.3436621 0.7871389 0.5710010 0.4226980
20: [5,] 0.0 0.0000000 0.0000000 0.0000000 0.2826847 0.7831198 0.3352446 0.2797313 0.4573777 0.9425268
21: [6,] 0.0 0.0000000 0.0000000 0.0000000 0.0000000 0.2793255 0.2322540 0.9241146 0.2450380 0.8923532
22: [7,] 0.0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.7231424 1.0953043 0.3243910 0.5908204
23: [8,] 0.0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.4119291 0.4303230 0.3608438
24: [9,] 0.0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.4454546 0.8321836
25: [10,] 0.0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.8871720
10)、矩阵奇异值分解
1: > data=c(
2: + 0.0100,0.1710,0.0967,0.1661,0.1254,0.0728,0.0928,0.1272,0.1041,0.0644,
3: + 0.1710,3.3743,2.3896,3.5556,2.6312,1.6968,1.8502,2.6637,2.5208,1.2792,
4: + 0.0967,2.3896,2.5098,3.0234,2.1745,1.6661,1.5959,2.5601,2.6507,1.0512,
5: + 0.1661,3.5556,3.0234,4.3530,3.4144,2.5287,2.3244,3.6691,3.5036,1.6727,
6: + 0.1254,2.6312,2.1745,3.4144,2.9375,2.4152,1.9473,3.0911,2.9016,1.6776,
7: + 0.0728,1.6968,1.6661,2.5287,2.4152,2.5097,1.7161,2.8719,2.6663,2.0713,
8: + 0.0928,1.8502,1.5959,2.3244,1.9473,1.7161,2.0160,3.2206,2.3529,1.8967,
9: + 0.1272,2.6637,2.5601,3.6691,3.0911,2.8719,3.2206,5.8295,4.0840,3.4258,
10: + 0.1041,2.5208,2.6507,3.5036,2.9016,2.6663,2.3529,4.0840,3.8915,2.7323,
11: + 0.0644,1.2792,1.0512,1.6727,1.6776,2.0713,1.8967,3.4258,2.7323,4.3582
12: + )
13: > a=matrix(data,nrow=10,ncol=10,byrow=TRUE)
14: > svd(a)
15: $d
16: [1] 2.419118e+01 4.155429e+00 1.305297e+00 1.042439e+00 7.271958e-01 1.724779e-01 1.188910e-01 7.550273e-02 6.811257e-04 4.025360e-04
17:
18: $u
19: [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
20: [1,] -0.01423809 -0.01770852 0.01046421 0.0440060554 -0.03708011 0.003989202 0.02212364 -0.02965763 0.609656748 -0.789301225
21: [2,] -0.30655677 -0.38844316 0.21540302 0.5860837223 -0.20831346 -0.071321462 0.09280202 -0.53705509 0.058902791 0.127479773
22: [3,] -0.27500938 -0.28653235 -0.04046986 -0.0006133085 0.62767935 -0.424262650 -0.38187184 0.16164557 0.260582126 0.163683441
23: [4,] -0.39289768 -0.37181967 0.10218229 0.0430729100 -0.04830660 0.258172491 -0.19145885 0.37788201 -0.530454225 -0.406528598
24: [5,] -0.32434239 -0.18416219 0.18854340 -0.2694300561 -0.38073924 0.273128771 0.08003428 0.37039768 0.491516620 0.384701079
25: [6,] -0.28065826 0.08302171 0.24277466 -0.6198550162 -0.27053693 -0.402153630 -0.20235690 -0.39991126 -0.144858619 -0.119997665
26: [7,] -0.26573534 0.09522360 -0.29346909 0.1374724777 -0.21528535 -0.600459122 0.52653396 0.34671820 -0.089612967 -0.053976475
27: [8,] -0.44372190 0.30810327 -0.69462776 0.0748652314 -0.12439310 0.238340862 -0.32629727 -0.18852619 0.059652991 0.047118889
28: [9,] -0.38271998 0.06649679 0.03894330 -0.2531095648 0.52173933 0.307376912 0.59639765 -0.24228533 -0.036358521 -0.033403328
29: [10,] -0.27906174 0.69225888 0.52609547 0.3276170825 0.08203487 -0.008974789 -0.14718647 0.17374370 0.008992724 0.007135959
30:
31: $v
32: [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
33: [1,] -0.01423809 -0.01770852 0.01046421 0.0440060554 -0.03708011 0.003989202 0.02212364 -0.02965763 0.609656748 -0.789301225
34: [2,] -0.30655677 -0.38844316 0.21540302 0.5860837223 -0.20831346 -0.071321462 0.09280202 -0.53705509 0.058902791 0.127479773
35: [3,] -0.27500938 -0.28653235 -0.04046986 -0.0006133085 0.62767935 -0.424262650 -0.38187184 0.16164557 0.260582126 0.163683441
36: [4,] -0.39289768 -0.37181967 0.10218229 0.0430729100 -0.04830660 0.258172491 -0.19145885 0.37788201 -0.530454225 -0.406528598
37: [5,] -0.32434239 -0.18416219 0.18854340 -0.2694300561 -0.38073924 0.273128771 0.08003428 0.37039768 0.491516620 0.384701079
38: [6,] -0.28065826 0.08302171 0.24277466 -0.6198550162 -0.27053693 -0.402153630 -0.20235690 -0.39991126 -0.144858619 -0.119997665
39: [7,] -0.26573534 0.09522360 -0.29346909 0.1374724777 -0.21528535 -0.600459122 0.52653396 0.34671820 -0.089612967 -0.053976475
40: [8,] -0.44372190 0.30810327 -0.69462776 0.0748652314 -0.12439310 0.238340862 -0.32629727 -0.18852619 0.059652991 0.047118889
41: [9,] -0.38271998 0.06649679 0.03894330 -0.2531095648 0.52173933 0.307376912 0.59639765 -0.24228533 -0.036358521 -0.033403328
42: [10,] -0.27906174 0.69225888 0.52609547 0.3276170825 0.08203487 -0.008974789 -0.14718647 0.17374370 0.008992724 0.007135959
11)、矩阵QR分解
1: > A=matrix(1:12,3,4)
2: > qr(A)
3: $qr
4: [,1] [,2] [,3] [,4]
5: [1,] -3.7416574 -8.552360 -1.336306e+01 -1.817376e+01
6: [2,] 0.5345225 1.963961 3.927922e+00 5.891883e+00
7: [3,] 0.8017837 0.988693 3.443426e-16 -3.439089e-16
8:
9: $rank
10: [1] 2
11:
12: $qraux
13: [1] 1.267261e+00 1.149954e+00 3.443426e-16 3.439089e-16
14:
15: $pivot
16: [1] 1 2 3 4
17:
18: attr(,"class")
19: [1] "qr"
(2)、找论文的好去处
1)、http://www.pdfgratis.com/一个非常强大的网站,能下载到各种论文;
2)、http://jeffhuang.com/best_paper_awards.html
(3)、编写latex公式的工具
最初是在小桥流水的博客上发现的,地址如下:http://www.cnblogs.com/youwang/archive/2010/04/04/1704099.html,对他的插件做了一点修改,增加的功能是:可以还原公式,例如:用该工具写公式, ,选中该公式,然后单击插件,可以看到公式的latex代码,源码可以在这里下载到。
posted on 2011-03-22 15:28 Leo Zhang 阅读(21630) 评论(21) 编辑 收藏 举报