(模拟)迭代方法与Multigrid
首先,老规矩:
未经允许禁止转载(防止某些人乱转,转着转着就到蛮牛之类的地方去了)
B站:Heskey0
一,Iterative Method
《Iterative Method for Sparse Linear System》
Pre. 线性代数的本质
行列式:面积的缩放倍数
特征向量:矩阵变换后,某方向的向量不发生旋转(仅发生缩放,缩放倍数为特征值)
Gaussian Elimination:高斯消元,可理解为矩阵的行变换。
Matrix Polynomial: p(A)=a0I+a1A+a2A2+...+anAnp(A)=a0I+a1A+a2A2+...+anAn
Frobenius norm (F-范数)
是一种矩阵范数
linear span (线性生成空间):向量集 S 的生成空间也可定义为 S 中元素的所有有限线性组合组成的集合
M矩阵:实方阵A=[aij]的非对角元均非正(即aij≤0,i≠j),且其所有主子式均为正,则称其为闵可夫斯基矩阵,简称M矩阵
Cholesky factorization:Cholesky factorization)-CSDN博客_乔里斯基分解法
若 AA 是SPD的 n×nn×n矩阵,则可以分解为
其中 LL 为下三角矩阵
Chapter 1. Background in Linear Algebra
1.2 Square Matrices and Eigenvalues
spectral radius:The maximum modulus of the eigenvalues
-
当B的谱半径小于1时这个系统会收敛但是当谱半径接近1时收敛速度很慢。
-
The spectral radius is considered to be an asymptotic measure of convergence because it predicts the worst-case error reduction over many iterations
1.3 Types of Matrices
nonsingular matrix:非奇异矩阵是行列式不为 0 的矩阵,也就是可逆矩阵
1.5 Matrix Norm
【定义】
可以由向量范数诱导出矩阵范数:
其中 p,q=1,2,∞p,q=1,2,∞
当 q=pq=p 时,称为 p-norm
【性质】
- ||AB||p≤||A||p||B||p||AB||p≤||A||p||B||p
- 谱半径不大于矩阵范数:ρ(A)≤||A||ρ(A)≤||A||
1.6 Subspaces, Range, and Kernel
vector subspace: 向量空间V的非空子集W,满足W对于V的加法和数量乘法具有封闭性
两个重要的 vector subspace:range,kernel
range:像空间
AA is a matrix of Cn×mCn×m
kernel/null space:零空间
1.7 Orthogonal Vectors and Subspaces
Gram-Schmidt: Gram Schmidt Process 把一组普普通通的basis转换为相互正交的basis
QR decomposition:QR分解 将矩阵 A=QRA=QR 分解成正交矩阵 QQ 和上三角矩阵 RR
Householder:Householder变换 是一种能将n维向量x变换到任一n维向量y的正交变换
1.8 Canonical Forms of Matrices
Two matrices AA and BB is said to be similar if there is a nonsingular matrix XX such that
The mapping B→AB→A is called a similarity transformation
1.9 Normal and Hermitian Matrices
Normal Matrix:正规矩阵,满足 AHA=AAHAHA=AAH
Hermitian Matrix:复共轭对称矩阵(矩阵中第i行第j列是第j行第i列的元素的共轭)
ai,j=ˉaj,iai,j=¯aj,i记作:A=AHA=AH
1.11 Positive-Definite Matrices
positive define matrix:xTMx>0xTMx>0,则称M为正定矩阵,f(x)=xTMxf(x)=xTMx 为正定二次型
定理:
(1)正定矩阵可逆
判定:
(1)A是正定矩阵;
(2)A的一切顺序主子式均为正;
(3)A的一切主子式均为正;
(4)A的特征值均为正;
(5)存在实可逆矩阵C,使A=C′C;
(6)存在秩为n的m×n实矩阵B,使A=B′B;
(7)存在主对角线元素全为正的实三角矩阵R,使A=R′R
1.12 Projection Operators
projection:a projector PP is any linear mapping from CnCn to itself which is idempotent.
Orthogonal projection:正交投影,是指像空间U和零空间W相互正交子空间的投影
projection operator:投影算子是赋范线性空间 XX 上具有幂等性的有界线性算子,设 PP 是 XX 上的有界线性算子,如果 P2=PP2=P,则称 PP 为投影算子。
define: Ran(P)=MRan(P)=M and Null(P)=SNull(P)=S.
In fact, an arbitrary projector PP can be entirely determined by two subspaces:
- The range MM of PP
- its null space SS which is also the range of I−PI−P, i.e. S=Null(P)=Ran(I−P)S=Null(P)=Ran(I−P)
For any xx, the vector PxPx satisfies the conditions,
The linear mapping PP is said to project xx onto MM and along or parallel to the subspace SS.
define: L=S⊥L=S⊥ and u=Pxu=Px , then
These equations define a projector PP onto MM and orthogonal to the subspace LL.
Orthogonal Projector: L is equal to M
Theorem:
- Let PP be the orthogonal projector onto a subspace MM. Then for any vector xx in CnCn:miny∈M||x−y||2=||x−Px||2miny∈M||x−y||2=||x−Px||2
1.13 Basic Concepts in Linear Systems
1.13.1 Existence of a Solution
解的情况:
-
Case 1:AA is nonsingular (可逆)
- unique solution given by: x=A−1bx=A−1b
-
Case 2:AA is singular, and b∈Ran(A)b∈Ran(A)
- infinitely many solutions
-
Case 3:AA is singular, and b∉Ran(A)b∉Ran(A)
- no solutions
1.13.2 Perturbation Analysis
condition number:
SPD(Symmetric Positive Define)矩阵的condition number:
最大特征值除以最小特征值
In addition, small eigenvalues do not always give a good indication of poor conditioning. Indeed, a matrix can have all its eigenvalues equal to one yet be poorly conditioned.
- condition number是线性方程组Ax=b的解对b中的误差或不确定度的敏感性的度量,condition number 越小,收敛越快
特征值的小,不代表线性系统的poor conditioning
Chapter 2. Discretization of PDEs
PDE is the biggest source of sparse matrix problems. The typical way to solve such equations is to discretize them, i.e. to approximate them by equations that involve a finite number of unknowns.
本章介绍PDE discretize的三种方法:
- The simplest method uses finite difference approximations for the partial differential operators.
- The Finite Element Method replaces the original function by a function which has some degree of smoothness over the global domain, but which is piecewise polynomial on simple cells, such as small triangles or rectangles.
- 在上两种方法之间,there are a few conservative schemes called Finite Volume Methods, which attempt to emulate continuous conservation laws of physics.
2.1 Partial Differential Equations
2.1.1 Elliptic Operators
Poisson's equation:工程界最常见的一种PDE
for x=(x1,x2)x=(x1,x2) a vector
Domain ΩΩ:Poisson's equation中,xx 所属的bounded open domain
→n→n:unit vector, normal to the boundary ΓΓ and directed outwards.
Boundary condition:
- Dirichlet condition:u(x)=ϕ(x)u(x)=ϕ(x)
- Neumann condition:∂u∂→n=0∂u∂→n=0 ,是Cauchy condition的特殊情况(γ=α=0γ=α=0)
- Cauchy condition:∂u∂→n+α(x)u(x)=γ(x)∂u∂→n+α(x)u(x)=γ(x)
Laplace equation
its solutions are called harmonic function
Laplacean operator
2.1.2 The Convection Diffusion Equation
Many physical problems involve a combination of “diffusion” and “convection” phenomena. Such phenomena are modeled by the convection-diffusion equation:
2.2 Finite Difference Methods
The finite difference method is based on local approximations of the partial derivatives in a Partial Differential Equation, which are derived by low order Taylor series expansions.
2.2.1 Basic Approximation
-
Forward difference,一阶泰勒展开
∂q∂x≈qi+1−qiΔx=u(x+h)−u(x)h∂q∂x≈qi+1−qiΔx=u(x+h)−u(x)h -
backward difference
-
central difference(unbiased, 精度高),两个二阶泰勒展开
∂q∂x≈qi+1−qi−12Δx∂q∂x≈qi+1−qi−12Δx改进:(staggered grid)
∂q∂x≈qi+1/2−qi−1/2Δx∂q∂x≈qi+1/2−qi−1/2Δx
2.2.2 Difference Schemes for the Laplacean Operator
当 h1=h2时h1=h2时:
Mesh size:网格的大小 h,h1,h2h,h1,h2
2.2.3 Finite Differences for 1-D Problems
Consider the 1-D equation:
The interval [0,1][0,1] can be discretized uniformly by taking the n+2n+2 points
where h=1/(n+1)h=1/(n+1)
Because of the Dirichlet boundary conditions, the values u(x0)u(x0) and u(xn+1)u(xn+1) are known. At every other point, an approximation uiui is sought for the exact solution u(xi)u(xi)
If the centered difference approximation is used, then:
in which fi=f(xi)fi=f(xi) .
for i=1i=1 and i=ni=n,u0=un+1=0u0=un+1=0 in this case. Thus, for n=6n=6,the linear system obtained is of the form
where
nn 越大,则grid越密,且 AA 越大,解向量 xx 的维度越大
2.2.5 Finite Differences for 2-D Problems
Consider the 2-D equation:
where ΩΩ is now the rectangle (0,l1)×(0,l2)(0,l1)×(0,l2) and ΓΓ its boundary
类似2.2.3节,可以得到一个二维的网格(网格的粒度由discretize所用的点的数量决定),由于Dirichlet Boundary condition(边界点的值已知为0),所以只考虑内部的点(忽略网格边界的点)
the linear system obtained is of the form (7×57×5 finite difference)
with
grid mesh and matrix:
个人理解:矩阵分九个方格 (i×j=3×3i×j=3×3)
- 每 ii 行中,第 jj 个方格代表grid mesh的第 jj 行 (grid mesh有3行,所以矩阵每一行有3个方格)
- 第 ii 行代表标号为 ii 的grid point
- 矩阵中,有一些 −1−1 的元素缺失,是因为点在grid mesh中处于边界条件
不规则网格:
通过对规则网格的理解,来看上图中的不规则网格,思考下列过程(仅限于该案例):
- 向量 u1,u2,...,u48u1,u2,...,u48 按照grid mesh中的标号进行排序
- 矩阵的第一行与向量的乘积:4×u1−u2−u44×u1−u2−u4
- 按照grid mesh中的vertex based domain decomposition
- u1,...u16u1,...u16 处于一个domain,对应到矩阵 AA 的1-16行
- domain decomposition就相当于矩阵 AA 的按行分割
- 若不考虑subdomain之间的boundary condition(或者说inteface),则 AA 可以沿对角线方向分割成多个矩阵(如下图)
2.3 Finite Element Method
FEM belongs to the family of Galerkin methods.
Typical steps:
- Convert strong-form PDEs to weak forms, using a test function ww.
- Integrate by parts to redistribute gradient operators.
- Use the divergence theorem to simplify equations and enforce Neumann boundary conditions (BCs).
- Discretization (build the stiffness matrix and right-hand side).
- Solve the (discrete) linear system.
2.3.1 Notation
Consider the 2-D equation:
where ΩΩ is now the rectangle (0,l1)×(0,l2)(0,l1)×(0,l2) and ΓΓ its boundary
The Green's formula:
where vv is a vector-valued function →v=(v1v2)→v=(v1v2)
To solve this problem approximately, it is necessary to:
- take approximations to the unknown function uu
- translate the equations into a system which can be solved numerically
Let us define:
2.3.2 FEM推导
The original domain is approximated by union ΩhΩh of mm triangles KiKi,
Then the finite dimensional space VhVh is defined as the space of all functions which are piecewise linear and continuous on the polygonal region ΩhΩh, and which vanish on the boundary ΓΓ :
Here, ϕ|Xϕ|X represents the restriction of the function ϕϕ to the subset X. If xj,j=1,...,nxj,j=1,...,n are the nodes of the triangulation, then a function ϕjϕj in VhVh can be associated with each node xjxj , so that the family of functions ϕjϕj’s satisfies the following conditions:
In addition, the ϕiϕi's form a basis of the space VhVh. Each function of VhVh can be expressed as:
The finite element approximation consists of writing the Galerkin condition for functions in VhVh. This defines the problem:
Writing the desired solution uu in the basis {ϕi}{ϕi} as:
带入上式 a(u,v)=(f,v)a(u,v)=(f,v),得到:
where:
αij=a(ϕi,ϕj),βi=(f,ϕi)αij=a(ϕi,ϕj),βi=(f,ϕi)
The above equations form a linear system of equations Ax=bAx=b, in which the coefficients of AA are the aijaij's; those of bb are the βjβj's. In addition, AA is a SPD matrix. (∫Ω∇ϕi.∇ϕjdx=∫Ω∇ϕj.∇ϕidx∫Ω∇ϕi.∇ϕjdx=∫Ω∇ϕj.∇ϕidx ).
In practice, the matrix is built by summing up the contributions of all triangles by applying the formula:
Thus, a triangle contributes nonzero values to its three vertices from the above formula. The 3×33×3 matrix
associated with the triangle K(i,j,k)K(i,j,k) with vertices i,j,ki,j,k is called an element stiffness matrix. In order to form the matrix AA, it is necessary to sum up all the contributions aK(ϕk,ϕm)aK(ϕk,ϕm) to the position k,mk,m of the matrix. This process is called an assembly process. In the assembly, the matrix is computed as:
2.3.3 FEM的直观理解
The four elements are numbered from bottom to top as indicated by the labels located at their centers. There are six nodes in this mesh and their labeling is indicated in the circled numbers. The four matrices A[e]A[e] associated with these elements are shown in Figure 2.9(下图). Thus, the first element will contribute to the nodes 1, 2, 3, the second to nodes 2, 3, 5, the third to nodes 2, 4, 5, and the fourth to nodes 4, 5, 6:
2.4 Mesh Generation and Refinement
2.5 Finite Volume Method
Chapter 3. Sparse Matrix
Standard discretization of PDEs typically lead to large and sparse matrices. Some sparse matrix techniques begin with the idea that the zero elements need not be stored. This chapter gives an overview of sparse matrices, their properties, their representations, and the data structures used to store them.
3.1 Introduction
two broad types of matrices: structured and unstructured
3.2 稀疏矩阵的存储形式
coo:
coo_matrix
的优点:
- 有利于稀疏格式之间的快速转换(
tobsr()
、tocsr()
、to_csc()
、to_dia()
、to_dok()
、to_lil()
) - 允许又重复项(格式转换的时候自动相加)
- 能与CSR / CSC格式的快速转换
coo_matrix
的缺点:
- 不能直接进行算术运算
csr:(compressed sparse row)
csr_matrix
的优点:
- 高效的算术运算CSR + CSR,CSR * CSR等
- 高效的行切片
- 快速矩阵运算
csr_matrix
的缺点:
- 列切片操作比较慢(考虑csc_matrix)
- 稀疏结构的转换比较慢(考虑lil_matrix或doc_matrix)
csc:(compressed sparse column)
优缺点和csr_matrix
一致
LiL:(List of Lists format)
LIL format优点
- 支持灵活的切片操作「行切片操作效率高,列切片效率低」
- 稀疏矩阵格式之间的转化很高效(
tobsr()
、tocsr()
、to_csc()
、to_dia()
、to_dok()
、to_lil()
)
LIL format缺点
- 加法操作效率低 (consider CSR or CSC)
- 列切片效率低(consider CSC)
- 矩阵乘法效率低 (consider CSR or CSC)
dok:(Dictionary Of Keys based sparse matrix)
是一种类似于coo matrix但又基于字典的稀疏矩阵存储方式,key由非零元素的的坐标值tuple(row, column)
组成,value则代表数据值。dok matrix非常适合于增量构建稀疏矩阵,并一旦构建,就可以快速地转换为coo_matrix
。
dia:(Sparse matrix with DIAgonal storage)
bsr:(Block Sparse Row matrix)
bsr_matrix可用于算术运算:支持加法,减法,乘法,除法和矩阵幂。对于许多稀疏算术运算,BSR比CSR和CSC更有效
Chapter 4. Basic Iterative Methods
4.1 Point Relaxation Scheme
solving Ax=bAx=b
We split matrix AA in the form
where
- DD the diagonal of AA
- LL and UU the lower and upper triangular parts of AA
Jacobi method
Damped Jacobi (weighted Jacobi)建议 ω=23ω=23
Gauss-Seidel
依次算出 x(k+1)x(k+1) 的component
Successive Over Relaxation (SOR,超松弛)
当 ω=1ω=1 时,
Damped Jacobi == Jacobi
SOR == Gauss-Seidel
-
当B的谱半径小于1时这个系统会收敛但是当谱半径接近1时收敛速度很慢。
-
SOR就是为了减少谱半径所施加的一些小代数技巧。
for Gauss–Seidel, the order of updating is significant. Instead of sweeping through the components in ascending order, we could sweep through the components in descending order.
symmetric Gauss-Seidel:alternate between ascending and descending orders
red-black Gauss-Seidel:ordering that results by coloring the gridpoints red and black in a checkerboard pattern. All red(even) points are updated before the black(odd) points.
4.2 Block Relaxation Schemes
Block Relaxation Schemes update a whole set of components at each time, typically a subvector of the solution vector, instead of only one component.
in which the partitionings of bb and xx into subvectors βiβi and ξiξi are identical and compatible with the partitioning of AA.
Consider:
- a set-decomposition S={1,2,...,n}S={1,2,...,n}
- pp 个 subsets S1,...,SpS1,...,Sp with ∪Si=S∪Si=S
- Denote by nini the size of SiSi and define Si={mi(1),mi(2),...,mi(ni)}Si={mi(1),mi(2),...,mi(ni)}
- the n×nin×ni matrix Vi=[emi(1),emi(2),...,emi(ni)]Vi=[emi(1),emi(2),...,emi(ni)]
- where each ejej is the jj-th column of the n×nn×n identity matrix.
Then Let:
-
ViVi be the n×nin×ni matrix
Vi=[emi(1),emi(2),...,emi(ni)]Vi=[emi(1),emi(2),...,emi(ni)] -
Wi=[ηmi(1)emi(1),ηmi(2)emi(2),...,ηmi(ni)emi(ni)]Wi=[ηmi(1)emi(1),ηmi(2)emi(2),...,ηmi(ni)emi(ni)]
where each ejej is the column of the n×nn×n identity matrix, and ηmi(j)ηmi(j) represents a weight factor chosen so that:
WTiVi=IWTiVi=I
- When there is no overlap, then define ηmi(j)=1ηmi(j)=1
Then: (WTiWTi 和 VjVj 的作用就是取出 AA 中的某一部分)
- Aij=WTiAVjAij=WTiAVj
- ξi=WTix,βi=WTibξi=WTix,βi=WTib
- x=∑Viξix=∑Viξi
4.2.1 Introduction
With finite diference, it is standard to block the variables and the matrix by partitioning along whole lines of the mesh.
grid mesh:
matrix:
block techniques can be defined in more general terms:
- by using blocks that allow us to update arbitrary groups of components
- by allowing the blocks to overlap.
our partition has been based on an actual set-partition of the variable set S={1,2,...,n}S={1,2,...,n} into subsets S1,S2,...,SpS1,S2,...,Sp, with the condition that two distinct subsets are disjoint. More generally, a set-decomposition of SS removes the constraint of disjointness.
4.2.2 Algorithm
Block Jacobi method:
Block Jacobi Iteration:
- For k=0,1,...,k=0,1,..., until convergence Do:
- For i=1,2,...,pi=1,2,...,p Do:
- Solve Aiiδi=WTi(b−Axk)Aiiδi=WTi(b−Axk)
- Set xk+1=xk+Viδixk+1=xk+Viδi
- EndDo
- For i=1,2,...,pi=1,2,...,p Do:
- EndDo
Block Gauss-Seidel Iteration:
- Until convergence Do:
- For i=1,2,...,pi=1,2,...,p Do:
- Solve Aiiδi=WTi(b−Ax)Aiiδi=WTi(b−Ax)
- Set x=x+Viδix=x+Viδi
- EndDo
- For i=1,2,...,pi=1,2,...,p Do:
- EndDo
Block Jacobi/GS的 main loop (lines 2 to 5) represents an orthogonal projection process over Ki=span{Vi}Ki=span{Vi}
Chapter 5. Projection Methods
Projection method就是:
- 在search subspace KK 中选取一个 xx,将其投影到 Ran(A)Ran(A)(即矩阵 AA 的子空间range),得到 AxAx.
- 目标解 uu 投影到 Ran(A)Ran(A),得到 Au=b.
- 此时 Ax 与 Au=b 都在 Ran(A) 中,且 b−Ax⊥L
- 不断跳转到 step1,minimize the residual
K 就是自由度的体现,L 就是约束的体现.
A projection process represents a canonical way for extracting an approximation to the solution of a linear system from a subspace.
A projection technique onto the subspace K(search subspace) and orthogonal to L(subspace of constraints) is a process which finds an approximate solution ˜x to Ax=b by imposing the conditions that ˜x belong to K and that the new residual vector be orthogonal to L:
即:
affine subspace:x0+K
Petrov-Galerkin conditions:b−Ax⊥L
when L=K,the Petrov-Galerkin conditions are called the Galerkin conditions
x0:initial guess
orthogonal projection:L=K
oblique projection:L≠K
5.1 Prototype Projection Method
5.1.1 General Projection Methods
In fact, virtually all of the basic iterative techniques seen in the previoud chapter can be considered projection techniques. Whenever an approximation is defined via m degrees of freedom (subspace K) and m constraints (subspace L), a projection process results.
5.1.2 Matrix Representation
Let V=[v1,...,vm], an n×m matrix whose column-vectors form a basis of K and, W=[w1,...,wm], an n×m matrix whose column-vectors form a basis of L.
Prototype Projection Method:
- Until convergence, Do:
- Select a pair of subspace K and L
- Choose bases V=[v1,...,vm] and W=[w1,...wm] for K and L
- r=b−Ax
- y=(WTAV)−1WTr
- x=x+Vy
- EndDo
5.2 General Theory
This section gives some general theoretical results without being specific about the subspaces K and L which are used. The goal is to learn about the quality of the approximation obtained from a general projection process. Two main tools are used for this.
- The first is to exploit optimality properties of projection methods. These properties are induced from those properties of projectors seen in Section 1.12.4 of Chapter 1.
- The second tool consists of interpreting the projected problem with the help of projection operators in an attempt to extract residual bounds.
5.3 One-Dimensional Projection Processes
where v and w are two vectors. In this case, the new approximation takes the form x←x+αv and the Petrov-Galerkin condition r−Aδ⊥w yields
Steepest Descent Algorithm:
- Until convergence, Do:
- r←b−Ax
- α←(r,r)/(Ar,r)
- x←x+αr
- EndDo
each step minimizes f(x)=||x−x⋆||2A=(A(x−x⋆),(x=x⋆)) over all vectors of the form x+αd, where d=−∇f
Minimal Residual Iteration:
- Until convergence, Do:
- r←b−Ax
- α←(Ar,r)/(Ar,Ar)
- x←x+αr
- EndDo
each step minimizes f(x)=||b−Ax||22 in the direction r.
Residual Norm Steepest Descent:
- Compute r=b−Ax
- Until convergence, Do:
- v=ATr
- Compute Av and α=||v||22/||Av||22
- x=x+αv
- r=r−αAv
- EndDo
each step minimizes f(x)=||b−Ax||22 in the direction −∇f.
5.4 Additive and Multiplicative Processes
Consider:
- a set-decomposition S={1,2,...,n}
- p 个 subsets S1,...,Sp with ∪Si=S
- Denote by ni the size of Si and define Si={mi(1),mi(2),...,mi(ni)}
- the n×ni matrix Vi=[emi(1),emi(2),...,emi(ni)]
- where each ej is the j-th column of the n×n identity matrix.
Additive Projection Procedure: 累加
- For k=0,1,... until convergence, Do:
- For i=1,2,...,p Do:
- Solve Aiyi=VTi(b−Axk)
- EndDo
- Set xk+1=xk+∑pi=1Viyi
- For i=1,2,...,p Do:
- EndDo
the general block Jacobi iteration combines these modifications, implicitly adding them together, to obtain the next iterate xk+1
Multiplicative Projection Procedure:
- Until convergence, Do:
- For i=1,2,...,p Do:
- Solve Aiy=VTi(b−Ax)
- Set x=x+Viy
- EndDo
- For i=1,2,...,p Do:
- EndDo
Similar to the Jacobi and Gauss-Seidel iterations, what distinguishes the additive and multiplicative iterations is that the latter updates the component to be corrected at step i immediately.
Chapter 6. Krylov Subspace Methods Part 1
共轭梯度法的基本要求是,矩阵A是对称正定的,否则收敛很慢甚至无法收敛。但是巧了,压力泊松方程所要解的矩阵和刚度矩阵也都是对称正定的。
The next two chapters explore a few methods which are considered currently to be among the most important iterative techniques available for solving large linear systems.
These techniques are based on projection processes, both orthogonal and oblique, onto Krylov subspaces, which are subspaces spanned by vectors of the form p(A)v where p is a polynomial. In short, these techniques approximate A−1b by p(A)b, where p is a "good" polynomial. Matrix polynomial - Wikipedia
This chapter covers methods derived from, or related to, the Arnoldi orthogonalization. The next chapter covers methods based on Lanczos bi-orthogonalization.
6.1 Introduction
Consider
- affine subspace x0+Km of dimension m
- Petrov-Galerkin condition b−Axm⊥Lm
A Krylov subspace method is a method for which the subspace Km is the Krylov subspace:
where r0=b−Ax0
krylov subspace methods就是 K 为Krylov subspace的projection methods.
The different version of Krylov subspace methods arise from different choices of the subspace Lm and from the ways in which the system is preconditioned.
6.2 Krylov subspace
properties of Krylov subspace:
-
Km is the subspace of all vectors in Rn which can be written as x=p(A)v, where p is a polynomial of degree not exceeding m−1.
the grade of v:The degree of the minimal polynomial of v with respect to A.
6.3 Arnoldi's method
Lanczos 算法 & Arnoldi 算法 简述 - 知乎 (zhihu.com)
给定Krylov向量组(k维),求一组与之等价的规范正交向量组。使用Gram-Schmidt过程求解便可。Arnoldi在研究非对称矩阵的特征值问题时,利用Krylov向量组的特殊结构,给出了Gram-Schmidt算法的一种变体算法,现在称为Arnoldi算法。
Arnoldi's procedure is an algorithm for building an orthogonal basis of the Krylov subspace Km. 在Arnoldi的过程中,会顺带着计算出Hessenberg Matrix。
Hm=VTmAVm,and AVm=VmHm+wmeTm=Vm+1ˉHm.
Vm:n×m matrix with column v1,...vm.
ˉHm:(m+1)×m Hessenberg Matrix.
Hm:deleting ˉHm's last row
基于Gram-Schmidt的Arnoldi's method:
- Choose a vector v1 of norm 1
- For j=1,2,...,m Do:
- Compute hij=(Avj,vi) for i=1,2,...,j.
- Compute wj=Avj , ∑ji=1hijvi
- hj+1,j=||wj||2
- If hj+1,j=0 then Stop
- vj+1=wj/hj+1,j
- EndDo
向量组 v1,v2,...,vk 即为 K(A,v1) 的正交基
At each step, the algorithm multiplies the previous Arnoldi vector vj by A and then orthonormalizes the resulting vector wj against all previous vi ’s by a standard Gram-Schmidt procedure. It will stop if the vector wj vanishes.
6.4 Arnoldi's method for linear system (FOM)
Full Orthogonalization Method(FOM):
- Compute r0=b−Ax0, β=||r0||2, and v1=r0/β
- Define the m×m matrix Hm=hij ; Set Hm=0
- For j=1,2,...,m, Do:
- Compute wj=Avi
- For i=1,...,j Do:
- hij=(wj,vi)
- wj=wj−hijvi
- EndDo
- Compute hj+1,j=||wj||2. If hj+1,j=0 set m=j and Goto final step
- Compute vj+1=wj/hj+1,j
- EndDo
- Compute ym=H−1m(βe1) and xm=x0+Vmym
6.5 GMRES
GMRES对矩阵没有特殊要求. 是由Yousef Saad等人发明的
The Generalized Minimum Residual Method(GMRES) is a projection method based on taking K=Km and L=AKm, in which Km is the m-th Krylov subspace with v1=r0/||r0||2. As seen in Chapter 5 (Proposition 5.3: Assume that L=AK, then ˜x is the projection method onto K with the starting vector x0, if and only if it minimizes the 2-norm of the residual vector), such a technique minimizes the residual norm over all vectors in x0+Km. The implementation of an algorithm based on this approach is similar to that of the FOM algorithm
The GMRES approximation is the unique vector of x0+Km which minimizes J(y)=||b−Ax||2=||b−A(x0+Vmy)||2=||βe1−ˉHmy||2 ,为了得到 ˉHm,所以需要用Arnoldi. 推导过程:
GMRES:
- Compute r0=b−Ax0, β=||r0||2, and v1=r0/β
- Define the (m+1)×m matrix ˉHm={hij}1≤i≤m+1,1≤j≤m. Set ˉHm=0
- For j=1,2,...,m Do:
- Compute wj=Avj
- Fori=1,...,j Do:
- hij=(wj,vi)
- wj=wj−hijvi
- EndDo
- hj+1,j=||wj||2. If hj+1,j=0 set m=j and goto final step
- vj+1=wj/hj+1,j
- EndDo
- Compute ym the minimizer of ||βe1−ˉHmy||2 and xm=x0+Vmym
GMRES实际上就是在Galerkin条件的扩张下,||r|| 的优化问题。这个过程中,求解系数的方法是不断的通过正交化求投影值,与施密特正交化类似。
Restarted GMRES:
- Compute r0=b−Ax0,β=||r0||2,and v1=r0β
- Generate the Arnoldi basis and the matrix ˉHm using Arnoldi algorithm
- starting with v1
- Compute ym which minimizes ||βe1−ˉHmy||2 and xm=x0+Vmym
- If satisfied then Stop, else set x0=xm and GoTo step 1.
6.6 The Symmetric Lanczos Algorithm
都是求一个矩阵的正交基,施密特正交化的复杂度是O(n3),而lanczos的方法是O(n2)。但lanczos方法要求矩阵必须为 Hermitian Matrix. Krylov subspace的MGS (Modified Gram-Schmidt)在Hermitian情况下等同于Lanczos。
The symmetric Lanczos algorithm can be viewed as a simplification of Arnoldi’s method for the particular case when the matrix is symmetric. When A is symmetric, then the Hessenberg matrix Hm becomes symmetric tridiagonal. This leads to a three-term recurrence in the Arnoldi process and short-term recurrences for solution algorithms such as FOM and GMRES. On the theoretical side, there is also much more to be said on the resulting approximation in the symmetric case.
The Lanczos Algorithm:
-
Choose an initial vector v1 of norm unity. Set β1=0,v0=0.
-
For j=1,2,...,m Do:
- wj=Avj−βjvj−1
- αj=(wj,vj)
- wj=wj−αjvj
- βj+1=||wj||2. If βj+1=0 then Stop
- vj+1=wj/βj+1
-
EndDo
Lanczos Method for Linear Systems:
- Compute r0=b−Ax0, β=||r0||2, and v1=r0/β
- For j=1,2,...,m Do:
- wj=Avj−βjvj−1 (If j=1 set βjv0=0)
- αj=(wj,vj)
- wj=wj−αjvj
- βj+1=||wj||2. If βj+1=0 set m=j then 跳出循环
- vj+1=wj/βj+1
- EndDo
- Set Tm=tridiag(βi,αi,βi+1), and Vm=[v1,...,vm].
- Compute ym=T−1m(βe1) and xm=x0+Vmym
Direct version of the Lanczos algirithm(D-Lanczos):
- Compute r0=b−Ax0 , ζ1=β=||r0||2, and v1=r0/β
- Set λ1=β1=0, p0=0
- For m=1,2,..., until convergence Do:
- Compute w=Avm−βmvm−1 and αm=(w,vm)
- If m>1 then compute λm=βmηm−1 and ζm=−λmζm−1
- ηm=αm−λmβm
- pm=η−1m(vm−βmpm−1)
- xm=xm−1+ζmpm
- If xm has converged then Stop
- w=w−αmvm
- βm+1=||w||2,vm+1=w/βm+1
- EndDo
The result of the Lanczos algorithm is a relation of the form
AVm=Vm+1ˉTmin which ˉTm is the (m+1)×m tridiagonal matrix
ˉTm=(Tmδm+1eTm)
6.7 The Conjugate Gradient Algorithm
One of the best known iterative techniques for solving sparse Symmetric Positive Define linear systems. It is equivalent to FOM. However, because A is symmetric, some simplifications resulting from the three-term Lanczos recurrence will lead to more elegant algorithms. CG可以看做是FOM for Symmetric Positive Define matrix.
The Conjugate Gradient algorithm can be derived from the Lanczos algorithm in the same way DIOM was derived from IOM. In face, the Conjugate Gradient algorithm can be viewed as a variation of DIOM for the case when A is symmetric.
Conjugate Gradient:
- Compute r0=b−Ax0,p0=x0
- For j=0,1,..., until convergence Do:
- αj=(rj,rj)/(Apj,pj) (搜索步长)
- xj+1=xj+αjpj
- rj+1=rj−αjApj
- βj=(rj+1,rj+1)/(rj,rj)
- pj+1=rj+1+βjpj (搜索方向)
- EndDo
CG由Lanczos Algorithm推导而来,相关的定理:
Proposition:
- Let
- rm=b−Axm, m=0,1,.... be the residual vectors produced by Lanczos and the D-Lanczos algorithm
- pm, m=0,1,.... the auxiliary vectors produced by D-Lanczos algorithm.
- Then:
- Each residual vector rm is such that rm=σmvm+1 where σm is a certain scalar. As a result, the residual vectors are orthogonal to each other.
- The auxiliary vectors pi form an A-conjugate set, i.e., (Api,pj)=0, for i≠j.
总结:
-
Arnoldi是在Krylov的性质下,使用Gram-Schmidt之类的正交化方法求得KrylovSubspace的正交基.
-
Symmetric Lanczos是symmetric矩阵下,一个简化版的Arnoldi.
-
共轭梯度是用Symmetric Lanczos过程中产生的A-conjugate vector set,然后按照共轭向量组的方向去优化residual.
6.8 The Conjugate Residual Method
Conjugate Residual Algorithm:
- Compute r0=b−Ax0,p0=x0
- For j=0,1,..., until convergence Do:
- αj=(rj,Arj)/(Apj,Apj)
- xj+1=xj+αjpj
- rj+1=rj−αjApj
- βj=(rj+1,Arj+1)/(rj,Arj)
- pj+1=rj+1+βjpj
- Compute Apj+1=Arj+1+βjApj
- EndDo
6.11 Convergence Analysis
One of the main tools used in the analysis of these methods is Chebyshev polynomials. These polynomials are useful both in theory, when studying convergence, and in practice, as a means of accelerating single-vector iterations or projection processes. In the following, real and complex Chebyshev polynomials are discussed separately.
optimal algorithms:
- CG
- GMRES
non-optimal algorithms:
- FOM
- IOM
- QGMRES
6.12 Block Krylov Methods
Chapter 7. Krylov Subspace Methods Part 2
This chapter will describe a class of Krylov subspace methods which are instead based on a biorthogonalization algorithm due to Lanczos. These are projection methods that are intrinsically non-orthogonal.
7.1 Lanczos biorthogonalization
The Lanczos biorthogonalization algorithm is an extension to nonsymmetric matrices of the symmetric Lanczos algorithm seen in the previous chapter. One such extension, the Arnoldi procedure, has already been seen. However, the nonsymmetric Lanczos algorithm is quite different in concept from Arnoldi’s method because it relies on biorthogonal sequences instead of orthogonal sequences.
双正交:
-
已知
- 向量组 u1,u2,...,un 和 v1,v2,...,vn.
-
双正交:
(ui,vj)={1i=j0i≠j
The algorithm proposed by Lanczos for nonsymmetric matrices builds a pair of biorthogonal bases for the two subspaces:
The Lanczos Biothogonalization Procedure:
- Choose two vectors v1,w1 such that (v1,w1)=1
- Set β1=δ1=0, w0=v0=0
- For j=1,2,...,m Do:
- αj=(Avj,wj)
- ˆvj+1=Avj−αjvj−βjvj−1
- ˆwj+1=ATwj−αjwj−δjwj−1
- δj+1=|(ˆvj+1,ˆwj+1)|1/2. If δj+1=0 Stop
- βj+1=(ˆvj+1,ˆwj+1)/δj+1
- wj+1=ˆwj+1/βj+1
- vj+1=ˆvj+1/δj+1
- EndDo
Denote:
7.2 Lanczos Algorithm for linear systems
Two-sided Lanczos Algorithm for Linear System:
- Compute r0=b−Ax0 and β=||r0||2
- Run m steps of the nonsymmetric Lanczos Algorithm, i.e.,
- Start with v1=r0/β, and any w1 such that (v1,w1)=1
- Generate the Lanczos vectors v1,...,vm, w1,...,wm
- and the tridiagonal matrix Tm from Algorithm The Lanczos Biothogonalization Procedure.
- Compute ym=T−1m(βe1) and xm=x0+Vmym
7.3 The BCG and QMR algorithms
The Biconjugate Gradient (BCG) algorithm can be derived from The Lanczos Biothogonalization Procedure in exactly the same way as the Conjugate Gradient method was derived from The Lanczos Algorithm.
Implicitly, the algorithm solves not only the original system Ax=b but also a dual linear system ATx⋆=b⋆ with AT. This dual system is often ignored in the formulations of the algorithm.
Biconjugate Gradient(BCG):
- Compute r0=b−Ax0. Choose r⋆0 such that (r0,r⋆0)≠0
- Set, p0=r0, p⋆0=r⋆0
- For j=0,1,... until convergence Do:
- αj=(rj,r⋆j)/(Apj,p⋆j)
- xj+1=xj+αjpj
- rj+1=rj−αjApj
- r⋆j+1=r⋆j−αjATp⋆j
- βj=(rj+1,r⋆j+1)/(rj,r⋆j)
- pj+1=rj+1+βjpj
- p⋆j+1=r⋆j+1+βjp⋆j
- EndDo
Like Conjugate Gradient algorithm, the vectors rj and r⋆j are in the same direction as vj+1 and wj+1.
Quasi-Minimal Residual(QMR):
-
Compute r0=b−Ax0 and γ1=||r0||2, w1=v1=r0/γ1
-
For m=1,2,..., until convergence Do:
-
Compute αm,δm+1 and vm+1,wm+1 as in Algorithm The Lanczos Biothogonalization Procedure.
-
Update the QR factorization of ˉTm, i.e.,
-
Apply Ωi,i=m−2,m−1 to the m-th column of ˉTm
-
Compute the rotation coefficients cm,sm by
si=hi+1,i√(h(i−1)ii)2+h2i+1,ici=h(i−1)ii√(h(i−1)ii)2+h2i+1,i
-
-
Apply rotation Ωm, to ˉTm and ˉgm, i.e., compute:
- γm+1=−smγm
- γm=cmγm, and,
- am=cmαm+smδm+1(=√δ2m+1+α2m)
-
pm=(vm−∑m−1i=m−2timpi)/tmm
-
xm=xm−1+γmpm
-
If |γm+1| is small enough then stop
-
-
EndDo
7.4 Transpose-free Variants
Conjugate Gradient Squared:
- Compute r0=b−Ax0; r⋆0 arbitrary
- Set p0=u0=r0
- For j=0,1,2,... untill convergence Do:
- αj=(rj,r⋆0)/(Apj,r⋆0)
- qj=uj−αjApj
- xj+1=xj+αj(uj+qj)
- rj+1=rj−αjA(uj+qj)
- βj=(rj+1,r⋆0)/(rj,r⋆0)
- uj+1=rj+1+βjqj
- pj+1=uj+1+βj(qj+βjpj)
- EndDo
Biconjugate Gradient Stabilized Algorithm(BICGSTAB):
- Compute r0=b−Ax0; r⋆0 arbitrary
- p0=r0
- For j=0,1,..., until convergence Do:
- αj=(rj,r⋆0)/(Apj,r⋆0)
- sj=rj−αjApj
- ωj=(Asj,sj)/(Asj,Asj)
- xj+1=xj+αjpj+ωjsj
- rj+1=sj−ωjAsj
- βj=(rj+1,r⋆0)(rj,r⋆0)×αjωj
- pj+1=rj+1+βj(pj−ωjApj)
- EndDo
Chapter 8. Methods Related to the Normal Equations
There are a number of techniques for converting a non-symmetric linear system into a symmetric one. One such technique solves the equivalent linear system ATAx=ATb, called the normal equations. This chapter covers iterative methods which are either directly or implicitly related to the normal equations.
8.3 Conjugate Gradient and Normal Equations
CGNR:(NR: Normal Residual)
-
Compute r0=b−Ax0, z0=ATr0, p0=z0
-
For i=0,..., until convergence Do:
- wi=Api
- αi=||zi||2/||wi||22
- xi+1=xi+αipi
- ri+1=ri−αiwi
- zi+1=ATri+1
- βi=||zi+1||22/||zi||22
- pi+1=zi+1+βipi
-
EndDo
CGNE:(Craig's Method)(NE: Normal Equation)
- Compute r0=b−Ax0, p0=ATr0
- For i=0,..., until convergence Do:
- αi=(ri,ri)/(pi,pi)
- xi+1=xi+αipi
- ri+1=ri−αiApi
- βi=(ri+1,ri+1)/(ri,ri)
- pi+1=ATri+1+βipi
- EndDo
Chapter 9. Preconditioned Iterations
Preconditioning is a key ingredient for the success of Krylov subspace methods. This chapter discusses the preconditioned versions of the Krylov subspace algorithm already seen, using a generic preconditioner. The standard preconditioning techniques will be covered in the next chapter.
M−1A 的condition number小,所以迭代收敛快。
9.1 Introduction
Both the efficiency and robustness of iterative techniques can be improved by using preconditioning.
preconditioning:preconditioning is simply a means of transforming the original linear system into one which has the same solution, but which is likely to be easier to solve with an iterative solver.
9.2 Preconditioned Conjugate Gradient
preconditioner:is a matrix M.
From a practical point of view, the only requirement for M is that it is inexpensive to solve linear systems Mx=b. This is because the preconditioned algorithms will all require a linear system solution with the matrix M at each step. For example:
- left preconditioning: M−1Ax=M−1b
- right preconditioning: AM−1u=b , x=M−1u
(1)Preserving Symmetric:
When M is available in the form of an incomplete Cholesky factorization, i.e., when
then a simple way to preserve symmetric is to "split" the preconditioner between left and right, i.e., to solve
which involves a Symmetric Positive Define matrix.
(2)PCG:
M-inner product:
since
使用M-inner product,CG的过程可以写成:
-
initialize:
- original residual:rj=b−Axj
- residual for the preconditioned system:zj=M−1rj
-
Loop
-
αj=(zj,zj)M/(M−1Apj,pj)M
-
xj+1=xj+αjpj
-
rj+1=rj−αjApj and zj+1=M−1rj+1
-
βj=(zj+1,zj+1)M/(zj,zj)M
-
计算共轭方向:pj+1=zj+1+βjpj
-
CG中,M-inner product 没有显式参与计算
而PCG则为:
-
initialize:
- original residual:r0=b−Ax0
- residual for the preconditioned system:z0=M−1r0
- p0=z0
-
Loop
-
αj=(rj,zj)/(Apj,pj)
-
xj+1=xj+αjpj
-
rj+1=rj−αjApj and zj+1=M−1rj+1
-
βj=(rj+1,zj+1)/(rj,zj)
-
计算共轭方向:pj+1=zj+1+βjpj
-
M=I 时,PCG=CG
Split Preconditioner Conjugate Gradient:
- Compute r0=b−Ax0; ˆr0=L−1r0; and p0=L−Tˆr0.
- For j=0,1,..., until convergence Do:
- αj=(ˆrj,ˆrj)/(Apj,pj)
- xj+1=xj+αjpj
- ˆrj+1=ˆrj−αjL−1Apj
- βj=(ˆrj+1,ˆrj+1)/(ˆrj,ˆrj)
- pj+1=L−Tˆrj+1+βjpj
- EndDo
9.3 Preconditioned GMRES
9.3.1 Left-preconditioned GMRES
define the left preconditioned GMRES algorithm, as the GMRES algorithm applied to the system:
GMRES with Left Preconditioning:
- Compute r0=M−1(b−Ax0), β=||r0||2 and v1=r0/β
- For j=1,...,m Do:
- Compute w=M−1Avj
- For i=1,...,j, Do:
- hi,j=(w,vj)
- w=w−hi,jvi
- EndDo
- Compute hj+1,j=||w||2 and vj+1=w/hj+1,j
- EndDo
- Define Vm=[v1,...,vm], ˉHm={hi,j}1≤i≤j+1;1≤j≤m
- Compute ym=argminy||βe1−ˉHmy||2, and xm=x0+Vmym
- If satisfied Stop, else set x0=xm and Goto Step 1
9.3.2 Right-preconditioned GMRES
The right preconditioned GMRES algorithm is based on solving
GMRES with Right Preconditioning:
- Compute r0=b−Ax0, β=||r0||2, and v1=r0/β
- For j=1,...,m Do:
- Compute w=AM−1vj
- For i=1,...,j, Do:
- hi,j=(w,vi)
- w=w−hi,jvi
- EndDo
- Compute hj+1,j=||w||2 and vj+1=w/hj+1,j
- Define Vm=[v1,...,vm], ˉHm={hi,j}1≤i≤j+1;1≤j≤m
- EndDo
- Compute ym=argminy||βe1−ˉHmy||2, and xm=x0+M−1Vmym
- If satisfied Stop, else set x0=xm and GoTo step 1
9.4 Flexible Variants
In the discussion of preconditioning techniques so far, it is implicitly assumed that the preconditioning matrix M is fixed, i.e., it does not change from step to step. However, in some cases, no matrix M is available. Instead, the operation M−1x is the result of some unspecified computation, possibly another iterative process. In such cases, it may well happen that M−1 is not a constant operator. The previous preconditioned iterative procedures will not converge if M is not constant. There are a number of variants of iterative procedures developed in the literature that can accommodate variations in the preconditioner, i.e., that allow the preconditioner to vary from step to step. Such iterative procedures are called “flexible” iterations. One of these iterations, a flexible variant of the GMRES algorithm, is described next.
Flexible GMRES(FGMRES):
- Compute r0=b−Ax0, β=||r0||2, and v1=r0/β
- For j=1,...,m Do:
- Compute zj=M−1jvj
- Compute w=Azj
- For i=1,...,j, Do:
- hi,j=(w,vi)
- w=w−hi,jvi
- EndDo
- Compute hj+1,j=||w||2 and vj+1=w/hj+1,j
- Define Zm=[z1,...,zm], ˉHm={hi,j}1≤i≤j+1;1≤j≤m
- EndDo
- Compute ym=argminy||βe1−ˉHmy||2, and xm=x0+Zmym
- If satisfied Stop, else set x0←xm and GoTo step 1
It is clear that when Mj=M for j=1,...,m, then this method is equivalent mathematically to GMRES with Right Preconditioning Algorithm.
9.5 Preconditioned CG For the Normal Equations
Left-Preconditioned CGNR:(NR: Normal Residual)
- Compute r0=b−Ax0, ˜r0=ATr0, z0=M−1˜r0, p0=z0
- For j=0,..., until convergence Do:
- wj=Apj
- αj=(zj,˜rj)/||wj||22
- xj+1=xj+αjpj
- rj+1=rj−αjwj
- ˜rj+1=ATrj+1
- zj+1=M−1˜rj+1
- βj=(zj+1,˜rj+1)/(zj,˜rj)
- pj+1=zj+1+βjpj
- EndDo
Left-Preconditioned CGNE:
- Compute r0=b−Ax0, z0=M−1r0, p0=ATz0
- For j=0,..., until convergence Do:
- wj=Apj
- αj=(zj,rj)/(pj,pj)
- xj+1=xj+αjpj
- rj+1=rj−αjwj
- zj+1=M−1rj+1
- βj=(zj+1,rj+1)/(zj,rj)
- pj+1=ATzj+1+βjpj
- EndDo
Chapter 10. Preconditioning Techniques
A preconditioner can be defined as any subsidiary approximate solver which is combined with an outer iteration technique, typically one of the Krylov subspace iterations.
Preconditioner的选择是无限制的。For example, preconditioners can be derived from knowledge of the original physical problems from which the linear system arises.
A common feature of the preconditioners discussed in this chapter is that they are built from the original coefficient matrix.
some common preconditioners:
- Jacobi (diagonal) preconditioner M=diag(A)
- Poisson preconditioner
- (Incomplete) Cholesky decomposition
- Multigrid: M= very complex linear operator that almost inverts A.
- Geometric multigrid
- Algebraic multigrid
- Fast multipole method (FMM)
10.1 Introduction
This chapter considers the most common preconditioners used for solving large sparse matrices and comparestheir performance.It begins with the simplest preconditioners(SOR and SSOR) and then discusses the more accurate variants such as ILUT.
10.2 Jacobi, SOR, and SSOR preconditioners
Consider:
- D: diagonal of A
- E: lower part of A. (not including the diagonal)
- F: upper part of A. (not including the diagonal)
The Jacobi and Gauss-Seidel iterations are of the form:
in which:
given the matrix splitting:
where A is associated with the linear system Ax=b, a linear fixed-point iteration can be defined by the recurrence:
which has the form xk+1=Gxk+f with:
The iteration xk+1=Gxk+f can be viewed as a technique for solving the system:
Since G=I−M−1A, this system can be rewritten as:
for the Jacobi, Gauss-Seidel, SOR, and SSOR iterations, these preconditioning matrices are, respectively:
10.3 ILU factorization preconditioners
A general Incomplete LU (ILU) factorization process computes a sparse lower triangular matrix L and a sparse upper triangular matrix U so the residual matrix R=LU−A satisfies certain constraints, such as having zero entries in some locations.
10.3.1 Imcomplete LU factorizations
Theorem:
- Let A be an M-matrix and let A1 be the matrix obtained from the first step of Gaussian elimination. Then A1 is an M-matrix.
11. Parallel Implementations
There have been two traditional approaches for developing parallel iterative techniques thus far.
-
The first extracts parallelism whenever possible from standard algorithms. The advantage of this viewpoint is that it is easier to understand in general since the underlying method has not changed from its sequential equivalent.
-
The second approach is to develop alternative algorithms which have enhanced parallelism(for example: domain decomposition).
This chapter will give an overview of implementations and will emphasize methods in the first category. The later chapters will consider alternative algorithms that have been developed specifically for parallel computing environments.
11.1 Introduction
The remaining chapters of this book will examine the impact of high performance computing on the design of iterative methods for solving large linear systems of equations.
Chapter 12. Parallel Preconditioners
The main question to ask is whether or not it is possible to find preconditioning techniques that have a high degree of parallelism, as well as good intrinsic qualities.
12.1 Introduction
A number of alternative techniques can be developed that are specifically targeted at parallel environments. These are preconditioning techniques that would normally not be used on a standard machine, but only for parallel computers.
Three such types of techniques discussed in this chapter:
- The simplest approach is to use a Jacobi or, even better, a block Jacobi approach. In the simplest case, a Jacobi preconditioner may consist of the diagonal or a block-diagonal of A. To enhance performance, these preconditioners can themselves be accelerated by polynomial iterations, i.e., a second level of preconditioning called polynomial preconditioning.
- Using graph theory algorithms, such as graph-coloring techniques. These consist of coloring nodes such that two adjacent nodes have different colors. The gist of this approach is that all unknowns associated with the same color can be determined simultaneously in the forward and backward sweeps of the ILU preconditioning operation.
- Using generalizations of "partitioning" techniques, which can be put in the general framework of "domain decomposition" approaches.
There are essentially two types of algorithms, namely, those which can be termed coarse-grain and those which can be termed fine-grain:
- coarse-grain: the parallel tasks are relatively big and may, for example, involve the solution of small linear systems
- fine-grain: the subtasks can be elementary floating-point operations or consist of a few such operations.
12.2 Block-Jacobi preconditioners
block Jacobi preconditioner (cnrs.fr)
the block Jacobi preconditioner:
where Ri∈Rn×N is the restriction matrix(取出 A 中的某一部分) from the global numbering of unknowns to the local numbering of process i, and RiARTi is the diagonal block of A corresponding to the unknowns held by process i.
12.3 Polynomial Preconditioners
In polynomial preconditioning hte matrix M is defined by:
12.3.2 Chebyshev Polynomials
Chebyshev Acceleration:
-
r0=b−Ax0; σ1=θ/δ
-
ρ0=1/σ1; d0=1θr0
-
For k=0,..., until convergence Do:
- xk+1=xk+dk
- rk+1=rk−Adk
- ρk+1=(2σ1−ρk)−1
- dk+1=ρk+1ρkdk+2ρk+1δrk+1
-
EndDo
Chapter 13. Domain Decomposition Methods
new classes of numerical methods that can take better advantage of parallelism are emerging. Among these techniques, domain decomposition methods are undoubtedly the best known and perhaps the most promising for certain types of problems.
13.1 Introduction
2.2.5中有介绍不规则网格划分
13.1.1 Notation
Consider the following problem:
图示:
离散化之后:(the interface nodes are labeled last)
矩阵:
即:
- xi represents the subvector of unknowns that are interior to subdomain Ωi
- y represents the vector of all interface unknowns
即:
Thus,
- E represents the subdomain to interface coupling seen from the subdomains,
- F represents the interface to subdomain coupling seen from the interface nodes.
domain decomposition or substructuring methods attempt to solve the problem on the entire domain Ω=∪Ωi from problem solutions on the subdomains Ωi
advantages of domain-decomposition:
- In the case of the above picture, one obvious reason is that the subproblems are much simpler because of their rectangular geometry. For example, fast solvers can be used on each subdomain in this case.
- A second reason is that the physical problem can sometimes be split naturally into a small number of subregions where the modeling equations are different (e.g., Euler’s equations on one region and Navier-Stokes in another)
13.1.2 Types of Partitionings
vertex-based partitioning:do not allow vertex to be split between two subdomains
edge-based partitioning: do not allow edges to be split between two subdomains
element-based partitioning: do not allow elements to be split between two subdomains
13.1.3 Types of Techniques
domain decomposition techniques distinguished by four features:
-
Type of Partitioning
-
Overlap
-
Processing of interface value
-
Subdomain solution. Should the subdomain problems be solved exactly or approximately by an iterative method?
本章中的methods分为:
- direct methods and the substructuring approach are useful for introducing some definitions and for providing practical insight
- among the simplest and oldest techniques are the Schwarz Alternating Procedures
- based on preconditioning the Schur complement system
- based on solving the linear system with the matrix A, by using a preconditioning derived from Domain Decomposition concepts
13.2 Direct solution and the schur complement
The matric S=C−FB−1E is called the Schur complement matrix associated with the y variable. (13.4) called the reduced system.
a solution method:
- Obtain the right-hand side of the reduced system (13.4)
- Form the Schur complement matrix
- Solve the reduced system (13.4)
- Back-substitute using (13.3) to obtain the other unknowns.
Block-Gaussian Elimination:
- Solve BE′=B, and Bf′=f for E′ and f′, respectively
- Compute g′=g−Ff′
- Compute S=C−FE′
- Solve Sy=g′
- Compute x=f′−E′y
where:
- E′=B−1E and f′=B−1f.
13.3 Schwarz alternating procedures
The original alternating procedure described by Schwarz in 1870 consisted of three parts:
- alternating between two overlapping domains
- solving the Dirichlet problem on one domain at each iteration
- taking boundary conditions based on the most recent solution obtained from other domain.
This procedure is called the Multiplicative Schwarz procedure. In matrix term, this is very reminiscent of the block Gauss-Seidel iteration with overlap defined with the help of projectors. The analogue of the block-Jacobi procedure is known as the Additive Schwarz procedure.
13.3.1 Multiplicative Schwarz Procedure
In the following, assume that each pair of neighboring subdomains has a nonvoid overlapping region. The boundary of subdomain Ωi that is included in subdomain j is denoted by Γi,j.
Each subdomain extends beyond its initial boundary into nerghboring subdomains. Call Γi the boundary of Ωi consisting of its original boundary (which is denoted by Γi,0) and the Γi,j's, and denote by uji the restriction of the solution u to the boundary Γji. Then:
Schwarz Alternating Procedure(SAP):
- Choose an initial guess u to the solution
- Until convergence Do:
- For i=1,...,s Do:
- Solve Δu=f in Ωi with u=uij in Γij
- Update u values on Γji, ∀j
- EndDo
- For i=1,...,s Do:
- EndDo
The algorithm sweeps through the s subdomains and solves the original equation in each of them by using boundary conditions that are updated from the most recent values of u.
we follow the model of the overlapping block-Jacobi technique seen in the previous chapter. Let Si be an index set of subdomain:
where the indices jk are those associated with the ni mesh points of the interior of the discrete subdomain Ωi.
ni 表示第 i 个subdomain包含的grid point数量.
Let:
- Ri be a restriction operator from Ω to Ωi. (By definition, xi=Rix belongs to Ωi. )(换句话说:Ri 将 x 中属于 Ωi 的那部分components取出来得到一个subvector)
- RTi the transpose is a prolongation operator which takes a variable from Ωi and extends it to the equivalent variable in Ω.
multiplicative Schwarz procedure:
- For i=1,...,s Do:
- x=x+RTiA−1iRi(b−Ax)
- EndDo
the matrix Ai=RiARTi
个人理解:
- Ri(b−Ax): 把subdomain中的residual取出来
- A−1iRi(b−Ax): 求出subdomain中的error
- RTiA−1iRi(b−Ax): subdomain中的error扩张到全局
令error: d=x⋆−x。则在multiplicative Schwarz procedure的过程中:
等价于:
in which: Pi=RTiA−1RiA
则:
令:
补充:restriction matrix的形式为:
13.3.2 Multiplicative Schwarz preconditioning
Proposition:
the multiplicative schwarz procedure is equivalent to a fixed-point iteration for the "preconditioned" problem
Define the matrices:
- Ti=PiA−1=RTiA−1iRi
Denote:
- zi=Miv
Multiplicative Schwarz Preconditioner:
- Input: v; Output: z=M−1v
- z=T1v
- For i=2,...,s Do:
- z=z+Ti(v−Az)
- EndDo
zs=M−1v.
Multiplicative Schwarz Preconditioned Operator:
- Input: v, Output: z=M−1Av
- z=P1v
- For i=2,...,s Do
- z=z+Pi(v−z)
- EndDo
14.3.3 Addtive Schwarz Procedure
The additive Schwarz procedure is similar to a block-Jacobi iteration and consists of updating all the new (block) components from the same residual. Thus, it differs from the multiplicative procedure only because the components in each subdomain are not updated until a whole cycle of updates through all domains are completed.
basic Addtive Schwarz iteration:
- For i=1,...,s Do
- Compute δi=RTiA−1iRi(b−Ax)
- EndDo
- xnew=x+∑si=1δi
Additive Schwarz Preconditioner:
- Input: v; Output: x=M−1v
- For i=1,...,s Do:
- Compute zi=Tiv
- EndDo
- Compute z=z1+z2...+zs
二,Multigrid
《A Multigrid Tutorial》2nd edition
multigrid can solve a linear system with N unknowns with only O(N) work.
交替使用粗细网格可以解决这些迭代方法的缺陷。先在细网格上迭代使得频率大的误差衰减,剩下误差中频率较小的部分;再将误差中频率较小的部分作为方程组的右侧,在粗网格上迭代,使得频率较小的误差也能快速地衰减;最后为了保证精度,将粗网格上迭代的结果和细网格上的结果加总之后,作为初始猜测值再在细网格上迭代。如此循环往复。因为先由细到粗,后由粗到细,网格转换的路径形似 V 字,所以该方法被称为 V-Cycle Multigrid。除此之外还有 F-cycle multigrid 和 W-cycle multigrid ,基本思想都是相同。
本书的符号使用:
- We always use u to denote the exact solution of this system and v to denote an approximation to the exact solution, perhaps generated by some iterative method.
- →u and →v, represent vectors, while the jth components of these vectors are denoted by uj and vj . In later chapters, we need to associate u and v with a particular grid: Ωh. In this case, the notation uh and vh is used
- error e=u−v
- residual r=f−Av,hence residual equation Ae=r
- initial guess:初始迭代值
Chapter 2. Basic Iterative Methods
详情请见上文《迭代方法》的Chapter 4.
Chapter 3. Elements of Multigrid
3.1 Move ahead multigrid
令initial guess consisting of the vectors (or Fourier modes):由傅里叶可知,任意函数(也就是目标向量 v)可以由无数个sin函数构成
j:component of vector v
n:grid node的数量
k :wavenumber(also called frequency)
其中 0≤j≤n,1≤k≤n−1
(1)不同的frequency:
(2)不同frequency迭代的结果:
many standard iterative methods possess the smoothing property. This property makes these methods very effective at eliminating the high-frequency or oscillatory components of the error, while leaving the low-frequency or smooth components relatively unchanged.
Why use coarse grid?:Relaxation on a coarse grid is less expensive because there are fewer unknowns to be updated.
(3)对于不同密度的网格:
Through the phenomenon of aliasing, the k th mode on Ωh becomes the (n−k)th mode on Ω2h when k>n2.
即:细网格高/低频,等价于粗网格低/高频 (smooth modes on a fine grid look less smooth on a coarse grid)
3.2 Transfer between fine and coarse grid
(1)fine grid to coarse grid:restriction operator I2hh
对周围点加权求和
例如 n=8 时:
即
(2)coase grid to fine grid:linear interpolation operator Ih2h
this is a procedure called interpolation or prolongation.
Ih2h 与 I2hh 互为转置矩阵
the linear interpolation
例如 n=8 时:
插值
3.3 Two-Grid Correction Scheme
Multigrid的两个主要步骤:
- Relax(Smooth):迭代解线性系统
- correct:更新迭代的值 vh
Relaxation on the original equation Au=f with an arbitrary initial guess v is equivalent to relaxing on the residual equation Ae=r with the specific initial guess e=0
Two-grid correction(MG)的步骤:
- Relax ν1 times on Ahuh=fh on Ωh with initial guess vh. (relax也可称为smoothing)//pre-smoothing
- Compute the fine-grid residual rh=fh−Ahvh and restrict it to the coarse grid by r2h=I2hhrh.
- Solve A2he2h=r2h on Ω2h.
- Interpolate the coarse-grid error to the fine grid by eh=Ih2he2h and correct the fine-grid approximation by vh←vh+eh.
- Relax ν2 times on Ahuh=fh on Ωh with initial guess vh
上面步骤可以循环多次
- 构建grid,转换到coase grid
- 解residual equation
- 将解 e 转换到fine grid,用 e 更新 v
3.4 V-Cycle Scheme
将V-cycle(V)记作 vh←Vh(vh,fh),则其步骤可以写为:
- Relax ν1 times on Ahuh=fh with initial guess vh.
- Compute f2h=I2hhrh=I2hh(fh−Ahvh).
- Relax ν1 times on A2hu2h=f2h with initial guess v2h=0.
- Compute f4h=I4h2hr2h.
- Relax ν1 times on A4hu4h=f4h with initial guess v4h=0.
- Compute f8h=I8h4hr4h.
- · · ·
- Solve ALhuLh=fLh.
- · · ·
- Correct v4h←v4h+I4h8hv8h.
- Relax on A4hu4h=f4h ν2 times with initial guess v4h.
- Correct v2h←v2h+I2h4hv4h.
- Relax on A2hu2h=f2h ν2 times with initial guess v2h.
- Correct vh←vh+Ih2hv2h.
- Relax on Ahuh=fh ν2 times with initial guess vh.
V-cycle的递归版本:
- Relax ν1 times on Ahuh=fh with a given initial guess vh
- If Ωh= coarsest grid, then go to step 4. Else:
- f2h←I2hh(fh−Ahvh)
- v2h←0
- v2h←V2h(v2h,f2h).
- Correct vh←vh+Ih2hv2h.
- Relax ν2 times on Ahuh=fh with initial guess vh
3.5 μ-cycle multigrid
μ=1 时,等于V-cycle
将 μ-cycle multigrid记为 vh←Mμh(vh,fh) ,则其步骤可以写为:
- Relax ν1 times on Ahuh=fh with a given initial guess vh.
- If Ωh= coarsest grid, then go to step 4. Else:
- f2h←I2hh(fh−Ahvh)
- v2h←0
- v2h←Mμ2h(v2h,f2h) μ times.
- Correct vh←vh+Ih2hv2h.
- Relax ν2 times on Ahuh=fh with initial guess vh
3.6 Full Multigrid V-cycle
使用不同阶数的V-cycle组合
Full Multigrid V-cycle(FMG)记作 vh←FMGh(fh),则其步骤:
- Initialize f2h←I2hhfh,f4h←I4h2hf2h,...
- Solve or relax on coarsest grid
- · · ·
- v2h←I2h4hv4h.
- v2h←V2h(v2h,f2h) ν0 times.
- vh←Ih2hv2h.
- vh←Vh(vh,fh), ν0 times
FMG的递归版本:
- If Ωh= coarsest grid, set vh←0 and go to step 3. Else:
- f2h←I2hh(fh)
- v2h←FMG2h(f2h).
- Correct vh←Ih2hv2h.
- vh←Vh(vh,fh) ν0 times.
图示:
(a)V-cycle multigrid
(b)W-cycle multigrid
(c)FMG
Chapter 4. Implementation
4.1 Data structure
There seems to be general agreement that the solutions →v and right-side vectors →f on the various grids should be stored contiguously in single arrays
Consider:
- 4-level V-cycle
- 1-D problem
- n=16 points
data structure的图示:
4.2 Computational cost
work unit:the cost of performing one relaxation sweep on the finest grid.
Consider:
- d dimensional V-cycle with one relaxation sweep on each level(v1=v2=1).
- Each level is visited twice and grid Ωph requires p−d work units
so V-cycle computation cost:
example:
- 4 WUs for one-dimensional (d=1) problem
- 8/3 WUs for d=2
- 16/7 WUs for d=3
Consider:
- d dimensional FMG with one relaxation sweep on each level(v0=v1=v2=1).
so FMG computation cost:
example:
- 8 WUs for d=1
- 7/2 WUs for d=2
- 5/2 WU for d=3
4.3 Predictive Tools: Local Mode Analysis
The remainder of this chapter presents some practical tools for determining whether an algorithm is working properly
由于spectral radius难算(特征值难算),所以不容易用它分析convergence factor。所以可以使用local mode analysic (or normal mode analysis or Fourier analysis) for approximating convergence and smoothing factors.
Local mode analysis begins with the assumption that relaxation is a local process: each unknown is updated using information from nearby neighbors.
4.4 Diagnostic Tools
As with any numerical code, debugging can be the most difficult part of creating a successful program. For multigrid, this situation is exacerbated in two ways:
- the interactions between the various multigrid components are very subtle. It can be difficult to determine which part of a code is defective
- an incorrectly implemented multigrid code can perform quite well (sometimes better than other solution methods)
Chapter 6. Nonlinear Problems
6.1 Formal difference between linear and nonlinear
书中提到nonlinear Gauss–Seidel经常用来解nonlinear system
(1)介绍:
Consider
- a system of nonlinear algebraic equations:A(u)=f, where u,f∈Rn (the notation A(u), rather than Au, signifies that the operator is nonlinear)
类似线性系统:
- e=u−v
- r=A(u)−A(v)
由于 A 非线性,无法推出 r=A(u)−A(v)=A(e) ,所以没有linear residual equation,now use r=A(u)−A(v) as residual equation
(2)Newton-multigrid:
residual equation等价于:
Expanding A(v+e) in a Taylor series about v and truncating the series after two terms, we have the linear system of equation:
where J=(∂Ai/∂uj) is the n×n Jacobian matrix.
It can be solved for e and the current approximation v can be updated by v←v+e. Iteration of this step is Newton's method. at the same time, at each step, use multigrid to solve the linear system, called Newton-multigrid.
6.2 Full Approximation Scheme(FAS)
步骤:
- Restrict the current approximation and its fine-grid residual to the coarse grid: r2h=I2hh(fh−Ah(vh)) and v2h=I2hhvh
- Solve the coarse-grid problem A2h(u2h)=A2h(v2h)+r2h
- Compute the coarse-grid approximation to the error: e2h=u2h−v2h.
- Interpolate the error approximation up to the fine grid and correct the current fine-grid approximation: vh←vh+Ih2he2h.
FAS can be viewed as a generalization of the two-grid correction scheme to nonlinear problems
Example:
Consider the two-point boundary value problem:
the source term f and a constant γ>0 are given
let
- Ωh consist of the grid points xj=jn for some positive even integer n.
- uj=u(xj) and fj=f(xj) , for j=0,1,...,n.
using finite difference:
this scalar equation is linear in uhj even though the vector equation Ah(uh)=fh is nonlinear in uh. nonlinear Gauss–Seidel requires no Newton step.
Residual equation given by:
把Residual equation中的 A2hj 用上面finite difference的 Ahj(u) 做替换
Chapter 7. Multigrid preconditioning
multigrid可以被用作preconditioner
If the matrix of the original equation or an eigenvalue problem is symmetric positive definite (SPD), the preconditioner is commonly constructed to be SPD as well, so that conjugate gradient can still be used.
When using multigrid as a preconditioner, the action of this preconditioner is obtained by applying a fixed number of multigrid cycles.
Multigrid做standalone solver就是解到哪里算哪里,留下的residual通通不管;做Preconditioner的话,你的residual会根据你是用的krylov solver的特性继续减小。
7.1 MGPCG
MST10.pdf (ucla.edu) : A parallel multigrid Poisson solver for fluids simulation on large grids
MGPCG:
V-Cycle Algorithm:
Smoother:
Chapter 8. Algebraic Multigrid(AMG)
论文《An Introduction to Algebraic Multigrid》
8.1 Introduction
介绍:
-
algebraic multigrid methods (AMG) construct their hierarchy of operators directly from the system matrix. (The AMG method determines coarse “grids”, inter-grid transfer operators, and coarse-grid equations based solely on the matrix entries)
-
In classical AMG, the levels of the hierarchy are simply subsets of unknowns without any geometric interpretation.Thus, AMG methods become black-box solvers for certain classes of sparse matrices
-
requires no explicit knowledge of the problem geometry( Since AMG knows nothing about the geometry of the problem, we need to characterize this in some algebraic way)
前文中的插值矩阵 Ih2h 重新记为 P,则 I2hh 记为 PT
Because AMG is based only on the matrix A, there are few options for defining the coarse system Ac. The most common approach is to use the Galerkin operator:
8.1.1 Adjacency graph
fine-grid:soving Au=f, where u=[u1...un]. Then the fine-grid points are just the indices 1,2,...,n.
the connections within the grid are determined by the undirected adjacency graph of the matrix:
The connections in the grid are the edges in the graph
The grid in AMG is simply the set of vertices in the graph(grid point i is just vertex i)
8.1.2 Algebraic smoothness
If the problem provides no geometric information, then we cannot simply examine the Fourier modes of the error.
we define smooth error loosely to be any error that is not reduced effectively by relaxation. (例如:粗网格上迭代高频,细网格上迭代低频,会导致error无法下降)
Weighted Jacobi relaxation, as we know, has the property that after making great progress toward convergence, it stalls, and little improvement is made with successive iterations. At this point, we define the error to be algebraically smooth. (By our definition, algebraic smoothness means that the size of ei+1 is not significantly less than that of ei,也就是:第 i+1 次迭代后error不一定别第 i 次小)
8.1.3 Influence and Dependence
(1)Dependence:
Given a threshold value 0<θ≤1, the variable (point) ui strongly depends on the variable (point) uj if
This says that grid point i strongly depends on grid point j if the coefficient aij is comparable in magnitude to the largest off-diagonal coefficient in the ith equation
(2)Influence:
If the variable ui strongly depends on the variable uj , then the variable uj strongly influences the variable ui.
8.1.4 Defining the Interpolation Operator
The coarsening procedure partitions the grid into C-points (points on the coarse grid) and F-points (points not on the coarse grid (only on the fine-grid) )
For each fine-grid point i, we define Ni, the neighborhood of i, to be the set of all points j≠i such that aij≠0. These points can be divided into three categories:
-
the neighboring coarse-grid points that strongly influence i; this is the coarse interpolatory set for i, denoted by Ci.
-
the neighboring fine-grid points that strongly influence i, denoted by Dsi.
-
the points that do not strongly influence i, denoted by Dwi ; this set may contain both coarse- and fine-grid points; it is called the set of weakly connected neighbors.
smooth error that must be interpolated to the fine grid:
8.1.5 Selecting the Coarse Grid
8.2 Classical AMG
(1)Choosing the coarse grid:
- Define a strength matrix As by deleting weak connections in A.
- First pass:Choose an independent set of fine-grid points based on the graph of As.
- Second pass:Choose additional points if needed to satisfy interpolation requirements.
8.3 AMG Two-Grid correction cycle
步骤:
- Relax ν1 times on Ahuh=fh with initial guess vh.
- Compute the fine-grid residual rh=fh−Ahvh and restrict it to the coarse grid by r2h=I2hhrh.
- Solve A2he2h=r2h on Ω2h.
- Interpolate the coarse-grid error to the fine grid by eh=Ih2he2h and correct the fine-grid approximation by vh←vh+eh.
- Relax ν2 times on Ahuh=fh with initial guess vh.
Chapter 9. Multilevel Adaptive Methods
The goal of this chapter is to understand how to treat local demands in a multilevel context without overburdening the overall computation. (不需要全部计算,即可针对性地解决某个local demands)
9.1 Introduction
To keep matters simple, assume that a local fine grid with mesh size h=18 is needed on the interval (12,1) to resolve some special feature near x=34 , but that a mesh size of 2h=14 is deemed adequate for the rest of the domain. (要solve some special feature near x=34,需要 (12,1) 区间的网格的密度达到 h=18,剩下的 (0,12) 区间的网格密度只需要达到 2h=14 即可)
思路:
- 对于非local fine grid,不做relax,直接将residual给restrict到coarse grid.
- residual给restrict到coarse grid之后,下一轮直接更新coarse grid的residual即可
- 对于local fine grid,正常走流程
- 在coarse grid中,通过对residual equation的relax求得error,然后将error插值回到fine grid,再进行correct
由于非local fine grid在fine grid上不做relax,所以可以用参数 ω 来累积coarse grid solution
9.2 Fast Adaptive Composite Grid Method(FAC)
- Initialize vh=0,w2h1=0, and f2h1≡(I2hhfh)1=fh1+2fh2+fh34 .
- Relax on vh on the local fine grid {xh5,xh6,xh7}={58,34,78}.
- Compute the right sides for the local coarse grid: f2h2←g2h2−−w2h1+2vh4−vh6(2h)2 , f2h3←rh5+2rh6+rh74 .
- Compute an approximation v2h to the coarse-grid residual equation A2hu2h=f2h.
- Update the residual at ×=14 for use in later cycles: f2h1←f2h1−−v2h0+2v2h1−v2h2(2h)2 .
- Update the coarse-grid approximation at ×=14:w2h1←w2h1+v2h1 .
- Interpolate the correction and update the approximation: vh←vh+Ih2hv2h at the local fine grid and the interface points {12,58,34,78}.
补充概念:
流体
Warm starting:use the p from the last frame as the initial guess of the current frame. (p 为流体的压强场)(对MGPCG效果不太好,对其它方法效果好)
为什么使用multigrid:discretization的时候,中间的cell对周围4个cell做discretization (比如说finite difference),这个linear operator可以看到的cell只有5个,是比较high frequency的,而速度的divergence可能是一个large scale, low frequency的,就没有办法被capture到。(在CG中,很少用来做solver,而是做preconditioner)(通常,multigrid每次迭代residual都是降低一个数量级)
Papers:
- (sig2018) An Advection-Reflection Solver for Detail-Preserving Fluid Simulation
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· SQL Server 2025 AI相关能力初探
· AI编程工具终极对决:字节Trae VS Cursor,谁才是开发者新宠?
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南