Linear Algebra 期末复习
Linear System
consistent/inconsistent, coefficient matrix, augmented matrix
Elementary row operations (ERO)
- replacement: replace one row by the sum of itself and a multiple of another row. (Ai→Ai+rAjAi→Ai+rAj)
- interchange: interchange two rows. (Ai→Aj,Aj→AiAi→Aj,Aj→Ai)
- scaling: multiple a nonzero scaler to a row. (Ai→rAiAi→rAi)
echelon form, reduced echelon form, pivot position
Def. A matrix is in an echelon form if the following conditions hold.
- all nonzero rows are above all zero rows;
- each leading entry (the leftmost nonzero entry) of a row is in a column to the right of the leading entry of the row above it.
- (a consequence) all entries in a column below leading entry are zeros.
Def. A matrix is in reduced row echelon form if the following additional conditions are satisfied.
- the leading entry in each row is 11.
- each leading 11 is the only nonzero entry in its column.
Def. A pivot position in a matrix AA is a location in AA that corresponds to a leading 11 in the reduced echelon form of AA (i.e. a leading entry of a row in the echelon form of AA). A pivot column is a column of AA containing a pivot position.
The row reduction algorithm to solve a linear system
Vector Spaces
field, vector space, subspace
FF is called a field if it's a set where ++ and ⋅⋅ are defined.
VV is called a vector space if it's a set where addition and scalar multiplication are defined.
If W⊆VW⊆V, and WW is still a vector space with addition and scalar multiplication from VV, then WW is called subspace of VV.
Thm. WW is a subspace of VV if and only if 0∈W0∈W, and WW is closed under addition and scalar multiplication.
linear dependent/linear independent
Thm. Suppose SS is linearly independent, then S∪{v}S∪{v} is linearly dependent if and only if v∈span(S)v∈span(S).
basis, dimension
Thm. Suppose WW is a subspace of VV (containing more than one vector). Then any linearly independent set in HH can be expanded, if necessary, to a basis for HH.
Thm.(The basis theorem) Let VV be a pp-dimensional vector space, then any linearly independent set of exactly pp elements in VV is automatically a basis for VV.
Any set of exactly pp elements that spans VV is automatically a basis for VV.
Linear Transformation
linear transformation, range/image, kernel/null space, rank, nullity
Thm.(Dimension Theorem) Suppose T:V→WT:V→W, then nullity(T)+rank(T)=dim(V)nullity(T)+rank(T)=dim(V).
one-to-one, onto
Thm. Suppose {v1,v2,⋯,vn}{v1,v2,⋯,vn} is a basis for VV. For w1,w2,⋯,wn∈Ww1,w2,⋯,wn∈W, there exists exactly one linear transformation T:V→WT:V→W such that T(vi)=wiT(vi)=wi for i=1,2,⋯,ni=1,2,⋯,n.
coordinate vector of xx relative to ββ
Let β=\setu1,u2,⋯,unβ=\setu1,u2,⋯,un be an ordered basis for a finite-dimensional vector space VV. For x∈Vx∈V, let a1,a2,⋯,ana1,a2,⋯,an be the unique scalars such that x=∑ni=1aiuix=∑ni=1aiui. We define the coordinate vector of xx relative to ββ(坐标向量), denoted [x]β[x]β, by [x]β=(a1a2⋯an)T[x]β=(a1a2⋯an)T.
the vector space of all linear transformations from VV to WW
We denote the vector space of all linear transformations from VV into WW by L(V,W)L(V,W). In the case that V=WV=W, we write L(V)L(V) instead of L(V,W)L(V,W).
matrix representation
Def. Suppose that VV and WW are finite-dimensional vector spaces with ordered bases β=\setv1,v2,⋯,vnβ=\setv1,v2,⋯,vn and γ=\setw1,w2,⋯,wmγ=\setw1,w2,⋯,wm, respectively. Let T:V→WT:V→W be linear. Then for each jj, 1≤j≤n1≤j≤n, there exist unique scalars aij∈F,1≤i≤maij∈F,1≤i≤m, such that T(vj)=∑mi=1aijwiT(vj)=∑mi=1aijwi for 1≤j≤n1≤j≤n. We call the m×nm×n matrix AA defined by Aij=aijAij=aij the matrix representation of TT in the ordered bases ββ and γγ and write A=[T]γβA=[T]γβ.
Thm. Let T:V→WT:V→W be linear, and let β,γβ,γ be ordered bases for V,WV,W, respectively. Then we have [T(x)]γ=[T]γβ[x]β[T(x)]γ=[T]γβ[x]β.
Thm. Let T:V→WT:V→W and U:W→ZU:W→Z be linear transformations, and let α,β,γα,β,γ be ordered bases for V,WV,W and ZZ, respectively. Then we have [UT]γα=[U]γβ[T]βα[UT]γα=[U]γβ[T]βα.
left-multiplication transformation
Def. Let AA be an m×nm×n matrix with entries from a field FF. We denote by LALA the mapping Fn→FmFn→Fm defined by LA(x)=AxLA(x)=Ax (the matrix product of AA and xx) for each column vector x∈Fnx∈Fn. We call LALA a left-multiplication transformation.
inverse, invertible
Thm. A linear transformation is invertible if and only if it’s both one-to-one and onto.
Thm. Let T:V→WT:V→W be a linear transformation. If dim(V)=dim(W)dim(V)=dim(W), then TT is invertible if and only if rank(T)=dim(V)rank(T)=dim(V).
isomorphic, isomorphism
Def. Let VV and WW be vector spaces. We say VV is isomorphic to WW if there exists a linear transformation T:V→WT:V→W that is invertible. Such a linear transformation is called an isomorphism from VV onto WW.
Thm. Let VV and WW be finite-dimensional vector spaces over the same field. Then VV is isomorphic to WW if and only if dim(V)=dim(W)dim(V)=dim(W).
standard representation of VV with respect to ββ
Def. Let ββ be an ordered basis for an nn-dimensional vector space VV over a field FF. The standard representation of VV with respect to ββ is the function ϕβ:V→Fnϕβ:V→Fn defined by ϕβ(x)=[x]βϕβ(x)=[x]β for each x∈Vx∈V.
change of coordinate matrix
Def. Let ββ and β′β′ be two ordered bases for a finite-dimensional vector space VV. The matrix Q=[IV]ββ′Q=[IV]ββ′ is called a change of coordinate matrix, and we say that QQ changes β′β′-coordinates into ββ-coordinates.
Thm. Let TT be a linear operator on a finite-dimensional vector space VV, and let ββ and β′β′ be ordered bases of VV. Suppose that QQ is the change of coordinate matrix that changes β′β′-coordinates into ββ-coordinates, then [T]β′=Q−1[T]βQ[T]β′=Q−1[T]βQ.
similar
linear operator, linear functional
coordinate function, dual space, dual basis
Def. Let VV be a finite-dimensional vector space and let β=\setx1,⋯,xnβ=\setx1,⋯,xn be an ordered basis for VV. For each i=1,2,⋯,ni=1,2,⋯,n, define fi(x)=aifi(x)=ai, where [x]β=(a1a2⋯an)T[x]β=(a1a2⋯an)T is the coordinate vector of xx relative to ββ. Then fifi is a linear functional on VV called the ii-th coordinate function with respect to the basis ββ.
Def. For a vector space VV over FF, we define the dual space of VV to be the vector space L(V,F)L(V,F), denoted by V∗V∗. We also define the double dual space V∗∗V∗∗ of VV to be the dual space of V∗V∗.
Thm. Suppose that VV is a finite-dimensional vector space with the ordered basis β=\setx1,⋯,xnβ=\setx1,⋯,xn. Let fi(1≤i≤n)fi(1≤i≤n) be the ii-th coordinate function with respect to ββ, and let β∗=\setf1,⋯,fnβ∗=\setf1,⋯,fn. Then β∗β∗ is an ordered basis for VV, and for any f∈V∗f∈V∗, we have f=∑ni=1f(xi)fif=∑ni=1f(xi)fi. The ordered basis β∗β∗ is called the dual basis of ββ.
Thm. Let VV be a finite-dimensional vector space. For a vector x∈Vx∈V, we define ˆx:V∗→F^x:V∗→F by ˆx(f)=f(x)^x(f)=f(x) for every f∈V∗f∈V∗. Define ψ:V→Vψ:V→V by ψ(x)=ˆxψ(x)=^x. Then ψψ is an isomorphism.
Matrix
rank
Def. If A∈Mm×n(F)A∈Mm×n(F), we define the rank of AA, denoted rank(A)rank(A), to be the rank of the linear transformation LA:Fn→FmLA:Fn→Fm.
Thm. Elementary row and column operations on a matrix are rank-preserving.
所有的ERO可以用左乘一个矩阵描述,这些矩阵的秩都是 nn。
交换 i,ji,j 两行:在单位矩阵的基础上,把 (i,i),(j,j)(i,i),(j,j) 置 00,把 (i,j),(j,i)(i,j),(j,i) 置 11;
把第 ii 行的 kk 倍加到第 jj 行上:在单位矩阵的基础上,把 (j,i)(j,i) 置 kk。
把第 ii 行翻 kk 倍:在单位矩阵的基础上,把 (i,i)(i,i) 置 kk。
Thm. The rank of any matrix equals the maximum number of its linearly independent columns; that is, the rank of a matrix is the dimension of the subspace generated by its columns.
Corollary. The rank of a matrix equals the number of pivot columns.
Thm. Let AA be an m×nm×n matrix of rank rr. Then r≤m,r≤nr≤m,r≤n, and by means of a finite number of elementary row and column operations, AA can be transformed into the matrix D=(IrO1O2O3)D=(IrO1O2O3), where O1,O2,O3O1,O2,O3 are zero matrices. Thus Dii=1Dii=1 for i≤ri≤r and Dij=0Dij=0 otherwise.
Corollary. Let AA be an m×nm×n matrix of rank rr, then there exist invertible matrices BB and CC of sizes m×mm×m and n×nn×n, respectively, such that D=BACD=BAC and DD has the form above.
这个“对角化”的做法就是做行变换和列变换。
partitioned matrices
LU decomposition, inverse
For a linear system Ax=bAx=b, if we can decompose AA as LULU, we can divide the linear system into two linear systems Lc=bLc=b and Ux=cUx=c. Here, for an m×nm×n matrix AA, LL is an m×mm×m lower triangular matrix, and UU is in echelon form.
对 AA 做 ERO 把 AA 变成 echelon form 就得到了 UU;LijLij 是过程中第 jj 行加到第 ii 行的系数的相反数。
若 AA 可逆,对 (A∣I)(A∣I) 做 ERO 把 AA 变成单位矩阵,得到的矩阵就是 (I∣A−1)(I∣A−1)。
homogeneous system
Def. A system Ax=bAx=b of mm linear equations in nn unknowns is said to be homogeneous(齐次的)if b=0b=0. Otherwise the system is said to be nonhomogeneous.
Thm. Let KK be the solution set of a system of linear equations Ax=bAx=b, and let KHKH be the solution set of the corresponding homogeneous system Ax=0Ax=0. Then for any solution ss to Ax=bAx=b, K=\sets+KH=\sets+k∣k∈KHK=\sets+KH=\sets+k∣k∈KH.
Thm. Let Ax=bAx=b be a system of nn linear equations in nn unknowns. If AA is invertible, then the system has exactly one solution, namely, A−1bA−1b. Conversely, if the system has exactly one solution, then AA is invertible.
Determinant
Def. Let A∈Mn×n(F)A∈Mn×n(F). If n=1n=1, so that A=(A11)A=(A11), we define det(A)=A11det(A)=A11. For n≥2n≥2, we define det(A)det(A) recursively as det(A)=∑nj=1(−1)1+jA1jdet(˜A1j)det(A)=∑nj=1(−1)1+jA1jdet(~A1j). Here, ˜Aij~Aij means the (n−1)×(n−1)(n−1)×(n−1) matrix obtained from AA by deleting row ii and column jj.
The scalar det(A)det(A) is called the determinant of AA and is also denoted by |A||A|. The scalar cij=(−1)i+jdet(˜Aij)cij=(−1)i+jdet(~Aij) is called the cofactor of the entry of AA in row ii, column jj. We can express the formula for the determinant of AA as det(A)=∑ni=1A1ic1idet(A)=∑ni=1A1ic1i, and this formula is called cofactor expansion along the first row of AA.
Thm. The determinant of a square matrix can be evaluated by cofactor expansion along any row. That is, if A∈Mn×n(F)A∈Mn×n(F), then for any integer i(1≤i≤n)i(1≤i≤n), det(A)=∑nj=1Aijcijdet(A)=∑nj=1Aijcij.
The following rules summarize the effect of an elementary row operation on the determinant of a matrix A∈Mn×n(F)A∈Mn×n(F).
- (interchange) If BB is a matrix obtained by interchanging any two rows of AA, then det(B)=−det(A)det(B)=−det(A).
- (scaling) If BB is a matrix obtained by multiplying a row of AA by a nonzero scalar kk, then det(B)=kdet(A)det(B)=kdet(A).
- (replacement) If BB is a matrix obtained by adding a multiple of one row of AA to another row of AA, then det(B)=det(A)det(B)=det(A).
These facts can be used to simplify the evaluation of a determinant. By using elementary row operations of types 11 and 33 only, we can transform any square matrix into an upper triangular matrix, and so we can easily evaluate determinant of any square matrix since the determinant of an upper triangular matrix is the product of its diagonal entries.
Thm. For any A,B∈Mn×n(F)A,B∈Mn×n(F), det(AB)=det(A)⋅det(B)det(AB)=det(A)⋅det(B).
Corollary. A matrix A∈Mn×n(F)A∈Mn×n(F) is invertible iff det(A)≠0det(A)≠0. Furthermore, if AA is invertible, then det(A−1)=1det(A)det(A−1)=1det(A).
Thm.(Cramer's Rule) Let Ax=bAx=b be the matrix form of a system of nn linear equations in nn unknowns, where x=(x1,⋯,xn)tx=(x1,⋯,xn)t. If det(A)≠0det(A)≠0, then this system has a unique solution, and for each k=1,2,⋯,nk=1,2,⋯,n, xk=det(Mk)det(A)xk=det(Mk)det(A), where MkMk is the n×nn×n matrix obtained from AA by replacing column kk of AA by bb.
Diagonalization
diagonalizable
eigenvector, eigenvalue, eigenspace
Thm. A linear operator TT on a finite-dimensional vector space VV is a diagonalizable iff there exists an ordered basis ββ for VV consisting of eigenvectors of TT. Furthermore, if TT is diagonalizable, β={v1,⋯,vn}β={v1,⋯,vn} is an ordered basis of eigenvectors of TT, and D=[T]βD=[T]β, then DD is a diagonal matrix is DjjDjj is the eigenvalue corresponding to vjvj for 1≤j≤n1≤j≤n.
characteristic polynomial, split, (algebraic) multiplicity
计算矩阵 AA 的特征值,通过求关于 tt 的多项式 det(A−tI)det(A−tI) 的根。计算 λλ 对应的特征向量,只要解方程 (A−λI)x=0(A−λI)x=0.
计算线性算子 TT 的特征值和特征向量,先选一个基 ββ,求 A=[T]βA=[T]β 的特征值和特征向量。AA 的特征值就是 TT 的特征值,AA 的特征向量是 TT 的特征向量在基 ββ 下的坐标。
Thm. Let TT be a linear operator on a finite-dimensional vector space VV such that the characteristic polynomial of TT splits. Let λ1,⋯,λkλ1,⋯,λk be the distinct eigenvalues of TT. Then
- TT is diagonalizable if and only if the multiplicity of λiλi is equal to dim(Eλi)dim(Eλi) for all ii.
- If TT is diagonalizable and βiβi is an ordered basis for EλiEλi for each ii, then β=β1∪⋯βkβ=β1∪⋯βk is an ordered basis for VV consisting of eigenvectors of TT.
这个定理给出了对角化的方法。
微分方程组 ddtx=Axddtx=Ax
考虑线性常微分方程组 x′i=∑nj=1aijxj,i=1,2,⋯,nx′i=∑nj=1aijxj,i=1,2,⋯,n,其中 xi=xi(t)xi=xi(t) 是关于 tt 的函数。对 AA 做对角化 Q−1AQ=DQ−1AQ=D,设 y=Q−1xy=Q−1x,则 y′=Dyy′=Dy。根据 DD 是对角矩阵求 yy,进而求得 xx。
sum, direct sum
TT-invariant, TT-cyclic subspace of VV generated by xx
Let TT be a linear operator on a vector space VV, and let xx be a nonzero vector in VV. The subspace W=span({x,T(x),T2(x),⋯})W=span({x,T(x),T2(x),⋯}) is called the TT-cyclic subspace of VV generated by xx.
Thm. Let TT be a linear operator on a finite-dimensional vector space VV, and let WW be a TT-invariant subspace of VV. Then the characteristic polynomial of TWTW divides the characteristic polynomial of TT.
Thm. Let TT be a linear operator on a finite-dimensional vector space VV, and let WW denote the TT-cyclic subspace of VV generated by a nonzero vector v∈Vv∈V. Let k=dim(W)k=dim(W), then
- {v,T(v),T2(v),⋯,Tk−1(v)}{v,T(v),T2(v),⋯,Tk−1(v)} is a basis for WW.
- If a0v+a1T(v)+⋯+ak−1Tk−1(v)+Tk(v)=0a0v+a1T(v)+⋯+ak−1Tk−1(v)+Tk(v)=0, then the characteristic polynomial of TWTW is f(t)=(−1)k(a0+a1t+⋯+ak−1tk−1+tk)f(t)=(−1)k(a0+a1t+⋯+ak−1tk−1+tk).
Thm.(Cayley-Hamilton) Let TT be a linear operator on a finite-dimensional vector space VV, and let f(t)f(t) be the characteristic polynomial of TT. Then f(T)=T0f(T)=T0, the zero transformation. That is, TT "satisfies" its characteristic equation.
Corollary.(Cayley-Hamilton Theorem for Matrices) Let AA be an n×nn×n matrix and let f(t)f(t) be the characteristic polynomial of AA. Then f(A)=Of(A)=O, the n×nn×n zero matrix.
Thm. Let TT be a linear operator on a finite-dimensional vector space VV, and suppose that V=W1⊕W2⊕⋯⊕WkV=W1⊕W2⊕⋯⊕Wk, where WiWi is a TT-invariant subspace of VV for each i(1≤i≤k)i(1≤i≤k). Suppose that fi(t)fi(t) is the characteristic polynomial of TWi(1≤i≤k)TWi(1≤i≤k), then f1(t)⋅f2(t)⋅⋯⋅fk(t)f1(t)⋅f2(t)⋅⋯⋅fk(t) is the characteristic polynomial of TT.
Thm. Let TT be a linear operator on a finite-dimensional vector space VV, and let W1,W2,⋯,WkW1,W2,⋯,Wk be TT-invariant subspaces of VV such that V=W1⊕W2⊕⋯⊕WkV=W1⊕W2⊕⋯⊕Wk. For each ii, let βiβi be an ordered basis for WiWi, and let β=β1∪β2∪⋯∪βkβ=β1∪β2∪⋯∪βk. Let A=[T]βA=[T]β and Bi=[TWi]βiBi=[TWi]βi for i=1,2,⋯,ki=1,2,⋯,k, then A=B1⊕B2⊕⋯⊕BkA=B1⊕B2⊕⋯⊕Bk.
Inner Product Spaces
inner product
standard inner product, Frobenius inner product
complex/real inner product space
norm/length, unit vector
orthogonal/perpendicular, orthonormal
Def. Let VV be an inner product space. Vectors xx and yy in VV are orthogonal (perpendicular) if <x,y>=0<x,y>=0. A subset SS of VV is orthogonal if any two distinct vectors in SS are orthogonal. A subset SS of VV is orthonormal if SS is orthogonal and consists entirely of unit vectors.
这里 orthogonal 理解成“正交”而非垂直。在后面某些地方会有区别。
Def. Let VV be an inner product space. A subset of VV is an orthonormal basis for VV if it is an ordered basis that is orthonormal.
Thm. Let VV be an inner product space and S={v1,v2,⋯,vk}S={v1,v2,⋯,vk} be an orthogonal subset of VV consisting of nonzero vectors. If y∈span(S)y∈span(S), then y=∑ki=1<y,vi>‖vi‖2viy=∑ki=1<y,vi>∥vi∥2vi. If, in addition to the hypotheses of this theorem, SS is orthonormal and y∈span(S)y∈span(S), then y=∑ki=1<y,vi>viy=∑ki=1<y,vi>vi.
Gram-Schmidt process
Thm. Let VV be an inner product space and S={w1,⋯,wn}S={w1,⋯,wn} be a linearly independent subset of VV. Define S′={v1,⋯,vn}S′={v1,⋯,vn}, where v1=w1v1=w1 and vk=wk−∑k−1j=1<wk,vj>‖vj‖2vjvk=wk−∑k−1j=1<wk,vj>∥vj∥2vj for 2≤k≤n2≤k≤n. Then S′S′ is an orthogonal set of nonzero vectors such that span(S′)=span(S)span(S′)=span(S). The construction of {v1,⋯,vn}{v1,⋯,vn} is called the Gram-Schmidt process.
Fourier coefficients
Def. Let ββ be an orthonormal subset (possibly infinite) of an inner product space VV, and let x∈Vx∈V. We define the Fourier coefficients of xx relative to ββ to be the scalars <x,y><x,y>, where y∈βy∈β.
orthogonal complement, orthogonal projection
一个集合 SS 的正交补集(orthogonal complement)是和其中所有向量都垂直的向量构成的集合。这个集合用 S⊥S⊥ 表示,形式化地说,有 S⊥={x∈V∣S⊥={x∈V∣<x,y> = 0 \text{ for all } y\in S}<x,y> = 0 \text{ for all } y\in S}。
Thm. Let WW be a finite-dimensional subspace of an inner product space VV, and let y∈Vy∈V. Then there exist unique vectors u∈Wu∈W and z∈W⊥z∈W⊥ such that y=u+zy=u+z. Furthermore, if {v1,v2,⋯,vk}{v1,v2,⋯,vk} is an orthonormal basis for WW, then u=∑ki=1<y,vi>viu=∑ki=1<y,vi>vi.
Corollary. In the notation of this theorem, the vector uu is the unique vector in WW that is "closest" to yy; that is, for any x∈Wx∈W, ‖y−x‖≥‖y−u‖∥y−x∥≥∥y−u∥, and this inequality is an equality if and only if x=ux=u. The vector uu is called the orthogonal projection of yy on WW.
adjoint
一个算子的共轭算子利用它们的矩阵表示定义。如果 TT 和 UU 在某个标准正交基下的矩阵表示互为共轭转置,那么 TT 和 UU 互为共轭算子。下面的定理给出了共轭算子的一个等价定义。
Thm. Let VV be a finite-dimensional inner product space, and let TT be a linear operator on VV. Then there exists a unique function T∗:V→VT∗:V→V such that <T(x),y>=<x,T∗(y)><T(x),y>=<x,T∗(y)> for all x,y∈Vx,y∈V.
least squares approximation (最小二乘法)
最小二乘法考虑的问题是用一条直线拟合若干个点。形式化地讲,给定 mm 个点 (t1,y1),(t2,y2),⋯(t1,y1),(t2,y2),⋯(tm,ym)(tm,ym),要寻找一条直线 y=ct+dy=ct+d,使得 E=∑mi=1(yi−cti−d)2E=∑mi=1(yi−cti−d)2 最小。这个问题可以进一步抽象化:设 A=(t11t21⋮⋮tm1),x=(cd),y=(y1y2⋮ym)A=⎛⎜ ⎜ ⎜ ⎜⎝t11t21⋮⋮tm1⎞⎟ ⎟ ⎟ ⎟⎠,x=(cd),y=⎛⎜ ⎜ ⎜ ⎜⎝y1y2⋮ym⎞⎟ ⎟ ⎟ ⎟⎠, 则 E=‖y−Ax‖2E=∥y−Ax∥2。我们将抛开这种固定的形式,对任意给定的 AA 和 yy,求一个 xx 使得 ‖y−Ax‖∥y−Ax∥ 最小。
Thm. Let A∈Mm×n(F)A∈Mm×n(F) and y∈Fmy∈Fm. Then there exists x0∈Fnx0∈Fn such that (A∗A)x0=A∗y(A∗A)x0=A∗y and ‖Ax0−y‖≤‖Ax−y‖∥Ax0−y∥≤∥Ax−y∥ for all x∈Fnx∈Fn. Furthermore, if rank(A)=nrank(A)=n, then x0=(A∗A)−1A∗yx0=(A∗A)−1A∗y.
Ax0Ax0 是 R(LA)R(LA) 中最靠近 yy 的向量;如果 x0x0 满足这一条件,则 Ax0−y∈R(LA)⊥Ax0−y∈R(LA)⊥,也就是 <x,A∗(Ax0−y)>=0<x,A∗(Ax0−y)>=0 对所有 xx 成立。再注意到如果 rank(A)=nrank(A)=n,则 rank(A∗A)=nrank(A∗A)=n,就可以直接推出全部结论。
minimal solution
下面的定理给出了求一个线性方程组模长最小的解(minimal solution)的方法。
Thm. Let A∈Mm×n(F)A∈Mm×n(F) and b∈Fmb∈Fm. Suppose that Ax=bAx=b is consistent, then the following statements are true.
- There exists exactly one minimal solution ss of Ax=bAx=b, and s∈R(LA∗)s∈R(LA∗).
- The vector ss is the only solution to Ax=bAx=b that lies in R(LA∗)R(LA∗). That is, if uu satisfies (AA∗)u=b(AA∗)u=b, then s=A∗us=A∗u.
Thm.(Schur) Let TT be a linear operator on a finite-dimensional inner product space VV. Suppose that the characteristic polynomial of TT splits, then there exists an orthonormal basis ββ for VV such that the matrix [T]β[T]β is upper triangular.
normal
Def. Let VV be an inner product space, and let TT be a linear operator on VV. We say that TT is normal if T∗T=TT∗T∗T=TT∗. An n×nn×n real or complex matrix AA is normal if A∗A=AA∗A∗A=AA∗.
Thm. Let TT be a linear operator on a finite-dimensional complex inner product space VV, then TT is normal if and only if there exists an orthonormal basis for VV consisting of eigenvectors of TT.
self-adjoint(Hermitian)
Def. Let TT be a linear operator on an inner product space VV. We say that TT is self-adjoint(Hermitian) if T=T∗T=T∗. An n×nn×n real or complex matrix AA is self-adjoint(Hermitian) if A=A∗A=A∗.
Thm. Let TT be a linear operator on a finite-dimensional real inner product space VV. Then TT is self-adjoint if and only if there exists an orthonormal basis ββ for VV consisting of eigenvectors of TT.
unitary/orthogonal operator
Def. Let TT be a linear operator on a finite-dimensional inner product space VV (over FF). If ‖T(x)‖=‖x‖∥T(x)∥=∥x∥ for all x∈Vx∈V, we call TT a unitary operator if F=CF=C and an orthogonal operator if F=RF=R.
Thm. Let TT be a linear operator on a finite-dimensional inner product space VV. Then the following statements are equivalent.
- TT∗=T∗T=ITT∗=T∗T=I.
- <T(x),T(y)>=<x,y><T(x),T(y)>=<x,y> for all x,y∈Vx,y∈V.
- If ββ is an orthonormal basis for VV, then T(β)T(β) is an orthonormal basis for VV.
- There exists an orthonormal basis ββ for VV such that T(β)T(β) is an orthonormal basis for VV.
- ‖T(x)‖=‖x‖∥T(x)∥=∥x∥ for all x∈Vx∈V.
Def. A square matrix AA is called an orthogonal matrix if AtA=AAt=IAtA=AAt=I and unitary if A∗A=AA∗=IA∗A=AA∗=I.
Def. Two matrix AA and BB are unitary equivalent [orthogonal equivalent] if and only if there exists a unitary [orthogonal] matrix PP such that B=P∗APB=P∗AP.
Thm. Let AA be a complex n×nn×n matrix. Then AA is normal if and only if AA if unitarily equivalent to a diagonal matrix.
Thm. Let AA be a real n×nn×n matrix. Then AA is symmetric if and only if AA if orthogonally equivalent to a diagonal matrix.
实数域和复数域上部分概念和结论是不一样的,具体来说有以下几处。
首先,对于一个算子,如果有一个由其特征向量组成的标准正交基,那么,在复数域上,可以推出这个算子是正规(normal)的;在实数域上,可以推出这个算子是自伴(self-adjoint)的。
其次是一个称呼的不同,如果一个算子保范数(norm/length),在复数域上,将其称为酉(unitary)算子;在实数域上,将其称为正交(orthogonal)算子。不过后者的称呼并不常见。
再有,如果一个矩阵在复数域上可以酉对角化,可以推出这个矩阵是正规的(normal);如果一个矩阵在实数域上可以正交(orthogonal)对角化,可以推出这个矩阵是对称的。
orthogonal projection
Recall that if V=W1⊕W2V=W1⊕W2, then a linear operator TT on VV is the projection on W1W1 along W2W2 if, whenever x=x1+x2x=x1+x2, with x1∈W1x1∈W1 and x2∈W2x2∈W2, we have T(x)=x1T(x)=x1.
Def. Let VV be an inner product space, and let T:V→VT:V→V be a projection. We say that TT is an orthogonal projection if R(T)⊥=N(T)R(T)⊥=N(T) and N(T)⊥=R(T)N(T)⊥=R(T).
Thm. Let VV be an inner product space, and let TT be a linear operator on VV. Then TT is an orthogonal projection if and only if TT has an adjoint T∗T∗ and T2=T=T∗T2=T=T∗.
Spectral Theorem, spectrum, resolution of the identity operator, spectral decomposition
Thm. (The Spectral Theorem) Suppose that TT is a linear operator on a finite-dimensional inner product space VV over FF with the distinct eigenvalues λ1,λ2,⋯,λkλ1,λ2,⋯,λk. Assume that TT is normal if F=CF=C and that TT is self-adjoint if F=RF=R. For each i(1≤i≤k)i(1≤i≤k), let WiWi be the eigenspace of TT corresponding to the eigenvalue λiλi, and let TiTi be the orthogonal projection of VV on WiWi. Then the following statements are true.
- V=W1⊕W2⊕⋯⊕WkV=W1⊕W2⊕⋯⊕Wk.
- If W′iW′i denotes the direct sum of the subspaces WjWj for j≠ij≠i, then W⊥i=W′iW⊥i=W′i.
- TiTj=δijTiTiTj=δijTi for 1≤i,j≤k1≤i,j≤k.
- I=T1+T2+⋯+TkI=T1+T2+⋯+Tk.
- T=λ1T1+λ2T2+⋯+λkTkT=λ1T1+λ2T2+⋯+λkTk.
The set {λ1,λ2,⋯,λk}{λ1,λ2,⋯,λk} of eigenvalues of TT is called the spectrum of TT. The sum I=T1+T2+⋯+TkI=T1+T2+⋯+Tk in (d) is called the resolution of the identity operator, and the sum T=λ1T1+λ2T2+⋯+λkTkT=λ1T1+λ2T2+⋯+λkTk in (e) is called the spectral decomposition of TT.
Corollary 1. If F=CF=C, then TT is normal if and only if T∗=g(T)T∗=g(T) for some polynomial gg.
Corollary 2. If F=CF=C, then TT is unitary if and only if TT is normal and |λ|=1|λ|=1 for every eigenvalue of TT.
Corollary 3. If F=CF=C and TT is normal, then TT is self-adjoint if and only if every eigenvalue of TT is real.
Corollary 4. Let TT be as in the spectral theorem with spectral decomposition T=λ1T1+λ2T2+⋯+λkTkT=λ1T1+λ2T2+⋯+λkTk. Then each TjTj is a polynomial in TT.
singular value decomposition, singular value
奇异值分解(singular value decomposition,SVD)是把一个 m×nm×n 矩阵 AA 分解成 UΣV∗UΣV∗ 的形式,其中 U,VU,V 分别是 m×m,n×nm×m,n×n 的矩阵,而 ΣΣ 是一个 m×nm×n 矩阵,并且存在一个 rr,使得 ΣΣ 中仅有 Σ11≥Σ22≥⋯≥ΣrrΣ11≥Σ22≥⋯≥Σrr 这 rr 个元素是非零的。我们将从线性变换开始研究虽然可能只会考计算。
Thm. (Singular Value Theorem for Linear Transformations) Let VV and WW be finite-dimensional inner product spaces, and let T:V→WT:V→W be a linear transformation of rank rr. Then there exists orthonormal bases {v1,v2,⋯,vn}{v1,v2,⋯,vn} for VV and {u1,u2,⋯,um}{u1,u2,⋯,um} for WW and positive scalars σ1≥σ2⋯≥σrσ1≥σ2⋯≥σr such that T(vi)={σiuiif i≤r0if i>rT(vi)={σiuiif i≤r0if i>r.
Furthermore, suppose the preceding conditions are satisfied. Then for 1≤i≤n1≤i≤n, vivi is an eigenvector of T∗TT∗T with corresponding eigenvalue σ2iσ2i if 1≤i≤r1≤i≤r and 00 if i>ri>r. Therefore the scalars σ1,σ2,⋯,σrσ1,σ2,⋯,σr are uniquely determined by TT.
Def. The unique scalars σ1,σ2,⋯,σrσ1,σ2,⋯,σr are called the singular values(奇异值)of TT. If r<mr<m and r<nr<n, then the term singular value is extended to include σr+1=⋯=σk=0σr+1=⋯=σk=0 where k=min(m,n)k=min(m,n).
Def. Let AA be a m×nm×n matrix. We define the singular values of AA to be the singular values of the linear transformation LALA.
Thm. (Singular Value Theorem for Matrices) Let AA be an m×nm×n matrix of rank rr with singular values σ1≥σ2⋯≥σrσ1≥σ2⋯≥σr, and let ΣΣ be the m×nm×n matrix defined by Σij={σiif i=j≤r0elseΣij={σiif i=j≤r0else. Then there exists an m×mm×m unitary matrix UU and an m×nm×n unitary matrix VV such that A=UΣV∗A=UΣV∗.
Def. Let AA be an m×nm×n matrix of rank rr with positive singular values σ1≥σ2≥⋯≥σrσ1≥σ2≥⋯≥σr. A factorization A=UΣV∗A=UΣV∗ where UU and VV are unitary matrices and ΣΣ is the m×nm×n matrix is called a singular value decomposition of AA.
下面来叙述一下求 SVD 的方法。我们对 A∗AA∗A 做酉对角化,得到的结果就是 A∗A=VΣ2V∗A∗A=VΣ2V∗;相应的,对 AA∗AA∗ 做酉对角化,得到的结果就是 AA∗=UΣ2U∗AA∗=UΣ2U∗。
事实上,假设 A=UΣV∗A=UΣV∗,那么 AV=UΣAV=UΣ。注意 U∗U=V∗V=IU∗U=V∗V=I,A∗=VΣU∗A∗=VΣU∗ (注意 ΣΣ 的共轭转置就是 ΣΣ),那么 A∗A=VΣ2V∗A∗A=VΣ2V∗,AA∗=UΣ2U∗AA∗=UΣ2U∗。
bilinear form, matrix representation
双线性型(bilinear form)是指一类 V×V→FV×V→F 的映射,它对两个变量都线性。我们用 B(V)B(V) 表示这类映射的集合。可以自然地定义加法和数乘,然后 B(V)B(V) 就是一个线性空间。
一个双线性型 HH 相对于一个基 β={v1,v2,⋯,vn}β={v1,v2,⋯,vn} 的矩阵表示(matrix representation) A=ψβ(H)A=ψβ(H) 由 Aij=H(vi,vj)Aij=H(vi,vj) 定义。不难发现 H(x,y)=[x]TβA[y]βH(x,y)=[x]TβA[y]β(这里的 TT 是转置)。特别地,如果 V=FnV=Fn,那么存在一个矩阵 AA,使得 H(x,y)=xTAyH(x,y)=xTAy。
symmetric, diagonalizable
双线性型的对称和对角化可以直接由它的矩阵表示定义。
Thm. Let VV be a finite-dimensional vector space over a field FF not of characteristic two. Then every symmetric bilinear form on VV is diagonalizable. Here, FF is of characteristic two if 1+1=01+1=0 in FF.
这里给出了对角化一个对称矩阵的方法。只需要两个方向同时做初等变换,例如把第一行加到第二行上后,立即把第一列加到第二列上。
quadratic form
Def. Let VV be a vector space over FF. A function K:V→FK:V→F is called a quadratic form if and only if there exists a symmetric bilinear form H∈B(V)H∈B(V) such that K(x)=H(x,x)K(x)=H(x,x) for all x∈Vx∈V.
If the field FF is not of characteristic two, there is a one-to-one correspondence between symmetric bilinear forms and quadratic forms.
Thm. Let VV be a finite-dimensional real inner product space, and let HH be a symmetric bilinear form on VV. Then there exists an orthonormal basis ββ for VV such that ψβ(H)ψβ(H) is a diagonal matrix.
Corollary. Let KK be a quadratic form on a finite-dimensional real inner product space VV. There exists an orthonormal basis β={v1,⋯,vn}β={v1,⋯,vn} for VV and scalars λ1,⋯,λnλ1,⋯,λn (not necessarily distinct) such that if x∈Vx∈V and x=∑ni=1sivi,si∈Rx=∑ni=1sivi,si∈R, then K(x)=∑ni=1λis2iK(x)=∑ni=1λis2i. In fact, if HH is the symmetric bilinear form determined by KK, then ββ can be chosen to be any orthonormal basis for VV such that ψβ(H)ψβ(H) is a diagonal matrix.
Jordan Canonical Form
首先说明 Jordan Canonical Form 只要会算就可以。不过这里还是给出核心的推理过程。
generalized eigenvector, generalized eigenspace
Def. Let TT be a linear operator on a vector space VV, and let λλ be a scalar. A nonzero vector xx in VV is called a generalized eigenvector of TT corresponding to λλ if (T−λI)p(x)=0(T−λI)p(x)=0 for some integer pp.
Def. Let TT be a linear operator on a vector space VV and let λλ be an eigenvalue of TT. The generalized eigenspace of TT corresponding to λλ, denoted KλKλ, is the subset of VV defined by Kλ={x∈V∣(T−λI)p(x)=0 for some integer p}Kλ={x∈V∣(T−λI)p(x)=0 for some integer p}.
Thm. Let TT be a linear operator on a finite-dimensional vector space VV such that the characteristic polynomial of TT splits, and let λ1,λ2,⋯,λkλ1,λ2,⋯,λk e the distinct eigenvalues of TT with corresponding multiplicities m1,m2,⋯,mkm1,m2,⋯,mk. For 1≤i≤k1≤i≤k, let βiβi be an ordered basis for KλiKλi. Then the following statements are true.
- βi∩βj=∅βi∩βj=∅ for i≠ji≠j.
- β=β1∪β2∪⋯∪βkβ=β1∪β2∪⋯∪βk is an ordered basis for VV.
- dim(Kλi)=midim(Kλi)=mi for all ii.
这个结论可以直接表述为 V=Kλ1⊕Kλ2⊕⋯⊕KλkV=Kλ1⊕Kλ2⊕⋯⊕Kλk。结论的证明很复杂。
cycle of generalized eigenvectors
Def. Let TT be a linear operator on a finite-dimensional vector space VV, and let xx be a generalized eigenvector of TT corresponding to the eigenvalue λλ. Suppose that pp is the smallest positive integer for which (T−λI)p=0(T−λI)p=0. Then the ordered set {(T−λI)p−1(x),{(T−λI)p−1(x),(T−λI)p−2,⋯,(T−λI)(x),x}(T−λI)p−2,⋯,(T−λI)(x),x} is called a cycle of generalized eigenvectors of TT corresponding to λλ. The vectors (T−λI)p−1(x)(T−λI)p−1(x) and xx are called the initial vector and the end vector of the cycle, respectively. We say the length of the cycle is pp.
Thm. Let TT be a linear operator on a vector space VV, and let λλ be an eigenvalue of TT. Suppose that γ1,γ2,⋯,γqγ1,γ2,⋯,γq are cycles of generalized eigenvectors of TT corresponding to λλ such that the initial vectors of the γiγi's are distinct and form a linearly independent set. Then γiγi's are disjoint, and their union γ=⋃qi=1γiγ=⋃qi=1γi is linearly independent.
Thm. Let TT be a linear operator on a finite-dimensional vector space VV, and let λλ be an eigenvalue of TT. Then KλKλ has an ordered basis consisting of a union of disjoint cycles of generalized eigenvectors corresponding to λλ.
对每个广义特征向量空间,存在由若干个循环的并构成的基。
Thm. Let TT be a linear operator on a finite-dimensional vector space VV whose characteristic polynomial splits, and suppose that ββ is a basis for VV such that ββ is a disjoint union of cycles of generalized eigenvectors of TT. Then for each cycle γγ of generalized eigenvectors contained in ββ, W=span(γ)W=span(γ) is TT-invariant, and [TW]γ[TW]γ is a Jordan block. Furthermore, ββ is a Jordan canonical basis for VV.
每个循环给出一个 Jordan block;所有循环的并如果构成一个基,那么它就是一个 Jordan 基。
到这里就可以定论,如果一个算子的特征多项式可分解,那么它就有一个 Jordan canonical form。可以用 left-multiplication transformation 定义矩阵的 Jordan canonical form。
Jordan Canonical Form
首先说明 Jordan Canonical Form 只要会算就可以。不过这里还是给出核心的推理过程。
generalized eigenvector, generalized eigenspace
Def. Let TT be a linear operator on a vector space VV, and let λλ be a scalar. A nonzero vector xx in VV is called a generalized eigenvector of TT corresponding to λλ if (T−λI)p(x)=0(T−λI)p(x)=0 for some integer pp.
Def. Let TT be a linear operator on a vector space VV and let λλ be an eigenvalue of TT. The generalized eigenspace of TT corresponding to λλ, denoted KλKλ, is the subset of VV defined by Kλ={x∈V∣(T−λI)p(x)=0 for some integer p}Kλ={x∈V∣(T−λI)p(x)=0 for some integer p}.
Thm. Let TT be a linear operator on a finite-dimensional vector space VV such that the characteristic polynomial of TT splits, and let λ1,λ2,⋯,λkλ1,λ2,⋯,λk e the distinct eigenvalues of TT with corresponding multiplicities m1,m2,⋯,mkm1,m2,⋯,mk. For 1≤i≤k1≤i≤k, let βiβi be an ordered basis for KλiKλi. Then the following statements are true.
- βi∩βj=∅βi∩βj=∅ for i≠ji≠j.
- β=β1∪β2∪⋯∪βkβ=β1∪β2∪⋯∪βk is an ordered basis for VV.
- dim(Kλi)=midim(Kλi)=mi for all ii.
这个结论可以直接表述为 V=Kλ1⊕Kλ2⊕⋯⊕KλkV=Kλ1⊕Kλ2⊕⋯⊕Kλk。结论的证明很复杂。
cycle of generalized eigenvectors
Def. Let TT be a linear operator on a finite-dimensional vector space VV, and let xx be a generalized eigenvector of TT corresponding to the eigenvalue λλ. Suppose that pp is the smallest positive integer for which (T−λI)p=0(T−λI)p=0. Then the ordered set {(T−λI)p−1(x),{(T−λI)p−1(x),(T−λI)p−2,⋯,(T−λI)(x),x}(T−λI)p−2,⋯,(T−λI)(x),x} is called a cycle of generalized eigenvectors of TT corresponding to λλ. The vectors (T−λI)p−1(x)(T−λI)p−1(x) and xx are called the initial vector and the end vector of the cycle, respectively. We say the length of the cycle is pp.
Thm. Let TT be a linear operator on a vector space VV, and let λλ be an eigenvalue of TT. Suppose that γ1,γ2,⋯,γqγ1,γ2,⋯,γq are cycles of generalized eigenvectors of TT corresponding to λλ such that the initial vectors of the γiγi's are distinct and form a linearly independent set. Then γiγi's are disjoint, and their union γ=⋃qi=1γiγ=⋃qi=1γi is linearly independent.
Thm. Let TT be a linear operator on a finite-dimensional vector space VV, and let λλ be an eigenvalue of TT. Then KλKλ has an ordered basis consisting of a union of disjoint cycles of generalized eigenvectors corresponding to λλ.
对每个广义特征向量空间,存在由若干个循环的并构成的基。
Thm. Let TT be a linear operator on a finite-dimensional vector space VV whose characteristic polynomial splits, and suppose that ββ is a basis for VV such that ββ is a disjoint union of cycles of generalized eigenvectors of TT. Then for each cycle γγ of generalized eigenvectors contained in ββ, W=span(γ)W=span(γ) is TT-invariant, and [TW]γ[TW]γ is a Jordan block. Furthermore, ββ is a Jordan canonical basis for VV.
每个循环给出一个 Jordan block;所有循环的并如果构成一个基,那么它就是一个 Jordan 基。
到这里就可以定论,如果一个算子的特征多项式可分解,那么它就有一个 Jordan canonical form。可以用 left-multiplication transformation 定义矩阵的 Jordan canonical form。
dot diagram
考虑一个固定的特征值 λλ 和相应的广义特征空间 KλKλ。根据上面的定理,我们可以找到若干个循环的并构成 KλKλ 的基。具体的,我们将它写成 {v1,(T−λI)(v1),⋯,(T−λI)p1−1(v1)}∪{v1,(T−λI)(v1),⋯,(T−λI)p1−1(v1)}∪ {v2,(T−λI)(v2),⋯,(T−λI)p2−1(v2)}∪⋯{v2,(T−λI)(v2),⋯,(T−λI)p2−1(v2)}∪⋯ {vk,(T−λI)(vk),⋯,(T−λI)pk−1(vk)}{vk,(T−λI)(vk),⋯,(T−λI)pk−1(vk)}。假设 p1≥p2≥⋯≥pkp1≥p2≥⋯≥pk,然后就可以用一个点状图(dot diagram)来表示(第 ii 列有 pipi 个点,从上到下依次是 (T−λI)pi−1(vi),⋯,(T−λI)(vi),vi(T−λI)pi−1(vi),⋯,(T−λI)(vi),vi)。
这个点状图直接给出了 Jordan canonical form。假设所有的特征值是 λ1,λ2,⋯,λkλ1,λ2,⋯,λk,那么 Jordan canonical form 是由 kk 个大块组成的。设第 ii 个特征值 λiλi 对应的基是由 v1,v2,⋯,vniv1,v2,⋯,vni 的循环的并组成的,然后 vjvj 的循环的长度是 pjpj,那么第 ii 个大块又是由 nini 个小块组成的。第 jj 个小块就是 pj×pjpj×pj 的矩阵,对角线上的值是 λiλi。
能不能说点人话
下一个定理直接给出了点状图的形状。设特征多项式是 (t−λ1)r1(t−λ2)r2⋯(t−λk)rk(t−λ1)r1(t−λ2)r2⋯(t−λk)rk,那么第 ii 个点状图前 ll 行的点数就是 nullity((T−λiI)l)nullity((T−λiI)l)。这个定理是不难理解的:对于点状图作用一次 T−λiIT−λiI,第一行都变成零了,后面的行向上移动;再作用一次,原本的第二行又变成零,后面的行再向上移动,依此类推。
有了点状图的形状,就有了 Jordan canonical form 这个矩阵,下一步就是求出 Jordan canonical basis。对每个特征值分别计算。先算点状图的最后一行。这些向量应当是 N((T−λI)r)N((T−λI)r) 的基的一部分,但不在 N((T−λI)r−1)N((T−λI)r−1) 内。然后对这些向量做一遍 T−λIT−λI,得到倒数第二行的向量。倒数第二行可能还有一些点,我们要补全这些向量,使得最后两行的向量构成了 N((T−λI)r)N((T−λI)r) 的基不在 N((T−λI)r−2)N((T−λI)r−2) 的部分。以此类推就可以求出需要的基这玩意根本不是能手算的吧。
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 在鹅厂做java开发是什么体验
· 百万级群聊的设计实践
· WPF到Web的无缝过渡:英雄联盟客户端的OpenSilver迁移实战
· 永远不要相信用户的输入:从 SQL 注入攻防看输入验证的重要性
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析