Generalized Schur's Theorem
Generalized Schur's Theorem
Generalized Schur's Theorem: Statement
THEOREM. Let \(V\) be a nonzero finite-dimensional real inner product space and \(T\) a linear operator on \(V\).
I) There exists an orthonormal basis \(\beta\) for \(V\) such that
is a block upper triangular matrix, where \(A_j\in M_{2\times 2}(\mathbb{R})\ (j=1,\cdots,q)\) and \(c_j\in \mathbb{R}\ (j=1,\cdots,p)\).
II) Moreover, if \(T\) is normal, then there exists an orthonormal basis \(\beta\) such that
is a block upper triangular matirx, where the multiset \(\{c_1,\cdots,c_p,a_1\pm ib_1,\cdots,a_q\pm ib_q\}\) with \(b_j\neq 0\ (j=1,\cdots,q)\) is the spectrum of the complexification of \(T\). (For the concept of complexification, see this blog : [https://www.cnblogs.com/chaliceseven/p/17094294.html].)
III) In particular, if \(T\) is orthogonal, then there exists an orthonormal basis \(\beta\) for \(V\) such that
is a block diagonal matrix, where \(\theta_j\in \mathbb{R}\setminus\{k\pi:k\in \mathbb{Z}\}\ (j=1,\cdots,q)\) and \(\varepsilon_j=\pm 1\ (j=1,\cdots,p)\). That is to say, any orthogonal operator on a nonzero finite-dimensional real inner product space is the composite of rotations and reflections (the order does not matter), and the space can be decomposed as a direct sum of pairwise orthogonal \(1\)- or \(2\)-dimensional spaces that are invariant under the operator.
In each case above, the \(q\)-tuple of second-order blocks on the diagonal can be arranged to its arbitrary permutation, e.g., \((A_1,\cdots,A_q)\) can be arranged to \((A_{\sigma(1)}\cdots,A_{\sigma(q)})\), where \(\sigma\) is any given permutation. The similar statement also holds for the \(p\)-tuple of first-order blocks on the diagonal. However, if we change the order of the basis \(\beta\) in I) in order that the \(q\)-tuple and the \(p\)-tuple are mixed, e.g., \(\text{diag}([T]_{\beta})=(A_1,c_1,A_2,c_2,\cdots)\), then \([T]_{\beta}\) will no longer be garanteed to be a blocked upper triangular matrix. The same statement goes for II).
Generalized Schur's Theorem: Proof
We focus on proving I) and add in the proof the key observations of II), after which III) follows immediately.
Proof of I). Induction on \(n:=\dim V\). The theorem is trivial when \(n=1\), so we may assume that \(n\ge 2\). Assume that the theorem is true for any real inner product space of dimension less than \(n\).
Case 1: \(T\) has an eigenvalue \(\lambda\).
Since \(\ker(T^*-\lambda I)=\text{im}(T-\lambda I)^{\perp}\), we have
and thus there exists a unit vector \(z\) such that \(T^*z=\lambda z\). Define \(W:=\text{span}(\{z\})\). Then \(W\) is \(T^*\)-invariant, and so \(W^{\perp}\) is \(T\)-invariant, with \(\dim W^{\perp}=n-1\). By induction hypothesis, there exists an orthonormal basis \(\gamma\) for \(W^{\perp}\) such that \([T_{W^{\perp}}]_{\gamma}\) is of the stated form. Define \(\beta:=\gamma\cup \{z\}\). Then \(\beta\) is an orthonormal basis for \(V\) such that
is of the stated form, with \([T]_{\beta}(n,n)=\langle T(z),z\rangle=\langle z,T^*(z)\rangle=\langle z,\lambda z\rangle=\lambda\).
Case 2: \(T\) has no eigenvalue.
In this case, \(T^*\) has no eigenvalue either. Let \(x=x_1\otimes 1+x_2\otimes i\ (x_1,x_2\in V)\) be an eigenvector of \((T^*)_{\mathbb{C}}\) with the corresponding eigenvalue \(\overline{\lambda}=\lambda_1+i\lambda_2\ (\lambda_1,\lambda_2\in \mathbb{R})\). Clearly, \(\lambda_2\neq 0\) and \(x_1,x_2\) are linearly independent in \(V\). Also, we have
Define \(W:=\text{span}(\{x_1,x_2\})\). Then \(W\) is a \(2\)-dimensional \(T^*\)-invariant subspace, and so \(W^{\perp}\) is an \((n-2)\)-dimensional \(T\)-invariant subspace. By induction hypothesis, there exists an orthonormal basis \(\gamma\) for \(W^{\perp}\) such that \([T_{W^{\perp}}]_{\gamma}\) is of the stated form. Let \(\{x'_1,x'_2\}\) be an orthonormal basis for \(W\), and define \(\beta':=\gamma\cup\{x'_1,x'_2\}\). Then \(\beta'\) is an orthonormal basis for \(V\) and
is of the stated form. Therefore I) is proved.
Proof of II). Now we assume that \(T\) is normal. Thanks to the identity \((T_{\mathbb{C}})^*=(T^*)_{\mathbb{C}}\), \(T_{\mathbb{C}}\) is normal as well. Therefore, we have
(For the leftmost "\(\implies\)", see this blog : [https://www.cnblogs.com/chaliceseven/p/17094294.html].)
Consequently, \(W\) is \(T\)-invariant as well and \([T_{W}]_{\{x_1,x_2\}}=\begin{pmatrix}\lambda_1 & -\lambda_2\\ \lambda_2 & \lambda_1 \end{pmatrix}\). Note that
Without the loss of generality, we may assume that \(\|x_1\|=\|x_2\|=1\). Define \(\beta:=\gamma\cup \{x_1,x_2\}\), then \(\beta\) is a orthonormal basis for \(V\) such that
as desired.
Proof of III). Note that if a block upper triangular matrix \(\begin{pmatrix}A & C\\O & B\end{pmatrix}\) is unitary, then \(A^*A=I,B^*B+C^*C=I\) and \(BB^*=I\), implying that \(A,B\) are unitary and \(C=O\). Now III) follows from II) and the fact above, recalling that the eigenvalues of a unitary matrix are all of modulus one. \(\blacksquare\)
Remark. In fact, if the block upper triangular matrix \(\begin{pmatrix}A & C\\O & B\end{pmatrix}\) is normal, then we have \(A^*A=AA^*+CC^*\), and so \(\text{tr}(CC^*)=0\), implying that \(C=O\), and consequently both \(A\) and \(B\) are normal.