Generalized Schur's Theorem

Generalized Schur's Theorem

Generalized Schur's Theorem: Statement

THEOREM. Let \(V\) be a nonzero finite-dimensional real inner product space and \(T\) a linear operator on \(V\).

I) There exists an orthonormal basis \(\beta\) for \(V\) such that

\[[T]_{\beta}=\small \begin{pmatrix} A_1 &  &  &  &  &\\ &  \ddots&  &  & \Large{*} &\\ &  & A_q &  &  &\\ &  &  & c_1 &  &\\ &  &  &  & \ddots &\\ &  &  &  &  & c_p \end{pmatrix} \]

is a block upper triangular matrix, where \(A_j\in M_{2\times 2}(\mathbb{R})\ (j=1,\cdots,q)\) and \(c_j\in \mathbb{R}\ (j=1,\cdots,p)\).

II) Moreover, if \(T\) is normal, then there exists an orthonormal basis \(\beta\) such that

\[[T]_{\beta}= \begin{pmatrix} \left.\begin{matrix} \begin{matrix}a_1 & -b_1 \\ b_1 & a_1\end{matrix} &  & \\ & \ddots & \\ &  & \begin{matrix}a_q & -b_q \\ b_q & a_q\end{matrix}\\ \hline \end{matrix}\hspace{-0.2em}\right| & \begin{matrix} *&\cdots&* \\ *&\cdots&* \\ \vdots&&\vdots \\ *&\cdots&* \\ *&\cdots&* \end{matrix} \\ & \begin{matrix} c_1 & \cdots & * \\ & \ddots & \vdots \\ &  & c_p \end{matrix} \end{pmatrix} \]

is a block upper triangular matirx, where the multiset \(\{c_1,\cdots,c_p,a_1\pm ib_1,\cdots,a_q\pm ib_q\}\) with \(b_j\neq 0\ (j=1,\cdots,q)\) is the spectrum of the complexification of \(T\). (For the concept of complexification, see this blog : [https://www.cnblogs.com/chaliceseven/p/17094294.html].)

III) In particular, if \(T\) is orthogonal, then there exists an orthonormal basis \(\beta\) for \(V\) such that

\[[T]_{\beta}=\begin{pmatrix} \small\begin{matrix}\cos\theta_1 & -\sin\theta_1 \\ \sin\theta_1 & \cos\theta_1\end{matrix} &  &  &  &  &\\ &  \ddots&  &  &  &\\ &  & \small\begin{matrix}\cos\theta_q & -\sin\theta_q \\ \sin\theta_q & \cos\theta_q\end{matrix} &  &  &\\ &  &  & \varepsilon_1 &  &\\ &  &  &  & \ddots &\\ &  &  &  &  & \varepsilon_p \end{pmatrix} \]

is a block diagonal matrix, where \(\theta_j\in \mathbb{R}\setminus\{k\pi:k\in \mathbb{Z}\}\ (j=1,\cdots,q)\) and \(\varepsilon_j=\pm 1\ (j=1,\cdots,p)\). That is to say, any orthogonal operator on a nonzero finite-dimensional real inner product space is the composite of rotations and reflections (the order does not matter), and the space can be decomposed as a direct sum of pairwise orthogonal \(1\)- or \(2\)-dimensional spaces that are invariant under the operator.

In each case above, the \(q\)-tuple of second-order blocks on the diagonal can be arranged to its arbitrary permutation, e.g., \((A_1,\cdots,A_q)\) can be arranged to \((A_{\sigma(1)}\cdots,A_{\sigma(q)})\), where \(\sigma\) is any given permutation. The similar statement also holds for the \(p\)-tuple of first-order blocks on the diagonal. However, if we change the order of the basis \(\beta\) in I) in order that the \(q\)-tuple and the \(p\)-tuple are mixed, e.g., \(\text{diag}([T]_{\beta})=(A_1,c_1,A_2,c_2,\cdots)\), then \([T]_{\beta}\) will no longer be garanteed to be a blocked upper triangular matrix. The same statement goes for II).

Generalized Schur's Theorem: Proof

We focus on proving I) and add in the proof the key observations of II), after which III) follows immediately.

Proof of I). Induction on \(n:=\dim V\). The theorem is trivial when \(n=1\), so we may assume that \(n\ge 2\). Assume that the theorem is true for any real inner product space of dimension less than \(n\).

Case 1: \(T\) has an eigenvalue \(\lambda\).

Since \(\ker(T^*-\lambda I)=\text{im}(T-\lambda I)^{\perp}\), we have

\[\dim \ker(T^*-\lambda I)=n-\dim \text{im}(T-\lambda I)=\dim \ker(T-\lambda I)\ge 1 \]

and thus there exists a unit vector \(z\) such that \(T^*z=\lambda z\). Define \(W:=\text{span}(\{z\})\). Then \(W\) is \(T^*\)-invariant, and so \(W^{\perp}\) is \(T\)-invariant, with \(\dim W^{\perp}=n-1\). By induction hypothesis, there exists an orthonormal basis \(\gamma\) for \(W^{\perp}\) such that \([T_{W^{\perp}}]_{\gamma}\) is of the stated form. Define \(\beta:=\gamma\cup \{z\}\). Then \(\beta\) is an orthonormal basis for \(V\) such that

\[[T]_{\beta}=\begin{pmatrix}\left.\begin{matrix}[T_{W^{\perp}}]_{\gamma}\\O_{1\times (n-1)}\end{matrix}\ \right| \phi_{\beta}(T(z))\end{pmatrix} \]

is of the stated form, with \([T]_{\beta}(n,n)=\langle T(z),z\rangle=\langle z,T^*(z)\rangle=\langle z,\lambda z\rangle=\lambda\).

Case 2: \(T\) has no eigenvalue.

In this case, \(T^*\) has no eigenvalue either. Let \(x=x_1\otimes 1+x_2\otimes i\ (x_1,x_2\in V)\) be an eigenvector of \((T^*)_{\mathbb{C}}\) with the corresponding eigenvalue \(\overline{\lambda}=\lambda_1+i\lambda_2\ (\lambda_1,\lambda_2\in \mathbb{R})\). Clearly, \(\lambda_2\neq 0\) and \(x_1,x_2\) are linearly independent in \(V\). Also, we have

\[\begin{align*} &T^*(x_1)\otimes 1+T^*(x_2)\otimes i\\ =&(T^*)_{\mathbb{C}}(x_1\otimes 1+x_2\otimes i)\\ =&(\lambda_1+i\lambda_2)(x_1\otimes 1+x_2\otimes i)\\ =&(\lambda_1x_1-\lambda_2x_2)\otimes 1+(\lambda_2x_1+\lambda_1x_2)\otimes i \end{align*}\implies \begin{cases}T^*(x_1)=\lambda_1x_1-\lambda_2x_2\\T^*(x_2)=\lambda_2x_1+\lambda_1x_2\end{cases} \]

Define \(W:=\text{span}(\{x_1,x_2\})\). Then \(W\) is a \(2\)-dimensional \(T^*\)-invariant subspace, and so \(W^{\perp}\) is an \((n-2)\)-dimensional \(T\)-invariant subspace. By induction hypothesis, there exists an orthonormal basis \(\gamma\) for \(W^{\perp}\) such that \([T_{W^{\perp}}]_{\gamma}\) is of the stated form. Let \(\{x'_1,x'_2\}\) be an orthonormal basis for \(W\), and define \(\beta':=\gamma\cup\{x'_1,x'_2\}\). Then \(\beta'\) is an orthonormal basis for \(V\) and

\[[T]_{\beta'} =\begin{pmatrix} \left.\left.\begin{matrix}[T_{W^{\perp}}]_{\gamma}\\O_{2\times (n-2)}\end{matrix}\right|\phi_{\beta'}(T(x'_1))\right| \phi_{\beta'}(T(x'_2)) \end{pmatrix} \]

is of the stated form. Therefore I) is proved.

Proof of II). Now we assume that \(T\) is normal. Thanks to the identity \((T_{\mathbb{C}})^*=(T^*)_{\mathbb{C}}\), \(T_{\mathbb{C}}\) is normal as well. Therefore, we have

\[(T^*)_{\mathbb{C}}(x)=\lambda x\implies (T_{\mathbb{C}})^*(x)=\lambda x\iff T_{\mathbb{C}}(x)=\bar{\lambda}x\implies \begin{cases}T(x_1)=\lambda_1x_1+\lambda_2x_2\\T(x_2)=-\lambda_2x_1+\lambda_1x_2\end{cases} \]

(For the leftmost "\(\implies\)", see this blog : [https://www.cnblogs.com/chaliceseven/p/17094294.html].)
Consequently, \(W\) is \(T\)-invariant as well and \([T_{W}]_{\{x_1,x_2\}}=\begin{pmatrix}\lambda_1 & -\lambda_2\\ \lambda_2 & \lambda_1 \end{pmatrix}\). Note that

\[\begin{align*} &\langle T(x_1),x_1 \rangle=\langle x_1,T^*(x_1) \rangle\implies \lambda_2 \langle x_2,x_1 \rangle=0\implies \langle x_2,x_1 \rangle=0\\ &\langle T(x_1),x_2 \rangle=\langle x_1,T^*(x_2) \rangle\implies \lambda_2\langle x_2,x_2 \rangle=\lambda_2\langle x_1,x_1 \rangle\implies \|x_1\|=\|x_2\| \end{align*} \]

Without the loss of generality, we may assume that \(\|x_1\|=\|x_2\|=1\). Define \(\beta:=\gamma\cup \{x_1,x_2\}\), then \(\beta\) is a orthonormal basis for \(V\) such that

\[[T]_{\beta} =\begin{pmatrix} [T_{W^{\perp}}]_{\gamma} & O \\ O & [T_{W}]_{\{x_1,x_2\}} \end{pmatrix}\]

as desired.

Proof of III). Note that if a block upper triangular matrix \(\begin{pmatrix}A & C\\O & B\end{pmatrix}\) is unitary, then \(A^*A=I,B^*B+C^*C=I\) and \(BB^*=I\), implying that \(A,B\) are unitary and \(C=O\). Now III) follows from II) and the fact above, recalling that the eigenvalues of a unitary matrix are all of modulus one. \(\blacksquare\)

Remark. In fact, if the block upper triangular matrix \(\begin{pmatrix}A & C\\O & B\end{pmatrix}\) is normal, then we have \(A^*A=AA^*+CC^*\), and so \(\text{tr}(CC^*)=0\), implying that \(C=O\), and consequently both \(A\) and \(B\) are normal.

posted @ 2023-02-06 01:09  ChaliceSeven  阅读(17)  评论(0编辑  收藏  举报