Spectral Theorem

In linear algebra, one is often interested in the canonical forms of a linear transformation. Given a particularly nice basis for the vector spaces in which one is working, the matrix of a linear transformation may also be particularly nice, revealing some information about how the transformation operates on the vector space.

The spectral theorem provides a sufficient criterion for the existence of a particular canonical form. Specifically, the spectral theorem states that if $M$ equals the transpose of $M$, then $M$ is diagonalizable: there exists an invertible matrix $C$ such that $C^{-1} MC $ is a diagonal matrix. Recall that a diagonal matrix is any matrix for which all entries off the main diagonal (the diagonal from top left to bottom right) are zero.

A matrix $M$ with entries in $\mathbb{R}$ is called symmetric if $M =M^{T}$. The spectral theorem states that any symmetric matrix is diagonalizable.

Motivation

Consider the matrix

\[A= \begin{pmatrix} 2 & 6 \\ 0 & -1 \end{pmatrix}.\]

The eigenvalues of this matrix are $\lambda_1, \lambda_2 = 2, -1$, with corresponding eigenvectors $v_1 = (1,0)$ and $v_2 = (-2,1)$. These eigenvectors $\{v_1, v_2\}$ form a basis for $\mathbb{R}^2$, with change of basis matrix

\[C= \begin{pmatrix} 1 & -2 \\ 0 & 1 \end{pmatrix},\]

sending the standard basis vector $e_1 = (1,0)$ to $v_1$ and $e_2 = (0,1)$ to $v_2$. In the basis $\{v_1, v_2\}$, the matrix for the linear transformation described by $A$ is precisely

\[C^{-1} AC = \begin{pmatrix} 2 & 0 \\ 0 & -1 \end{pmatrix}.\]

This diagonal matrix is easier to parse than $A$; looking at it immediately tells one that $A$ acts on $\mathbb{R}^2$ by scaling two axes, one by a factor of $2$ and one by a factor of $-1$ (i.e. reflection).

$$A$ acts on $\mathbb{R}^2$ by scaling along the two axes depicted, the lines $y = 0$ and $y = -\frac x2$.$ $A$ acts on $\mathbb{R}^2$ by scaling along the two axes depicted, the lines $y = 0$ and $y = -\frac x2$.

A matrix $M$ is called diagonalizable if there is a basis in which the linear transformation described by $M$ has a diagonal matrix, i.e. a matrix whose entries off the main diagonal (the diagonal from top left to bottom right) are all zero. Equivalently, $M$ is diagonalizable if and only if there exists an invertible matrix $C$ such that $C^{-1}M C$ is a diagonal matrix. Diagonalizable matrices are easier to work with than non-diagonalizable matrices since they can be placed in this canonical diagonal form with a change of basis.

Proof of Spectral Theorem

Let's start off by proving the following useful lemma:

Lemma: All the eigenvalues of a real $n \times n $ symmetric matrix $M$ are real.

Proof:
Let $\lambda \in \mathbb{C} $ be an eigenvalue of $M$ with corresponding eigenvector $\ v \in \mathbb{C^n} $. Now I will show that $\ \overline{\lambda} = \lambda $ by evaluating $\ (Mv)^{T} \overline{v} $ in two ways:

\[\begin{align} \ (Mv)^{T} \overline{v}
&= v^{T}M^{T}\overline{v} \\ &= v^{T}(M\overline{v}) \\ &= v^{T}\big(\overline{M}\overline{v}\big) \\ &= v^{T}\big(\overline{Mv}\big) \\ &= v^{T}\big(\overline{\lambda v}\big) \\ &= \overline{\lambda}v^{T}\overline{v} &\qquad (1) \\\\ (Mv)^{T} \overline{v} &= \lambda v^{T} \overline{v}. &\qquad (2) \end{align} \]

On comparing equations (1) and (2), it's clear that $\ \overline{\lambda} = \lambda,$ so $\lambda \in \mathbb{R}.\ _\square $

Proof by Induction

For $n = 1,$ $M$ and $v$ are scalars, so $ Mv = \lambda v, $ where $ M = \lambda. $ Thus, we can pick any non-zero real scalar $v$ to form a basis for $\mathbb{R}. $

Induction Hypothesis:
Every $\ k \times k $ symmetric matrix is diagonalisable for $k = 1,...,n-1.$ So, let $M \in M_{n} (\mathbb{R}) $ be symmetric. We break the induction part into 3 steps:

Step 1:
We can choose eigenvalue $\lambda_{1} \in \mathbb{R} $ by the above lemma (see comment for details), and let's normalize a corresponding eigenvector such that $\ |v_{1}| = 1 $. Now we can extend to a basis $\ u_{1}, u_{2}, ... , u_{n} $ for $\mathbb{R^n} $ and apply the Gram-Schmidt procedure to obtain an orthonormal basis $\ B = {v_{1}, v_{2}, ... , v_{n}} $ for $\mathbb{R^n} $.

Step 2:
Define $P= [v_{1} v_{2} \ldots v_{n}] $. Now the columns of $P$ are orthonormal, so $P$ is an orthogonal matrix. That is, $ P^{T} = P^{-1}$. Also, define $\ A= P^{-1}MP = P^{T}MP, $ then \[ A^{T} = \left(\big(P^{T}M\big)P\right)^{T} = P^{T}\big(P^{T}M\big)^{T} = P^{T}M^{T}P = P^{T}MP = A, \] so $A$ is symmetric. Now we use this to show how $A$ can be written in block matrix form. So, pre-multiplying the standard basis vector $\mathbf{e_{1}} $ by $A$ gives \[\ P^{T}AP\mathbf{e_{1}} = P^{T}A v_{1} = \lambda_{1}P^{T}v_{1} = \lambda_{1} [v_{1}]_{B} = \lambda_{1} \begin{bmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix}_{B}. \] So, $A$ has the matrix block form \[\ \begin{bmatrix} \lambda_{1} \quad 0 \\ 0 \quad C \end{bmatrix}, \] where $ C \in M_{n-1}(\mathbb{R}) $ is symmetric.

Step 3:
Now, by the induction hypothesis, there exists a diagonal matrix $D$ and orthogonal matrix $Q$ such that $ D= Q^{-1} CQ = Q^{T} CQ $. Now define \[ R = P \begin{bmatrix} 1 \quad 0 \\ 0 \quad Q \end{bmatrix}. \] We aim to show that $R$ is orthogonal and $\ R^{T} MR $ is diagonal. For the first part, \[ \begin{bmatrix} 1 \quad 0 \\ 0 \quad Q \end{bmatrix}^{-1} = \begin{bmatrix}1 \quad 0 \\ 0 \quad Q^{-1} \end{bmatrix} \implies R^{-1} = \begin{bmatrix}1 \quad 0 \\ 0 \quad Q^{-1} \end{bmatrix} P^{-1} = \begin{bmatrix}1 \quad 0 \\ 0 \quad Q^{T} \end{bmatrix} P^{T} = R^{T}. \] Finally, \[ R^{T}MR = \begin{bmatrix} 1 \quad 0 \\ 0 \quad Q^{T} \end{bmatrix} P^{T}MP \begin{bmatrix}1 \quad 0 \\ 0 \quad Q \end{bmatrix} = \begin{bmatrix} \lambda_{1} \quad 0 \\ 0 \quad D \end{bmatrix}. \] So there exists an orthogonal matrix $R$ such that $\ R^{T}MR $ is diagonal, which is equivalent to saying that there exists an orthonormal basis for $\mathbb{R^n} $ consisting of eigenvalues of $M.\ _\square $

(Note: The proof is yet to be peer-reviewed, so it may contain errors.)

Contents