Jordan Canonical Form
Jordan canonical form is a representation of a linear transformation over a finite-dimensional complex vector space by a particular kind of upper triangular matrix. Every such linear transformation has a unique Jordan canonical form, which has useful properties: it is easy to describe and well-suited for computations.
Less abstractly, one can speak of the Jordan canonical form of a square matrix; every square matrix is similar to a unique matrix in Jordan canonical form, since similar matrices correspond to representations of the same linear transformation with respect to different bases, by the change of basis theorem.
Jordan canonical form can be thought of as a generalization of diagonalizability to arbitrary linear transformations (or matrices); indeed, the Jordan canonical form of a diagonalizable linear transformation (or a diagonalizable matrix) is a diagonal matrix.
Contents
Definition and Properties
A Jordan block is a square matrix of the following form: \[ J_{\lambda} = \begin{pmatrix} \lambda & 1 & & & \\ & \lambda & 1 & & \\ & & \ddots & \ddots & \\ & & & \lambda & 1 \\ & & & &\lambda \end{pmatrix} \] for some complex number \(\lambda.\)
Note that the characteristic polynomial of an \(n\times n\) Jordan block is \((x-\lambda)^n,\) so a Jordan block has precisely one eigenvalue \(\lambda.\) Note also that the \(\lambda\)-eigenspace is one-dimensional, spanned by the vector \(\begin{pmatrix} 1\\0\\\vdots\\0 \end{pmatrix}.\)
Call a square matrix Jordan if it is a block matrix of the form \[ \begin{pmatrix} J_{\lambda_1} & & & \\ & J_{\lambda_2} & & \\ & & \ddots & \\ & & & J_{\lambda_k} \end{pmatrix}, \] where \(\lambda_1, \ldots, \lambda_k\) are (not necessarily distinct) complex numbers, and each \(J_{\lambda_i}\) is a Jordan block.
Jordan Canonical Form Theorem:
Let \(V\) be a finite-dimensional complex vector space, and let \(T \colon V \to V\) be a linear transformation. There is a unique basis \(\mathcal B\) of \(V\) \((\)unique up to ordering of the vectors in \(\mathcal B)\) such that the matrix of \(T\) with respect to \(\mathcal B\) is Jordan. This matrix is called the Jordan canonical form of \(T.\)
Equivalently, let \(A\) be an \(n \times n\) matrix with complex entries. Then \(A\) is similar to a Jordan matrix, called the Jordan canonical form of \(A,\) which is unique up to rearrangement of the Jordan blocks.
Let \(A = \begin{pmatrix} -2&1&4\\-5&2&5\\-1&1&3 \end{pmatrix}.\) Find the Jordan canonical form of \(A.\) What are the eigenvalues and eigenvectors of \(A?\) Is \(A\) diagonalizable?
The characteristic polynomial of \(A\) is \[ \det\begin{pmatrix} t+2&-1&-4\\5&t-2&-5\\1&-1&t-3 \end{pmatrix} = t^3-3t^2+4 = (t-2)^2(t+1). \]
The \(-1\)-eigenspace is the kernel of \( \begin{pmatrix} 1&-1&-4\\5&-3&-5\\1&-1&-4 \end{pmatrix},\) which is a one-dimensional subspace generated by \( \begin{pmatrix} 7\\15\\-2 \end{pmatrix}\) (e.g. by Gauss-Jordan elimination).
The \(2\)-eigenspace is the kernel of \( \begin{pmatrix} 4&-1&-4\\5&0&-5\\1&-1&-1 \end{pmatrix},\) which is a one-dimensional subspace generated by \(\begin{pmatrix} 1\\0\\1 \end{pmatrix}.\)
These eigenvalue and eigenvector computations show that \(A\) is not diagonalizable. Now note that \[ A \begin{pmatrix} 0\\1\\0 \end{pmatrix} = \begin{pmatrix} 1\\2\\1 \end{pmatrix} = 2 \begin{pmatrix} 1\\1\\1 \end{pmatrix} - \begin{pmatrix} 1\\0\\1 \end{pmatrix}. \]
- Note: We will see below how vectors like \(\small{ \begin{pmatrix} 0\\1\\0 \end{pmatrix}}\) can be found in general.
So let \(P = \begin{pmatrix} 1&1&7\\0&1&15\\1&1&-2 \end{pmatrix},\) then \(P^{-1}AP = \begin{pmatrix} 2&1&0 \\ 0&2&0 \\ 0&0&-1 \end{pmatrix},\) which is the Jordan canonical form of \(A.\) \(_\square\)
Generalized Eigenvectors
A diagonalizable linear transformation \(T\) on a finite-dimensional vector space \(V\) has the property that there is an eigenbasis of \(V;\) that is, there is a basis of \(V\) consisting of eigenvectors of \(T.\) The matrix of \(T\) with respect to this basis is diagonal. Another way to say this is every vector in \(V\) can be written uniquely as a sum of elements in each eigenspace of \(T.\) \((\)More concisely, \(V = E_{\lambda_1} \oplus E_{\lambda_2} \oplus \cdots \oplus E_{\lambda_k},\) where \(\lambda_i\) are the eigenvalues and \(E_{\lambda_i}\) is the corresponding eigenspace.\()\)
For the construction of the Jordan canonical form of a linear transformation (or matrix), the idea is to replace the eigenspaces in the last sentence of the above paragraph by larger subspaces called generalized eigenspaces of \(V,\) such that every vector in \(V\) can always be written uniquely as a sum of elements in each generalized eigenspace.
Let \(T \colon V \to V\) be a linear transformation on a complex vector space, and let \(\lambda\) be a complex number. The generalized \(\lambda\)-eigenspace \(W_{\lambda}\) is the subspace of \(V\) consisting of vectors \({\bf v} \in V\) such that \[ (T-\lambda I)^m({\bf v}) = {\bf 0} \] for some positive integer \(m.\) \((\)Here \(I\) is the identity map.\()\)
The vector \(\bf v\) is said to be a generalized eigenvector of rank \(m\) if \(m\) is the smallest positive integer such that \(\bf v\) is in the kernel of \((T-\lambda I)^m.\)
Note that generalized eigenvectors of rank \(1\) are precisely the eigenvectors of \(T,\) because \((T-\lambda I)({\bf v}) = {\bf 0}\) if and only if \(T{\bf v} = \lambda {\bf v}.\)
Let \(A = \begin{pmatrix} -2&1&4\\-5&2&5\\-1&1&3 \end{pmatrix},\) the matrix from the above example. The generalized eigenspace \(W_{-1}\) is the same as the eigenspace \(E_{-1}\): it is one-dimensional, spanned by \(\begin{pmatrix} 7\\15\\-2 \end{pmatrix}.\)
The generalized eigenspace \(W_2\) is not the same as \(E_2.\) In particular, \(W_2\) is two-dimensional, spanned by \(\begin{pmatrix} 1\\0\\1 \end{pmatrix}\) and \(\begin{pmatrix} 0\\1\\0 \end{pmatrix}.\) The first of these two vectors is an eigenvector, but the second is not. Note that \[ \begin{align} A-2I &= \begin{pmatrix} -4&1&4\\-5&0&5\\-1&1&1 \end{pmatrix} \\ (A-2I)^2 &= \begin{pmatrix} 7&0&-7\\15&0&-15\\-2&0&2 \end{pmatrix}, \end{align} \] so while the kernel of \(A-2I\) is one-dimensional, generated by \(\begin{pmatrix} 1\\0\\1 \end{pmatrix},\) the kernel of \((A-2I)^2\) is two-dimensional, generated by \(\begin{pmatrix} 1\\0\\1 \end{pmatrix}\) and \(\begin{pmatrix} 0\\1\\0 \end{pmatrix}.\) \(_\square\)
Algebraic Multiplicity and Geometric Multiplicity
Each eigenvalue of a linear transformation \(T \colon V \to V\) has two different concepts of multiplicity that can be associated to it. These two multiplicities are closely related to the Jordan canonical form.
Let \(V\) be a finite-dimensional complex vector space. The algebraic multiplicity of an eigenvalue \(\lambda\) of a linear transformation \(T \colon V \to V\) is the exponent of \((t-\lambda)\) in the characteristic polynomial \(p_T(t).\)
The geometric multiplicity of \(\lambda\) is the dimension of the eigenspace \(E_{\lambda}.\)
Facts about Multiplicities:
If \(T\) has a Jordan canonical form \((\)every \(T\) does, but this will not be proved until later\(),\) the algebraic multiplicity of an eigenvalue \(\lambda\) is the sum of the sizes of the Jordan blocks with \(\lambda\) on the diagonal. The geometric multiplicity is the number of Jordan blocks with \(\lambda\) on the diagonal.
The algebraic multiplicity of \(\lambda\) is the dimension of the generalized eigenspace \(W_{\lambda}\) \((\)while the geometric multiplicity is the dimension of the eigenspace \(E_{\lambda}).\)
Either of the above statements implies the following fact: the algebraic multiplicity is always \(\ge\) the geometric multiplicity, and equality holds for every eigenvalue if and only if \(T\) is diagonalizable. This fact can also be proved without using the Jordan canonical form theorem.
The algebraic multiplicities always add up to \(n = \text{dim}(V),\) by the fundamental theorem of algebra applied to the characteristic polynomial. Note that this is why \(V\) is assumed to be a complex vector space: the characteristic polynomial needs to factor completely into linear factors.
Continuing the example with \(A = \begin{pmatrix} -2&1&4\\-5&2&5\\-1&1&3 \end{pmatrix},\) the characteristic polynomial is \( (t-2)^2(t+1),\) so the algebraic multiplicity of \(-1\) is \(1\) and the algebraic multiplicity of \(2\) is \(2.\) The geometric multiplicity of \(-1\) is automatically \(1,\) and the geometric multiplicity of \(2\) is \(1\) as well, because the \(2\)-eigenspace is one-dimensional (as seen above). This corresponds to the fact that there is only one Jordan block with eigenvalue \(2.\)
Note that the algebraic multiplicities sum to \(1+2=3,\) the number of rows (and columns) of \(A,\) but the geometric multiplicities sum to \(1+1=2.\) This is a reflection of the fact that \(A\) is not diagonalizable.
Let \(A = \begin{pmatrix} 0&-1\\1&0 \end{pmatrix}.\) The characteristic polynomial \(p_A(t)\) is \(t^2+1,\) which has complex roots \(\pm i.\) Note that \(A\) has no real eigenvalues, and it is not similar over the real numbers to a Jordan matrix. However, it is similar over the complex numbers to the matrix \(\begin{pmatrix} i&0\\0&-i \end{pmatrix},\) which is a Jordan matrix--indeed \(A\) is diagonalizable over the complex numbers.
The point of this example is that \(p_A(t)\) has no real roots, but since every monic polynomial of degree \(n\) over the complex numbers splits into a product of \(n\) linear factors, \(p_A(t)\) must have two complex roots, which in this case both have algebraic and geometric multiplicities equal to \(1.\)
Computing Jordan Canonical Form
Here are some useful facts about generalized eigenvectors:
The set \(W_{\lambda,m}\) of generalized eigenvectors of rank \( \le m\) is a subspace of \(V.\)
Let \(w_{\lambda,m} = \text{dim}(W_{\lambda,m}).\) Then the sequence \(w_{\lambda,1}, w_{\lambda,2}, \ldots\) of positive integers is a nondecreasing sequence. If two consecutive terms are equal, then every subsequent term is equal, and these terms are all equal to the algebraic multiplicity of \(\lambda.\)
The quantity \(w_{\lambda,m} - w_{\lambda,m-1}\) equals the number of Jordan blocks of size \(\ge m\) in the Jordan canonical form.
So the Jordan canonical form is determined by the quantities \(w_{\lambda,m}\) for every eigenvalue \(\lambda\) and positive integer \(m.\)
Put the matrix \( \begin{pmatrix} 1 &0 &0 &0 &0 \\ 1 &−1 &0 &0 &−1 \\ 1 &−1 &0 &0 &−1 \\ 0 &0 &0 &0 &−1 \\ −1 &1 &0 &0 &1 \end{pmatrix} \) in Jordan canonical form.
It is straightforward to compute the characteristic polynomial \(p(t) = t^4(t-1).\) So the algebraic multiplicity of \(1\) is \(1,\) and hence the geometric multiplicity is \(1\) as well. The algebraic multiplicity of \(0\) is \(4,\) so we must compute the geometric multiplicity and the structure of the generalized eigenspace. This involves looking at the kernels of the powers of the matrix \(A-0I,\) which is \(A.\)
Now \(N(A)\) is two-dimensional (e.g. by rank-nullity since the first, second, and fourth rows are clearly a basis for its row space), spanned by \( \begin{pmatrix} 0\\0\\1\\0\\0 \end{pmatrix} \) and \( \begin{pmatrix} 0\\0\\0\\1\\0 \end{pmatrix}.\) So the geometric multiplicity is \(2,\) i.e. there are two Jordan blocks corresponding to the eigenvalue \(0.\)
On the other hand, \[ A^2 = \begin{pmatrix} 1 &0 &0 &0 &0 \\ 1 &0 &0 &0 &0 \\ 1 &0 &0 &0 &0 \\ 1 &−1 &0 &0 &−1 \\ −1 &0 &0 &0 &0 \end{pmatrix}, \] which clearly has rank 2, so \(N(A^2)\) is three-dimensional. It contains \(N(A),\) and a third vector that spans it is \(\begin{pmatrix} 0\\1\\0\\0\\-1 \end{pmatrix}.\)
Finally, \[ A^3 = \begin{pmatrix} 1 &0 &0 &0 &0 \\ 1 &0 &0 &0 &0 \\ 1 &0 &0 &0 &0 \\ 1 &0 &0 &0 &0 \\ −1 &0 &0 &0 &0 \end{pmatrix}, \] so \(N(A^3)\) is four-dimensional, with a fourth spanning vector \(\begin{pmatrix} 0\\0\\0\\0\\1 \end{pmatrix}.\)
So the sequence \(w_{0,1},w_{0,2},\ldots\) equals \(2,3,4,4,4,\ldots.\) There are two Jordan blocks corresponding to the eigenvalue \(0.\) The sizes add up to the algebraic multiplicity, which is \(4.\) The fact that \(w_{0,3}-w_{0,2} = 1\) implies that one of the blocks has size \(\ge 3.\) So there must be one block of size \(3\) and one block of size \(1.\)
Putting it all together, the Jordan canonical form for \(A\) is \[ \begin{pmatrix} 0&1&0&0&0 \\ 0&0&1&0&0 \\ 0&0&0&0&0 \\ 0&0&0&0&0 \\ 0&0&0&0&1 \end{pmatrix}.\ _\square \]
Which of these matrices is not similar to any of the other three?
\(\)
Notation: Two square matrices \(A,B\) with complex entries are similar if and only if there is an invertible square matrix \(P\) such that \(A = PBP^{-1}.\)
Proof of Existence and Uniqueness
Applications
The Jordan canonical form is convenient for computations. In particular, matrix powers and exponentials are straightforward to compute once the Jordan canonical form is known. Here is an illustrative example.
Let \(A = \begin{pmatrix} 5&-1\\9&-1 \end{pmatrix}.\) Find \(A^{10}.\) Can you find a formula for \(A^k\) for any positive integer \(k?\)
The easiest way to do this problem is to convert \(A\) to a similar matrix in Jordan canonical form, and then to consider powers of this matrix. The characteristic polynomial of \(A\) is \(p(t) = (5-t)(-1-t)+9 = t^2-4t+4 = (t-2)^2.\) The kernel of \(A-2I = \begin{pmatrix} 3&-1 \\ 9&-3 \end{pmatrix}\) is one-dimensional, generated by \(\begin{pmatrix} 1\\3 \end{pmatrix},\) so there is only one Jordan block in \(A.\) Therefore \(A\) is similar to \(J = \begin{pmatrix} 2&1 \\ 0&2 \end{pmatrix}.\)
To solve the rest of the problem, it is necessary to determine the matrix \(P\) such that \(A = PJP^{-1}.\) By the change of basis theorem, the first column \({\bf c_1}\) of \(P\) is an eigenvector with eigenvalue \(2,\) so we can take \({\bf c_1} = \begin{pmatrix} 1\\3 \end{pmatrix}.\) And the second column \({\bf c_2}\) of \(P\) satisfies \(A{\bf c_2} = {\bf c_1} + 2{\bf c_2},\) or \((A-2I){\bf c_2} = {\bf c_1}.\) That is, \(3c_{12} - c_{22} = 1.\) For convenience's sake, we take \(c_{12} = 0\) and \(c_{22} = -1,\) which gives \(P = \begin{pmatrix} 1&0\\3&-1 \end{pmatrix}.\) Note that \(P=P^{-1},\) which will be useful later.
Now \[A^k = \big(PJP^{-1}\big)^k = \big(PJP^{-1}\big)\big(PJP^{-1}\big)(\cdots)\big(PJP^{-1}\big) = PJ^kP^{-1},\] so the problem essentially reduces to the computation of \(J^k.\) \((\)This technique may be familiar in other situations where \(J\) is diagonal, for instance in one derivation of the formula for the Fibonacci numbers. In that case, \(J^k\) is trivial to compute, as its entries are just the powers of the diagonal entries of \(J.)\)
It remains to find a formula for \(J^k.\) It is convenient to write \(J = N+2I,\) where \(N = \begin{pmatrix} 0&1\\0&0 \end{pmatrix}.\) Then \[J^k = (2I+N)^k = \sum_{i=0}^k \binom{k}{i} 2^{k-i} N^i = 2^k I + k 2^{k-1} N\] because \(N^i\) is the zero matrix for \(i \ge 2.\) So \[J^k = \begin{pmatrix} 2^k & k2^{k-1} \\ 0& 2^k \end{pmatrix},\] and \[\begin{align} A^k = PJ^kP^{-1} &= \begin{pmatrix} 1&0 \\ 3&-1 \end{pmatrix} \begin{pmatrix} 2^k &k2^{k-1} \\ 0 & 2^k \end{pmatrix} \begin{pmatrix} 1&0 \\ 3&-1 \end{pmatrix} \\ &= 2^{k-1} \begin{pmatrix} 3k+2 & -k \\ 9k & -(3k-2) \end{pmatrix}.\ _\square \end{align}\]
This technique generalizes to arbitrary Jordan blocks, although the computations are tedious for larger blocks because more powers of the matrix \(N\) are nonzero \((\)for a Jordan block of size \(k,\) \(N^k\) is zero but \(N^{k-1}\) is not\().\)