Eigenvalues and Eigenvectors
For a matrix transformation \( T \), a non-zero vector \( v (\neq 0) \) is called its eigenvector if \( T v = \lambda v \) for some scalar \( \lambda \). This means that applying the matrix transformation to the vector only scales the vector. The corresponding value of \( \lambda \) for \( v \) is an eigenvalue of \( T \).
Contents
Matrix Transformation Example
To understand how matrix transformations act geometrically, it is important to understand their effects on individual vectors.
Consider the matrix transformation
\[\begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix}.\]
This corresponds to the map \(T: \mathbb{R}^2 \to \mathbb{R}^2\) given by \((x,y) \mapsto (2x, 3y)\). The action of \(T\) is to scale the \(x\)-axis by a factor of 2 and the \(y\)-axis by a factor of 3. In particular, this is equivalent to requiring that \(T\) send \((1,0) \mapsto (2,0)\) and \((0,1) \mapsto (0,3)\). Assuming this, linearity of \(T\) implies that
\[\begin{align} T(x,y) &= T\big((x,0) + (0,y)\big) \\ &= T(x,0) + T(0,y) \\ &= x \cdot T(1,0) + y \cdot T(0,1) \\ &= x \cdot (2,0) + y \cdot (0,3) \\ &= (2x, 3y). \end{align}\]
Thus, it was possible to identify a basis \(\{(1,0), (0,1)\}\) for \(\mathbb{R}^2\) on which \(T\) acts very simply, by scaling. From this knowledge, one can deduce how \(T\) works on every vector in \(\mathbb{R}^2\).
The natural question that sprouts out of this discussion is whether or not it is possible for all matrices to determine such a basis of eigenvectors. At the very least, when it is possible to do so, the matrix at hand may be considered especially well-behaved. For instance, the spectral theorem of linear algebra states that whenever \(A\) is a symmetric matrix, i.e. \(A = A^{T}\), there is a basis consisting of eigenvectors for \(A\).
Computing Eigenvalues and Eigenvectors
It is not too difficult to compute eigenvalues and their corresponding eigenvectors when the matrix transformation at hand has a clear geometric interpretation. For examples, consider the diagonal matrix discussed above and the reflection matrix below:
Consider the reflection matrix transformation \( T = \begin{pmatrix} -1 & 0 \\ 0 & 1 \end{pmatrix} \) which reflects a vector across the \( y \)-axis. Find the eigenvectors and the corresponding eigenvalues of \( T \).
The vectors that get scaled after a reflection across the \( y \)-axis are either parallel to the \( y \)-axis, i.e. in the span of \((0,1)\), or parallel to the \(x\)-axis, i.e. in the span of \((1,0)\).
In the first case, the vector is not changed at all:
\[T(0,1) = (0,1) = 1\cdot (0,1),\]
and therefore the eigenvalue for \( (0,1) \) is \( 1 \).
In the second case, after applying the transformation, the length of the vector remains the same, but the direction reverses:
\[T(1,0) = (0,-1) = -1 \cdot (0,1),\]
and therefore the eigenvalue for \( (1,0) \) is \( -1 \). \(_\square\)
For arbitrary matrices, however, a more general procedure is necessary.
Let \(A\) be an \(n\)-by-\(n\) matrix, so that it corresponds to a transformation \(\mathbb{R}^n \to \mathbb{R}^n\). If \(\lambda\) is an eigenvalue for \(A\), then there is a vector \(v \in \mathbb{R}^n\) such that \(Av = \lambda v\). Rearranging this equation shows that \((A - \lambda \cdot I)v = 0\), where \(I\) denotes the \(n\)-by-\(n\) identity matrix. This implies the null space of the matrix \(A-\lambda \cdot I\) is nonzero, so \(A-\lambda \cdot I\) has determinant zero.
Note that for every matrix \(A\) has 0 as an eigenvalue, with eigenvector \((0,0, \cdots, 0) \in \mathbb{R}^n\). Generally, one is only concerned with the nonzero eigenvectors associated to an eigenvalue, so convention dictates that \(0\) is considered an eigenvalue of \(A\) only when the null space of \(A\) is nonzero \(\big(\)equivalently, when \(x\) divides \(p_{A} (x)\big).\)
Accordingly, any eigenvalue of \(A\) must be a root of the polynomial \(p_{A} (x) = \det(A - x \cdot I)\). This is called the characteristic polynomial of \(A\). Observe that this implies \(A\) has only finitely many eigenvalues (in fact, at most \(n\) eigenvalues).
In computations, the characteristic polynomial is extremely useful. To determine the eigenvalues of a matrix \(A\), one solves for the roots of \(p_{A} (x)\), and then checks if each root is an eigenvalue.
Consider the matrix
\[A = \begin{pmatrix} 1 & -3 & 3 \\ 3 & -5 & 3 \\ 6 & -6 & 4 \end{pmatrix}.\]
Compute its nonzero eigenvalues and their corresponding eigenvectors.
The characteristic polynomial of \(A\) may be computed as
\[\det(A - x \cdot I) = \det \begin{pmatrix} 1-x & -3 & 3 \\3 & -5-x & 3 \\6 & -6 & 4-x \end{pmatrix} = 16 + 12x - x^3. \]
This factors as \(p_{A} (x) = -(x-4)(x+2)^2\), so the possible for eigenvalues of \(A\) are \(\lambda_1 = 4\) and \(\lambda_2 = -2\).
To prove \(\lambda_1\) is an eigenvalue for \(A\), it suffices to exhibit a vector \(v\in \mathbb{R}^3\) with \(Av = 4v\). This means that there exists a vector that satisfies \((A - 4I)v = 0\).
\[(A - 4I)v = \begin{pmatrix} -3 & -3 & 3 \\3 & -9 & 3 \\6 & -6 & 0 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} \]
Note that \(v_1 = \left(\frac12, \frac12, 1\right)\) does the trick. Similarly, we find eigenvectors corresponding to \(\lambda_2 = -2\) by finding the null space of \((A + 2I):\)
\[(A + 2I)v = \begin{pmatrix} 3 & -3 & 3 \\3 & -3 & 3 \\6 & -6 & 6 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} \]
This gives just one equation : \(x_1 - x_2 + x_3 = 0\). Two vectors that are linearly independent that work are \(v_2 = (1,1,0)\) and \(v_3 = (0, 1, 1)\).
Solving the system of linear equations corresponding to \(Av = 4v\) shows that any eigenvector satisfying this equation is a multiple of \(\lambda_1\). Similarly, solving the system corresponding to \(Av = -2v\) demonstrates every eigenvector satisfying this equation is a linear combination of \(v_1\) and \(v_2\). Thus, we have found all the eigenvectors of \(A\). \(_\square\)
Connections with Trace and Determinant
Suppose \(A\) is a square matrix. Then
- the sum of its eigenvalues is equal to the trace of \(A;\)
- the product of its eigenvalues is equal to the determinant of \(A.\)
The proof of these properties requires the investigation of the characteristic polynomial of \(A\), which is found by taking the determinant of \((A - \lambda{I}_{n})\). We have
\[\begin{align}A &= \begin{bmatrix} {a}_{11} & \cdots &{a}_{1n} \\ \vdots &\ddots &\vdots \\ {a}_{n1} &\cdots & {a}_{nn}\\ \end{bmatrix}\\\\ A - {I}_{n}\lambda &= \begin{bmatrix} {a}_{11} - \lambda & \cdots &{a}_{1n} \\ \vdots &\ddots &\vdots \\ {a}_{n1} &\cdots & {a}_{nn} - \lambda\\ \end{bmatrix}. \end{align}\]
Observe that \(\det(A - \lambda{I}_{n} ) = \det(A) +\cdots+ \text{tr}(A){(-\lambda)}^{n-1} + {(-\lambda)}^{n}\).
Let \({r}_{1}, {r}_{2}, ...,{r}_{n}\) be the roots of an \(n\)-order polynomial.
\[\begin{align} P(\lambda) &= ({r}_{1} - \lambda)({r}_{2} - \lambda)...({r}_{n} - \lambda)\\ &= \prod _{ i=i }^{ n }{ { r }_{ i } } +\cdots+\sum _{ i=i }^{ n }{ { r }_{ i }{(-\lambda)}^{n-1} + {(-\lambda)}^{n} }. \end{align}\]
Since the eigenvalues are the roots of a matrix polynomial, we can match \(P(x)\) to \(\det(A - \lambda{I}_{n})\). Therefore, it is clear that
\[\prod _{ i=i }^{ n }{ { \lambda }_{ i } = \det(A)}\]
and
\[\sum _{ i=i }^{ n }{ { \lambda }_{ i } = \text{tr}(A)}.\ _\square\]
Applications
In addition to their theoretical significance, eigenvalues and eigenvectors have important applications in various branches of applied mathematics, including signal processing, machine learning, and social network analysis.
For a simple application in network analysis, consider a network modeled by the finite connected graph \(G = (V, E)\). A computer will store the information of this graph in the form of an adjacency matrix. This is an \(n\)-by-\(n\) matrix \(A = \{a_{ij}\}_{1\le i \le j \le n}\), where \(n = |V|\) and \(a_{ij} = 1\) if vertices \(i\) and \(j\) have an edge between them, and \(a_{ij} = 0\) otherwise.
The eigenvalues of \(A\) detect information about the structure of \(G\). For instance, let \(\lambda_{\text{min}}\) and \(\lambda_{\text{max}}\) denote the smallest and largest eigenvalues of \(A\), respectively. Then, \(G\) is bipartite if and only if \(\lambda_{\text{min}} = - \lambda_{\text{min}}\). Accordingly, a computer could determine whether or not \(G\) is bipartite by computing the eigenvalues of \(A\); this is far more computationally feasible than attempting to check \(G\) is bipartite by brute force.
References
- Lantonov, L. Eigenvalue-equation. Retrieved August 24, 2016, from https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors#/media/File:Eigenvalue_equation.svg