Eigenvalues and Eigenvectors

For a matrix transformation $ T $, a non-zero vector $ v\, (\neq 0) $ is called its eigenvector if $ T v = \lambda v $ for some scalar $ \lambda $. This means that applying the matrix transformation to the vector only scales the vector. The corresponding value of $ \lambda $ for $ v $ is an eigenvalue of $ T $.

$The matrix transformation $A$ acts on the eigenvector $x$, scaling it by a factor of the eigenvalue $\lambda$.$ The matrix transformation $A$ acts on the eigenvector $x$, scaling it by a factor of the eigenvalue $\lambda$.^[1]

Matrix Transformation Example

To understand how matrix transformations act geometrically, it is important to understand their effects on individual vectors.

Consider the matrix transformation

\[\begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix}.\]

This corresponds to the map $T: \mathbb{R}^2 \to \mathbb{R}^2$ given by $(x,y) \mapsto (2x, 3y)$. The action of $T$ is to scale the $x$-axis by a factor of 2 and the $y$-axis by a factor of 3. In particular, this is equivalent to requiring that $T$ send $(1,0) \mapsto (2,0)$ and $(0,1) \mapsto (0,3)$. Assuming this, linearity of $T$ implies that

\[\begin{align} T(x,y) &= T\big((x,0) + (0,y)\big) \\ &= T(x,0) + T(0,y) \\ &= x \cdot T(1,0) + y \cdot T(0,1) \\ &= x \cdot (2,0) + y \cdot (0,3) \\ &= (2x, 3y). \end{align}\]

Thus, it was possible to identify a basis $\{(1,0), (0,1)\}$ for $\mathbb{R}^2$ on which $T$ acts very simply, by scaling. From this knowledge, one can deduce how $T$ works on every vector in $\mathbb{R}^2$.

The natural question that sprouts out of this discussion is whether or not it is possible for all matrices to determine such a basis of eigenvectors. At the very least, when it is possible to do so, the matrix at hand may be considered especially well-behaved. For instance, the spectral theorem of linear algebra states that whenever $A$ is a symmetric matrix, i.e. $A = A^{T}$, there is a basis consisting of eigenvectors for $A$.

Computing Eigenvalues and Eigenvectors

It is not too difficult to compute eigenvalues and their corresponding eigenvectors when the matrix transformation at hand has a clear geometric interpretation. For examples, consider the diagonal matrix discussed above and the reflection matrix below:

Consider the reflection matrix transformation $ T = \begin{pmatrix} -1 & 0 \\ 0 & 1 \end{pmatrix} $ which reflects a vector across the $ y $-axis. Find the eigenvectors and the corresponding eigenvalues of $ T $.

The vectors that get scaled after a reflection across the $ y $-axis are either parallel to the $ y $-axis, i.e. in the span of $(0,1)$, or parallel to the $x$-axis, i.e. in the span of $(1,0)$.

In the first case, the vector is not changed at all:

\[T(0,1) = (0,1) = 1\cdot (0,1),\]

and therefore the eigenvalue for $ (0,1) $ is $ 1 $.

In the second case, after applying the transformation, the length of the vector remains the same, but the direction reverses:

\[T(1,0) = (-1,0) = -1 \cdot (1,0),\]

and therefore the eigenvalue for $ (1,0) $ is $ -1 $. $_\square$

For arbitrary matrices, however, a more general procedure is necessary.

Let $A$ be an $n$-by-$n$ matrix, so that it corresponds to a transformation $\mathbb{R}^n \to \mathbb{R}^n$. If $\lambda$ is an eigenvalue for $A$, then there is a vector $v \in \mathbb{R}^n$ such that $Av = \lambda v$. Rearranging this equation shows that $(A - \lambda \cdot I)v = 0$, where $I$ denotes the $n$-by-$n$ identity matrix. This implies the null space of the matrix $A-\lambda \cdot I$ is nonzero, so $A-\lambda \cdot I$ has determinant zero.

Note that for every matrix $A$ has 0 as an eigenvalue, with eigenvector $(0,0, \cdots, 0) \in \mathbb{R}^n$. Generally, one is only concerned with the nonzero eigenvectors associated to an eigenvalue, so convention dictates that $0$ is considered an eigenvalue of $A$ only when the null space of $A$ is nonzero $\big($equivalently, when $x$ divides $p_{A} (x)\big).$

Accordingly, any eigenvalue of $A$ must be a root of the polynomial $p_{A} (x) = \det(A - x \cdot I)$. This is called the characteristic polynomial of $A$. Observe that this implies $A$ has only finitely many eigenvalues (in fact, at most $n$ eigenvalues).

In computations, the characteristic polynomial is extremely useful. To determine the eigenvalues of a matrix $A$, one solves for the roots of $p_{A} (x)$, and then checks if each root is an eigenvalue.

Consider the matrix

\[A = \begin{pmatrix} 1 & -3 & 3 \\ 3 & -5 & 3 \\ 6 & -6 & 4 \end{pmatrix}.\]

Compute its nonzero eigenvalues and their corresponding eigenvectors.

The characteristic polynomial of $A$ may be computed as

\[\det(A - x \cdot I) = \det \begin{pmatrix} 1-x & -3 & 3 \\3 & -5-x & 3 \\6 & -6 & 4-x \end{pmatrix} = 16 + 12x - x^3. \]

This factors as $p_{A} (x) = -(x-4)(x+2)^2$, so the possible for eigenvalues of $A$ are $\lambda_1 = 4$ and $\lambda_2 = -2$.

To prove $\lambda_1$ is an eigenvalue for $A$, it suffices to exhibit a vector $v\in \mathbb{R}^3$ with $Av = 4v$. This means that there exists a vector that satisfies $(A - 4I)v = 0$.

\[(A - 4I)v = \begin{pmatrix} -3 & -3 & 3 \\3 & -9 & 3 \\6 & -6 & 0 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} \]

Note that $v_1 = \left(\frac12, \frac12, 1\right)$ does the trick. Similarly, we find eigenvectors corresponding to $\lambda_2 = -2$ by finding the null space of $(A + 2I):$

\[(A + 2I)v = \begin{pmatrix} 3 & -3 & 3 \\3 & -3 & 3 \\6 & -6 & 6 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} \]

This gives just one equation : $x_1 - x_2 + x_3 = 0$. Two vectors that are linearly independent that work are $v_2 = (1,1,0)$ and $v_3 = (0, 1, 1)$.

Solving the system of linear equations corresponding to $Av = 4v$ shows that any eigenvector satisfying this equation is a multiple of $\lambda_1$. Similarly, solving the system corresponding to $Av = -2v$ demonstrates every eigenvector satisfying this equation is a linear combination of $v_1$ and $v_2$. Thus, we have found all the eigenvectors of $A$. $_\square$

Connections with Trace and Determinant

Suppose $A$ is a square matrix. Then

the sum of its eigenvalues is equal to the trace of $A;$
the product of its eigenvalues is equal to the determinant of $A.$

The proof of these properties requires the investigation of the characteristic polynomial of $A$, which is found by taking the determinant of $(A - \lambda{I}_{n})$. We have

\[\begin{align}A &= \begin{bmatrix} {a}_{11} & \cdots &{a}_{1n} \\ \vdots &\ddots &\vdots \\ {a}_{n1} &\cdots & {a}_{nn}\\ \end{bmatrix}\\\\ A - {I}_{n}\lambda &= \begin{bmatrix} {a}_{11} - \lambda & \cdots &{a}_{1n} \\ \vdots &\ddots &\vdots \\ {a}_{n1} &\cdots & {a}_{nn} - \lambda\\ \end{bmatrix}. \end{align}\]

Observe that $\det(A - \lambda{I}_{n} ) = \det(A) +\cdots+ \text{tr}(A){(-\lambda)}^{n-1} + {(-\lambda)}^{n}$.

Let ${r}_{1}, {r}_{2}, ...,{r}_{n}$ be the roots of an $n$-order polynomial.

\[\begin{align} P(\lambda) &= ({r}_{1} - \lambda)({r}_{2} - \lambda)...({r}_{n} - \lambda)\\ &= \prod _{ i=1 }^{ n }{ { r }_{ i } } +\cdots+\sum _{ i=1 }^{ n }{ { r }_{ i }{(-\lambda)}^{n-1} + {(-\lambda)}^{n} }. \end{align}\]

Since the eigenvalues are the roots of a matrix polynomial, we can match $P(x)$ to $\det(A - \lambda{I}_{n})$. Therefore, it is clear that

\[\prod _{ i=1 }^{ n }{ { \lambda }_{ i } = \det(A)}\]

and

\[\sum _{ i=1 }^{ n }{ { \lambda }_{ i } = \text{tr}(A)}.\ _\square\]

Applications

In addition to their theoretical significance, eigenvalues and eigenvectors have important applications in various branches of applied mathematics, including signal processing, machine learning, and social network analysis.

For a simple application in network analysis, consider a network modeled by the finite connected graph $G = (V, E)$. A computer will store the information of this graph in the form of an adjacency matrix. This is an $n$-by-$n$ matrix $A = \{a_{ij}\}_{1\le i \le j \le n}$, where $n = |V|$ and $a_{ij} = 1$ if vertices $i$ and $j$ have an edge between them, and $a_{ij} = 0$ otherwise.

The eigenvalues of $A$ detect information about the structure of $G$. For instance, let $\lambda_{\text{min}}$ and $\lambda_{\text{max}}$ denote the smallest and largest eigenvalues of $A$, respectively. Then, $G$ is bipartite if and only if $\lambda_{\text{min}} = - \lambda_{\text{min}}$. Accordingly, a computer could determine whether or not $G$ is bipartite by computing the eigenvalues of $A$; this is far more computationally feasible than attempting to check $G$ is bipartite by brute force.

References

Lantonov, L. Eigenvalue-equation. Retrieved August 24, 2016, from https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors#/media/File:Eigenvalue_equation.svg

Contents