Kernel (Nullspace)
The kernel (or nullspace) of a linear transformation \( T \colon {\mathbb R}^n \to {\mathbb R}^m\) is the set \(\text{ker}(T)\) of vectors \( {\bf x} \in {\mathbb R}^n\) such that \( T({\bf x}) = {\bf 0}.\) It is a subspace of \( {\mathbb R}^n\) whose dimension is called the nullity. The rank-nullity theorem relates this dimension to the rank of \( T.\)
When \(T\) is given by left multiplication by an \(m \times n\) matrix \( A,\) so that \( T({\bf x}) = A{\bf x}\) \((\)where \( {\bf x} \in {\mathbb R}^n\) is thought of as an \(n \times 1\) matrix\(),\) it is common to refer to the kernel of the matrix rather than the kernel of the linear transformation, i.e. \(\text{ker}(A)\) instead of \(\text{ker}(T).\)
Many subspaces of \( {\mathbb R}^n\) can be naturally described as kernels of a particular linear transformation (and every subspace of \( {\mathbb R}^n\) can be described as the kernel of some linear transformation). Given a system of linear equations \( A{\bf x} = {\bf b},\) the computation of the kernel of \( A\) (via Gaussian elimination) can be used to give a general solution to the system once a particular solution is known.
Let \(A = \begin{pmatrix} 1&2&3\\4&5&6\\7&8&9 \end{pmatrix}.\) Find \(\text{ker}(A).\)
Gaussian elimination: \[ \begin{pmatrix} 1&2&3\\4&5&6\\7&8&9 \end{pmatrix} \rightarrow \begin{pmatrix} 1&2&3\\0&3&6\\0&6&12 \end{pmatrix} \rightarrow \begin{pmatrix} 1&2&3\\0&3&6\\0&0&0 \end{pmatrix} \rightarrow \begin{pmatrix} 1&2&3\\0&1&2\\0&0&0 \end{pmatrix} \rightarrow\begin{pmatrix} 1&0&-1\\0&1&2\\0&0&0 \end{pmatrix}. \] It is a fact (proved in the below section) that row reduction doesn't change the kernel of a matrix. The kernel of the matrix \(U\) at the end of the elimination process, which is in reduced row echelon form, is computed by writing the pivot variables (\(x_1,x_2\) in this case) in terms of the free (non-pivot) variables (\(x_3\) in this case). That is, \( U{\bf x} = {\bf 0}\) is equivalent to \[ \begin{align} x_1 - x_3 &= 0\\ x_2+2x_3 &= 0 \end{align} \]
so the solution vector \( {\bf x} = \begin{pmatrix} x_1\\x_2\\x_3 \end{pmatrix} \) equals \( \begin{pmatrix} x_3 \\ -2x_3 \\ x_3 \end{pmatrix} = x_3 \begin{pmatrix} 1\\-2\\1 \end{pmatrix}.\)
So the kernel is a one-dimensional subspace of \( {\mathbb R}^3\) whose basis is the single vector \( \begin{pmatrix} 1\\-2\\1 \end{pmatrix}.\)
Contents
Properties
- Injectivity: The kernel gives a quick check on the injectivity of \( T\):
A linear transformation \(T \colon {\mathbb R}^n \to {\mathbb R}^m\) is injective if and only if \(\text{ker}(T) = \{ {\bf 0}\}.\)
To see this, note that the kernel is the set of vectors which map to \( \bf 0\), so if \(T\) is injective then the kernel can only have one element, which must be \( \bf 0\). On the other hand, if the kernel is trivial, then \( T({\bf x}) = T({\bf y})\) implies that \( T({\bf x}-{\bf y}) = \bf 0\), and since the kernel is trivial, this implies \( {\bf x}-{\bf y} = {\bf 0},\) so \( {\bf x} = {\bf y}.\) So \(T\) is injective.
Subspace: To see that the kernel is a subspace of \( {\mathbb R}^n,\) it is enough to check that it is closed under addition and scalar multiplication: \[ {\bf x}, {\bf y} \in \text{ker}(A) \Rightarrow A{\bf x} = {\bf 0}, A{\bf y} = {\bf 0} \Rightarrow A({\bf x}+{\bf y}) = {\bf 0}+{\bf 0} = {\bf 0} \Rightarrow {\bf x}+{\bf y} \in \text{ker}(A) \] \[ {\bf x} \in \text{ker}(A), r \in {\mathbb R} \Rightarrow A(r{\bf x}) = rA{\bf x} = r{\bf 0} = {\bf 0} \Rightarrow r{\bf x} \in \text{ker}(A). \]
Orthogonality: One way to express the kernel of a matrix \(A\) in terms of the dot product on \( {\mathbb R}^n\) is: \[ \text{ker}(A) = (R(A))^\perp, \] where \( R(A)\) is the row space of \( A,\) and \( V^\perp\) denotes the set of all vectors which are orthogonal to all the vectors in \( V.\)
This straightforward result is proved in the wiki on row and column spaces.
Unchanged under elementary row operations: As cited in the above example, the elementary row operations in Gaussian elimination do not change the kernel of \(A\). One way to see this is that doing an elementary row operation is the same as multiplying \( A\) on the left by an invertible elementary matrix \( E.\) But the kernel of \( EA\) is the same as the kernel of \(A\): if \( A{\bf x} = {\bf 0}\) then \( EA{\bf x} = E{\bf 0} = {\bf 0},\) and if \( EA{\bf x} = {\bf 0} \) then \( E^{-1} EA{\bf x} = E^{-1}{\bf 0},\) so \( A{\bf x} = {\bf 0}.\) This justifies the method for computing the kernel outlined in the above example.
Solving general linear equations via translation: As is common with linear systems of equations, the kernel can be used to solve general equations of the form \( A{\bf x} = {\bf b}.\) If \( A{\bf x_0} = {\bf b},\) then the general solution is of the form \( {\bf x} = {\bf x_0} + {\bf v},\) where \( {\bf v} \in \text{ker } A.\) This is because \[ \begin{align} A{\bf x} = {\bf b} &\Leftrightarrow A{\bf x} - A{\bf x_0} = {\bf b} - {\bf b} = {\bf 0} \\ &\Leftrightarrow A({\bf x}-{\bf x_0}) = {\bf 0} \\ &\Leftrightarrow {\bf x}-{\bf x_0} \in \text{ker } A \\ &\Leftrightarrow {\bf x} = {\bf x_0} + {\bf v}, \, \text{for some } {\bf v} \in \text{ker } A. \\ \end{align} \]
So the set of solutions to \( A{\bf x} = {\bf b} \) is a coset of the kernel, of the form \( {\bf x_0} + \text{ ker}(A).\)
For \( A = \begin{pmatrix} 1&2&3\\4&5&6\\7&8&9 \end{pmatrix},\) find all solutions to \( A{\bf x} = \begin{pmatrix} 0\\3\\6 \end{pmatrix},\) given that \( {\bf x_0} = \begin{pmatrix} 1\\1\\-1 \end{pmatrix} \) is one such solution.
The example in the introduction shows that the kernel of \(A\) is the set of vectors which are scalar multiples of \( \begin{pmatrix} 1\\-2\\1 \end{pmatrix},\) so the set of all solutions to \( A{\bf x} = \begin{pmatrix} 0\\3\\6 \end{pmatrix}\) is \[ {\bf x} = \begin{pmatrix} 1\\1\\-1 \end{pmatrix} + t\begin{pmatrix} 1\\-2\\1 \end{pmatrix}, \] for all \(t\in {\mathbb R}.\)
Rank-nullity
Intuitively, the kernel measures how much the linear transformation \(T\) collapses the domain \({\mathbb R}^n.\) If the kernel is trivial, so that \(T\) does not collapse the domain, then \(T\) is injective (as shown in the previous section); so \( T\) embeds \( {\mathbb R}^n\) into \({\mathbb R}^m.\) But if the kernel is nontrivial, \(T\) is no longer an embedding, so its image in \({\mathbb R}^m\) is smaller.
This intuition suggests an inverse relationship between the sizes of the kernel and of the image of \(T.\) The formal version of this intuition is the rank-nullity theorem. Here it is stated in matrix form:
Here the rank of \(A\) is the dimension of the column space (or row space) of \(A.\) The first term of the sum, the dimension of the kernel of \(A,\) is often called the nullity of \(A.\)Let \( A\) be an \( m\times n\) matrix. Then \[ \text{dim}(\text{ker}(A)) + \text{rank}(A) = n. \]
The most natural way to see that this theorem is true is to view it in the context of the example from the previous two sections. The computation of the kernel of \(A\) made it clear that the dimension of the kernel was equal to the number of free (non-pivot) columns in the reduced row echelon form of \(A.\) On the other hand, recall that the rank equals the number of pivot columns. So the sum of the dimensions equals the number of columns, which is \(n.\)
Again taking \(A = \begin{pmatrix} 1&2&3\\4&5&6\\7&8&9 \end{pmatrix},\) verify rank-nullity for \(A.\)
We have already seen that \(\text{dim}(\text{ker}(A)) = 1,\) and the rank of \(A\) equals the number of pivot columns in the reduced row echelon form \( U = \begin{pmatrix} 1&0&-1\\0&1&2\\0&0&0 \end{pmatrix},\) which is \( 2.\) As expected, \(1+2=3.\)
Generalizations
This wiki has concentrated on down-to-earth applications of the kernel using matrices, but the statements above hold more generally for linear transformations \( T \colon V \to W \) between arbitrary vector spaces \( V,W.\) The kernel is still a subspace and can still be used to solve linear equations of the form \( T({\bf x}) = {\bf b};\) the rank-nullity theorem is still correct if the "number of columns" \(n\) is replaced by \( \text{dim}(V).\)
Let \(V=W=\) the real vector space \( {\mathbb P}_4\) of polynomials of degree \( \le 4\) with real coefficients. Verify rank-nullity for the linear transformation \( T\colon V\to W\) given by \(T(f) = f''.\)
The kernel of \(T\) is the set of polynomials in \(V\) whose second derivative vanishes. This is clearly the set of polynomials of the form \( a_0+a_1x,\) with \(a_0,a_1 \in {\mathbb R}.\) This is a two-dimensional subspace of \(V\) with basis \(\{1,x\}.\)
The image of \(T\) is the set of polynomials in \(W\) which are the second derivative of a polynomial in \(V.\) It is not hard to see that this is the subspace of \(W\) consisting of polynomials of degree \(\le 2.\) This is a three-dimensional subspace of \(W\) with basis \( \{1,x,x^2\}.\)
The dimension of \(V\) is \(5,\) since its canonical basis is \( \{1,x,x^2,x^3,x^4\}.\) Rank-nullity is satisfied since \(2+3=5.\)