Singular Value Decomposition

Singular value decomposition generalizes the spectral theorem to arbitrary \(m\)-by-\(n\) matrices. Treating an \(m\)-by-\(n\) matrix as representing a linear transformation \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\), singular value decomposition states that there are bases for \(\mathbb{R}^n\) and \(\mathbb{R}^m\) in which the matrix of \(T\) takes a diagonal form.

If \(M\) is any \(m\)-by-\(n\) matrix with entries in \(\mathbb{R}\), there exists an \(m\)-by-\(m\) orthogonal matrix \(U\), an \(n\)-by-\(n\) orthogonal matrix \(V\), and an \(m\)-by-\(n\) diagonal matrix \(\Sigma\) such that \[M = U\Sigma V.\] Furthermore, the numbers on the diagonal of \(\Sigma \) are non-negative. \(_\square\)

Recall that an \(m\)-by-\(n\) matrix is called diagonal when any entry whose row does not equal its column is zero. For example, the \(3\)-by-\(4\) matrix \[\begin{pmatrix} 2 & 0 & 0 & 0 \\0 & 3 & 0 & 0 \\ 0 & 0 & 6 & 0 \end{pmatrix}\] is diagonal.

Remark About Special Case: In the case when \(m=n\), note that singular value decomposition is strictly weaker than the spectral theorem, since the orthogonal matrices whose existence is ensured by singular value decomposition are not necessarily related (whereas in the spectral theorem, one orthogonal matrix is inverse to the other). However, in singular value decomposition, the entries of the diagonal matrix are ensured to be non-negative, whereas the spectral theorem makes no such guarantee.

Proof Using Spectral Theorem

Remark About Notation: In this proof, we will use block matrix notation. In block matrix notation, square submatrices of a matrix are represented as entries of another matrix, in order to ease notation and computation. For example, consider the \(3\)-by-\(3\) matrix \[\begin{pmatrix} 3 & 0 & 0\\ 0 & 2 & 0 \\ 0 & 0 & 0\end{pmatrix}.\] Letting \(D\) denote the diagonal matrix \[\begin{pmatrix} 3 & 0 \\ 0 & 2 \end{pmatrix},\] we can write the \(3\)-by-\(3\) matrix above in the block form \[\begin{pmatrix} D & 0 \\ 0 & 0 \end{pmatrix},\] where the zeroes represent the zero matrices in dimensions \(1\)-by-\(2\), \(1\)-by-\(1\), and \(2\)-by-\(1\).

Let \(M\) be the \(m\)-by-\(n\) matrix that we wish to singular value decompose, and let \(N\) denote the \(n\)-by-\(n\) matrix \(M^{T} M\).

Step 1 - Eigenvalues Of \(N\) Are Non-Negative Reals: If \(\lambda\) is an eigenvalue of \(N\), with eigenvector \(v\), we compute \[\lambda \|v\|^2 = v^{T} \lambda v = v^{T} M^{T} M v = \|Mv\|^2.\] This equation is only possible if \(\lambda \ge 0\).

Step 2 - Using Spectral Theorem: By the spectral theorem, since \(N\) is symmetric, there is an orthogonal matrix \(V\) such that \(V^{T} N V\) is diagonal. Write the block form \[V^{T} N V = V^{T} M^{T} M V = \begin{pmatrix} D & 0 \\ 0 & 0 \end{pmatrix}, \] where \(D\) is a diagonal matrix that has only positive entries. We can ensure this order for the eigenvalues in the diagonal matrix \(V^{T} N V\) by multiplying \(V\) with a suitable permutation matrix.

Step 3 - Breaking \(V\) Into Blocks: Partition \(V\) into the block form \(V = \begin{pmatrix} V_1 & V_2 \end{pmatrix}\) so that \[\begin{pmatrix} D & 0 \\ 0 & 0 \end{pmatrix} = V^{T} M^{T} M V = \begin{pmatrix} V_1 & V_2 \end{pmatrix}^{T} M^{T} M \begin{pmatrix} V_1 & V_2 \end{pmatrix} = \begin{pmatrix} V_{1}^{T} M^{T} M V_1 & V_{1}^{T} M^{T} M V \\ V_{2}^{T} M^{T} M V & V_{2}^{T} M^{T} M V \end{pmatrix}.\] In particular, \(V_{1}^{T} M^{T} M V_1 = D\) and \(V_{2}^{T} M^{T} M V_{2} = 0\). This latter equation implies \(MV_2 = 0\), since the diagonal entries of \(V_{2}^{T} M^{T} M V_{2}\) are norms of the rows of \(MV_2\). Furthermore, note that \(V_1 V_{1}^{T} + V_2 V_{2}^{T} = I\), where \(I\) is the identity matrix of some suitable dimension; this is true from a coordinate-wise computation using the fact that \(V\) is orthogonal.

Step 4 - Fudging Blocks To Get What We Want: Since the entries of \(D\) are positive, there is a diagonal matrix \(\sqrt{D}\) whose entries are also positive and whose square is \(D\); furthermore, since the entries are positive, \(\sqrt{D}\) is invertible. Let \(U_1 = MV_1 \sqrt{D}^{-1}\). We compute \[U_1\sqrt{D} V_{1}^{T} = MV_1 \sqrt{D}^{-1} \sqrt{D} V_{1}^{T} = MV_1 V_{1}^{T} = M(I - V_2 V_{2}^{T}) = M - MV_2 V_{2}^{T} = M,\] where the final equality follows from the fact \(MV_2 = 0\). This is almost what we want, but \(U_1\) and \(V_1\) are not necessarily orthogonal because they aren't even necessarily square!

Step 5 - Fudging \(U_1\) and \(V_1\) To Actually Get What We Want: Note \[U_{1}^{T} U_1 = \sqrt{D}^{-1} V_{1}^{T} M^{T} M V_1 \sqrt{D}^{-1} = \sqrt{D}^{-1} D \sqrt{D}^{-1} = I.\] This implies the columns of \(U_1\) are orthogonal, and hence this collection of columns can be extended to an orthogonal basis. Consequently, there is a matrix \(U_2\) such that the \(m\)-by-\(m\) block matrix \(U:= \begin{pmatrix} U_1 & U_2 \end{pmatrix}\) is orthogonal. Now, we fudge \(\sqrt{D}\) by adding or removing the suitable number of zero rows/columns; call the diagonal matrix obtained by this process \(\Sigma\). Then, a computation will prove \[M = U \Sigma V,\] so we are now actually done. \(_\square\)

Singular Values of Matrix

If \(U\Sigma V\) is a singular value decomposition of \(M\), the orthogonal matrices \(U\) and \(V\) are not unique. However, the diagonal entries of \(\Sigma\) are unique, at least up to a permutation. These entries are called the singular values of \(M\).

Contents