Solving Linear Systems Using Matrices
Solving systems of linear equations is a common problem encountered in many disciplines. Solving such problems is so important that the techniques for solving them (substitution, elimination) are learned early on in algebra studies. This wiki will elaborate on the elementary technique of elimination and explore a few more techniques that can be obtained from linear algebra.
Contents
Row Reduction Techniques
A system of equations can be represented in a couple of different matrix forms. One way is to realize the system as the matrix multiplication of the coefficients in the system and the column vector of its variables. The square matrix is called the coefficient matrix because it consists of the coefficients of the variables in the system of equations:
\[ \text{Product of matrices: } \quad \begin{array}{c c c c c c c}
2x & + & 4y & + & 7z & = & 4\\
3x & + & 3y & + & 2z & = & 8\\
5x & + & 6y & + & 3z & = & 0 \\
\end{array} \longrightarrow
\left[ \begin{array}{c c c}
2 & 4 & 7\\
3 & 3 & 2 \\
5 & 6 & 3 \\
\end{array}\right]
\left[ \begin{array}{c}
x \\
y \\
z\\ \end{array} \right]
= \left[ \begin{array}{c}
4 \\
8 \\
0 \end{array} \right].
\]
An alternate representation called an augmented matrix is created by stitching the columns of matrices together and divided by a vertical bar. The coefficient matrix is placed on the left of this vertical bar, while the constants on the right-hand side of each equation are placed on the right of the vertical bar:
\[ \text{Augmented matrix: } \quad \quad \begin{array}{c c c c c c c}
2x & + & 4y & + & 7z & = & 4\\
3x & + & 3y & + & 2z & = & 8\\
5x & + & 6y & + & 3z & = & 0 \\
\end{array} \longrightarrow
\left[ \begin{array}{c c c | c}
2 & 4 & 7 & 4\\
3 & 3 & 2 & 8\\
5 & 6 & 3 & 0
\end{array}\right].
\]
The matrices that represent these systems can be manipulated in such a way as to provide easy to read solutions. This manipulation is called row reduction. Row reduction techniques transform the matrix into reduced row echelon form without changing the solutions to the system.
The reduced row echelon form of a matrix \(A\) \(\big(\)denoted \( \text{rref}(A)\big)\) is a matrix equal dimensions that satisfies:
- The leftmost non-zero element in each row is \( 1 \). This element is known as a pivot.
- Any column can have at most \( 1 \) pivot. If a column has a pivot, then the rest of the elements in the column will be \( 0 \).
- For any two columns \(C_{1} \) and \(C_{2}\) that have pivots in rows \( R_{1} \) and \( R_{2}, \) respectively, if pivot in \( C_{1} \) is to the left of pivot in \( C_{2}\), then \( R_{1} \) is above \( R_{2} \). In other words, for any two pivots \( P_{1}\) and \(P_{2}\), if \( P_{2} \) is to the right of \( P_{1} \), then \( P_{2} \) is below \( P_{1}\).
- Rows that consist of only zeroes are in the bottom of the matrix.
To convert any matrix to its reduced row echelon form, Gauss-Jordan elimination is performed. There are three elementary row operations used to achieve reduced row echelon form:
- Switch two rows.
- Multiply a row by any non-zero constant.
- Add a scalar multiple of one row to any other row.
Find \( \text{rref}(A)\) using Gauss-Jordan elimination, where
\[ A = \left[ \begin{array}{c c c} 2 & 6 & -2\\ 1 & 6 & -4 \\ -1 & 4 & 9 \\ \end{array}\right].\]
The leftmost element in the first row must be 1, so the first row is divided by 2:
\[ \left[ \begin{array}{c c c} 2 & 6 & -2\\ 1 & 6 & -4 \\ -1 & 4 & 9 \\ \end{array}\right] \ce{->[\large \text{Divide the first row by 2.}]} \left[ \begin{array}{c c c} 1 & 3 & -1\\ 1 & 6 & -4 \\ -1 & 4 & 9 \\ \end{array}\right]. \]
The top left element is a pivot, so the rest of the elements in the first column must be 0. This can be done by subtracting the first row from the second row. Furthermore, the first row can be added to the third row to obtain the necessary 0s in the first column:
\[ \left[ \begin{array}{c c c} 1 & 3 & -1\\ 1 & 6 & -4 \\ -1 & 4 & 9 \\ \end{array}\right] \ce{->[\large R_2 - R_1 \text{ and } R_3 + R_1]} \left[ \begin{array}{c c c} 1 & 3 & -1\\ 0 & 3 & -3 \\ 0 & 7 & 8 \\ \end{array}\right]. \]
Now that the leftmost column is \( \left[ \begin{array}{c} 1 \\ 0 \\ 0 \\ \end{array}\right] \), the middle element can be made 1 by dividing the second row by 3:
\[ \left[ \begin{array}{c c c} 1 & 3 & -1\\ 0 & 3 & -3 \\ 0 & 7 & 8 \\ \end{array}\right] \ce{->[\large \text{Divide the second row by 3.}]} \left[ \begin{array}{c c c} 1 & 3 & -1\\ 0 & 1 & -1 \\ 0 & 7 & 8 \\ \end{array}\right]. \]
The top and bottom elements in the second column can be made 0 with the appropriate row operations:
\[ \left[ \begin{array}{c c c} 1 & 3 & -1\\ 0 & 1 & -1 \\ 0 & 7 & 8 \\ \end{array}\right] \ce{->[\large R_1 - 3R_2 \text{ and } R_3 - 7R_2]} \left[ \begin{array}{c c c} 1 & 0 & 2\\ 0 & 1 & -1 \\ 0 & 0 & 15 \\ \end{array}\right]. \]
With the middle column now \( \left[ \begin{array}{c} 0 \\ 1 \\ 0 \\ \end{array}\right] \), the method proceeds to the third column by dividing the third row by 15:
\[ \left[ \begin{array}{c c c} 1 & 0 & 2\\ 0 & 1 & -1 \\ 0 & 0 & 15 \\ \end{array}\right] \ce{->[\large \text{Divide the third row by 15} ]} \left[ \begin{array}{c c c} 1 & 0 & 2\\ 0 & 1 & -1 \\ 0 & 0 & 1 \\ \end{array}\right]. \]
In the final step of the process, multiples of the third row are added to the first and second rows so that the last column becomes \( \left[ \begin{array}{c} 0 \\ 0 \\ 1 \\ \end{array}\right]: \)
\[ \left[ \begin{array}{c c c} 1 & 0 & 2\\ 0 & 1 & -1 \\ 0 & 0 & 1 \\ \end{array}\right] \ce{->[\large R_1 - 2R_3 \text{ and } R_2 + R_3 ]} \left[ \begin{array}{c c c} 1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array}\right]. \ _\square \]
Interpreting Solutions from Reduced Row Echelon Form
The following steps can be used to obtain the solutions to a system of linear equations:
- Convert the given equations to an augmented matrix.
- Perform row operations to get the reduced row echelon form of the matrix.
- Convert to augmented matrix back to a set of equations.
Once in this form, the possible solutions to a system of linear equations that the augmented matrix represents can be determined by three cases.
Case 1. If \(\text{rref}(A)\) is the identity matrix, then the system has a unique solution.
When read row by row, this augmented matrix says \(x = -1, y = 2,\) and \(z = 3:\)
\[ \left[ \begin{array}{c c c | c} 1 & 0 & 0 & -1\\ 0 & 1 & 0 & 2\\ 0 & 0 & 1 & 3\\ \end{array}\right]. \]
Case 2. If \(\text{rref}(A)\) contains a row of zeroes followed by a zero augmented value, then the system has infinitely many solutions.
Consider the following augmented matrix in reduced row echelon form. The last row reads \(\left[0 \hspace{.15cm} 0 \hspace{.15cm} 0 \hspace{.15cm} | \hspace{.15cm} 0\right] \) or \(0 = 0,\) which is not a contradiction. The other rows read \(x + 4y = -1\) and \(z = 2.\) This determines a value for \(z,\) but there are infinitely many pairs of real numbers \(x\) and \(y\) such that \(x + 4y = -1,\) so this system has an infinite number of solutions:
\[ \left[ \begin{array}{c c c | c} 1 & 4 & 0 & -1\\ 0 & 0 & 1 & 2\\ 0 & 0 & 0 & 0\\ \end{array}\right]. \]
Case 3. If \(\text{rref}(A)\) contains a row of zeroes followed by a nonzero augmented value, then the system has no solution.
This case is similar to the first in that the bottom row contains mostly 0s. However, observe that the last row of the matrix reads \(\left[0 \hspace{.15cm} 0 \hspace{.15cm} 0 \hspace{.15cm} | \hspace{.15cm} 1\right] \), which reads as \(0 = 1.\) This is a contradiction; thus the system has no solutions:
\[ \left[ \begin{array}{c c c | c} 1 & 4 & 0 & -1\\ 0 & 0 & 1 & 2\\ 0 & 0 & 0 & 1\\ \end{array}\right]. \]
Find the values of \( x \) and \( y \) satisfying the following system of equations:
\[ \begin{cases} 5x - y =1\\ x + 2y = 9. \end{cases} \]
For this example, row reduction techniques will be favored over standard elimination. The given system of equations can be written as the following augmented matrix:
\[ \left[ \begin{array} {cc|c} 5 & -1 & 1\\ 1 & 2 & 9\\ \end{array} \right]. \]
Gauss-Jordan elimination is performed to produce the reduced row echelon form of the matrix:
\[ \left[ \begin{array} {cc|c} 5 & -1 & 1\\ 1 & 2 & 9\\ \end{array} \right] \ce{->[\text{Swap } R_1 \text{ and } R_2. ]} \left[ \begin{array} {cc|c} 1 & 2 & 9\\ 5 & -1 & 1\\ \end{array} \right] \]
\[ \left[ \begin{array} {cc|c} 1 & 2 & 9\\ 5 & -1 & 1\\ \end{array} \right] \ce{->[R_2 - 5R_1 ]} \left[ \begin{array} {cc|c} 1 & 2 & 9\\ 0 & -11 & -44\\ \end{array} \right] \]
\[ \left[ \begin{array} {cc|c} 1 & 2 & 9\\ 0 & -11 & -44\\ \end{array} \right] \ce{->[\text{Divide the second row by } -11.]} \left[ \begin{array} {cc|c} 1 & 2 & 9\\ 0 & 1 & 4\\ \end{array} \right] \]
\[ \left[ \begin{array} {cc|c} 1 & 2 & 9\\ 0 & 1 & 4\\ \end{array} \right] \ce{->[R_1 - 2R_2]} \left[ \begin{array} {cc|c} 1 & 0 & 1\\ 0 & 1 & 4\\ \end{array} \right]. \]
The first row of this reduced form reads \( x = 1 \) and the second row reads \( y = 4 \). \(_\square\)
Note that this process is essentially the same as standard elimination. The main advantage of this procedure is the brevity of notation \((\)not having to write variable names such as \(x, y,\) and \(z,\) repetitively\()\) and ease of keeping track of steps taken.
Multiplication by Inverse of Coefficient Matrix
As seen before, a system of equations can be represented by the matrix multiplication \(Ax = b.\) From here, the solution represented by the column matrix \(x\) can be obtained by left multiplying both sides of the equation by the inverse of the coefficient matrix \(A^{-1}\):
\[Ax = b \implies A^{-1}Ax = A^{-1}b \implies x = A^{-1}b. \]
One pitfall of this method arises when \(\det(A) = 0.\) Because the determinant of \(A\) is 0, it cannot have an inverse, meaning this method will fail. In fact, \(\det(A) = 0\) is equivalent to \(\text{rref}(A)\) containing a row of zeroes. As such, the system of equations that \(Ax = b\) represents will not have a unique solution. However, the system could have infinitely many solutions or no solution and a different method will be needed to discover it.
Cramer's Rule
Cramer's rule is a formula that uses determinants to provide a solution to a system of linear equations. The statement of the rule below uses only three variables, but the rule can be applied to a system of any size.
Consider a system of equations of three variables:
\[\begin{align*} a_{1} x + b_{1} y + c_{1} z &= d_{1} \\ a_{2} x + b_{2} y + c_{2} z &= d_{2} \\ a_{3} x + b_{3} y + c_{3} z &= d_{3}. \end{align*}\]
The solution to this system is given by
\[x=\dfrac{\det(A_{1})}{\det(A)} \quad y = \dfrac{\det(A_{2})}{\det(A)} \quad z= \dfrac{\det(A_{3})}{\det(A)},\]
where \(A= \begin{bmatrix} a_{1} & b_{1} & c_{1} \\ a_{2} & b_{2} & c_{2} \\ a_{3} & b_{3} & c_{3} \end{bmatrix} \) is the coefficient matrix and each \(A_{i}\) is \(A\) with the \(i^\text{th}\) column replaced with \( \begin{bmatrix} d_{1} \\ d_{2} \\ d_{3} \end{bmatrix}.\)
Note:
- If \(\det(A)=0\) and if any one of \(\det(A_1)\), \(\det(A_2),\) or \( \det(A_3)\) is not equal to \(0\), then the system has no solution.
- If \(\det(A) = 0\) and \(\det(A_1)\), \(\det(A_2),\) and \( \det(A_3)\) all equal 0, then a solution may or may not exist. If a solution does exist, then the system of equations will have infinitely many solutions.
- The computational complexity of this method increases on the order of \(O(n!),\) so it is unwieldy after \(3\times 3 \) systems.
As previously stated, a system of equations can be represented as a product of matrices like
\[\begin{bmatrix} a_{1} & b_{1} & c_{1} \\ a_{2} & b_{2} & c_{2} \\ a_{3} & b_{3} & c_{3} \end{bmatrix}\begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} d_{1} \\ d_{2} \\ d_{3} \end{bmatrix}.\]
The determinant of the coefficient matrix \(A\) is given by
\[\det(A) = \left| \begin{array}{ccc} a_{1} & b_{1} & c_{1} \\ a_{2} & b_{2} & c_{2} \\ a_{3} & b_{3} & c_{3} \end{array} \right|.\]
If a column or row of a matrix is multiplied by a constant, then the value of its determinant is multiplied by that constant. So, in multiplying the first column by \(x,\) this equation becomes
\[x\det(A) = \left| \begin{array}{ccc} a_{1}x & b_{1} & c_{1} \\ a_{2}x & b_{2} & c_{2} \\ a_{3}x & b_{3} & c_{3} \end{array} \right|.\]
At this point, it is worth pointing out that column reduction techniques can be used as though they were row reduction techniques while preserving the solution. So adding \(y\) times the second column and \(z\) times the third column to the first column yields
\[x\det(A) = \left| \begin{array}{ccc} a_{1}x & b_{1} & c_{1} \\ a_{2}x & b_{2} & c_{2} \\ a_{3}x & b_{3} & c_{3} \end{array} \right| \xrightarrow[]{C_{1} + yC_{2} + zC_{3}} \left| \begin{array}{ccc} a_{1}x+b_{1}y+c_{1}z & b_{1} & c_{1} \\ a_{2}x+b_{2}y+c_{2}z & b_{2} & c_{2} \\ a_{3}x+b_{3}y+c_{3}z & b_{3} & c_{3} \end{array} \right|. \]
However, because \(a_{1}x + b_{1}y + c_{1}z = d_1\) \((\)and similarly for \(d_2\) and \(d_3),\) the new first column is precisely \( \left| \begin{array}{c} d_{1} \\ d_{2} \\ d_{3} \end{array} \right|.\) That is,
\[\begin{align} x\det(A) = \left| \begin{array}{ccc} a_{1}x+b_{1}y+c_{1}z & b_{1} & c_{1} \\ a_{2}x+b_{2}y+c_{2}z & b_{2} & c_{2} \\ a_{3}x+b_{3}y+c_{3}z & b_{3} & c_{3} \end{array} \right| = \left| \begin{array}{ccc} d_{1} & b_{1} & c_{1} \\ d_{2} & b_{2} & c_{2} \\ d_{3} & b_{3} & c_{3} \end{array} \right| &= \det(A_1)\quad \text{ or } \\\\ x\det(A) &= \det(A_1) \longrightarrow x = \frac{\det(A_1)}{\det(A)}. \end{align} \]
By a similar construction,
\[y = \frac{\det(A_2)}{\det(A)}\quad \text{ and }\quad z = \frac{\det(A_3)}{\det(A)}. \ _\square\]
\[ \begin{cases} 5x - y =1\\ x + 2y = 9\\ \end{cases} \]
According to Cramer's rule, the solution to the above system is
\[x = \frac{\left| \begin{array}{cc} 1 & -1 \\ 9 & 2 \end{array} \right|}{\left| \begin{array}{cc} 5 & -1 \\ 1 & 2 \end{array} \right|} = 1 \quad \quad \text{ and } \quad \quad y = \frac{\left| \begin{array}{cc} 5 & 1 \\ 1 & 9 \end{array} \right|}{\left| \begin{array}{cc} 5 & -1 \\ 1 & 2 \end{array} \right|} = 4. \]
\[x = \frac{\left| \begin{array}{ccc} 1 & -4 & 1\\ 5 & 1 & 2 \\ 11 & -1 & -3 \end{array} \right|}{\left| \begin{array}{ccc} 3 & -4 & 1\\ 5 & 1 & 2 \\ 1 & -1 & -3 \end{array} \right|}, \quad \quad y = \frac{\left| \begin{array}{ccc} 3 & 1 & 1\\ 5 & 5 & 2 \\ 1 & 11 & -3 \end{array} \right|}{\left| \begin{array}{ccc} 3 & -4 & 1\\ 5 & 1 & 2 \\ 1 & -1 & -3 \end{array} \right|}, \quad \quad z = \frac{\left| \begin{array}{ccc} 3 & -4 & 1\\ 5 & 1 & 5 \\ 1 & -1 & 11 \end{array} \right|}{\left| \begin{array}{ccc} 3 & -4 & 1\\ 5 & 1 & 2 \\ 1 & -1 & -3 \end{array} \right|} \]
Using Cramer's rule, which equation is not solved by the solutions above?