Matrices
A matrix is a rectangular array of numbers, arranged in rows and columns. For instance, the left matrix has two rows and three columns, while the right matrix has three rows and two columns:
Matrices are useful in a variety of fields and form the basis for linear algebra. Their applications include solving systems of linear equations, path-finding in graph theory, and several applications in group theory (especially representation theory). They are also extremely useful in representing linear transformations (especially rotations) and thus complex numbers; for instance, the matrix
represents the complex number .
Contents
Formal Definition
A matrix is a rectangular array of any objects for which addition and multiplication are defined. Generally, these objects are numbers, but it is equally valid to have a matrix of symbols like
so long as there is a suitable understanding of what (for example) and are. More formally speaking, a matrix's elements can be drawn from any field. However, it is generally best to consider matrices as collections of real numbers.
Generally, in a matrix, the vertical elements are termed as columns and the horizontal elements are termed as rows. The size of a matrix is measured in the number of rows and columns the matrix has. The above matrix, for instance, has 2 rows and 3 columns, and thus it is a matrix. Matrices that have the same number of rows as columns are called square matrices and are of particular interest.
The elements of a matrix are specified by the row and column they reside in. For example, the in the above matrix is at position the row and column. More explicitly, . This notation is especially convenient when the elements are related by some formula; for instance, the matrix
can be more succinctly written as for , or even more compactly as where denotes the size of the matrix The row of the matrix can also be denoted by and the column by
In a given matrix of order there are elements present. For example, in a 3 by 3 matrix the number of elements are and in case of a 2 by 4 matrix there are elements present.
Finally, it is worth defining a matrix with exactly one column as a column vector, as they are especially useful in representing points in the -dimensional plane.
Basic Operations
There are several simple operations on matrices and one somewhat complicated one (multiplication). The first is addition: matrix addition is defined only on two matrices of the same size and works by adding corresponding elements:
What is
The matrices are added element-wise, so the result is
If and then find a matrix such that .
We have
More formally, we can state as follows:
The sum of two matrices of the same size satisfies the relation
for all within the size of the matrices.
It is also possible to multiply matrices by scalars, i.e. single numbers, by multiplying element-wise:
What is
The elements are each multiplied by 3.5, so the result is
More formally, we can state as follows:
The product of a constant and matrix satisfies the relation
for all within the size of the matrices.
If and what is the value of
Express your answer as the sum of all elements in the final matrix.
Matrix Multiplication
Finally, there is the more complicated operation of matrix multiplication. The product of two matrices is defined only when the number of columns of the first matrix is the same as the number of rows of the second; in other words, it is only possible to multiply and size matrices. The reason for this becomes clear upon defining the product:
The product of an matrix and an matrix satisfies
for all within the size of the matrices.
Here denotes the row of , which is a vector, and denotes the column of , which is also a vector. Thus, the dot in this sense refers to multiplying vectors, defined by the dot product. Note that and are defined on and , so the product will be an matrix.
This rule seems rather arbitrary, so it is best illustrated by an example:
What is
Firstly, note that the first matrix is and the second is , so their product is indeed defined and will be a matrix. Firstly consider the element of the product:
It is equal to the dot product of the row of the first matrix and the column of the second matrix, i.e.
So the top left entry of the result is 22. The rest of the matrix can be filled out in the same way; for instance,
The final result is
If and then find and . What can you conclude from the final two matrices?
Both and are square matrices of order . Hence both and are well-defined and are matrices of the same order .
Clearly, you can see that . Thus, we can conclude that multiplication of matrices need not be commutative.
Suppose that and satisfy the following equation:
Evaluate
It is still admittedly unclear why matrix multiplication is defined this way. One major reason is in the use of systems of linear equations. The coefficients of each equation can be assembled into a coefficient matrix, and the variables can be arranged into a column vector. The product of the coefficient matrix and the column vector will itself be a column vector, the values of each equation. For example, the system of equations
can be more succinctly written in the form
This is a very useful transformation besides the saving of space; in particular, if it were possible to "divide" matrices, it would be easy to find out what are by dividing out the coefficient matrix. Unfortunately, division takes some more effort to define, so further explanation of this is left to a later section.
As a warning about matrix multiplication, it is extremely important to understand the following:
Matrix multiplication is not commutative. In other words, it is not generally true that .
The simplest way to see this is that matrix multiplication is defined only on and matrices; reversing their order gives the product of an matrix and an matrix, and is not necessarily equal to . Even in such a case (e.g. square matrices), the multiplication is generally not commutative. Matrices that do indeed satisfy are (appropriately) called commuting matrices.
Do the two matrices and commute?
No, since
but
Finally, it is worth noting a special matrix: the identity matrix
which is an matrix that is zero everywhere except for the main diagonal, which contains all 1s. For instance,
It satisfies the property that
for any matrix . The reason should be clear from the above definitions.
Transpose and Determinant
Two useful functions on matrices are the transpose and the determinant. The transpose of an matrix is an matrix such that the rows of are the columns of , and the columns of are the rows of . For instance,
The transpose satisfies a few useful properties:
The second of these is the most useful since it (roughly) means that properties true of left multiplication hold true for right multiplication as well.
More interesting is the determinant of a matrix. There are several equally valid definitions of the determinant, though all would seem arbitrary at this point without an understanding of what the determinant is supposed to compute.
Formally, the determinant is a function from the set of square matrices to the set of real numbers that satisfies 3 important properties:
- is linear in the rows of the matrix;
- if two rows of a matrix are equal,
The second condition is by far the most important. It means that any of the rows of the matrix is written as a linear combination of two other vectors, and the determinant can be calculated by "splitting" that row. For instance, in the below example, the second row can be written as , so
A key theorem shows this:
There is exactly one function satisfying the above 3 relations.
Unfortunately, this is very difficult to work with for all but the simplest matrices, so an alternate definition is better to use. There are two major ones: determinant by minors and determinant by permutations.
The first of the two, determinant by minors, uses recursion. The base case is simple: the determinant of a matrix with element is simply . Note that this agrees with the conditions above, since
because . The recursive step is as follows: denote by the matrix formed by deleting the row and column. For instance,
Then the determinant is given as follows:
The determinant of an matrix is
For example,
What is the determinant of
We write
If and what is
Unfortunately, these calculations can get quite tedious; already for matrices, the formula is too long to memorize in practice.
An alternate definition uses permutations. Let be a permutation of , and the set of those permutations.
Then the determinant of an matrix is
This may look more intimidating than the previous formula, but in fact it is more intuitive. It essentially says the following:
Choose elements of such that no two are in the same row and no two are in the same column, and multiply them, possibly also by if the permutation has an odd sign. The determinant is the sum over all choices of these elements.
This definition is especially useful when the matrix contains many zeros, as then most of the products vanish.
Find the determinant of the matrix
Here is an example:
What is the determinant of
There are two permutations of itself and . The first has a positive sign (as it has 0 transpositions) and the second has a negative sign (as it has 1 transposition), so the determinant is
Unsurprisingly, this is the same result as above.
Calculate
The determinant is a very important function because it satisfies a number of additional properties that can be derived from the 3 conditions stated above. They are as follows:
- Multiplicativity:
- Invariance under row operations: If is a matrix formed by adding a multiple of any row to another row, then
- Invariance under transpose:
- Sign change under row swap: If is a matrix formed by swapping the positions of two rows, then
As the next section shows, the multiplicative property is of special importance.
Inverting Matrices
At the end of the matrix multiplication section, it was noted that "dividing" matrices would be an extremely useful operation. To attempt to create one, it is important to understand the definition of division in numbers:
Dividing by a number is equivalent to multiplying by a number
In other words, dividing by is equivalent to multiplying by a number such that . This makes sense for what division "should" do: dividing by followed by multiplying by should be the equivalent of doing nothing, i.e. multiplying by 1. The above definition ensures this.
Matrix "division," should it exist, should follow the same principle: multiplying by a matrix and then dividing by it should be the equivalent of doing nothing. In matrix multiplication, however, the equivalent of doing nothing is multiplying by .
This leads to a natural definition:
The inverse of a matrix is a matrix such that .
A natural question to ask is whether all matrices have inverses. Unfortunately, the answer is no, but this should not be surprising: not all numbers have inverses either (it is impossible to divide by 0). Indeed, the multiplicative property of the determinant from the previous section shows this: since ,
So it is necessary for to be nonzero. It is somewhat more difficult for this to be a sufficient condition, but this is indeed the case:
A matrix has an inverse if and only if it has a nonzero determinant.
It is worth remembering the formula in the case:
The inverse of the matrix , if it exists, is
This isn't too difficult to verify.
In general, the inverse matrix can be found by analyzing the cofactor matrix, an matrix satisfying
where refers to the matrix formed by removing the row and column from . This matrix satisfies the property that
This provides yet another reason that is invertible if and only if it has nonzero determinant. It is also worth noting that the cofactor matrix is , which aligns with the formula above.
Solving Systems of Linear Equations
See full article here: Solving Linear Systems Using Matrices.
The above sections provide a general method for solving systems of linear equations:
- Arrange the coefficients in the coefficient matrix .
- Arrange the variables in a vector .
- Arrange the resulting values into another vector . The goal is now to solve the equation
- Calculate , the inverse of , for example, by the cofactor method from the previous section.
- Multiply both sides of the above equation by on the left. Then , which is a simple matrix multiplication.
Find the solution to the following system of linear equations:
There is one potential pitfall: the inverse of does not exist. This means that the determinant of is 0, so there are two rows of that are multiples of one another; this means that the original system of equations had two equations that were multiples of one another. This means that there are either no or infinite solutions.
The system of linear equations
has a non-trivial solution for