Lecture10: Matrices
Lecture10: Matrices
MAT 461/561
Spring Semester 2009-10
Lecture 10 Notes
for the 𝑛 unknowns 𝑥1 , 𝑥2 , . . . , 𝑥𝑛 . Many data fitting problems, including ones that we have previ-
ously discussed such as polynomial interpolation or least-squares approximation, involve the solu-
tion of such a system. Discretization of partial differential equations often yields systems of linear
equations that must be solved. Systems of nonlinear equations are typically solved using iterative
methods that solve a system of linear equations during each iteration. We will now study the
solution of this type of problem in detail.
The basic idea behind methods for solving a system of linear equations is to reduce them to linear
equations involving a single unknown, because such equations are trivial to solve. Such a reduction
is achieved by manipulating the equations in the system in such a way that the solution does not
change, but unknowns are eliminated from selected equations until, finally, we obtain an equation
involving only a single unknown. These manipulations are called elementary row operations, and
they are defined as follows:
∙ Reordering the equations by interchanging both sides of the 𝑖th and 𝑗th equation in the
system
∙ Replacing equation 𝑖 by the sum of equation 𝑖 and a multiple of both sides of equation 𝑗
The third operation is by far the most useful. We will now demonstrate how it can be used to
reduce a system of equations to a form in which it can easily be solved.
1
Example Consider the system of linear equations
𝑥1 + 2𝑥2 + 𝑥3 = 5,
3𝑥1 + 2𝑥2 + 4𝑥3 = 17,
4𝑥1 + 4𝑥2 + 3𝑥3 = 26.
First, we eliminate 𝑥1 from the second equation by subtracting 3 times the first equation from the
second. This yields the equivalent system
𝑥1 + 2𝑥2 + 𝑥3 = 5,
−4𝑥2 + 𝑥3 = 2,
4𝑥1 + 4𝑥2 + 3𝑥3 = 26.
Next, we subtract 4 times the first equation from the third, to eliminate 𝑥1 from the third equation
as well:
𝑥2 + 2𝑥2 + 𝑥3 = 5,
−4𝑥2 + 𝑥3 = 2,
−4𝑥2 − 𝑥3 = 6.
Then, we eliminate 𝑥2 from the third equation by subtracting the second equation from it, which
yields the system
𝑥1 + 2𝑥2 + 𝑥3 = 5,
−4𝑥2 + 𝑥3 = 2,
−2𝑥3 = 4.
This system is in upper-triangular form, because the third equation depends only on 𝑥3 , and the
second equation depends on 𝑥2 and 𝑥3 .
Because the third equation is a linear equation in 𝑥3 , it can easily be solved to obtain 𝑥3 = −2.
Then, we can substitute this value into the second equation, which yields −4𝑥2 = 4. This equation
only depends on 𝑥2 , so we can easily solve it to obtain 𝑥2 = −1. Finally, we substitute the values
of 𝑥2 and 𝑥3 into the first equation to obtain 𝑥1 = 9. This process of computing the unknowns
from a system that is in upper-triangular form is called back substitution. □
In general, a system of 𝑛 linear equations in 𝑛 unknowns is in upper-triangular form if the 𝑖th
equation depends only on the unknowns 𝑥𝑖 , 𝑥𝑖+1 , . . . , 𝑥𝑛 , for 𝑖 = 1, 2, . . . , 𝑛.
It can be seen from this example that continually rewriting a system of equations as it is reduced
can be quite tedious. Therefore, we instead represent a system of linear equations using a matrix,
which is an array of elements, or entries. We say that a matrix 𝐴 is 𝑚 × 𝑛 if it has 𝑚 rows and 𝑛
2
columns, and we denote the element in row 𝑖 and column 𝑗 by 𝑎𝑖𝑗 . We also denote the matrix 𝐴
by [𝑎𝑖𝑗 ].
With this notation, a general system of 𝑛 equations with 𝑛 unknowns can be represented using
a matrix 𝐴 that contains the coefficients of the equations, a vectors x that contains the unknowns,
and a vector b that contains the quantities on the right-hand sides of the equations. Specifically,
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
𝑎11 𝑎12 ⋅ ⋅ ⋅ 𝑎1𝑛 𝑥1 𝑏1
⎢ 𝑎21 𝑎22 ⋅ ⋅ ⋅ 𝑎2𝑛 ⎥ ⎢ 𝑥2 ⎥ ⎢ 𝑏2 ⎥
𝐴=⎢ . , x = , b = ⎢ .. ⎥ .
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
.. ⎢ .. ⎥
⎣ ..
⎥
. ⎦ ⎣ . ⎦ ⎣ . ⎦
𝑎𝑛1 𝑎𝑛2 ⋅ ⋅ ⋅ 𝑎𝑛𝑛 𝑥𝑛 𝑏𝑛
for 𝑗 = 1, 2, . . . , 𝑛 − 1 do
for 𝑖 = 𝑗 + 1, 𝑗 + 2, . . . , 𝑛 do
𝑚𝑖𝑗 = 𝑎𝑖𝑗 /𝑎𝑗𝑗
for 𝑘 = 𝑗 + 1, 𝑗 + 2, . . . , 𝑛 do
𝑎𝑖𝑘 = 𝑎𝑖𝑘 − 𝑚𝑖𝑗 𝑎𝑗𝑘
end
𝑏𝑖 = 𝑏𝑖 − 𝑚𝑖𝑗 𝑏𝑗
end
end
3
The number 𝑚𝑖𝑗 is called a multiplier. It is the number by which row 𝑗 is multiplied before
adding it to row 𝑖, in order to eliminate the unknown 𝑥𝑗 from the 𝑖th equation. Note that this
algorithm is applied to the augmented matrix, as the elements of the vector b are updated by the
row operations as well.
It should be noted that in the above description of Gaussian elimination, each entry below
the main diagonal is never explicitly zeroed, because that computation is unnecessary. It is only
necessary to update entries of the matrix that are involved in subsequent row operations or the
solution of the resulting upper triangular system. This system is solved by the following algorithm
for back substitution. In the algorithm, we assume that 𝑈 is the upper triangular matrix containing
the coefficients of the system, and y is the vector containing the right-hand sides of the equations.
for 𝑖 = 𝑛, 𝑛 − 1, . . . , 1 do
𝑥𝑖 = 𝑦𝑖
for 𝑗 = 𝑖 + 1, 𝑖 + 2, . . . , 𝑛 do
𝑥𝑖 = 𝑥𝑖 − 𝑢𝑖𝑗 𝑥𝑗
end
𝑥𝑖 = 𝑥𝑖 /𝑢𝑖𝑖
end
This algorithm requires approximately 𝑛2 arithmetic operations. We will see that when solving
systems of equations in which the right-hand side vector b is changing, but the coefficient matrix
𝐴 remains fixed, it is quite practical to apply Gaussian elimination to 𝐴 only once, and then
repeatedly apply it to each b, along with back substitution, because the latter two steps are much
less expensive.
We now illustrate the use of both these algorithms with an example.
Example Consider the system of linear equations
𝑥1 + 2𝑥2 + 𝑥3 − 𝑥4 = 5
3𝑥1 + 2𝑥2 + 4𝑥3 + 4𝑥4 = 16
4𝑥1 + 4𝑥2 + 3𝑥3 + 4𝑥4 = 22
2𝑥1 + 𝑥3 + 5𝑥4 = 15.
This system can be represented by the coefficient matrix 𝐴 and right-hand side vector b, as follows:
⎡ ⎤ ⎡ ⎤
1 2 1 −1 5
⎢ 3 2 4 4 ⎥
⎥ , b = ⎢ 16 ⎥ .
⎢ ⎥
𝐴=⎢ ⎣ 4 4 3 4 ⎦ ⎣ 22 ⎦
2 0 1 5 15
4
To perform row operations to reduce this system to upper triangular form, we define the augmented
matrix ⎡ ⎤
1 2 1 −1 5
] ⎢ 3 2 4 4 16 ⎥
𝐴˜ = 𝐴 b = ⎢
[
⎥.
⎣ 4 4 3 4 22 ⎦
2 0 1 5 15
We first define 𝐴˜(1) = 𝐴˜ to be the original augmented matrix. Then, we denote by 𝐴˜(2) the result of
the first elementary row operation, which entails subtracting 3 times the first row from the second
in order to eliminate 𝑥1 from the second equation:
⎡ ⎤
1 2 1 −1 5
⎢ 0 −4 1 7 1 ⎥
𝐴˜(2) = ⎢
⎣ 4
⎥.
4 3 4 22 ⎦
2 0 1 5 15
Next, we eliminate 𝑥1 from the third equation by subtracting 4 times the first row from the third:
⎡ ⎤
1 2 1 −1 5
⎢ 0 −4 1 7 1 ⎥
𝐴˜(3) = ⎢
⎣ 0 −4 −1
⎥.
8 2 ⎦
2 0 1 5 15
Then, we complete the elimination of 𝑥1 by subtracting 2 times the first row from the fourth:
⎡ ⎤
1 2 1 −1 5
0 −4 1 7 1 ⎥
𝐴˜(4) = ⎢
⎢
⎥.
⎣ 0 −4 −1 8 2 ⎦
0 −4 −1 7 5
We now need to eliminate 𝑥2 from the third and fourth equations. This is accomplished by sub-
tracting the second row from the third, which yields
⎡ ⎤
1 2 1 −1 5
⎢ 0 −4 1 7 1 ⎥
𝐴˜(5) = ⎢ ⎥,
⎣ 0 0 −2 1 1 ⎦
0 −4 −1 7 5
5
Finally, we subtract the third row from the fourth to obtain the augmented matrix of an upper-
triangular system, ⎡ ⎤
1 2 1 −1 5
⎢ 0 −4 1 7 1 ⎥
𝐴˜(7) = ⎢ ⎥.
⎣ 0 0 −2 1 1 ⎦
0 0 0 −1 3
Note that in a matrix for such a system, all entries below the main diagonal (the entries where the
row index is equal to the column index) are equal to zero. That is, 𝑎𝑖𝑗 = 0 for 𝑖 > 𝑗.
Now, we can perform back substitution on the corresponding system,
𝑥1 + 2𝑥2 + 𝑥3 − 𝑥4 = 5,
−4𝑥2 + 𝑥3 + 7𝑥4 = 1,
−2𝑥3 + 𝑥4 = 1,
−𝑥4 = 3,
to obtain the solution, which yields 𝑥4 = −3, 𝑥3 = −2, 𝑥2 = −6, and 𝑥1 = 16. □
It can be seen from the above pseudocode that the algorithms for Gaussian elimination and back
substitution can break down if a diagonal element of the matrix is equal to zero. In order to work
around this potential pitfall, another elementary row operation can be used: a row interchange.
If, when computing the multiplier 𝑚𝑖𝑗 during Gaussian elimination, the entry 𝑎𝑗𝑗 is equal to zero,
then, to avoid breakdown of the algorithm, row 𝑗 of the augmented matrix can be interchanged
with row 𝑖, for some 𝑖 > 𝑗, where 𝑎𝑖𝑗 ∕= 0, and then Gaussian elimination can continue.
Example Consider the system of linear equations
𝑥1 + 2𝑥2 + 𝑥3 − 𝑥4 = 5
3𝑥1 + 6𝑥2 + 4𝑥3 + 4𝑥4 = 16
4𝑥1 + 4𝑥2 + 3𝑥3 + 4𝑥4 = 22
2𝑥1 + 𝑥3 + 5𝑥4 = 15.
The coefficient matrix 𝐴 and the right-hand side vector b for this system are
⎡ ⎤ ⎡ ⎤
1 2 1 −1 5
⎢ 3 6 4 4 ⎥
⎥ , b = ⎢ 16 ⎥ .
⎢ ⎥
𝐴=⎢ ⎣ 4 4 3 4 ⎦ ⎣ 22 ⎦
2 0 1 5 15
To perform Gaussian elimination on this system, we perform row operations on the augmented
6
matrix ⎡ ⎤
1 2 1 −1 5
] ⎢ 3 6 4 4 16 ⎥
𝐴˜ = 𝐴 b = ⎢
[
⎥.
⎣ 4 4 3 4 22 ⎦
2 0 1 5 15
We first set 𝐴˜(1) = 𝐴,
˜ and then subtract 3 times the first row from the second row in 𝐴(1) to obtain
⎡ ⎤
1 2 1 −1 5
⎢ 0 0 1 7 1 ⎥
𝐴˜(2) = ⎢
⎣ 4 4 3
⎥.
4 22 ⎦
2 0 1 5 15
Then, we subtract 4 times the first row from the second to obtain
⎡ ⎤
1 2 1 −1 5
⎢ 0 0 1 7 1 ⎥
𝐴˜(3) = ⎢
⎣ 0 −4 −1
⎥.
8 2 ⎦
2 0 1 5 15
We complete the elimination of 𝑥1 by subtracting 2 times the first row from the fourth, which yields
⎡ ⎤
1 2 1 −1 5
⎢ 0 0 1 7 1 ⎥
𝐴˜(4) = ⎢
⎣ 0 −4 −1
⎥.
8 2 ⎦
0 −4 −1 7 5
Normally, we would continue by subtracting multiples of the second row from the third and fourth,
to eliminate 𝑥2 from the third and fourth equations, but the computing the multiples requires
dividing by [𝐴˜(4) ]22 , which is zero. Therefore, we interchange the second and third rows, which
yields ⎡ ⎤
1 2 1 −1 5
⎢ 0 −4 −1 8 2 ⎥
𝐴˜(5) = ⎢
⎣ 0
⎥.
0 1 7 1 ⎦
0 −4 −1 7 5
We then subtract the (new) second row from the fourth to obtain
⎡ ⎤
1 2 1 −1 5
⎢ 0 −4 −1 8 2 ⎥
𝐴˜(6) = ⎢
⎣ 0
⎥.
0 1 7 1 ⎦
0 0 0 −1 3
This matrix is already in upper-triangular form, so there is no need to subtract a multiple of the
third row from the fourth, as Gaussian elimination would normally require. Now, we can peform
7
back substitution as usual to obtain the solution, which is 𝑥1 = 4, 𝑥2 = −12, 𝑥3 = 22, and 𝑥4 = −3.
□
Of course, it may happen that there is no suitable row 𝑖 with which row 𝑗 can be interchanged.
In this case, Gaussian elimination fails, and it can be concluded that the system of equations does
not have a unique solution. There are two possibilities that remain: either there is no solution,
or there are infinitely many solutions. To determine which scenario applies, one can continue to
apply Gaussian elimination with the remaining columns of the coefficient matrix. Because there is
at least one column, column 𝑗, for which all entries on or below the diagonal are equal to zero, it
will follow that at least one equation in the system will have all coefficients equal to zero. If the
corresponding element on the right-hand side of the equation is nonzero, then there is no solution,
but if it is zero, then there are infinitely many solutions.
Example Consider the system of linear equations
𝑥1 + 2𝑥2 + 𝑥3 − 𝑥4 = 5
3𝑥1 + 6𝑥2 + 4𝑥3 + 4𝑥4 = 16
4𝑥1 + 8𝑥2 + 3𝑥3 + 4𝑥4 = 22
2𝑥1 + 4𝑥2 + 𝑥3 + 5𝑥4 = 15.
The coefficient matrix 𝐴 and the right-hand side vector b for this system are
⎡ ⎤ ⎡ ⎤
1 2 1 −1 5
⎢ 3 6 4 4 ⎥
⎥ , b = ⎢ 16 ⎥ .
⎢ ⎥
𝐴=⎢ ⎣ 4 8 3 4 ⎦ ⎣ 22 ⎦
2 4 1 5 15
To perform Gaussian elimination on this system, we perform row operations on the augmented
matrix ⎡ ⎤
1 2 1 −1 5
] ⎢ 3 6 4 4 16 ⎥
𝐴˜ = 𝐴 b = ⎢
[
⎥.
⎣ 4 8 3 4 22 ⎦
2 4 1 5 15
We first set 𝐴˜(1) = 𝐴,
˜ and then subtract 3 times the first row from the second row in 𝐴(1) to obtain
⎡ ⎤
1 2 1 −1 5
⎢ 0 0 1 7 1 ⎥
𝐴˜(2) = ⎢
⎣ 4 8 3
⎥.
4 22 ⎦
2 4 1 5 15
8
Then, we subtract 4 times the first row from the second to obtain
⎡ ⎤
1 2 1 −1 5
⎢ 0 0 1 7 1 ⎥
𝐴˜ = ⎢
(3) ⎥.
⎣ 0 0 −1 8 2 ⎦
2 4 1 5 15
We complete the elimination of 𝑥1 by subtracting 2 times the first row from the fourth, which yields
⎡ ⎤
1 2 1 −1 5
⎢ 0 0 1 7 1 ⎥
𝐴˜(4) = ⎢
⎣ 0 0 −1
⎥.
8 2 ⎦
0 0 −1 7 5
Normally, we would continue by subtracting multiples of the second row from the third and fourth,
to eliminate 𝑥2 from the third and fourth equations, but the computing the multiples requires
dividing by [𝐴˜(4) ]22 , which is zero. However, we cannot solve this problem by interchanging the
second row with either the third or fourth row, because the elements in those rows, and the second
column, are also zero. Therefore, this system does not have a unique solution.
To determine whether there is a solution at all, we continue the elimination process with 𝑥3 , in
order to zero elements in the third column. We first add the second row to the third, which yields
⎡ ⎤
1 2 1 −1 5
⎢ 0 0 1 7 1 ⎥
𝐴˜(5) = ⎢
⎣ 0 0
⎥.
0 15 3 ⎦
0 0 −1 7 5
We also add the second row to the fourth to obtain
⎡ ⎤
1 2 1 −1 5
⎢ 0 0 1 7 1 ⎥
𝐴˜(6) = ⎢
⎣ 0 0 0 15 3 ⎦ .
⎥
0 0 0 14 6
Finally, we subtract 14/15 times the third row from the fourth to complete the elimination:
⎡ ⎤
1 2 1 −1 5
⎢ 0 0 1 7 1 ⎥
𝐴˜(7) = ⎢
⎣ 0 0 0 15
⎥
3 ⎦
0 0 0 0 16/5
The fourth row corresponds to the equation 0 = 16/5, which is a contradiction. Therefore, we
conclude that the original system of linear equations has no solution. If the value on the right-hand
side of this equation had been zero, then the original system would have infinitely many solutions.
□