Solving Linear Equations
Solving Linear Equations
Solving Linear Equations
214
6.1 SIMULTANEOUS LINEAR EQUATIONS
: !1
1 -4 1 5 y
=
9
. (6.2)
2 -2 1 3 z 7
Ax=v, (6.3)
where x = ( w, x, y, z) and the matrix A and vector v take the appropriate val-
ues.
One way to solve equations of this form is to find the inverse of the ma-
trix A then multiply both sides of (6.3) by it to get the solution x = A- 1 v.
This sounds like a promising approach for solving equations on the computer
but in practice it's not as good as you might think. Inverting the matrix A is
a rather complicated calculation that is inefficient and cumbersome to carry
out numerically. There are other ways of solving simultaneous equations that
don't require us to calculate an inverse and it turns out that these are faster,
simpler, and more accurate. Perhaps the most straightforward method, and
the first one we will look at, is Gaussian elimination.
Suppose we wish to solve a set of simultaneous equations like Eq. (6.1). We will
carry out the solution by working on the matrix form Ax = v of the equations.
As you are undoubtedly aware, the following useful rules apply:
1. We can multiply any of our simultaneous equations by a constant and
it's still the same equation. For instance, we can multiply Eq. (6.la) by 2
to get 4w + 2x + By+ 2z = -8 and the solution for w, x, y, z stays the
same. To put that another way: If we multiply any row of the matrix A by
any constant, and we multiply the corresponding row of the vector v by the same
constant, then the solution does not change.
2. We can take any linear combination of two equations to get another cor-
rect equation. To put that another way: If we add to or subtract from any
row of A a multiple of any other row, and we do the same for the vector v, then
the solution does not change.
215
CHAPTER 6 SOLUTION OF LINEAR AND NONLINEAR EQUATIONS
We can use these operations to solve our equations as follows. Consider the
matrix form of the equations given in Eq. (6.2) and let us perform the following
steps:
1. We divide the first row by the top-left element of the matrix, which has
the value 2 in this case. Recall that we must divide both the matrix itself
and the corresponding element on the right-hand side of the equation, in
order that the equations remain correct. Thus we get:
Because we have divided both on the left and on the right, the solution to
the equations is unchanged from before, but note that the top-left element
of the matrix, by definition, is now equal to 1.
2. Next, note that the first element in the second row of the matrix is a 3. If
we now subtract 3 times the first row from the second this element will
become zero, thus:
Notice again that we have performed the same subtraction on the right-
hand side of the equation, to make sure the solution remains unchanged.
3. We now do a similar trick with the third and fourth rows. These have
first elements equal to 1 and 2 respectively, so we subtract 1 times the
first row from the third, and 2 times the first row from the fourth, which
gives us the following:
(6.6)
The end result of this series of operations is that the first column of our matrix
has been reduced to the simple form (1, 0, 0, 0),
but the solution of the complete
set of equations is unchanged.
216
6.1 SIMULTANEOUS LINEAR EQUATIONS
Now we move on to the second row of the matrix and perform a similar
series of operations. We divide the second row by its second element, to get
1 0.5
0 2 0.5)
1 -2.8 -1 x (-2)
3.6
(6.7)
0 -4.5 1 4.5 y 11 .
(
0 -3 -3 2 z 11
Then we subtract the appropriate multiple of the second row from each of the
rows below it, so as to make the second element of each of those rows zero.
That is, we subtract -4.5 times the second row from the third, and -3 times
the second row from the fourth, to give
- (:) ( (6.8)
0 0 -13.6 0 y 27.2 .
0 0 -11.4 -1 z 21.8
Then we move onto the third and fourth rows, and do the same thing, di-
viding each then subtracting from the rows below (except that the fourth row
obviously doesn't have any rows below, so it only needs to be divided). The
end result of the entire set of operations is the following:
10 0.5
0 0
1 - 22.8 0.5)
1
-1
0
(w)
x
y
(-2)
3.6
-2 . -
(6.9)
(
00 0 1 z, 1
By definition, this set of equations still has the same solution for the variables
w, x, y, and z as the equations we started with, but the matrix is now upper
triangular: all the elements below the diagonal are zero. This allows us to de-
termine the solution for the variables quite simply by the process of backsubsti-
tution.
6.1.2 BACKSUBSTITUTION
(6.10)
217
CHAPTER 6 SOLUTION OF LINEAR AND NONLINEAR EQUATIONS
which is exactly the form of Eq. (6.9) generated by the Gaussian elimination
procedure. We can write the equations out in full as
Note that we are using Python-style numbering for the elements here, starting
from zero, rather than one. This isn't strictly necessary, but it will be conve-
nient when we want to translate our calculation into computer code.
Given equations of this form, we see now that the solution for the value of
z is trivial-it is given directly by Eq. (6.lld):
(6.12)
But given this value, the solution for y is also trivial, being given by Eq. (6.1 lc):
(6.13)
(6.14)
and
(6.15)
218
6.1 SIMULTANEOUS LINEAR EQUATIONS
A = array([[ 2, 1, 4, 1 ],
[ 3,4, -1, -1 ] ,
[ 1, -4, 1, 5 ] ,
[ 2, -2, 1, 3 ]], float)
v = array( [ -4, 3, 9, 7 ] ,float)
N = len(v)
# Gaussian elimination
form in range(N):
# Backsubstitution
x = empty(N,float)
form in range(N-1,-1,-1):
x[m] = v[m]
for i in range(m+1,N):
x [m] -= A [m, i] *X [i]
print(x)
There are number of features to notice about this program. We store the ma-
trices and vectors as arrays, whose initial values are set at the start of the pro-
gram. The elimination portion of the program goes through each row of the
matrix, one by one, and first normalizes it by dividing by the appropriate di-
agonal element, then subtracts a multiple of that row from each lower row.
Notice how the program uses Python's ability to perform operations on entire
rows at once, which makes the calculation faster and simpler to program. The
second part of the program is a straightforward version of the backsubstitution
procedure. Note that the entire program is written so as to work for matrices
219
CHAPTER 6 SOLUTION OF LINEAR AND NONLINEAR EQUATIONS
of any size: we use the variable N to represent the size, so that no matter what
size of matrix we feed to the program it will perform the correct calculation.
All the resistors have the same resistance R. The power rail at the top is at voltage V+ =
5 V. What are the other four voltages, Vi to Vi?
To answer this question we use Ohm's law and the Kirchhoff current law, which
says that the total net current flow out of (or into) any junction in a circuit must be zero.
Thus for the junction at voltage V1, for instance, we have
Vi-Vi Vi-V3 Vi-Vi V1-V+
-R-+-R-+-R-+ R =O,
or equivalently
4Vi - Vi - V3 - Vi= V+.
a) Write similar equations for the other three junctions with unknown voltages.
b) Write a program to solve the four resulting equations using Gaussian elimination
and hence find the four voltages (or you can modify a program you already have,
such as the program gausselim. py in Example 6.1).
220
6.1 SIMULTANEOUS LINEAR EQUATIONS
6.1.3 PIVOTING
Suppose the equations we want to solve are slightly different from those of the
previous section, thus:
!
1 -4 1 5 y 9 .
(6.17)
2 -2 1 3 z 7
Just one thing has changed from the old equations in (6.2): the first element of
the first row of the matrix is zero, where previously it was nonzero. But this
makes all the difference in the world, because the first step of our Gaussian
elimination procedure requires us to divide the first row of the matrix by its
first element, which we can no longer do, because we would have to divide by
zero. In cases like these, Gaussian elimination no longer works. So what are
we to do?
The standard solution is to use pivoting, which means simply interchanging
the rows of the matrix to get rid of the problem. Clearly we are allowed to
interchange the order in which we write our simultaneous equations-it will
not affect their solution-so we could, if we like, swap the first and second
equations, which in matrix notation is equivalent to writing
1 -4 1 5
(:) (:4) .
y
=
6 9
(6.18)
2 -2 1 3 z 7
In other words, we have swapped the first and second rows of the matrix and
also swapped the first and second elements of the vector on the right-hand
side. Now the first element of the matrix is no longer zero, and Gaussian elim-
ination will work just fine.
Pivoting has to be done with care. We must make sure, for instance, that in
swapping equations to get rid of a problem we don't introduce another prob-
lem somewhere else. Moreover, the elements of the matrix change as the Gaus-
sian elimination procedure progresses, so it's not always obvious in advance
where problems are going to arise. A number of different rules or schemes
for pivoting have been developed to guide the order in which the equations
should be swapped. A good, general scheme that works well in most cases is
the so-called partial pivoting method, which is as follows.
221
CHAPTER 6 SOLUTION OF LINEAR AND NONLINEAR EQUATIONS
As we have seen the Gaussian elimination procedure works down the rows
of the matrix one by one, dividing each by the appropriate diagonal element,
before performing subtractions. With partial pivoting we consider rearranging
the rows at each stage. When we get to the mth row, we compare it to all lower
rows, looking at the value each row has in its mth element and finding the one
such value that is farthest from zero-either positive or negative. If the row
containing this winning value is not currently the mth row then we move it up
to mth place by swapping it with the current mth row. This has the result of
ensuring that the element we divide by in our Gaussian elimination is always
as far from zero as possible.
If we look at Eq. (6.18), we see that in fact we inadvertently did exactly the
right thing when we swapped the first and second rows of our matrix, since
we moved the row with the largest first element to the top of the matrix. Now
we would perform the first step of the Gaussian elimination procedure on the
resulting matrix, move on to the second row and pivot again.
In practice, one should always use pivoting when applying Gaussian elim-
ination, since you rarely know in advance when the equations you're trying to
solve will present a problem. Exercise 6.2 invites you to extend the Gaussian
elimination program from Example 6.1 to incorporate partial pivoting.
Exercise 6.2:
a) Modify the program gausselim.py in Example 6.1 to incorporate partial pivot-
ing (or you can write your own program from scratch if you prefer). Run your
program and demonstrate that it gives the same answers as the original program
when applied to Eq. (6.1).
b) Modify the program to solve the equations in (6.17) and show that it can find
the solution to these as well, even though Gaussian elimination without pivoting
fails.
6.1.4 LU DECOMPOSITION
222