Solving Linear Equations

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

CHAPTER 6

SOLUTION OF LINEAR AND NONLINEAR


EQUATIONS

O NE of the commonest uses of computers in physics is the solution of equa-


tions or sets of equations of various kinds. In this chapter we look at
methods for solving both linear and nonlinear equations. Solutions of linear
equations involve techniques from linear algebra, so we will spend some time
learning how linear algebra tasks such as inversion and diagonalization of ma-
trices can be accomplished on the computer. In the second part of the chapter
we will look at schemes for solving nonlinear equations.

6.1 SIMULTANEOUS LINEAR EQUATIONS

A single linear equation in one variable, such as x - 1 = 0, is trivial to solve.


We do not need computers to do this for us. But simultaneous sets of linear
equations in many variables are harder. In principle the techniques for solv-
ing them are well understood and straightforward-all of us learned them at
some point in school-but they are also tedious, involving many operations,
additions, subtractions, multiplications, one after another. Humans are slow
and prone to error in such calculations, but computers are perfectly suited to
the work and the solution of systems of simultaneous equations, particularly
large systems with many variables, is a common job for computers in physics.
Let us take an example. Suppose you want to solve the following four
simultaneous equations for the variables w, x, y, and z:

2w+x+4y+z = -4, (6.la)


3w+4x-y-z =3, (6.lb)
w - 4x + y + Sz = 9, (6.lc)
2w-2x+y+3z = 7. (6.ld)

214
6.1 SIMULTANEOUS LINEAR EQUATIONS

For computational purposes the simplest way to think of these is in matrix


form: they can be written as

: !1
1 -4 1 5 y
=
9
. (6.2)

2 -2 1 3 z 7

Alternatively, we could write this out shorthand as

Ax=v, (6.3)

where x = ( w, x, y, z) and the matrix A and vector v take the appropriate val-
ues.
One way to solve equations of this form is to find the inverse of the ma-
trix A then multiply both sides of (6.3) by it to get the solution x = A- 1 v.
This sounds like a promising approach for solving equations on the computer
but in practice it's not as good as you might think. Inverting the matrix A is
a rather complicated calculation that is inefficient and cumbersome to carry
out numerically. There are other ways of solving simultaneous equations that
don't require us to calculate an inverse and it turns out that these are faster,
simpler, and more accurate. Perhaps the most straightforward method, and
the first one we will look at, is Gaussian elimination.

6.1.1 GAUSSIAN ELIMINATION JI

Suppose we wish to solve a set of simultaneous equations like Eq. (6.1). We will
carry out the solution by working on the matrix form Ax = v of the equations.
As you are undoubtedly aware, the following useful rules apply:
1. We can multiply any of our simultaneous equations by a constant and
it's still the same equation. For instance, we can multiply Eq. (6.la) by 2
to get 4w + 2x + By+ 2z = -8 and the solution for w, x, y, z stays the
same. To put that another way: If we multiply any row of the matrix A by
any constant, and we multiply the corresponding row of the vector v by the same
constant, then the solution does not change.
2. We can take any linear combination of two equations to get another cor-
rect equation. To put that another way: If we add to or subtract from any
row of A a multiple of any other row, and we do the same for the vector v, then
the solution does not change.

215
CHAPTER 6 SOLUTION OF LINEAR AND NONLINEAR EQUATIONS

We can use these operations to solve our equations as follows. Consider the
matrix form of the equations given in Eq. (6.2) and let us perform the following
steps:
1. We divide the first row by the top-left element of the matrix, which has
the value 2 in this case. Recall that we must divide both the matrix itself
and the corresponding element on the right-hand side of the equation, in
order that the equations remain correct. Thus we get:

13 0.54 -12 0.5)


-1 (w)x (-2)
3
(6.4)
(2 -2 1 3
1 -4 1 5 y
z
9
7
.

Because we have divided both on the left and on the right, the solution to
the equations is unchanged from before, but note that the top-left element
of the matrix, by definition, is now equal to 1.
2. Next, note that the first element in the second row of the matrix is a 3. If
we now subtract 3 times the first row from the second this element will
become zero, thus:

2.5 -27 -0.52.5) (w)


01 0.5 x (-2)
9 (6.5)
1 -4 1 5 y 9 .
(
2 -2 1 3 z 7

Notice again that we have performed the same subtraction on the right-
hand side of the equation, to make sure the solution remains unchanged.
3. We now do a similar trick with the third and fourth rows. These have
first elements equal to 1 and 2 respectively, so we subtract 1 times the
first row from the third, and 2 times the first row from the fourth, which
gives us the following:

(6.6)

The end result of this series of operations is that the first column of our matrix
has been reduced to the simple form (1, 0, 0, 0),
but the solution of the complete
set of equations is unchanged.

216
6.1 SIMULTANEOUS LINEAR EQUATIONS

Now we move on to the second row of the matrix and perform a similar
series of operations. We divide the second row by its second element, to get

1 0.5
0 2 0.5)
1 -2.8 -1 x (-2)
3.6
(6.7)
0 -4.5 1 4.5 y 11 .
(
0 -3 -3 2 z 11

Then we subtract the appropriate multiple of the second row from each of the
rows below it, so as to make the second element of each of those rows zero.
That is, we subtract -4.5 times the second row from the third, and -3 times
the second row from the fourth, to give

- (:) ( (6.8)
0 0 -13.6 0 y 27.2 .
0 0 -11.4 -1 z 21.8

Then we move onto the third and fourth rows, and do the same thing, di-
viding each then subtracting from the rows below (except that the fourth row
obviously doesn't have any rows below, so it only needs to be divided). The
end result of the entire set of operations is the following:

10 0.5
0 0
1 - 22.8 0.5)
1
-1
0
(w)
x
y
(-2)
3.6
-2 . -
(6.9)
(
00 0 1 z, 1
By definition, this set of equations still has the same solution for the variables
w, x, y, and z as the equations we started with, but the matrix is now upper
triangular: all the elements below the diagonal are zero. This allows us to de-
termine the solution for the variables quite simply by the process of backsubsti-
tution.

6.1.2 BACKSUBSTITUTION

Suppose we have any set of equations of the form

(6.10)

217
CHAPTER 6 SOLUTION OF LINEAR AND NONLINEAR EQUATIONS

which is exactly the form of Eq. (6.9) generated by the Gaussian elimination
procedure. We can write the equations out in full as

w + a01 x + ao2Y + a03z = Vo, (6.1 la)


x + auy + anz =vi, (6.llb)
y +a23Z = V2 1 (6.llc)
Z = V3. (6.lld)

Note that we are using Python-style numbering for the elements here, starting
from zero, rather than one. This isn't strictly necessary, but it will be conve-
nient when we want to translate our calculation into computer code.
Given equations of this form, we see now that the solution for the value of
z is trivial-it is given directly by Eq. (6.lld):

(6.12)

But given this value, the solution for y is also trivial, being given by Eq. (6.1 lc):

(6.13)

And we can go on. The solution for x is

(6.14)

and
(6.15)

Applying these formulas to Eq. {6.9) gives

W=2, x = -1, y = -2, z = l, (6.16)

which is, needless to say, the correct answer.


Thus by the combination of Gaussian elimination with backsubstitution we
have solved our set of simultaneous equations.

EXAMPLE 6.1: GAUSSIAN ELIMINATION WITH BACKSUBSTITUTION

We are now in a position to create a complete program for solving simultane-


ous equations. Here is a program to solve the equations in (6.1) using Gaussian
elimination and backsubstitution:

218
6.1 SIMULTANEOUS LINEAR EQUATIONS

from numpy import array,empty File:gausselim.py

A = array([[ 2, 1, 4, 1 ],
[ 3,4, -1, -1 ] ,
[ 1, -4, 1, 5 ] ,
[ 2, -2, 1, 3 ]], float)
v = array( [ -4, 3, 9, 7 ] ,float)
N = len(v)

# Gaussian elimination
form in range(N):

# Divide by the diagonal element


div = A[m,m]
A[m, : ] /= div
v[m] /= div

# Now subtract from the lower rows


for i in range(m+1,N):
mult = A[i ,m]
A[i,:] -= mult*A[m,:]
v[i] -= mult*v[m]

# Backsubstitution
x = empty(N,float)
form in range(N-1,-1,-1):
x[m] = v[m]
for i in range(m+1,N):
x [m] -= A [m, i] *X [i]

print(x)

There are number of features to notice about this program. We store the ma-
trices and vectors as arrays, whose initial values are set at the start of the pro-
gram. The elimination portion of the program goes through each row of the
matrix, one by one, and first normalizes it by dividing by the appropriate di-
agonal element, then subtracts a multiple of that row from each lower row.
Notice how the program uses Python's ability to perform operations on entire
rows at once, which makes the calculation faster and simpler to program. The
second part of the program is a straightforward version of the backsubstitution
procedure. Note that the entire program is written so as to work for matrices

219
CHAPTER 6 SOLUTION OF LINEAR AND NONLINEAR EQUATIONS

of any size: we use the variable N to represent the size, so that no matter what
size of matrix we feed to the program it will perform the correct calculation.

Exercise 6.1: A circuit of resistors


Consider the following circuit of resistors:

All the resistors have the same resistance R. The power rail at the top is at voltage V+ =
5 V. What are the other four voltages, Vi to Vi?
To answer this question we use Ohm's law and the Kirchhoff current law, which
says that the total net current flow out of (or into) any junction in a circuit must be zero.
Thus for the junction at voltage V1, for instance, we have
Vi-Vi Vi-V3 Vi-Vi V1-V+
-R-+-R-+-R-+ R =O,

or equivalently
4Vi - Vi - V3 - Vi= V+.
a) Write similar equations for the other three junctions with unknown voltages.
b) Write a program to solve the four resulting equations using Gaussian elimination
and hence find the four voltages (or you can modify a program you already have,
such as the program gausselim. py in Example 6.1).

220
6.1 SIMULTANEOUS LINEAR EQUATIONS

6.1.3 PIVOTING

Suppose the equations we want to solve are slightly different from those of the
previous section, thus:

!
1 -4 1 5 y 9 .
(6.17)

2 -2 1 3 z 7

Just one thing has changed from the old equations in (6.2): the first element of
the first row of the matrix is zero, where previously it was nonzero. But this
makes all the difference in the world, because the first step of our Gaussian
elimination procedure requires us to divide the first row of the matrix by its
first element, which we can no longer do, because we would have to divide by
zero. In cases like these, Gaussian elimination no longer works. So what are
we to do?
The standard solution is to use pivoting, which means simply interchanging
the rows of the matrix to get rid of the problem. Clearly we are allowed to
interchange the order in which we write our simultaneous equations-it will
not affect their solution-so we could, if we like, swap the first and second
equations, which in matrix notation is equivalent to writing

1 -4 1 5
(:) (:4) .
y
=
6 9
(6.18)

2 -2 1 3 z 7

In other words, we have swapped the first and second rows of the matrix and
also swapped the first and second elements of the vector on the right-hand
side. Now the first element of the matrix is no longer zero, and Gaussian elim-
ination will work just fine.
Pivoting has to be done with care. We must make sure, for instance, that in
swapping equations to get rid of a problem we don't introduce another prob-
lem somewhere else. Moreover, the elements of the matrix change as the Gaus-
sian elimination procedure progresses, so it's not always obvious in advance
where problems are going to arise. A number of different rules or schemes
for pivoting have been developed to guide the order in which the equations
should be swapped. A good, general scheme that works well in most cases is
the so-called partial pivoting method, which is as follows.

221
CHAPTER 6 SOLUTION OF LINEAR AND NONLINEAR EQUATIONS

As we have seen the Gaussian elimination procedure works down the rows
of the matrix one by one, dividing each by the appropriate diagonal element,
before performing subtractions. With partial pivoting we consider rearranging
the rows at each stage. When we get to the mth row, we compare it to all lower
rows, looking at the value each row has in its mth element and finding the one
such value that is farthest from zero-either positive or negative. If the row
containing this winning value is not currently the mth row then we move it up
to mth place by swapping it with the current mth row. This has the result of
ensuring that the element we divide by in our Gaussian elimination is always
as far from zero as possible.
If we look at Eq. (6.18), we see that in fact we inadvertently did exactly the
right thing when we swapped the first and second rows of our matrix, since
we moved the row with the largest first element to the top of the matrix. Now
we would perform the first step of the Gaussian elimination procedure on the
resulting matrix, move on to the second row and pivot again.
In practice, one should always use pivoting when applying Gaussian elim-
ination, since you rarely know in advance when the equations you're trying to
solve will present a problem. Exercise 6.2 invites you to extend the Gaussian
elimination program from Example 6.1 to incorporate partial pivoting.

Exercise 6.2:
a) Modify the program gausselim.py in Example 6.1 to incorporate partial pivot-
ing (or you can write your own program from scratch if you prefer). Run your
program and demonstrate that it gives the same answers as the original program
when applied to Eq. (6.1).
b) Modify the program to solve the equations in (6.17) and show that it can find
the solution to these as well, even though Gaussian elimination without pivoting
fails.

6.1.4 LU DECOMPOSITION

The Gaussian elimination method, combined with partial pivoting, is a reliable


method for solving simultaneous equations and is widely used. For the types
of calculations that crop up in computational physics, however, it is commonly
used in a slightly different form from the one we have seen so far. In physics
calculations it often happens that we want to solve many different sets of equa-
tions Ax = v with the same matrix A but different right-hand sides v. In such

222

You might also like