0% found this document useful (0 votes)
24 views126 pages

CVE 603 Numerical Analysis

Numerical Analysis

Uploaded by

Dr Ibeje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views126 pages

CVE 603 Numerical Analysis

Numerical Analysis

Uploaded by

Dr Ibeje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 126

NATIONAL OPEN UNIVERSITY OF NIGERIA

SCHOOL OF SCIENCE AND TECHNOLOGY

COURSE CODE: PHY 314

COURSE TITLE: NUMERICAL COMPUTATIONS

1
PHY 314: Numerical Computations

Course Code PHY 314


Course Title Numerical Computations

Course Developer Dr. A. B. Adeloye


DEPARTMENT OF PHYSICS
UNIVERSITY UNIVERITY OF LAGOS

Programme Leader Dr. Ajibola S. O.


National Open University
of Nigeria
Lagos

COURSE GUIDE

2
NATIONAL OPEN UNIVERSITY OF NIGERIA

Contents
Introduction

The Course
Course Aims
Course Objectives

Working through the Course


Course Material
Study Units
Textbooks

Assessment
Tutor Marked Assignment
End of Course Examination

Summary

3
Introduction
Numerical Analysis is an important part of Physics and Engineering. This is because
most of the problems encountered in real life do not lend themselves to a solution in a
closed form. In other words, we have to make do with approximate solutions. It is clear,
therefore, that you need to be conversant with the various methods of approximate
solution of problems, as well as the loss of information inherent in replacing the exact
solution with an approximate one.

It is also quite clear that the fastest way of doing numerical computation is through the
computer. It is imperative, then, that you understand one or more of the available
programming languages. In this course, the programming language of interest is C++.

It is quite clear from the foregoing that numerical analysis is an interesting course, and
we would expect you to apply yourself fully to the course, as a lot of your future work in
the field of physics would warrant a sound knowledge of numerical analysis.

THE COURSE
PHY 309 (3 Credit Units)
This 2-unit course introduces you to numerical analysis. Unit 1 discusses the various
types of errors and how they might be minimised.

Unit 2 is on curve-fitting. You would need to deduce some physical parameters from a
given set of readings obtained perhaps in a laboratory. Various ways of linearising given
formulas is given, preparatory to drawing a line of best fit from which the physical
quantity is deduced.

Unit 3 is all about linear systems of simultaneous equations. You shall learn how to
handle a large set of linear equations by writing them in the form of matrices. Such
problems will then be solved with the methods applicable to matrices. You would also
learn how to arrive at solutions through iterative methods.

4
Unit 4 discusses different methods of finding the roots of algebraic and transcendental
equations.

In Unit 5, you will come across finite differences. You will be introduced to various
kinds of differences, and how to detect the error in difference tables.

Numerical integration is the object of Unit 6. In this Unit, you shall learn how to integrate
a function within a given set of limits (definite integrals).

Unit 7, the concluding part of the theory part of the course discusses the numerical
solution of initial value problems of ordinary differential equations.

The C++ Programming aspect of the course is an introduction to program-writing in one


of the most versatile programming languages.

We wish you success.

COURSE AIMS
The aim of this course is to teach you about the mechanics of the atomic and subatomic
particles.

COURSE OBJECTIVES
After studying this course, you should be able to
 Understand the various types of errors and how to minimise them.
 Linearise a given expression in order to bring out a physical constant from the
resultant relationship.
 Fit a curve to a given set of data.
 Solve a system of linear equations.
 Find the roots of a given algebraic or transcendental equation.
 Obtain the definite integral of a given function of a single variable.
 Work with finite difference schemes.
 Solve a first order initial value problems of ordinary differential equation.
 Solve higher order initial value problems of ordinary differential equations.
 Write C++ programs for solving the numerical problems.

WORKING THROUGH THE COURSE


Numerical methods provide a powerful way of solving almost any problem in physics,
provided it has been properly formulated. It is our belief that the student would be
motivated enough to put in a good effort in understanding the theoretical part of this
course and be willing to learn to write programs in C++ language.

THE COURSE MATERIAL


You will be provided with the following materials:

5
Course Guide
Study Material containing study units

At the end of the course, you will find a list of recommended textbooks which are
necessary as supplements to the course material. However, note that it is not compulsory
for you to acquire or indeed read them.

STUDY UNITS for Numerical Analysis


The following study units are contained in this course:

Unit 1: Approximations and Errors in Numerical Computations


Unit 2: Approximations and Errors in Numerical Computations
Unit 3: Linear Systems of Equations
Unit 4: Roots of Algebraic and Transcendental Equations
Unit 5: Finite Differences and Interpolation
Unit 6: Numerical Integration
Unit 7: Initial Value Problems of Ordinary Differential Equations

TEXTBOOKS
Some reference books, which you may find useful, are given below:
1. Numerical Methods in Engineering and Science – Grewal, B. S.
2. Introductory Methods of Numerical Analysis – Sastry, S. S.
3. A friendly Introduction to Numerical Analysis – Bradie, B.

Assessment
There are two components of assessment for this course. The Tutor Marked Assignment
(TMA), and the end of course examination.

Tutor Marked Assignment


The TMA is the continuous assessment component of your course. It accounts for 30% of
the total score. You will be given 4 TMA’s to answer. Three of these must be answered
before you are allowed to sit for the end of course examination. The TMA’s would be
given to you by your facilitator and returned after they have been graded.

End of Course Examination


This examination concludes the assessment for the course. It constitutes 70% of the
whole course. You will be informed of the time for the examination. It may or may not
coincide with the university semester examination.

Summary
This course is designed to lay a foundation for you for further studies in Numerical
Analysis. At the end of this course, you will be able to answer the following types of
questions:

 What is the need for numerical analysis in Physics?


 What are the types of error that can be encountered in numerical work?

6
 What the ways of obtaining the line that best fits a set of laboratory data?
 What are the various ways of numerically solving a system of linear equations?
 What are the ways in which we can numerically find the roots of an equation?
 How do I integrate a function that does not lend itself to an analytical solution?
 How do I solve a first order ordinary differential equation?
 How do I tackle a higher order initial value problem of ordinary differential
equation?
 What are the merits and demerits of some of the methods of numerical analysis?

We wish you success.

7
UNIT 1: Approximations and Errors in Numerical Computations
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Accuracy of Numbers
3.1.1 Approximate Numbers
3.1.2 Significant digits (figures)
3.2.3 Rounding off
3.1.4 Arithmetic precision
3.2 Accuracy of Measurement
3.3 Errors
3.3.1 Rounding Errors
3.3.2 Inherent Errors
3.3.3 Truncation Errors
3.3.4 Absolute Error, Relative Error and Percentage Error
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment (TMA)
7.0 References/Further Readings

1.0 Introduction
Physics is an exact science. However, it is strictly impossible to achieve infinite accuracy
in practice. You are quite aware that your apparatus or instrument is not perfect, neither is
your eye nor your measuring ability. We then see that errors arise in everyday
observations and measurements. The study of errors is very important in all areas of
Science and Technology. This is necessitated by the fact that errors should not swamp
our procedure enough to alter, significantly, conclusions that may be drawn from such
observations or measurements.

Apart from the limitations of observation and measurement, there are some errors
inherent in the problem itself. A good example in Quantum Mechanics is given by the
Heisenberg Uncertainty Principle, which maintains that we cannot measure some pairs of
quantities accurately simultaneously, for example, the position of a body and its
momentum. Any attempt to measure either quantity accurately gives an infinite error in
the other. Some other errors arise as a result of representing an infinite series with a
truncated one. We shall talk a little bit more about this in a while.

2.0 Objectives
At the end of this unit, you would be able to:
 understand the importance of errors in numerical analysis.
 round a number to a certain number of significant figures
 know how to reduce the errors involved in your numerical work.
 understand arithmetic precision

8
3.0 Main Content
3.1 Accuracy of Numbers
3.1.1 Approximate Numbers
For the sake of numerical computation, all numbers can be classified under two broad
headings: exact numbers and approximate numbers. As the name implies, the former
comprises numbers that are fully represented by some digits. Examples include the
integers, and rational numbers that can and have been completely written, e.g., 3.2158.
Approximate numbers are those that are not fully specified by the digits representing
3
them. As an example, we could write the rational number as 2.3333. You are quite
7
aware that the actual number is not exactly 2.3333.

By this stage of your study, you must have worked with the rational numbers. These are
numbers which can be written as a fraction of two integers. Although certain rational
numbers are exact numbers, you have also come across a lot of rational numbers that
cannot be written as exact numbers as in the example above. The irrational numbers are
even more troublesome. An example of an irrational number is 2 : such numbers cannot
be written as the ratio of any two integers. There are two families of numbers that are
unending: the ones that repeat certain sequences, and the ones that do not. For instance,
12.345454545 and 18.127849342. The order of preference in dealing with numbers in
numerical computations is: natural numbers, rational numbers that have a finite string of
digits, rational numbers that have unending strings of digits and irrational numbers.

3.1.2 Significant digits (figures)


We say a number is of r significant digits (figures) if r digits are used to express it. As an
example, 1.612, 0.004812 and 3806000 all have four significant figures. You would
notice that each of them could be written x  10 n (with no loss of information), where x
is of 4 (r = 4 in this case) digits, not starting or ending with zero and n is an integer,
positive or negative.

The following rules will be of assistance to you. Make sure they become a part of you.

1. The leftmost non-zero digit is the most significant digit, e.g., in 0.001243, 1 is the
most significant digit.
2. In the case where there is no decimal point, the rightmost non-zero digit is the least
significant, e.g., 145630000, 3 is the least significant figure.
3. If there is a decimal point, the rightmost digit is the least significant, even if it is zero,
e.g., in 235.34200, the last 0 is the least significant. The number is not 235.34201 or
235.34199.
4. All digits between the least significant and the most significant (inclusive) are
significant, e.g., in the example under rule 1, 13 are significant. In the example in
rule 2, 1  3 are significant.

Take another example: 0.00004 has one significant figure, while 984.13245 has 8
significant figures. It should be obvious to you why they have been classified this way.

9
There is an exception, however:
When a zero is obtained by rounding, for example, 329.5 is rounded to 3 significant
figures. This becomes 330, the last zero being significant in this case. You can compare
this with rule 2 above.

3.2.3 Rounding off


The irrational numbers are a perfect example of numbers with unending digits. Even in
the case of rational numbers there can still be unending number of digits and in some
other cases we may decide to reduce the number of digits by which a number is
represented. This process is called rounding off.

Rules for Rounding off a number to n significant figures


(a) Discard all digits to the right of the nth digit
(b) If the discarded part of the number is
(I) less than half a unit in the n th place leave the n th digit unchanged
(II) greater than half a unit in the n th place, increase the n th digit by unity
(III) exactly equal to half a unit in the n th place, leave the digit unchanged
if it is even; increase by unity if otherwise.

Examples: Round the following numbers to 5 significant figures:


(i) 3.142857143 (ii) 6.32431925 (iii) 1.4123519

Solution: To 5 significant figures, the numbers are:


(i) 3.1428 (rules (a) and (b)(III) nth digit unchanged as it is even
(ii) 6.3243 (rules (a) and (b)(I) as the discarded part of the number is less than
half a unit in the nth place.
(iii) 1.4124 (rules (a) and (b)(III) nth digit increased by unity as it is odd

Note:
A number rounded off to n significant figures is said to be correct to n significant
places.

3.1.4 Arithmetic precision


As we have said before, it might be necessary to round off our numbers to make them
useful for numerical computation, moreso as it would require an infinite computer
memory to store an unending number. The precision of a number is an indication of the
number of digits that have been used to express it. In scientific computing, it is the
number of significant digits or numbers, while in financial systems, it is the number of
decimal places. You are quite aware that most currencies in the world are quoted to two
decimal places.

In our own case, arithmetic precision (often referred to simply as precision) is the
specified number of significant figures or digits to which the number of interest is to be
rounded.

10
3.4 Errors
We said earlier, that we shall be revisiting the different types of errors. These are:

3.4.1 Rounding Errors


These are errors incurred by truncating a sequence of digits representing a number, as we
7
saw in the case of representing the rational number by 2.3333, instead of 2.3333…..,
3
which is an unending number. Apart from being unable to write this number in an exact
form by hand, our instruments of calculation, be it the calculator or the computer, can
only handle a finite string of digits.

Rounding errors can be reduced if we change the calculation procedure in such a way as
to avoid the subtraction of nearly equal numbers or division by a small number. It can
also be reduced by retaining at least one more significant figure at each step than the one
given in the data, and then rounding off at the last step.

3.4.2 Inherent Errors


As the name implies, these are errors that are inherent in the statement of the problem
itself. This could be due to the limitations of the means of calculation, for instance, the
calculator or the computer. This error could be reduced by using a higher precision of
calculation.

3.4.3 Truncation Errors


If we truncate Taylor’s series, which should be an infinite series, then some error is
incurred. This is the error associated with truncating a sequence or by terminating an
iterative process.

This kind of error also results when, for instance, we carry out numerical differentiation
or integration, because we are replacing an infinitesimal process with a finite one. In
either case, we would have required that the elemental value of the independent variable
tend to zero in order to get the exact value.

3.3.4 Absolute Error, Relative Error and Percentage Error


The absolute error in a measurement is the absolute difference between the measured
value and the actual value of the quantity. Thus, we can write
Absolute error = | actual value  measured value |

The ratio of the absolute error to the actual value is the relative error. We can therefore
write the relative error as
| actual value  measured value |
Absolute error =
actual value

The relative error taken to a percentage is the percentage error. Percentage error can
therefore be written as

11
| actual value  measured value |
Percentage error =  100
actual value
Examples

4.0 Conclusion
In this Unit you learnt that errors occur in measurement, because the imperfect observer
makes use of imperfect measuring instruments. Some errors are inevitable as they are a
part of the problem under investigation. Moreover, the instruments of calculation, such as
the computer, can only handle a finite number of digits, as the memory is finite. You also
learnt to write a certain number in a specified number of decimal points. You got to know
how to round a number to a number of significant figures. Some ways of reducing some
of these errors were also discussed.

5.0 Summary
In this Unit, you learnt the following:
 Errors are an integral part of life.
 How to round a number to a specific number of significant figures.
 The different types of error and how some of them may be reduced.

6.0 Tutor-Marked Assignment


1. Round the following to the number of significant figures indicated.
(a) 12.0234831 4 significant figures
(b) 295.10542 5 significant figures
(c) 0.0045829 3 significant figures

2. A student measured the length of a string of actual length 72.5 cm as 72.4 cm.
Calculate the absolute error and the percentage error.

7.0 References/Further Readings

12
Solutions to Tutor Marked Assignment

1. Round the following to the number of significant figures indicated.


(a) 12.0234831 4 significant figures = 12.02
(b) 295.10542 6 significant figures = 295.105
(c) 0.0045829 3 significant figures = 0.00458

2. A student measured the length of a string of actual length 72.5 cm as 72.4 cm.
Calculate the absolute error and the percentage error.

Absolute error is | 72.5  72.4 | 0.01 .


0. 1
The percentage error is  100  0.1379
72.5

13
UNIT 2: Approximations and Errors in Numerical Computations
1.0 Introduction
2.0 Objectives
8.0 Main Content
3.1 Linear Graph
3.2 Linearisation
3.3 Curve Fitting
3.3.1 Method of Least Squares
3.3.2 Method of group averages
9.0 Conclusion
10.0 Summary
11.0 Tutor-Marked Assignment (TMA)
12.0 References/Further Readings

1.0 Introduction
In most experiments as a physicist, you would be expected to plot some graphs. This
chapter explains in details, how you can interpret the equation governing a particular
phenomenon, plot the appropriate graph with the data obtained, to illustrate the inherent
physical features, and deduce the values of some physical quantities. The process of
fitting a curve to a set of data is called curve-fitting. We shall now take a look at the
possible cases that could arise in curve-fitting.

2.0 Objectives
At the end of this unit, you should be able to:
 Linearise a given equation in order to plot a linear graph from which some
physical constants can be determined.
 Derive the equation for least squares linear fit.
 Derive the equation for the method of moving averages.
 Fit a linear graph to a set of data.

3.0 Main Content


3.1 Linear Graph
The law governing the physical phenomenon under investigation could be linear, of the
form y  mx  c . It follows that a graph could be plotted of the points ( xi , y i ) , i = 1, …,
n, where n is the number of observations (or sets of data). We could obtain the line of
best fit via any of a number of methods:

More on this.

3.2 Linearisation
A nonlinear relationship can be linearised and the resulting graph analysed to bring out
the relationship between variables. We shall consider a few examples:

Case 1: y  ae x .
(i) We could take the logarithm of both sides to base e:
ln y  ln( ae x )  ln a  ln e x  x  ln a ,

14
since ln e x  x . Thus, a plot of ln y against x gives a linear graph with slope
unity and a y-intercept of ln a.

(ii) We could also have plotted y against e x . The result is a linear graph through the
origin, with slope equal to a.

l
Case 2: T  2
g
We can write this expression in three different ways:
1 l 1
(i) ln T  ln( 2 )  ln    ln( 2 )  (ln l  ln g ) .
2 g 2
Rearranging, we obtain,
1  1 
ln T  ln l   ln( 2 )  ln g 
2  2 
writing this in the form y  mx  c , we see that a plot of ln T against ln l gives a slope of
 1 
0.5 and a ln T intercept of  ln( 2 )  ln g  . Once the intercept is read of the graph, you
 2 
can then calculate the value of g.

2
(ii) T l
g
A plot of T versus l gives a linear graph through the origin (as the intercept is zero).
2
The slope of the graph is , from which the value of g can be recovered.
g

(iii) Squaring both sides,


2 4 2
T  l
g
A plot of T 2 versus l gives a linear graph through the origin. The slope of the graph is
4 2
, and the value of g can be obtained appropriately.
g

Case 3: N  N 0 e   t
The student can show that a plot of ln N versus t will give a linear graph with slope
  , and ln N intercept is ln N 0 .

What other functions of N and t could you plot in order to get  and N 0 ?

15
1 1 1
Case 4:  
f u v
We rearrange the equation:
1 1 1
 
v f u
A plot of v 1 (y-axis) versus u 1 (x-axis) gives a slope of  1 and a vertical intercept of
1
.
f

Example
A student obtained the following reading with a mirror in the laboratory.

u 10 20 30 40 50
v -7 -10 -14 -15 -17

1 1 1
Linearise the relationship   . Plot the graph of v 1 versus u 1 and draw the line
v f u
of best fit. Hence, find the focal length of the mirror. All distances are in cm.

Solution

u v 1/u 1/v
10 -7 0.1-0.14286
20 -10 0.05 -0.1
30 -140.033333-0.07143
40 -15 0.025-0.06667
50 -17 0.02-0.05882

The graph is plotted in Fig. 1.1.

16
1/u (/cm)
0.00
0 0.02 0.04 0.06 0.08 0.1 0.12
-0.02

-0.04

-0.06
1/v (/cm)

-0.08

-0.10

-0.12

-0.14

-0.16

1 1 1
Fig. 1.1: Linear graph of the function  
v f u

1 1 1
The slope is  1.05 and the intercept  0.04 . From   , we see that the intercept is
v f u
1 1
 0.04 , or f   =  25 cm.
f 0.04

3.3 Curve Fitting


What we did in Section 3.2, generally, was to plot the values of dependent variable
against the corresponding values of the independent variable. With this done, we got the
line of best fit. The latter could have been obtained by eye judgment. There are some
other ways of deducing the relationship between the variables. We shall first consider the
ones based on linear relationship, or the ones that can be somehow reduced to such
relationships.

3.3.1 Method of Least Squares


Suppose xi , i  1,  , n are the points of the independent variable where the dependent
variable having respective values y i , i  1,  , n is measured. Consider the graph below,
where we have assumed a linear graph of equation y  mx  c . Then at each point
xi , i  1,  , n , y i  mxi  c .

The least square method entails minimizing the sum of the squares of the difference
between the measured value and the one predicted by the assumed equation.

17
y1  (mx1  c )

y1

x1
mx1  c

Fig. 1.2: Illustration of the error in representing a set of data with the line of best fit

n
S    y i  (mxi  c)
2
2.1
i 1
We have taken the square of the difference because taking the sum alone might give the
impression that there is no error if the sum of positive differences is balanced by the sum
of negative differences, just as in the case of the relevance of the variance of a set of data.

Now, S is a function of m and c , that is, S  S (m, c) . This is because we seek a line of
best fit, which will be determined by an appropriate slope and a suitable intercept. In any
case, xi and y i are not variables in this case, having been obtained in the laboratory, for
instance.

You have been taught at one point or another that for a function of a single variable
df
f (x ) , the extrema are the points where  0 . However, for a function of more than
dx
one variable, partial derivatives are the relevant quantities. Thus, since S  S ( m, c) , the
condition for extrema is

S S
 0 and 0 2.2
m c

n
S
 2 [ y i  (mxi  c )]( xi )  0 2.3
m i 1

18
n
S
 2 [ yi  (mxi  c )](1)  0 2.4
c i 1
From equation 2.3,
n n n
2
 xi yi  m xi   cxi = 0
i 1 i 1 i 1
2.5

and from equation 2.4,


n n n

 y i  m xi   c = 0
i 1 i 1 i 1
2.6
n

x
i 1
i
It follows from the fact that xi  and similar expressions, that equations 2.5 and
n
2.6 give, respectively,
xy  m x 2  cx  0 2.7
y  mx  c  0 2.8
Multiplying equation 2.8 by x gives
x y  mx 2  cx  0 2.9
Finally, from equations 2.7 and 2.9,
xy  x y
m 2.10
x2  x 2
and from equation 2.8,
c  y  mx 2.11

Example
A student obtained the following data in the laboratory. By making use of the method of
least squares, find the relationship between x and t.

Thus, for the following set of readings:

t 5 12 19 26 33
x 23 28 32 38 41

The table can be extended to give


t 5 12 19 26 33 =95 t =19
x 23 28 32 38 41 =162 x =32.4
tx 115 336 608 988 1353 =3400 tx =680
t2 25 144 361 676 1089 =2295 x 2 =459

tx  t x
680  19  32.4
m   0.6571 2.12
2
t t 2 459  19 2
c  x  mt  32.4  0.6571 19  19.9151 2.13

19
Hence, the relationship between x and t is,
x  0.6571t  19.9151

3.3.2 Method of group averages


As the name implies, a set of data is divided into two groups, each of which is assumed to
have a zero sum of residuals. Thus, given the equation
y  mx  c 2.14
we would like to fit a set of n observations as close as possible.

The error in the measured value of the variable and the value predicted by the equation is
(as we have seen in Fig. …):
 i  y i  (mxi  c ) 2.15

The fitted line requires two unknown quantities: m and c. Thus, two equations are
needed. We would achieve these two equations by dividing the data into two, one of size
l and the other of size n-l, where n is the total number of observations.

The assumption that the sum of errors for each group is zero, requires that
l

[ y
i 1
i  (mx i  c)] = 0 2.16

and
n

[ y
l 1
i  (mx i  c)] = 0 2.17

From equation 2.16,


l l

 yi  m xi  lc
i 1 i 1
2.18

and equation 2.17 yields


n n

 y i  m  x i  (n  l )c
i l 1 i l 1
2.19

the latter equation being true since n – l is the number of observations that fall into that
group.

Dividing through by l and n – l, respectively, equation 2.18 gives


1 l 1 l
 i l
l i 1
y  m
i 1
xi  c 2.20

and from equation 2.19,


1 n 1 n
 yi  m n  l i
n  l i l 1 l 1
xi  c 2.21

Thus,
y1  mx1  c
y 2  mx 2  c 2.22

20
Subtracting,
y1  y 2  m( x1  x2 ) 2.23
y  y2
m 1 2.24
x1  x 2
and
c  y1  mx1 2.25

Example

Let us solve the example in Section 3.3.1 using the method of group averages.

t 5 12 19 26 33
x 23 28 32 38 41

We shall divide the data into two groups, such as:

t 5 12 19
x 23 28 32

and

t 26 33
x 38 41

The tables can be extended to give, for Table 3:

t 5 12 19 =36 t1 =12
x 23 28 32 =83 x1 =27.666667

and for Table 4:

t 26 33 =59 t 2 =29.5
x 38 41 =79 x2 =39.5

x1  x 2 27.666667  39.5
m =  0.67619
t1  t 2 12  29.5
and
c  x1  mt1 = 27.666667  (0.67619  12)
= 19.552387

Thus, the equation of best fit is,


x  0.67619t  19.552387

21
4.0 Conclusion
In this Unit, you learnt how to linearise an expression in order to obtain some relevant
information when written as a linear equation. You also derived the equations for two
different methods of drawing the line of best fit. In addition, you applied these formulas
to a set of data and was able to write the equation of best fit in each case.

5.0 Summary
In this Unit, you learnt:
 How to linearise a nonlinear expression in order to deduce some desired
parameters.
 How to draw the line of best fit with the method of least squares.
 How to draw the line of best fit with the method of group averages.

6.0 Tutor Marked Assignment (TMA)


1. The current flowing in a particular R-C circuit is tabulated against the change in
the time t  t 0 , such that at time t  t 0 , the current is 1.2 A. Using the least-
squares method, find the slope and the intercept of the linear function relating the
current i to the time t. Hence, determine the time-constant of the circuit.

t 2 2.2 2.4 2.6 2.8 3


i 0.20 0.16 0.13 0.11 0.09 0.07

2. Solve the problem in TMA 1 with the method of group averages by dividing into
two groups of three data sets each.
t 2 2.2 2.4
i 0.20 0.16 0.13

and

t 2.6 2.8 3
i 0.11 0.09 0.07

3. A student performing the simple pendulum experiment obtained the following


table, where t is the time for 50 oscillations.

l (cm) 50 45 40 35 30 25 20 15
t (s) 71 69 65 61 56 52 48 43

Find the acceleration due to gravity at the location of the experiment, using
(i) the method of least squares, and
(ii) the method of group averages.

7.0 References/Further Readings

22
Solutions to Tutor Marked Assignment

1. The current flowing in a particular R-C circuit is tabulated against the change in
the time t  t 0 , such that at time t  t 0 , the current is 1.2 A. Using the least-
squares method, find the slope and the intercept of the linear function relating the
current i to the time t. Hence, determine the time-constant of the circuit.

t 2 2.2 2.4 2.6 2.8 3


i 0.20 0.16 0.13 0.11 0.09 0.07

t
Taking logs: i  i0 e t / RC . log i  log i0  log(e t / RC )  log i0  . A plot of log i
RC
1
against t gives slope and intercept log i0 .
RC
t I tsquare log l tlogl
2.00.200000 4 -0.69897-1.39794 -0.7022
2.20.160000 4.84 -0.79588-1.75094-0.78902
2.40.130000 5.76 -0.88606-2.12654-0.87584
2.60.110000 6.76 -0.95861-2.49238-0.96266
2.80.090000 7.84 -1.04576-2.92812-1.04948
3.00.070000 9 -1.1549-3.46471 -1.1363
Sum 15 38.2 -5.54017-14.1606
Average 2.5 6.3666667 -0.92336 -2.3601
Slope -0.4431
Intercept 0.1844

 2.3601  (2.5  0.92336)


m  0.4431
6.3666667  2.5 2
c  log l  mt  0.1844
1 1
m , or RC    2.2568 = time constant of the circuit.
RC m

2. Solve the problem in TMA 1 with the method of group averages by dividing into
two groups of three data sets each.
t 2 2.2 2.4
i 0.20 0.16 0.13

and

t 2.6 2.8 3
i 0.11 0.09 0.07

23
Group 1
t i log i
2.0 0.20 -0.69897
2.2 0.16 -0.79588
2.4 0.13 -0.88606
6.6 -2.38091
2.2 -0.79364

Group 2
t i log i
2.6 0.11 -0.95861
2.8 0.09 -1.04576
3.0 0.07 -1.1549
8.4 -3.15927
2.8 -1.05309

y1  y 2  .79364  (1.05309)
m   0.4324
x1  x 2 2.2  2.8
c  y1  mx1  0.79364  (0.4324  2.2)  0.1576

3. A student performing the simple pendulum experiment obtained the following


table, where t is the time for 50 oscillations.

l (cm) 50 45 40 35 30 25 20 15
t (s) 71 69 65 61 56 52 48 43

Find the acceleration due to gravity at the location of the experiment, using
(iii) the method of least squares, and
(iv) the method of group averages.

Method of least squares (taking logs)


 2  1
log T  log   log l : A plot of log T against log l gives slope 0.5 and
 g 2
 
2
2  2 
intercept c = log , from which the value of g is  1
 .
g  log (c) 

l t log l log T (log l)*(log l) (log l)*(log T)


0.50 71 -0.30103 0.152288 0.090619058 -0.04584 0.2966771
0.45 69 -0.34679 0.139879 0.120261561 -0.04851 0.3165404
0.40 65 -0.39794 0.113943 0.158356251 -0.04534 0.3387458

24
0.35 61 -0.45593 0.086360 0.207873948 -0.03937 0.3639201
0.30 56 -0.52288 0.049218 0.273402182 -0.02574 0.3929817
0.25 52 -0.60206 0.017033 0.362476233 -0.01026 0.4273542
0.20 48 -0.69897 -0.017729 0.488559067 0.012392 0.4694229
0.15 43 -0.82391 -0.065502 0.678825613 0.053967 0.5236588
Sum -4.14951 0.475492 2.380373913 -0.1487
Average -0.51869 0.059436 0.297546739 -0.01859

slope 0.429391
intercept 0.282157
2 pi 6.284
log 2 pi 0.798236
log 2 pi -inter 0.516079
2(log 2pi-
inter) 1.032159log g
10.84g
Method of least squares (taking squares)
4 2 4 2
T2  l . A plot of T 2 against l gives a line through the origin with slope m = ,
g g
4 2
from which g  :
m

L t L Tsquare lsquare Tsquare l


50 71 0.50 2.016400 0.2500 1.0082 2.05 0.5
45 69 0.45 1.904400 0.2025 0.85698 0.070136 0.45
40 65 0.40 1.690000 0.1600 0.676 0 0.4
35 61 0.35 1.488400 0.1225 0.52094 2.0164 0.35
30 56 0.30 1.254400 0.0900 0.37632 2.50932 0.3
25 52 0.25 1.081600 0.0625 0.2704 2.1661 0.25
20 48 0.20 0.921600 0.0400 0.18432 1.8264 0.2
15 43 0.15 0.739600 0.0225 0.11094 1.47766 0.15
Sum 2.6 11.0964 0.9500 4.0041
Average 0.325 1.38705 0.11875 0.500513

slope 3.788286 g 10.42


intercept 0.155857

Method of group averages (taking logs)


Group 1
L t log l log T
0.50 71 -0.3010 0.15229
0.45 69 -0.3468 0.13988
0.40 65 -0.3979 0.11394
0.35 61 -0.4559 0.08636

25
Sum -1.50169 0.49247
Average -0.375420.123118

Group 2
L t log l log T
0.30 56 -0.5229 0.04922
0.25 52 -0.6021 0.01703
0.20 48 -0.6990 -0.0177
0.15 43 -0.8239 -0.0655
Sum -2.64782 -0.01698
Average -0.66196 -0.00425

slope 0.444496
intercept 0.289991
g 10.38

Method of group averages (taking squares)


Group 1
L t l Tsquare
0.50 71 0.50 2.0164
0.45 69 0.45 1.9044
0.40 65 0.40 1.69
0.35 61 0.35 1.4884
Sum 1.70 7.0992
Average 0.43 1.7748

Group 2
L t l Tsquare
0.30 56 0.30 1.2544
0.25 52 0.25 1.0816
0.20 48 0.20 0.9216
0.15 43 0.15 0.7396
Sum 0.9 3.9972
Average 0.225 0.9993

slope 3.8775
intercept 1.02051
g 10.18

26
UNIT 3: Linear Systems of Equations
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 System of Linear Equations
3.2 Gaussian Elimination
3.3 Gauss-Jordan Elimination
3.4 LU Decomposition
3.5 Jacobi Iteration
3.6 Gauss-Seidal Iteration
4.0 Conclusion
5.0 Summary
6.0 Tutor Marked Assignment
7.0 References/Further Reading

1.0 Introduction
Perhaps in all areas of Physics, you would come across a system of linear equations. For
example, you might want to know what proportions of two or more variables you would
need to achieve some specific values of a desired composite product. This kind of
problem could lead to a set of linear equations. This unit will equip you with the
necessary tools to solve a system of linear equations. You shall come across direct
methods as well as iterative ways of solving such problems.

2.0 Objectives
You should be able to do the following after studying this Unit:
 Write a system of linear equations in an augmented matrix form
 Solve a system of linear equations.

3.1 System of Linear Equations


It is necessary for us to set the stage by getting to know how to write the general set of
simultaneous linear equations.

Let us consider a linear system of equations


a11 x1  a12 x 2    a1n x n  b1
a 21 x1  a 22 x 2    a 2 n x n  b2
.
3.1
.
.
a n1 x1  a n 2 x 2    a nn x n  bn

This can be written in the form

27
 a11 a12    a1n   x1   b1 
    
 a 21 a 22    a 2 n   x 2   b2 
            
    3.2
            
             

a an 2    a nn   x n   bn 
 n1

3.2 Gaussian Elimination


A recall of the solution of a system of two equations will help in introducing the Gaussian
Elimination method.

For instance, let ( 2,3) be a solution set ( x, y ) . Then the following equations are in order.
2 x  3 y  13 3.3
x  y  1 3.4

You might want to verify that these equations are consistent with the given solution set.

We could multiply equation 3.2 by -2 and add to equation 2.3. This yields

5 y  15 3.5

Equivalently, y  3 . Substituting this value of y in either equation 3.3 or 3.4 gives


x  2.

The augmented matrix representing our system of two equations is


 2 3 13 
 
1  1  1

By Gaussian elimination, we seek to make every entry below the main diagonal zero.
This we achieve by reducing 1 to zero, making use of the first row.
 2 3 13  (ii )'( i ) 2( ii )  2 3 13
       
1  1  1 0 5 15 3.6

Thus,
5 y  15  y  3 3.7
Substituting this in the first row gives
2 x  3(3)  13 3.8
from which we obtain x  2 .

28
The process of reducing every element below the main diagonal to zero (row echelon
form) is called Gaussian Elimination. That of substituting obtained values to calculate
other variables is called Back Substitution.

You can see that there is nothing new about Gaussian elimination. It is a process you
have been carrying out all along, but which you never called this name.

The same process can be carried over to the case of a system of three equations.

Let (1, 2,1) be a solution set.

Then, the equations below are valid:


2x  y  z  5
x  3 y  2z  5 3.9
3x  2 y  4 z  3

The augmented matrix is

2 1  1 5
 
1 3 2 5
 3  2  4 3

This yields (by Gaussian elimination)

2 1  1 5 2 1 1 5 
  ( ii )' ( i )  2 ( ii )  
1 3 2 5      0  5  5  5
iii )  ( i )  ( 2 / 3)( iii )
 3  2  4 3 (    0 7 / 3 5 / 3 3 
2 1 1 5 
 
 0  5  5  5 3.10
iii )'' ( ii )' (15 / 7 )(iii )'
(     0 0  10 10 
Upon back substitution,
 10 z  10 or z  1
z  1 ; y  z  1  y  2 ; 2 x  y  z  5  x  1

Traditionally, in Mathematics, it is usual to use indices such as x1 , x2 , etc. instead of


x, y, z . Do you have any idea why this is so? It is because if we stay with the alphabets,
we shall soon run out of symbols. Bear in mind that not all the alphabets can be employed
as variables; as an example, a, b, c are commonly used as constants. In addition, it makes
it easy to associate the coefficients a11 , a12 , etc. with x1 , x2 , etc. respectively. More
importantly in numerical work, it makes programming easier. For instance for our system
of three equations, we could use the more general notation:

29
 a11 a12 a13 a14  a11 a12 a13 a14 
  ( ii)'  ( i )  ( a11 / a12 )( ii)  
a 21 a 22 a 23 a 24         0 a 22 ' a 23 ' a 24 '
( iii)  ( i )  ( a11 / a31 )(iii)
a 31 a32 a 33 a34        0 a 32 ' a 33 ' a 34 '

a11 a12 a13 a14 


 
0 a 22 ' a 23 ' a 24 ' 
( iii)''  ( ii)'  ( a22 / a32 )( iii)'
        0 0 a 33 ' ' a 34 ' '
3.11

We would like to sound a note of warning here. How do you set a 21 ' equal to zero? From
a
the expression 3.11, a 21 '  0  a11  11 a 21 . In order to avoid having to deal with
a 21
fractions which could lead to rounding errors, it is better to put this in the form:
(ii )'  a12 (i)  a11 (ii ) 3.12
A better way of writing equation 3.11 is,

 a11 a12 a13 a14   a11 a12 a13 a14 


  (ii )'a12 (i )a11 (ii )  
a21 a22 a23 a24         0 a22 ' a23 ' a24 '
iii)  a11 ( i )  a11 ( iii)
 a31
 a32 a33 a34  (     0 a32 ' a33 ' a34 '

a11 a13 a14 


a12
 
0 a22 ' a 23 ' a24 ' 
( iii)'' a32 ( ii)'  a22 ( iii)'
       0 0 a33 ' ' a34 ' '
3.11

3.3 Gauss-Jordan Elimination


This entails eliminating in addition to the entries below the major diagonal, the entries
above it, so that the main matrix is a diagonal matrix. In that case, the solution to the
system is given by dividing the element in the augmented part of the matrix by the
diagonal element for that row. In other words, the end product of Gauss-Jordan
elimination looks like

 a11 0 0 a14 ' ' 


 
0 a 22 ' ' ' ' 0 a 24 ' ' ' 
 0 0 a33 ' ' ' ' a34 ' ' ' '
3.12

from which it follows that

x1  a14 ' ' / a11 x2  a 24 ' ' ' / a 22 ' ' ' x3  a 34 ' ' ' ' / a 33 ' ' ' ' 3.13

Example

30
We shall solve problem … using the Gauss-Jordan elimination. Luckily, we have already
completed the Gaussian elimination part of this method. We continue from where we
stopped.

i )' ( iii ) 10 ( i )


2 1  1 5  (     20 10 0  40
  (ii )'( iii ) 2(ii )  
0  5  5  5      0  10 0 20 
0 0  10 10   0 0  10 10 

i )'' ( ii )' ( i )'


(    20 0 0  20
 
 0  10 0 20  3.14
 0 0  10 10 
It follows that,

 20 x  20 or x  1 ;  10 y  20 or y  2 ; and  10 z  10 or z  1

3.4 LU Decomposition
Suppose we could write the matrix
 a11 a12 a13   l11 0 0  u11 u12 u13 
a   0   0 u 23 
 21 a 22 a 23  = l 21 l 22 u 22 3.15
 a31 a32 a33  l31 l 32 l33   0 0 u 33 

This implies that


l11u11  a11 , l11u12  a12 , l11u13  a13 3.16
a 21  l 21u11 , a 22  l 21u12  l 22 u 22 , a23  l21u13  l22u 23 3.17
a31  l31u11 , a32  l31u12  l32 u 22 , a33  l31u13  l 32 u 23  l33 u 33 3.18

Without loss of generality, we could set the diagonal elements of the L matrix equal to 1.
Then,
 a11 a12 a13   1 0 0 u11 u12 u13 
a    
 21 a 22 a 23  = l 21 1 0  0 u 22 u 23  3.19a
 a31 a32 a33  l31 l32 1  0 0 u 33 

Multiplying out the right side of equation 3.19,


 a11 a12 a13   u11 u12 u13 
a   
 21 a 22 a 23  = l 21u11 l 21u12  u 22 l 21u13  u 23  3.19b
 a31 a32 a33  l31u11 l31u12  l32 u 22 l31u13  l32 u 23  u 33 

From the equality of matrices, this requires that,


u11  a11 3.20

31
u12  a12 3.21
u13  a13 3.22
a 21  l 21u11  l 21  a 21 / u11  a 21 / a11 3.23
a31  l31u11  l 31  a31 / u11  a31 / a11 3.24
a
a 22  l 21u12  u 22 , or u 22  a 22  l 21u12  a 22  21 u12
u11
a 21
 u 22  a 22  a12
a11 3.25
a 21
a 23  l 21u13  u 23 , or u 23  a 23  l 21u13  a 23  u13
u11
a 21
 u 23  a 23  a13 3.26
a11
a32  l 31u12 1  a31 
l32   a32  u12 
u 22 u 22
 u11 
1  a31 
 l32   a32  a12  3.27
u 22  a11 
a 32  l 31 u 12  l 32 u 22
3.28
a33  l31u13  l32u23  u33
 u 33  a33  l31u13  l32 u 23 3.29

You can see that we have determined all the nine elements of the two matrices in terms of
the elements of the original matrix.

Once we have obtained L and U, then we can write the original equation

 a11 a12 a13   x1   y1 


a a 22 a 23   x 2    y 2  3.30
 21
 a31 a32 a33   x3   y3 
as
LUx  y 3.31
where x and y are column vectors.

We shall write w  Ux
Then,
Lw  y 3.32

Example
Solve the following system of equations using the method of LU decomposition.

32
2x  y  z  5
x  3 y  2z  5 3.33
3x  2 y  4 z  3

The corresponding matrix is

 2 1  1
1 3 2 

 3  2  4

u11  a11 = 2 3.34


u12  a12 = 1 3.35
u13  a13  1 3.36
l 21  a 21 / a11 = 1 / 2 3.37
l31  a31 / a11 = 3 / 2 3.38
a 1
u 22  a 22  21 a12  3  (1)  5 / 2
a11 2 3.39
a 21 1 1
u 23  a 23  a13  2  (1)  2   5 / 2
a11 2 2
1  a31  1  3 
l32   a32  a12     2  (1)  7 / 5 3.40
u 22 a11  5/ 2  2 
u 33  a 33  l 31u13  l 32 u 23  4  (3 / 2)(1)  (7 / 5)(5 / 2)  1 3.41

Thus,
 1 0 0 2 1 1   2 1  1
1 / 2 1 0 0 5 / 2 5 / 2 =
  1 3 2  3.42
 
3 / 2  7 / 5 1 0 0 1   3  2  4

As you can see, we got the decomposition right, as the multiplication of the L and U
gives the original matrix.

The original equation is equivalent to


LUx  Lw  y 3.43
Lw  y implies
 1 0 0  w1  5
1 / 2 1 0  w2   5 3.44

3 / 2  7 / 5 1  w3  3

33
Solving,
w1  5 3.45
1 1 1 5
w1  w2  5 or w2  5  w1  5  (5)  3.46
2 2 2 2
3 7 7 3 75 3
w1  w2  w3  3 , or w3  3  w2  w1  3     (5)  1 3.47
2 5 5 2 52 2

Ux  w implies:
2 1  1   x1   5 
0 5 / 2 5 / 2  x   5 / 2 3.48
  2  
0 0 1   x3    1 

By back substitution,
x 3  1 3.49
5 5 5 5 5 5 5 5
x 2  x3   x 2   x3   (1)  5 3.50
2 2 2 2 2 2 2 2
x2  2 3.51
2 x1  x 2  x3  5 3.52
5  x 2  x 3 5  2  (1)
x1   1 3.53
2 2
The solution set is therefore,
x1  1 , y  2 , z  1 . 3.54

3.5 Jacobi Iteration


Given the system of equations
a1 x  b1 y  c1 z  d 1 3.55
a 2 x  b2 y  c2 z  d 2 3.56
a3 x  b3 y  c3 z  d 3 3.57

Solving for x , y and z , gives


1
x  d1  b1 y  c1 z  3.58
a1
1
y  [d 2  a 2 x  c 2 z ] 3.59
b2
1
z  [d 3  a3 x  b3 y ] 3.60
c3

It is easy to see that provided the diagonal elements are large relative to the other
coefficients, the sequence of iteration would converge.

34
For initial values x 0 , y 0 and z 0 , the scheme would be as shown below:
1
x1  d1  b1 y 0  c1 z 0  3.61
a1
1
y1  [d 2  a 2 x0  c 2 z 0 ] 3.62
b2
1
z1  [d 3  a3 x 0  b3 y 0 ] 3.63
c3

We can now write, for n = 0 and above,


1
x n 1  d1  b1 y n  c1 z n  3.64
a1
1
y n 1  [d 2  a 2 x n  c 2 z n ] 3.65
b2
1
z n 1  [d 3  a 3 x n  b3 y n ] 3.66
c3

The sequence of iteration continues until there is convergence, in the sense that
| x n1  x n | , | y n1  y n | and | z n1  z n | are less than the prescribed tolerance.

Example
We shall solve the following system of equations using the Jacobi iteration method.

25 x  y  z  28 3.67
x  30 y  2 z  59 3.68
3 x  2 y  20 z  19 3.69

Equivalently,
28  y  z 59  x  2 z 3x  2 y  19
x , y , z 3.70
25 30 20

Let us assume that the initial guess of solution is (0, 0, 0).


Then, the first set of values for the iteration is:
28  0  0 28
x1    1.12 3.71
25 25
59  0  0 59
y1    1.96666667 3.72
30 30
0  0  19 19
z1    0.95 3.73
20 20

28  59 / 30  19 / 20 13850
x   1.00333333 3.74
25 15000

35
28 19
59   2  ( )
y 25 20  5606  2803  1.99266667
30 1500 1500 3.75
26 59
3  2   19
25 30 14860
z   0.97866667 3.76
20 15000

Table 3.1 shows the rest of the computation.

Table 3.1: Table for Jacobi iteration

n x y z
1 1.12000000 1.96666667 -0.95000000
2 1.00333333 1.99266667 -0.97866667
3 1.00114667 1.99846667 -0.99876667
4 1.00011067 1.99987956 -0.99967467
5 1.00001783 1.99997462 -0.99997136
6 1.00000216 1.99999750 -0.99999479
7 1.00000031 1.99999958 -0.99999943
8 1.00000004 1.99999995 -0.99999991
9 1.00000001 1.99999999 -0.99999999
10 1.00000000 2.00000000 -1.00000000

3.6 Gauss-Seidal Iteration


You would recall that in each of the Jacobi iterations, we calculated the value of the
variables using the old variables. The Gauss-Seidal iteration is a modification of this
method, in which the value of x obtained in a particular iteration and the old value of z is
put into the formula for y to obtain a new value for y. The new values of x and y are
substituted into the equation for z.

Thus, given the system of equations


a1 x  b1 y  c1 z  d 1 3.77
a 2 x  b2 y  c2 z  d 2 3.78
a3 x  b3 y  c3 z  d 3 3.79
with the initial condition x 0 , y 0 , z 0 ,
1
x1  d1  b1 y 0  c1 z 0 
a1 3.80
1
y1  [d 2  a 2 x1  c 2 z 0 ] 3.81
b2
1
z1  [d 3  a 3 x1  b3 y1 ] 3.82
c3

36
1
x n 1  d1  b1 yn  c1 z n  3.83
a1
1
y n 1  [d 2  a 2 x n 1  c 2 z n ] 3.84
b2
1
z n 1  [d 3  a3 x n 1  b3 y n 1 ] 3.85
c3

As in the case of the Jacobi iteration, the sequence of iteration continues until there is
convergence, in the sense that | x n1  x n | , | y n1  y n | and | z n1  z n | are less than the
prescribed tolerance.

Example
We shall solve the following system of equations using the Gauss-Seidal iteration
method. Assume (0,0,0) is the initial guess of solution.

25 x  y  z  28 3.86
x  30 y  2 z  59 3.87
3 x  2 y  20 z  19 3.88

28  y 0  z 0
x1 
25 3.80
59  x1  2 z 0
y1  3.81
20
3x  2 y1  19
z1  1 3.82
30

28  0  0 28
x1    1.12
25 25 3.80
28
59   20
25 1447
y1    1.929333333 3.81
30 750
28 1447
3  2  19
25 750 365600
z1    0.974933333 3.82
20 375000

You can verify the remaining calculations on Table 3.2.

Table 3.2: Table for Gauss-Seidal iteration


n x y z
1 1.12000000 1.92933333 -0.97493333

37
2 1.00382933 1.99820124 -0.99924572
3 1.00010212 1.99994631 -0.99997931
4 1.00000298 1.99999852 -0.99999941
5 1.00000008 1.99999996 -0.99999998
6 1.00000000 2.00000000 -1.00000000

Observation: As expected, the Gauss-Seidal iteration converged faster than the Jacobi
iteration.

4.0 Conclusion
In this Unit, you learnt various methods for solving a system of linear algebraic or
transcendental equations using various methods: some were direct, while the others were
iterative in nature. You also got to know the merits and the demerits of direct and
iterative methods. You also found out that it is important, in elementary row operations,
to avoid having to deal with fractions, so as to keep rounding errors minimal.

5.0 Summary
You learnt the following in this Unit:
 How to write a matrix in the form amenable for programming.
 How to numerically solve a set of linear equations.
 That the Gauss-Seidal iteration converges faster than the Jacobi iteration.
 In numerical work, for the sake of avoiding rounding errors, it is better to retain
fractions for as long as possible.
 Iteration is advisable only if the main diagonal elements are large compared with
the other entries of the equivalent matrix.

6.0 Tutor Marked Assignment

1. Solve the system of linear equations x  y  z  1 , x  2 y  2 z  4 ,


9 x  6 y  z  7 using the method of
(i) Gaussian elimination
(ii) Gauss-Jordan elimination
(iii) LU decomposition
(iv) Jacobi iteration
(v) Gauss-Seidal iteration

2. Solve the system of linear equations x  2 y  2 z  2 , 2 x  2 y  z  4 ,


9 x  6 y  2 z  14 using the method of
(i) Gaussian elimination
(ii) Gauss-Jordan elimination
(iii) LU decomposition
(iv) Jacobi iteration
(v) Gauss-Seidal iteration

38
7.0 References/Further Reading

39
Solutions to Tutor Marked Assignment

1. Solve the system of linear equations x  y  z  1 , x  2 y  2 z  4 ,


9 x  6 y  z  7 using the method of
(i) Gaussian elimination
Initial augmented matrix
1 1 1 -1
1 2 2 -4
9 6 1 7

First round of Gaussian elimination


1 1 1 -1
0 1 1 -3
0 -3 -8 16

Second round of Gaussian elimination


1 1 1 -1
0 1 1 -3
0 0 -5 7

(ii) Gauss-Jordan elimination


Last matrix for Gaussian
elimination
1 1 1 -1
0 1 1 -3
0 0 -5 7

First round of Jordan elimination


5 5 0 2
0 5 0 -8
0 0 -5 7

Second round of Jordan elimination


-25 0 0 -50
0 5 0 -8
0 0 -5 7

(iii) LU decomposition
x  y  z  1
x  2 y  2 z  4 3.33
9x  6 y  z  7

The corresponding matrix is

40
1 1 1 
1 2 2
 
9 6 1 

u11  a11 = 1 3.34


u12  a12 = 1 3.35
u13  a13  1 3.36
l 21  a 21 / a11 = 1 / 1 =1 3.37
l31  a31 / a11 = 9 / 1 = 9 3.38
a
u 22  a 22  21 a12  2  (1)(1)  1
a11 3.39
a 21
u 23  a 23  a13  2  (1)(1)  1
a11
1  a31  1 9 
l32   a32  a12   6  (1)  3 3.40
u 22 a11  1 1 
u 33  a 33  l 31u13  l 32 u 23  1  (9)(1)  (3)(1)  5 3.41

Thus,
1 0 0 1 1 1  1 1 1 
1 1 0 0 1 1  = 1 2 2 3.42
    
9  3 1 0 0  5 9 6 1 

W got the decomposition right, as the multiplication of the L and U gives the original
matrix.

The original equation is equivalent to LUx  Lw  y ,


Lw  y implies
1 0 0  w1    1 
1 1 0  w     4 3.44
   2  
9  3 1  w3   7 

Solving,
w1   1 3.45
w1  w2  4 or w2  4  w1  4  (1)  3 3.46
9 w1  3w2  w3  7 , or w3  7  3w2  9w1  7  3(3)  9(1)  7 3.47

Ux  w implies:

41
1 1 1   x1    1
0 1 1   x     3 3.48
  2  
0 0  5  x 3   7 

By back substitution,
x 3   7 / 5 =  1. 4 3.49
x 2  x3  3  x 2  3  x3  3  (7 / 5) 3.50
x 2  8 / 5  1.6 3.51
x1  x 2  x3  1 3.52
10
x1  1  x 2  x3  1  (8 / 5)  (7 / 5)  2 3.53
5
The solution set is therefore,
x1  2 , y  1.6 , z  1.4 . 3.54
Notice that, where necessary, we reverted to fractions to avoid incurring rounding errors.

2. Solve the system of equations 25 x  2 y  z  26 , 3 x  20 y  2 z  15 ,


x  4 y  15 z using:
(i) Jacobi iteration
(ii) Gauss-Seidal iteration
Assume a starting set of values x 0  y 0  z 0  0 and a tolerance of
| xi 1  xi | 5  10 6 , | y i 1  y i | 5  10 6 , | z i 1  z i | 5  10 6 .

(i) Jacobi iteration


1.040000 0.750000 1.333333
1.033333 1.039333 1.064000
0.999413 1.011400 0.987289
0.998580 0.998641 0.996999
0.999989 0.999487 1.000457
1.000059 1.000044 1.000138
1.000002 1.000023 0.999984
0.999998 0.999999 0.999994
1.000000 0.999999 1.000001
1.000000 1.000000 1.000000

(ii) Gauss-Seidal iteration


1.040000 0.906000 1.022400
1.008416 1.003502 0.998505
0.999660 0.999799 1.000076
1.000019 1.000010 0.999996
0.999999 0.999999 1.000000
1.000000 1.000000 1.000000

42
Observation: The Gauss-Seidal iteration scheme converged faster than the
Jacobi iteration, as was expected.

3. Solve the system of linear equations x  2 y  2 z  2 , 2 x  2 y  z  4 ,


9 x  6 y  2 z  14 using the method of
(iv) Gaussian elimination
Initial augmented matrix
1 2 2 -2
2 2 1 -4
9 6 2 -14

First round of Gaussian elimination


1 2 2 -2
0 -2 -3 0
0 -12 -16 4

Second round of Gaussian elimination


1 2 2 -2
0 -2 -3 0
0 0 -4 -8

Answers
x 0
y -3
z 2

(v) Gauss-Jordan elimination


Last matrix for Gaussian
elimination
1 2 2 -2
0 -2 -3 0
0 0 -4 -8

First round of Jordan elimination


4 8 0 -24
0 -8 0 24
0 0 -4 -8

Second round of Jordan elimination


32 0 0 0
0 -8 0 24
0 0 -4 -8

(vi) LU decomposition

43
x  y  z  1
x  2 y  2 z  4 3.33
9x  6 y  z  7

The corresponding matrix is

1 1 1 
1 2 2
 
9 6 1 

u11  a11 = 1 3.34


u12  a12 = 1 3.35
u13  a13  1 3.36
l 21  a 21 / a11 = 1 / 1 =1 3.37
l31  a31 / a11 = 9 / 1 = 9 3.38
a
u 22  a 22  21 a12  2  (1)(1)  1
a11 3.39
a 21
u 23  a 23  a13  2  (1)(1)  1
a11
1  a31  1 9 
l32   a32  a12   6  (1)  3 3.40
u 22 a11  1 1 
u 33  a 33  l 31u13  l 32 u 23  1  (9)(1)  (3)(1)  5 3.41

Thus,
1 0 0 1 1 1  1 1 1 
1 1 0 0 1 1  = 1 2 2 3.42
    
9  3 1 0 0  5 9 6 1 

W got the decomposition right, as the multiplication of the L and U gives the original
matrix.

The original equation is equivalent to LUx  Lw  y ,


Lw  y implies
1 0 0  w1    1 
1 1 0  w     4 3.44
   2  
9  3 1  w3   7 

Solving,

44
w1   1 3.45
w1  w2  4 or w2  4  w1  4  (1)  3 3.46
9 w1  3w2  w3  7 , or w3  7  3w2  9w1  7  3(3)  9(1)  7 3.47

Ux  w implies:
1 1 1   x1    1
0 1 1   x     3 3.48
  2  
0 0  5  x 3   7 

By back substitution,
x 3   7 / 5 =  1. 4 3.49
x 2  x3  3  x 2  3  x3  3  (7 / 5) 3.50
x 2  8 / 5  1.6 3.51
x1  x 2  x3  1 3.52
10
x1  1  x 2  x3  1  (8 / 5)  (7 / 5)  2 3.53
5
The solution set is therefore,
x1  2 , y  1.6 , z  1.4 . 3.54
Notice that, where necessary, we reverted to fractions to avoid incurring rounding errors.

4. Solve the system of equations 25 x  2 y  z  26 , 3 x  20 y  2 z  15 ,


x  4 y  15 z using:
(i) Jacobi iteration
(ii) Gauss-Seidal iteration
Assume a starting set of values x 0  y 0  z 0  0 and a

(i) Jacobi iteration tolerance of | xi 1  xi | 10 7 , | y i 1  y i | 10 7 ,


| z i 1  z i | 10 7 .
x y z
0.571429 -0.300000 0.800000
0.878571 -0.722857 0.805714
0.910816 -0.907714 0.913429
0.962490 -0.937833 0.980922
0.988746 -0.975586 0.982635
0.992054 -0.991511 0.992485
0.996710 -0.994481 0.998194
0.998961 -0.997845 0.998451
0.999293 -0.999221 0.999346
0.999711 -0.999510 0.999830

45
(ii) Gauss-Seidal iteration of | xi 1  xi | 5  10 6 , | y i 1  y i | 5  10 6 ,
| z i 1  z i | 5  10 6 .

x y z
0.571429 -0.642857 0.942857
0.954082 -0.966735 0.995878
0.996152 -0.997279 0.999681
0.999692 -0.999783 0.999975
0.999976 -0.999983 0.999998
0.999998 -0.999999 1.000000
1.000000 -1.000000 1.000000

46
Unit 4:Roots of Algebraic and Transcendental Equations

1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Introduction
3.2 Bisection Method
3.2.1 Merits of the Bisection Method
3.2.2 Demerits of the Bisection Method
3.3 Newton-Raphson Method
3.3.1 Merits of the Newton-Raphson Method
3.3.2 Demerits of the Newton-Raphson Method
3.4 Regula-falsi method
3.5 Secant Method
4.0 Conclusion
5.0 Summary
6.0 Tutor Marked Assignment
7.0 References/Further Reading

1.0 Introduction
In Physics, as well as in many other scientific fields, there is always the need to find the
root of an equation. You have no doubt been tackling such problems from high school
days. However, up till now, you have been able to handle simple cases that a calculator
could be employed to do. In this Unit, you shall learn how to handle the more
complicated cases of roots of algebraic and transcendental equations.

2.0 Objectives
By the time you are through with this Unit, you should be able to:
 Find the root of an equation or equivalently the zero of a function.
 You would also be able to compare the various methods of obtaining the zero of a
function.

3.0 Main Content

3.1 Introduction
You are probably quite familiar with the concept of the function of a continuous variable
f (x ) , continuous over a certain interval of the independent variable x. If we equate
f (x ) to zero, we obtain the equation f ( x )  0 . You might even see the process as that of
equating two different functions f1 ( x ) and f 2 ( x) , where the latter is identically zero.

47
1.5

1.0

0.5

0.0
-0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
-0.5

-1.0

-1.5

-2.0

-2.5

Fig. 4.1

Figure 4.1 shows the graph of f1 ( x )  x 2  3 x . The x-axis can be seen as the function
f 2 ( x)  0 . Equating the two functions gives f1 ( x )  x 2  3x  0  f 2 ( x ) . The resulting
equation, x 2  3 x  0 , has two solutions x  0 and 3 (the two solutions are indicated in
Figure 4.1). Let us ‘slide’ f 2 ( x) down to f 2 ( x)  2 , the lower horizontal line. The
equation becomes x 2  3 x  2 . This is perhaps one of the commonest quadratic
equations you ever came across. The solutions are: 1.0 and 2.0. You can check this out on
Fig. … as well. Shifting f 2 ( x) lower to  2.5 would ensure that the resulting equation
has no real solutions as the curve would not intersect the line.

The equations we have dealt with so far have been such that could easily be solved using
analytical methods. It should be obvious to you that such equations should form a small
subset of a much larger family of equations, the solutions of most of which do not readily
lend themselves to analytical methods, especially as the power of the polynomial being
equated to zero becomes large. Equating a polynomial to zero gives an algebraic
equation. A transcendental function is a function that ‘transcends’ the normal laws of
algebra as it cannot be expressed as a sequence of the algebraic operations of
addition/subtraction, multiplication/division, an example being the square root of another
function. Other examples include logarithmic, trigonometric, exponential functions and
their inverses. If an equation involves the transcendental expressions, such as
exponentials, trigonometric, logarithmic functions, the equation is said to be a
transcendental equation.

We shall assume that the function whose roots we desire, f (x ) , is a function of x ,


whose zeros (or the roots of the resulting equation) lie on the real axis. That is, the roots

48
of the equation f ( x )  0 are real numbers. There are a number of methods of finding the
roots. We shall now take some of these.

3.2 Bisection Method

1
x1 x4 x3 x2
0
0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4
-1

-2

Fig.

As the name implies, we obtain the points x1 and x 2 , such that f ( x 2 ) f ( x1 )  0 , meaning
that the value of f has opposite signs at the two points, which points to the fact that a
root exists between x1 and x 2 . We approximate this root by the average of the two, i.e.,
( x1  x 2 ) / 2 . Let this be x3 . Then we evaluate f ( x3 ) . x3 is then combined with x1 or
x 2 , depending on the one at which the sign of the function is opposite that of f ( x3 ) .
This gives x 4 . This process is repeated until f (x ) attains the prescribed tolerance. We
have illustrated this in Fig … for the root of the equation x 3  3 x  0 , given that the root
lies between x1 = 1.2 and x 2 = 2.4. Then, x3  ( x1  x 2 ) / 2 = 1.8. f ( x3 )  0 , so we
combine it with x1 to arrive at x 4  ( x1  x3 ) / 2 , and so on.

The convergence of the Bisection method is slow and steady

3.2.1 Merits of the Bisection Method


1. As you can see, the root bisection method always converges. This is because you
would get closer and closer to the root as the distance between the two points of
interest is halved each time.
2. You can also keep a tab on the error. If the root lies between the points a and b,
there will be a sequence:

49
1 1 1
bn  a n  (bn1  a n 1 )  (bn 2  a n 2 )  ...  n 1 (b1  a1 ) . But you would
2 4 2
ba
recall that b1  b and a1  a . Thus, bn  a n  n1 . On the other hand, we note
2
that the first iteration point x3 is at least as close to the root as half the interval
b1  a1 b  an
b1  a1 , i.e., | x3  x | . Similarly, for the nth iteration | x n  x | n .
2 2
ba 1 1 ba ba
But bn  a n  n1 . Hence, | x n  x | (bn  a n )   n . We
2 2 2 2 n1 2
1 ba ba
conclude that | x n  x |  n , and this gives us an idea of the
2 2 n 1 2
maximum error in our estimate of the root.

3.2.2 Demerits of the Bisection Method


1. The convergence is generally slow.
2. You might actually be approaching a singularity, for example, while dealing with
functions that are not continuous between the two initial points. A classical
1
example is the function f ( x)  , negative for x  0 and positive for x  0 . As
x
you start out with the bisection method with a point on the right of 0 and another
on the left of 0, you are under the impression that there should be a root in-
between. If the function is continuous between the initial guesses, this problem is
eliminated.
3. The bisection method will not work if the function is tangential to the x-axis at the
desired root. For example, f ( x)  x 2 is tangential to the x-axis at the point x  0
which is the root of the equation x 2  0 . The function is positive on either side of
x  0 , so you would not even try to get it in the first place, as the bisection
method imposes the condition that the signs on either side be different.
4. If one of the initial points is close to the root, you would need many iterations to
arrive at the root.
5. It does not work for repeated roots. If there are multiple roots within the interval
given, the scheme narrows down on only one of the roots.
6. It does not work for repeated roots.

Example: Find a zero of the function f ( x)  2 x 3  3 x 2  2 x  3 between the points 1.4


and 1.7, using the bisection method. Take the tolerance to be | x j 1  x j | 10 5 .

Solution
f (1.4)  0.192
f (1.7)  0.756
1.4  1.7
x3   1.55
2

50
f (1.55)  1.4025  10 1
1.55  1.4
x4   1.475
2
f (1.475)  0.0588
1.55  1.475
x5   1.5125
2

You can confirm that Table 4.1 is indeed true.

Table 4.1: Table for Bisection method

n x f (x )
1 1.55 0.14025
2 1.475 -5.88E-02
3 1.5125 3.22E-02
4 1.49375 -1.54E-02
5 1.503125 7.87E-03
6 1.498437 -3.89E-03
7 1.500781 1.96E-03
8 1.499609 -9.76E-04
9 1.500195 4.89E-04
10 1.499902 -2.44E-04
11 1.500049 1.22E-04
12 1.499976 -6.10E-05

3.3 Newton-Raphson Method


Consider Taylor Series
f ' ' ( x)(x ) 2
f ( x  x )  f ( x )  f ' ( x ) x   4.1
2!
To a first order approximation, we can neglect second order and higher order terms. In
that case, if f ( x  x)  0 , then we truncate equation 4.1, leaving only the first two terms
on the right. Then,

f ( x  x )  f ( x)  f ' ( x)x  0 4.2


or
 f ( x)
x  , 4.3
f ' ( x)
so that with an initial guess of x0 , we obtain a better approximation x 0  x , i.e.,
f ( x0 )
x1  x0  x  x 0  4.4
f ' ( x0 )

51
It is quite clear that the function f (x ) must be differentiable for you to be able apply the
Newton-Raphson method.

More generally,
f ( xi )
xi 1  x i  x  xi  4.5
f ' ( xi )
With an initial guess of x0 , we can then get a sequence x1 , x 2 , …, which we expect to
converge to the root of the equation.

We can rearrange equation 4.5 to obtain,


0  f ( xi )
f ' ( xi )  4.6
xi 1  x i
meaning that Newton-Raphson method is equivalent to taking the slope of the function
f (x ) at the i th iterative point, and the next approximation is the point where the slope
intersects the x axis. See the Fig 4.1:

f (x )
f (x )

f ( xi )

xi 1 xi x
0

Fig. 4.1: Graph showing the gradient relationship of Newton-Raphson method

3.3.1 Merits of the Newton-Raphson Method


1. The Newton-Raphson method has a fast rate of convergence.
2. It can identify repeated roots, since it does not explicitly look for changes in the
sign of f (x ) .
3. It can find complex roots of polynomials if you started with a complex initial
guess.

52
3.3.2 Demerits of the Newton-Raphson Method
1. It requires that we compute both f (x ) and f ' ( x ) , which makes the scheme
taxing.
2. Some functions might not be so easy to differentiate. In that case, it might be
f ( x  x )  f ( x )
useful to take an approximate differential, .
x
3. It is quite sensitive to initial condition and may diverge for the wrong choice of
initial point.
4. It will not work if f ' ( x )  0 . Also, if the differential is sufficiently close to zero,
the sequence may diverges away from the root, or converge very slowly.
5. If the derivative changes signs at a test point, the sequence may oscillate around a
point that may not even be the root.
6. It cannot detect repeated roots.

Example: Find the zeros of the function f ( x)  2 x 3  3 x 2  2 x  3 using the Newton-


Raphson method, starting with x = 1.4. Take the tolerance to be | x j 1  x j | 10 5 .

Solution
f ( x)  2 x 3  3 x 2  2 x  3
f ' ( x)  6 x 2  6 x  2
x0  1.4
f ( x0 ) x0 f ' ( x0 )  f ( x0 )
x1  x0  
f ' ( x0 ) f ' ( x0 )
3 2 3 2
6 x0  6 x 0  2 x0  2 x 0  3x 0  2 x 0  3
 2
6 x0  6 x 0  2
3 2
4 x 0  3 x0  3
 2
6 x0  6 x 0  2
4(1.4) 3  3(1.4) 2  3

6(1.4) 2  6(1.4)  2
 1.5412

x1  1.5412 , | x1  x0 | 0.1412
x 2  1.5035 , | x 2  x1 | 0.0377
x3  1.5 , | x3  x 2 | 0.0035
x 4  1.5 , | x 4  x3 | 0

3.4 Regula-falsi method


A regula-falsi or a method of false position assumes a test value for the solution of the
equation.

53
You would recall that with the root-bisection method, we knew that a root existed
between x1 and x2 if the function was smooth and f ( x1 ) f ( x2 )  0 . Let us again choose
these two points as in the case of root-bisection.

x2 x3 x1
0

Then, for an arbitrary x and the corresponding y,


y  f ( x1 ) f ( x 2 )  f ( x1 )
 4.9
x  x1 x 2  x1
gives the equation of the chord joining the points ( x1 , f ( x1 )) and ( x2 , f ( x2 )) .

Setting y  0 , that is, where the chord crosses the x-axis,


x 2  x1
x3  x1  f ( x1 ) 4.10
f ( x 2 )  f ( x1 )
Then, we evaluate f ( x 3 ) . Just as in the case of root-bisection, if the sign is opposite that
of f ( x1 ) , then a root lies in-between x1 and x3 . Then, we replace x2 by x3 in equation
4.10. In just the same way, if the root lies between x1 and x 3 , we replace x 2 by x1 . We
shall repeat this procedure until we are as close to the root as desired.

Example
Find the root of the equation f ( x)  2 x 3  3 x 2  2 x  3 between x = 1.4 and 1.7 by the
regula-falsi method.

f (1.4)  0.192 , f (1.7)  0.756


A solution lies between x = 1.4 and 1.7. Let x1  1.4 and x 2  1.7 . Then,
x 2  x1 1.7  1.4
x3  x1  f ( x1 )  1.4  (0.192)
f ( x 2 )  f ( x1 ) 0.756  (0.192)
= 1.4607595
f (1.4607595)  0.088983
The root lies between 1.46076 and 1.7. Let x1  1.46076 and x 2  1.7 .

54
x 2  x1 1.7  1.46076
x 4  x1  f ( x1 )  1.4607595  (0.088983)
f ( x 2 )  f ( x1 ) 0.756  (0.088983)
= 1.485953

Table 4.2 gives the remaining iterations.

Table 4.2: Table for Regula-falsi method

n x f (x )
1 1.460759 -0.088983
2 1.485953 -0.033938
3 1.495149 -0.011985
4 1.498346 -0.004118
5 1.499439 -0.001401
6 1.499810 -0.000475
7 1.499936 -0.000161
8 1.499978 -0.000055

3.5 Secant Method


In the case of the secant method, it is not necessary that the root lie between the two
initial points. As such, the condition f ( x1 ) f ( x2 )  0 is not needed. Following the same
analysis with the case of the regula-falsi method,
y  f ( x1 ) f ( x 2 )  f ( x1 )
 4.11
x  x1 x 2  x1
Setting y  0 gives
x 2  x1
x3  x 2  f ( x 2 ) 4.12
f ( x 2 )  f ( x1 )

Thus, having found x n , we can obtain x n 1 as,


x n  x n1
x n1  x n  f ( x n ) , n = 2, 3, … 4.13
f ( x n )  f ( x n1 )
By inspection, if f ( x n )  f ( x n 1 )  0 , the sequence does not converge, because the
formula fails to work for x n 1 . The regula-falsi scheme does not have this problem as the
associated sequence always converges.

Example
Find the root of the equation f ( x)  2 x 3  3 x 2  2 x  3 between x = 1.4 and 1.7 by the
regula-falsi method.
x1  1.4, x 2  1.7
f (1.4)  0.192 , f (1.7)  0.756

55
A solution lies between x = 1.4 and 1.7. Let x1  1.4 and x 2  1.7 . Then,
x 2  x1 1.7  1.4
x3  x1  f ( x1 )  1.4  (0.192)
f ( x 2 )  f ( x1 ) 0.756  (0.192)
= 1.460759

f ( x3 )   0.088983
x3  x 2 1.460759  1.7
x 4  x 3  f ( x3 )  1.460759  (0.088983) 
f ( x3 )  f ( x 2 )  0.088983  0.756
= 1.485953

You can continue with this scheme. Table 4.3 shows the other values obtained from the
operation.

Table 4.3: Table for Secant Method

n x f (x )
1 1.460759 -0.088983
2 1.485953 -0.033938
3 1.501487 0.003730
4 1.499949 -0.000129
5 1.500000 0.000000

4.0 Conclusion
In this Unit, you learnt to find the zeros of an algebraic or transcendental function. We
explored a number of methods, and outlined their merits and demerits. We were also able
to estimate the maximum error in the bisection method.

5.0 Summary
In this Unit, you learnt:
 to find the zeros of an algebraic or transcendental function using several methods.
 the merits and demerits of the methods.
 to the maximum error that can be incurred in using the bisection method.

6.0 Tutor Marked Assignment


1. Find the upper bound of the error you are likely to incur in using the bisection
method in finding the root of an equation if the two starting points are 1.4 and 2.5
and you needed 8 steps to achieve the required tolerance.
2. Find a root of the equation 2 x 3  3 x 2  2 x  0.5 using the following methods
(tolerance …..):
(i) Root bisection [starting points1.9 and 2.1 (tolerance | f ( x) | 0.001)].
(ii) Newton-Raphson starting point 2.0
(iii) Regula-falsi [starting points1.9 and 2.1].
(iv) Secant [starting points1.9 and 2.1].

56
3. Find a root of the equation x  2 sin x using
(i) The bisection method, given that the root is between 1.5 and 3, with
tolerance | f ( x) | 0.02 .
(ii) Newton-Raphson method, with the starting point 1.35, with tolerance
| f ( x ) | 10 6 .
(iii) Regula-falsi [starting points1.5 and 3.0].
(iv) Secant [starting points1.5 and 3.0].

7.0 References/Further Reading

57
Solutions to Tutor Marked Assignment

1. Find the upper bound of the error you are likely to incur in using the bisection
method in finding the root of an equation if the two starting points are 1.4 and 2.5
and you needed 8 steps to achieve the required tolerance.
2.5  1.4
| x n  x | 8
 4.297  10 3
2

2. Find a root of the equation 2 x 3  3 x 2  2 x  0.5 using the following methods


(tolerance …..):
(v) Root bisection [starting points1.9 and 2.1 (tolerance | f ( x) | 0.001)].

Iteration
No. xi f ( xi )
1 2 -0.5
2 2.05 2.27E-02
3 2.025 -0.24434
4 2.0375 -0.11224
5 2.04375 -4.51E-02
6 2.046875 -1.13E-02
7 2.048437 5.72E-03
8 2.047656 -2.78E-03
9 2.048047 1.47E-03
10 2.047851 -6.58E-04

(vi) Newton-Raphson starting point 2.0

Iteration
No. xi f ( xi )
1 2.05 0.05
2 2.04792 0.002084
3 2.04791 3.81E-06

(vii) Regula-falsi [starting points1.9 and 2.1].

Iteration
No. xi f ( xi )
1 2.040918 -0.075613
2 2.047610 -0.003287
3 2.047899 -0.000142

58
(viii) Secant [starting points1.9 and 2.1].

Iteration
No. xi f ( xi )
1 2.568354 8.457968
2 1.912709 -1.30566
3 2.000387 -0.49613
4 2.054121 6.79E-02
5 2.047653 -2.81E-03
6 2.047911 -1.49E-05

2. Find a root of the equation x  2 sin x using


(v) The bisection method, given that the root is between 1.5 and 3, with
tolerance | f ( x) | 0.02 .

Iteration
No. xi f ( xi )
1 2.250.69385361
2 1.875 -0.0331716
3 2.06250.29944043
4 1.968750.12503812
5 1.9218750.04387042
6 1.89843750.00482936
7 1.886719 -0.0143012

(vi) Newton-Raphson method, with the starting point 1.35, with tolerance
| f ( x ) | 10 6 .

Iteration
No. xi f ( xi )
1 2.420215 1.099376
2 1.980780 0.146526
3 1.899250 0.006165
4 1.895502 0.000013
5 1.895494 0.000000

(vii) Regula-falsi [starting points1.5 and 3.0].

Iteration
No. xi f ( xi )
1 1.731106 -0.243250
2 1.835347 -0.095074
3 1.874712 -0.033632

59
4 1.888467 -0.011464
5 1.893136 -0.003858
6 1.894705 -0.001292
7 1.895230 -0.000432

(viii) Secant [starting points1.5 and 3.0].

Iteration
No. xi f ( xi )
1 1.731106 -0.243250
2 1.835347 -0.095074
3 1.902230 0.011077
4 1.895251 -0.000399
5 1.895493 -0.000002

60
UNIT 5: FINITE DIFFERENCES AND INTERPOLATION
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Finite Differences
3.1.1 Forward Differences
3.1.2 Error in Finite Difference Table
3.2 Interpolation
3.2.1 Newton forward interpolation formula
3.2.2 Newton’s Backward Interpolation Formula
4.0 Conclusion
5.0 Summary
6.0 Tutor Marked Assignment
7.0 References/Further Reading

1.0 Introduction
Given the function f (x ) we can evaluate the values of f at different x , thereby
representing a continuous function with a set of discrete data. On the other hand, it could
be that we have a set of data and we would like to see if they could have been got from a
polynomial or if indeed we could represent the points by a polynomial. Finite differences
would help us in this regard. With the aid of finite differences, we shall then derive
Newton’s forward and Newton’s backward interpolation formulas.

2.0 Objectives
At the end of this Unit, you would be able to:
 Deduce a polynomial from its difference table.
 Derive Newton’s forward and Newton’s backward interpolation formulas.
 Fit a polynomial to a given a set of data
 Interpolate and extrapolate with Newton’s forward difference
 Interpolate and extrapolate with Newton’s backward difference

3.0 Main Content


3.1 Finite Differences
We proceed by defining the finite difference
i. First Forward difference:
f i 1  f i  f i 5.1
ii. First Backward difference:
f i 1  f i  f i 5.2
iii. First Central difference:
 f i1  f i1 
 i/2 5.3
2

The table for forward difference would look like Table 5.1. What do you notice about this
table? You can see that y 0 and the differences related to it appear on the first line
slanting down to the right.

61
Table 5.1: Forward difference Table
x y y 2 y 3 y
x0 y0
y 0
x1 y1 2 y 0
y1 3 y 0
x2 y2 2 y1
y 2 3 y1
x3 y3 2 y 2
y 3
x4 y4

You can see that differences with similar subscripts form a line slanting downward to the
right from the top.

Table 5.2 is the backward difference table.

Table 5.2: Backward difference table


x y y 2 y 3 y
x0 y0
y1
x1 y1  2 y2
y 2  3 y3
x2 y2  2 y3
y 3 3 y4
x3 y3  2 y4
y 4
x4 y4

Can you spot what makes this table unique? Differences with similar subscripts form a
line slanting upward to the right from the bottom.

Note that for forward difference, 2 y 0  y1  y 0 , or generally,


2 y n  y n1  y n 5.4
and for backward difference,

62
 2 y n  y n  y n 1 5.5

Of course, we can also get a table for central differences, Table 5.3.

Table 5.3: Central difference table


x y y  2y  3y
x0 y0
 y1 / 2
x1 y1 2 y1
 y3 / 2 3 y 3 / 2
x2 y2 2 y 2
 y5 / 2 3 y 5 / 2
x3 y3 2 y 3
 y7 / 2
x4 y4

Do you notice that like subscripts appear on the same row.

3.1.1 Forward Differences


Suppose the given function is f ( x)  x 2  2 x  3 , then we can evaluate f at
x  0,1, 2,  , 6 , and then with the aid of forward difference, arrive at Table 5.4:

Table 5.4: Forward difference table for y  x 2  2 x  3

x y y 2 y
0 3
3
1 6 2
5
2 11 2
7
3 18 2
9
4 27 2
11
5 38 2
13
6 51

The second forward difference produces a constant value of 2.

63
A similar operation carried out on the function f ( x )  2 x will produce a constant
difference after only one forward difference.

It follows that the number of forward differences needed to achieve a constant value of
difference is the degree of the polynomial, and the constant value in the second forward
difference is the second differential of the function.

Hence,
d2 f
2
dx 2

Integrating,
df
 2 x  c1
dx

and finally,
f ( x )  x 2  c1 x  c 2

The values of the constants c1 and c 2 will be determined from the values of f at
different values of x .
f (0)  c 2  3
f (1)  1  c1  3  4  c1  6
Thus, c1  2 .

The function, therefore, is


f ( x)  x 2  2 x  3 .

This was the same function we started with. Of course, if what we started with was just
the table, we could then have obtained the polynomial the way we did.

We could extrapolate for values of x not given on the table, such as for x =  2.0 or 7.0
or interpolate for values such as x = 3.5 and 4.2.

3.1.2 Error in Finite Difference Table


Consider Table 5.5 for forward difference table into which we have introduced an error
 through x 4 .

64
Table 5.5: Forward difference table with error
x y y 2 y 3 y 4 y
x0 y0
y 0
x1 y1 2 y 0
y1 3 y 0
x2 y2 2 y1 4 y 0  
y 2 3 y1  
x3 y3 2 y 2   4 y1  4
y 3   3 y 2  3
x4 y4   2 y 3  2 3 y 2  6
y 4   3 y 3  3
x5 y5 2 y 4   3 y 3  4
y 5 3 y 4  
x6 y6 2 y 5 3 y 4  
y 6 3 y5
x7 y7 2 y 6
y 7
x8 y8

The higher the degree of the difference, the more the error involved. Moreover, you
would notice that the error terms are the binomial coefficient of (1   ) n , where n is the
order of the difference. Thus, for degree 1, it is   . For degree 2, it is
(1   ) 2  1  2   2 . For (1   ) 3  1  3  3 2   3 . But can you notice one thing? The
errors in each difference column cancel out. You shall need this property later.

Example
Find the wrong entry in the following table, given that they represent a cubic polynomial.

x 0 1 2 3 4 5 6 7 8
y -2 4 34 106 238 448 754 1174 1726

Solution
The forward difference table is as shown below on the left part of Table 5.6. The right
part of the table would have resulted if there had been no error.

65
Table 5.6: Forward difference table with error (a) and without (b)
(a) (b)
From the given Table What the table would have looked
like had there been no error

0 -2 0 -2
6 6
1 4 24 1 4 24
30 18 30 18
2 34 42 2 34 42
72 20 72 18
3 106 62 3 106 60
134 12 132 18
4 240 74 4 238 78
208 24 210 18
5 448 98 5 448 96
306 16 306 18
6 754 114 6 754 114
420 18 420 18
71174 132 7 1174 132
552 552
81726 8 1726

We recall that the third difference should have been a constant. This constant we can
determine by remembering that you were told the sum of errors in a single difference
column cancel out. Thus, the sum of the entries in the column representing the third
forward difference remains the same as it would have been had there been no error. This
sum is 108. We divide this by 6 to arrive at 18. Each entry in that column should have
been 18. We notice that the shaded entries in the table can be traced backwards to the
entry 240 in the values of y. This is the entry in error. Moreover,
y 5    240 , 3 y1    20 , 3 y 2  3  12 . But 3 y1  3 y 2 , implying that
20    12  3
Solving for  , 4  8 and   2 . Thus, y 5  2  240 , giving y 5  238 . You can now
see that the table on the left of Table … should have been the correct table if there had
been no error.

66
3.2 Interpolation
3.2.1 Newton forward interpolation formula
At times, we would like to represent a set of values ( xi , y i ) with a function, enabling us,
among other things, to be able to interpolate or extrapolate values that are not in the given
set.

Let the interpolating function be a polynomial given by y (x) . Then, we can write the
polynomial as,
y n ( x)  a 0  a1 ( x  x0 )  a 2 ( x  x 0 )( x  x1 )  a3 ( x  x0 )( x  x1 )( x  x 2 )
+ ...  a n ( x  x0 )( x  x1 )...(x  x n1 ) 5.6
y n (x) must be equal to the tabulated values of y. Thus, we require that:
y 0 ( y at x  x 0 ) = a0 5.7
y1 ( y at x  x1 ) = a0  a1 ( x1  x 0 )
which implies
y  a 0 y1  y 0 y 0
a1  1   5.8
x1  x 0 x1  x0 h
y 2 ( y at x  x 2 ) = a0  a1 ( x 2  x 0 )  a 2 ( x 2  x0 )( x 2  x1 )
= a0  a1 ( x 2  x1  x1  x 0 )  a 2 ( x 2  x0 )( x 2  x1 )
= a0  a1 ( x1  x 0 )  a1 ( x 2  x1 )  2h 2 a 2 (since x 2  x 0  2h )
y
= y1  0 h  2h 2 a 2
h
y 2  y1  y1  y 0  2h 2 a 2
from which
y1  y 0 2 y 0
a2   5.9
2h 2 2!h 2
Similarly,
2 y 0
a3  5.10
3!h 3

Putting these values in equation 5.6 gives


y 2 y 0
y ( x)  y 0  0 ( x  x 0 )  ( x  x 0 )( x  x1 )
h 2 !h 2
3 y 0
 ( x  x 0 )( x  x1 )( x  x 2 )  ... 5.11
3! h 2
Now, let x  x0  rh . Then,
x  x 0  rh , x  x1  x  x 0  x 0  x1  rh  h  (r  1)h
x  x 2  x  x1  x1  x 2  (r  1)h  h  (r  2)h
Hence, from equation 5.11,

67
r (r  1) 2 r (r  1)(r  2) 3
y ( x)  y ( x0  rh)  y 0  ry 0   y0   y 0  ...
2! 3!
r (r  1)...(r  (n  1)) n
+…  y 0  ... 5.12
n!
This is Newton’s forward interpolation formula.

Note: Newton’s forward interpolation formula is for


(i) interpolating the values of y near the beginning of a set of tabulated values, and
(ii) extrapolating values of y a little to the left of y 0

3.2.2 Newton’s Backward Interpolation Formula


Let us choose y n (x) in the form,
y n ( x)  a0  a1 ( x  x n )  a 2 ( x  x n )( x  x n 1 )
 ...  a n ( x  x n )( x  x n 1 )...(x  x1 ) 5.13

y n (x) must be equal to the tabulated values of y. Thus, we require that:


y n ( y at x  x n ) = a0 5.14
y n 1 ( y at x  x n 1 ) = a0  a1 ( x n1  x n )

y  a 0 y n1  y n y 0
a1  n1   5.15
x n 1  x n x n1  x n h
y n 2 ( y at x  x n 2 ) = a0  a1 ( x n 2  x n )  a 2 ( x n  2  x n )( x n 2  x n1 )
= a0  a1 (2h)  a 2 (2h)( h)
y
y n 2 = y n  2h n  2h 2 a 2
h
= y n  2( y n  y n1 )  2h 2 a 2
= 2 y n1  y n  2h 2 a 2

We can then write


1
a 2  2 ( y n 2  y n  2 y n 1 )
2h
2
But  y n  y n  y n 1  ( y n  y n 1 )  ( y n 1  y n 2 )  y n 2  y n  2 y n 1
Hence,
1
a2  2  2 y n 5.16
2h

Similarly,
1
a3   3 yn 5.17
3 h3

68
1
an  n
 n yn 5.18
nh
Putting these values in equation 5.13 yields,
( x  xn ) ( x  x n )( x  x n 1 ) 2
y n (x) = y n  y n   y n  ...
h 2h 2
( x  x m )( x  x m 1 )...(x  x1 ) m
 ...   y m  ...
mh m
Setting x  x n  rh , x  x n  rh , x  x n1  x  x n  x n  x n1  rh  h  (r  1)h .
Similarly, x  x n 2  x  x n1  x n 1  x n 2 = (r  1)h  h  (r  2)h . Thus,
x  x1  [r  (n  1)]h .

r (r  1) 2 r (r  1)(r  2) 3
y ( x)  y ( x0  rh)  y n  ry n   yn   y n  ...
2! 3!
r (r  1)...(r  (n  1)) n
+…  y n  ...
n!
This is the Newton’s backward interpolation formula.

Note: Newton’s backward interpolation formula is for


(i) interpolating the values of y near the end of a set of tabulated values, and
(ii) extrapolating values of y a little to the right of y n .

Example
Find the cubic polynomial that fits the following table.

x 1 2 3 4
y 3 9 27 63

Solution
The forward difference table gives:

x y  2 3
1 3
6
2 9 12
18 6
3 27 18
36
4 63

The step-size, h, is 1. Let x  x0  rh , with x0  1 .

69
r (r  1) 2 r (r  1)(r  2) 3
y ( x)  y ( x0  rh)  y 0  ry 0   y0   y 0  ...
2! 3!
r (r  1)...(r  (n  1)) n
+…  y 0  ...
n!
r  x  x0 = x  1 .
Then,
( x  1)( x  2) ( x  1)( x  2)( x  3)
y ( x)  3  ( x  1)  6   12   6  ...
2! 3!
= 3  6 x  6  6( x 2  3 x  2)  ( x 2  3 x  2)( x  3)  ...
= 6 x  3  6 x 2  18 x  12  x 3  3 x 2  2 x  3 x 2  9 x  6  ...

= x3  x  3

Check: Find the value of y when x = 3:


y (3)  33  3  3  27

You could also get the value of y when x is 0.95, being a little to the left of x0 = 1.
y (3)  .95 3  .95  3  2.907375

Let us solve the same problem with Newton’s backward formula.

x y y 2 y 3 y
1 3
-6
2 9 12
-18 -6
3 27 18
-36
4 63

r  x  x n = x  4 , since h = 1.
r (r  1) 2 r (r  1)(r  2) 3
y ( x )  y n  r y n   yn   y n  ...
2! 3!
r (r  1)...(r  (n  1)) n
+…  y n  ...
n!
( x  4)( x  3) ( x  4)( x  3)( x  2)
= 63  ( x  4)  36   18   6  ...
2 6
= 63  36 x  144  9  ( x 2  7 x  12)  ( x  4)( x 2  5 x  6)
= 63  36 x  144  9 x 2  63 x  108  x 3  5 x 2  6 x  4 x 2  20 x  24

70
= x3  x  3

Check: Find the value of y when x = 2:


y (3)  2 3  2  3  9

You could also have found y (3.9) , x = 3.9 being a point to the left of x = 4:
y (3.9) = (3.9) 3  3.9  3 = 58.419

4.0 Conclusion
In this Unit, you have learnt how to carry out the three different difference schemes. You
have also learnt how to deduce a polynomial from tabulated data. Moreover, you can now
detect what and where an error has been introduced into a difference table. You also
derived Newton’s forward and backward interpolation formulas. From the interpolation
formulas, you were able get interpolating functions.

5.0 Summary
In this Unit, you leant to do the following:
 Carry out any of the three difference schemes.
 Derive the polynomial that fits a set of tabulated data.
 Derive Newton’s forward interpolation formula.
 Derive Newton’s backward interpolation formula.
 With the aid of Newton’s forward or backward formula, obtain a function that
takes the values in a set of tabulated data.

6.0 Tutor Marked Assignment

1. Carry out the forward, backward, and the central difference schemes on the set of
data provided below:
1 2 3 4 5 6 7
1 12 47 118 237 416 667

2. Starting with the function 8 x 3  8 x 2  2 x  12 , draw up a difference table. Deduce


the equation that fits the data, starting from the table alone.

3. We have deliberately inserted an error in the data in the table below. If the data
represents a cubic polynomial, find which of the entries is in error.
0 1 2 3 4 5 6 7 8 9
-12 -14 16 126 366 778 1416 2326 3556 5154

4. Find the quartic polynomial that fits the following table.


(i) Using the Newton’s forward interpolation formula.
(ii) Using the Newton’s backward interpolation formula.

7.0 References/Further Reading

71
72
Solutions to Tutor Marked Assignment
1. Carry out the forward, backward, and the central difference schemes on the set of
data provided below:
1 2 3 4 5 6 7
1 12 47 118 237 416 667

Solution
Forward difference:
1 1
11
2 12 24
35 12
3 47 36
71 12
4 118 48
119 12
5 237 60
179 12
6 416 72
251
7 667

Backward difference:
1 1
-11
2 12 24
-35 -12
3 47 36
-71 -12
4 118 48
-119 -12
5 237 60
-179 -12
6 416 72
-251
7 667

2. Starting with the function 8 x 3  8 x 2  2 x  12 , draw up a difference table. Deduce


the equation that fits the data, starting from the table alone.

Solution
0 -12
-2

73
1 -14 32
30 48
2 16 80
110 48
3 126 128
238 48
4 364 176
414
5 778

The degree of the polynomial is 3.


d3y
 48  y ' '
dx 3
3 x2
Hence, y  8 x  c  dx  e
2
Substituting in turn three different values of x yields c, d , e,
respectively –16, –2 and –12.
The polynomial is then y  8 x 3  8 x 2  2 x  12 .

3. We have deliberately inserted an error in the data in the table below. If the data
represents a cubic polynomial, find which of the entries is in error.
0 1 2 3 4 5 6 7 8 9
-12 -14 16 126 366 778 1416 2326 3556 5154

74
Solution
x y y 2 y 3 y
0 -12
-2
1 -14 32
30 48
2 16 80
110 50
3 126 130
240 42
4 366 172
412 54
5 778 226
638 46
6 1416 272
910 48
7 2326 320
1230 48
8 3556 368
1598
9 5154

The third difference should have been a constant. The sum of errors in a single difference
column cancel out. Thus, the sum of the entries in the column representing the third
forward difference remains the same as it would have been had there been no error. This
sum is 336. We divide this by 7 to arrive at 48. Each entry in that column should have
been 48. We notice that the shaded entries in the table can be traced backwards to the
entry 336 in the values of y. This is the entry in error. Moreover,
y 5    366 , 3 y1    50 , 3 y 2  3  42 . But 3 y1  3 y 2 , implying that
50    42  3
Solving for  , 4  8 and   2 . Thus, y 5  2  366 , giving y 5  364 . You can now
see that the table below should have been the correct table if there had been no error.

x y y 2 y 3 y
0 -12
-2
1 -14 32
30 48
2 16 80
110 48
3 126 128

75
238 48
4 364 176
414 48
5 778 224
638 48
6 1416 272
910 48
7 2326 320
1230 48
8 3556 368
1598
9 5154

4. Find the quartic polynomial that fits the following table.


(iii) Using the Newton’s forward interpolation formula.
(iv) Using the Newton’s backward interpolation formula.
x 0 2 4 6 8
y 8 17 230 1230 3972

Solution
The forward difference table gives:
x y y 2 y 3 y 4 y
0 8
9
2 17 204
213 583
4 230 787 372
1000 955
6 1230 1742
2742
8 3972

The step-size, h, is 2. Let x  x 0  rh , with


x
x 0  0 , h = 2. Hence, r  ( x  x0 ) / h 
2

r (r  1) 2 r (r  1)(r  2) 3
y ( x)  y ( x0  rh)  y 0  ry 0   y0   y 0  ...
2! 3!
r (r  1)...(r  (n  1)) n
+…  y 0  ...
n!
Then,

76
x x x x x
(  1) (  1)(  2)
x
y ( x)  8   9  2 2  204  2 2 2  583
2 2! 3!
x x x x
(  1)(  2)(  3)
2 2 2 2  372  ...
4!
Simplifying this expression,
31 4 25 3 19 2 25
y ( x)  x  x  x  x8
32 48 4 6

Solving the same problem with Newton’s backward formula.


x y y 2 y 3 y 4 y
0 8
-9
2 17 204
-213 -583
4 230 787 372
-1000 -955
6 1230 1742
-2742
8 3972

x  xn x  8 x
r =    4 , since h = 2.
h 2 2
r (r  1) 2 r (r  1)(r  2) 3
y ( x )  y n  r y n   yn   y n  ...
2! 3!
r (r  1)...(r  (n  1)) n
+…  y n  ...
n!
x x x x x
(  4)(  3) (  4)(  3)(  2)
x
= 3972  (  4)  2742  2 2  1742  2 2 2  955 
2 2 6
x x x x
(  4)(  3)(  2)(  1)
 2 2 2 2  372  ...
6
Simplifying, we yet again arrive at
31 4 25 3 19 2 25
y ( x)  x  x  x  x8
32 48 4 6

77
Unit 6: Numerical Integration
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 The Newton-Coates Quadrature Formula
3.2 The Trapezoidal Rule
3.3 Simpson’s one-third rule
3.4 Simpson’s three-eighth rule
3.5 Errors in the Quadrature formulas
3.5.1 Error in the Trapezoidal rule
3.5.2 Error in the Simpson’s one-third rule
3.6 Romberg’s method
4.0 Conclusion
5.0 Summary
6.0 Tutor Marked Assignment
7.0 References/Further Reading

1.0 Introduction
No doubt, before you could get to this stage of your studies, you integrated quite a
number of function analytically. Perhaps you were told at the onset that the process of
analytical integration arose from discretising the function, that is, ‘slicing’ up the function
into vertical bars as shown in Fig. 5.1 and then adding up the areas of the bars in the limit
as the slivers become infinitesimally narrow. Numerical integration goes back to this
idea, and represents a continuous function by a discrete set of points as this is the way the
program compiler can handle data. Numerical integration is called quadrature when the
function is a function of a single variable. In this unit, you shall learn several methods of
integrating a function numerically.

2.0 Objectives
At the end of this Unit, you should be able to:
 Numerically integrate a given function of a single variable between a given set of
limits.
 Know the merits and demerits of various numerical integration schemes.
 Deduce the error involved in approximating an analytical integral with a
numerical integral.

78
f (x )

x
a  x 0 x1 x n 1 xn  b

h
Fig. 5.1: Discretisation of the interval of integration

3.1 The Newton-Coates Quadrature Formula


As you can see, numerical integration as the process of finding the value of a definite
integral,
b
I   f ( x) dx 6.1
a

with a  x  b (Fig. 5.1). An approximate value of the integral is obtained by replacing


the function by an interpolating polynomial. Thus, different formulas for numerical
integration would result for different interpolating formulas. In our own case, we shall be
making use of take Newton’s forward difference formula.

We shall divide the interval [ a, b] into n equal subintervals: a  x0  x1  ...  x n  b ,


such that x j 1  x j  h , where the interval, h  (b  a ) / h . Hence, we can
write x1  x 0  h , x 2  x1  h  ( x 0  h)  h  x 0  2h . It follows therefore, that
xr  x0  rh . The integral becomes,
xn
I   f ( x) dx 6.2
x0

We can write x  x0  qh and dx  hdq . xn  x0  nh .

Let us make a change of variable from x to q: x  x0  qh . Then, q  ( x  x0 ) / h . It


follows, therefore, that when x  x0 , q 0; when x  xn  x0  nh ,
q  ( x n  x 0 ) / h  nh / h  n .

79
The integral becomes
x 0  nh n
I f ( x0  qh) (hdq )  h  f ( x0  qh) dq
x0 0
6.3

Let us approximate f ( x )  f ( x 0  q h) by the Newton’s forward difference formula.


Then, from equation 6.3, and setting y  f (x ) ,
n  q (q  1) 2 q(q  1)(q  2) 3 
I  h   y 0  qy 0   y0   y 0  ... dq 6.4
0
 2 6 

Integrating and putting the limits of integration,


n  q(q  1) 2 q (q  1)(q  2) 3 
I    y 0  q y 0   y0   y 0  ... dq
0
 2 6 
 n n(2n  3) 2 n ( n  2) 2 3  n 4 3n 2 11n 2  4 y 0 
= nh  y 0  y 0   y0   y 0      3n   ...
 2 12 24  5 2 3  4! 
6.5
This is the Newton-Coates quadrature formula.

By setting n equal to 1, 2, 3, …, we obtain different integration formulas.

3.2 The Trapezoidal Rule


Suppose we set n equal to 1, and take the curve between two consecutive points as
linear. Thus, we terminate the sequence on the right in equation 6.5 at the linear term as
the higher difference terms ( 2 y 0 , 3 y 0 , etc.) would be zero. Then,
x0  h  1  h 1  h
 f ( x) dx  h y 0  y 0    y 0  ( y1  y 0 )   ( y 0  y1 ) 6.6
x0
 2  2 2  2
Similarly,
x0  2 h  1  h 1  h
 f ( x) dx  h y1  y1    y1  ( y 2  y1 )   ( y1  y 2 )
x0  h
 2  2 2  2 6.7
.
.
.
x0  nh h
 f ( x) dx 
( y n 1  y n )
2
x0  ( n 1) h
6.8
Do these equations remind you of an old formula for finding the area of a triangle? Yes,
each of them is the area of a trapezium, hence the procedure is known as the trapezoidal
rule.

Adding all these integrals,


x0  nh h h h
 x0 f ( x) dx  2 ( y 0  y1 )  2 ( y1  y 2 )  ...  2 ( y n1  y n ) 6.9
h
= [( y 0  y n )  2( y1  y 2  ...  y n 1 )]
2

80
3.3 Simpson’s one-third rule
We set n = 2 in equation 6.5, and assume the function is quadratic between two
consecutive intervals. Then,
x0  2 h  1 2 
 x0 f ( x) dx  2h y0  y 0  6  y0  6.10

 1 
= 2h y 0  ( y1  y 0 )  (y1  y 0 ) 
 6 
 1 
= h y 0  ( y1  y 0 )  [( y 2  y1 )  ( y1  y 0 )]
 6 
 1 
= h y 0  ( y1  y 0 )  [ y 2  2 y1  y 0 ]
 6 
= h(6 y 0  6( y1  y 0 )  [ y 2  2 y1  y 0 ])
y  4 y1  y 2
= 2h 0
3
h
=  ( y 0  4 y1  y 2 ) 6.11
3
Similarly,
x0  4 h h
 x0 2h f ( x) dx  3 ( y 2  4 y3  y 4 ) 6.12
.
.
.
x 0  nh h
 x0 ( n  2) h f ( x) dx  3 ( yn 2  4 yn 1  yn ) 6.12

Adding all these integrals, with the proviso that n be even (this condition is necessary as
we need two consecutive intervals, x k to x k 1 to x k  2 ),
x 0  nh h
 x0 f ( x) dx  3 [( y0  yn )  4( y1  y2  ...  yn 1 )  2( y2  y4  ...  yn 2 )]
With the aid of the summation symbol,
n 1 n -2
x0  nh h
 x0 f ( x ) dx 
3
[( y 0  y n )  4  i i 2
y
i 1, i odd
 2 yi ]
, i even
6.13

This is Simpson’s one-third rule.

3.4 Simpson’s three-eighth rule


In this case, we set n equal to 3 in equation 6.5 and take the curve over each interval as a
polynomial of order 3.
x0  3 h  3 3 2 1 3 
 x0 f ( x) dx  3h y0  2 y0  2  y0  8  y0  6.14

81
The student can show that,
x0 3 h 3h
 x0 f ( x) dx = 8 ( y 0  3 y1  3 y 2  y3 ) 6.15
Similarly,
x0  5 h 3h
 x0 3h f ( x) dx  8 ( y3  3 y 4  3 y5  y 6 ) 6.16
.
.
.
Adding all these integrals, with the proviso that n be a multiple of 3,
x 0  nh 3h
 x0 f ( x) dx  8 [( y0  y n )  3( y1  y 2  ...  yn 1 )  2( y3  y6  ...  y n3 )] 6.17
This is Simpson’s three-eighth rule.

Exercise: Integrate the following function of x with respect to x using the Trapezoidal
rule, Simpson’s one-third rule and Simpson’s three-eighth rule.
3 x 2  5 x  1 ; 1  x  4 ; step size 0.5.
Compare your results with the exact value of the integral.

Solution:
1 3 5
x 0  0, x1  , x 2  1, x3  , x 4  2, x5  , x6  3
2 2 2
y 0  1, y1  2.75, y 2  5, y 3  7.75, y 4  11, y 5  14.75, y 6  19
i. Trapezoidal rule

5
h 
Integral   y 0  y 6  2 y i   25.625
2 i 1 

1
ii. Simpson’s rule
3
h 5 4 
Integral   y 0  y 6  4  yi  2  y i   25.5
3 i 1,i odd i  2, i even 

The exact integral is 25.5.

The Simpson’s one-third rule is a second order approximation to the integral. Since the
function is quadratic, an accurate result is obtained.

(iii) Simpson’s three-eighth rule


1 3 5
x 0  0, x1  , x 2  1, x3  , x 4  2, x5  , x6  3
2 2 2
y 0  1, y1  2.75, y 2  5, y 3  7.75, y 4  11, y 5  14.75, y 6  19

82
3h
Integral = [( y 0  y 6 )  3( y1  y 2  y 4  y 5 )  2 y 3 ]
8
31
= [(1  19)  3( 2.75  5  11  14.75)  2( 7.75)]
82
= 25.5

3.5 Errors in the Quadrature formulas


We approximated the function f (x ) with the polynomial P (x) . The error involved in the
approximation is
b b
E   f ( x) dx   P ( x) dx 6.18
a a

The Taylor series expansion of f (x ) about x 0 is, where h  x  x 0 ,


h2
f ( x )  f ( x 0 )  hf ' ( x 0 )  f ' ' ( x0 )  ... 6.19
2!

3.5.1 Error in the Trapezoidal rule


x0  h x0  h  h2 
 x0 f ( x) dx   x0  f ( x0 )  hf ' ( x0 )  2! f ' ' ( x0 )  ... dx 6.20

1
In the first interval, [ x0 , x1 ] , trapezoidal rule gives an area ( f ( x0 )  f ( x1 )) . The
2
integral (integrating term by term) on the right side of equation 6.20 gives
h2 h3
hf ( x0 )  f ' ( x0 )  f ' ' ( x0 )  ... 6.21
2 3  2!
When x  x1 , f ( x)  f ( x1 ) . Thus, from equation 6.19,
h2
f ( x1 )  f ( x 0 )  hf ' ( x 0 )  f ' ' ( x 0 )  ... 6.22
2!
Thus,
h h  h2 
( f ( x 0 )  f ( x1 )) =  f ( x0 )   f ( x 0 )  hf ' ( x 0 )  f ' ' ( x0 )  ...
2 2  2! 
h h2 
= 2 f ( x 0 )  hf ' ( x0 )  f ' ' ( x0 )  ... 6.23
2 2! 
The error in the first interval is therefore (5.21 minus 5.23):

 h2 h3   h2 h3 
 hf ( x 0 )  f ' ( x 0 )  f ' ' ( x 0 )  ...   hf ( x 0 )  f ' ( x 0 )  f ' ' ( x0 )  ...
 2 3  2!   2 2  2! 

 1 1  3 h3
=    h f ' ' ( x 0 )  ... =  f ' ' ( x0 )  ... 6.24
 3  2 2  2 ! 12

83
h3
In the interval [ x1 , x2 ] , the error in the two values is  f ' ' ( x1 )  ... , etc.
12

The total error, therefore, is


h3
E   [ f ' ' ( x0 )  f ' ' ( x1 )  f ' ' ( x 2 )  ...  f ' ' ( x n1 )] 6.25
12
If the largest value of the sequence of the second differentials at different discrete values
of x is f ' ' ( xˆ ) , then we can write,
nh3 (b  a ) h 2
E ˆ
f ' ' ( x)   f ' ' ( xˆ )
12 12 6.26
since n  (b  a ) / h

3.5.2 Error in the Simpson’s one-third rule


x0  2 h x0  2 h  h2 
 x0 f ( x ) dx   x0  0 f ( x )  hf ' ( x 0 )  f ' ' ( x 0 )  ... dx 6.27
2! 

In the first interval, [ x0 , x1 ] , Simpson’s one-third rule gives an area


h
( f ( x0 )  4 f ( x1 )  f ( x 2 )) . The integral on the right side of equation 6.27 gives
3
4h 2 8h 3
2hf ( x 0 )  f ' ( x0 )  f ' ' ( x 0 )  ... 6.28
2! 3!
When x  x1 , f ( x)  f ( x1 ) . Thus, from equation 6.19,
h2 h3
f ( x1 )  f ( x 0 )  hf ' ( x0 ) 
f ' ' ( x0 )  f ' ' ' ( x0 )  ... 6.29
2! 3!
Setting x  x0  2h , f ( x)  f ( x2 ) and
4h 2 8h 3
f ( x 2 )  f ( x 0 )  2hf ' ( x 0 )  f ' ' ( x0 )  f ' ' ' ( x 0 )  ...
2! 3! 6.30
Putting equations 6.29 and 6.30 into equation 6.27 and equating to the approximate
integral in the interval x0 to x0  2h ,
h
( f ( x0 )  4 f ( x1 )  f ( x 2 ))
3
h  h2 h3 
=  f ( x0 )  4 f ( x0 )  hf ' ( x0 )  f ' ' ( x0 )  f ' ' ' ( x0 )  ...
3  2! 3! 
 4h 2 8h 3 
  f ( x 0 )  2hf ' ( x0 )  f ' ' ( x0 )  f ' ' ' ( x0 )  ...
 2! 3! 
2 3 5
4h 8h 5h (iv )
= 2hf ( x0 )  f ' ( x0 )  f ' ' ( x0 )  f (0)  ... 6.31
2! 3  2! 18

84
The error in the interval [ x0 , x2 ] is therefore 5.28 minus 5.31:
 4h 2 8h 3 
 2hf ( x0 )  2! f ' ( x 0 )  3! f ' ' ( x 0 )  ...
 
2 3
 4h 8h 5h 5 (iv ) 
 2hf ( x 0 )  f ' ( x0 )  f ' ' ( x0 )  f (0)  ...
 2! 3  2! 18 
5
5 4 h (iv )
=    h 5 f (iv ) ( x0 )  ... =  f ( x 0 )  ... 6.32
18 15  90
h 5 (iv )
In the interval [ x2 , x4 ] , the error in the two values is  f ( x 0 )  ... , etc.
90

The total error, therefore, is


h5
E   [ f (iv ) ( x 0 )  f ( iv ) ( x1 )  f (iv ) ( x 2 )  ...] 6.33
90
If the largest value of the sequence of the second differentials at different discrete values
of x is f (iv ) ( xˆ ) , then we can write,
nh 5 (iv ) (b  a)h 5 ( iv ) (b  a)h 5 ( iv )
E ˆ
f ( x) =  ˆ
f ( x) =  f ( xˆ ) 6.34
90 2  90 190
since h  (b  a) / 2n .

3.6 Romberg’s method


Yet again, we refer to the integral,
b
I   f ( x) dx
a

You would recall that the error in trapezoidal rule in a subinterval h is


(b  a)h 2
E f ' ' ( x)
12
Thus, for subinterval of width h1 , the error in the integral is
2
(b  a )h1
E1   f ' ' ( x) 6.35
12
For subinterval of width h2 , the error in the integral is,
2
(b  a)h2
E2   f ''(x) 6.36
12
We expect that f ' ' ( x) and f ' ' ( x ) would be almost equal. Dividing equation 6.35 by
equation 6.36,
2
E1 h1
 6.37
E 2 h2 2
2
h2
2
h2  h2 2 
It follows that E 2  E1 , so that E 2  E1  E  E  E  2  1
1
2
h1
2 1 1 
h1  h1 

85
2
E1 E1 h1
  6.38
E 2  E1  h2 2  h2 2  h1 2
E1  2  1
 h1 
We also note that adding the error to the estimate gives the correct integral I.
I  I 1  E1  I 2  E 2 6.39
from which
E 2  E1  I 1  I 2 6.40

2
E1 h
From  21 2,
E 2  E1 h2  h1
2 2
h1 h1
E1 2 2
( E 2  E1 )  2 2
( I1  I 2 ) 6.41
h2  h1 h2  h1
We can therefore write,
2
h1
I  I 1  E1  I1  2 2
(I1  I 2 )
h2  h1
2 2 2 2 2
I 1 (h2  h1 )  h1 ( I 1  I 2 ) I1 h2  I 2 h1
= 2 2
 2 2
6.42
h2  h1 h2  h1
This is a better approximation to the integral, I. Why, do you think?

1
Let us take a situation where h1  h and h2  h . Then, equation 6.42 gives,
2
I 1 ( h / 2) 2  I 2 h 2 I 1 ( h 2 / 4)  I 2 h 2
I  6.43
( h / 2) 2  h 2 ( h 2 / 4)  h 2
Multiplying through by 4, and denoting I by I (h, h / 2)
4I  I
I ( h, h / 2)  2 1 6.44
3
But I 1  I (h) and I 2  I (h / 2)
We can therefore write,
I (h, h / 2)  [4 I (h / 2)  I (h)] / 3 6.45

We can develop the scheme below by applying equation 6.45 to the estimates of the
integral over successively halved intervals.

I (h)
I (h, h / 2)
I (h / 2) I (h, h / 2, h / 4)
I (h / 2, h / 4) I (h, h / 2, h / 4, h / 8)
I (h / 4) I (h / 2, h / 4, h / 8)
I (h / 4, h / 8)

86
I (h / 8)

We continue the table until successive values converge. This gives a better result than
could have been obtained with the trapezoidal rule.

Example
3
2
Let us once again solve the problem  (x  3 x  1)dx.
0

Solution:
Let us choose h = 1.0, 0.5 and 0.25. Then, the following table obtains.

I (h) = 26
I (h, h / 2) = (4  25.625-26)/3 = 25.5
I (h / 2) = 25.625
I (h / 2, h / 4) = (4  25.53125-25.625)/3 = 25.5
I (h / 4) = 25.53125

25.5 is a better approximation to the integral than the trapezoidal method. Indeed, in this
case, it is the exact integral.

4.0 Conclusion
In this Unit, you derived the Newton-Coates quadrature formula. Also, you learnt how to
carry out, with several methods, the numerical integration of a function between a given
limit of integration. Having found the error in the quadrature formula for the different
integration methods, you were able to link up with the Romberg method of numerical
integration.

5.0 Summary
In this Unit, you were able to:
 Derive the Newton-Coates quadrature formula, and, consequently, the different
formulas for integrating a function between limits.
 Estimate the error in the quadrature formula for different numerical integration
methods

6.0 Tutor Marked Assignment


1 2 5
1. Integrate the function x(t )  t  t  2 , 0  t  0.6 , with six intervals, using
2 2
the following methods:
(i) Trapezoidal rule
(ii) Simpson’s one-third rule
(iii) Simpson’s three-eighth rule

87
 /2
2. Evaluate the integral  x sin x dx (where x is in radians) with a step-size of
0

x   / 16 , using
(i) Trapezoidal rule
(ii) Simpson’s one-third rule
(iii) Simpson’s three-eighth rule

7.0 References/Further Reading

88
Solutions to Tutor Marked Assignment

1 2 5
1. Integrate the function x(t )  t  t  2 , 0  t  0.6 , with six intervals, using
2 2
the following methods:
(i) Trapezoidal rule
(ii) Simpson’s one-third rule

Formula for Trapezoidal rule:

n 1
b
x  
 a
f ( x )dx 
2 
y 0  2
i 1
y i  y n


Formula for Simpson’s one-third rule:

 n 2 n 2 
b x  
 a
f ( x)dx 
3  y 0  4  y i  2  y i  y n
i 1 i 2 
 i , odd i , even 

Trapezo Simpson
Step 0.1
x f (x )
0 2 First value of f (x ) = 2 2
0.1 2.255
0.2 2.52 33.7 4 times sum of odd
0.3 2.795 Sum of intermediate values of f (x ) 28.05 Simpson’s 1-3 rule
0.4 3.08 (Trapezoidal rule) 11.2 2 times sum of even
0.5 3.375
0.6 3.68 Last value of f (x ) = 3.68 3.68
33.73 50.58
Results Answer 1.6865 1.686

Analytical solution:
0.6 0.6
 1 2 t2 t3
0 
 2  2 . 5t  t  dt = 2t  2 .5  = 1.686
2  2 6 0

(iii) Simpson’s three-eighth rule


3h
Integral = [( y 0  y 6 )  3( y1  y 2  y 4  y 5 )  2 y 3 ]
8
3(0.1)
= [(2  3.68)  3( 2.255  2.52  3.08  3.375)  2( 2.795)]
8

89
= 1.686

 /2
2. Evaluate the integral  x sin x dx (where x is in radians) with a step-size of
0

x   / 16 , using
(i) Trapezoidal rule
(ii) Simpson’s one-third rule

Working with radians

Trapezoidal Simpson’s 1-3


x xsinx
First value of
0 0 f (x ) = 0 0
0.19635 0.038306
4 times
sum of
0.392699 0.150279 10.11957735 odd
Sum of
intermediate
values of
0.589049 0.327258 f (x ) 8.6479081
2 times
sum of
0.785398 0.55536 3.588119468 even
0.981748 0.816293
Last value of
1.178097 1.08842 f (x ) = 1.5707963 1.570796328
1.374447 1.348037
1.570796 1.570796
10.218704 15.27849315
Answer 1.0032188 0.99997483

90
Unit 7: Initial Value Problems of Ordinary Differential Equations

1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Reduction of a higher order ODE to a system of first order ODE
3.2 Methods of Solving First Order Ordinary Differential Equations
3.2.1 Picard’s Method
3.2.2 Euler Method
3.2.3 Modified Euler Method
3.2.4 Runge-Kutta Methods
3.3 Fourth-Order Runge-Kutta Scheme for a System of Three Equations
4.0 Conclusion
5.0 Summary
6.0 Tutor Marked Assignment
7.0 References/Further Reading

1.0 Introduction
Ordinary differential equations abound in Physics. This is because we often have to deal
with a rate of change of function of a single variable. It could be a time-rate of change,
say velocity or acceleration, or it could be a spatial rate of change as you would expect
from the variation of temperature over a metallic bar heated at one end at any particular
fixed instant of time. Unlike analytic differentiation of a function, which is most times
achievable, the larger number of functions do not lend themselves to analytical
integration. We therefore have to resort to numerical integration when confronted with
such functions. In this Unit, you shall learn how to numerically integrate a function of a
single variable.

2.0 Objectives
By the end of this Unit, you should be able to:
 Write an nth order ordinary differential equation in terms of n first order ordinary
differential equations.
 Solve a first order ordinary differential equation.
 Solve a system of first order ordinary differential equations.

3.1 Reduction of a higher order ODE to a system of first order ODE


Every ordinary differential equation can be put in the form
dy
 f ( y, x)  y ' 7.1
dx
or a system of such equations. As an example, take the equation of simple harmonic
oscillation,
x' ' 2 x  0 7.2
where  is the angular frequency of oscillation.
Let
z  x' 7.3

91
Then,
z '   2 x 7.4
The last two equations form a system of ordinary differential equations. Likewise an nth
order ordinary differential equation can be written as a set of n ordinary differential
dy
equations. Thus, it suffices to solve the ordinary differential equation  f ( y , x) .
dx

Example
The Henon-Heiles system of equations leads to chaotic motion. We can reduce the two
second-order differential equations to four first order ordinary differential equations. We
can then solve the equations with the methods to be learnt later in this Unit.

The Henon-Heile’s Hamiltonian is,


2 2 2 2
p  p2 q  q2 2 1 3
H 1  1  q1 q 2  q 2
2 2 3
The resulting equations are,

d 2 q1
 (q1  2q1 q 2 )
dt 2
d 2q2 2 2
2
 (q 2  2q1  q 2 )  2(1  2q1 )
dt

Each of these equations has been broken up into two first order ordinary differential
equations:
dq1
 p1
dt
d
p1  (q1  2q1 q 2 )  2(1  2q1 )
dt

dq 2
 p2
dt
d 2 2
p 2  (q 2  2q1  q 2 )  2(1  2q1 )
dt

3.2 Methods of Solving First Order Ordinary Differential Equations


We shall now take a look at the various methods of solving a first order ordinary
differential equation.

3.2.1 Picard’s Method


Given the ordinary differential equation
dy
 f ( x, y ) 7.5
dx

92
we can write
dy  f ( x, y ) dx 7.6
Integrating both sides,
y x
 dy   f ( x, y ) dx 7.7
y0 x0

Then,
x
y  y 0   f ( x, y ) dx 7.8
x0

We take, as a first approximation to the solution y (x) , the value of y when x  x 0 , that
is, y 0 . Then,
x
y1  y 0   f ( x, y 0 ) dx 7.9
x0

The next approximation to y, that is, y 2 , is obtained by substituting y1 under the integral.
x
y 2  y 0   f ( x, y1 ) dx 7.10
x0

Thus, we obtain a sequence of approximations to y which would converge to the


solution of the ordinary differential equation provided the function f ( x, y ) is bounded in
a region about ( x 0 , y 0 ) and satisfies the Lipschitz condition:
| f ( x, y )  f ( x, yˆ ) | M | y  yˆ | 7.11
where M is a constant.

Obviously, a drawback to this method is that most times, the function has to be a simple
function that can be easily integrated. As we have discussed before, only a limited class
of functions satisfies this condition.

3.2.2 Euler Method


We discretize the ordinary differential equation 7.1 as
y j 1  y j
 f (yj, xj) 7.12
x
From which it follows that
y j 1  y j  x  f ( y j , x j ) 7.13
This method is self-starting, but is so low in accuracy that it is rarely ever used in serious
computational work.

Example: With the aid of the Euler method, calculate y (0.8) , given the differential
equation
dy
 x  y ; y (0)  0 ; with h  0.2
dx

Solution:
y j 1  y j  h  f ( y j , x j )
j  0; y 0  0, x 0  0 ; f ( y 0 , x0 )  0  0  0 ;

93
y1  y 0  h  f ( y 0 , x 0 )  0  0.2 * 0  0

j  1;
y1  0, x1  0.2 ; f ( y1 , x1 )  0  0.2  0.2 ;
y 2  y1  h  f ( y1 , x1 )  0  0.2 * 0.2  0.04

j  2;
y 2  0.04, x 2  0.4 ; f ( y 2 , x 2 )  0.04  0.4  0.44 ;
y 3  y 2  h  f ( y 2 , x 2 )  0.04  0.2 * 0.44  0.128

3.2.3 Modified Euler Method


We could write equation 7.1 as
dy
 f ( x, y )
dx
dy  f ( x, y ) dx
Integrating,
x1
y1  y 0   f ( x, y )dx
x0

Rearranging and generalizing,,


x j 1
y j 1  y j   f ( x, y )dx
xj

With the aid of the trapezoidal rule, we can write the last equation as
h
y j 1  y j  [ f ( x j , y j )  f ( x j 1 , y j 1 )] 7.14
2
Indeed, it is best to write equation 7.14 as
h
y (ji11)  y j  [ f ( x j , y j )  f ( x j 1 , y (ji)1 )] 7.15
2
This is the modified Euler method. It is an implicit scheme.
The starting value y 0(1) is obtained by an implicit formula, e.g., the Euler formula. Thus,
the scheme would look like (for j = 0),
h
i  0 y1(1)  y 0  [ f ( x0 , y 0 )  f ( x1 , y1(0 ) )]
2
h
i  1 y1( 2 )  y 0  [ f ( x 0 , y 0 )  f ( x1 , y1(1) )]
2
This is continued until convergence is achieved.

Example
Using the modified Euler method, find y (0.2) , if y '( x  y )  0 , given that y (0)  1 .
Take a step length of 0.1 and the tolerance as | y i( k )  y i( k 1) |  0.0001 .

Solution
x0  0 , y 0  1

94
Using Euler’s formula,
y1(0 )  y 0  hf ( x0 , y 0 ) = 1  0.1  (0  1) = 1.1

We now apply the modified Euler formula.


h
i  0 y1(1)  y 0  [ f ( x0 , y 0 )  f ( x1 , y1(0 ) )]
2
= 1  0.05[(0  1)  (0.1  1.1)] = 1.11
h
i  1 y1( 2 )  y 0  [ f ( x 0 , y 0 )  f ( x1 , y1(1) )]
2
= 1  0.05[(0  1)  (0.1  1.11)] = 1.1105
| y1( 2)  y1(1) | = 1.1105 - 1.11  0.0005
h
i2 y1( 3)  y 0  [ f ( x 0 , y 0 )  f ( x1 , y1( 2 ) )]
2
= 1  0.05[(0  1)  (0.1  1.1105)] = 1.110525
| y1( 3)  y1( 2) | = 1.110525 - 1.1105  0.000025
With the tolerance satisfied, we can now proceed to get y 2 , that is, y (0.2)

Using Euler’s formula,


y 2(0)  y1  hf ( x1 , y1 ) = 1.110525  0.1  (0  1.110525) = 1.23155

We now apply the modified Euler formula.


h
i  0 y 2(1)  y1  [ f ( x1 , y1 )  f ( x 2 , y 2(0 ) )]
2
= 1.1105  0.05[(0.1  1.1105)  (0.2  1.23155)] = 1.242603
h
i  1 y 2( 2 )  y1  [ f ( x1 , y1 )  f ( x 2 , y 2(1) )]
2
= 1.1105  0.05[(0.1  1.1105)  (0.2  1.242603)] = 1.243155
| y 2( 2)  y 2(1) | = 1.243155 - 1.242603  0.000552
h
i2 y 2(3)  y1  [ f ( x1 , y1 )  f ( x 2 , y 2( 2 ) )]
2
= 1.1105  0.05[(0.1  1.1105)  (0.2  1.243155)] = 1.243183
| y 2( 3)  y 2( 2) | = 1.243183 - 1.243155  0.000028

3.2.4 Runge-Kutta Methods


We recall that Taylor’s series is given as
f ' ' ( x )(x ) 2
f ( x  x )  f ( x )  f ' ( x )x  
2!
This we can write as (if we set f ( x  x)  y1 , f ( x )  y 0 ) and x  h )

95
f ' ' ( x) h 2
y1  y 0  hf ' ( x)   ...
2!
A first order approximation to the series is
y1  y 0  hf ' ( x) 7.16
This is the Runge-Kutta first order method, which you would also notice is the Euler
method.

On the other hand, we recall equation 7.14,


h
y j 1  y j  [ f ( x j , y j )  f ( x j 1 , y j 1 )]
2
Writing y j 1  y j  hf ( x j , y j ) in equation 7.14 (the modified Euler formula),
h
y j 1  y j  [ f ( x j , y j )  f ( x j 1 , y j  hf ( x j , y j ))]
2
Let us set
hf ( x j , y j )  k1
and
hf ( x j 1 , y j  hf ( x j , y j ))  hf ( x j 1 , y j  k1 )  k 2
Then, we can write
1
y j 1  y j  [ k1  k 2 ]
2 7.17
This is the second-order Runge-Kutta formula.

Example
Find the value of y at x = 0.2 if y '2 y  0 ; y (0)  1 , step-length 0.2.

Solution

h
y j 1  y j  [ f ( x j , y j )  f ( x j 1 , y j  hf ( x j , y j ))]
2
h
y1  y 0  [ f ( x 0 , y 0 )  f ( x1 , y 0  hf ( x 0 , y 0 ))]
2
Let us set
hf ( x 0 , y 0 )  k1  0.2(2 y 0 )  0.2[2(1)]  0.4
and
hf ( x1 , y 0  k1 )  k 2  0.2[ 2(1  0.4)]  0.24
Then, we can write
1
y j 1  y j  [ k1  k 2 ]
2
1 1
y1  y 0  [ k1  k 2 ]  1  [ 0.4  0.24]
2 2

Hence,

96
y (0.2)  0.68

The formula for the third-order Runge Kutta method is


1
y j 1  y j  ( k1  4k 2  k 3 ) 7.18
6
where
k1  hf ( x j , y j )
 h k 
k 2  hf  x j  , y 0  1 
 2 2
k 3  hf ( x j  h, y j  k1  2k 2 )

Example
Using the third-order Runge-Kutta method, find the value of y when x = 0.2, given that
y '  x  y , y (0)  2 , with step length 0.1.

Solution
j0
k1  hf ( x0 , y 0 )  0.1  (0  2)  0.2
 h k 
k 2  hf  x 0  , y 0  1 
 2 2

 0.1  0.2  
 0.1   0   1   0.1  (.05  0.9)  0.085
 2   2  
k 3  hf ( x 0  h, y 0  k1  2k 2 )
 0.1  f (0.1, 2  ( 0.2)  2  ( 0.085))  0.1  f (0.1, 2.03)  0.193
1 1
y1  y 0  ( k1  4k 2  k 3 ) = 2  [ 0.2  4  ( 0.085)  ( 0.193)] = 1.877833
6 6

j 1
k1  hf ( x1 , y1 )  0.1  (0.1  1.877833)  0.177783
 h k 
k 2  hf  x1  , y1  1 
 2 2

 0.1  0.177783  
 0.1   0.1    1.877833     0.163894
 2   2 
k 3  hf ( x1  h, y1  k1  2k 2 )

 0.1  f (0.2,1.877833  ( 0.177783)  2  ( 0.163894))  0.1  f (0.2, 1.727828)  0.152783

97
1
y 2  y1  ( k1  4 k 2  k 3 )
6
1
 1.877833  [ 0.177783  4  ( 0.163894)  ( 0.152783)]
6
= 1.713476

Fourth Order Runge-Kutta Method


The formula is , where h is the step-length,
1 1
y j 1  y j  h  f ( y j , x j )  f ( y * j 1 / 2 , x j 1 / 2 )
6 3
1 1 
 f ( y * * j 1 / 2 , x j 1 / 2 )  f ( y * j 1 , x j 1 ) 7.19
3 6 
The computation follows the order
(i) f ( y j , x j ) 7.20
h
(ii) x j 1 / 2  x j  7.21
2
h
(iii) y * j 1 / 2  y j  f (yj, xj ) 7.22
2
h
(iv) y * * j 1 / 2  y j  f ( y * j 1 / 2 , x j ) 7.23
2
(v) y * j 1  y j  h  f ( y * * j 1 / 2 , x j ) 7.24
(vi) Evaluate y j 1 with equation 7.19.

Another, equivalent, computation scheme is as follows:


(i) f (y j , xj ) 7.25
(ii) k1  hf ( x 0 , y 0 ) 7.26
 1 1 
(iii) k 2  hf  x 0  h, y 0  k1  7.27
 2 2 
 1 1 
(iv) k 3  hf  x0  h, y 0  k 2  7.28
 2 2 
(v) k 4  hf  x0  h, y 0  k 3  7.29
1
(vi) k  (k1  2k 2  2k 3  k 4 ) 7.30
6
(vii) Evaluate y j 1  y j  k

Example: Solve the following ordinary differential equation using the Runge-Kutta
Fourth order method.
dy
 y  x ; y (0)  1 . Find y at x  0.2
dx

98
Solution:
x 0  0, y 0  1, h  0.2, f ( x0 , y 0 )  1
k1  hf ( x0 , y 0 )  0.2  1  0.2
 1 1 
k 2  hf  x0  h, y 0  k1   0.2  f (0.1,1.1)  0.2400
 2 2 
 1 1 
k 3  hf  x0  h, y 0  k 2   0.2  f (0.1,1.12)  0.2440
 2 2 
k 4  hf  x0  h, y 0  k 3   0.2  f (0.2,1.244)  0.2888
1
 k  (k1  2k 2  2k 3  k 4 )
6
1
 (0.2  0.48  0.488  0.2888)  0.2428
6
Hence, y1  y 0  k  1.2428

The fourth-order Runge-Kutta method is the most accurate of the Runge-Kutta methods.

3.3 Fourth-Order Runge-Kutta Scheme for a System of Three Equations


We shall solve the Lorenz system of equation with the fourth-order Runge-Kutta method.
The equations are:
dx
 10( y  x)  f1 ( x, y, z )
dt
dy
 (100 x  y  xz )  f 2 ( x, y , z )
dt
dz
 ( xy  2 z )  f 3 ( x, y , z )
dt

Let us make use of the set of equations given in 7.19-7.24.

There will be three k1 ’s, one each for three variables, three k 2 ’s and so on.
k1x  hf1 (t 0 , x0 , y 0 , z 0 )
k1 y  hf 2 (t 0 , x 0 , y 0 , z 0 )
k1z  hf 3 (t 0 , x0 , y 0 , z 0 )

 1 1 1 1 
k 2 x  hf1  t 0  h, x 0  k1x , y 0  k1 y , z 0  k1z 
 2 2 2 2 
 1 1 1 1 
k 2 y  hf 2  t 0  h, x 0  k1x , y 0  k1 y , z 0  k1z 
 2 2 2 2 
 1 1 1 1 
k 2 z  hf 3  t 0  h, x 0  k1x , y 0  k1 y , z 0  k1z 
 2 2 2 2 

99
 1 1 1 1 
k 3 x  hf 1  t 0  h, x0  k 2 x , y 0  k 2 y , z 0  k 2 z 
 2 2 2 2 
 1 1 1 1 
k 3 y  hf 2  t 0  h, x0  k 2 x , y 0  k 2 y , z 0  k 2 z 
 2 2 2 2 
 1 1 1 1 
k 3 z  hf 3  t 0  h, x 0  k 2 x , y 0  k 2 y , z 0  k 2 z 
 2 2 2 2 

k 4 x  hf1 t 0  h, x 0  k 3 x , y 0  k 3 y , z 0  k 3 z 
k 4 x  hf1 t 0  h, x 0  k 3 x , y 0  k 3 y , z 0  k 3 z 
k 4 x  hf1 t 0  h, x 0  k 3 x , y 0  k 3 y , z 0  k 3 z 

1
kx  (k1x  2k 2 x  2k 3 x  k 4 x )
6
1
k y  (k1 y  2k 2 y  2k 3 y  k 4 y )
6
1
k z  (k1z  2k 2 z  2k 3 z  k 4 z )
6

Hence, x1  x0  k x
y1  y 0  k y
z1  z 0  k z

4.0 Conclusion
In this Unit, you got to know how to reduce an nth ordinary differential equation to n first
order differential equations. In particular, you were able to see how a pair of second-order
ordinary differential equations were reduced to four first-order differential equations.
You also learnt various methods of solving a first order ordinary differential equation.

5.0 Summary
In this Unit, you learnt how to:
 Reduce an nth order ordinary differential equation to n first order ordinary
differential equations.
 Numerically solve a first order ordinary differential equation.
 Numerically solve a system of first order ordinary differential equations.

6.0 Tutor Marked Assignment


dy
1. Given that  xy 2 ; y (0)  1 , evaluate y (0.2) (step length 0.2), using the
dx
(i) Modified Euler method.
(ii) Fourth order Runge-Kutta method.

100
2. With a step length of 0.1, find the value of y at x  0.2 given the ordinary
dy
differential equation:  y  x  0 ; y (0)  0 .
dx
(i) Second-order Runge-Kutta method
(ii) Fourth-order Runge-Kutta method.

7.0 References/Further Reading

101
Solutions to Tutor Marked Assignment

dy
1. Given that  xy 2 ; y (0)  1 , evaluate y (0.2) (step length 0.2), using the
dx
(iii) Modified Euler method.
(iv) Fourth order Runge-Kutta method.

Solution
x 0  0 , y 0  1
(i) First using Euler’s formula,
2
y1(0 )  y 0  hf ( x0 , y 0 ) = y 0  x 0 y 0  1  0(1) 2 =  1

We can now apply the modified Euler formula.


h
i  0 y1(1)  y 0  [ f ( x0 , y 0 )  f ( x1 , y1( 0 ) )]
2
=  1  0.1[(0  ( 1) 2  (0.2  (1) 2 )]  0.98
h
i  1 y1( 2 )  y 0  [ f ( x 0 , y 0 )  f ( x1 , y1(1) )]
2
=  1  0.1[(0  ( 1) 2  (0.2( 0.98) 2 )]  0.980792
| y1( 2)  y1(1) | = - 0.980792 - (-0.98)  .000792
h
i2 y1( 3)  y 0  [ f ( x 0 , y 0 )  f ( x1 , y1( 2 ) )]
2
=  1  0.1[( 0  ( 1) 2  (0.2( 0.980792) 2 ]  0.980761 = 1.110525
| y1( 3)  y1( 2) | = - 0.980761 - (-0.980792)  0.000031
Hence, y (0.2)  0.980761 .

(v) fourth-order Runge-Kutta method.


x0  0, y 0  1, h  0.2, f ( x, y )  xy 2
k1  hf ( x 0 , y 0 )
 1 1 
k 2  hf  x 0  h, y 0  k1 
 2 2 
 1 1 
k 3  hf  x0  h, y 0  k 2 
 2 2 
k 4  hf  x 0  h, y 0  k 3 
1
 k  (k1  2k 2  2k 3  k 4 )
6
Hence, y1  y 0  k

102
x(0) 0 -1 y (0)
Step-size 0.2
k1 0 -1 0
k2 0.1 -1 0.02
k3 0.1 -0.99 0.0196
k4 0.2 -0.9804 0.03845
k 0.01961
y1 -0.98039

2. With a step length of 0.1, find the value of y at x  0.2 given the ordinary
dy
differential equation:  y  x  0 ; y (0)  0 .
dx
(iii) Second-order Runge-Kutta method
(iv) Fourth-order Runge-Kutta method.

Solution
(i) Second-order Runge-Kutta method
k1  hf ( x j , y j ) = 0.1( x0  y 0 )  0.1(0  0)  0
k 2  hf ( x j 1 , y j  k1 ) = 0.1( x1 , y 0  0)  0.1(0.1  0)  0.01
1 1 1
y1  y 0  [ k1  k 2 ]  y 0  [ k1  k 2 ]  0  [0  0.01]  0.005
2 2 2

k1  hf ( x j , y j ) = 0.1( x1  y1 )  0.1(0.1  0.005)  0.0105


k 2  hf ( x j 1 , y j  k1 )  0.1(0.2, (0.005  0.0105))
 0.1(0.2  0.0155)  0.02155
1 1
y 2  y1  [ k1  k 2 ]  y1  [ k1  k 2 ]
2 2
1
 0.005  [0.0105  0.02155]  0.021025
2

(ii) Fourth-order Runge-Kutta method.

x0 = 0 y0 = 0 x1 = 0.1 y1 = -0.00517
k1 0 k1 -0.0105171
k2 -0.005 k2 -0.0160429
k3 -0.00525 k3 -0.0163192
k4 -0.01053 k4 -0.022149
k -0.00517 k -0.0162317
y1 -0.00517 y2 -0.0214026

103
Elements of C++ Programming

In this chapter we take a look at C++ programming as a tool for numerical analysis. This
should not be taken as a substitute for a good book on C++ programming. Indeed, space
would only permit us to treat just what it would take you to do write scientific programs.

As is usual with most books on C++ programming language, it would be in order to start
with a simple program, the ‘Hello World.’

#include <iostream>
using namespace std;

int main ()
{
// Program to write ‘Hello World’ on the screen.

cout << “Hello World”;


return (0);
}

We shall now examine this program with a view to familiarizing you with the simplest
program in C++ language.

#include <iostream>
A line that begins with # is a directive for the preprocessor. Including this file, which is
the iostream standard file. This line is necessary as we shall be making use of input or
output (in this particular case, the standard output stream, cout). The symbol << is the
insertion operator. In the program, the insertion operator inserts the variable “Hello
World” into the output stream cout.

using namespace std;


namespace contains all the elements of the standard C++ library. This expression enables
us to use the elements of the standard C++ library.

int main ()
This is the statement that begins the definition of the main function. All C++ programs
are executed beginning from this statement. Thus, it is essential that every C++ program
has a main function.

After the int main () statement, the bracket opens with ‘{‘ signifying the beginning of the
codes within the main function. This ends with an ‘}’ after the return (0); statement.

// Program to write ‘Hello World’ on the screen


Any statement that begins with two slashes (/) is taken as a comment by the compiler.
Comments are used to make some ‘thought sense’ of a program. You would be surprised

104
a program you wrote a few weeks back might not make any sense anymore if you never
put enough comments.

cout << ”Hello World”;

Note that apart from the #include statement and int main (), every statement in this
program ends with the semi-colon.

The basic ideas of C++ programming can be listed under the following broad headings:
Declaration Statements
Array Dimensioning
Input / Output
Arithmetic / Logical Expressions
Looping
Subroutines and functions

We shall however discuss first the variables and data types. There are several types:
integer, floating point and string.

An identifier is required by every variable. This distinguishes it from other variables. An


identifier contains one or more digits, letters and single underscore characters. Usually, it
begins with a letter, although it might begin with an underscore sign, where it does not
clash with those reserved for the compiler.

Basic Data Types


It is necessary at this point to mention that the byte (4 bits) is the unit of representation in
C++. A bit is the smallest. This can store a single character or a small integer. Integers
could be signed or unsigned. A byte can store an integer between 0 and 255 if it is an
unsigned integer. For a signed integer, it can store between –128 and 127, both limits
inclusive in both cases.

Table 1 shows each data type, its size and the range of data that it can take. The size and
range are for a 32-bit system.

Table 1: Data types in C++

Name Description Size Range


(bytes)
Char A character or small integer 1 Signed: -128 to 127
Unsigned: 0 to 255
short A short integer 2 Signed: -32768 to 32767
int Unsigned: 0 to 65535
Int An integer 4 Signed: -2147483648 to
2147483648
Unsigned: 0 to 4294967295
long int A long integer 4 Signed: -2147483648 to

105
2147483648
Unsigned: 0 to 4294967295
Float A floating point number 4 +/-3.4  10  / 38
double A double precision floating point 8 +/-1.7 10  / 308
number
long A long double precision floating 8 +/-1.7 10  / 308
double point number
wchar_t A wide character 2 or 4 1 wide character
Bool A Boolean value. It takes true or 1 True or false
false

A variable has to be declared to be used in C++. This is achieved by simply stating the
type of variable it is. For example, int, long, char, short, long (long int), short (short int),
float, bool, long double, or wchar_t. This is followed by the variable name. For example,
int number;
or
float age_goat;

Variables of the same type could be declared using the same statement, e.g.,
long number, year;

Moreover, the default is signed. For an unsigned variable, we would need to declare it so.
For example,
unsigned distance;

An exception is char. Char has no default; as such, you have to declare it signed or
unsigned.

Variable names are case-sensitive, meaning that Happy is not the same variable as happy.

Strings
These are non-numeric values that are longer than a single character. It is not a
fundamental type. This necessitates including the header file <string> along with
<iostream>.

The Span of a Variable


A variable in a program could be local or global. Local variables are declared within a
block or a function. Their scope is limited to the block or the function usually delineated
by { … }. Outside the block or the function, the variables are of no relevance. As an
example, variable chalk has relevance throughout the function main (). So does variable
chalk_dust, but chalk_dust has relevance only within the function minute.

int main ()

float chalk;
{

106
// Program to demonstrate span of variable

int minute ()
{
// Span of chalk_dust is only within the function minute.
float chalk_dust;
cout <<chalk_dust;
}
return (0);
}

Initialising a Variable
We can fix the initial value of a variable after it might have been appropriately declared,
as in:
int counter;
counter = 5;

On the other hand, we could also set the initial value of a variable as we declare its type,
as in:
int counter = 5;

Another way of initializing a variable is by writing the initial value in parenthesis:


short counter (5);

Strings
These are variables that store non-numeric values longer than a character.

To be able to make use of strings, the programmer would need to include the standard
header file <string>:

#include <iostream>
#include <string>
using namespace std;

int main ()
{
string My_Name;
My_Name = Johnson;
cout << My_Name;
return 0;
}

Constants
A constant, as the name implies, has a fixed value.

107
Constants can be further divided into three categories: Literals, Defined constants and
Declared constants.

Literals state the specific values within a program. These can be further divided into
three: integer numerals, floating point numbers, Boolean literals and character and string
literals.

Integer numerals identify integer decimal, octal (base 8) or hexadecimal (base 16) values.
The last two are expressed, respectively, by putting a suffix 0 and 0x. Thus, decimal 750
is equivalent to 01356, and 0x2ee in hexadecimal. Integer numerals are by default
integers (int), but we could still declare them unsigned, long or unsigned long by
appending the appropriate letter (l, u or ul), where it is immaterial whether the letter is
upper or lower case. For example, 750ul.

Floating point numbers are numbers with decimals, and could be with or without
exponents. Examples include 4.1239, 6.64e-34. Floating point literals are of type double
by default, but we can still express a floating point literal as float or long double. In this
case, respectively, we append f or l. For instance, 4.1239f. The appended symbol could be
lower or upper case.

Boolean Literals have only two values: true and false. Their type is bool. For example,
Bool Decision.

Character and String Literals are non-numerical constants. Single characters are enclosed
within single quotes, e.g., ‘t’. A string is expressed within double quotes, “Hello World”
for example.

Declared Constants are constants the user declares. After the declaration, the values of
the constants remain unchanged as their values cannot be modified. For example,
const int Number.

Defined Constants are constants the user might need quite often in a program. A good
example is the number pi. Thus, we could define pi as follows:
#define pi = 3.142

As is usual with all the lines starting with the hatch sign (#), this is a command for the
preprocessor.

OPERATORS
Assignment operator
This is the operator that assigns a value on the right of the equality sign to the variable on
the left. Thus,
int a = 3;
float b = 4.28;

108
Arithmetic Operators
These are the operators for carrying out the usual arithmetic operations. They are:
Addition +
Subtraction –
Multiplication *
Division /
Modulo %

Increase and Decrease


The increase operator is ++ and not +, while the decrease operator is --. Thus,
a++ would mean increase a by 1, or a = a + 1. This could also be written in a compound
way as a += 1. Likewise, decrease by 1 in b would be written as b-- or b-=1.

We could also write ++a or – b. The difference being that in the first case treated in the
above paragraph, the number is incremented after it has been used, while in the second
case, the increment is done before the number is used. Thus,
a = 2;
d = a++;
cout <<d;

the output is ‘d = 2’. In this case, a will become 3.

a = 2;
e = ++a;
cout <<e;
the output is ‘e = 3’. In this case, a is also 3, because the increment had been made before
the number was stored as variable e.

Relational Operators
The equality operator for comparing two values is the double equality sign. Thus, if we
would inquire whether variable r is equal to two, we would write
r==2
Note that as a relational, this could be true or false.

The relational operators are:


Equal to =
Not equal to !=
Less than <
Greater than >
Less than or equal to <=
Greater than or equal to >=

Logical Operators
The (Boolean) logical operators are:
NOT !
AND &&

109
OR ||

The NOT operator changes True to False and vice versa.


A !A
True False
False True

The AND and the OR operator are used when there are two expressions that will yield a
single relational result. The NOT is true only if the two expressions are true. It is false
otherwise.

AND operator
A B A&&B
True True True
True False False
False True False
False False False

The OR operator is true if either expression is true. It is false if both are false.
AND operator
A B A&&B
True True True
True False True
False True True
False False False

The Conditional Operator


The conditional operator is represented by the symbol ?. Thus,
a = = b ? v : w returns v if a is equal to b, but returns w if a is not equal to b.

Explicit Type Casting Operator


This allows us, for example to utilise the integer part of a number that has been declared
as a floating point number:
int Johns_Age;
Float JohnsDecimal_Age = 25.36;
Johns_Age = int JohnsDecimal_Age = 25.36;
cout << Johns_Age;

This program writes 25 years as Johns_Age.

Sizeof()
The operator Sizeof() takes one parameter and gives the length (in bytes). Thus,
Sizeof(char) is 1, as a character variable has a length of one byte.

BASIC INPUT AND OUTPUT STATEMENTS

110
The basic output statement is the cout. It outputs onto the screen. This, as we have seen
all along, could be used if we included the header file <iostream>. As said earlier << is
the insertion operator.

The basic input statement is the cin. It takes the input from the keyboard. The syntax is
cin >> , where >> is the extraction operator. cin extraction can only take one word,
because it stops whenever a blank space appears. To get an entire line, we use the getline
function.

In the example below, String_var will be given as “I am writing a string with the getline
function”. Later, the String_var will be given the string “It sure is”. You will notice that
String_var would have been replaced by the new string.

#include <iostream>
#include <string>
using namespace std;

int main ()
{
string String_var;
cout <<“What am I doing?”;
getline (cin,String_var);
cout <<“That could be fun”;
getline (cin, String_var);
return (0);
}

The output will be:

What am I doing?
I am writing a string with the getline function.
That could be fun
It sure is

Writing into a file


We would like to write some of our results in an output file. We proceed by opening a
file, for instance, Arearray.txt. But to allow us to do this we need to put

Ofstream myfile;

ofstream myfile;
myfile.open ("Arearray.txt");
float Area[5];
float s;
float a[5] = {2.0, 1.5, 4.1, 3.2, 2.3};
float b[5] = {2.0, 4.2, 3.5, 1.9, 1.1};

111
float c[5] = {2.0, 3.3, 2.4, 1.4, 2.8};
int i=0;
while (i<5) {
s = (a[i]+b[i]+c[i])/2.0;
Area[i]= s*(s-a[i])*(s-b[i])*(s-c[i]);
Myfile <<a[i]<<","<<b[i]<<","<<c[i]<<","<<s<<","<<Area[i]<<"\n";

CONTROL STRUCTURES
Central to the concept of control structures is the block. Each block is enclosed in a pair
of braces ( ). Thus, the block has one or (usually) more statements enclosed inside a pair
of braces. Note that if the block has only one statement, the braces ( ) are not necessary.

THE CONDITIONAL STRUCTURE


This has the form
if (condition) statement
where condition is a valid C++ expression. The statement is executed if the condition is
true. If the condition is false, the statement is not executed. The program continues after
this statement, whether the expression is executed or not. As an example, consider the
following program that determines the period of oscillation of a pendulum, given its
length and the acceleration due to gravity.

// Program to evaluate the period of oscillation of a pendulum, given the length and the
//acceleration due to gravity

#include <iostream>
#include <math.h>
using namespace std;

int main ()
{
float length, acceleration_gravity;
cin >> length;
cin >> acceleration_gravity;
float discri = length/acceleration_gravity;
if (discri < 0) sqrtdiscri = sqrt(discri);
}

We could also state what the program should do if the statement is false, using the else
keyword.

int main ()
{
float length, acceleration_gravity;
cin >> length;
cin >> acceleration_gravity;

112
float discri = length/acceleration_gravity;
if (discri < 0) sqrtdiscri = sqrt(discri);
else
cout << “Imposssible”;
}

LOOP STRUCTURES
It might be necessary to repeat a set of codes in the program. This is called a loop. We
shall examine a few ways of doing this.

The for loop


The for loop follows the following routine:

for (initialisation, condition, increase) statement;

As an example, we want to write a program that reads from 1 to 10 and adds the numbers.

#include <iostream>
using namespace std;

int main ()
{
for (int i = 0, i <= 10, i++);
{
int sum = sum + i;
}

The while loop


For the while loop, the format is,

while (expression) statement

The while loop repeats the statement for as long as the expression is true.
The program written with the for loop can be written with the while loop as shown
below.

#include <iostream>
using namespace std;

int main ()
{
while (i < 50) {
int sum = sum + i;
i++;
}

113
The do while loop
The do while loop has the format:
do statement while (condition);

The program written with the for loop and the while loop can be written with the while
loop as shown below.

#include <iostream>
using namespace std;

int main ()
{
int sum = sum + i;
i++;
while (i < 50);
}

JUMP STATEMENTS
The goto statement
With the goto statement, we could jump from one point to another in the program. The
point to jump to is identified by an identifier, followed by a colon (:).

int main ()
{
float length, acceleration_gravity;
new_set:
cin >> length;
cin >> acceleration_gravity;
float discri = length/acceleration_gravity;
if (discri < 0) sqrtdiscri = sqrt(discri);
else
cout << “Imposssible”;
goto new_set;
}

In the program segment above, we get the opportunity to pick another set of length and
acceleration due to gravity to calculate another value of the period of oscillation.

As another example,
#include <iostream>
using namespace std;

int main ()
{
newnumber:

114
int sum = sum + i;
i++;
if (i < 50);
goto newnumber;
}

In this program, we have the goto pairs up with the conditional operator if to produce a
loop.

The continue statement


The continue statement gives the programmer the opportunity to jump to the beginning of
the loop. If we would like to skip 25 in the last example, we could write:

#include <iostream>
using namespace std;

int main ()
{
newnumber:
int sum = sum + i;
i++;
if (i < 50);
if (i = = 25) continue;
goto newnumber;
}

The program would now add from 1 to 50, skipping the number 25.

The break statement


The break statement enables us to leave a loop before the end of the loop. As an example,
let us again write the program for adding from 1 to 50.

#include <iostream>
using namespace std;

int main ()
{
int sum = sum + i;
i++;
while (i < 50);
if (i = = 25) break;
}

This program now adds from 1 to 24.

The switch function

115
The switch function works in a way similar to the if (condition) statement expression
works.

116
int main () int main ()
{ {
float length, acceleration_gravity; switch (discri);
cin >> a; float length, acceleration_gravity;
cin >> b; cin >> a;
cin >> c; cin >> b;
float discri = b*b-4*a*c; cin >> c;
if (discri < 0) sqrtdiscri = sqrt(discri); float discri = b*b-4*a*c;
else case <0:
cout << “Imposssible”; cout << “Imposssible”;
else break;
if (discri = 0) sqrtdiscri = sqrt(discri); case = 0:
else cout << “coincident roots”;
cout << “coincident roots”; break;
if (discri > 0) sqrtdiscri = sqrt(discri); case >0:
else cout << “different roots”;
cout << “different roots”; }
}

The switch statement does not use blocks. Rather it uses labels (recall the goto
statement).

FUNCTIONS
A function consists of a group of statements that are executed when the function is called
from a point in the program.

A function is of the form:

type name (parameter1, parameter2, …) (statements)

The function returns the data type specifier type. The name is the identifier by which the
function will be called within the program. Each parameter has its data type specifier
declared along with its identifier, e.g., float orange. Finally, the body of the function is
made up of statements enclosed within braces.

As an example,
#include <iostream>
#include <math.h>
using namespace std;

int main ()
{
float root;
root = root_quadratic (3, 2, 5);

117
return (0);
}

float root_quadratic (float a, float b, float c)


{
r = b + sqrt(b*b-4*a*c);
return (r);
}

math.h is a header file that allows you to do mathematical operations.

The main function calls up the function root_quadratic to provide the value of root in the
main program.

Notice that 3, 2 and 5 correspond respectively (in order) to a, b and c in the function
root_quadratic.

The function type void


When a function needs not return a value, we use type void. This could be declared as
follows:

void menu ();


or
void menu (void);

More on VOID!

ARRAYS
Arrays are memory locations within the computer that are reserved for some values that
will eventually be stored in them. An array could be one dimensional (a column or row
vector) or two or more in dimension (a matrix). The type of the array is specified along
with the size. Thus, the following are examples of arrays.
float Abba [4]; is a one-dimensional floating point array that has four memory locations.
int forum [3] [4]; is a two-dimensional integer with twelve locations, a 3 by 4 matrix.

The memory locations for Abba will be filled with floating point numbers; those of forum
will contain integers.

Initialising an Array
A global array is set to zero, unless otherwise initialised. A local array (for example, one
that is within a function) will not be initialised until some values are stored in them. Note
that arrays start with zero. For example, Abba [4] has the locations Abba [0], Abba [1],
Abba [2] and Abba [3].

118
Just as any variable could have an initial value stated on the same line as the type is
declared, an array could also have its initial values declared along with the type. For
example,

float Abba [4] (1.2, 2.5, 15.4, 12.1, 6.0);

We could also have written


float Abba [ ] (1.2, 2.5, 15.4, 12.1, 6.0); that is, leaving out the size of the array. But the
compiler reads in the five values and then gives the array a size: 5 and takes the array to
be float Abba [5].

119
APPENDIX II

SOME C++ PROGRAMS

// Program to calculate the area of a triangle, using Hero’s formula: // A = sqrt(s(s-a)(s-


b)(s-c)), given a, b, and c, the sides of the
// triangle.

#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
ofstream myfile;
myfile.open ("Area.txt");
float Area;
float s;
float a=2.0;
float b=2.0;
float c=2.0;
s = (a+b+c)/2.0;
Area = s*(s-a)*(s-b)*(s-c);
myfile << a << ", " << b << ", " << c << ", " << s << ", "<< Area;
myfile.close ();
return 0;
}

We could also write the program in such a way that several values of a, b, and c could be
read in, and values of A calculated in each case.

#include <iostream>
#include <fstream>
using namespace std;

int main ()
{
ofstream myfile;
myfile.open ("Arearray.txt");
float Area[5];
float s;
float a[5] = {2.0, 1.5, 4.1, 3.2, 2.3};
float b[5] = {2.0, 4.2, 3.5, 1.9, 1.1};
float c[5] = {2.0, 3.3, 2.4, 1.4, 2.8};
int i=0;

120
while (i<5) {
s = (a[i]+b[i]+c[i])/2.0;
Area[i]= s*(s-a[i])*(s-b[i])*(s-c[i]);
myfile <<i<<","<<a[i]<<","<<b[i]<<","<<c[i]<<","<<s<<","<<Area[i]<<"\n";
i++;
}
myfile.close ();
return 0;
}

#include <iostream>
#include <fstream>
#include <math.h>
using namespace std;

float root_quadratic (float a, float b, float c)


{
float r;
float argum = b*b-4*a*c;
if (argum>=0)
{
r = b + sqrt(b*b-4*a*c);
return (r);
}
}

int main ()
{
ofstream myfile;
myfile.open ("FunctExam.txt");

float root;
root = root_quadratic (1, 2, -2);
//cout <<root;
myfile <<"The root of the equation is" <<"\n";
myfile << root;
return 0;
}

121
// Newton-Raphson solution of the equation
// x**2-3x-2
#include <iostream>
#include <fstream>
#include <math.h>
using namespace std;
// We make the initial approximate solution 0.8
int main ()
{
ofstream myfile;
myfile.open ("Nraphson.txt");
//float xold=0.8;
cout <<"The initial guess is ";
float xold, ratio;
cin >> xold;
myfile <<xold <<"\n";
myfile <<"Successive iterations yield" <<"\n";
float f;
float fprime;
evaluate:
f=xold*xold-3.*xold+2.0;
fprime=2.*xold-3.0;
ratio=f/fprime;
float xnew=xold-ratio;
float Diff;
Diff = fabs(xnew-xold);
myfile <<xnew << ", " << Diff <<"\n";

if (Diff>0.001)
{
xold=xnew;
goto evaluate;
}
myfile <<"The root of the equation is"<< "\n";
myfile << xnew << "\n";
return 0;
}

122
// Bisection Method
// 2.0x**3-3x**2-2x-0.5
#include <iostream>
#include <fstream>
#include <math.h>
using namespace std;
// We make the initial approximate solution 0.8
int main ()
{
ofstream myfile;
myfile.open ("bisection1.txt");
float ratio;
float x1,x2,x3,fx1,fx2,fx3,multi,dfx3;
x1=1.9;
x2=2.1;
compute:
fx1=2.0*x1*x1*x1-3.0*x1*x1-2.0*x1-0.5;
fx2=2.0*x2*x2*x2-3.0*x2*x2-2.0*x2-0.5;
x3=(x1+x2)/2.0;

fx3=2.0*x3*x3*x3-3.0*x3*x3-2.0*x3-0.5;
multi=fx1*fx3;
myfile << x3 << " " <<fx3 <<"\n";

if (fx3<0.0){
dfx3=fx3*-1.0;
}
if (fx3>=0.0){
dfx3=fx3;
}
if (dfx3<0.001)
{
goto evaluate;
}

if (multi<0.0)
{
x2=x3;
goto compute;
}

if (multi>0.0)
{
x1=x2;
x2=x3;
goto compute;

123
}

evaluate:
myfile << x3 << " " <<fx3 <<"\n";

myfile <<"The root of the equation is"<< "\n";


myfile << x3 << "\n";
return 0;
}

124
// Euler Implicit
// y'=x*x+y
#include <iostream>
#include <fstream>
#include <math.h>
using namespace std;
float f(float x1, float x2);
int main ()
{
ofstream myfile;
myfile.open ("eulerim.txt");
float x[10],y[10],b,c,d,e,g,h,diff;

x[0]=0.0;
y[0]=1.0;
myfile << x[0] << " " << y[0] <<"\n";
cout<< x[0];
e = x[0];
g = y[0];

// Step size is 0.1


h = 0.1;
int m=1;

// Get y(1) using the Euler method


compute:
y[1] = y[0]+h*f(x[0],y[0]);
int j = 1;

// Send it for refining by the Modified Euler method


loop:
b=x[0]+h;
c=y[j-1];
d=y[j];
y[j+1]=y[0]+(h/2.)*(f(x[0],y[0])+f(x[0]+h,y[j]));
diff = y[j+1]-y[j];
diff = fabs(diff);
if (diff <= .001)
{
goto write;
}
y[j]=y[j+1];
j = j + 1;
goto loop;

125
// Write answers and then get nine other steps
write:
myfile << x[0]+h << " " <<y[j+1] <<"\n";
x[0]=x[0]+h;
y[0]=y[j+1];
e = x[0];
g = y[0];
m = m + 1;
if (m < 10)
{
goto compute;
}

stop:

return 0;
}

float f(float xx, float yy) //function declaration


{
return xx*xx+yy;
}

126

You might also like