Intorduction To Numerical Analysis
Intorduction To Numerical Analysis
Numerical Analysis
By
Tibebesellassie T/Maraim
March 2005
Contents
Introduction.................................................................................................................... iv
1 Basic Concepts in error estimation .................................................................. 1
1.1 Sources of error......................................................................................................... 1
1.2 Number Representation .......................................................................................... 2
1.3 Rounding off errors................................................................................................. 4
1.4 Absolute and relative errors .................................................................................... 4
1.5 Propagation of Errors.............................................................................................. 5
2 Nonlinear equations ....................................................................................................... 8
2.1 Introduction............................................................................................................. 8
2.2 Locating Roots ........................................................................................................ 8
2.3 Bisection Method .................................................................................................... 9
2.4 False-Position Method .......................................................................................... 13
2.5 The Secant Method ............................................................................................... 16
2.6 Fixed-Point Iteration Method................................................................................ 19
2.7 The Newton-Raphson iterative method ................................................................ 22
3 Systems of Linear Equations ....................................................................................... 29
Introduction................................................................................................................... 29
3.1 Direct Methods........................................................................................................ 31
3.1.1 Upper-triangular Linear Systems..................................................................... 31
3.1.2 Gauss elimination method................................................................................ 34
3.1.3 Gaussian Elimination With Partial Pivoting .................................................. 38
3.1.4 Gauss-Jordan Method .................................................................................... 41
3.1.5 Matrix Inversion Using Jordan Elimination .................................................. 44
3.1.6 Matrix Decomposition ................................................................................... 45
3.1.7 Tri Diagonal Matrix Method.......................................................................... 54
3.2 Indirect methods.................................................................................................... 56
3.2.1 Gauss Jacobi Iterative Method....................................................................... 56
3.2.2 Gauss-Seidel Method ..................................................................................... 58
4 Solving Nonlinear Equations Using Newton’s Method............................................... 62
4.1 Introduction............................................................................................................. 62
4.2 Newton Method ...................................................................................................... 62
5 Finite differences ......................................................................................................... 65
Introduction................................................................................................................... 65
5.1 The shift operator E .............................................................................................. 67
5.2 The forward difference operator Δ ........................................................................ 68
5.3 The backward difference operator ∇ ...................................................................... 69
5.4 The central difference operator δ ........................................................................... 69
5. 5 Finite Differences of Polynomials ......................................................................... 72
6 Interpolation................................................................................................................. 75
6.1 Linear interpolation................................................................................................ 75
6.2 Quadratic interpolation ........................................................................................... 77
6.3 Newton interpolation formulae ............................................................................. 79
6.3.1 Newton's forward difference formula ............................................................ 79
6.3.2 Newton's backward difference formula ......................................................... 79
6.4 Lagrange interpolation formula .............................................................................. 83
6.5 Divided Difference Interpolation........................................................................... 86
Computational steps
For loops
Based on the flow chart a series of instructions are written in “C++” programming
language.
Since the thorough understanding of errors is necessary for a proper appreciation
of the art of using numerical methods we devote the first chapter of this material for the
theory of error analysis.
Classification of errors
We classify errors in general into three depending on their sours:
1. Errors which are already present in the statement of a problem before its solution are
called inherent errors. Such errors arise either due to the given data being
approximate or due to the limitations of mathematical tables, calculators or the digital
computer. Inherent errors can be minimized by taking better data or by using high
precision computing aids.
2. Errors due to arithmetic operations using normalized floating-point numbers. Such
errors are called rounding errors. Such errors are unavoidable in most of the
calculations due to the limitations of the computing aids. Rounding errors can,
however, be reduced:
(i) by changing the calculation procedure so as to avoid subtraction of nearly equal
numbers or division by a small number; or
(ii) by retaining at least one more significant figure at each step than that given in the
data and rounding off at the last step.
3. Errors due to finite representation of an inherently infinite process. For example the
use of a finite number of terms in the infinite series expansion of sin x, cos x, ex, etc.
Such errors are called truncation errors. Truncation error is a type of algorithm
error. Truncation error can be reduced by retaining more terms in the series or more
steps in the iteration; this, of course, involves extra work.
1
1.2 Number Representation
In order to carry out a numerical calculation involving real numbers like 1/3 and π in
terms of decimal representation, we are forced to approximate them by a representation
involving a finite number of significant figures. At this stage in a significant
representation of a number we have to notice the following properties.
i. All zeros between two non-zero digits are significant.
ii. If a number is having embedded decimal point ends with a non-zero or a sequence of
zeros, then all these zeros are significant digits.
iii. All zeros preceding a non-zero digit are non-significant.
For example
Number Number of significant digits
5.0450 5
0.0037 2
0.0001020 4
Similarly, when trailing zeros are used in large numbers, it is not clear how many,
if any, of the zeros are significant. For example, at face value the number 23,400 may
have three, four, or five significant digits, depending on whether the zeros are known
with confidence. Such uncertainty can be resolved by using scientific notations, where
2 ⋅ 34 × 10 4 ,2 ⋅ 340 × 10 4 ,2 ⋅ 3400 × 10 4 significant that the number is known to three, four,
and five significant figures, respectively.
Notice that as indicated above, we separate the significant figures (the mantissa)
from the power of ten (the exponent); the form in which the exponent is chosen so that
the magnitude of the mantissa is less than 10 but not less than 1 is referred to as a
scientific notation.
Number representation in the computer
Since there is a fixed space of memory in a computer, a given number in a certain base
must be represented in a finite space in the memory of the machine. This means that not
all real numbers can be represented in the memory. The numbers that can be represented
in the memory of a computer are called machine numbers. Numbers outside these cannot
be represented.
There are two conventional ways of representation of machine numbers.
i. Fixed point representation
Suppose a number to be represented has n digits, in fixed-point representation system, the
n digits are subdivided into n1 and n2, where n1 digits are reserved for the integer part and
n2 digits are reserved for the fractional part of the number. Here, whatever the value of
the number is, the numbers n1 and n2 are fixed from the beginning. i.e the decimal point is
fixed.
Example: If n=10, n1 = 4, and n2 = 6 then say 30.421 is represented by 0030421000 in a
register.
Note: - In modern computers, the fixed point representation is used for integers only.
ii. Floating point representation
Every real number “a” can be written as:
a = p × N q , where: p is any real number, N is the chosen base of representation and q is
an integer. Such a kind of representation of “a” is said to be normalized if N −1 ≤ p < 1.
Most of the time computers store numbers in base two to save memory space So
for instance the number (-39.9)10 should first be changed to its binary form
(−100111.11100) 2 = (−0.10011111100) 2 × (2 6 )10 so as to be stored in a computer.
A widely used storage scheme for the binary form is IEEE Standard for Binary
Floating-Point arithmetic. This standard was define by the Institute of Electrical and
Electronic Engineers and adopted by the American National Standards Institute. The
single-precision format employs 32 bits, of which the first or most significant bit is
reserved for the sign bit S; the first bit for the number (-39.9)10 is therefore equal to 1.
The next 8 bits are used to store a bit pattern to represent the exponent r. the
binary value of r is not stored directly; rather, it is stored in biased or offset form as a
nonnegative binary value c. The relation for the actual exponent r in terms of the stored
value c and the bias b is
c=r+b
The advantage of biased exponents is that they contain only positive numbers. It
is then simpler to compare their relative magnitude without being concerned with their
signs. As consequence, a magnitude comparator can be used to compare their relative
magnitude during the alignment of the mantissa. Another advantage is that the smallest
possible baised exponent contains all zeros. The floating-point representation of zero is
then a zero mantissa and the smallest possible exponent.
The 8-bit value of c ranges from (0000 0000)2 to (1111 1111)2 or from (0)10 to
(255)10. The bias b has a value of (0111 1111)2 or (127)10. Again using (-39.9)10 for which
r is equal to(6)10 we obtain a c value of (133)10 whose 8-bit form is (10000101)2.
The remaining 23 bits of the 32-bit format are used for the mantissa. The mantissa
for our example (-39.9)10 is an infinitely long binary sequence (−0.10011111100) 2 ,
which must be reduced to 23 bits for storage. The method of reduction dictated by the
IEEE standard is rounding.
The IEEE format for the decimal number (-39.9)10 is shown in Fig bellow
1 1 0 0 0 0 1 0 1 1 0 0 1 1 1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1
S c mantissa
Note: A number cannot be represented exactly if it contains more than t bits in mantissa.
In that case it has to be rounded off. Moreover if, in the course of computation, a number
The actual value of ab is 11,729.8; however, (ab)’ is 0.117 × 105. The absolute error in
the product is 29.8 and the relative error is approximately 0.25 × 10-2. Floating-point
subtraction and division can be done in a similar manner.
One of the challenges of numerical methods is to determine error estimates in the
absence of knowledge regarding the exact value. For example, certain numerical methods
use an iterative approach to compute answers. In such an approach, a present
approximation is made on the basis of a previous approximation. This process is
performed repeatedly, or iteratively. For such cases, the error is often estimated as the
difference between the previous and the current approximations. Thus, relative error is
determined according to
current approximation − previous approximation
erel =
current approximation
For practical reasons, the relative error is usually more meaningful than the absolute
error. For example if, if a1= 1.333, b1= 1.334, and a2 = 0.001, and b2 = 0.002, then the
absolute error of bi as an approximation to ai is the same in both cases-namely, 10-3.
However, the relative errors are ¾ × 10-3 and 1, respectively. The relative error clearly
indicates that b1 is a good approximation to a1 but that b2 is a poor approximation to a2.
a x
x* b
Fig 2.1
b) Often it is possible to write the function f (x) as the difference of two simple known
functions. Suppose f ( x) = h( x) − g ( x) . Then, the roots of f ( x) = 0 that is
h( x) = g ( x) , are given by the intersection of the curves h(x) and g(x). For instance if
in the equation
sin x − x + 0.5 = 0
we put f ( x) = sin x − x + 0.5 , it is easy to separate f (x) into two parts, sketch two
curves on the set of axes, and see where they intersect. In this case we sketch
h( x) = sin x and g ( x) = x − 0.5. Since sin x ≤ 1 we are only interested in the interval
− 0.5 ≤ x ≤ 1.5 (outside which x − 0.5 > 1).
We deduce from the graph that the equation has one real root near x = 1.5. We then
tabulate f ( x) = sin x − x + 0.5 near x =1.5 as follows:
We now know that the root lies between 1.49 and 1.50, and we can use a numerical
method to obtain a more accurate answer.
Hence since | 1.497307 – 1.4 97267| ≈ 0.00004 the approximated root is 1.4973 round to
4 decimal places.
Convergence Analysis
Now let us investigate the accuracy with which the bisection method determines a root of
a function. Suppose that f is a continuous function that takes values of opposite sign at
the ends of an interval [a, b]. Since the new interval containing the root, is exactly half
the length of the previous one, the interval width is reduced by a factor of ½ at each step.
At the end of the nth step, the new interval will therefore be of length (b – a)/2. If on
repeating this process n times, the latest interval is as small as an error tolerance ε, then
(b – a)/2n ≤ ε. From this relation it is also possible to determine the number of steps
required in the bisection method to reach at the solution. So to find this number of steps
n, it is necessary to take the logarithm (with any convenient base) both sides of the
inequality (b – a)/2n ≤ ε to obtain
log(b − a) − log ε
n≥
log 2
Observe that the error decreases with each step by a factor of ½, (i.e en+1/en = ½), thus we
say that the convergence in the bisection method is linear for en+1 is a scalar multiple of
en .
Start
Define f(x)
Read
Read
a,a,b,b,eε
Inappropriate
f (a)f (b) >0 yes initials Stop
a+b
x* =
a=x* 2 b= x*
No No
<0 >0
|b-x*|≤ε f(x*)f(b) |a-x*| ≤ ε
=0
Yes Yes
x* is root
Stop
Fig 2.3
Following the algorithm and the follow chart we can write a pseudocode for the bisection
method in C++ programming language as bellow:
#include<iostream.h>
#include<math.h>
#include<conio.h>
#include<iomanip.h>
getch();
}
x
a x1 x3 x2
Fig 2.4
Example 3 To witness how fast the false-position method can be let us reconsider the
problem in example 1. Thus let f ( x) = sin x − x + 0.5 , since the root lays between 1.49
and 1.5 the first intermediate point x is given by
1.5 − 1.49
x = 1.49 − f (1.49) = 1.4973
f (1.5) − f (1.49)
which is the solution of the equation round to 4 decimal places. Observe that this was the
result that was found after 8 steps of bisection in example1.
Example 4 Find the root of the equation e − x = x using the false-position method correct
to three decimal places.
Solution:- Let f ( x) = e − x − x , then since f(0) = 1 and f(1) = – 0.6321 the root lies
between 0 and 1. Putting a = 0 and b = 1 the first intermediate point x can found using
the relation
a b x f (x) f (x )
0 1 0.6127 -0.0708 0.0708
0 0.6127 0.5722 -0.0079 0.0079
0 0.5722 0.5677 -0.0009 0.0009
0 0.5677 0.5672 -0.0001 0.0001
Form the table we deduce that the root round to 3 decimal places is 0.567 in 4 iterations.
Observe that every approximation was done round to 4 decimal places and finally we
round the last approximation to 3 decimal places that is a deliberate act to minimize the
absolute error. Note that, had it been in the case of bisection there could have been 13
iterations to reach at the same accuracy! (Check).
Like the bisection method, the method of false position has almost assured
convergence, and it may converge to a root faster. However, it may happen that most or
all of the calculated x values are on the same side of the root, in which case convergence
may be slower than bisection see fig 2.5 Example 5 bellow.
a b
Fig 2.5
Example 5 Use bisection and false position to locate the root of f ( x) = x10 − 1 between
x=0 and x=1.2 round two decimal places.
Solution: It can be shown that in the case of bisection method we have to go through 9
iterations while in case of false-position method we have to go through at least twenty
iteration in order to reach at the desired accuracy. Which implies it took us more than two
fold iterations in the case of false position than in the case of bisection. (Check).
xn xn+1 xn-1
Fig 2.6
Algorithm for Secant Method
1. Choose x0 and x1, set the error (tolerance) ε and the maximum number of iteration M.
2. Set the counter i to 0.
3. If |f(x1)| < ε, x1 is the solution otherwise go to Step 3.
4. Increase i by 1 and calculate an intermediate point x2 from
x1 − x 0
x 2 = x1 − f ( x1 )
f ( x1 ) − f ( x0 )
5. If i > M, ‘the process does not converge to the required limit’ otherwise go to step 5.
6. Reset x0 to x1 , rest x1 to x2, and return to Step 2.
As can be seen from the tables above the bisection method takes 6 iterations, the
interpolation method takes 4 iterations and the secant method takes only 3 iterations to
reach at the approximated root 1.73.Thus comparing the methods we see that the fastest is
the secant and the slowest is the bisection method.
Even in the general case, if there is a solution in each case the secant method is
the fastest of the three methods discussed so far in this section.
Rate of Convergence of the Secant Method
With respect to speed of convergence of the secant methods, we have the error at (n+1)th
step:
x n − x n −1
x n +1 = x n − f ( x n ) **
f ( x n ) − f ( x n −1 )
The convergence of the iterative procedure is judged by the rate at which the error
between the true root and the calculated root decreases. The order of convergence of an
iterative process is defined in terms of the errors ei and ei+1 in successive approximations.
An iterative algorithm is kth order convergent if k is the largest integer such that:
lim (ei +1 eik ) ≤ M
n→∞
where M is a finite number. In other words the error in any step is proportional to the kth
power of the error in the previous step.
We now find the speed of convergence of the secant method form the following
derivation. Let r be the root of the equation f ( x) = 0. Let ei = r − xi i.e ei is the error in
the ith iteration. Then
ei +1 = r − xi +1
x n −1 f ( x n ) − x n f ( x n −1 )
=r−
f ( x n ) − f ( x n −1 )
(r − x n −1 ) f ( x n ) − (r − x n ) f ( x n −1 )
=
f ( x n ) − f ( x n −1 )
and using the Taylor’s series of f ( x n ) = f (r − en ) and f ( x n −1 ) = f (r − x n −1 ) , moreover
using the fact that f (r ) = 0.
2 2
e [ f (r ) − en f ' (r ) + e2n f " ( x ) − ...] − en [ f (r ) − en −1 f ' (r ) + en2−1 f " (r )...]
en +1 = n −1
[ f (r ) − en f ' (r ) + ...] − [ f (r ) − en −1 f ' (r ) + ...]
2 n
(e − e ) f (r ) + 12 (en −1en − en en −1 ) f " (r ) − ....
= n −1 n
(en −1 − en ) f ' ( r ) − ...
⎡ − f " (r ) ⎤
≈ en −1en ⎢ ⎥ ~ en −1en .
⎣ 2 f ' (r ) ⎦
y=x
y =g(x)
x0 x 2 x1
Fig 2.6
Example 8 Find the root of the equation 3xe x = 1 to an accuracy of 0.0001, using the
method of simple iteration with initial point x0 = 1.
x Δx Δ 2x
x1 0.6667
-0.0714
x2 0.5953 0.0854
0.014
x3 0.6093
(Δx1 ) 2
(−0.0714) 2
Hence x 4 = x1 − 2 = 0.6667 − = 0.607 .
Δ x1 0.0854
Which corresponds to six iterations in normal form. Thus the required root is 0.607.
f ( x0 )
whence h≈−
f ' ( x0 )
and, consequently,
f ( x0 )
x1 = x 0 −
f ' ( x0 )
should be a better estimate of the root than x0. Even better approximations may be
obtained by repetition (iteration) of the process, which then becomes
f ( xn )
x n +1 = x n −
f ' ( xn )
x1 x3 x2 x0 x
Fig 2.7
The geometrical interpretation is that each iteration provides the point at which the
tangent at the original point cuts the x-axis (Figure 9). Thus the equation of the tangent at
(x0, f (x0)) is
y − f ( x 0 ) = f ' ( x0 )( x − x 0 )
so that (x1, 0) corresponds to
− f ( x0 ) = f ' ( x0 )( x1 − x0 )
whence
f ( x0 )
x1 = x0 −
f ' ( x0 )
Algorithm for Newton-Raphson method
1. Suppose x0 , ε , M , and δ .
2. Set i=0
3. If | f ( x 0 ) |<ε, x0 is the estimated solution; otherwise go to step 4.
4. Compute the next approximation a from
f ( x0 )
a = x0 − ,
f ' ( x0 )
increase i by one , reset x0 to a and go to step 5.
5. If i ≤ M go to step 3, otherwise go to step 6.
6. Impossible to attain the required accuracy within M iterations.
Example 10 Find the root of the equation e − x = x using the Newton-Raphson method
correct to three decimal places given the initial guess x0 = 1.
x f ( x ) = sin x − x 2
0 0
0.25 0.1849
0.5 0.2294
0.75 0.1191
1 –0.1585
With numbers displayed to 4 decimal places, we see that there is a root in the interval
0.75 < x < 1 at approximately
1 0.75 0.1191
x0 =
− 0.1585 − 0.1191 1 − 0.1585
1
=− ( −0.1189 − 0.1191)
0.2777
0.2380
= = 0.8573
0.2777
Next, we will use the Newton-Raphson method:
f (0.8573) = sin(0.8573) − (0.8573) 2
= 0.7561 − 0.7349 = 0.0211
and
f ' ( x ) = cos x − 2 x
3. Convergence
If we write
f ( x)
φ ( x) = x − ,
f ' ( x)
the Newton-Raphson iteration expression
f ( xn )
x n +1 = x n −
f ' ( xn )
may be rewritten
x n +1 = φ ( x n )
We have observed (section 2.6) that, in general, the iteration method converges when
φ ' ( x ) < 1 near the root. In the case of Newton-Raphson, we have
[ f ' ( x )] 2 − f ( x ) f " ( x ) f ( x ) f " ( x )
φ ' ( x) = 1 − = ,
[ f ' ( x )] 2 [ f ' ( x )] 2
so that the criterion for convergence is
,
i.e., convergence is not as assured as, say, for the bisection method.
4. Rate of convergence
Since , we find
This result states that the error at the (n + 1)-th iteration is proportional to the square of
the error at the nth iteration; hence, if an answer is correct to one decimal place at one
iteration it should be accurate to two places at the next iteration, four at the next, eight at
the next, etc. This quadratic - second-order convergence - outstrips the rate of
convergence of the methods of bisection and false position which as we already know is
linear!
In relatively little used computer programs, it may be wise to prefer the methods of
bisection or false position, since convergence is virtually assured. However, for hand
calculations or for computer routines in constant use, the Newton-Raphson method is
usually preferred.
5. The square root
f(x) = x2 - a = 0.
1. Determine a root of the following equations correct to 3 decimal places and the
number of iterations to reach the required accuracy using
a)bisection method b)false position method c) secant method d) fixed point
iteration e) Newton-Raphson method
i. x 3 − 4 x + 1 = 0
ii. x + cos x = 0
iii. 2 x − log x = 6
iv. x 6 − x 4 − x 3 − 1 = 0
v. x − e − x = 0
2. Find where the graphs of y = 3x and y = e x intersect correct to four decimal
digits.
3. Demonstrate graphically that the equation 50π + sin x = 100 arctan x has infinitely
many roots.
4. If a = 0.1 and b = 1.0, how many steps of the bisection method are needed to
determine the root with an error of at most 0.5 × 10 – 8.
5. Determine the two smallest roots of the equation f ( x) = x sin x + cos x = 0 correct
to three decimal places using:
a) bisection method b) false position method c) secant method d) fixed point
iteration e) Newton-Raphson method
6. Find the root of the equation 2 x = cos x + 3 correct to three decimal places using
(i) Iteration method (ii) Aitken’s Δ2 method.
7. Determine the real root of ln x 2 = 0.8 .
i) Graphically
ii) Using three iterations of the bisection method, with initial guesses of a=0.5
and b=2.
iii) Using three iterations of false-position method, with the same initial guesses
as in (ii).
8. Find the root or roots of ln[(1 + x) /(1 − x 2 )] = 0
9. Denote the successive intervals that arise in the bisection method by [a0 , b0 ] ,
[a1 , b1 ] , [a 2 , b2 ] , and so on.
a. Show that a 0 ≤ a1 ≤ a 2 ≤ ... and that b0 ≥ b1 ≥ b2 ≥ ...
b. Show that bn − a n = 2 − n (b0 − a 0 ).
c. Show that, for all n, a n bn − a n − bn −1 = a n −1bn − a n bn −1.
10. Verify that when Newton’s method is used to compute R (by solving the
equation x 2 = R) , the sequence of iterates is defined by
1⎛ R⎞
x n +1 = ⎜⎜ x n + ⎟⎟
2⎝ xn ⎠
11. Show that if the sequence {x n } is defined as in the problem 10, then
x n +1 = x n (2 − x n R)
Establish this relation by applying Newton’s method to some f(x). beginning with
x0 = 0.2 , compute the reciprocal of 4 correct to six decimal digits or more by this
rule. Tabulate the error at each step and observe the quadratic convergence.
1
Dense matrices have few zero elements. Such matrices occur in science and engineering problems.
2
The sparse matrices have few non-zero elements. Such type of matrices arise in partial differential
equations.
for i=1 to n
for j ≥ i to n +1
Read aij
a n ,n +1
xn =
a nn
k = n–1
Sum=0
j= k +1
yes
j<n
no
a k ,n +1 − Sum
xk =
a kk
yes
k>1
no
For i =1 to n
Display xi
Stop
∑ k = m(m + 1) / 2
k =1
7. Write a forward-substitution algorithm for solving a system of equations with a lower
triangular coefficient matrix and draw a flow chart for the algorithm.
Step 1: Eliminate the coefficients a21 and a31, using row R1:
We now display the process for the general n × n system, omitting the primes (') for
convenience. Recall that the original augmented matrix was:
⎡ a11 a12 L a1n a1,n +1 ⎤
⎢ ⎥
⎢a 21 a 22 L a 2 n a 2,n +1 ⎥
⎢ M M O M M ⎥
⎢ ⎥
⎣a n1 a n 2 L a nn a n ,n +1 ⎦
Step 1: If necessary, switch rows so that a11 ≠ 0 ; then eliminate the coefficients a21, a31,.
. . , an1 by calculating the multipliers
for i = 2 to n
m i1 = a i1 / a11 ;
a i1 = 0
for j = 2 to n + 1
a ij = a ij − m i1 ∗ a1 j ;
whence
which has the exact solution x =1, y = 2. Changing coefficients of the second equation by
1% and the constant of the first equation by 5% yields the system:
It is easily verified that the exact solution of this system is x = 11, y = -18.2. This solution
differs greatly from the solution of the first system. Both these systems are said to be ill-
conditioned.
If a system is ill-conditioned, then the usual procedure of checking a numerical solution
by calculation of the residuals may not be valid. In order to see why this is so, suppose
we have an approximation X to the true solution x. The vector of residuals r is then
given by r = b - AX = A(x - X). Thus e = x - X satisfies the linear system Ae = r. In
general, r will be a vector with small components. However, in an ill-conditioned
system, even if the components of r are small so that it is `close' to 0, the solution of the
linear system Ae = r could differ greatly from the solution of the system Ae = 0, namely
0. It then follows that X may be a poor approximation to x despite the residuals in r being
small.
Obtaining accurate solutions to ill-conditioned linear systems can be difficult, and many
tests have been proposed for determining whether or not a system is ill-conditioned.
Step k. Assume a kkk −1 ≠ 0. Then pre-multiplying the kth row of [A, b]k-1 by 1 / a kkk −1 , we
change a kkk −1 to 1, but the other coefficients of the k-th row are changed. In fact, the new
coefficients will be
akjk = akjk −1 / akkk −1 for j = k ,..., ( n + 1)
The non-diagonal coefficients of the k-th column are made zero by pre-multiplying the
k-th row by aikk −1 for i=1,…,n and i ≠ k . This implies that the new coefficients will be
defined by
aijk = aijk −1 − aikk −1 .a kjk for j = (k + 1),..., (n + 1); i = 1,2,...n; i ≠ k
Now, we can write Gauss-Jordan algorithm for a linear system.
Gauss-Jordan Algorithm for solving a Linear System [A, b]
Gauss-Jordan Algorithm without pivoting
(i) Transforming [A, b] into [I, b’]. For k = 1, 2, …,n,
akjk = akjk −1 / akkk −1 ∀j = (k + 1),..., ( n + 1)
aijk = aijk −1 − aikk −1akjk , ∀j = ( k + 1),..., ( n + 1)
i = 1,..., n; (i ≠ k )
(ii) Solution of the system
x i = a in,n +1 for i = 1, 2, ..., n.
The number of operations necessary to transform the system [A, b]k is: (n-1)(n-k+1)
additions, (n-1)(n-k+1) multiplications (n-k+1)division. Therefore, the total number of
operations necessary to transform the original system [A, b] into [I, b’] is
n
No. of additions = ∑ (n − 1)(n − k + 1) = n(n
k =1
2
− 1) / 2.
⎡ 1 − 1 − 2 1 0 0⎤ ≈ ⎡1 − 1 − 2 1 0 0⎤
⎢ ⎥
[ A : I 3 ] = ⎢ 2 − 3 − 5 0 1 0⎥ R 2 + ( −2) R1 ⎢0 − 1 − 1 − 2 1 0 ⎥
⎢ ⎥
⎢⎣− 1 3 5 0 0 1⎥⎦ R3 + R1 ⎢⎣0 2 3 1 0 1⎥⎦
⎡1 − 1 − 2 1 0 0⎤
≈ ⎢0 1 1 2 − 1 0⎥
( −1) R2 ⎢ ⎥
⎢⎣0 2 3 1 0 1⎥⎦
≈ ⎡1 0 − 1 3 − 1 0⎤
R1 + R2 ⎢0 1 1 2 − 1 0⎥
⎢ ⎥
R3 + ( −2) R2 ⎢⎣0 0 1 − 3 2 1⎥⎦
≈ ⎡1 0 0 0 1 1⎤
R1 + R3 ⎢0 1 0 5 − 3 − 1⎥ = [ I : A−1 ]
⎢ ⎥ 3
R2 + ( −1) R3 ⎢⎣0 0 1 − 3 2 1 ⎥⎦
Thus
⎡0 1 1⎤
⎢
A = 5 − 3 − 1⎥
−1
⎢ ⎥
⎣⎢ − 3 2 1 ⎦⎥
Observe that we can solve the system of equations
e) x1 + x 2 + x 3 − x 4 = −3 e ) x1 − x 2 + 2 x 3 =7
2 x1 + 3x 2 + x 3 − 5 x 4 = −9 2 x1 − 2 x 2 + 2 x 3 − 4 x 4 = 12
x1 + 3x 2 − x 3 − 6 x 4 = −7 − x 1 + x 2 − x 3 + 2 x 4 = −4
− x1 − x 2 − x 3 =1 − 3x1 + x 2 − 8 x 3 − 10 x 4 = −29
2. Solve the following system of linear equations by applying the method of Gauss-
Jordan elimination to a large augmented matrix that represents two systems wih the
same matrix of coefficients.
x1 + x 2 + 5 x 3 = b1 ⎡ b1 ⎤ ⎡ 2 ⎤ ⎡ 3⎤
x1 + 2 x 2 + 8 x 3 = b2 ⎢ ⎥ ⎢ ⎥
for b2 = 5 and ⎢2⎥ in turn
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
2 x1 + 4 x 2 + 16 x 3 = b3 ⎢⎣b3 ⎥⎦ ⎢⎣10⎥⎦ ⎢⎣4⎥⎦
3. Solve the following systems of equations by determining the inverse of the matrix of
coefficients and then using matrix multiplication.
a) x1 + 2 x 2 − x 3 = 2 b) x1 − x 2 =1
x1 + x 2 + 2 x 3 = 0 x1 + x 2 + 2 x 3 = 2
x1 − x 2 − x 3 = 1 x1 + 2 x 2 + x 3 = 0
4. Draw a flow chart for Gauss-Jordan elimination method
The condition that A is nonsingular implies that u kk ≠ 0 for all k. The notation for
the entries in L is mij , and the reason for the choice of mij instead of l ij , will be pointed
out in the next example.
Example Use Gaussian elimination to construct the triangular factorization of the matrix
⎛ 4 3 − 1⎞
⎜ ⎟
A = ⎜− 2 − 4 5 ⎟
⎜ 1 2 6 ⎟⎠
⎝
The matrix L will be constructed from an identity matrix placed at the left. For each row
operation used to construct the upper-triangular matrix, the multipliers mij will be put in
their proper places at the left. Start with
⎛ 1 0 0 ⎞⎛ 4 3 − 1⎞
⎜ ⎟⎜ ⎟
A = ⎜ 0 1 0 ⎟⎜ − 2 − 4 5 ⎟
⎜ 0 0 1 ⎟⎜ 1 2 6 ⎟⎠
⎝ ⎠⎝
Row 1 is used to eliminate the elements of A in column 1 below a11 . The multiples
m 21 = −0.5 and m 31 = 0.25 of row 1 are subtracted from rows 2 and 3, respectively.
These multipliers are put in the matrix at the left and the result is
⎛ 1 0 0 ⎞⎛ 4 3 −1 ⎞
⎜ ⎟⎜ ⎟
A = ⎜ − 0.5 1 0 ⎟⎜ 0 − 2.5 4.5 ⎟
⎜ 0.25 0 1 ⎟⎜ 0 1.25 6.25 ⎟
⎝ ⎠⎝ ⎠
The upper-triangular process is now complete. Notice that once array is used to store the
elements of both L and U. the 1’s of L are not stored, nor are the 0’s of L and U that lies
Let A = LU
⎡l11 0 0⎤ ⎡1 u12 u13 ⎤
Where L = ⎢⎢l 21 l 22 0 ⎥ , U = ⎢⎢0 1 u 23 ⎥⎥ So that
⎥
⎢⎣l31 l 31 l33 ⎥⎦ ⎢⎣0 0 1 ⎥⎦
⎡l11 0 0 ⎤ ⎡1 u12 u13 ⎤ ⎡ a11 a12 a13 ⎤
⎢l ⎥⎢ ⎥ ⎢ ⎥
⎢ 21 l 22 0 ⎥ ⎢0 1 u 23 ⎥ = ⎢a 21 a 22 a 23 ⎥
⎢⎣l31 l 32 l33 ⎥⎦ ⎢⎣0 0 1 ⎥⎦ ⎢⎣ a 31 a32 a33 ⎥⎦
Then l11 = a11 , l21 = a 21 , l31 = a31 ,
a
l11u12 = a12 then u12 = 12
a11
a13
l11u13 = a13 then u13 =
a11
l 21u12 + l 22 = a 22 then l 22 = a 22 − a 21 .u12
l 31u12 + l 32 = a 32 then l 32 = a 32 − a 31 .u12
1
l 21 .u13 + l 22 .u 23 = a 23 then u 23 = (a 23 − a 21 .u13 )
l 22
l31u13 + l32u23 + l33 = a33 then l33 = [a 33 − a31 .u13 − l 32 .u 23 ]
We may generalize the above relations by the following concise series of formulas:
for k = j + 1, j + 2,..., n
j −1
a jk − ∑ l ji uik
u jk = i =1
l jj
and
n −1
lnn = ann − ∑ lnk ukn
k =1
Now Ax = B becomes LUx = B.
Let Ux = y then Ly=B
And from Ly = B we get
⎡ l11 0 0 ⎤ ⎡ y1 ⎤ ⎡ b1 ⎤
⎢l l 0 ⎥ ⎢ y2 ⎥ = ⎢b2 ⎥
⎢ 21 22 ⎥⎢ ⎥ ⎢ ⎥
⎢⎣l31 l31 l33 ⎥⎦ ⎢⎣ y2 ⎥⎦ ⎢⎣b2 ⎥⎦
Thus
l11 . y1 = b1
l 21 y1 + l 22 . y 2 = b2
l31 y1 + l32 . y 2 + l 33 y 3 = b3
Solve for y1 , y 2 , y 3 by forward substitution.
We know that Ux = y
Thus
⎡1 u12 u13 ⎤ ⎡ x1 ⎤ ⎡ y1 ⎤
⎢0 1 u ⎥ ⎢ x ⎥ = ⎢ y ⎥
⎢ 23 ⎥ ⎢ 2 ⎥ ⎢ 2⎥
⎢⎣0 0 1 ⎥⎦ ⎢⎣ x3 ⎥⎦ ⎢⎣ y3 ⎥⎦
Consequently
x1 + u12 x 2 + u13 x3 = y1
x 2 + u 23 x3 = y 2
x3 = y 3
By the backward substitution, we get the values of x1 , x2 , x3.
Definition: - Let A = ( aij ) be a square matrix of order n such that aij whenever i − j > 1
then A is called a Tridiagonal matrix. Such matrices arise frequently in the study of
numerical differential equations.
⎛ α1 γ1 0 0 L 0 ⎞
⎜ ⎟
⎜β2 α2 γ 2 0 L 0 ⎟
⎜ 0 β3 α3 γ 3 L 0 ⎟
A=⎜ ⎟
⎜ M M M M O M ⎟
⎜ 0 0 0 L α n −1 γ n −1 ⎟
⎜ ⎟
⎜ 0 0 0 L β n α n ⎟⎠
⎝
⎛ l1 0 0 ... 0 ⎞ ⎛ 1 μ1 0 ... 0 ⎞
⎜ ⎟ ⎜ ⎟
⎜ β 2 l 2 0 ... 0 ⎟ ⎜0 1 μ2 ... 0 ⎟
L = ⎜ 0 β 3 l 3 ... 0 ⎟ U = ⎜0 0 1 ... 0 ⎟
⎜ ⎟ ⎜ ⎟
⎜ ... ... ... ... ⎟ ⎜ ... ... ... μ n −1 ⎟
⎜ 0 0 0 β n l n ⎟⎠ ⎜0 0 ... 0 1 ⎟⎠
⎝ ⎝
This leads us to the following algorithm
l1 = α 1 , μ 1 = γ 1 / l 1
for i = 2 to n
l i = α i − β i μ i −1
if i < n
μi = γ i / li
end for
The solution is now easy consisting of forward and backward substitution.
y1 = b1 / l1 y i = bi − β i y i −1 i = 2,3,..., n
xn = y n xi = y i − μ i xi +1 i = 1,2,..., n − 1 (2)
Step I: Obtain x1(1) from the first equation as a function of xi( 0) , i = 2,3,..., n as follows:
b1 − (a12 x 2( 0 ) + a13 x3( 0 ) + ... + a1n x n( 0) )
x1(1) =
a11
Similarly
b2 − (a 21 x1( 0 ) + a 23 x3( 0) + ... + a 2 n x n( 0) )
x (1)
2 = (from the second equation)
a 22
bk − (a k 2 x1( 0) + ... + a k ,k −1 x k( 0−)1 + a k ,k +1 x k +1 + ... + a1n x n( 0) )
x (1)
k = (from the kth equation)
a kk
and
Now to solve the system rewrite the equations as follows, isolating x in the first equation,
y in the second equation, and z in the third equation.
4 − 2y + z
x=
6
3− x − z
y= (2)
5
27 − 2 x − y
z=
4
Now make an estimate of the solution, say x = 1, y = 1, z = 1. The accuracy of the estimate
affects only the speed with which we get a good approximation to the solution. Let us
label these values x ( 0) , y ( 0 ) , and z ( 0 ) . They are called the initial values of the iterative
process.
x ( 0) = 1, y ( 0) = 1, z ( 0) = 1
Substitute these values into the right-hand side system (2) to get the next set of values in
the iterative process.
x (1) = 0.5, y (1) = 0.2, z (1) = 6 1
These values of x,y, and z are now substituted into system (2) again to get
x ( 2) = 1.6, y ( 2) = −0.7, z ( 2 ) = 6.45
This process can be repeated to get x ( 3) , y (3) , z ( 3) , and so on. Repeating the iteration will
give a better approximation to the exact solution at each step. For this straightforward
system of equations, the solution can easily be seen to be
x = 2, y = −1, z = 6.
Example: Let us consider the previous system of equations. As before, let us take our
initial guess to be
x ( 0) = 1, y ( 0) = 1, z ( 0) = 1
Substituting the latest values of each variable into (2) system gives
4 − 2 y ( 0) + z (0)
x (1) = = 0.5
6
3 − x (1) − z ( 0 )
y (1) = = 0.3
5
27 − 2 x (1) − y (1)
z =(1)
= 6.4250
4
Observe that we have used x (1) , the most up-to-date value of x, to get y (1) . We have used
x (1) and y (1) to get z (1) . Continuing, we get
4 − 2 y (1) + z (1)
x ( 2) = = 1.6375
6
3 − x ( 2 ) − z (1)
y ( 2) = = −1.0125
5
27 − 2 x ( 2 ) − y ( 2 )
z ( 2) = = 6.4250
4
The next two tables below give the results obtained for this system of equations using
both methods. They illustrate the more rapid convergence of the Gauss-Seidel method to
the exact solution x = 2, y = −1, z = 6. And the last table gives the difference between the
solution x ( 6) , y ( 6 ) , z ( 6) obtained in the two methods after six iterations and the actual
solution x = 2, y = −1, z = 6. The Gauss-Seidel method converges much more rapidly
than the Jacobi method.
Gauss-Seidel Method
Index x y Z
Initial Guess 1 1 1
1 0.5 0.3 6.425
2 1.6375 -1.0125 6.184375
3 2.034896 -1.043854 5.993516
4 2.013537 -1.001411 5.993594
5 1.998597 -0.998597 5.999949
6 1.999524 -0.9998945 6.000212
Since the approximation vector x is very close to x*, the elements ( x *j − x kj ) 2 are small
(k)
and therefore, all the higher order terms including the second order terms negligible. So
the system above will become
n
∂f i (x )
∑
j =1 ∂ xj x = x ( k )
( x *j − x (j k ) ) = − f i ( x ( k ) ), i = 1, 2 ,..., n (4.4)
⎡ 6 4 0.5403⎤ ⎡ h1 ⎤ ⎡− 7.8415⎤
⎢3.7183 0.5
⎢ 1 ⎥⎥ ⎢⎢h2 ⎥⎥ = ⎢⎢ − 3.7183⎥⎥
⎢⎣ 1 4 1 ⎥⎦ ⎢⎣ h3 ⎥⎦ ⎢⎣ − 1 ⎥⎦
Solving this system results in h1 = −1.2674, h2 = −0.2076, h3 = 1.0979. Using these
results, the next approximation for the unknown variable can be calculated as follows.
x (1) = x ( 0 ) + h1 = 1 − 1.2674 = −0.2674
y (1) = y ( 0) + h2 = 1 − 0.2076 = 0.7924
z (1) = z ( 0 ) + h3 = 1 + 1.0979 = 2.0979
Now substituting the values of x(1),y(1),z(1) in Eq (4.6), we construct a new linear system
and solve it to obtain the new correction factor {h} and then calculate the new
approximations x(2),y(2), and z(2). This process is continued until the convergence
condition is fulfilled.
Exercise:
1. Consider the nonlinear system
⎧ f ( x, y ) = x 2 + y 2 − 25
⎨
⎩ g ( x, y ) = x − y − 2 = 0
2
Using a software package that has 2D plotting capabilities, illustrate what is going on
in solving such a system by plotting f(x, y), g(x, y), and show their intersection with
the xy-plane. Determine approximate roots of these equations from the plot.
2. Using Newton’s method approximate the solution of
a) x 2 + y 2 + z 2 = 1 b) x + y + z = 3
2x + y − 4z = 0
2 2
x2 + y2 + z2 = 5
3x 2 − 4 y + z 2 = 0 e x + xy − xz = 1
with initial values x ( 0) = y ( 0 ) = z ( 0) = 0.5 with initial values x ( 0 ) = y ( 0 ) = z ( 0 ) = 0
c) x+ y+ z=0
x 2 + y 2 + z 2 = 2 with initial values(3/4, ½,– ½ )
x ( y + z ) = −1
Differences
x f(x) 1st 2nd 3rd
0.20 1.22140
0.01228
0.21 1.23368 0.00012
0.01240 0.00000
0.22 1.24608 0.00012
0.01252 0.00001
0.23 1.25860 0.00013
0.01265 0.00000
0.24 1.27125 0.00013
0.01278
0.25 1.28403
Differences
x f(x) 1st 2nd 3rd
0.20 1.22140
0.06263
0.25 1.28403 0.00320
0.06583 0.00018
0.30 1.34986 0.00338
0.06921 0.00016
0.35 1.41907 0.00354
0.07275 0.00020
0.40 1.49182 0.00374
0.07649
0.45 1.56831
Although the round-off errors in f should be less than 1/2 in the last significant place,
they may accumulate; the greatest error that can be obtained corresponds to:
Diffrences
Tabular error
1st 2th 3th 4th 5th 6th
+½
–1
–½ +2
+1 –4
+½ –2 +8
–1 +4 – 16
–½ +2 –8 +32
+1 –4 +16
+½ –2 +8
–1 +4
–½ +2
+1
+½
A rough working criterion for the expected fluctuations (`noise level') due to round-off
error is shown in the table:
Diffrences
f(x)
1st 2th 3th 4th 5th 6th
0
0
0 0
0 ε
0 ε –4 ε
ε –3 ε 10 ε
0+ε –2ε 6ε – 20 ε
–ε 3ε –10 ε
0 ε –4ε
0 –ε
0 0
0
0
Note that the maximum error coccurs directly opposite to the entery whose functional
value is in error of ε.
EXERCISES
1. Construct a difference table for the function f (x) = x3 for x = 0(1) 6.
2. Construct a difference table for each of the polynomial functions:
a) 2x - l for x = 0(1)3.
b) 3x2 + 2x - 4 for x = 0(1)4.
c) 2x3 + 3x - 3 for x = 0(1)5.
Study your resulting tables carefully; note what happens in the final few columns of
each table. Suggest a general result for polynomials of degree n and compare your
answer with the theorem in.
3. Construct a difference table for the function f (x) = ex, given to 7S for x = 0.1(0.05)
0.5:
Example 1:- Find the value of E 2 x 2 when the value of x may vary by a constant
increment of 2.
Solution: let f ( x ) = x 2 since the constant increment h=2, we obtain
E 2 x 2 = E 2 f ( x ) = f ( x + 2h ) by definition hence
E 2 x 2 = ( x + 4) 2 = x 2 + 8 x + 16.
Example 2: - Find the value of E n e x when x may vary by a conxtant interval of h.
Solution: Similar to Example 1 E n e x = e x + nh .
e ⎩ e − 1⎭
ex
Hence the function whose first defference is e x is given by f ( x ) = .
eh − 1
then
1 1 1 1
− −
δf j = ( E 2 − E 2
)ff = E2 fj − E 2
fj = f 1 − f 1 ,
j+ j−
2 2
is the second-order central difference at xj, etc. The central difference of order k is
⎛ ⎞
δ k f j = δ k −1 (δf j ) = δ k −1 ⎜⎜ f 1 − f 1 ⎟⎟ = δ k −1 f 1 − δ k −1 f 1
⎝ j+ 2 j+
2 ⎠
j+
2
j+
2
Example 7: Find the value of δ x when the value of x may vary by a constant
2 3
increment of 1.
2
⎛ 1 − ⎞
1
Solution: δ x = ⎜⎜ E 2 − E 2 ⎟⎟ x 3 = (E − 2 + E −1 )x 3
2 3
⎝ ⎠
= Ex − 2 x + E −1 x 3
3 3
= ( x + 1) 3 − 2 x 3 + ( x − 1) 3
= 6 x.
Forward difference table
In general a forward difference table and simply a difference table are identical except
that when it comes to the case of a forward difference table we use the forward difference
operators to indicate the order of the difference column. See the table below.
Differences
x f(x)
Δ Δ2 Δ3
0.20 1.22140
0.01228
0.21 1.23368 0.00012
0.01240 0.00000
0.22 1.24608 0.00012
0.01252 0.00001
0.23 1.25860 0.00013
0.01265 0.00000
0.24 1.27125 0.00013
0.01278
0.25 1.28403
Onec we construct a forward difference table we can red the result of the rest difference
operater from the same table.So there is no need to construct other difference table for
central and backward differences.
Example 8:- If we just let x 0 = 0.20 in the last difference table, since h=0.01 we can
easly see from the table that Δf 1 = 0.01240, while ∇f 1 = 0.01228, moreover
Δ2 f 2 = 0.00013 , ∇ 2 f 2 = 0.00012, and δ 2 f 2 = 0.00013. But we can not read values like
δf 1 , ∇f 0 , Δf 5 , ∇ 2 f 1 , Δ4 f 4 , etc just from the difference table.
⎛Δ ⎞2
4. Evaluate ⎜⎜ ⎟⎟ x 3 while the interval of difference equals 1.
⎝E⎠
5. Prove that
⎛ Δ2 ⎞ Ee x
e x = ⎜⎜ ⎟⎟e x ⋅ 2 x
⎝E⎠ Δe
the interval of difference being h.
6. Show that
a) ( Δ − ∇) ≡ Δ∇ ;
b) (1 + Δ )(1 − ∇) ≡ 1 ;
c) ∇Δ ≡ δ 2 .
7. Show that all the difference operators commute with one another for instance
example EΔ ≡ ΔE , and δ∇ ≡ ∇δ etc.
8. Simplify ( Δ + ∇) 2 f ( x ) with the common step length equals to h.
M M
Δn f ( x ) = a n n! h n = constant
Δn +1 f ( x ) = 0.
In passing, the student may recall that in Differential Calculus the increment
Δf ( x ) = f ( x + h ) − f ( x ) is related to the derivative of f (x) at the point x.
Conversely if the nth finite-difference of a tabulated function are constant then
f(x) is a polynomial of degree n.
This converse result is very much helpful. If, in an experimental set up we obtain
a table of values between two variables and that the function relation y = f (x ) is not
known to us and that a particular order difference turns out to be, all constant, then we
can imply that it is a polynomial of the order corresponding to constant difference. This
concept helps us to establish the relationship for experimental results. We shall see later,
how we shall see in the next chapter how this can be established through numerical
examples.
Examples
1) Determine Δ3 {(1 + x )(1 − 3x )(1 + 5 x )}, where the common interval length is 1.
Solution: Observe that (1 + x )(1 − 3x )(1 + 5 x ) = 1 + 3x − 13x 2 − 15 x 3 . From the theorem
above, we can see that for a polynomial of nth degree, the difference is constant being
equal to a n n! h n where a n is the coefficient of xn in the polynomial, h is the interval of
differencing.
x f(x) Δ Δ2 Δ3 Δ4
5.0 125.000
7.651
5.1 132.651 0.306
7.957 0.006
5.2 140.608 0.312 0.000
8.269 0.006
5.3 148.877 0.318 0.000
8.587 0.006
5.4 157.464 0.324
8.911
5.5 166.375
3. Find the tenth term of the sequence 3, 14, 39, 84, 155, 258, …
4. Given f 0 = 3, f 1 = 12, f 2 = 81, f 3 = 200, f 4 = 100, f 5 = 8, find Δ5 f 0 without
constructing a difference table.
5. Show that the kth forward difference Δ2f1 in a difference table may be expressed
as
⎛k ⎞ ⎛k ⎞ ⎛k ⎞ ⎛k ⎞
Δk f 1 = ⎜⎜ ⎟⎟ f k + ( −1)⎜⎜ ⎟⎟ f k −1 + ( −1) 2 ⎜⎜ ⎟⎟ f k − 2 + ... + ( −1) k ⎜⎜ ⎟⎟ f 1
⎝ 0⎠ ⎝1⎠ ⎝ 2⎠ ⎝k ⎠
⎛k ⎞ k
Where the coefficients ⎜⎜ ⎟⎟ = are the Binomial coefficients.
⎝ j ⎠ j! ( k − j )!
1. interpolation is still important for functions that are available only in tabular
form (perhaps from the results of an experiment); and
2. interpolation serves to introduce the wider application of finite differences.
In Chapter 5 we have observed that, if the differences of order k are constant (within
round-off fluctuation), the tabulated function may be approximated by a polynomial of
degree k. Linear and quadratic interpolation correspond to the cases k = 1 and k = 2,
respectively.
so that
The first differences are almost constant locally, so that the table is suitable for
linear interpolation. For example,
Given three adjacent points xj, xj+1 = xj+ h and xj+2 = xj + 2h, suppose that f (x) can
be approximated by
,.
Thus,
whence
Checkpoint
EXERCISES
1. Linear interpolation,
2. quadratic interpolation.
6. The entries in a table of tan x are:
The linear and quadratic interpolation formulae are based on first and second degree
polynomial approximations. Newton has derived general forward and backward
difference interpolation formulae, corresponding for tables with constant interval h.
which is an approximation based on the values fj, fj+1,. . . , fj+n. It will be exact if
(within round-off errors)
Newton's forward and backward difference formulae are well suited for use at
the beginning and end of a difference table, respectively. (Other formulae,
which use central differences, may be more convenient elsewhere.)
As an example, consider the difference table of f (x) = sin x for x = 0°( 10°)50°:
Since the fourth order differences are constant, we conclude that a quartic
approximation is appropriate. (The third-order differences are not quite
constant within expected round-offs, and we anticipate that a cubic
approximation is not quite good enough.) In order to determine sin 5° from the
table, we use Newton's forward difference formula (to fourth order); thus,
taking xj = 0, we find and
Note that we have kept a guard digit (in parentheses) to minimize accumulated
round-off error.
In order to determine sin 45° from the table, we use Newton's backward
difference formula (to fourth order); thus, taking xj = 40, we find
and
Given a set of values f(x0), f(x1), . . , f(xn) with xj = x0 + jh, we have two
interpolation formulae of order n available:
Checkpoint
1. What is the relationship between the forward and backward linear and
quadratic interpolation formulae (for a table of constant interval h) and
Newton's interpolation formulae?
2. When do you use Newton's forward difference formula?
3. When do you use Newton's backward difference formula?
EXERCISES
The linear and quadratic interpolation formulae correspond to first and second-degree
polynomial approximations, respectively. In section 6.3, we have discussed Newton's
forward and backward interpolation formulae and noted that higher order
interpolation corresponds to higher degree polynomial approximation. In this Step we
consider an interpolation formula attributed to Lagrange, which does not require
function values at equal intervals. Lagrange's interpolation formula has the
disadvantage that the degree of the approximating polynomial must be chosen at the
outset. Thus, Lagrange's formula is mainly of theoretical interest for us here; in passing,
we mention that there are some important applications of this formula beyond the scope
of this book - for example, the construction of basis functions to solve differential
equations using a spectral (discrete ordinate) method.
Procedure
Let the function f be tabulated at (n + 1), not necessarily equidistant points xj, j =
1, 2,…., n and be approximated by the polynomial
n
( x − xi )
Since for k = 0,1, 2, . . , n Lk ( x ) = ∏ i.e.
i =0 ( x k − xi )
i≠k
then:
(established by setting f(x) = 1) may be used as a check. Note also that with n = 1
we recover the linear interpolation formula:
in Section 6.1.
1 Example
Notes of caution
Checkpoint
EXERCISE
Given that f (-2) = 46, f (-1 ) = 4, f ( 1 ) = 4, f (3) = 156, and f (4) = 484, use Lagrange's
interpolation formula to estimate the value of f(0).
x 0 1 3 6 10
f(x) 1 –6 4 169 921
x f(x)
0 1 −7
4
1 −6 5 1
10 0
3 4 55 1
19
6 169 188
10 921
It is notable that the third divided differences are constant. Below, we shall
interpolate from the table by using Newton’s divided difference formula, and determine
the corresponding collocation cubic.
f ( x) = f ( x0 ) + ( x − x 0 ) f [ x0 , x1 ] + ( x − x 0 )( x − x1 ) f [ x0 , x1 , x 2 ] + ... +
( x − x0 )( x − x1 )...( x − x n −1 ) f [ x 0 , x1 ,..., x n ] + R
where
R= ( x − x0 )( x − x1 )...( x − x n ) f ( x, x 0 , x1 ,..., x n )
Note that the remainder term R is zero at x0 , x1 ,..., x n and we may infer that the other
terms of the right hand side constitute the collocation polynomial or, equivalently, the
Lagrange polynomial. If the degree of collocation polynomial necessary is not known in
advance, it is customary to order the points x0 , x1 ,..., x n according to increasing distance
from x and add terms until R is small enough.
For instance for the tabular function in the above example we may find
f (2) and f (4) by Newton’s divided difference formula as below.
Since the third difference is constant, we can fit a cubic through the five points.
By Newton’s divided difference formula, using x0 = 0, x1 = 1, x 2 = 3, x3 = 6 the cubic is
f ( x) = f (0) + xf (0,1) + x( x − 1) f (0,1,3) + x( x − 1)( x − 3) f (0,1,3,6)
= 1 − 7 x + 4 x( x − 1) + 1x( x − 1)( x − 3),
so that
f (2) = 1 − 14 + 8 − 2 = −7.
Note the corresponding collocation polynomial is obviously
1 − 7 x + 4 x 2 − 4 x + x 3 − 4 x 2 + 3 x = x 3 − 8 x + 1.
To find f (4) , let us identify x0 = 1, x1 = 3, x 2 = 6, x3 = 10 , so that
f ( x) = −6 + 5( x − 1) + 10( x − 1)( x − 3) + ( x − 1)( x − 3)( x − 6)
and
f (4) = −6 + 5 × 3 + 10 × 3 + 3 × 1(−2) = 33 .
As expected, the collocation polynomial is the same cubic polynomial i.e x 3 − 8 x + 1 .
Exercise
1. Use Newton’s divided difference formula to show that an interpolation for 3 20 from
the points (0,0),(1,1),(8,2),(27,3),(64,4), on f ( x) = 3 x is quite invalid.
2. Given that f (−2) = 46, f (−1) = 4, f (3) = 156 and f (4) = 484, compute f(0) by
Newton’s divided difference formula.
In particular, if we set θ=0, we arrive at formulae for derivatives at the tabular points
{xj}:
.
If we set ½, we have a relatively accurate formula at half-way points (without second
differences):
;
if we set = 1 in the formula for the second derivative, we find (without third
differences):
,
i.e., a formula for the second derivative at the next point.
Note that, if one retains only one term, one arrives at the well-known formulae:
,
Errors in numerical differentiation
.
FIGURE 14 Interpolating f(x)
It should also be noted that all these formulae involve division of a combination of
differences (which are prone to loss of significance or cancellation errors, especially if h
is small) by a positive power of h. Consequently, if we want to keep round-off errors
down, we should use a large value of h. On the other hand, it can be shown (see Exercise
3 below) that the truncation error is approximately proportional to hp, where p is a
positive integer, so that k must be sufficiently small for the truncation error to be
tolerable. We are in a cleft stick and must compromise with some optimum choice of h.
In brief, large errors may occur in numerical differentiation, based on direct
polynomial approximation, so that an error check is always advisable. There arc
alternative methods, based on polynomials, which use more sophisticated procedures
such as least-squares or mini-max, and other alternatives involving other basis
functions (for example, trigonometric functions). However, the best policy is probably
to use numerical differentiation only when it cannot be avoided!
Example
We will estimate the values of f'(0.1) and f"(0.1) for f(x) = ex, using the data in.
If we use the above formulae with = 0, we obtain (ignoring fourth and higher
differences):
.
Since f"(0.1) = f''(0.1) = f (0.1) = 1.10517, it is obvious that the second result is much less
accurate (due to round-off errors).
Checkpoint
How are formula for the derivatives of a function obtained from interpolation
formulae?
a. Estimate the values of f'(1.00) and f"(1.00), using Newton's forward difference
formula.
b. Estimate f'(1.30) and f"(1.30), using Newton's backward difference formula.
3. Use Taylor series to find the truncation errors in the formulae:
,
so that numerical integration or quadrature must be used.
It is well known that the definite integral may be interpreted as the area under the curve y
= f (x) for and may be evaluated by subdivision of the interval and summation
of the component areas. This additive property of the definite integral permits evaluation
in a piecewise sense. For any subinterval of the interval , we
may approximate f (x) by the interpolating polynomial Pn(x). Then we obtain the
approximation
,
which will be a good approximation, provided n is chosen so that the error |f (x) - Pn(x)|
in each tabular subinterval is sufficiently small. Note that
such that b = a + Nh. Then one can use the additive property
Accuracy
The trapezoidal rule corresponds to a rather crude polynomial approximation (a straight
line) between successive points xj and xj+1 = xj + h, and hence can only be accurate for
sufficiently small h. An approximate (upper) bound on the error may be derived as
follows:
Ignoring higher-order terms, one arrives at an approximate bound on this error when
using the trapezoidal rule over N subintervals:
Whenever possible, we will choose h small enough to make this error negligible. In the
case of hand computations from tables, this may not be possible. On the other hand, in a
computer program in which f (x) may be generated anywhere in , the interval
may be resubdivided until sufficient accuracy is achieved. (The integral value for
successive subdivisions can be compared, and the subdivision process terminated when
there is adequate agreement between successive values.)
Example
Obtain an estimate of the integral
using the trapezoidal rule and the data in the table on page 63. If we use T(h) to denote
the approximation with strip width h, we obtain
EXERCISES
Estimate the value of the integral
using the trapezoidal rule and the data given in Exercise 2 of the preceding Step.
Use the trapezoidal rule with h = 1,0.5, and 0.25 to estimate the value of the integral
.
A parabolic arc is fitted to the curve y = f(x) at the three tabular points
Hence, if N/2 = (b - a)/2 is even, one obtains Simpson's Rule:
,
where
.
Integration by Simpson's Rule involves computation of a finite sum of given values of the
integrand f, as in the case of the trapezoidal rule. Simpson's Rule is also effective for
implementation on a computer; a single direct application in a hand calculation usually
gives sufficient accuracy.
Accuracy
For a given integrand f, it is quite appropriate to computer program increased interval
subdivision, in order to achieve a required accuracy, while for hand calculations an error
bound may again be useful.
,
then
.
One may reformulate the quadrature rule for by replacing fj+2 = f (j+1 +
h) and fj = f (xj+1 - k) by its Taylor series; thus
.
A comparison of these two versions shows that the truncation error is
.
Ignoring higher order terms, we conclude that the approximate bound on this error
while estimating
.
Note that the error bound is proportional to h , compared with h2 for the cruder
4
trapezoidal rule. Note that Simpson's rule is exact for cubic functions!
Example
We shall estimate the value of the integral
,
using Simpson's rule and the data in Exercise 2 of STEP29. If we choose h = 0.15 or h =
0.05, there will be an even number of intervals. Denoting the approximation with strip
width h by S(h), we obtain
and
,
respectively. Since f(4)(x) = -15x-7/2/16, an approximate truncation error bound is
,
to 4D.
Use Simpson's Rule with N = 2 to estimate the value of
.
Estimate to 5D the resulting error, given that the true value of the integral is 0.26247.
1. i) The normalized floating point representations of 10,872 and 0.0066 are respectively
___________________ and _______________ [1 point]
ii) a) The numbers 10,872 and 0.0066 chopped to two significant figures are given
by ____________________ and _________________
b) The numbers 10,872 and 0.0066 chopped to two decimal places are given
by ____________________ and _________________
c) The numbers 10,872 and 0.0066 round to two significant figures are given
by ____________________ and _________________
d) The numbers 10,872 and 0.0066 round to two decimal places are given
by ____________________ and _________________ [ 4 point]
iii) Using four-digit normalized floating point arithmetic give the sum [3 point]
a) 10,872 + 0.0066
b) Calculate the absolute and relative errors in your answer for iii(a) above.
iii) Evaluate 4.27 × 3.13 as accurately as possible assuming that the values 4.27 and
3.13 are correct to three significant figures. [3 points]
2. i)Given the equation − 0.4 x + 2.2 x + 4.7 = 0 and
2
x -2 -1 0 7 8
− 0.4 x 2 + 2.2 x + 4.7 -1.3 2.1 4.7 0.5 -3.3
b. Using the bisection method determine the highest root round to two decimal
places. Compute the absolute error after each iterations. Arrange your job in a
table. [3 points]
c. Using the method of false-position to determine the smallest root round to two
decimal places. Compute the absolute error after each iterations. Arrange your
job in a table. [3 points]
d. Using the method of simple iteration determine the highest root round to four
decimal places. Use the result of (b) above as initial guess. [3 points]
[3 points]
3. a) Solve the following system of equations using the Gauss-Jordan elimination
method. [4 points]
2 x + 2 y − 4 z = 14
3x + y + z = 8
2 x − y + 2 z = −1
1. The population of a city in a census taken once in ten years is given below.
x 1935 1945 1955 1965 1975 1985 1995
f(x) 35 42 58 84 120 165 220
a) Construct a forward difference table for the data in the table above.
[4 points]
b) What is the lowest degree of polynomial that matches the data? [2 points]
2 2 2
c) Determine Δ f2, δ f4, and ∇ f6. [3 points]
d) Estimate the population it the years 1940, 1980 and 1990 [2 points]
2. Use a Lagrange interpolating polynomial of the first and second order to evaluate
f(2) on the bases of the following table. [5 points]
x 1 3 6
f(x) -6 4 196
3. Given the table of values
x -2 0 1 2 5
f(x) -15 1 -3 -7 41
a) Construct the corresponding divided difference table and decide the lowest
degree of the polynomial which matches the data exactly [5 points]
b) Find the value of f[1,2,5] [2 points]
c) Write down the corresponding interpolating polynomial.(Do not simplify!)
[3 points]
1. Given the Newton’s forward difference table for f ( x) = sin x below
x f(x) Δ Δ2 Δ3 Δ4
0.2 0.1987
0.0487
0.25 0.2474 − 0.0006
0.0481 − 0.0001
0.3 0.2955 − 0.0007 −0.0001
0.0474 − 0.0002
0.35 0.3429 − 0.0009
0.0465
0.4 0.3894
a. Use Newton’s forward difference formula to estimate f ' (0.2) and f " (0.2)
ignoring the fourth difference. [5 points]
0.4
b. Use the trapezoidal rule with step length 0.05 to estimate ∫ f ( x)dx . Give the
0.2
Part III Solve the following problems by showing the necessary steps neatly and
clearly.
1. a. Develop an iterative formula using Newton-Raphson method, to find the fourth
root of a positive number K. (i.e 4 K ) (2 points)
b. Draw a flow chart that can help in developing a computer program for finding
the fourth root of a positive number with a given tolerance say ε using the
result of (a). Your flow chart should display the number of iteration that is
required to achieve the result. If in case this process diverges devise a
termination condition as well. (Hint: Use x0, ε, and imax as input) (3 points)
2. a. find the inverse of
⎛ 1 1 2⎞
⎜ ⎟
⎜ 1 2 4⎟
⎜ 2 1 3⎟
⎝ ⎠
by using Jordan method (2 points)
b. Solve the system of equation bellow by using your result in (a)
x + y + 2z = 3
x + 2 y + 4 z = −2 (2 points)
2 x + y + 3z = 1
3. Solve the system of equations below by using LU decomposition method
10 x + y + z = 12
2 x + 10 y + z = 13 (5 points)
2 x + 2 y + 10 z = 14
_____ 1. Let A=(aij) be a square matrix such that aij =0where |i – j |≥2 , then matrix A is
a. Diagonally dominated b. Tri-diagonal c. Upper triangular
d. Lower triangular e. None of these
_____ 2. Which one of the following methods of solving system of linear equations is
different from the rest?
a. Gaussian elimination b. LU decomposition
c. Gauss-Jacobi d. Gauss-Jordan
e. None of the above
_____ 3. Which one of the following operators is used as a basis for defining the rest?
a. E b. ∇ c. δ d. Δ e. None of these
_____ 4. ∇Δf i =
a. δ 2 f i b. (Δ − ∇) f i c. Δ∇f i
d. All of the above e. None of these
_____ 5. Let f ( x) = (1 − x)(1 − 2 x)(1 − 3x)(1 − 4 x) , x 0 = 0, and h = 0.5 then Δ4 f 0 equals
a. 18 b. 288 c. 1 d. cannot be determined
e. None of the above
_____6. Given x0 , x1 ,..., x n and f 0 , f 1 ,..., f n which one of the following is true about
the Lk ( x) (kth order Lagrange coefficient)?
n
a. Lk ( x k ) = 0 b. Lk ( xi ) = 0 for k ≠ i c. ∑L
k =0
k ( x) = 0 for x ∈ [ x0 , x n ]
n
d. Lk ( x k ) = ∑ f i e. None of these
i =0
i≠k
_____ 7. Which one of the following interpolation formulas cannot be applicable while
we have a constant interval length between consecutive arguments for a tabulated
function?
a. Newton’s forward interpolation formula
b. Newton’s backward interpolation formula
c. Lagrange’s interpolation formula
d. Divided difference interpolation formula
e. None of the above
____ 9. The total truncation error in Trapezoidal rule with constant step length h applied
b
h3 (b − a )h 4 (b − a )h 2
a. − f "(x j ) b. − max f " ( x) c. − max f " ( x)
12 12 a ≤ x ≤b 12 a ≤ x≤b
(b − a )h 4
d) − max f ( 4) ( x) e. None of these
180 a ≤ x ≤b
____ 10. Which one of the following methods is the most appropriate one for solving
system of none linear equations?
a. Gaussian elimination b. LU decomposition
c. Gauss-Jacobi d. Gauss-Seidel
e. None of these
Part II: - Solve the following problems by showing the necessary stapes clearly.
1. Given the system of equations
5 x + 2 y + z = 12
x + 4 y + 2 z = 15 and initial guess x = 2, y = 3, z = 0.
x + 2 y + 5 z = 20
solve questions a and b bellow.
a. Discuss why the Gauss Seidel iteration process converges no matter the initial
guess is. [2 points]
b. Solve the system using the Gauss Seidel iteration method in two iterations.
[4 points]
2. Given the table of data below solve the questions bellow the table
x -2 -1 0 1 2 3
f(x) 31 5 1 1 11 61
x 0 2 3
f ( x) 7 11 28
x 0 1 2 4 5 6
f ( x) 1 14 15 5 6 19
f. Construct the corresponding divided difference table and decide the lowest degree
of the polynomial which matches the data exactly [4 points]
g. Find the value of f[1,2,4] [1 points]
h. Write down the corresponding interpolating polynomial.(Do not simplify!)
[3 points]
5. Given the Newton’s forward difference table for f ( x) = ln x below
x f(x) Δ Δ2 Δ3 Δ4 Δ5 Δ6
4.0 1.3863
0.0488
4.2 1.4351 -0.0023
0.0465 0.0003
4.4 1.4816 -0.0020 -0.0003
0.0445 0.0000 0.0006
4.6 1.5261 -0.0020 0.0003 -0.0010
0.0425 0.0003 -0.0004
4.8 1.5686 -0.0017 -0.0001
0.0408 0.0002
5.0 1.6094 -0.0015
0.0393
5.2 1.6487
d. Use Newton’s forward difference formula to estimate f ' (4.0) and f " (4.0)
ignoring the fourth difference. [4 points]
5.2
e. Use the trapezoidal rule with step length 0.2 to estimate ∫ f ( x)dx . Give the
4.0
ii. Using five-digit normalized floating point arithmetic give the sum [5 points]
d) x + y
e) Calculate the absolute and relative errors due to rounding in your answer for
ii(a) above.
f) Calculate the absolute and relative errors due to chopping in your answer for
ii(a) above.
4.a) Use three bisection routines on the equation x sin x + cos x = 0 in the interval [2, 3]
to give the approximated root.(Arrange your job in a table) [3 points]
b) how many steps of the bisection method are needed to determine the root of the
equation in problem (2a) above with an error of at most 0.5×10-5 [2 points]
x1 = 2
− x1 − x2 = −1
2 x2 x3 = 0
x3 x4 x5 = 3
− x5 2 x6 = 0
x6 x7 = 1
x7 = 2
c. Draw a flow chart for the algorithm that you have developed in (3b)
[4 points]
5. Solve the system of equations below using the method of LU decomposition.
2 x − 5 y + z = 12
− x + 3 y − z = −8 [5 points]
3 x − 4 y + 2 z = 16
5. Using a third degree Lagrange polynomial approximation for the function that
passes through the points given in the table bellow, find P3(1.7)
xi F(xi)
1 0
1.2 0.182
1.6 0.47
1.9 0.642
(Hint: L1(1.7)=-0.101 ,L2(1.7)= 0.885, L3(1.7)= 0.213) [4 points]
x 0 1 2 3 5
f ( x) 2 1 6 5 -183
a. Construct the corresponding divided difference [4 points]
b. Find the value of f[1,2,3,5] [1 points]
c. Write down the corresponding interpolating polynomial.(Do not simplify!)
[3 points]
x
8. Given the Newton Forward difference table for f(x)=e as bellow
x f(x) Δ Δ2 Δ3 Δ4
1 2.7183
0.2859
1.1 3.0042 0.0301
0.3160 0.0032
1.2 3.3201 0.0332 0.0003
0.3492 0.0035
1.3 3.6693 0.0367 0.0004
0.3859 0.0039
1.4 4.0552 0.0406 0.0004
0.4265 0.0043
1.5 4.4817 0.0449
0.4713
1.6 4.9530
g. Use Newton’s forward difference formula to estimate f ' (1) and f " (1)
ignoring the fourth difference. [4 points]
1.6
h. Use the Simpson’s rule with step length 0.1 to estimate ∫ f ( x)dx . Give the
1
truncation error bound involved in using this method. [5 points]
i. Draw a flow chart for Simpson’s rule. (Bonus) [10 points]
C O
Cholesky method · 53 Order of convergence · 22
Chopping · 8 Overflow · 7
D P
Definite integrals · 95 Partial pivoting · 42
Direct method · 33
Doolittle method · 50
Q
E Quadrature · 95, 98
Error bound · 99
R
F relative error · 8
round off error · 8
Fixed-point representation · 6 Rounding errors · 5
Flow chart · 3 Round-off errors and numbers of
operations · 42
G
S
Gauss-Jordan Algorithm · 47
graphically · 12 Scientific notation · 6
Secant method · 20
Significant figures · 6
I
Simpson's Rule · 98
Ill-conditioned system · 43 Systems of linear equations · 33
Ill-conditioning · 43