0% found this document useful (0 votes)
7 views

Lecture Notes All 2

The lecture notes cover numerical methods used to solve mathematical problems that are difficult or impossible to solve analytically, emphasizing the importance of error analysis. It discusses absolute and relative errors, types of errors in numerical procedures such as truncation and round-off errors, and provides examples illustrating these concepts. The notes also explain how Taylor's theorem can be used to approximate functions and analyze the accuracy of these approximations.

Uploaded by

emironus.47
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Lecture Notes All 2

The lecture notes cover numerical methods used to solve mathematical problems that are difficult or impossible to solve analytically, emphasizing the importance of error analysis. It discusses absolute and relative errors, types of errors in numerical procedures such as truncation and round-off errors, and provides examples illustrating these concepts. The notes also explain how Taylor's theorem can be used to approximate functions and analyze the accuracy of these approximations.

Uploaded by

emironus.47
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 161

Numerical Methods - Lecture Notes

Maksat Ashyraliyev
Associate Professor, Dr.
MAT2045
Bahcesehir University

1
1 Introduction, Error Analysis, O nota-
tion
Mathematical problems are used to simulate real life phenomena. Some
of these problems can be solved analytically. For instance, the following
algebraic equation
x3 − x2 − 3x + 3 = 0
can be solved by hand. However, most of the mathematical prob-
lems are difficult or even impossible to solve analytically.
Therefore, numerical methods are needed to solve those problems with
the use of computers. Here are some examples of problems which can-
not be solved by hand:
• Nonlinear algebraic equation 2x + 3 cos x − ex = 0.
• System of linear algebraic equations Ax = b, where A is a very
large matrix.
Zπ p
• Definite integral 1 + cos2 xdx.
0

 dy = ex2 ,

• Initial Value Problem dx


y(0) = 1.

• etc.
Computers can only perform four arithmetic operations (addition,
subtraction, multiplication, division) and one logical operation (com-
parison). Numerical methods are techniques by which mathemat-
ical problems are formulated so that they can be solved with these
arithmetic and logical operations only.
Numerical result is always an approximation, while ana-
lytical result is exact. Numerical result can be as accurate as
needed but not exact!

2
Lecture Notes - Numerical Methods M.Ashyraliyev

1.1 Absolute and Relative Errors


Definition 1.1. Suppose x̂ is an approximation to x. The absolute
error Ex in the approximation of x is defined as the magnitude of the
difference between exact value x and the approximated value x̂. So,

Ex = |x − x̂|. (1.1)

The relative error Rx in the approximation of x is defined as the


ratio of the absolute error to the magnitude of x itself. So,
|x − x̂|
Rx = , x 6= 0. (1.2)
|x|

Example 1.1. Let x̂ = 1.02 be an approximation of x = 1.01594,


ŷ = 999996 be an approximation of y = 1000000 and ẑ = 0.000009 be
an approximation of z = 0.000012. Then

Ex = |x − x̂| = |1.01594 − 1.02| = 0.00406,


|x − x̂| |1.01594 − 1.02|
Rx = = = 0.0039963,
|x| |1.01594|

Ey = |y − ŷ| = |1000000 − 999996| = 4,

|y − ŷ| |1000000 − 999996|


Ry = = = 0.000004,
|y| |1000000|

Ez = |z − ẑ| = |0.000012 − 0.000009| = 0.000003,

|z − ẑ| |0.000012 − 0.000009|


Rz = = = 0.25.
|z| |0.000012|
Note that Ex ≈ Rx , Ry << Ey and Ez << Rz .

3
Lecture Notes - Numerical Methods M.Ashyraliyev

1.2 Types of Errors in Numerical Procedures


When applying a numerical method to solve a certain mathematical
problem on the computer, one has to be aware of two main types of
errors which may affect significantly the numerical result. The first
type of errors, so called truncation errors, occur when a mathemat-
ical problem is reformulated so that it can be solved with arithmetic
operations only. The other type of errors, so called round-off errors,
occur when a machine actually does those arithmetic operations.
Here are some real life examples of what can happen when errors
in numerical algorithms are underestimated.

• The Patriot Missile failure, in Dharan, Saudi Arabia, on February


25, 1991 which resulted in 28 deaths, is ultimately attributable
to poor handling of rounding errors.

• The explosion of the Ariane 5 rocket just after lift-off on its


maiden voyage off French Guiana, on June 4, 1996, was ulti-
mately the consequence of a simple overflow.

• The sinking of the Sleipner A offshore platform in Gandsfjorden


near Sta-vanger, Norway, on August 23, 1991, resulted in a loss
of nearly one billion dollars. It was found to be the result of
inaccurate finite element analysis.

1.2.1 Truncation Errors


Four arithmetic operations are sufficient to evaluate any polynomial
function. But how can we evaluate other functions by using only four
arithmetic operations?
√ For instance, how do we find the value of func-
tion f (x) = x at x = 5?
With the help of classical Taylor’s theorem, functions can be ap-
proximated with polynomials. Let us first recall the Taylor’s theorem
known from Calculus.

4
Lecture Notes - Numerical Methods M.Ashyraliyev

Theorem 1.1. (Taylor’s Theorem)


Suppose function f has continuous n + 1 derivatives on interval [a, b].
Then for any points x and x0 from [a, b], there exists a number ξ
between x and x0 such that

f (x) = Pn (x) + Rn (x), (1.3)

where

0 f 00 (x0 ) 2 f (n) (x0 )


Pn (x) = f (x0 )+f (x0 )(x−x0 )+ (x−x0 ) +. . .+ (x−x0 )n
2! n!
(1.4)
is called the n-order Taylor polynomial and

f (n+1) (ξ)
Rn (x) = (x − x0 )n+1 (1.5)
(n + 1)!

is called the remainder term. (1.3)-(1.5) is called Taylor expansion


of function f (x) about point x0 .

Remark 1.1. If the remainder term Rn (x) in (1.3) is small enough then
function f (x) can be approximated with Taylor polynomial Pn (x). So
that
f (x) ≈ Pn (x). (1.6)
Then, the error in this approximation is the magnitude of the remain-
der term Rn (x), since E = |f (x) − Pn (x)| = |Rn (x)|. In numerical
analysis, remainder term (1.5) is usually called a truncation error of
approximation (1.6).

Example 1.2. Let f (x) = ex . Then, using Taylor’s theorem with


x0 = 0, we have
ex = Pn (x) + Rn (x),
where
x x2 x3 xn
Pn (x) = 1 + + + + ... +
1! 2! 3! n!
5
Lecture Notes - Numerical Methods M.Ashyraliyev

is a Taylor polynomial and


ξ xn+1
Rn (x) = e
(n + 1)!
is a remainder term (or truncation error) associated with Pn (x). Here
ξ is some point between 0 and x.
Now, assume that we want to approximate the value of f (1) = e with
Taylor polynomials. The exact value of number e is known from the
literature; it is e = 2.718281828459 . . .
Taylor polynomials up to order 5 for function f (x) = ex are:
P0 (x) = 1,
P1 (x) = 1 + x,
x2
P2 (x) = 1 + x + ,
2
x2 x3
P3 (x) = 1 + x + + ,
2 6
x2 x3 x4
P4 (x) = 1 + x + + + ,
2 6 24
x2 x3 x4 x5
P5 (x) = 1 + x + + + + .
2 6 24 120
Then
P0 (1) = 1 =⇒ Error = |f (1) − P0 (1)| = 1.718281828459 . . .
P1 (1) = 2 =⇒ Error = |f (1) − P1 (1)| = 0.718281828459 . . .
P2 (1) = 2.5 =⇒ Error = |f (1) − P2 (1)| = 0.218281828459 . . .
P3 (1) = 2.666666 . . . =⇒ Error = |f (1) − P3 (1)| = 0.051615161792 . . .
P4 (1) = 2.708333 . . . =⇒ Error = |f (1) − P4 (1)| = 0.009948495126 . . .
P5 (1) = 2.716666 . . . =⇒ Error = |f (1) − P5 (1)| = 0.001615161792 . . .
We observe that truncation error decreases as we use higher order
Taylor polynomials.

6
Lecture Notes - Numerical Methods M.Ashyraliyev

Now, let us find the order of Taylor polynomial that should be used to
approximate f (1) so that the error is less than 10−6 . We have
eξ e 3
Rn (1) = ≤ < < 10−6 =⇒ n ≥ 9.
(n + 1)! (n + 1)! (n + 1)!
So, analyzing the remainder term we can say that 9-th order Taylor
polynomial is certain to give an approximation with error less than
10−6 . Indeed, we have
1 1 1 1 1 1 1 1 1
P9 (1) = 1+ + + + + + + + + = 2.718281525573 . . .
1! 2! 3! 4! 5! 6! 7! 8! 9!
for which |f (1) − P9 (1)| = 0.000000302886 . . . < 10−6 .

Example 1.3. Consider the definite integral


Z1/2
2
p = ex dx = 0.544987104184 . . .
0

From previous example we have approximation


xx x2 x3 x4
e ≈1+ + + +
1! 2! 3! 4!
Replacing x with x2 , we get
x2 x2 x4 x6 x8
e ≈1+ + + +
1! 2! 3! 4!
Then
Z1/2 Z1/2 4 6 8

2 x x x
p = ex dx ≈ 1 + x2 + + + dx =
2 6 24
0 0

x3 x5 x7 x9
 
1/2
= x+ + + + =
3 10 42 216 0

1 1 1 1 1
= + + + + = 0.54498672 . . . = p̂
2 3 · 23 10 · 25 42 · 27 216 · 29
7
Lecture Notes - Numerical Methods M.Ashyraliyev

with the error in approximation being |p − p̂| ≈ 3.8 × 10−7 .

Example 1.4. What is the maximum possible error in using the ap-
proximation
x3 x5
sin x ≈ x − +
3! 5!
when −0.3 ≤ x ≤ 0.3?
Solution: Using Taylor’s theorem for function f (x) = sin x with
x0 = 0, we have
x3 x5
sin x = x − + + R6 (x),
3! 5!
where
f (7) (ξ) 7 x7
R6 (x) = x = − cos (ξ)
7! 7!
Then for −0.3 ≤ x ≤ 0.3 we have
|x|7 0.37
|R6 (x)| ≤ ≤ ≈ 4.34 × 10−8 .
7! 7!

Remark 1.2. Truncation error is under control of the user. Truncation


error can be reduced as much as it is needed. However, it cannot be
eliminated entirely!

1.2.2 Round-off Errors


Computers use only a fixed number of digits to represent a number.
As a result the numerical values stored in a computer are said to have
a finite precision. Because of this round-off errors occur when the
arithmetic operations, performed in a machine, involve numbers with
only a finite number of digits.

Remark 1.3. Round-off errors depend on hardware and the computer


language used.

8
Lecture Notes - Numerical Methods M.Ashyraliyev

To understand the effect of round-off errors we should first under-


stand how numbers are stored on a machine. Although computers use
binary number system (so that every digit is 0 or 1), for simplicity of
explanation we will use here the decimal number system.

Any real number y can be normalized to achieve the form:

y = ±0.d1 d2 . . . dk−1 dk dk+1 dk+2 . . . × 10n (1.7)

where n ∈ Z, 1 ≤ d1 ≤ 9 and 0 ≤ dj ≤ 9 for j > 1. For instance,


38 1
y= = 0.126666 . . . × 102 or y = = 0.3030303030 . . . × 10−1 .
3 33
Real number y in (1.7) may have infinitely many digits. As we
have already mentioned, computer stores only finite number of digits.
That’s why number y should be represented with so-called floating-
point form of y, denoted by f l(y), which has let say only k digits.
There are two ways to perform this. One method, called chopping, is
to simply chop off in (1.7) the digits dk+1 dk+2 . . . to obtain

f lc (y) = ±0.d1 d2 . . . dk−1 dk × 10n

The other method, called rounding, is performed in the following way:


if dk+1 < 5 then the result is the same as chopping; if dk+1 ≥ 5 then
1 is added to k-th digit and the resulting number is chopped. The
floating-point form of y obtained with rounding is denoted by f lr (y).

Example 1.5. Determine five-digit (a) chopping and (b) rounding


22
values of numbers = 3.14285714 . . . and π = 3.14159265 . . .
7
Solution:
(a) Five-digit chopping of numbers gives f lc 22
 1
7 = 0.31428 × 10 and
f lc (π) = 0.31415 × 101 . Therefore, the errors of chopping are
 
22 22
f lc − ≈ 5.7 × 10−5 ; |f lc (π) − π| ≈ 9.3 × 10−5 .
7 7

9
Lecture Notes - Numerical Methods M.Ashyraliyev

(b) Five-digit rounding of numbers gives f lr 22


 1
7 = 0.31429 × 10 and
f lr (π) = 0.31416 × 101 . So, the errors of rounding are
 
22 22
f lr − ≈ 4.3 × 10−5 ; |f lr (π) − π| ≈ 7.3 × 10−6 .
7 7

We observe that the values obtained from rounding method have less
errors than the values obtained from chopping method.

Remark 1.4. The error that results from replacing a number with
its floating-point form is called round-off error regardless of whether
the rounding or chopping method is used. However, rounding method
results in less error than chopping method and that’s why it is preferred
generally.

1.2.3 Loss of Significance


A loss of significance can occur if two nearly equal quantities are
subtracted from one another. For example, both numbers a = 0.177241
and b = 0.177589 have 6 significant digits. But the difference of them
b − a = 0.000348 = 0.348 × 10−3 has only three significant digits. So,
in subtraction of these numbers three significant digits have been lost!
This loss is called subtractive cancellation.
Errors can also occur when two quantities of radically different
magnitudes are added. For example, if we add numbers x = 0.1234
and 0.6789 × 10−20 the result for x + y will be rounded to 0.1234 by a
machine that keeps only 16 significant digits.
The loss of accuracy due to round-off errors can often be avoided by
a careful sequencing of operations or a reformulation of the problem.
We will describe this issue with examples.
√ √ 
Example 1.6. Consider two functions f (x) = x x + 1 − x and
x
g(x) = √ √ . Evaluate f (500) and g(500) using six-digits
x+1+ x
10
Lecture Notes - Numerical Methods M.Ashyraliyev

arithmetic with rounding method and compare the results in terms of


errors.
Solution: Using six-digits arithmetic with rounding method, we have
√ √ 
ˆ
f (500) = 500 · 501 − 500 = 500 · (22.3830 − 22.3607) =

= 500 · 0.0223 = 11.1500


and
500 500 500
ĝ(500) = √ √ = = = 11.1748.
501 + 500 22.3830 + 22.3607 44.7437
Note that the exact values of these two functions at x = 500 are the
same, namely f (500) = g(500) = 11.1747553 . . . Then, the absolute er-
rors in approximations of f (500) and g(500), when six-digits arithmetic
with rounding method is used, are as following:
Ef = |f (500) − fˆ(500)| ≈ 2.5 × 10−2 ,

Eg = |g(500) − ĝ(500)| ≈ 4.5 × 10−5 .


Although f (x) ≡ g(x), we see that using g(x) function results in much
less error than using f (x) function. It is only due to the loss of signif-
icance which occurred in f (x) when we subtracted nearly equal num-
bers. By rewriting function f (x) in the form of g(x) we eliminated
that subtractive cancellation.
3 2
Example 1.7.  Consider two polynomials P (x) = x − 3x + 3x − 1
and Q(x) = (x − 3)x + 3 x − 1. Evaluate P (2.19) and Q(2.19) using
three-digits arithmetic with rounding method and compare the results
in terms of errors.
Solution: Using three-digits arithmetic with rounding method, we
have
P̂ (2.19) = 2.193 − 3 · 2.192 + 3 · 2.19 − 1 =
= 2.19 · 4.80 − 3 · 4.80 + 6.57 − 1 = 10.5 − 14.4 + 5.57 =
= −3.9 + 5.57 = 1.67
11
Lecture Notes - Numerical Methods M.Ashyraliyev

and
 
Q̂(2.19) = (2.19 − 3) · 2.19 + 3 · 2.19 − 1 =

= (−0.81 · 2.19 + 3) · 2.19 − 1 = (−1.77 + 3) · 2.19 − 1 =


= 1.23 · 2.19 − 1 = 2.69 − 1 = 1.69.

Note that the polynomials P (x) and Q(x) are the same and
P (2.19) = Q(2.19) = 1.685159. Then, the absolute errors in approx-
imations of P (2.19) and Q(2.19), when three-digits arithmetic with
rounding method is used, are as following:

EP = |P (2.19) − P̂ (2.19)| = 0.015159,

EQ = |Q(2.19) − Q̂(2.19)| = 0.004841.

Although P (x) ≡ Q(x), we see that using polynomial Q(x) results in


three times less error than using polynomial P (x). Moreover, if we
count the number of arithmetic operations, we observe that evaluation
of polynomial Q(x) needs less number of arithmetic operations than
evaluation of polynomial P (x). Indeed,
• it requires 4 multiplications and 3 additions/subtractions to eval-
uate the polynomial P (x);
• it requires 2 multiplications and 3 additions/subtractions to eval-
uate the polynomial Q(x).

Remark 1.5. The polynomial Q(x) in Example 1.7 is called a nested


structure of polynomial P (x). Evaluation of nested structure of poly-
nomial requires less arithmetic operations and often has less error.

1.2.4 Propagation of Error


When repeating arithmetic operations are performed on the computer,
round-off errors sometimes have tendency to accumulate and therefore

12
Lecture Notes - Numerical Methods M.Ashyraliyev

to grow up. Let us illustrate this issue on the base of two arithmetic
operations, addition and multiplication of two numbers.
Let p̂ be an approximation of number p and q̂ be an approximation
of number q. Then
p = p̂ + p and q = q̂ + q
where p and q are the errors in the approximations p̂ and q̂, respec-
tively. Assume that both p and q are positive. Then
p + q = p̂ + q̂ + (p + q ).
Therefore, approximation of p+q with p̂+ q̂ has an error p +q . So, the
absolute error in the sum p + q can be as large as the sum of absolute
errors in p and q.
Assuming that p ≈ p̂ and q ≈ q̂, we have
pq − p̂q̂ p̂q + q̂p + p q q p
Rpq = = ≈ + = Rq + Rp .
pq pq q p
So, the relative error in the product pq is approximately the sum of
relative errors in p and q.

1.3 O notation
Definition 1.2. Suppose {pn }∞ ∞
n=1 and {qn }n=1 are sequences of real
numbers. If there exist positive constants C and N such that
|pn | ≤ C|qn | for all n ≥ N,
then we say that {pn } is of order {qn } as n → ∞ and we write
pn = O(qn ).

Example 1.8. Let {pn }∞ n=1 be a sequence of numbers defined by


2
pn = n + n for all n ∈ N.
Since for all n ≥ 1 we have pn = n2 + n ≤ n2 + n2 = 2n2 , it follows
that pn = O(n2 ).

13
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 1.9. Let {pn }∞ n=1 be a sequence of numbers defined by


n+1
pn = for all n ∈ N.
n2
n+1 n+n 2
Since for all n ≥ 1 we have pn = ≤ = , it follows that
  n2 n2 n
1
pn = O .
n

Example 1.10. Let {pn }∞ n=1 be a sequence of numbers defined by


n+3
pn = for all n ∈ N.
2n3
n+3 n+n 1
Since for all n ≥ 3 we have pn = ≤ = , it follows that
  2n3 2n3 n2
1
pn = O .
n2

Definition 1.3. Let f (x) and g(x) be two functions defined in some
open interval containing point a. If there exist positive constants C
and δ such that
|f (x)| ≤ C|g(x)| for all x with |x − a| < δ,
then we say that f (x) is of order g(x) as x → a and we write
f (x) = O(g(x)).

Example 1.11. Using Taylor’s theorem for function f (x) = ln (1 + x)


with x0 = 0, we have
x2 x3
ln (1 + x) = x − + ,
2 3(1 + ξ)3
1
where ξ is some point between 0 and x. Since 3(1+ξ)3 is bounded by a
constant, we have
x2
ln (1 + x) = x − + O(x3 ) for x close to 0.
2

14
Lecture Notes - Numerical Methods M.Ashyraliyev

1.4 Self-study Problems


Problem 1.1. Compute the absolute error and the relative error in
approximations of p by p̂

a) p = 3.1415927, p̂ = 3.1416

b) p = 2, p̂ = 1.414

c) p = 8!, p̂ = 39900

Problem 1.2. Find the fourth order Taylor polynomial for the func-
tion f (x) = cos x about x0 = 0.

Problem 1.3. Find the third order Taylor polynomial for the function
f (x) = x3 − 21x2 + 17 about x0 = 1.

Problem 1.4. Let f (x) = ln (1 + 2x).

a) Find the fifth order Taylor polynomial P5 (x) for given function
f (x) about x0 = 0

b) Use P5 (x) to approximate ln (1.2)

c) Determine the actual error of the approximation in (b)


Problem 1.5. Let f (x) = x + 4.

a) Find the third order Taylor polynomial P3 (x) for given function
f (x) about x0 = 0
√ √
b) Use P3 (x) to approximate 3.9 and 4.2

c) Determine the actual error of the approximations in (b)

15
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 1.6. Determine the order of Taylor polynomial Pn (x) for the
function f (x) = ex about x0 = 0 that should be used to approximate
e0.1 so that the error is less than 10−6 .

Problem 1.7. Determine the order of Taylor polynomial Pn (x) for the
function f (x) = cos x about x0 = 0 that should be used to approximate
cos (0.2) so that the error is less than 10−5 .

Problem 1.8. Let f (x) = ex .


a) Find the Taylor polynomial P8 (x) of order 8 for given function
f (x) about x0 = 0
b) Find an upper bound for error |f (x) − P8 (x)| when −1 ≤ x ≤ 1
c) Find upper bound for error |f (x) − P8 (x)| when −0.5 ≤ x ≤ 0.5

Problem 1.9. Let f (x) = ex cos x.


a) Find the second order Taylor polynomial P2 (x) for given function
f (x) about x0 = 0
b) Use P2 (0.5) to approximate f (0.5)
c) Find an upper bound for error |f (0.5) − P2 (0.5)| and compare it
to the actual error
Z1 Z1
d) Use P2 (x)dx to approximate f (x)dx.
0 0

e) Determine the actual error of the approximation in (d)

Problem 1.10. Use three-digit rounding arithmetic to perform the


following calculations. Find the absolute error and the relative error
in obtained results.

16
Lecture Notes - Numerical Methods M.Ashyraliyev

a) (121 − 0.327) − 119

b) (121 − 119) − 0.327


3
c) −10π + 6e −
62

Problem 1.11. Let P (x) = x3 − 6.1x2 + 3.2x + 1.5

a) Use three-digit arithmetic with chopping method to evaluate


P (x) at x = 4.17 and find the relative error in obtained result

b) Rewrite P (x) in a nested structure. Use three-digit arithmetic


with chopping method to evaluate a nested structure at x = 4.17
and find the relative error in obtained result

Problem 1.12. Let P (x) = x3 − 6.1x2 + 3.2x + 1.5

a) Use three-digit arithmetic with rounding method to evaluate


P (x) at x = 4.17 and find the relative error in obtained result

b) Rewrite P (x) in a nested structure. Use three-digit arithmetic


with rounding method to evaluate a nested structure at x = 4.17
and find the relative error in obtained result

p
Problem 1.13. Consider two functions f (x) = x + 1 − x2 + x + 1
x
and g(x) = √ .
x + 1 + x2 + x + 1
a) Use four-digit arithmetic with rounding method to approximate
f (0.1) and find the relative error in obtained result.
(Note that the true value is f (0.1) = 0.0464346247 . . .)

b) Use four-digit arithmetic with rounding method to approximate


g(0.1) and find the relative error in obtained result.
(Note that the true value is g(0.1) = 0.0464346247 . . .)

17
Lecture Notes - Numerical Methods M.Ashyraliyev

ex − e−x
Problem 1.14. Let f (x) = .
x
a) Evaluate f (0.1) using four-digit arithmetic with rounding method
and find the relative error of obtained value.
(Note that the true value is f (0.1) = 2.0033350004 . . .)
b) Find the third order Taylor polynomial P3 (x) for given function
f (x) about x0 = 0.
x x2 x3 x4
(You can use the formula e = 1 + x + + + + . . .)
2! 3! 4!
c) For approximation of f (0.1) evaluate P3 (0.1) using four-digit
arithmetic with rounding method and find the relative error of
obtained value.

Problem 1.15. For which values of x the expression 2x2 + 1 − 1
would cause the loss of accuracy and how would you avoid the loss of
accuracy?

Problem 1.16. For which values of x and y the expression


ln (x) − ln (y) would cause the loss of accuracy and how would you
avoid the loss of accuracy?

Problem 1.17. For each of the following


 sequences, find the largest
1
possible value of α such that pn = O

   
1 1 1
a) pn = √ b) pn = sin c) pn = ln 1 +
n2 + n + 1 n n

Problem 1.18. For each of the following functions, find the largest
possible value of α such that f (x) = O (xα ) as x → 0.
1 − cos x ex − e−x
a) f (x) = sin x−x b) f (x) = c) f (x) =
x x

18
2 Numerical Solution of Nonlinear Alge-
braic Equations
Consider a nonlinear algebraic equation

f (x) = 0. (2.1)

Definition 2.1. p is called a root of equation (2.1) if f (p) = 0.

The problem of finding the roots of equation (2.1), so called root-


finding problem, is one of the most basic problems of mathematics.
In some cases, one can find the root of equation (2.1) analytically.
However, most of the time it is impossible to find the root by hand.
There is a number of different numerical methods which are used to
solve equation (2.1) numerically. These numerical methods can be
classified into two classes: bracketing methods and open methods.
Here is the list of some of these numerical methods:

• Bracketing methods:

– Bisection method
– False Position method

• Open methods:

– Secant method (will not be studied in this course)


– Fixed-Point method
– Newton-Raphson method

Bracketing methods start with an interval that contains the root


and procedure is used to obtain a smaller interval containing the root.
Open methods start with one (or more) initial guess point and in
each iteration a new guess of the root is obtained.

19
Lecture Notes - Numerical Methods M.Ashyraliyev

Remark 2.1. Open methods are usually more efficient than bracketing
methods. However, open methods may fail to converge to a root while
bracketing methods always converge.

The use of bracketing methods is based on the result which follows


from classical Intermediate Value theorem known from Calculus. Let
us first recall this result.

Theorem 2.1. If function f (x) is continuous on interval [a, b] and


f (a)f (b) < 0 then there is at least one root of equation (2.1) in [a, b].

Remark 2.2. Any numerical root-finding method generates a sequence


of approximations {pn }∞
n=0 . We say that numerical method converges
if lim pn = p, where p is an exact root of equation (2.1).
n→∞

Definition 2.2. Suppose {pn }∞ n=0 is a sequence that converges to p


with pn 6= p for all n. If positive constants K and α exist such that
|pn+1 − p|
lim =K
n→∞ |pn − p|α

then we say that sequence {pn }∞


n=0 converges to p with order α.

• If α = 1 then convergence is called linear.

• If α = 2 then convergence is called quadratic.

Example 2.1. Let {pn }∞ n=0 be a sequence of numbers with p0 = 1


|pn+1 |
and lim = 0.5. By definition it means that sequence {pn }∞n=0
n→∞ |pn |
converges to p = 0 linearly. Let {qn }∞
n=0 be a sequence of numbers with
|qn+1 |
q0 = 1 and lim = 0.5. By definition it means that sequence
n→∞ |qn |2
{qn }∞
n=0 converges to q = 0 quadratically.

20
Lecture Notes - Numerical Methods M.Ashyraliyev

|pn+1 | |qn+1 |
For simplicity we assume that ≈ 0.5 and ≈ 0.5 for all n.
|pn | |qn |2
Then
|pn | ≈ 0.5 · |pn−1 | ≈ 0.52 · |pn−2 | ≈ . . . ≈ 0.5n · |p0 | = 0.5n
and
n n n
−1 −1
|qn | ≈ 0.5 · |qn−1 |2 ≈ 0.53 · |qn−2 |4 ≈ . . . ≈ 0.52 · |q0 |2 = 0.52 .
Table below shows first few terms of these sequences. We can see that
quadratic convergence is much faster than the linear convergence.

n pn qn
1 5 × 10−1 5 × 10−1
2 2.5 × 10−1 1.25 × 10−1
3 1.25 × 10−1 7.8125 × 10−3
4 6.25 × 10−2 3.0518 × 10−5
5 3.125 × 10−2 4.6566 × 10−10
6 1.5625 × 10−2 1.0842 × 10−19

Remark 2.3. When the numerical method generates approximations


one has to measure the quality of those approximations. If the true
value is known then the approximate value can be directly compared
with the exact value. However, most of the time the exact value is not
known in advance. In that case one can compare approximates with
each other.

Definition 2.3. Accuracy measures how approximate value differs


from the true value. So, if pn is an approximation of p, then
|p − pn |
Absolute accuracy = |p − pn |, Relative accuracy = .
|p|
21
Lecture Notes - Numerical Methods M.Ashyraliyev

Definition 2.4. Precision measures how two consecutive approxi-


mates differ from each other. So, if pn and pn−1 are two approxima-
tions, then
|pn − pn−1 |
Absolute precision = |pn − pn−1 |, Relative precision = .
|pn |

2.1 Bisection Method


Consider a nonlinear algebraic equation:

f (x) = 0, (2.2)

where f (x) is continuous function on interval [a, b] and f (a)f (b) < 0.
Then by using Theorem 2.1, there is at least one root p of equation
(2.2) in interval [a, b]. We assume for simplicity of our discussion that
the root p in interval [a, b] is unique.
To begin the Bisection method, we set a1 = a and b1 = b, as
shown in Figure 1. Let p1 be the midpoint of the interval [a1 , b1 ]. So,
a1 + b1
p1 = .
2
If f (p1 ) = 0, then p1 is the root of equation (2.2) and we can stop the
search. If f (p1 ) 6= 0, then f (p1 ) has the same sign as either f (a1 ) or
f (b1 ).
• If f (a1 ) and f (p1 ) have opposite signs, then root p must be in
the interval [a1 , p1 ] and we set

a2 = a1 and b2 = p1 .

• If f (b1 ) and f (p1 ) have opposite signs, then root p must be in


the interval [p1 , b1 ] and we set

a2 = p 1 and b2 = b1 .

22
Lecture Notes - Numerical Methods M.Ashyraliyev

We reapply the process to the interval [a2 , b2 ], and continue forming


intervals [a3 , b3 ], [a4 , b4 ], . . . , [an , bn ]. Each new interval will contain
exact root p and have length one half of the length of the preceding
interval. Figure 1 illustrates geometrically first 3 steps of Bisection
method. Algorithm 1 shows a pseudo-code of Bisection method.

Figure 1: Bisection Method

Remark 2.4. Bisection method generates the sequence of intervals:

[a1 , b1 ] =⇒ [a2 , b2 ] =⇒ [a3 , b3 ] =⇒ . . . =⇒ [an−1 , bn−1 ] =⇒ [an , bn ]

such that

a1 ≤ a2 ≤ a3 ≤ . . . ≤ an−1 ≤ an ≤ p ≤ bn ≤ bn−1 ≤ . . . ≤ b3 ≤ b2 ≤ b1 .

It also generates the sequence of approximations:

p1 , p2 , p3 , . . . , pn−1 , pn

23
Lecture Notes - Numerical Methods M.Ashyraliyev

an + b n
where pn = .
2

Algorithm 1 Pseudo-code of Bisection Method.


Set a1 = a and b1 = b.
Compute f (a1 ) and f (b1 ).
for n = 1, 2, 3, . . . do
an + bn
Compute pn =
2
Compute f (pn )
if f (pn ) = 0 then 
Stop the search pn is the root of equation (2.2)
else if f (an )f (pn ) < 0 then
Set an+1 = an and f (an+1 ) = f (an )
Set bn+1 = pn and f (bn+1 ) = f (pn )
else
Set an+1 = pn and f (an+1 ) = f (pn )
Set bn+1 = bn and f (bn+1 ) = f (bn )
end if
if Stopping condition is satisfied then
Stop the search pn is sufficiently accurate approximation to the
root of equation (2.2)
end if
end for

Theorem 2.2. Suppose f (x) is continuous function on interval [a, b]


and f (a)f (b) < 0. Then Bisection method generates a sequence {pn }
approximating a root p of equation (2.2) and
b−a
|p − pn | ≤ for n = 1, 2, 3, . . . (2.3)
2n
Proof. Convergence of sequence {pn } to root p as n → ∞ is guaranteed
b−a
from (2.3), since n → 0 as n → ∞. Let us prove (2.3). Since every
2

24
Lecture Notes - Numerical Methods M.Ashyraliyev

interval has length one half of the length of preceding interval, we have
bn−1 − an−1 bn−2 − an−2 b1 − a1 b−a
b n − an = = = . . . = = .
2 22 2n−1 2n−1
an + bn
Then, since pn = and p ∈ [an , bn ], we get
2
b n − an b−a
|p − pn | ≤ = n
2 2
which completes the proof of this theorem.
Remark 2.5. (Advantages/disadvantages of Bisection method)
Results of Theorem 2.2 imply two obvious advantages of Bisection
method. First of all, Bisection method always converges. Moreover,
inequality (2.3) gives us an upper bound for the error when Bisection
b−a
method is used. So, the error after n iterations is less than .
2n
However, there is one important disadvantage of Bisection method.
It has very slow convergence. In fact, one can easily show that the
Bisection method has a linear convergence.
Remark 2.6. (Stopping conditions for Bisection method)
Suppose we want to obtain approximation pn of the root p within some
accuracy T OL, so that we expect |p − pn | < T OL to be satisfied for
some n. If the exact value of the root p is known in advance then at
each step of Algorithm 1 we can directly check the condition:
|p − pn | < T OL.
If the exact value of the root p is not known then we can use (2.3) to
guarantee the desired accuracy without knowing the value of p itself.
So, for stopping condition in Algorithm 1 we can use:
b−a
< T OL. (2.4)
2n
If after n iterations condition (2.4) is satisfied, then from (2.3) we have
b−a
|p − pn | ≤ < T OL =⇒ |p − pn | < T OL
2n
25
Lecture Notes - Numerical Methods M.Ashyraliyev

which means that pn is an approximation of p within desired accuracy


T OL and Bisection algorithm can stop then.

Example 2.2. Use Bisection method to approximate a root of equa-


tion
x3 − 7x2 + 14x − 6 = 0 (2.5)
−2
in interval [0, 1] within
√ an accuracy 10 . (You can use the exact value
of the root p = 2 − 2 = 0.5857864376269 . . .)
Solution: Let us denote f (x) = x3 − 7x2 + 14x − 6. We begin the
Bisection method by setting a1 = 0 and b1 = 1. Then
f (a1 ) = f (0) = −6 < 0 and f (b1 ) = f (1) = 2 > 0.

a1 + b1 0+1
1 step: We have p1 = = = 0.5.
2 2
Then f (p1 ) = f (0.5) = −0.625 < 0.
Since f (b1 )f (p1 ) < 0, we set a2 = p1 = 0.5 and b2 = b1 = 1.
Accuracy = |p − p1 | ≈ 0.0858 > 10−2 , so we continue with the
next step.
a2 + b2 0.5 + 1
2 step: We have p2 = = = 0.75.
2 2
Then f (p2 ) = f (0.75) = 0.984375 > 0.
Since f (a2 )f (p2 ) < 0, we set a3 = a2 = 0.5 and b3 = p2 = 0.75.
Accuracy = |p − p2 | ≈ 0.1642 > 10−2 , so we continue with the
next step.
a3 + b3 0.5 + 0.75
3 step: We have p3 = = = 0.625.
2 2
Then f (p3 ) = f (0.625) = 0.259765625 > 0.
Since f (a3 )f (p3 ) < 0, we set a4 = a3 = 0.5 and b4 = p3 = 0.625.
Accuracy = |p − p3 | ≈ 0.0392 > 10−2 , so we continue with the
next step.

26
Lecture Notes - Numerical Methods M.Ashyraliyev

a4 + b 4 0.5 + 0.625
4 step: We have p4 = = = 0.5625.
2 2
Then f (p4 ) = f (0.5625) = −0.161865234375 < 0.
Since f (b4 )f (p4 ) < 0, we set a5 = p4 = 0.5625, b5 = b4 = 0.625.
Accuracy = |p − p4 | ≈ 0.0233 > 10−2 , so we continue with the
next step.
a5 + b 5 0.5625 + 0.625
5 step: We have p5 = = = 0.59375.
2 2
Then f (p5 ) = f (0.59375) = 0.05404663085938 > 0.
Accuracy = |p − p5 | ≈ 0.0080 < 10−2 , so we can stop the itera-
tions.

So, Bisection method in 5 steps gives an approximation p5 = 0.59375


for the root p = 0.5857864376269 . . . of the equation (2.5) within an
accuracy 10−2 . We summarize all results in Table 1.

Table 1: Results of Bisection method for equation (2.5).


n an bn pn f (an ) f (bn ) f (pn ) Accuracy
1 0 1 0.5 − + − 0.0858
2 0.5 1 0.75 − + + 0.1642
3 0.5 0.75 0.625 − + + 0.0392
4 0.5 0.625 0.5625 − + − 0.0233
5 0.5625 0.625 0.59375 − + + 0.0080

Example 2.3. Use Bisection method to approximate a root of equa-


tion
3x + sin x − ex = 0 (2.6)
in interval [0, 1] within an accuracy 10−1 .

27
Lecture Notes - Numerical Methods M.Ashyraliyev

Solution: Let us denote f (x) = 3x+sin x−ex . We begin the Bisection


method by setting a1 = 0 and b1 = 1. Then f (a1 ) = f (0) = −1 < 0
and f (b1 ) = f (1) = 1.12318915634885 . . . > 0.

a1 + b1 0+1
1 step: We have p1 = = = 0.5.
2 2
Then f (p1 ) = f (0.5) = 0.33070426790407 . . . > 0.
Since f (a1 )f (p1 ) < 0, we set a2 = a1 = 0 and b2 = p1 = 0.5.
b−a
Upper bound of error = = 0.5 > 10−1 , so we continue with
2
the next step.
a2 + b2 0 + 0.5
2 step: We have p2 = = = 0.25.
2 2
Then f (p2 ) = f (0.25) = −0.28662145743322 . . . < 0.
Since f (b2 )f (p2 ) < 0, we set a3 = p2 = 0.25 and b3 = b2 = 0.5.
b−a
Upper bound of error = 2
= 0.25 > 10−1 , so we continue
2
with the next step.
a3 + b3 0.25 + 0.5
3 step: We have p3 = = = 0.375.
2 2
Then f (p3 ) = f (0.375) = 0.03628111446785 . . . > 0.
Since f (a3 )f (p3 ) < 0, we set a4 = a3 = 0.25 and b4 = p3 = 0.375.
b−a
Upper bound of error = 3
= 0.125 > 10−1 , so we continue
2
with the next step.
a4 + b 4 0.25 + 0.375
4 step: We have p4 = = = 0.3125.
2 2
Then f (p4 ) = f (0.3125) = −0.12189942659342 . . . < 0.
b−a
Upper bound of error = 4
= 0.0625 < 10−1 , so we can stop
2
the iterations.

28
Lecture Notes - Numerical Methods M.Ashyraliyev

So, Bisection method in 4 steps gives an approximation p4 = 0.3125


for the root of the equation (2.6) within an accuracy 10−1 . We sum-
marize all results in Table 2.

Table 2: Results of Bisection method for equation (2.6).


n an bn pn f (an ) f (bn ) f (pn ) U.B.E.
1 0 1 0.5 − + + 0.5
2 0 0.5 0.25 − + − 0.25
3 0.25 0.5 0.375 − + + 0.125
4 0.25 0.375 0.3125 − + − 0.0625

Note that the last column of Table 2 shows the upper bound of
error in approximation pn . The actual errors can be less than these
upper bounds. Indeed, the exact value of the root of the equation
(2.6) is p = 0.360421680476 . . . and the accuracy of approximation p4
is |p − p4 | ≈ 0.0479 which is less than the upper bound of error being
0.0625.

Remark 2.7. With Bisection method we gain one digit of accuracy in


approximations every 3-4 iterations, which is not surprising since the
method has a linear convergence.

2.2 False Position Method


Consider again a nonlinear algebraic equation:

f (x) = 0, (2.7)

where f (x) is continuous function on interval [a, b] and f (a)f (b) < 0.
We begin the False Position method by setting  a1 = a and
b1 = b. The line that passes through points a1 , f (a1 ) and b1 , f (b1 )

29
Lecture Notes - Numerical Methods M.Ashyraliyev

has an equation:
f (b1 ) − f (a1 )
y = f (a1 ) + (x − a1 ). (2.8)
b1 − a1
The intersection of this line with x-axis, we take as a first approxima-
tion of the root, denoted by p1 . Then, putting y = 0 and x = p1 in
(2.8) and solving it for p1 , we obtain

a1 f (b1 ) − b1 f (a1 )
p1 = .
f (b1 ) − f (a1 )

If f (p1 ) = 0, then p1 is the root of equation (2.7) and we can stop the
search. If f (p1 ) 6= 0, then f (p1 ) has the same sign as either f (a1 ) or
f (b1 ).
• If f (a1 ) and f (p1 ) have opposite signs, then root p must be in
the interval [a1 , p1 ] and therefore we set a2 = a1 and b2 = p1 .

• If f (b1 ) and f (p1 ) have opposite signs, then root p must be in


the interval [p1 , b1 ] and therefore we set a2 = p1 and b2 = b1 .
We reapply the process to the interval [a2 , b2 ], and continue forming
intervals [a3 , b3 ], [a4 , b4 ], . . . , [an , bn ]. Each new interval will contain
the exact root p and therefore the False Position method is also a
bracketing method. Algorithm 2 shows a pseudo-code of False Position
method.
Remark 2.8. False Position method generates the sequence of inter-
vals: [a1 , b1 ], [a2 , b2 ], [a3 , b3 ], . . . , [an , bn ] such that

a1 ≤ a2 ≤ a3 ≤ . . . ≤ an−1 ≤ an ≤ p ≤ bn ≤ bn−1 ≤ . . . ≤ b3 ≤ b2 ≤ b1 .

It also generates the sequence of approximations:

p1 , p2 , p3 , . . . , pn−1 , pn

where pn ∈ [an , bn ].

30
Lecture Notes - Numerical Methods M.Ashyraliyev

Algorithm 2 Pseudo-code of False Position Method.


Set a1 = a and b1 = b.
Compute f (a1 ) and f (b1 ).
for n = 1, 2, 3, . . . do
an f (bn ) − bn f (an )
Compute pn =
f (bn ) − f (an )
Compute f (pn )
if f (pn ) = 0 then 
Stop the search pn is the root of equation (2.7)
else if f (an )f (pn ) < 0 then
Set an+1 = an and f (an+1 ) = f (an )
Set bn+1 = pn and f (bn+1 ) = f (pn )
else
Set an+1 = pn and f (an+1 ) = f (pn )
Set bn+1 = bn and f (bn+1 ) = f (bn )
end if
if Stopping condition is satisfied then
Stop the search pn is sufficiently accurate approximation to the
root of equation (2.7)
end if
end for

Theorem 2.3. Suppose f (x) is continuous function on interval [a, b]


and f (a)f (b) < 0. Then False Position method generates a sequence
{pn } approximating a root p of equation (2.7) and
|p − pn | ≤ max {pn − an , bn − pn } for n = 1, 2, 3, . . . (2.9)
Proof. Convergence of sequence {pn } to root p as n → ∞ is guaranteed
from (2.9), since bn − an → 0 as n → ∞. Inequality (2.9) is obvious
since pn ∈ [an , bn ] and p ∈ [an , bn ].
Remark 2.9. (Advantages/disadvantages of False Position method)
Results of Theorem 2.3 imply the obvious advantages of False Position
method. First of all, False Position method always converges. More-
over, at each step of the False Position method the upper bound for the

31
Lecture Notes - Numerical Methods M.Ashyraliyev

error is known from inequality (2.9). Finally, most of the time the False
Position method has faster convergence than the Bisection method
(but not always). However, convergence of False Position method is
still slow in comparison with more powerful Newton-Raphson method
which will be discussed later.

Remark 2.10. (Stopping conditions for False Position method)


Suppose we want to obtain approximation pn of the root p within some
accuracy T OL, so that we expect |p − pn | < T OL to be satisfied for
some n. If the exact value of the root p is known in advance then at
each step of Algorithm 2 we can directly check the condition:

|p − pn | < T OL.

If the exact value of the root p is not known then we can use (2.9) to
guarantee the desired accuracy without knowing the value of p itself.
So, for stopping condition in Algorithm 2 we can use:

max {pn − an , bn − pn } < T OL. (2.10)

If after n iterations condition (2.10) is satisfied, then from (2.9) we


have

|p − pn | ≤ max {pn − an , bn − pn } < T OL =⇒ |p − pn | < T OL

which means that pn is an approximation of p within desired accuracy


T OL and False Position algorithm can stop then.

In practice, some other stopping conditions can be used as well if


the value of the root p is not known in advance, such as:

1. |f (pn )| < T OL

2. |pn − pn−1 | < T OL (for absolute precision)


|pn − pn−1 |
3. < T OL (for relative precision)
|pn |

32
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 2.4. Use False Position method to approximate a root of


equation
x3 − 7x2 + 14x − 6 = 0 (2.11)
−2
in interval [0, 1] within
√ an accuracy 10 . (You can use the exact value
of the root p = 2 − 2 = 0.5857864376269 . . .)
Solution: Let us denote f (x) = x3 − 7x2 + 14x − 6. We begin the
False Position method by setting a1 = 0 and b1 = 1. Then
f (a1 ) = f (0) = −6 < 0 and f (b1 ) = f (1) = 2 > 0.

a1 f (b1 ) − b1 f (a1 ) 0 · f (1) − 1 · f (0)


1 step: We have p1 = = = 0.75.
f (b1 ) − f (a1 ) f (1) − f (0)
Then f (p1 ) = f (0.75) = 0.984375 > 0.
Since f (a1 )f (p1 ) < 0, we set a2 = a1 = 0 and b2 = p1 = 0.75.
Accuracy = |p − p1 | ≈ 0.1642 > 10−2 , so we continue with the
next step.
a2 f (b2 ) − b2 f (a2 )
2 step: We have p2 = = 0.64429530201342 . . .
f (b2 ) − f (a2 )
Then f (p2 ) = 0.38177674444195 . . . > 0.
Since f (a2 )f (p2 ) < 0, we set a3 = a2 = 0 and
b3 = p2 = 0.64429530201342 . . .
Accuracy = |p − p2 | ≈ 0.0585 > 10−2 , so we continue with the
next step.
a3 f (b3 ) − b3 f (a3 )
3 step: We have p3 = = 0.60575165300907 . . .
f (b3 ) − f (a3 )
Then f (p3 ) = 0.13424920850980 . . . > 0.
Since f (a3 )f (p3 ) < 0, we set a4 = a3 = 0 and
b4 = p3 = 0.60575165300907 . . .
Accuracy = |p − p3 | ≈ 0.0200 > 10−2 , so we continue with the
next step.

33
Lecture Notes - Numerical Methods M.Ashyraliyev

a4 f (b4 ) − b4 f (a4 )
4 step: We have p4 = = 0.59249466308157 . . .
f (b4 ) − f (a4 )
Then f (p4 ) = 0.04557101018090 . . . > 0.
Accuracy = |p − p4 | ≈ 0.0067 < 10−2 , so we can stop the itera-
tions.

So, False Position method in 4 steps gives an approximation


p4 = 0.59249466308157 . . . for the root p of the equation (2.11) within
an accuracy 10−2 . We summarize all results in Table 3. Note that for
the same equation in Example 2.2 with Bisection method we needed 5
steps (compare the results in Table 1 and Table 3). So, this example
shows that the False Position method converges a bit faster than the
Bisection method.

Table 3: Results of False Position method for equation (2.11).


n an bn pn f (an ) f (bn ) f (pn ) Accuracy
1 0 1 0.75 − + + 0.1642
2 0 0.75 0.644295 − + + 0.0585
3 0 0.644295 0.605752 − + + 0.0200
4 0 0.605752 0.592495 − + + 0.0067

Example 2.5. Use False Position method to approximate a root of


equation
3x + sin x − ex = 0 (2.12)
in interval [0, 1] within an accuracy 10−1 . (You can use the exact value
of the root p = 0.360421680476 . . .)
Solution: Let us denote f (x) = 3x + sin x − ex . We begin the method
by setting a1 = 0 and b1 = 1. Then f (a1 ) = f (0) = −1 < 0 and
f (b1 ) = f (1) = 1.12318915634885 . . . > 0.

34
Lecture Notes - Numerical Methods M.Ashyraliyev

a1 f (b1 ) − b1 f (a1 )
1 step: We have p1 = = 0.47098959459630 . . .
f (b1 ) − f (a1 )
Then f (p1 ) = 0.26515881591031 . . . > 0.
Since f (a1 )f (p1 ) < 0, we set a2 = a1 = 0 and
b2 = p1 = 0.47098959459630 . . .
Accuracy = |p − p1 | ≈ 0.1106 > 10−1 , so we continue with the
next step.
a2 f (b2 ) − b2 f (a2 )
2 step: We have p2 = = 0.37227705223507 . . .
f (b2 ) − f (a2 )
Then f (p2 ) = 0.02953366933827 . . . > 0.
Accuracy = |p − p2 | ≈ 0.0119 < 10−1 , so we can stop the itera-
tions.

So, the False Position method in 2 steps gives an approximation


p2 = 0.37227705223507 . . . for the root p of the equation (2.12) with
the accuracy 0.0119. We summarize all results in Table 4. Note that
for the same equation in Example 2.3 with Bisection method after 4
steps the approximation has an accuracy 0.0479. So, this example
again shows us that the False Position method converges faster than
the Bisection method.

Table 4: Results of False Position method for equation (2.12).


n an bn pn f (an ) f (bn ) f (pn ) Accuracy
1 0 1 0.470990 − + + 0.1106
2 0 0.470990 0.372277 − + + 0.0119

35
Lecture Notes - Numerical Methods M.Ashyraliyev

2.3 Fixed-Point Iteration


Assume that a nonlinear algebraic equation

f (x) = 0 (2.13)

is rearranged in the form


x = g(x). (2.14)
Normally this rearrangement can be done in several ways.

Definition 2.5. A fixed point of a function g(x) is a real number p


such that p = g(p).

Remark 2.11. Since (2.13) and (2.14) are equivalent, the fixed point
of function g(x) turns out to be a root of equation (2.13).

Remark 2.12. Note that the fixed point of function g(x) is the inter-
section of the line y = x and the graph of y = g(x).

Theorem 2.4. Assume that function g(x) is continuous on interval


[a, b] and g(x) ∈ [a, b] for all x ∈ [a, b]. Then g(x) has at least one fixed
point in [a, b].

36
Lecture Notes - Numerical Methods M.Ashyraliyev

Proof. If g(a) = a or g(b) = b, then g(x) has a fixed point at an


endpoint of interval [a, b]. If not, then g(a) > a and g(b) < b. The
function h(x) = g(x) − x is continuous on [a, b], with

h(a) = g(a) − a > 0 and h(b) = g(b) − b < 0.

The Intermediate Value Theorem implies that there exists p ∈ (a, b)


for which h(p) = 0. This number p is a fixed point for g(x) because

0 = h(p) = g(p) − p implies that g(p) = p.

Theorem 2.5. Assume that function g(x) is continuous on interval


[a, b] and g(x) ∈ [a, b] for all x ∈ [a, b]. In addition, suppose g(x) is
differentiable in (a, b) and a positive constant k < 1 exists such that
|g 0 (x)| ≤ k for all x ∈ (a, b). Then g(x) has a unique fixed point in
interval [a, b].

Proof. Since assumptions of Theorem 2.4 are satisfied, g(x) has at least
one fixed point in [a, b]. Let us prove that the fixed point is unique.
Suppose g(x) has two different fixed points p and r in interval [a, b]
and r < p. So, g(p) = p and g(r) = r. The Mean Value Theorem
implies that there exists ξ ∈ (r, p) for which

g(p) − g(r) p − r
g 0 (ξ) = = =1
p−r p−r
which contradicts to assumption that |g 0 (x)| < 1 for all x ∈ (a, b). This
contradiction proves that the fixed point is unique.

x2 − 1
Example 2.6. Show that function g(x) = has a unique fixed
3
point in interval [−1, 1].

37
Lecture Notes - Numerical Methods M.Ashyraliyev

Solution: Function g(x) is continuous on interval [−1, 1]. Moreover,


using simple Calculus one can easily illustrate that
1
max g(x) = g(±1) = 0 and min g(x) = g(0) = − .
x∈[−1,1] x∈[−1,1] 3

So, g(x) ∈ [−1, 1] for all x ∈ [−1, 1]. In addition, for all x ∈ (−1, 1)
we have
2x 2
|g 0 (x)| = ≤ < 1.
3 3
Then by Theorem 2.5 it follows that the given function g(x) has a
unique fixed point in interval [−1, 1]. Indeed,

p2 − 1
g(p) = p =⇒ = p =⇒ p2 − 3p − 1 = 0
3

3 − 13
which has only one root p = ≈ −0.303 in the interval [−1, 1].
2

Remark 2.13. The hypotheses of Theorem 2.5 are sufficient to guar-


antee a unique fixed point but are NOT necessary.

Example 2.7. Consider the function g(x) = 3−x in the interval [0, 1].
Since g 0 (x) = −3−x ln 3 < 0 on [0, 1], the function g(x) is decreasing
on [0, 1]. So,
1
g(1) = ≤ g(x) ≤ 1 = g(0)
3
for 0 ≤ x ≤ 1. Thus, for x ∈ [0, 1] we have g(x) ∈ [0, 1], and from
Theorem 2.4 it follows that g(x) has a fixed point in interval [0, 1].
However,
g 0 (0) = − ln 3 = −1.098612289 . . .
so g 0 (x) 6< 1 on (0, 1) and Theorem 2.5 cannot be used to determine
uniqueness. But g(x) is always decreasing, and it is clear from the
graph of g(x) = 3−x in [0, 1] that the fixed point must be unique.

38
Lecture Notes - Numerical Methods M.Ashyraliyev

To approximate the fixed point of a function g(x), we choose an


initial approximation p0 and generate the sequence {pn }∞
n=0 by letting

pn+1 = g(pn ) for each n ≥ 0

If the sequence {pn } converges to p and g(x) is continuous function,


then  
p = lim pn+1 = lim g(pn ) = g lim pn = g(p),
n→∞ n→∞ n→∞

so p is a fixed point of function g(x). This technique is called Fixed-


Point iteration. Figure 2 illustrates geometrically first 3 steps of
Fixed-Point method. Observe that we always get the successive iterates
in the following way: start on the x-axis at the initial p0 , go vertically to
the curve y = g(x), then horizontally to the line y = x, then vertically
to the curve y = g(x), then again horizontally to the line and so on.
This process is repeated until the points on the curve converge to
a fixed point. Algorithm 3 shows a pseudo-code of the Fixed-Point
iteration method.

Figure 2: Fixed-Point Method

39
Lecture Notes - Numerical Methods M.Ashyraliyev

Algorithm 3 Pseudo-code of Fixed-Point Method.


Choose p0 .
for n = 0, 1, 2, 3, . . . do
Set pn+1 = g(pn )
if Stopping condition is satisfied then
Stop the search pn+1 is sufficiently
 accurate approximation to
the fixed point of function g(x)
end if
end for

Theorem 2.6. (Fixed-Point Theorem)


Assume that function g(x) is continuous on [a, b] and g(x) ∈ [a, b] for
all x ∈ [a, b]. In addition, suppose g(x) is differentiable in (a, b) and a
positive constant k < 1 exists such that |g 0 (x)| ≤ k for all x ∈ (a, b).
Then for any number p0 ∈ [a, b], the sequence {pn }∞ n=0 defined by

pn+1 = g(pn ), n≥0 (2.15)

converges to the unique fixed point p in interval [a, b] and

|pn − p| ≤ k n max {p0 − a, b − p0 } for n = 1, 2, 3, . . . (2.16)

Proof. Convergence of sequence {pn } to the fixed point p as n → ∞ is


guaranteed from (2.16), since k n → 0 as n → ∞. Let us prove (2.16).
For each n using the Mean Value Theorem we have

|pn+1 − p| = |g(pn ) − g(p)| = |g 0 (ξn )| · |pn − p| ≤ k|pn − p|

where ξn ∈ (a, b). Applying this inequality inductively gives

|pn − p| ≤ k|pn−1 − p| ≤ k 2 |pn−2 − p| ≤ . . . ≤ k n |p0 − p|.

Since p0 ∈ [a, b] and p ∈ [a, b], we have |p0 − p| ≤ max {p0 − a, b − p0 },


which completes the proof of the inequality (2.16).

40
Lecture Notes - Numerical Methods M.Ashyraliyev

Remark 2.14. The inequality (2.16) gives us the upper bound for the
error involved in using pn to approximate fixed point p. As we can see
from (2.16), the rate of convergence of sequence {pn }∞
n=0 to fixed point
n
p depends on the factor k . The smaller the value of k, the faster the
convergence will be. However, the convergence may be very slow if k
is close to 1.

Example 2.8. Consider the following quadratic equation

x2 − 2x − 3 = 0

which has a root p = 3 in the interval [2, 4].


a) We first rearrange the given equation to

x = 2x + 3,

so that it is in the form (2.14) with g(x) = 2x + 3. Then
√ √
min g(x) = g(2) = 7 and max g(x) = g(4) = 11.
x∈[2,4] x∈[2,4]

So, g(x) ∈ [2, 4] for all x ∈ [2, 4]. In addition, for all x ∈ (2, 4) we have
1 1
g 0 (x) = √ ≤ √ < 1.
2x + 3 7
Then from Theorem 2.6 it follows that for any number p0 ∈ [2, 4] the
sequence {pn }∞n=0 defined by (2.15) must converge to the unique fixed
point p = 3 in interval [2, 4]. If we start with p0 = 4, successive fixed
point iterates are
p
p1 = g(p0 ) = 2p0 + 3 ≈ 3.316625,
p
p2 = g(p1 ) = 2p1 + 3 ≈ 3.103748,
p
p3 = g(p2 ) = 2p2 + 3 ≈ 3.034385,
p
p4 = g(p3 ) = 2p3 + 3 ≈ 3.011440,
p
p5 = g(p4 ) = 2p4 + 3 ≈ 3.003811
41
Lecture Notes - Numerical Methods M.Ashyraliyev

and we see that the values are converging to p = 3.


b) Now we rearrange the given equation to
3
x= ,
x−2
3
so that it is in the form (2.14) with g(x) = . If we start with
x−2
p0 = 4, in this case the successive fixed point iterates are
3
p1 = g(p0 ) = = 1.5,
p0 − 2
3
p2 = g(p1 ) = = −6,
p1 − 2
3
p3 = g(p2 ) = = −0.375,
p2 − 2
3
p4 = g(p3 ) = ≈ −1.263158,
p3 − 2
3
p5 = g(p4 ) = ≈ −0.919355
p4 − 2
and we see that the values are converging to the different fixed point
p = −1 6∈ [2, 4]. (Why?)
c) Finally, we rearrange the given equation to
x2 − 3
x= ,
2
x2 − 3
so that it is in the form (2.14) with g(x) = . If we start with
2
p0 = 4, in this case the successive fixed point iterates are
p20 − 3
p1 = g(p0 ) = = 6.5,
2
p21 − 3
p2 = g(p1 ) = = 19.625,
2
p22 − 3
p3 = g(p2 ) = = 191.0703125
2
42
Lecture Notes - Numerical Methods M.Ashyraliyev

and we see that the values are diverging. (Why?)

Remark 2.15. As we have seen in the previous example, the Fixed-


Point iteration method may converge to a root different from the ex-
pected one, or it may even diverge. Moreover, different rearrangements
may converge at different rates.

Remark 2.16. (Stopping conditions for Fixed-Point method)


If the exact value of the root p is known in advance then at each step
of Algorithm 3 we can directly use the condition for accuracy:
|p − pn | < T OL.
If the exact value of the root p is not known then we can measure
relative or absolute precision. So, for stopping condition in Algorithm 3
we can use:
|pn+1 − pn | < T OL (for absolute precision)
or
|pn+1 − pn |
< T OL (for relative precision)
|pn+1 |

2.4 Newton-Raphson Method


Newton-Raphson (or Newton’s) method is one of the most power-
ful numerical methods for solving a root-finding problem.
Consider again a nonlinear algebraic equation:
f (x) = 0. (2.17)

To begin the Newton-Raphson method, we choose an initial ap-


proximation p0 to the root p of the equation (2.17). The tangent line
to the curve y = f (x) at point p0 , f (p0 ) has an equation:
y = f (p0 ) + f 0 (p0 )(x − p0 ). (2.18)

43
Lecture Notes - Numerical Methods M.Ashyraliyev

The intersection of this tangent line with x-axis, we take as a next


approximation of the root, denoted by p1 . Then, putting y = 0 and
x = p1 in (2.18) and solving it for p1 , we obtain
f (p0 )
p1 = p 0 − .
f 0 (p0 )
We reapply the process to obtain the next approximation p2 , and con-
tinue obtaining approximations p3 , p4 , . . . , pn . Figure 3 illustrates
geometrically first 2 steps of Newton-Raphson method. Algorithm 4
shows a pseudo-code of Newton-Raphson method.

Figure 3: Newton-Raphson Method

Theorem 2.7. Suppose that f (x) is continuous and twice differen-


tiable function on interval [a, b]. If p ∈ [a, b] such that f (p) = 0 and
f 0 (p) 6= 0 then there exists δ > 0 such that Newton-Raphson method
for equation (2.17) generates a sequence {pn }∞ n=0 converging to root p
for any initial guess p0 ∈ [p − δ, p + δ].

44
Lecture Notes - Numerical Methods M.Ashyraliyev

Algorithm 4 Pseudo-code of Newton-Raphson Method.


Choose p0 .
for n = 0, 1, 2, 3, . . . do
Compute f (pn )
Compute f 0 (pn )
if f 0 (pn ) = 0 then
Stop the search (Newton-Raphson method fails)
end if
f (pn )
Compute pn+1 = pn − 0
f (pn )
if Stopping condition is satisfied then
Stop the search pn+1 is sufficiently
 accurate approximation to
the root of equation (2.17)
end if
end for

Theorem 2.8. Suppose that assumptions of Theorem 2.7 are satis-


fied. Then Newton-Raphson method for equation (2.13) generates a
sequence {pn }∞
n=0 converging to root p quadratically.
Proof. Using Taylor expansion about pn , we have
0 f 00 (ξ)
f (p) = f (pn ) + f (pn )(p − pn ) + (p − pn )2
2
where ξ is between p and pn . Since f (p) = 0, we get
0 f 00 (ξ)
f (pn )(pn − p) − f (pn ) = (p − pn )2
2
Dividing both sides by f 0 (pn ), we obtain
f (pn ) f 00 (ξ)
pn − p − 0
= 0
(p − pn )2
f (pn ) 2f (pn )
f (pn )
Putting pn+1 = pn − , we arrive to
f 0 (pn )
f 00 (ξ)
pn+1 − p = 0 (p − pn )2
2f (pn )
45
Lecture Notes - Numerical Methods M.Ashyraliyev

Then
|pn+1 − p| |f 00 (p)|
lim = K, where K =
n→∞ |pn − p|2 2|f 0 (p)|
which by definition means quadratic convergence.

Remark 2.17. (Advantages/disadvantages of Newton’s method)


Results of Theorem 2.8 imply that Newton-Raphson method has quadratic
convergence, so that it has tendency to approximately double the num-
ber of digits of accuracy with each successive iteration. However, there
is a number of disadvantages. First of all, Newton-Raphson method is
not guaranteed to converge always. Moreover, an upper bound for the
error in approximations is not known when Newton-Raphson method
is used.

Remark 2.18. (Stopping conditions for Newton’s method)


If the exact value of the root p is known in advance then at each step
of Algorithm 4 we can directly use the condition for accuracy:

|p − pn | < T OL.

If the exact value of the root p is not known then we can measure
relative or absolute precision. So, for stopping condition in Algorithm 4
we can use:

|pn+1 − pn | < T OL (for absolute precision)

or
|pn+1 − pn |
< T OL (for relative precision)
|pn+1 |

Example 2.9. Use Newton-Raphson method to approximate a root


of equation
x3 − 7x2 + 14x − 6 = 0 (2.19)
−6
in interval [0, 1] within
√ an accuracy 10 . (You can use the exact value
of the root p = 2 − 2 = 0.5857864376269 . . .)

46
Lecture Notes - Numerical Methods M.Ashyraliyev

Solution: Let us denote f (x) = x3 − 7x2 + 14x − 6. Then


f 0 (x) = 3x2 −14x+14. We begin Newton-Raphson method by choosing
initial approximation p0 to the root of equation (2.15). Let p0 = 0.
1 step: We have f (p0 ) = f (0) = −6 and f 0 (p0 ) = f 0 (0) = 14
f (p0 ) −6 3
Then p1 = p0 − = 0 − = = 0.42857142857143 . . .
f 0 (p0 ) 14 7
Accuracy = |p − p1 | ≈ 0.1572 > 10−6 , so we continue with the
next step
2 step: We have f (p1 ) = −1.2069970845481 . . . and
f 0 (p1 ) = 8.55102040816326 . . .
f (p1 )
Then p2 = p1 − 0 = 0.56972383225367 . . .
f (p1 )
Accuracy = |p − p2 | ≈ 0.0161 > 10−6 , so we continue with the
next step
3 step: We have f (p2 ) = −0.11103911401737 . . . and
f 0 (p2 ) = 6.99762208356209 . . .
f (p2 )
Then p3 = p2 − 0 = 0.58559195326555 . . .
f (p2 )
Accuracy = |p − p3 | ≈ 1.9448 × 10−4 > 10−6 , so we continue with
the next step
4 step: We have f (p3 ) = −0.00132822059431 . . . and
f 0 (p3 ) = 6.83046646147043 . . .
f (p3 )
Then p4 = p3 − 0 = 0.58578640859328 . . .
f (p3 )
Accuracy = |p − p4 | ≈ 2.9034 × 10−8 < 10−6 , so we can stop the
iterations
So, Newton-Raphson method in 4 steps gives an approximation
p4 = 0.58578640859328 . . . for the root p = 0.5857864376269 . . . of the
equation (2.19) within an accuracy 10−6 .

47
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 2.10. Use Newton-Raphson method to approximate a root


of equation
3x + sin x − ex = 0 (2.20)
in interval [0, 1] within an absolute precision 10−3 .

Solution: Let us denote f (x) = 3x + sin x − ex . Then


f 0 (x) = 3 + cos x − ex . We begin Newton-Raphson method by choosing
initial approximation p0 to the root of equation (2.16). Let p0 = 0.

1 step: We have f (p0 ) = f (0) = −1 and f 0 (p0 ) = f 0 (0) = 3


f (p0 ) −1 1
Then p1 = p0 − 0
=0− = = 0.33333333333333 . . .
f (p0 ) 3 3
Precision = |p1 − p0 | ≈ 0.3333 > 10−3 , so we continue with the
next step

2 step: We have f (p1 ) = −0.06841772828994 . . . and


f 0 (p1 ) = 2.54934452122865 . . .
f (p1 )
Then p2 = p1 − 0 = 0.36017071357763 . . .
f (p1 )
Precision = |p2 − p1 | ≈ 0.0268 > 10−3 , so we continue with the
next step

3 step: We have f (p2 ) = −0.00062798507057 . . . and


f 0 (p2 ) = 2.50226254780641 . . .
f (p2 )
Then p3 = p2 − 0 = 0.36042168047602 . . .
f (p2 )
Presicion = |p3 − p2 | ≈ 0.000251 < 10−3 , so we can stop the
iterations

So, Newton-Raphson method in 3 steps gives an approximation


p3 = 0.36042168047602 . . . for the root of the equation (2.20) within
an absolute precision 10−3 .

48
Lecture Notes - Numerical Methods M.Ashyraliyev

Remark 2.19. As we have already mentioned, Newton-Raphson method


may fail to converge to desired root. There might be different reasons
for failure:

• We may have so called ping-pong effect, illustrated in figure be-


low. If Newton’s method starts with p0 = 1, then following the
tangent lines we get p1 = −1 and p2 = 1 = p0 . So after 2 step,
we are back to the point where we started.

• We may have so called runaway effect, illustrated in figure below.


If Newton’s method starts at x0 , then tangent lines will take next
approximations farther away from root r.

• Newton-Raphson method may converge to the wrong root, not


the one we want. It can happen if the equation has multiple
roots.

49
Lecture Notes - Numerical Methods M.Ashyraliyev

• If at some step of Newton-Raphson algorithm we get f 0 (pn ) = 0


then the next approximation pn+1 cannot be found.
The only remedy for failure of Newton-Raphson method is to
try different initial guess p0 .

2.5 Self-study Problems


Problem 2.1. Consider the equation

x − cos x = 0.

a) Verify that the given equation has a root in the interval [0, 1].
b) Use the Bisection method to find first 3 approximations to the
root of given equation in the interval [0, 1].
c) Use the False Position method to find first 3 approximations to
the root of given equation in the interval [0, 1].

Problem 2.2. Consider the equation

x3 − 2x2 − 5 = 0

which has a root p = 2.69064744802878 . . . in the interval [2, 3].


a) Use the Bisection method to approximate the root of given equa-
tion in the interval [2, 3] within an accuracy of 10−2 .
b) Use the False Position method to approximate the root of given
equation in the interval [2, 3] within an accuracy of 10−2 .

Problem 2.3. Show that each of the following functions has a unique
fixed point in interval [0, 1]

a) g(x) = 2−x b) g(x) = cos x

50
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 2.4. Use Fixed-Point iteration method with the starting


value p0 = 2.5 to compute first five approximations for the fixed point
of the function √
g(x) = 2 x − 1.

Problem 2.5. Use the Fixed-Point iteration method with the starting
value p0 = 1 to approximate the root p = 1.94712296670701 . . . of the
equation
x4 − 3x2 − 3 = 0
within an accuracy 10−2 . Hint: first rearrange
 the given equation to
the appropriate fixed-point form x = g(x)

Problem 2.6. Use Newton-Raphson Position method with initial guess


p0 = 2 to approximate a root of equation

x3 − 2x2 − 5 = 0

in interval [2, 3] within an accuracy of 10−4 . (You can use the exact
value of the root p = 2.69064744802878 . . .)

Problem 2.7. Consider the equation

x(x2 + 2) = 4

a) Use the Bisection method to approximate the root of given equa-


tion in the interval [1, 2] within accuracy of 10−2 . (Exact value
of the root is p = 1.17950902 . . .)

b) Use the False Position method to approximate the root of given


equation in the interval [1, 2] within accuracy of 10−2 .

c) Use the Newton-Raphson method with initial guess p0 = 1 to


approximate the root of given equation within accuracy of 10−3 .

51
3 Solutions of Linear Systems of Equa-
tions with Gaussian Elimination Method
Solving systems of linear equations is the most frequently used numer-
ical procedure when real-world situations are modeled. For instance,
numerical methods for solving ordinary ordinary and partial differen-
tial equations depend on them.
When a system of linear equations has many equations, it is difficult
to discuss them without using matrices and vectors. So, we first give
a brief overview of main concepts and terminology which are needed
later.

3.1 Matrices and Vectors


Definition 3.1. A vector is a one-dimensional array of numbers. We
distinguish two types of vectors: row vectors and column vectors.
 
u = u1 u2 . . . um is a row vector with m entries.
 
v1
 v2 
v=  ...  is a column vector with n entries.

vn
Lowercase letters, such as u, v, a, b, x, . . . are used to denote vectors.

Definition 3.2. A matrix is a two-dimensional array of numbers.


 
a11 a12 . . . a1m
 a21 a22 . . . a2m 
A=  ... .. ..  is a matrix with n rows and m columns.
. . 
an1 an2 . . . anm
If matrix A has n rows and m columns then we usually say that matrix
A is of size n × m. Another short notation for matrix A of size n × m
is A = [aij ]n×m . Uppercase letters, such as A, B, U, L, . . . are used to
denote matrices.

52
Lecture Notes - Numerical Methods M.Ashyraliyev

Remark 3.1. A vector can be actually considered as a matrix. For


instance, the row vector u in Definition 3.1 is the matrix of size 1 × m
and the column vector v there is the matrix of size n × 1.

Definition 3.3. A matrix with the same number of rows and columns
is called a square matrix.
 
a11 a12 . . . a1n
 a21 a22 . . . a2n 
A=  ... .. ..  is a square matrix of size n × n.
. . 
an1 an2 . . . ann

Definition 3.4. If all elements of square matrix below the main di-
agonal are 0 then the matrix is called an upper-triangular matrix.
For instance,  
u11 u12 . . . u1n
 0 u22 . . . u2n 
U=  ... .. ..  (3.1)
. . 
0 0 . . . unn
is an upper-triangular matrix.

Definition 3.5. If all elements of square matrix above the main diag-
onal are 0 then the matrix is called a lower-triangular matrix. For
instance,  
l11 0 . . . 0
 l21 l22 . . . 0 
L=  ... .. ..  (3.2)
. . 
ln1 ln2 . . . lnn
is a lower-triangular matrix.

53
Lecture Notes - Numerical Methods M.Ashyraliyev

Definition 3.6. A square matrix in which all the main diagonal ele-
ments are 1’s and all the remaining elements are 0’s is called an iden-
tity matrix.
 
1 0 ... 0
 0 1 ... 0 
In =  ... ... ..  is an identity matrix of size n
. 
0 0 ... 1

3.1.1 Operations on Matrices


Definition 3.7. (Addition of matrices)
Let A = [aij ]n×m and B = [bij ]n×m be two matrices of the same size.
The sum of matrices A and B is a matrix C = [cij ]n×m , denoted
by C = A + B, such that cij = aij + bij for all i = 1, 2, . . . , n and
j = 1, 2, . . . , m.

Definition 3.8. (Multiplication of matrix by scalar)


Let A = [aij ]n×m be a matrix and λ be a constant. A scalar mul-
tiplication of matrix A with constant λ is a matrix C = [cij ]n×m ,
denoted by C = λA, such that cij = λ · aij for all i = 1, 2, . . . , n and
j = 1, 2, . . . , m.
   
1 2 2 3
Example 3.1. Let A = and B = be two ma-
−1 3 −4 −1
trices. Then
         
1 2 2 3 1 2 −4 −6 −3 −4
A−2B = −2 = + =
−1 3 −4 −1 −1 3 8 2 7 5

Definition 3.9. (Multiplication of matrices)


Let A = [aij ]n×m and B = [bij ]m×p be two matrices. The product of
matrix A by matrix B is a matrix C = [cij ]n×p , denoted by C = A · B,
Xm
such that cij = aik bkj for all i = 1, 2, . . . , n and j = 1, 2, . . . , p.
k=1

54
Lecture Notes - Numerical Methods M.Ashyraliyev
     
1 2 1 0 1 −1 2 1
Example 3.2. · =
3 4 −1 1 0 −1 4 3

Remark 3.2. A · B 6= B · A in general.

Definition 3.10. (Transpose of matrix)


Transpose of matrix A = [aij ]n×m is a matrix of size m × n, denoted
by AT , which is obtained by writing the rows of A as columns.
 
  1 −1
1 0 1
Example 3.3. Let A = . Then AT =  0 1 
−1 1 0
1 0

Definition 3.11. Square matrix A is called symmetric if A = AT .

Definition 3.12. (Inverse matrix)


Let A be a square matrix of size n × n. If there exists a square matrix
B of the same size, such that B · A = A · B = In , then it is called an
inverse of A, denoted by B = A−1 .

Remark 3.3. Finding the inverse of very large square matrix is com-
putationally expensive.

Definition 3.13. (Determinant of square matrix)


Determinant of a square matrix A = [aij ]n×n is a number, denoted
by det(A) or |A|, which is defined as following:
n!
X
det(A) = sgn(pk ) · a1,pk (1) · a2,pk (2) · . . . · an,pk (n)
k=1

a11 a12
For instance, = a11 a22 − a12 a21 and
a21 a22
a11 a12 a13
a21 a22 a23 = a11 a22 a33 +a21 a32 a13 +a12 a23 a31 −a31 a22 a13 −a32 a23 a11 −a21 a12 a33 .
a31 a32 a33
55
Lecture Notes - Numerical Methods M.Ashyraliyev

Remark 3.4. Calculating the determinant of a square matrix of size


n × n requires approximately n · n! arithmetic operations and therefore
it is computationally very expensive for large matrices.

Theorem 3.1. An inverse of a square matrix A exists if and only if


det(A) 6= 0.

Definition 3.14. Square matrix A is called singular if det(A) = 0.


So, singular matrix does not have an inverse. If det(A) 6= 0 then
matrix A is called nonsingular. So, nonsingular matrix does have an
inverse.

Remark 3.5. For triangular matrices determinant is equal to the prod-


uct of main diagonal elements. So, for upper-triangular matrix (3.1)
we have det(U) = u11 · u22 · . . . · unn and for lower-triangular matrix
(3.2) we have det(L) = l11 · l22 · . . . · lnn .

3.2 Systems of Linear Equations


Consider the system of linear equations in the standard form:

 a11 x1 + a12 x2 + . . . + a1n xn = b1


a21 x1 + a22 x2 + . . . + a2n xn = b2 (3.3)
 ..............................


an1 x1 + an2 x2 + . . . + ann xn = bn

System (3.3) can be written in so called matrix form:

Ax = b, (3.4)
 
x1
 x2 
where x = 
 ...  is the vector of unknowns,

xn
56
Lecture Notes - Numerical Methods M.Ashyraliyev

 
a11 a12 . . . a1n
 a21 a22 . . . a2n 
A=
 ... .. ..  is given matrix of coefficients and
. . 
a an2 . . . ann
 n1
b1
 b2 
b=
 ... 
 is given right-hand side vector.
bn

Definition 3.15. System of linear equations is called inconsistent if


there exists no solution.

Example 3.4. System of equations


(
x1 + 2x2 = 3
2x1 + 4x2 = 5

does not have a solution and therefore it is inconsistent. Note that the
determinant of the matrix of coefficients is 0 for this system.

Remark 3.6. System of linear equations may have infinitely many


solutions.

Example 3.5. System of equations


(
2x1 + x2 = 3
4x1 + 2x2 = 6

has infinitely many solutions. Note that the determinant of the matrix
of coefficients is 0 for this system.

Theorem 3.2. System of linear equations (3.4) has a unique solution


if and only if det(A) 6= 0.

57
Lecture Notes - Numerical Methods M.Ashyraliyev

The methods which are used to solve the system of linear equa-
tions can be classified into two classes: direct methods and iterative
methods. Here is the list of some of these methods:

• Direct methods

– Cramer’s rule, which is not practical for large systems,


since it requires the calculation of determinants. As we have
already mentioned, calculation of determinants for large ma-
trices is computationally very expensive. Therefore, we will
not discuss Cramer’s rule in this course.
– Matrix Inversion method: Ax = b =⇒ x = A−1 b. It
is not practical for large systems, since finding the inverse
of large matrix is computationally expensive. Therefore, we
will not discuss this method in this course.
– Gaussian Elimination method
– Gauss-Jordan Elimination method (will not be studied
in this course)
– LU decomposition method

• Iterative methods

– Jacobi method
– Gauss-Seidel method
– SOR method (will not be studied in this course)

3.3 Gaussian Elimination Method


Definition 3.16. Two linear systems Ax = b and Ãx = b̃ are called
equivalent if their solution sets are the same.

First suppose we have a system of equations that is in upper-


triangular form Ux = b̃, where U is defined by (3.1). Writing it in

58
Lecture Notes - Numerical Methods M.Ashyraliyev

standard form, we have




 u11 x1 + u12 x2 + . . . + u1n xn = b̃1


u22 x2 + . . . + u2n xn = b̃2


 ..................
 unn xn = b̃n

This system can be easily solved by using so called back-substitution


method:
n
P
b̃k − ukj xj
b̃n j=k+1
xn = , xk = , k = n − 1, . . . , 2, 1. (3.5)
unn ukk

Example 3.6. By using the back-substitution method solve the sys-


tem of equations:

 5x1 + 3x2 − 2x3 = −3

6x2 + x3 = −1

2x3 = 10

10
Solution: From the last equation we have x3 = = 5. Then from
2
the second equation we get
−1 − x3 −1 − 5
x2 = = = −1.
6 6
Finally, from the first equation we obtain
−3 − 3x2 + 2x3 −3 + 3 + 10
x1 = = = 2.
5 5

Remark 3.7. All elimination methods used for solving the system
(3.4), including the Gaussian elimination method, have a general strat-
egy which consists of two steps:

59
Lecture Notes - Numerical Methods M.Ashyraliyev

1. Reduce the system Ax = b to equivalent system in an upper-


triangular form Ux = b̃. This step is called triangularization
or forward elimination step.
2. Use the back-substitution method to solve the resulting system
Ux = b̃.

3.3.1 Triangularization
Definition 3.17. Matrix
 
a11 a12 . . . a1n b1
 a21 a22 . . . a2n b2 
[A|b] = 
 ... .. .. .. 
. . . 
an1 an2 . . . ann bn

is called an augmented matrix for the system (3.3).

Theorem 3.3. The following row operations applied to augmented


matrix result in an equivalent linear system
1. The order of two rows can be changed
2. The row can be replaced by the sum of that row and multiple of
any other row

Gaussian Elimination Algorithm for Triangularization:

1 step We start with augmented matrix


 
(1) (1) (1) (1) (1)
a11 a12 a13 . . . a1n b1
 
 a(1) a(1) a(1) . . . (1)
a2n
(1)
b2 
 21 22 23 
 (1) (1) (1) (1) (1)

a
 31 a32 a33 . . . a3n b3 
 .. .. .. .. .. 

 . . . . . 
(1) (1) (1) (1) (1)
an1 an2 an3 . . . ann bn

60
Lecture Notes - Numerical Methods M.Ashyraliyev

(1) (1)
where aij = aij and bi = bi for all i and j.
(1)
At the first step a11 is called a pivot element and the first row
(1)
is called a pivot row. Assume that a11 6= 0.
(1)
ai1
For each i = 2, 3, . . . , n we calculate mi1 = (1)
and then apply
a11
(2) (1) (1)
aij = aij − mi1 a1j , j = 2, 3, . . . , n
(3.6)
(2) (1) (1)
bi = bi − mi1 b1

Formulas (3.6) mean that i-th row is replaced by the difference


of that row and mi1 multiple of first row, so Ri = Ri − mi1 R1 .
This operations reduce the augmented matrix to new form:
 
(1) (1) (1) (1) (1)
a11 a12 a13 . . . a1n b1
 
 0 a(2) a(2) . . . a(2) b(2) 
 22 23 2n 2 
 
 0 a(2) a(2) . . . a(2) b(2) 
32 33 3n 3 
 .. .. .. .. .. 

 . . . . . 
(2) (2) (2) (2)
0 an2 an3 . . . ann bn

(2)
2 step At the second step a22 is a pivot element and the second row
(2)
is a pivot row. Assume that a22 6= 0.
(2)
ai2
For each i = 3, 4, . . . , n we calculate mi2 = (2)
and then apply
a22
(3) (2) (2)
aij = aij − mi2 a2j , j = 3, 4, . . . , n
(3.7)
(3) (2) (2)
bi = bi − mi2 b2

Formulas (3.7) mean that i-th row is replaced by the difference


of that row and mi2 multiple of second row, so Ri = Ri − mi2 R2 .

61
Lecture Notes - Numerical Methods M.Ashyraliyev

This operations reduce the augmented matrix to new form:


 
(1) (1) (1) (1) (1)
a11 a12 a13 . . . a1n b1
 
 0 a(2) (2) (2) (2) 
a23 . . . a2n b2 
 22
 (3) (3) (3)

 0 0 a33 . . . a3n b3 
 .. .. .. .. .. 
 
 . . . . . 
(3) (3) (3)
0 0 an3 . . . ann bn

We continue this process until we get an equivalent system in an upper-


triangular form.
Example 3.7. By using Gaussian Elimination method solve the sys-
tem of equations:
x1 − x2 + 3x3 = 13


− 3x1 − x2 + 4x3 = 8
4x1 − 2x2 + x3 = 15

Solution: We first apply the Gaussian Elimination Algorithm to re-


duce the given system to an equivalent system in upper-triangular
form.
1 step We start with augmented matrix
1 −1 3

13

 − 3 −1 4 8 
4 −2 1 15

−3 4
We first find m21 = = −3 and m31 = = 4.
1 1
Then we apply the following row operations: R2 = R2 + 3R1 and
R3 = R3 − 4R1 . This operations reduce the augmented matrix
to new form:
1 −1

3 13

 0 −4 13 47 
0 2 −11 − 37
62
Lecture Notes - Numerical Methods M.Ashyraliyev

2
2 step We first find m32 = = −0.5. Then the following row oper-
−4
ation R3 = R3 + 0.5R2 reduces the augmented matrix to new
form:
1 −1

3 13

 0 −4 13 47 
0 0 −4.5 − 13.5
So, the given system is equivalent to the following system in an upper-
triangular form:
 x1 −

x2 + 3x3 = 13
−4x2 + 13x3 = 47
− 4.5x3 = −13.5

Now, we can use the back-substitution method to find the solution of


−13.5
the resulting system. From the last equation we have x3 = = 3.
−4.5
Then from the second equation we get
47 − 13x3 47 − 39
x2 = = = −2.
−4 −4
Finally, from the first equation we obtain
x1 = 13 + x2 − 3x3 = 13 − 2 − 9 = 2.

3.3.2 Pivoting
Gaussian Elimination method for triangularization has two pitfalls.
If at some step of the algorithm a zero is encountered on the main
diagonal then we cannot use that row to eliminate the coefficients
below that zero element. The following example illustrates this issue.
Example 3.8. Consider the system of equations:


 x1 − x2 + 2x3 − x4 = −8

 2x −
1 2x2 + 3x3 − 3x4 = −20

 x1 + x2 + x3 = −2

 x −
1 x2 + 4x3 + 3x4 = 4

63
Lecture Notes - Numerical Methods M.Ashyraliyev

Gaussian Elimination method for triangularization of this system starts


with augmented matrix:
 
1 −1 2 −1 −8
 2 −2 3 −3 − 20 
 
1 1 1 0 −2 
 
1 −1 4 3 4

We find m21 = 2, m31 = 1 and m41 = 1. Then we apply the following


row operations: R2 = R2 − 2R1 , R3 = R3 − R1 and R4 = R4 − R1 .
This operations reduce the augmented matrix to new form:
 
1 −1 2 −1 −8
 0 0 −1 −1 − 4 
 
 0 2 −1 1 6 
 
0 0 2 4 12

Now, the pivot element for the second step is 0 and therefore the
procedure cannot continue in the present form.

Remark 3.8. (Trivial Pivoting)


The problem of having zero pivot element can be easily fixed with
so called trivial pivoting strategy: at each k-th step of Gaussian
Elimination method
(k)
• if akk 6= 0 then do not switch rows
(k)
• if akk = 0 then locate the first r-th row below k-th row in which
(k)
ark 6= 0 and switch rows k and r.

So, trivial pivoting strategy suggests that for system in Example 3.8
after the first step of Gaussian Elimination algorithm we need to switch
the second and the third rows. That will change the augmented matrix

64
Lecture Notes - Numerical Methods M.Ashyraliyev

to the form:  
1 −1 2 −1 −8

 0 2 −1 1 6 

0 0 −1 −1 −4
 
 
0 0 2 4 12
and now the algorithm can be continued with the next step.

If all the calculations could be done using exact arithmetic, trivial


pivoting strategy would be enough. However, there is another potential
danger for Gaussian Elimination method which stems from round-off
errors. The following example illustrates this issue.

Example 3.9. Apply Gaussian Elimination method to the system of


equations: 
0.003x1 + 59.14x2 = 59.17
5.291x1 − 6.130x2 = 46.78
using four-digit arithmetic with rounding. Note that the exact solution
of the system is: x1 = 10 and x2 = 1.

Solution: We first apply the Gaussian Elimination Algorithm to re-


duce the given system to an equivalent system in upper-triangular
form. We start with augmented matrix
 
0.003 59.14 59.17
5.291 −6.130 46.78

5.291
The first pivot element is m21 = = 1763.6666 . . . which rounds
0.003
to m̂21 = 1764. Then, applying the row operation R2 = R2 − 1764R1
with four-digit rounding arithmetic reduces the augmented matrix to
new form:  
0.003 59.14 59.17
0 −104300 − 104400
So, the given system by using the Gaussian Elimination Algorithm
with four-digit rounding arithmetic is reduced to the following system

65
Lecture Notes - Numerical Methods M.Ashyraliyev

in an upper-triangular form:

0.003x1 + 59.14x2 = 59.17
− 104300x2 ≈ −104400

Now, we can use the back-substitution method to find the solution of


the resulting system. From the second equation we have
−104400
x2 ≈ ≈ 1, 00095877277 . . .
−104300
which rounds to x̂2 = 1.001. We see that x̂2 gives a reasonably well
approximation to the actual value x2 = 1.
Now, from the first equation by using four-digit rounding arithmetic
we obtain
59.17 − 59.14x2 59.17 − 59.14x̂2 59.17 − 59.14 · 1.001
x1 = ≈ = =
0.003 0.003 0.003
59.17 − 59.19914 59.17 − 59.20 −0.03
= ≈ = = −10.
0.003 0.003 0.003
So, x̂1 = −10. This approximation compared to the actual value
x1 = 10 illustrates how round-off errors can cause unexpected results.

Remark 3.9. (Partial Pivoting)


Round-off errors can be minimized with so called partial pivoting
strategy: at each k-th step of Gaussian Elimination method

• locate row r in which the element has the largest absolute value,
so n o
(k) (k) (k) (k)
|ark | = max |akk |, |ak+1,k |, . . . , |ank |

• switch rows k and r.

66
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 3.10. Let us consider the system from the previous example:

0.003x1 + 59.14x2 = 59.17
5.291x1 − 6.130x2 = 46.78
and apply this time Gaussian Elimination method with partial pivoting
using four-digit arithmetic with rounding.
Solution: We start with the same augmented matrix
 
0.003 59.14 59.17
5.291 −6.130 46.78
Since 5.291 > 0.003, the partial pivoting strategy suggests that we
need to switch the rows. That will change the augmented matrix to
the form:  
5.291 −6.130 46.78
0.003 59.14 59.17
0.003
Now, the first pivot element is m21 = = 0.000567000567 . . .
5.291
which rounds to m̂21 = 0.000567. Then, applying the row operation
R2 = R2 − 0.000567R1 with four-digit rounding arithmetic reduces the
augmented matrix to new form:
 
5.291 −6.130 46.78
0 59.14 59.14
So, the given system this time is reduced to the following system in an
upper-triangular form:

5.291x1 − 6.130x2 = 46.78
59.14x2 ≈ 59.14
Now, we can use the back-substitution method to find the solution of
the resulting system. From the second equation we immediately have
x̂2 = 1 which is the same as the actual value x2 = 1. Then, from the
first equation by using four-digit rounding arithmetic we obtain
46.78 + 6.130x2 46.78 + 6.130x̂2 46.78 + 6.130 52.91
x1 = ≈ = = = 10
5.291 5.291 5.291 5.291
67
Lecture Notes - Numerical Methods M.Ashyraliyev

So, x̂1 = 10 which is the same as the actual value x1 = 10. We see that
with partial pivoting we eliminated the round-off errors completely.

Example 3.11. Apply the Gaussian Elimination method with trivial


pivoting to solve the following system of equations:

 x2 + x3 = 2
x1 + x2 − x3 = 1
2x1 − x2 − x3 = 0

Solution: Let us solve the problem with the use of augmented matrix.

1 1 −1

0 1 1 2
 
1
! 
Pivoting
 1 1 −1 1  =⇒ =⇒ 0 1 1 2  =⇒
R1 ↔ R2
2 −1 −1 0 2 −1 −1 0

2 1 1 −1
  
1

=⇒  m31 = 1 = 2  =⇒ 0 1 1 2  =⇒
R3 = R3 − 2R1 0 −3 1 −2

−3 1 −1
  
1 1

=⇒  m32 = 1 = −3  =⇒ 0 1 1 2 
R3 = R3 + 3R2 0 0 4 4

So, the given system is equivalent to the following system in an upper-


triangular form:
 x1 + x2 − x3 = 1

x2 + x3 = 2

4x3 = 4
Now with the back-substitution method we get x3 = 1, x2 = 1, x1 = 1.

68
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 3.12. Apply the Gaussian Elimination method with partial


pivoting to solve the following system of equations:

 x1 + 3x2 + 2x3 = 5
2x1 + 4x2 − 6x3 = −4

x1 + 5x2 + 3x3 = 10
Solution: Let us solve the problem with the use of augmented matrix.

4 −6 −4

1 3 2 5
 
2
! 
Pivoting
2 4 −6 −4  =⇒ =⇒ 1 3 2 5  =⇒
R1 ↔ R2
1 5 3 10 1 5 3 10

1 4 −6 −4
  
2

=⇒  m21 = 2 = 0.5  =⇒ 0 1 5 7  =⇒
R2 = R2 − 0.5R1 1 5 3 10

1 4 −6 −4
  
2

=⇒  m31 = 2 = 0.5  =⇒ 0 1 5 7  =⇒
R3 = R3 − 0.5R1 0 3 6 12

4 −6 −4

2
! 
Pivoting
=⇒ =⇒ 0 3 6 12  =⇒
R2 ↔ R3
0 1 5 7

1 4 −6 −4
  
2

m32 =
=⇒  3  =⇒ 0 3 6 12 
R3 = R3 − (1/3)R2 0 0 3 3
So, the given system is equivalent to the following system in an upper-
triangular form:
 2x1 + 4x2 − 6x3 = −4

3x2 + 6x3 = 12

3x3 = 3
69
Lecture Notes - Numerical Methods M.Ashyraliyev

Now with back-substitution method we get x3 = 1, x2 = 2, x1 = −3.

3.3.3 Complexity of Gaussian Elimination method


When we solve the system (3.3) with n linear equations by using Gaus-
sian Elimination method we have two steps: triangularization and
back-substitution. It is easy to show that:
2n3 n2 7n
• Triangularization requires + − arithmetic operations
3 2 6
• Back-substitution requires n2 arithmetic operations
2n3 3n2 7n
So, the Gaussian Elimination method in total requires + −
3 2 6
arithmetic operations.

3.4 Self-study Problems


Problem 3.1. Use Gaussian Elimination method with trivial pivoting
to solve the following system of equations:

 x1 − x2 + 3x3 = 2

3x1 − 3x2 + x3 = −1

x1 + x2 = 3

Problem 3.2. Use Gaussian Elimination method with trivial pivoting


to solve the following system of equations:

 x2 + x3 = 6
x1 − 2x2 − x3 = 4
x1 − x2 + x3 = 5

70
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 3.3. Use Gaussian Elimination method with partial pivoting


to solve the following system of equations:

 3x1 + x2 + 6x3 = 17
x1 + 3x3 = 5
− 5x1 + 2x2 − x3 = −1

Problem 3.4. Use Gaussian Elimination method with partial pivoting


to solve the following system of equations:

2x1 − 3x2 + 2x3 = 5




− 4x1 + 2x2 − 6x3 = 14

2x1 + 2x2 + 4x3 = 8

Problem 3.5. Consider the system of equations:




 x1 − x2 + 2x3 − x4 = −8

 2x −
1 2x2 + 3x3 − 3x4 = −20

 x1 + x2 + x3 = −2

 x −
1 x2 + 4x3 + 3x4 = 4

a) Use Gaussian Elimination method with trivial pivoting to solve


the given system.

b) Use Gaussian Elimination method with partial pivoting to solve


the given system.

Problem 3.6. Consider the system of equations:




 x1 + 2x2 + x3 + 4x4 = 13

 2x1 + 4x3 + 3x4 = 28

 4x1 + 2x2 + 2x3 + x4 = 20

 − 3x + x + 3x + 2x = 6
1 2 3 4

71
Lecture Notes - Numerical Methods M.Ashyraliyev

a) Use Gaussian Elimination method with trivial pivoting to solve


the given system.

b) Use Gaussian Elimination method with partial pivoting to solve


the given system.

Problem 3.7. Consider the system of equations:




 2x2 + x4 = 0

 2x +
1 2x2 + 3x3 + 2x4 = −2

 4x1 − 3x2 + x4 = −7

 6x +
1 x2 − 6x3 − x4 = 6

a) Use Gaussian Elimination method with trivial pivoting to solve


the given system.

b) Use Gaussian Elimination method with partial pivoting to solve


the given system.

Problem 3.8. Consider the system of equations:



0.03x1 + 58.9x2 = 59.2
5.31x1 − 6.10x2 = 47.0

a) Use Gaussian Elimination method with trivial pivoting and three-


digit rounding arithmetic to solve the given system and compare
the approximations to the actual solution.

b) Use Gaussian Elimination method with partial pivoting and three-


digit rounding arithmetic to solve the given system and compare
the approximations to the actual solution.

72
4 Solutions of Linear Systems of Equa-
tions with LU Factorization Method
Assume that A is nonsingular matrix and it can be written as

A=L·U (4.1)

where L is a lower-triangular matrix and U is an upper-triangular


matrix. Then the linear system Ax = b can be easily solved for x
using a two-step process:
• First define the temporary vector y = Ux and solve the lower
triangular system Ly = b for y by using forward substitution
procedure;
• Once y is known, solve the upper triangular system Ux = y for
x by using backward substitution procedure.

Example 4.1. Consider a linear system:




 x1 + x2 + 6x3 = 7
− x1 + 2x2 + 9x3 = 2

x1 − 2x2 + 3x3 = 10

with matrix of coefficients



1 1 6
 
1 0 0
 
1 1 6

A= −1 2 9 = −1 1 0 · 0 3 15  = L · U.


1 −2 3 1 −1 1 0 0 12

Then to solve the given system, we first use the forward substitution
procedure for the lower triangular system

1 0 0
 
y1
 
7

Ly = b =⇒  − 1 1 0  ·  y2  =  2  =⇒
1 −1 1 y3 10
73
Lecture Notes - Numerical Methods M.Ashyraliyev


 y1 = 7
=⇒ − y1 + y2 = 2 =⇒ y1 = 7, y2 = 9, y3 = 12.

y1 − y2 + y3 = 10

Next, we use the backward substitution procedure for the upper trian-
gular system

1 1 6
 
x1
 
7

Ux = y =⇒  0 3 15  ·  x2  =  9  =⇒
0 0 12 x3 12

 x1 + x2 + 6x3 = 7

=⇒ 3x2 + 15x3 = 9 =⇒ x3 = 1, x2 = −2, x1 = 3.

12x3 = 12

In the previous example, the factorization A = LU was given.


Now, we have to understand how to obtain LU factorization.
Theorem 4.1. Suppose that Gaussian elimination method without
row changes (which means without pivoting) can be performed suc-
cessfully to solve the linear system Ax = b. Then matrix A can be
factored as A = L · U, with
 
1 0 0 ... 0 0
 m21 1 0 ... 0 0
 
L=  .m 31 m 32 1 . . . 0 0 
 .. .
.. .
.. .
.. . 
.. 
mn1 mn2 mn3 . . . mn,n−1 1
and  
(1) (1) (1) (1) (1)
a11 a12 a13 ... a1,n−1 a1n
 (2) (2) (2) (2)


 0 a22 a23 ... a2,n−1 a2n 

U=
 (3) (3) (3)

 0 0 a33 ... a3,n−1 a3n 

 .. .. .. .. .. 
 . . . . . 
(n)
0 0 0 ... 0 ann
74
Lecture Notes - Numerical Methods M.Ashyraliyev

(k)
where all mij and aij are obtained from the Gaussian Elimination
procedure.

4 3 −1
 

Example 4.2. Find LU factorization of matrix A =  − 2 −4 5 


1 2 6
Solution: We have
4 3 −1 4 3 −1
  
1 0 0
  

A =  − 2 −4 5  =  0 1 0  ·  − 2 −4 5  =
1 2 6 0 0 1 1 2 6
 
m21 = −0.5
3 −1

1 0 0

4

 
 R2 = R2 + 0.5R1 
=   =  − 0.5 1 0 · 0 −2.5 4.5  =
 m31 = 0.25 
  0.25 0 1 0 1.25 6.25
R3 = R3 − 0.25R1
3 −1

1 0 0

4
! 
m32 = −0.5
= =  − 0.5 1 0 · 0 −2.5 4.5  = L·U
R3 = R3 + 0.5R2
0.25 −0.5 1 0 0 8.5
Example 4.3. Solve the following linear system:

 x1 − x2 + 3x3 = 13

3x1 + x2 − 4x3 = −8

4x1 − 2x2 + x3 = 15

by using LU factorization method without pivoting.


Solution: We first find LU factorization of the matrix of coefficients
1 −1 3 1 −1 3
  
1 0 0
  

A =  3 1 −4  =  0 1 0  ·  3 1 −4  =
4 −2 1 0 0 1 4 −2 1

75
Lecture Notes - Numerical Methods M.Ashyraliyev
 
m21 = 3
1 −1

1 0 0
 
3

 R2 = R2 − 3R1 
 
=   =  3 1 0  ·  0 4 −13  =
 m 31 = 4 
  4 0 1 0 2 −11
R3 = R3 − 4R1
1 −1

1 0 0

3
! 
m32 = 0.5
= = 3 1 0 · 0 4 −13  = L·U
R3 = R3 − 0.5R2
4 0.5 1 0 0 −4.5
Then to solve the given system, we first use the forward substitution
procedure for the lower triangular system

1 0 0
 
y1
 
13

Ly = b =⇒  3 1 0  ·  y2  =  − 8  =⇒
4 0.5 1 y3 15

 y1
 = 13
3y1 + y2 = −8 =⇒ y1 = 13, y2 = −47, y3 = −13.5

4y1 + 0.5y2 + y3 = 15

Next, we use the backward substitution procedure for the upper trian-
gular system

1 −1

3
 
x1
 
13

Ux = y =⇒  0 4 −13  ·  x2  =  − 47  =⇒
0 0 −4.5 x3 − 13.5

 x1 − x2 + 3x3 =
 13
4x2 − 13x3 = −47 =⇒ x3 = 3, x2 = −2, x1 = 2.

− 4.5x3 = −13.5

Remark 4.1. If the Gaussian Elimination method requires pivoting


(row changes) then matrix A cannot be factored directly as A = L · U.

76
Lecture Notes - Numerical Methods M.Ashyraliyev

Consider for instance the following matrix



1 2 6
 
1 0 0
 
1 2 6

A =  4 8 −1  =  0 1 0  ·  4 8 −1  =
−2 3 5 0 0 1 −2 3 5
 
m21 = 4 
1 0 0
 
1 2 6

 R2 = R2 − 4R1 
 
=  m = −2  =
  4 1 0 · 0 0 −25 
31
  −2 0 1 0 7 17
R3 = R3 + 2R1
and we cannot continue without pivoting.

4.1 LU factorization with partial pivoting


Definition 4.1. A square matrix which has precisely one entry in each
column and each row whose value is 1, and all of whose other entries
are 0 is called a permutation matrix, denoted by P. For example,

0 1 0
 
1 0 0
 
0 1 0

 0 0 1 ,  0 0 1 ,  1 0 0 
1 0 0 0 1 0 0 0 1
are permutation matrices.

Remark 4.2. If P is a permutation matrix then matrix multiplication


P · A permutes rows of A. For example,

0 1 0
 
1 2 3
 
4 5 6

 0 0 1 · 4 5 6 = 7 8 9 
1 0 0 7 8 9 1 2 3
Similarly, if P is a permutation matrix then matrix multiplication A·P
permutes columns of A. For example,

1 2 3
 
0 0 1
 
2 3 1

 4 5 6 · 1 0 0 = 5 6 4 
7 8 9 0 1 0 8 9 7
77
Lecture Notes - Numerical Methods M.Ashyraliyev

Remark 4.3. Any permutation matrix P has inverse and P−1 = PT .

Theorem 4.2. If A is a nonsingular matrix then there exists a per-


mutation matrix P , such that
P·A=L·U
where L is a lower-triangular matrix and U is an upper-triangular
matrix.

Example 4.4. Let us consider again the matrix A from Remark 4.1

1 2 6

A =  4 8 −1 
−2 3 5
We have shown already that we cannot obtain for this matrix LU
factorization without pivoting. For this matrix A there exists the
permutation matrix P such that PA can be factored as LU. Indeed,
with 
0 1 0

P = 0 0 1
1 0 0
we have
4 8 −1

0 1 0
 
1 2 6
  

P·A =  0 0 1 · 4 8 −1  =  − 2 3 5  =
1 0 0 −2 3 5 1 2 6
 
m21 = −0.5
4 8 −1

1 0 0
  
 
 R2 = R2 + 0.5R1 
=  0 1 0 · −2 3 5  =  
 m31 = 0.25 
0 0 1 1 2 6  
R3 = R3 − 0.25R1
4 8 −1

1 0 0
  

=  − 0.5 1 0  ·  0 7 4.5  = L · U
0.25 0 1 0 0 6.25
78
Lecture Notes - Numerical Methods M.Ashyraliyev

In the previous example, we obtained the factorization PA = LU


when the permutation matrix P was given. Now, we have to under-
stand how to obtain the permutation matrix P. We illustrate it
in the following example.
Example 4.5. Solve the following linear system:

 4x1
 + 11x2 − x3 = 27
6x1 + 4x2 + 12x3 = 2

8x1 − 3x2 + 2x3 = 0

by using LU factorization method with partial pivoting.


Solution: Let us denote
4 11 −1
  
x1
 
27

A =  6 4 12  , x =  x2  , b =  2 
8 −3 2 x3 0

Then given system can be written in the matrix form Ax = b. We


first find matrices P, L and U such that PA = LU. We have

4 11 −1

1 0 0
 
1 0 0
  !
Pivoting
 0 1 0 ·A =  0 1 0 · 6 4 12  =⇒ =⇒
R1 ↔ R3
0 0 1 0 0 1 8 −3 2

8 −3 2

0 0 1
 
1 0 0
  !
Row
0 1 0 ·A =  0 1 0 · 6 4 12  =⇒ =⇒
operations
1 0 0 0 0 1 4 11 −1
8 −3

0 0 1
 
1 0 0

2
 !
Pivoting
0 1 0 ·A =  0.75 1 0 · 0 6.25 10.5  =⇒ =⇒
R2 ↔ R3
1 0 0 0.5 0 1 0 12.5 −2
8 −3

0 0 1
 
1 0 0

2
 !
Row
1 0 0 ·A =  0.5 1 0 · 0 12.5 −2  =⇒ =⇒
operation
0 1 0 0.75 0 1 0 6.25 10.5
79
Lecture Notes - Numerical Methods M.Ashyraliyev

8 −3

0 0 1
 
1 0 0

2

 1 0 0 ·A =  0.5 1 0 · 0 12.5 −2  =⇒ P·A = L·U
0 1 0 0.75 0.5 1 0 0 11.5
where
8 −3

0 0 1
 
1 0 0
 
2

P =  1 0 0  , L =  0.5 1 0  , U =  0 12.5 −2  .
0 1 0 0.75 0.5 1 0 0 11.5

Now, to solve the given system Ax = b, we first multiply both sides


by obtained permutation matrix P

Ax = b =⇒ PAx = Pb =⇒ LUx = b̃

where 
0 0 1
 
27
 
0

b̃ = Pb =  1 0 0  ·  2  =  27 
0 1 0 0 2
Then to solve the system LUx = b̃, we first use the forward substitu-
tion procedure for the lower triangular system

1 0 0
 
y1
 
0

Ly = b̃ =⇒  0.5 1 0  ·  y2  =  27  =⇒
0.75 0.5 1 y3 2


 y1 = 0
0.5y1 + y2 = 27 =⇒ y1 = 0, y2 = 27, y3 = −11.5

0.75y1 + 0.5y2 + y3 = 2

Finally, we use the backward substitution procedure for the upper


triangular system

8 −3

2
 
x1
 
0

Ux = y =⇒  0 12.5 −2  ·  x2  =  27  =⇒
0 0 11.5 x3 − 11.5
80
Lecture Notes - Numerical Methods M.Ashyraliyev

 8x1 −
 3x2 + 2x3 = 0
12.5x2 − 2x3 = 27 =⇒ x3 = −1, x2 = 2, x1 = 1.

11.5x3 = −11.5

4.2 Complexity of LU factorization method


When we solve the system Ax = b with n linear equations by using
LU factorization method we have three steps. The first step is simply
the Gaussian Elimination procedure to obtain matrices L and U. The
second and third steps are forward and backward substitutions. It is
easy to show that:

• Gaussian Elimination procedure to obtain matrices L and U re-


2n3 n2 7n
quires + − arithmetic operations
3 2 6
• Forward substitution to solve the lower triangular system re-
quires n2 − n arithmetic operations

• Backward substitution to solve the upper triangular system re-


quires n2 arithmetic operations

So, the LU factorization method to solve the linear system of size n


2n3 5n2 13n
in total requires + − arithmetic operations.
3 2 6
Remark 4.4. Obviously, LU factorization method is approximately
as expensive as Gaussian Elimination method when we want to solve
one linear system Ax = b. However, if one has to solve the linear
system Ax = b with the same matrix A but different right-hand side
vectors b, then LU factorization method is computationally cheaper
than Gaussian Elimination method.

81
Lecture Notes - Numerical Methods M.Ashyraliyev

4.3 Self-study Problems


Problem 4.1. Use LU factorization method without pivoting to
solve the following system of equations:

 x1 + x2 + 6x3 = 23
− x1 + 2x2 + 9x3 = 35
x1 − 2x2 + 3x3 = 7

Problem 4.2. Use LU factorization method without pivoting to


solve the following system of equations:


 4x1 + 8x2 + 4x3 = 8

 x + 5x + 4x − 3x
1 2 3 4 = −4

 x1 + 4x2 + 7x3 + 2x4 = 10

 x + 3x
1 2 − 2x4 = −4

Problem 4.3. Use LU factorization method with partial pivoting


to solve the following system of equations:

 x1 + 2x2 + 6x3 = 9
4x1 + 8x2 − x3 = 11
− 2x1 + 3x2 + 5x3 = 6

Problem 4.4. Use LU factorization method with partial pivoting


to solve the following system of equations:

 −x1 + 2x2 − x3 = −3

− x2 + 2x3 = 1
2x1 − x2

= 3

82
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 4.5. Use LU factorization method with partial pivoting


to solve the following system of equations:

 5x2 + 6x3 = 1
x1 + 2x2 + 3x3 = 0

8x2 + 9x3 = 0

Problem 4.6. Use LU factorization method with partial pivoting


to solve the following system of equations:


 x1 + x2 + 4x4 = 19

 2x1 − x2 + 5x3 = 15

 5x1 + 2x2 + x3 + 2x4 = 20

 − 3x
1 + 2x3 + 6x4 = 27

83
5 Iterative Methods for Linear Systems
Definition 5.1. A square matrix A of size n × n is called strictly
diagonally dominant if

|a11 | > |a12 | + |a13 | + |a14 | + . . . + |a1n |


|a22 | > |a21 | + |a23 | + |a24 | + . . . + |a2n |
|a33 | > |a31 | + |a32 | + |a34 | + . . . + |a3n |
....................................
|ann | > |an1 | + |an2 | + |an3 | + . . . + |an,n−1 |

For example, matrix

6 −2 1
 

A =  −2 7 2 
1 2 −5

is strictly diagonally dominant.

5.1 Jacobi Method


Consider the linear system of equations


 a11 x1 + a12 x2 + a13 x3 + . . . + a1,n−1 xn−1 + a1n xn = b1


 a21 x1 + a22 x2 + a23 x3 + . . . + a2,n−1 xn−1 + a2n xn = b2


a31 x1 + a32 x2 + a33 x3 + . . . + a3,n−1 xn−1 + a3n xn = b3 (5.1)





 ...................................................
an1 x1 + an2 x2 + an3 x3 + . . . + an,n−1 xn−1 + ann xn = bn

We express x1 from the first equation, x2 from the second equation, x3


from the third equation, and so on. That gives

84
Lecture Notes - Numerical Methods M.Ashyraliyev

b1 − a12 x2 − a13 x3 − a14 x4 − . . . − a1n xn




 x1 =
a11







 b2 − a21 x1 − a23 x3 − a24 x4 − . . . − a2n xn

 x2 =
a22



b3 − a31 x1 − a32 x2 − a34 x4 − . . . − a3n xn (5.2)

 x3 =
a33




.............................................




b − an1 x1 − an2 x2 − an3 x3 − . . . − an,n−1 xn−1

 xn = n



ann
To begin the Jacobi method, we choose an initial approximation
 
(0)
x1
 (0) 
x 
 2 
(0)
 (0) 
x =  x3 

 . 
 .. 
 
x(0)
n

Then the Jacobi iterations for approximate solution of the system


(5.1) are generated from (5.2) in the following way
(k) (k) (k) (k)
(k+1) b1 − a12 x2 − a13 x3 − a14 x4 − . . . − a1n xn
x1 =
a11
(k) (k) (k) (k)
(k+1) b2 − a21 x1 − a23 x3 − a24 x4 − . . . − a2n xn
x2 =
a22
(k) (k) (k)
b3 − a31 x1 − a32 x2 − a34 x4 − . . . − a3n xn
(k) (5.3)
(k+1)
x3 =
a33
......................................................
(k) (k) (k) (k)
bn − an1 x1 − an2 x2 − an3 x3 − . . . − an,n−1 xn−1
x(k+1)
n =
ann
85
Lecture Notes - Numerical Methods M.Ashyraliyev

Jacobi iterations (5.3) generate the sequence of approximations:

x(0) =⇒ x(1) =⇒ x(2) =⇒ x(3) =⇒ . . . =⇒ x(k) =⇒ x(k+1)

Algorithm 5 shows a pseudo-code of Jacobi method.

Algorithm 5 Pseudo-code of Jacobi Method.


Choose initial x(0) .
for k = 0, 1, 2, 3, . . . do
Compute x(k+1) bu using (5.3)
if Stopping condition is satisfied then
Stop the search x(k+1) is sufficiently
 good approximation to the
solution of the system (5.1)
end if
end for

Theorem 5.1. If the matrix of coefficients A is strictly diagonally


dominant then Jacobi iterations (5.3) for the system (5.1) converge for
any initial guess x(0) .

Remark 5.1. Let D be the diagonal matrix whose diagonal entries


are those of A, L be the strictly lower-triangular part of A and U be
the strictly upper-triangular part of A. Then

Ax = b =⇒ (D + L + U)x = b =⇒ Dx = b − (L + U)x =⇒
=⇒ x = D−1 b − D−1 (L + U)x =⇒ x = Tx + b̃,

where T = −D−1 (L + U) and b̃ = D−1 b. Applying fixed point itera-


tion method for system x = Tx + b̃ gives us

x(k+1) = Tx(k) + b̃

which is exactly the same as Jacobi iterations (5.3).

86
Lecture Notes - Numerical Methods M.Ashyraliyev

Remark 5.2. (Stopping conditions for Jacobi method)


The following conditions can be used to stop the Jacobi iterations:

• kx(k+1) − x(k) k < T OL for absolute precision

kx(k+1) − x(k) k
• < T OL for relative precision
kx(k+1) k

• kAx(k+1) − bk < T OL, where b is the right-hand side vector of


the system (5.1)

Here kxk denotes the norm of the vector x. One of the most commonly
used norms, which we will use here, is the maximum norm defined as

kxk∞ = max {|x1 |, |x2 |, . . . , |xn |}.

Example 5.1. Consider the following linear system:




 4x1 − x2 + x3 = 7
4x1 − 8x2 + x3 = −21

− 2x1 + x2 + 5x3 = 15

Use Jacobi method with the starting values


 (0)   
x1 1
(0) 
x(0) = 
   
=
 x2   
 2 
(0)
x3 2

to approximate solution of given system within relative precision 10−1 .

Solution: The Jacobi iterations for approximate solution of given


system have the following form

87
Lecture Notes - Numerical Methods M.Ashyraliyev

(k) (k)
(k+1) 7 + x2 − x3
x1 =
4
(k) (k)
(k+1) 21 + 4x1 + x3
x2 =
8
(k) (k)
(k+1) 15 + 2x1 − x2
x3 =
5
1 step: We have
(0) (0)
(1) 7 + x2 − x3 7+2−2
x1 = = = 1.75
4 4
(0) (0)
(1) 21 + 4x1 + x3 21 + 4 · 1 + 2
x2 = = = 3.375
8 8
(0) (0)
(1) 15 + 2x1 − x2 15 + 2 · 1 − 2
x3 = = =3
5 5
So,
 
1.75
(1)
  kx(1) − x(0) k∞
x = 3.375  =⇒ ≈ 0.41 > 0.1
  kx(1) k∞
3

2 step: We have
(1) (1)
(2) 7 + x2 − x3 7 + 3.375 − 3
x1 = = = 1.84375
4 4
(1) (1)
(2) 21 + 4x1 + x3 21 + 4 · 1.75 + 3
x2 = = = 3.875
8 8
(1) (1)
(2) 15 + 2x1 − x2 15 + 2 · 1.75 − 3.375
x3 = = = 3.025
5 5

88
Lecture Notes - Numerical Methods M.Ashyraliyev

So,
 
1.84375
kx(2) − x(1) k∞
x(2) = 
 
3.875  =⇒ ≈ 0.13 > 0.1
  kx(2) k∞
3.025

3 step: We have
(2) (2)
(3) 7 + x2 − x3 7 + 3.875 − 3.025
x1 = = = 1.9625
4 4
(2) (2)
(3) 21 + 4x1 + x3 21 + 4 · 1.84375 + 3.025
x2 = = = 3.925
8 8
(2) (2)
(3) 15 + 2x1 − x2 15 + 2 · 1.84375 − 3.875
x3 = = = 2.9625
5 5
So,
 
1.9625
(3)
  kx(3) − x(2) k∞
x = 3.925  =⇒ ≈ 0.03 < 0.1
  kx(3) k∞
2.9625

So, x(3) gives an approximation to the solution of given system within


relative precision 10−1 . Note that the exact solution of given system is
(x1 , x2 , x3 ) = (2, 4, 3) and we see that after three Jacobi iterations we
already have sufficiently good approximation of the exact solution.

Example 5.2. Consider the following linear system:



 −2x1 + x2 +
 5x3 = 15
4x1 − 8x2 + x3 = −21

4x1 − x2 + x3 = 7

89
Lecture Notes - Numerical Methods M.Ashyraliyev

Note that it is the same system as the one in Example 5.1 with different
order of equations. Now, the Jacobi iterations have the form
(k) (k)
(k+1) x + 5x3 − 15
x1 = 2
2
(k) (k)
(k+1) 21 + 4x1 + x3
x2 =
8
(k+1) (k) (k)
x3 = 7 − 4x1 + x2
Let’s start with the same initial values as we had in Example 5.1
 (0)   
x1 1
 (0)   
x(0) = 
 x2  =  2 
  
(0)
x3 2
1 step: We have
(0) (0)
(1) x + 5x3 − 15 2 + 5 · 2 − 15
x1 = 2 = = −1.5
2 2
(0) (0)
(1) 21 + 4x1 + x3 21 + 4 · 1 + 2
x2 = = = 3.375
8 8
(1) (0) (0)
x3 = 7 − 4x1 + x2 = 7 − 4 · 1 + 2 = 5

2 step: We have
(1) (1)
(2) x + 5x3 − 15 3.375 + 5 · 5 − 15
x1 = 2 = = 6.6875
2 2
(1) (1)
(2) 21 + 4x1 + x3 21 + 4 · (−1.5) + 5
x2 = = = 2.5
8 8
(2) (1) (1)
x3 = 7 − 4x1 + x2 = 7 − 4 · (−1.5) + 3.375 = 16.375

Note that the exact solution of given system is (x1 , x2 , x3 ) = (2, 4, 3)


and obviously we have divergence of Jacobi method here. (why?)

90
Lecture Notes - Numerical Methods M.Ashyraliyev

5.2 Gauss-Seidel Method


With the small modifications of the Jacobi method we can obtain more
efficient method, called Gauss-Seidel iterative method. The Gauss-
Seidel iterations for approximate solution of the system (5.1) are gen-
erated from (5.2) in the following way
(k) (k) (k) (k)
(k+1) b1 − a12 x2 − a13 x3 − a14 x4 − . . . − a1n xn
x1 =
a11
(k+1) (k) (k) (k)
(k+1) b2 − a21 x1 − a23 x3 − a24 x4 − . . . − a2n xn
x2 =
a22
(k+1)
b3 − a31 x1
(k+1)
− a32 x2
(k)
− a34 x4 − . . . − a3n xn
(k) (5.4)
(k+1)
x3 =
a33
......................................................
(k+1) (k+1) (k+1)
bn − an1 x1 − an2 x2 − . . . − an,n−1 xn−1
x(k+1)
n =
ann

Gauss-Seidel iterations (5.4) generate the sequence of approximations:

x(0) =⇒ x(1) =⇒ x(2) =⇒ x(3) =⇒ . . . =⇒ x(k) =⇒ x(k+1)

Algorithm 6 shows a pseudo-code of Gauss-Seidel method.

Algorithm 6 Pseudo-code of Gauss-Seidel Method.


Choose initial x(0) .
for k = 0, 1, 2, 3, . . . do
Compute x(k+1) bu using (5.4)
if Stopping condition is satisfied then
Stop the search x(k+1) is sufficiently
 good approximation to the
solution of the system (5.1)
end if
end for

91
Lecture Notes - Numerical Methods M.Ashyraliyev

Theorem 5.2. If the matrix of coefficients A is strictly diagonally


dominant then Gauss-Seidel iterations (5.4) for the system (5.1) con-
verge for any initial guess x(0) .

Remark 5.3. Stopping conditions for Gauss-Seidel method are the


same as the ones for the Jacobi method discussed in Remark 5.1.

Example 5.3. Consider the following linear system:




 4x1 − x2 + x3 = 7
4x1 − 8x2 + x3 = −21

− 2x1 + x2 + 5x3 = 15

Use Gauss-Seidel method with the starting values


 (0)   
x1 1
(0) 
x(0) = 
   
x = 2
 2   
(0)
x3 2

to approximate solution of given system within relative precision 10−1 .

Solution: The Gauss-Seidel iterations for approximate solution of


given system have the following form
(k) (k)
(k+1) 7 + x2 − x3
x1 =
4
(k+1) (k)
(k+1) 21 + 4x1 + x3
x2 =
8
(k+1) (k+1)
(k+1) 15 + 2x1 − x2
x3 =
5

92
Lecture Notes - Numerical Methods M.Ashyraliyev

1 step: We have
(0) (0)
(1) 7 + x2 − x3 7+2−2
x1 = = = 1.75
4 4
(1) (0)
(1) 21 + 4x1 + x3 21 + 4 · 1.75 + 2
x2 = = = 3.75
8 8
(1) (1)
(1) 15 + 2x1 − x2 15 + 2 · 1.75 − 3.75
x3 = = = 2.95
5 5
So,
 
1.75
(1)
  kx(1) − x(0) k∞
x = 3.75  =⇒ ≈ 0.47 > 0.1
  kx(1) k∞
2.95

2 step: We have
(1) (1)
(2) 7 + x2 − x3 7 + 3.75 − 2.95
x1 = = = 1.95
4 4
(2) (1)
(2) 21 + 4x1 + x3 21 + 4 · 1.95 + 2.95
x2 = = = 3.96875
8 8
(2) (2)
(2) 15 + 2x1 − x2 15 + 2 · 1.95 − 3.96875
x3 = = = 2.98625
5 5
So,
 
1.95
(2)
  kx(2) − x(1) k∞
x = 3.96875  =⇒ ≈ 0.055 < 0.1
  kx(2) k∞
2.98625

So, x(2) gives an approximation to the solution of given system within


relative precision 10−1 . Note that for the same system in Example 5.1
we needed three Jacobi iterations.

93
Lecture Notes - Numerical Methods M.Ashyraliyev

5.3 General Remarks


Remark 5.4. Condition for convergence of both Jacobi and Gauss-
Seidel methods (strictly diagonally dominant matrix) is sufficient but
not necessary.

Remark 5.5. Each iteration step in both Jacobi and Gauss-Seidel


methods for the system with n linear equations requires 2n2 − n arith-
metic operations. So, these two methods have the same computational
costs. However, Gauss-Seidel method in general has faster convergence
than the Jacobi method.

5.4 Self-study Problems


Problem 5.1. Consider the following linear system:

6x1 − 2x2 + x3 = 11


− 2x1 + 7x2 + 2x3 = 5
x1 + 2x2 − 5x3 = −1

(0) (0) (0) 


Use Jacobi method with the starting values x1 , x2 , x3 = (0, 0, 0)
(1) (1) (1)  (2) (2) (2) 
and compute first three approximations x1 , x2 , x3 , x1 , x2 , x3
(3) (3) (3) 
and x1 , x2 , x3 .

Problem 5.2. Consider the following linear system:

 x1 − 5x2 − x3 = 8

4x1 + x2 − x3 = 13
2x1 − x2 − 5x3 = −1

(0) (0) (0) 


Use Jacobi method with the starting values x1 , x2 , x3 = (0, 0, 0)
to approximate solution of given system within relative precision 10−1 .

94
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 5.3. Consider the following linear system:

6x1 − 2x2 + x3 = 11


− 2x1 + 7x2 + 2x3 = 5
x1 + 2x2 − 5x3 = −1

(0) (0) (0) 


Use Gauss-Seidel method with starting values x1 , x2 , x3 = (0, 0, 0)
(1) (1) (1)  (2) (2) (2) 
and compute first two approximations x1 , x2 , x3 and x1 , x2 , x3 .

Problem 5.4. Consider the following linear system:

 x1 − 5x2 − x3 = = 8

4x1 + x2 − x3 = 13
2x1 − x2 − 5x3 = −1

(0) (0) (0) 


Use Gauss-Seidel method with starting values x1 , x2 , x3 = (0, 0, 0)
to approximate solution of given system within relative precision 10−1 .

Problem 5.5. Consider the following linear system:




 4x1 + x2 − x3 + x4 = −2

 x + 4x − x
1 2 3 − x4 = −1

 x1 + x2 − 5x3 − x4 = 0

 x − x + x
1 2 3 + 3x4 = 1

(0) (0) (0) (0) 


Starting with initial values x1 , x2 , x3 , x4 = (0, 0, 0, 0)

a) find the first two iterations of the Jacobi method

b) find the first two iterations of the Gauss-Seidel method

95
6 Newton’s Method for Systems of Non-
linear Equations
We have seen the Newton-Raphson method for numerical solution of
nonlinear algebraic equation. In this section we will study the gener-
alization of that method for systems of nonlinear equations.
Consider the system of n nonlinear equations


 f1 (x1 , x2 , x3 , . . . , xn ) = 0


 f2 (x1 , x2 , x3 , . . . , xn ) = 0


f3 (x1 , x2 , x3 , . . . , xn ) = 0 (6.1)





 ........................
fn (x1 , x2 , x3 , . . . , xn ) = 0

in n unknowns x1 , x2 , x3 , . . . , xn . We assume that functions f1 , f2 , f3 , . . . , fn


in (6.1) have first order partial derivatives with respect to all argu-
ments. We denote

x1
 
f1 (x1 , x2 , x3 , . . . , xn )

x   f (x , x , x , . . . , x ) 
 2   2 1 2 3 n 
x =  x3  and F(x) =  f3 (x1 , x2 , x3 , . . . , xn ) 
   
 .   .. 
 ..   . 
xn fn (x1 , x2 , x3 , . . . , xn )

Definition 6.1. The Jacobian of system (6.1) is the matrix function


 
∂f1 ∂f1 ∂f1 ∂f1
 ∂x1 (x) (x) (x) . . . (x) 
 ∂x2 ∂x3 ∂xn 
 
 ∂f2 ∂f2 ∂f2 ∂f2 
J(x) =  ∂x1
 (x) (x) (x) . . . (x) 
∂x2 ∂x3 ∂xn 
 .. .. .. .. 

 . . . . 

 ∂fn ∂fn ∂fn ∂fn 
(x) (x) (x) . . . (x)
∂x1 ∂x2 ∂x3 ∂xn
96
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 6.1. Consider the following nonlinear system:



3x − cos (x2 x3 ) − 0.5 = 0
 1


x21 − 81(x2 + 1)2 + sin x3 + 1 = 0

 e−x1 x2 + 20x − 3 = 0

3

The Jacobian matrix of given system has the following form


 
3 x3 sin (x2 x3 ) x2 sin (x2 x3 )
J(x1 , x2 , x3 ) =  2x1 −162(x2 + 1) cos x3
 

− x2 e−x1 x2 −x1 e−x1 x2 20

To begin the Newton’s method for nonlinear system (6.1), we


choose an initial approximation
 
(0)
x
 1(0) 
x 
 2 
(0)
 (0) 
x =  x3 

 . 
 .. 
 
x(0)
n

Then the Newton’s iterations for approximate solution of the system


(6.1) have the following form
−1
x(k+1) = x(k) − J(x(k) ) F(x(k) ) (6.2)
for k = 0, 1, 2, 3, . . . Note that (6.2) is equivalent to
J(x(k) ) · 4x(k) = −F(x(k) ), (6.3)
where 4x(k) = x(k+1) − x(k) .
Newton iterations (6.3) generate the sequence of approximations
for vector x
x(0) =⇒ x(1) =⇒ x(2) =⇒ x(3) =⇒ . . . =⇒ x(k) =⇒ x(k+1)

97
Lecture Notes - Numerical Methods M.Ashyraliyev

Algorithm 7 shows a pseudo-code of Newton’s method for the non-


linear system (6.1).

Algorithm 7 Pseudo-code of Newton’s Method.


Choose initial x(0) .
for k = 0, 1, 2, 3, . . . do
Compute F(x(k) )
Compute J(x(k) )
Solve the system of linear equations J(x(k) ) · 4x(k) = −F(x(k) )
Compute x(k+1) = x(k) + 4x(k)
if Stopping condition is satisfied then
Stop the search x(k+1) is sufficiently
 good approximation to the
solution of the system (6.1)
end if
end for

Remark 6.1. (Stopping conditions for Newton’s method)


If the exact solution x of system (6.1) is known, then we can the
following conditions for accuracy to stop the Newton’s iterations:

• kx(k+1) − xk < T OL for absolute accuracy

kx(k+1) − xk
• < T OL for relative accuracy
kxk
If the exact solution is not known then the following conditions for
precision can be used to stop the Newton’s iterations:

• kx(k+1) − x(k) k < T OL for absolute precision

kx(k+1) − x(k) k
• < T OL for relative precision
kx(k+1) k

98
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 6.2. Consider the following nonlinear system:


(
x21 − 2x1 − x2 + 0.5 = 0
x21 + 4x22 − 4 = 0

Use Newton’s method with the starting values


" (0) # " #
x 1
2
x(0) = (0)
=
x2 0.25

to compute first two approximations x(1) and x(2) .

Solution: Let
" # " #
x1 x21 − 2x1 − x2 + 0.5
x= and F(x) =
x2 x21 + 4x22 − 4

The Jacobian matrix of given system has the following form


" #
2x1 − 2 −1
J(x) =
2x1 8x2

1 step: We have

22 − 2 · 2 − 0.25 + 0.5
" # " #
0.25
F(x(0) ) = =
22 + 4 · 0.252 − 4 0.25
" # " #
2·2−2 −1 2 −1
J(x(0) ) = =
2·2 8 · 0.25 4 2
Then
(
(0) (0)
(0) (0) (0) 24x1 − 4x2 = −0.25
J(x )·4x = −F(x ) =⇒ (0) (0)
44x1 + 24x2 = −0.25

99
Lecture Notes - Numerical Methods M.Ashyraliyev

Solving this linear system with Cramer’s rule we get

−0.25 −1
(0)
− 0.25 2 −0.75
4x1 = = = −0.09375
2 −1 8
4 2

2 −0.25
(0)
4 −0.25 0.5
4x2 = = = 0.0625
2 −1 8
4 2
Then
" # " # " #
2 −0.09375 1.90625
x(1) = x(0) + 4x(0) = + =
0.25 0.0625 0.3125

2 step: We have

1.906252 − 2 · 1.90625 − 0.3125 + 0.5


" # " #
0.0087890625
F(x(1) ) = =
1.906252 + 4 · 0.31252 − 4 0.0244140625
" # " #
2 · 1.90625 − 2 −1 1.8125 −1
J(x(1) ) = =
2 · 1.90625 8 · 0.3125 3.8125 2.5
Then
J(x(1) ) · 4x(1) = −F(x(1) ) =⇒
(
(1) (1)
1.81254x1 − 4x2 = −0.0087890625
=⇒ (1) (1)
3.81254x1 + 2.54x2 = −0.0244140625
Solving this linear system we get
(1) (1)
4x1 = −0.00555945692884 and 4x2 = −0.00128745318352

100
Lecture Notes - Numerical Methods M.Ashyraliyev

Then " #
1.90069054307116
x(2) = x(1) + 4x(1) =
0.31121254681648

Example 6.3. Consider the following nonlinear system:


(
x31 + x2 = 1
x32 − x1 = −1

Use Newton’s method with the starting values


" (0) # " #
x 1
0.5
x(0) = (0)
=
x2 0.5

to approximate solution of given system within absolute accuracy 0.05.


Note that the exact solution of the system which we want to approxi-
mate is (x1 , x2 ) = (1, 0).
Solution: Let
" # " #
x1 x31 + x2 − 1
x= and F(x) =
x2 x32 − x1 + 1

The Jacobian matrix of given system has the following form


" #
2
3x1 1
J(x) =
− 1 3x22

1 step: We have

0.53 + 0.5 − 1
" # " #
−0.375
F(x(0) ) = =
0.53 − 0.5 + 1 0.625
" # " #
3 · 0.52 1 0.75 1
J(x(0) ) = =
−1 3 · 0.52 − 1 0.75

101
Lecture Notes - Numerical Methods M.Ashyraliyev

Then
J(x(0) ) · 4x(0) = −F(x(0) ) =⇒
(
(0) (0)
0.754x1 + 4x2 = 0.375
=⇒ (0) (0)
− 4x1 + 0.754x2 = −0.625
Solving this linear system with Cramer’s rule we get

0.375 1
(0)
− 0.625 0.75 0.90625
4x1 = = = 0.58
0.75 1 1.5625
− 1 0.75

0.75 0.375
(0)
− 1 −0.625 −0.09375
4x2 = = = −0.06
0.75 1 1.5625
− 1 0.75
Then
" # " # " #
0.5 0.58 1.08
x(1) = x(0) + 4x(0) = + =
0.5 − 0.06 0.44
" # " # " #
1.08 1 0.08
x(1) −x = − = =⇒ kx(1) −xk∞ = 0.44 > 0.05
0.44 0 0.44
2 step: We have

1.083 + 0.44 − 1
" # " #
0.699712
F(x(1) ) = =
0.443 − 1.08 + 1 0.005184
" # " #
3 · 1.082 1 3.4992 1
J(x(1) ) = =
−1 3 · 0.442 − 1 0.5808

102
Lecture Notes - Numerical Methods M.Ashyraliyev

Then
J(x(1) ) · 4x(1) = −F(x(1) ) =⇒
(
(1) (1)
3.49924x1 + 4x2 = −0.699712
=⇒ (1) (1)
− 4x1 + 0.58084x2 = −0.005184

Solving this linear system we get


(1) (1)
4x1 = −0.13231014448217 and 4x2 = −0.23673234242798

Then " #
0.94768985551783
x(2) = x(1) + 4x(1) =
0.20326765757202
" #
−0.05231014448217
x(2) −x = =⇒ kx(2) −xk∞ ≈ 0.20 > 0.05
0.20326765757202

3 step: We have
" #
0.05440313884529
F(x(2) ) =
0.06070870483311
" #
2.69434818675420 1
J(x(2) ) =
−1 0.12395322184444
Then
J(x(2) ) · 4x(2) = −F(x(2) ) =⇒
(
(2) (2)
2.69434818675424x1 + 4x2 = −0.05440313884529
=⇒ (2) (2)
− 4x1 + 0.123953221844444x2 = −0.06070870483311

Solving this linear system we get


(2) (2)
4x1 = 0.04045453310605 and 4x2 = −0.16340173676555

103
Lecture Notes - Numerical Methods M.Ashyraliyev

Then " #
0.98814438862387
x(3) = x(2) + 4x(2) =
0.03986592080647
" #
−0.01185561137613
x(3) −x = =⇒ kx(3) −xk∞ ≈ 0.04 < 0.05
0.03986592080647

So, Newton’s method in 3 steps gives an approximation for the


solution of given nonlinear system within an absolute accuracy 0.05.

Remark 6.2. Newton’s method can converge very rapidly once an


approximation is obtained that is near the true solution. However,
it is not always easy to determine starting values that will lead to a
solution.

6.1 Self-study Problems


Problem 6.1. Consider the following nonlinear system:
(
x21 + x22 − 2 = 0
x21 − x22 − 1 = 0

(0) (0) 
Use Newton’s method with the starting values x1 , x2 = (1.2, 0.7)
(1) (1)  (2) (2) 
and compute first two approximations x1 , x2 and x1 , x2 .

Problem 6.2. Consider the following nonlinear system:


(
x21 + x22 − 8x1 − 4x2 = 11
x21 + x22 − 20x1 + 75 = 0

(0) (0) 
Use Newton’s method with the starting values x1 , x2 = (6, −3) to
approximate solution of given system within relative precision 10−3 .

104
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 6.3. Consider the following nonlinear system:



 x1 + x2 + x3 = 0

x21 + x22 + x23 = 2

x1 x2 + x1 x3 = −1

Use Newton’s method with the starting values


 (0)   
x1 0.75
(0) 
x(0) = 
   
 x2  =  0.5
 

(0)
x3 − 0.5

to compute the first approximation x(1) .

105
7 Interpolation and Polynomial Approx-
imations
We have seen already how Taylor polynomials can be used to approx-
imate functions. Taylor polynomials of function f (x) about point x0
give accurate approximation only near that point. A good approxima-
tion polynomial needs to provide a relatively accurate result over large
intervals. Another disadvantage of using Taylor polynomials is that
derivatives of f (x) must be available.
In this section we will study the interpolation methods to approx-
imate functions with polynomials that exactly represents a collection
of data points. Suppose function y = f (x) is known at n + 1 points,
i.e. its graph passes through points:

(x0 , y0 ), (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) (7.1)

where

a ≤ x0 < x1 < x2 < . . . < xn ≤ b and yk = f (xk ), k = 0, 1, . . . , n

A polynomial of degree n can be constructed that passes through these


n + 1 points. Note that to construct such a polynomial only values
of xk and yk are needed. Once the polynomial Pn (x) of degree n is
constructed, it can be used to approximate f (x) at any point x.

• When x0 ≤ x ≤ xn , Pn (x) is called interpolated value.

• When x < x0 or x > xn , Pn (x) is called extrapolated value.

7.1 Lagrange Polynomials


Definition 7.1. Assume that n + 1 points (7.1) are given. The n-th
order Lagrange interpolating polynomial has the form

Pn (x) = Ln,0 (x)y0 + Ln,1 (x)y1 + Ln,2 (x)y2 + · · · + Ln,n (x)yn (7.2)

106
Lecture Notes - Numerical Methods M.Ashyraliyev

where
(x − x0 )(x − x1 ) · · · (x − xk−1 )(x − xk+1 ) · · · (x − xn )
Ln,k (x) =
(xk − x0 )(xk − x1 ) · · · (xk − xk−1 )(xk − xk+1 ) · · · (xk − xn )

Remark 7.1. Note that


(
1, if j = k
Ln,k (xj ) =
0, if j 6= k

So, for polynomial in (7.2) we have

Pn (xk ) = 0 · y0 + 0 · y1 + · · · + 0 · yk−1 + 1 · yk + 0 · yk+1 + · · · + 0 · yn = yk

for all k = 0, 1, 2 . . . , n and therefore the polynomial Pn (x) in (7.2)


passes through all data points (xk , yk ).

Remark 7.2. Assume that we have only two data points (x0 , y0 ) and
(x1 , y1 ). Using formula (7.2), the first order Lagrange interpolating
polynomial that passes through these two points has the form
x − x1 x − x0
P1 (x) = y0 + y1 .
x0 − x1 x1 − x0

Remark 7.3. Assume that we have three data points (x0 , y0 ), (x1 , y1 )
and (x2 , y2 ). Using formula (7.2), the second order Lagrange interpo-
lating polynomial that passes through these three points has the form
(x − x1 )(x − x2 ) (x − x0 )(x − x2 ) (x − x0 )(x − x1 )
P2 (x) = y0 + y1 + y2 .
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 )

Remark 7.4. Assume that we have four data points (x0 , y0 ), (x1 , y1 ),
(x2 , y2 ) and (x3 , y3 ). Using formula (7.2), the third order Lagrange

107
Lecture Notes - Numerical Methods M.Ashyraliyev

interpolating polynomial that passes through these four points has the
form
(x − x1 )(x − x2 )(x − x3 ) (x − x0 )(x − x2 )(x − x3 )
P3 (x) = y0 + y1 +
(x0 − x1 )(x0 − x2 )(x0 − x3 ) (x1 − x0 )(x1 − x2 )(x1 − x3 )
(x − x0 )(x − x1 )(x − x3 ) (x − x0 )(x − x1 )(x − x2 )
+ y2 + y3 .
(x2 − x0 )(x2 − x1 )(x2 − x3 ) (x3 − x0 )(x3 − x1 )(x3 − x2 )

Theorem 7.1. Suppose function f has continuous n + 1 derivatives


on interval [a, b]. Then for any x ∈ [a, b] there exists ξ ∈ [a, b] such
that
f (x) = Pn (x) + En (x),
where Pn (x) is Lagrange polynomial defined in (7.2) and

f (n+1) (ξ)
En (x) = (x − x0 )(x − x1 )(x − x2 ) · · · (x − xn )
(n + 1)!
is the error term.

Example 7.1. Consider the data given in the table

xk 0 20 40
yk 3.85 0.8 0.12

where yk = f (xk ). Suppose we want to construct the first order La-


grange polynomial to approximate f (15). We have to choose two data
points out of given three points. x0 = 0 and x1 = 20 are closest points
to x = 15 and therefore to have the most accurate approximation we
use first two points. Then
x − x1 x − x0 x − 20 x−0
P1 (x) = y0 + y1 = · 3.85 + · 0.8
x0 − x1 x1 − x0 0 − 20 20 − 0

108
Lecture Notes - Numerical Methods M.Ashyraliyev

and
15 − 20 15 − 0
f (15) ≈ P1 (15) = · 3.85 + · 0.8 = 1.5625
0 − 20 20 − 0
Now, suppose that we want to construct the second order Lagrange
polynomial to approximate f (15). We have to use all three data points.
Then
(x − 20)(x − 40) (x − 0)(x − 40)
P2 (x) = · 3.85 + · 0.8+
(0 − 20)(0 − 40) (20 − 0)(20 − 40)
(x − 0)(x − 20)
+ · 0.12
(40 − 0)(40 − 20)
and
(15 − 20)(15 − 40) (15 − 0)(15 − 40)
f (15) ≈ P2 (15) = · 3.85 + · 0.8+
(0 − 20)(0 − 40) (20 − 0)(20 − 40)
(15 − 0)(15 − 20)
+ · 0.12 = 1.3403125
(40 − 0)(40 − 20)

Example 7.2. Consider the data given in the table

xk −1 0 1 2 3
yk 2 1 2 −7 10

where yk = f (xk ). Construct the third order Lagrange polynomial to


approximate f (0.5) in most accurate way.
Solution: To construct the third order polynomial we have to choose
four data points from given data set. x0 = −1, x1 = 0, x2 = 1 and
x3 = 2 are closest points to x = 0.5 and therefore to have the most
accurate approximation we use first four data points. Then
(x − 0)(x − 1)(x − 2) (x + 1)(x − 1)(x − 2)
P3 (x) = ·2+ · 1+
(−1 − 0)(−1 − 1)(−1 − 2) (0 + 1)(0 − 1)(0 − 2)
(x + 1)(x − 0)(x − 2) (x + 1)(x − 0)(x − 1)
+ ·2+ · (−7)
(1 + 1)(1 − 0)(1 − 2) (2 + 1)(2 − 0)(2 − 1)
109
Lecture Notes - Numerical Methods M.Ashyraliyev

and
(0.5 − 0)(0.5 − 1)(0.5 − 2)
f (0.5) ≈ P3 (0.5) = · 2+
(−1 − 0)(−1 − 1)(−1 − 2)
(0.5 + 1)(0.5 − 1)(0.5 − 2)
+ · 1+
(0 + 1)(0 − 1)(0 − 2)
(0.5 + 1)(0.5 − 0)(0.5 − 2)
+ · 2+
(1 + 1)(1 − 0)(1 − 2)
(0.5 + 1)(0.5 − 0)(0.5 − 1)
+ · (−7) =
(2 + 1)(2 − 0)(2 − 1)
=2

So, f (0.5) ≈ P3 (0.5) = 2.

Remark 7.5. The advantage of using the Lagrange interpolating poly-


nomials is that an error estimate is available, i.e. we know the upper
bound for the error we make. The main disadvantage of this technique
is that there is no recursive formula between two successive Lagrange
polynomials Pn−1 (x) and Pn (x). So, if we desire to add or subtract
a point from the set used to construct the Lagrange polynomial, we
essentially have to start over all computations.

7.2 Newton Polynomials


Here we describe the construction of the polynomial of degree n that
passes through n + 1 points (7.1) with the use of so-called divided
differences. Since the n-th order polynomial that passes through
n + 1 points is unique, we will not get a polynomial different from the
one obtained by Lagrange technique. Actually, we will obtain the same
polynomial just expressed differently.

110
Lecture Notes - Numerical Methods M.Ashyraliyev

Definition 7.2. Suppose we are given n + 1 points (7.1). Divided


differences of function f are defined as following

f [xk ] = f (xk ), k = 0, 1, . . . , n
f [xk ] − f [xk−1 ]
f [xk−1 , xk ] = , k = 1, 2, . . . , n
xk − xk−1
f [xk−1 , xk ] − f [xk−2 , xk−1 ]
f [xk−2 , xk−1 , xk ] = , k = 2, 3, . . . , n
xk − xk−2
f [xk−2 , xk−1 , xk ] − f [xk−3 , xk−2 , xk ]
f [xk−3 , xk−2 , xk−1 , xk ] = , k = 3, 4, . . . , n
xk − xk−3

......................................................
f [x1 , x2 , . . . , xn ] − f [x0 , x1 , . . . , xn−1 ]
f [x0 , x1 , x2 , . . . , xn ] =
xn − x0

Definition 7.3. The n-th order Newton interpolating polynomial


that passes through n + 1 points (7.1) has the form

Pn (x) = a0 + a1 (x − x0 ) + a2 (x − x0 )(x − x1 ) + a3 (x − x0 )(x − x1 )(x − x2 ) + · · ·


+ an (x − x0 )(x − x1 ) · · · (x − xn−1 )
(7.3)
where
ak = f [x0 , x1 , x2 , . . . , xk ], k = 0, 1, 2, . . . , n.

Example 7.3. Construct the third order Newton polynomial for the
data given in the following table:

xk 1 2 3 4
f (xk ) −3 0 15 48

111
Lecture Notes - Numerical Methods M.Ashyraliyev

Solution: We first compute all divided differences. We have


f [x0 ] = f (x0 ) = −3, f [x1 ] = f (x1 ) = 0,
f [x2 ] = f (x2 ) = 15, f [x3 ] = f (x3 ) = 48.
The first divided differences are:
f [x1 ] − f [x0 ] 0 − (−3)
f [x0 , x1 ] = = = 3,
x1 − x0 2−1
f [x2 ] − f [x1 ] 15 − 0
f [x1 , x2 ] = = = 15,
x2 − x1 3−2
f [x3 ] − f [x2 ] 48 − 15
f [x2 , x3 ] = = = 33.
x3 − x2 4−3
The second divided differences are:
f [x1 , x2 ] − f [x0 , x1 ] 15 − 3
f [x0 , x1 , x2 ] = = = 6,
x2 − x0 3−1
f [x2 , x3 ] − f [x1 , x2 ] 33 − 15
f [x1 , x2 , x3 ] = = = 9.
x3 − x1 4−2
Finally, the third divided difference is:
f [x1 , x2 , x3 ] − f [x0 , x1 , x2 ] 9 − 6
f [x0 , x1 , x2 , x3 ] = = = 1.
x3 − x0 4−1
We summarize all results in the following table.

k xk f [xk ] f [xk−1 , xk ] f [xk−2 , xk−1 , xk ] f [xk−3 , xk−2 , xk−1 , xk ]


0 1 −3
1 2 0 3
2 3 15 15 6
3 4 48 33 9 1

Then the third order Newton polynomial has the form


P3 (x) = −3 + 3 · (x − 1) + 6 · (x − 1)(x − 2) + 1 · (x − 1)(x − 2)(x − 3).

112
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 7.4. Consider the data given in the table

x 1.0 1.3 1.6 1.9


f (x) 0.77 0.62 0.44 0.29

Construct the second order Newton polynomial to approximate f (1.5)


in most accurate way.
Solution: To construct the second order polynomial we have to choose
three data points from given data set. x = 1.3, x = 1.6 and x = 1.9
are closest points to x = 1.5 and therefore to have the most accurate
approximation we use last three data points. We compute all divided
differences. Results are given in the following table

k xk f [xk ] f [xk−1 , xk ] f [xk−2 , xk−1 , xk ]


0 1.3 0.62
1 1.6 0.44 −0.6
1
2 1.9 0.29 −0.5 6

Then the second order Newton polynomial has the form


1
P2 (x) = 0.62 − 0.6(x − 1.3) + (x − 1.3)(x − 1.6).
6
So, f (1.5) ≈ P2 (1.5) = 0.49666666 . . .

Remark 7.6. The main advantage of using the Newton interpolating


polynomials is that there is recursive formula between two successive
Newton polynomials Pn−1 (x) and Pn (x). In fact, from (7.3) we have
Pn (x) = Pn−1 (x) + an (x − x0 )(x − x1 ) · · · (x − xn−1 ).
So, if we want to construct the Newton polynomial Pn (x) by adding a
point to the set used for construction of Pn−1 (x), we do need to start
over the computations. In this case, we just need to compute a few
extra divided differences.

113
Lecture Notes - Numerical Methods M.Ashyraliyev

Remark 7.7. Another important advantage of using the Newton in-


terpolating polynomial is that it allows us to write the polynomial in
nested structure directly. For instance, from (7.3) with n = 3 we have

P3 (x) = a0 + a1 (x − x0 ) + a2 (x − x0 )(x − x1 ) + a3 (x − x0 )(x − x1 )(x − x2 ) =


  
= a3 (x − x2 ) + a2 (x − x1 ) + a1 (x − x0 ) + a0 .

7.3 Self-study Problems


Problem 7.1. Consider the graph of function f (x) = sin x over inter-
val [0, π2 ].
a) Use two points (0, 0) and ( π2 , 1) to construct a Lagrange linear
interpolation polynomial P1 (x). Use this polynomial to approx-
imate sin ( 2π
9 ) and find the upper bound for the error in this
approximation.

b) Use two points and ( 3 , 23 ) to construct a Lagrange linear
( π6 , 12 ) π

interpolation polynomial P1 (x). Use this polynomial to approx-


imate sin ( 2π
9 ) and find the upper bound for the error in this
approximation.

Problem 7.2. Consider the graph of function √


f (x) = sin x over in-
terval [0, π2 ]. Use three points (0, 0), ( π4 , 22 ), and ( π2 , 1) to construct a
Lagrange quadratic interpolation polynomial P2 (x). Use this polyno-
mial to approximate sin ( 2π 9 ) and find the upper bound for the error in
this approximation.

Problem 7.3. Construct Newton polynomial of degree four, P4 (x) for


the data given in the following table:
xk 4 5 6 7 8
f (xk ) 2.00000 2.23607 2.44949 2.64575 2.82843
114
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 7.4. The polynomial


P3 (x) = 2 − (x + 1) + x(x + 1) − 2x(x + 1)(x − 1)
interpolates the first four points in the table:
x −1 0 1 2 3
f (x) 2 1 2 −7 10
Find the polynomial P4 (x) interpolating all five points. What is the
relative error at x = 0.5?

Problem 7.5. In this problem choose the best interpolating method


so that it allows you to add a new data without too many unnecessary
calculations all over again.
a) Construct the interpolating polynomial of degree two, P2 (x) for
the unequally spaced points given in the following table:
x 0 0.1 0.2
f (x) −6 −5.8 −5.6
b) Then add f (0.3) = −4.9 to the table and construct the interpo-
lating polynomial of degree three, P3 (x) and calculate P3 (0.15).

Problem 7.6. For a function f , the Newton divided difference table


is given by
xk f [xk ] f [xk−1 , xk ] f [xk−2 , xk−1 , xk ]
x0 = 0 f [x0 ] = A
x1 = 0.4 f [x1 ] = B f [x0 , x1 ] = C
x2 = 0.7 f [x2 ] = 6 f [x1 , x2 ] = 10 f [x0 , x1 , x2 ] = 50
7

a) Determine the missing entries A, B, and C in the table.


b) Use the table to approximate f (0.5).

Problem 7.7. Let P3 (x) be the interpolating polynomial for the data
(0, 0), (1, A), (2, 4), and (4, 2). Find A if the coefficient of x3 in P3 (x)
5
is 12 .

115
8 Spline Interpolation
Polynomial interpolation for a set of n + 1 points
(x0 , y0 ), (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )
is often unsatisfactory when n is large, since high-degree polynomials
can have oscillations. Here we will discuss an alternative approach,
piecewise interpolation method, using lower-degree polynomials which
interpolate between nodes (xk , yk ) and (xk+1 , yk+1 ).
The simplest piecewise polynomial interpolation, called linear splines,
joins the data points (xk , yk ) and (xk+1 , yk+1 ) by a straight line segment.

Figure 4: Linear Splines

Linear splines can be defined as a piecewise linear function:


a0 + b0 (x − x0 ), if x ∈ [x0 , x1 )




S(x) = a1 + b1 (x − x1 ), if x ∈ [x1 , x2 ) (8.1)
 ...............


an−1 + bn−1 (x − xn−1 ), if x ∈ [xn−1 , xn ]
Coefficients a0 , a1 , . . . , an−1 and b0 , b1 , . . . , bn−1 in (8.1) can be chosen
such that
(i) S(xk ) = yk for k = 0, 1, . . . , n, so that the graph of S(x) passes
through all n + 1 data points.
(ii) S− (xk ) = S+ (xk ) for k = 1, 2, . . . , n−1, so that S(x) is continuous
on the whole interval [x0 , xn ].
Here S− (xk ) = lim− S(x) and S+ (xk ) = lim+ S(x).
x→xk x→xk

116
Lecture Notes - Numerical Methods M.Ashyraliyev

A disadvantage of using linear splines is that the approximation


S(x) in (8.1) is not smooth, i.e. it is not differentiable on the whole
interval [x0 , xn ].

8.1 Cubic Splines


To achieve a sufficient smoothness in the piecewise polynomial inter-
polation, often the third order polynomials are used to interpolate
between nodes (xk , yk ) and (xk+1 , yk+1 ).
Consider a piecewise function

a0 + b0 (x − x0 ) + c0 (x − x0 )2 + d0 (x − x0 )3 , if x ∈ [x0 , x1 )




a1 + b1 (x − x1 ) + c1 (x − x1 )2 + d1 (x − x1 )3 , if x ∈ [x1 , x2 )

S(x) =

 .......................................

an−1 + bn−1 (x − xn−1 ) + cn−1 (x − xn−1 )2 + dn−1 (x − xn−1 )3 , x ∈ [xn−1 , xn ]

(8.2)

Definition 8.1. Piecewise function (8.2) is called cubic splines if the


following conditions are satisfied:
(i) S(xk ) = yk for k = 0, 1, . . . , n, so that the graph of S(x) passes
through all n + 1 data points.

(ii) S− (xk ) = S+ (xk ) for k = 1, 2, . . . , n−1, so that S(x) is continuous


on the whole interval [x0 , xn ].

(iii) S−0 (xk ) = S+0 (xk ) for k = 1, 2, . . . , n − 1, so that S 0 (x) is contin-


uous on the whole interval [x0 , xn ].

(iv) S−00 (xk ) = S+00 (xk ) for k = 1, 2, . . . , n − 1, so that S 00 (x) is contin-


uous on the whole interval [x0 , xn ].

Remark 8.1. Note that there 4n unknown coefficients in (8.2) and


there are 4n−2 conditions in (i)-(iv). So there are 2 degrees of freedom.

117
Lecture Notes - Numerical Methods M.Ashyraliyev

Definition 8.2. Cubic spline S(x) in (8.2) is called a natural cubic


splines if it satisfies the boundary conditions S+00 (x0 ) = 0, S−00 (xn ) = 0.

Definition 8.3. Cubic splines S(x) in (8.2) is called a clamped cubic


splines if it satisfies the boundary conditions S+0 (x0 ) = α, S−0 (xn ) = β,
where α and β are given constants.

Example 8.1. Natural cubic spline S(x) on [0, 2] is defined by


(
1 + 2x − x3 , if 0 ≤ x < 1
S(x) =
a + b(x − 1) + c(x − 1)2 + d(x − 1)3 , if 1 ≤ x ≤ 2
Find a, b, c and d.
Solution: We first find the derivatives of S(x).
(
0
2 − 3x2 , if 0 < x < 1
S (x) =
b + 2c(x − 1) + 3d(x − 1)2 , if 1 < x < 2
(
−6x, if 0 < x < 1
S 00 (x) =
2c + 6d(x − 1), if 1 < x < 2
Continuity of S(x) at x = 1 implies that
1 + 2 · 1 − 13 = a + b(1 − 1) + c(1 − 1)2 + d(1 − 1)3 =⇒ a = 2.
Continuity of S 0 (x) at x = 1 implies that
2 − 3 · 12 = b + 2c(1 − 1) + 3d(1 − 1)2 =⇒ b = −1.
Continuity of S 00 (x) at x = 1 implies that
−6 · 1 = 2c + 6d(1 − 1) =⇒ c = −3.
Finally, from natural boundary conditions S+00 (0) = 0, S−00 (2) = 0 we
have 2c + 6d = 0 and therefore d = 1. So,
(
1 + 2x − x3 , if 0 ≤ x < 1
S(x) =
2 − (x − 1) − 3(x − 1)2 + (x − 1)3 , if 1 ≤ x ≤ 2
is the natural cubic spline on interval [0, 2].

118
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 8.2. Clamped cubic spline S(x) on [1, 3] is defined by


(
3(x − 1) + 2(x − 1)2 − (x − 1)3 , if 1 ≤ x < 2
S(x) =
a + b(x − 2) + c(x − 2)2 + d(x − 2)3 , if 2 ≤ x ≤ 3

Given S+0 (1) = S−0 (3), find a, b, c and d.


Solution: We first find the derivatives of S(x).
(
0
3 + 4(x − 1) − 3(x − 1)2 , if 1 < x < 2
S (x) =
b + 2c(x − 2) + 3d(x − 2)2 , if 2 < x < 3
(
4 − 6(x − 1), if 1 < x < 2
S 00 (x) =
2c + 6d(x − 2), if 2 < x < 3
Continuity of S(x) at x = 2 implies that

3(2−1)+2(2−1)2 −(2−1)3 = a+b(2−2)+c(2−2)2 +d(2−2)3 =⇒ a = 4.

Continuity of S 0 (x) at x = 2 implies that

3 + 4(2 − 1) − 3(2 − 1)2 = b + 2c(2 − 2) + 3d(2 − 2)2 =⇒ b = 4.

Continuity of S 00 (x) at x = 2 implies that

4 − 6(2 − 1) = 2c + 6d(2 − 2) =⇒ c = −1.

Finally, from given boundary condition S+0 (1) = S−0 (3) we have
1
3 + 4(1 − 1) − 3(1 − 1)2 = b + 2c(3 − 2) + 3d(3 − 2)2 =⇒ d= .
3
So,

 3(x − 1) + 2(x − 1)2 − (x − 1)3 ,



if 1 ≤ x < 2
S(x) =
 4 + 4(x − 2) − (x − 2)2 + 1 (x − 2)3 , if 2 ≤ x ≤ 3
3
is the clamped cubic spline on interval [1, 3].

119
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 8.3. Find a natural cubic spline to interpolate the data in


the following table:
xk −1 0 1
yk 1 2 −1
Solution: Let
(
a0 + b0 (x + 1) + c0 (x + 1)2 + d0 (x + 1)3 , if − 1 ≤ x < 0
S(x) =
a1 + b1 x + c1 x2 + d1 x3 , if 0 ≤ x ≤ 1
Then
(
0
b0 + 2c0 (x + 1) + 3d0 (x + 1)2 , if − 1 < x < 0
S (x) =
b1 + 2c1 x + 3d1 x2 , if 0 < x < 1
(
2c0 + 6d0 (x + 1), if − 1 < x < 0
S 00 (x) =
2c1 + 6d1 x, if 0 < x < 1
There are 8 constants in S(x) to be determined, which requires 8 con-
ditions. Three conditions come from the fact that the splines must
agree with the data at the nodes. Hence
S(x0 ) = y0 =⇒ S(−1) = 1 =⇒ a0 = 1
S(x1 ) = y1 =⇒ S(0) = 2 =⇒ a1 = 2
S(x2 ) = y2 =⇒ S(1) = −1 =⇒ a1 + b1 + c1 + d1 = −1
Three more conditions come from the fact S(x), S 0 (x) and S 00 (x) must
be continuous at x = 0. Hence
S− (0) = S+ (0) =⇒ a0 + b0 + c0 + d0 = a1
S−0 (0) = S+0 (0) =⇒ b0 + 2c0 + 3d0 = b1
S−00 (0) = S+00 (0) =⇒ 2c0 + 6d0 = 2c1
Last two conditions come from the natural boundary conditions
S+00 (−1) = 0 =⇒ 2c0 = 0 =⇒ c0 = 0
S−00 (1) = 0 =⇒ 2c1 + 6d1 = 0
120
Lecture Notes - Numerical Methods M.Ashyraliyev

Then, combining all 8 conditions results in a system of linear equations




 a0 = 1




 a1 = 2
a1 + b1 + c1 + d1 = −1





 a +b +c +d −a =0
0 0 0 0 1

 b0 + 2c0 + 3d0 − b1 = 0




 c0 + 3d0 − c1 = 0
c0 = 0





 c + 3d = 0
1 1

Solving this system of equations gives a0 = 1, b0 = 2, c0 = 0, d0 = −1,


a1 = 2, b1 = −1, c1 = −3, d1 = 1. So, the natural cubic splines is
(
1 + 2(x + 1) − (x + 1)3 , if − 1 ≤ x < 0
S(x) =
2 − x − 3x2 + x3 , if 0 ≤ x ≤ 1

8.2 Self-study Problems


Problem 8.1. Determine all unknown coefficients for which the fol-
lowing function
 1 + a(x − 1) + b(x − 1)3 ,

x ∈ [1, 2)
S(x) =
 1 + c(x − 2) − 3 (x − 2)2 + d(x − 2)3 , x ∈ [2, 3]
4
is the natural cubic spline on interval [1, 3].

Problem 8.2. Determine all unknown coefficients for which the fol-
lowing function
(
a + bx + 2x2 − 2x3 , x ∈ [0, 1)
S(x) =
1 + c(x − 1) + d(x − 1)2 + e(x − 1)3 , x ∈ [1, 2]
is the clamped cubic spline on interval [0, 2] with the derivative bound-
ary conditions S 0 (0) = 0 and S 0 (2) = 11.

121
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 8.3. Find the natural cubic spline S(x) that passes through
data points given in the table:
xk 1 2 3
yk 2 3 5

Problem 8.4. Find the clamped cubic spline S(x) that passes through
data points given in the table:
xk −1 0 1
yk 1 2 −1

with the derivative boundary conditions S 0 (−1) = 1 and S 0 (1) = 1.

Problem 8.5. Determine all unknown coefficients for which the fol-
lowing function
2 3

 a + bx − 2x + x ,
 x ∈ [0, 1)
S(x) = c + (x − 1)2 + d(x − 1)3 , x ∈ [1, 2)

t − (x − 2) − 2(x − 2)2 + r(x − 2)3 , x ∈ [2, 3]

is the clamped cubic spline that passes through data points given in
the table:
xk 0 1 2 3
yk 1 1 1 0

and has the derivative boundary conditions S 0 (0) = 1 and S 0 (3) = 1.

122
9 Curve Fitting
Assume that we have the set of paired observations
(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) (9.1)
We have already seen the interpolation methods when the approximat-
ing function is constructed such that its graph passes though all data
points. Interpolation assumes that there is some function f (x) such
that yk = f (xk ) for all values of k. However, in practice it is unrea-
sonable to require that the approximating function agree exactly with
the given data due to the presence of errors in the data. For instance,
in the left plot of Figure 5 points represent the data values and it ap-
pears that y depends linearly on x. In this case, data can represented
by ”best” (in some sense) approximating line (or some other curve for
the data in the right plot of Figure 5). The problem of finding that
”best” approximating curve to represent the data is called the curve
fitting problem.

Figure 5: Curve Fitting

9.1 Least Squares Method


Assume that the actual relationship between x and y has the form
y = f (x). We denote the error in observation yk by ek , so that
ek = f (xk ) − yk , k = 1, 2, . . . , n.

123
Lecture Notes - Numerical Methods M.Ashyraliyev

Definition 9.1. Least Squares method refers to the problem of min-


imizing the sum
n n
X
2
X 2
S= ek = f (xk ) − yk . (9.2)
k=1 k=1

Note that by minimizing the sum in (9.2), we actually minimize the


differences between the actual and observed values.

Definition 9.2. To measure the quality of the fit, we define the quan-
tity called Root Mean Square (RMS) defined as following
v
u n
u1 X 2
RM S = t f (xk ) − yk . (9.3)
n
k=1

Definition 9.3. Least squares line is the line y = f (x) = Ax + B


that minimizes the sum
n
X 2
S= Axk + B − yk . (9.4)
k=1

Theorem 9.1. The coefficients of least squares line are the solutions
of so-called normal equations:

Xn Xn X n
2
A xk + B xk = xk yk



k=1 k=1 k=1 (9.5)
 Pn Pn
x +B·n=

A

k y k
k=1 k=1
Proof. The values of parameters A and B in least squares line minimize
the sum (9.4). For a minimum to occur, we need
n
∂S X 
= 2 Axk + B − yk xk = 0,
∂A
k=1
n
∂S X 
= 2 Axk + B − yk = 0.
∂B
k=1

124
Lecture Notes - Numerical Methods M.Ashyraliyev

From these two equalities we easily obtain normal equations (9.5).

Example 9.1. Find the least squares line for the data given in the
following table
xk −1 0 1 2 3 4 5 6
yk 10 9 7 5 4 3 0 −1

Solution: We first construct the table to calculate all necessary sums


which appear in the normal equations (9.5)

xk yk x2k xk yk
−1 10 1 −10
0 9 0 0
1 7 1 7
2 5 4 10
3 4 9 12
4 3 16 12
5 0 25 0
6 −1 36 −6
8 8 8 8
x2k = 92
P P P P
xk = 20 yk = 37 xk yk = 25
k=1 k=1 k=1 k=1

Then the normal equations (9.5) take the form


(
92A + 20B = 25
20A + 8B = 37

45 121
Solving this system, we obtain A = − and B = . So, the least
28 14
squares line for the given data has the form:
45 121
y=− x+ .
28 14

125
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 9.2. Find the least squares line for the data given in the
following table
xk −2 −1 0 1 2
yk 1 2 3 3 4
and calculate the RMS.
Solution: We first construct the table to calculate all necessary sums
which appear in the normal equations (9.5)
xk yk x2k x k yk
−2 1 4 −2
−1 2 1 −2
0 3 0 0
1 3 1 3
2 4 4 8
5 5 5 5
x2k
P P P P
xk = 0 yk = 13 = 10 x k yk = 7
k=1 k=1 k=1 k=1

Then the normal equations (9.5) take the form


(
10A + 0 · B = 7
0 · A + 5B = 13
Solving this system, we obtain A = 0.7 and B = 2.6. So, the least
squares line for the given data has the form:
y = f (x) = 0.7x + 2.6
The Root Mean Square of obtained fit is
v
u 5 
u1 X 2
RM S = t f (xk ) − yk =
5
k=1
r h
1 2  2  2  2  2 i
= 1.2 − 1 + 1.9 − 2 + 2.6 − 3 + 3.3 − 3 + 4 − 4 =
5

r
1
= · 0.3 = 0.06 ≈ 0.245
5
126
Lecture Notes - Numerical Methods M.Ashyraliyev

Definition 9.4. Let m be some constant. Least squares power


curve is the curve y = f (x) = Axm that minimizes the sum
n
X 2
S= Axm
k − yk . (9.6)
k=1

Theorem 9.2. The coefficient A of least squares power curve is given


by
n
yk x m
P
k
k=1
A= P n . (9.7)
2m
xk
k=1

Proof. The value of parameter A in least squares power curve minimize


the sum (9.6). For a minimum to occur, we need
n
dS X
2 Axm
 m
= k − y k xk = 0.
dA
k=1

From this equality we can easily obtain (9.7).

Example 9.3. Consider the data given in the following table


xk 2.0 2.3 2.6 2.9 3.2
yk 5.1 7.5 10.6 14.4 19.0
Find the least squares power curves
a) y = f (x) = Ax2 a) y = g(x) = Bx3
for the given data. Determine which curve gives a better fit.
Solution: a) Using formula (9.7) with m = 2 we have
5.1 · (2.0)2 + 7.5 · (2.3)2 + 10.6 · (2.6)2 + 14.4 · (2.9)2 + 19.0 · (3.2)2
A= =
(2.0)4 + (2.3)4 + (2.6)4 + (2.9)4 + (3.2)4
447.395
= ≈ 1.686581163
265.2674
127
Lecture Notes - Numerical Methods M.Ashyraliyev

So, the least squares power curve for the given data has the form:

y = f (x) = 1.686581163 · x2

The Root Mean Square for power fit y = f (x) is


v
u 5 
u1 X 2
RM Sf = t f (xk ) − yk ≈ 1.297
5
k=1

b) Using formula (9.7) with m = 3 we have

5.1 · (2.0)3 + 7.5 · (2.3)3 + 10.6 · (2.6)3 + 14.4 · (2.9)3 + 19.0 · (3.2)3
B= =
(2.0)6 + (2.3)6 + (2.6)6 + (2.9)6 + (3.2)6
1292.1517
= ≈ 0.59015381571791
2189.51681
So, the least squares power curve for the given data has the form:

y = g(x) = 0.59015381571791 · x3

The Root Mean Square for power fit y = g(x) is


v
u 5 
u1 X 2
RM Sg = t g(xk ) − yk ≈ 0.287
5
k=1

Since RM Sg < RM Sf , the curve y = g(x) gives a better fit than the
curve y = f (x).

Definition 9.5. Least squares parabola is the parabola


y = f (x) = Ax2 + Bx + C that minimizes the sum
n
X 2
S= Ax2k + Bxk + C − yk . (9.8)
k=1

128
Lecture Notes - Numerical Methods M.Ashyraliyev

Theorem 9.3. The coefficients of least squares parabola are the solu-
tions of corresponding normal equations:
n n n n

X X X X
4 3 2
yk x2k



 A xk + B xk + C xk =

k=1 k=1 k=1 k=1



n n n n

3 2 (9.9)
P P P P
 A x k + B x k + C x k = yk xk

 k=1 k=1 k=1 k=1


 n n n
x2k + B
P P P
A xk + C · n = yk


k=1 k=1 k=1

Proof. The values of parameters A, B and C in least squares parabola


minimize the sum (9.8). For a minimum to occur, we need
n
∂S X
2 Ax2k + Bxk + C − yk x2k = 0,

=
∂A
k=1
n
∂S X
2 Ax2k + Bxk + C − yk xk = 0,

=
∂B
k=1
n
∂S X
2 Ax2k + Bxk + C − yk = 0.

=
∂C
k=1

From these equalities we can easily obtain normal equations (9.9).

Example 9.4. Find the least squares parabola for the data given in
the following table
xk −3 0 2 4
yk 3 1 1 3

Solution: We first construct the table to calculate all necessary sums


which appear in the normal equations (9.9)

129
Lecture Notes - Numerical Methods M.Ashyraliyev

xk yk x2k x3k x4k xk yk x2k yk


−3 3 9 −27 81 −9 27
0 1 0 0 0 0 0
2 1 4 8 16 2 4
P4 P3 P 16 P 64 P 256 P12 P 48
=3 =8 = 29 = 45 = 353 =5 = 79
Then the normal equations (9.9) take the form

 353A + 45B + 29C = 79

45A + 29B + 3C = 5

29A + 3B + 4C = 8

Solving this system, we obtain A ≈ 0.1785, B ≈ −0.1925, C ≈ 0.8505.


So, the least squares parabola for the given data has the form:

y = 0.1785x2 − 0.1925x + 0.8505

9.2 Data Transformations


Suppose we want to find the least squares exponential curve

y = f (x) = CeAx (9.10)

for the data (9.1). So, we want to minimize the sum


n
X 2
S= CeAxk − yk
k=1

Then, the corresponding normal equations have the form


 n n
X X
2Ax
xk yk eAxk = 0

C xk e k −



k=1 k=1
n
X n
X
e2Axk − yk eAxk = 0

C



k=1 k=1

130
Lecture Notes - Numerical Methods M.Ashyraliyev

which are nonlinear equations. Therefore, no explicit formulas can be


obtained for parameters A and C. Nonlinearity of normal equations
is due to the fact that the model (9.10) is nonlinear with respect to
parameters A and C. This nonlinearity can be removed by using so-
called data transformation method (or data linearization method).
We have
y = CeAx ⇐⇒ ln y = Ax + ln C ⇐⇒ ỹ = Ax + B (9.11)
where ỹ = ln y and B = ln C. Now we can fit the data
(x1 , ỹ1 ), (x2 , ỹ2 ), . . . , (xn , ỹn )
to least squares line ỹ = Ax + B to find the values of A and B. Once
B is found, we can obtain C = eB .

Example 9.5. Consider the data given in the table:


xk 0 1 2 3 4
yk 1.5 2.5 3.5 5.0 7.5
Use the data transformation method to find exponential fit y = CeAx .
Solution: We use transformation (9.11) to linearize the model. Then
we construct the table to calculate all necessary sums which appear in
the normal equations (9.5)
xk yk ỹk = ln (yk ) x2k xk ỹk
0 1.5 0.4055 0 0
1 2.5 0.9163 1 0.9163
2 3.5 1.2528 4 2.5056
3 5.0 1.6094 9 4.8282
P4 7.5 P 2.0149 P 16 P 8.0596
= 10 = 6.1989 = 30 = 16.3097
Then the normal equations (9.5) take the form
(
30A + 10B = 16.3097
10A + 5B = 6.1989

131
Lecture Notes - Numerical Methods M.Ashyraliyev

Solving this system, we obtain A ≈ 0.3912 and B ≈ 0.4574. Then


C = eB ≈ 1.58. So, the least squares exponential curve for the given
data has the form:
y = f (x) = 1.58e0.3912x

Remark 9.1. Some transformations for data linearization are given in


the following table

Model Transformation New model

A 1
y= +B x̃ = y = Ax̃ + B
x x
1 1
y= ỹ = ỹ = Ax + B
Ax + B y
1 1
y= ỹ = √ ỹ = Ax + B
(Ax + B)2 y

y = CxA ỹ = ln y, x̃ = ln x, B = ln C ỹ = Ax̃ + B
y 
Ax
y = Cxe ỹ = ln , B = ln C ỹ = Ax + B
x
D 1 D
y= x̃ = xy, A = − , B = y = Ax̃ + B
x+C C C

9.3 Self-study Problems


Problem 9.1. Find the least squares line y = Ax + B for the data
given in the table:
xk 1.0 1.1 1.3 1.5 1.9 2.1
yk 1.84 1.96 2.21 2.45 2.94 3.18

132
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 9.2. Consider the data given in the table:


xk 0.5 0.8 1.1 1.8 4.0
yk 7.1 4.4 3.2 1.9 0.9
A B
Find the least squares power curves y = and y = 2 for the given
x x
data. Determine which power curve gives a better fit by calculating
RMS.

Problem 9.3. Consider the data given in the table:


xk −3 −1 1 3
yk 15 5 1 5

Find the least squares parabola y = Ax2 + Bx + C for given data and
calculate RMS.

Problem 9.4. Consider the data given in the table:


xk −2 −1 0 1 2
yk 2 1 1 1 2
Find the quadratic polynomial that best fits given data in the sense of
least squares.

Problem 9.5. Consider the data given in the table:


xk 1 2 3 4 5
yk 0.6 1.9 4.3 7.6 12.6
Use the data transformation method to find:

a) Exponential fit y = CeAx

b) Power fit y = CxA

133
Lecture Notes - Numerical Methods M.Ashyraliyev

c) Determine which curve fits best by calculating RMS.

Problem 9.6. Consider the data given in the table:


xk 0 1 2
yk −0.5 0.5 1.0
Use the data transformation method and find the least squares curve
1
y= for the given data.
Ax + B

Problem 9.7. Consider the data given in the table:


xk −3.0 −1.5 0.0 1.5 3.0
yk −0.1385 −2.1587 0.8330 2.2774 −0.5110
Derive the normal equations and find the least squares curve
y = A cos x + B sin x.

134
10 Eigenvalues, Scaled Power Method, Sin-
gular Value Decomposition
10.1 Eigenvalues and Eigenvectors of a Matrix
Definition 10.1. Number λ is called an eigenvalue of the square
matrix A if
Av = λv (10.1)
for some nonzero vector v. Vector v in this case is called an eigen-
vector of matrix A associated with eigenvalue λ.
To find the eigenvalues and eigenvectors of the square matrix A of
size n × n, we first rewrite equation (10.1) in the form
(A − λIn )v = 0, (10.2)
where In is an identity matrix. Given λ, (10.2) is a system of n homo-
geneous linear equations. By a standard theorem of linear algebra, it
has a nonzero solution v if and only if the determinant of its coefficient
matrix vanishes; that is, if and only if
a11 − λ a12 ... a1n
a21 a22 − λ . . . a2n
|A − λIn | = = 0. (10.3)
... ... ... ...
an1 an2 . . . ann − λ

Equation (10.3) is called the characteristic equation of the matrix


A; its roots are the eigenvalues of A. So, to find the eigenvalues of
square matrix A we need to find the roots of the characteristic equation
(10.3). Once the eigenvalues are found, corresponding eigenvectors can
be obtained from (10.2).
Remark 10.1. Note that, if v is an eigenvector associated with the
eigenvalue λ, then so is any nonzero constant scalar multiple cv of v.
So, we can scale eigenvectors any way we want.

135
Lecture Notes - Numerical Methods M.Ashyraliyev

Example 10.1.
 Find
 eigenvalues and corresponding eigenvectors of
1 3
matrix A = .
4 5
Solution: The characteristic equation of given matrix is

1−λ 3
|A − λI2 | = = (1 − λ)(5 − λ) − 12 = λ2 − 6λ − 7 = 0,
4 5−λ

so we have" two# eigenvalues λ1 = 7 and λ2 = −1.


a
Let v1 = be an eigenvector associated with eigenvalue λ1 = 7.
b
Substitution of λ1 and v1 in (10.2) yields the system
" #" # " # (
−6 3 a 0 −6a + 3b = 0,
= =⇒
4 −2 b 0 4a − 2b = 0.

Since the two scalar equations in this system are equivalent, it has
infinitely many nonzero solutions. We can choose a arbitrarily (but
nonzero) and then solve
" #for b. For instance, the choice a = 1 yields
1
b = 2, and thus v1 = is an eigenvector associated with eigenvalue
2
λ1 = 7. " #
c
Let v2 = be an eigenvector associated with eigenvalue λ2 = −1.
d
Substitution of λ2 and v2 in (10.2) yields the system
" #" # " # (
2 3 c 0 2c + 3d = 0,
= =⇒
4 6 d 0 4c + 6d = 0.
" #
−3
The choice c = −3 yields d = 2, and thus v2 = is an eigenvector
2
associated with eigenvalue λ2 = −1.

136
Lecture Notes - Numerical Methods M.Ashyraliyev

Remark 10.2. Let λ1 , λ2 , . . . , λn be eigenvalues of square matrix A


and v1 , v2 , . . . , vn be corresponding eigenvectors. Then by denoting
 
λ1 0 . . . 0
 0 λ2 . . . 0   
D=  ... ... .. 
 and V = v 1 v2 · · · vn
.
0 0 . . . λn
we have
AV = VD or A = VDV−1 (10.4)
(10.4) is called an eigen decomposition of matrix A.

Remark 10.3. Eigen decomposition of square matrix A allows to com-


pute powers of A in a very efficient way. For instance,

A2 = (VDV−1 )(VDV−1 ) = VD2 V−1 .

By induction, it follows that for general positive integer powers,

Ak = VDk V−1 ,

where
λk1 0 . . . 0
 

k
 0 λk2 . . . 0 
D =
 ... ... .. 
. 
0 0 . . . λkn
Additionally, it allows to find the inverse of A,

A−1 = (VDV−1 )−1 = VD−1 V−1 ,

where  
1
λ10 ... 0
 0 1 ... 0 
D−1 =  .. λ..2
 
 . . .
. 
. 
0 0 . . . λ1n

137
Lecture Notes - Numerical Methods M.Ashyraliyev

10.2 Scaled Power Method


Definition 10.2. Let λ1 , λ2 , . . . , λn be eigenvalues of square matrix A
of size n × n. The spectral radius of matrix A, denoted by ρ(A), is
defined by
ρ(A) = max {|λ1 |, |λ2 |, . . . , |λn |}.

Remark 10.4. The spectral radius of matrix A is the dominant


eigenvalue; that is the eigenvalue with the largest magnitude. Domi-
nant eigenvalue is of primary interest in many applications.

Suppose λ1 is the dominant eigenvalue of the square matrix A. Let


x1 be the eigenvector of A associated with the dominant eigenvalue
λ1 . The Scaled Power method is an iterative technique used to
approximate the dominant eigenvalue λ1 and the associated eigenvector
x1 . We begin the Scaled Power method by choosing initial vector v0 .
To approximate the eigenvector x1 we generate the sequence of vectors
{vk }∞
k=1 by letting

Avk
vk+1 = for k = 0, 1, 2, . . . (10.5)
||Avk ||∞
Then the approximations for the dominant eigenvalue λ1 are defined
as following
(Avk+1 )T vk+1
αk+1 = T v
for k = 0, 1, 2, . . . (10.6)
vk+1 k+1

Algorithm 8 shows a pseudo-code of Scaled Power method.

Remark 10.5. (Stopping conditions for Scaled Power method)


The following stopping conditions can be used:
|αk+1 − αk | < T OL (for absolute precision)
or
|αk+1 − αk |
< T OL (for relative precision)
|αk+1 |
138
Lecture Notes - Numerical Methods M.Ashyraliyev

Algorithm 8 Pseudo-code of Scaled Power Method.


Choose initial v0 .
for k = 0, 1, 2, 3, . . . do
Compute vk+1 bu using (10.5)
Compute αk+1 bu using (10.6)
if Stopping condition is satisfied then
Stop the search αk+1  is sufficiently good approximation of the
dominant eigenvalue
end if
end for

Example 10.2. Use Scaled Power  method to approximate the domi-


3 2
nant eigenvalue of matrix A = within relative precision 0.05.
2 3
 
1
Solution: We choose the initial vector v0 = .
0
    
3 2 1 3
1 step: We have Av0 = = , so that ||Av0 ||∞ = 3.
2 3 0 2
Then  
Av0 1
v1 = = 2 ,
||Av0 ||∞ 3
     13 
3 2 1 13
Av1 = 2 = 3 =⇒ ||Av1 ||∞ = .
2 3 3 4 3
So
 
 13  1
3 4 2 13 8
(Av1 )T v1 3 3 + 3 63
α1 = = = = = 4.84615 . . .
v1T v1 1 + 49
 
 2 1 13
1 3 2
3

2 step: We have
 
Av1 1
v2 = = 12 ,
||Av1 ||∞ 13

139
Lecture Notes - Numerical Methods M.Ashyraliyev

63
   " #
3 2 1 13 63
Av2 = 12 = =⇒ ||Av2 ||∞ = .
2 3 13
62 13
13
So
 
 63 62
 1
13 13 12 744 63
(Av2 )T v2 + 169
13 1563 13
α2 = =  = 144 = = 4.99361 . . .
v2T v2

 12
 1 1 + 169 313
1 13 12
13

Since
|α2 − α1 |
≈ 0.03 < 0.05,
|α2 |
we can stop the iterations. α2 ≈ 4.99361 gives an approximation
to the dominant eigenvalue of given matrix within relative pre-
cision 0.05. Note that the exact value of dominant eigenvalue is
λ1 = 5.

10.3 Singular Value Decomposition


Singular Value Decomposition (SVD) is a technique for factoring
a general m × n matrix. It has applications in many different areas,
including image compression, least squares fitting of data, signal pro-
cessing and statistics.

Definition 10.3. Matrix V is called orthogonal if VVT = I.

Theorem 10.1. If matrix A is symmetric, then all eigenvalues of A


are nonnegative real numbers. Moreover, we have

A = VDVT ,

where V is the orthogonal matrix whose columns are eigenvectors of


A and D is the diagonal matrix whose entries are eigenvalues of A
corresponding to column vectors of V.

140
Lecture Notes - Numerical Methods M.Ashyraliyev

Theorem 10.2. For any matrix A of size m × n, matrix AT A is


symmetric matrix of size n × n. Therefore, all eigenvalues of AT A are
nonnegative real numbers.

Definition 10.4. Let A be a matrix of size m × n. Suppose that


λ1 ≥ λ2 ≥ . . . ≥ λn ≥ 0
are the eigenvalues of AT A. Then the numbers
p p p
σ1 = λ1 , σ2 = λ2 , . . . , σn = λn
are called the singular values of matrix A.

Theorem 10.3. (Singular Value Decomposition)


For any A be a matrix of size m × n we have
A = UΣVT , (10.7)
where U is orthogonal matrix of size m × m, V is orthogonal matrix
of size n × n and Σ is an m × n matrix whose only nonzero elements
lie along the main diagonal.

Remark 10.6. (10.7) is called Singular Value Decomposition (SVD)


of matrix A of size m × n. We assume here that m > n, since in many
important applications m is much larger than n.

Remark 10.7. From (10.7) we have


AT A = (VΣT UT )(UΣVT ) = V(ΣT Σ)VT .
Therefore, by Theorem 10.1 it follows that the columns of matrix V in
(10.7) are the eigenvectors of matrix AT A and the diagonal elements
of matrix Σ in (10.7) are the square roots of the eigenvalues of matrix
AT A; so that the diagonal elements of Σ are the singular values of
matrix A. Thus, to construct matrices V, Σ and U in the Singular
Value Decomposition (10.7) of matrix A of size m × n, we apply the
following procedure:

141
Lecture Notes - Numerical Methods M.Ashyraliyev

1. We first find all eigenvalues of matrix AT A. Let λ1 , λ2 , . . . , λn


be eigenvalues of AT A ordered from largest to smallest. Then,
the diagonal elements of matrix Σ are
p p p
σ1 = λ1 , σ2 = λ2 , . . . , σn = λn .

2. We find all eigenvectors of matrix AT A. Let v1 , v2 , . . . , vn be


normalized eigenvectors of AT A associated with the eigenvalues
λ1 , λ2 , . . . , λn , respectively. Then, the columns of matrix V are
the vectors v1 , v2 , . . . , vn .

3. Now, assume that first k singular values of A are positive and


remaining singular values are zero. Then, the first k columns of
matrix U can be computed by
Avi
ui = , i = 1, 2, . . . , k.
σi

Remaining m − k columns of U can be obtained from UUT = I.

Remark 10.8. In the Singular Value Decomposition (10.7), the matrix


Σ is unique. However, matrices V and U will not be unique unless
m = n and σi > 0 for all i. Non-uniqueness is of no concern; we can
choose any orthogonal matrices V and U such that factorization (10.7)
holds.
 
1 1
Example 10.3. Find SVD of the matrix A =  0 1 .
1 0
Solution: We have
 
  1 1  
T 1 0 1  2 1
A A= 0 1= .
1 1 0 1 2
1 0

142
Lecture Notes - Numerical Methods M.Ashyraliyev

The characteristic equation of matrix AT A is


2−λ 1
= (2 − λ)2 − 1 = λ2 − 4λ + 3 = 0,
1 2−λ

so we have two eigenvalues λ1 = √ 3 and √λ2 = 1. Then,√ the singular


values of given matrix A are σ1 = λ1 = 3 and σ2 = λ2 = 1. Then
√ 
3 0
Σ= 0 1
0 0
" #
a
Let be an eigenvector of matrix AT A associated with eigenvalue
b
λ1 = 3. Then
" #" # " # (
−1 1 a 0 −a + b = 0
= =⇒ =⇒ a = b.
1 −1 b 0 a−b=0
" #
1
The choice b = 1 yields a = 1, and thus is an eigenvector of
1
matrix AT A associated # with eigenvalue λ1 = 3. Normalizing this
√1
"
2
vector gives v1 = .
√1
" # 2
c
Let be an eigenvector of matrix AT A associated with eigenvalue
d
λ2 = 1. Then
" #" # " #
1 1 c 0
= =⇒ c + d = 0.
1 1 d 0
" #
1
The choice d = −1 yields c = 1, and thus is an eigenvector
−1

143
Lecture Notes - Numerical Methods M.Ashyraliyev

of matrix AT A associated with eigenvalue λ2 = 1. Normalizing this


" 1 #√
2
vector gives v2 = .
− √1
2
Then
√1 √1
" #
2 2
V=
√1 − √12
2
Now, we construct the matrix U. The first 2 columns of U can be
computed directly
 "
1 1 √1
#
0 1 2 √   2 
2 √
√1 6
Av1 1 0 2 1  1   1 
√ = √ 
u1 = = √ =√  2   6 
σ1 3 3 
1 1 √ √
2 6
" 
1 1 √1
#
2  
0 1 0
− √1
Av2 1 0 2 √1

 
u2 = = = 2

σ2 1 
1


2
 
p
The last column of U can be obtained from UUT = I. Let u3 =  q .
r
Then
 
√2 0 p √2 √1 √1

6
 
  6 6 6 1 0 0
UUT =  √16 − √12 q  

0 − √1 √1  =  0 1 0 
 
 2 2 
 0 0 1
√1 √1 r p q r
6 2

1 √1
results in p2 = q 2 = r2 = 3 and q = r = −p. The choice r = 3
yields

144
Lecture Notes - Numerical Methods M.Ashyraliyev

q= √1 and p = − √13 . So
3
 
√2 0 − √13
6
 
U=
 √1 − √12 √1 
 6 3 

√1 √1 √1
6 2 3

Finally, the SVD (10.7) for given matrix A takes the form
  √
√2 0 − √1

6 3 3 0 " 1 #T
  √ √1
2 2
A = UΣVT =  √16 − √12 √1 

0 1

3 
    √1 − √12
1 1 1 2

6

2

3
0 0

10.4 Self-study Problems


 
1 3
Problem 10.1. Let A = . Use Scaled Power method with
  4 5
1
initial vector v0 = to approximate the dominant eigenvalue of
0
matrix A within relative precision 10−2 . Compare your results to the
exact value of the dominant eigenvalue.

 
7 −2 0
Problem 10.2. Let A =  −2 6 −2 . Use Scaled Power method
  0 −2 5
1
with initial vector v0 = 0  to approximate the dominant eigenvalue

0
of matrix A within relative precision 10−2 . Compare your results to
the exact value of the dominant eigenvalue.

145
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 10.3. Find the Singular Value Decomposition of the matrix


 
3 2
A=
2 3

Problem 10.4. Find the Singular Value Decomposition of the matrix


 
2 −1
A=
−2 1

Problem 10.5. Find the Singular Value Decomposition of the matrix


 
2 1
A=1 1
0 1

146
11 Numerical Differentiation, Numerical
Integration
11.1 Numerical Differentiation
Here we will learn the methods to approximate the first and the second
derivatives of function. These methods are often needed for approxi-
mating the solutions of differential equations. Approximation formulas
are based on the Taylor’s formula. Let us recall the Taylor’s theorem.
Theorem 11.1. (Taylor’s Theorem)
Suppose function f has continuous n + 1 derivatives on interval [a, b].
Then for any points x and x0 from [a, b], there exists a number ξ
between x and x0 such that

0 f 00 (x0 )
f (x) = f (x0 ) + f (x0 )(x − x0 ) + (x − x0 )2 +
2!
(11.1)
f (n) (x0 ) n f (n+1) (ξ)
+ ... + (x − x0 ) + (x − x0 )n+1 .
n! (n + 1)!

Remark 11.1. Putting x = x0 + h in (11.1), we get

0 h2 00 hn (n) hn+1 (n+1)


f (x0 +h) = f (x0 )+hf (x0 )+ f (x0 )+. . .+ f (x0 )+ f (ξ).
2! n! (n + 1)!
(11.2)

11.1.1 Approximations for the first derivative


Theorem 11.2. If f 00 (x) is continuous on the interval [a, b], then for
any points x0 and x0 + h from [a, b] we have
f (x0 + h) − f (x0 ) h 00
f 0 (x0 ) = − f (ξ) (11.3)
h 2
for some ξ between x0 and x0 + h.

147
Lecture Notes - Numerical Methods M.Ashyraliyev

Proof. With n = 1 in (11.2) we have

h2 00
f (x0 + h) = f (x0 ) + hf 0 (x0 ) + f (ξ),
2!
which gives us immediately (11.3).

Remark 11.2. Assuming that h is small enough, from (11.3) we get


an approximation for the first derivative of function f at point x0
f (x0 + h) − f (x0 )
f 0 (x0 ) ≈ (11.4)
h
When h > 0, (11.4) is called the forward difference formula. Note
that the truncation error in the approximation (11.4) is O(h).

Remark 11.3. Replacing h with −h in (11.4), we obtain

f (x0 ) − f (x0 − h)
f 0 (x0 ) ≈ , (11.5)
h
which is called the backward difference formula. The truncation
error in the approximation (11.5) is also O(h).

Theorem 11.3. If f 000 (x) is continuous on the interval [a, b], then for
any points x0 − h and x0 + h from [a, b] we have

f (x0 + h) − f (x0 − h) h2 000


f 0 (x0 ) = − f (ξ) (11.6)
2h 6
for some ξ between x0 − h and x0 + h.

Proof. With n = 2 in (11.2) we have

h2 00 h3
f (x0 + h) = f (x0 ) + hf 0 (x0 ) + f (x0 ) + f 000 (ξ2 ), (11.7)
2! 3!

148
Lecture Notes - Numerical Methods M.Ashyraliyev

where ξ2 is some point between x0 and x0 + h. Replacing h with −h


in (11.7) gives

0h2 00 h3 000
f (x0 − h) = f (x0 ) − hf (x0 ) + f (x0 ) − f (ξ1 ), (11.8)
2! 3!
where ξ1 is some point between x0 −h and x0 . Subtracting from equality
(11.7) the equality (11.8) side by side, we get

0 h3  000 000

f (x0 + h) − f (x0 − h) = 2hf (x0 ) + f (ξ1 ) + f (ξ2 ) ,
6
from which we obtain

0 f (x0 + h) − f (x0 − h) h2  000 000



f (x0 ) = − f (ξ1 ) + f (ξ2 ) . (11.9)
2h 12
Since f 000 (x) is continuous on [a, b] and therefore it is continuous on
[ξ1 , ξ2 ], there must exist some point ξ between ξ1 and ξ2 such that
000 f 000 (ξ1 ) + f 000 (ξ2 )
f (ξ) = . Using this in (11.9), we obtain (11.6).
2

Remark 11.4. Assuming that h is small enough, from (11.6) we get


an approximation for the first derivative of function f at point x0
f (x0 + h) − f (x0 − h)
f 0 (x0 ) ≈ , (11.10)
2h
which is called the central difference formula for the first derivative.
Note that the truncation error in the approximation (11.10) is O(h2 ).

Example 11.1. Let f (x) = xex . Approximate f 0 (2) using

(a) forward difference formula (11.4)

(b) central difference formula (11.10)

149
Lecture Notes - Numerical Methods M.Ashyraliyev

with h = 0.1, h = 0.01 and h = 0.001. Find the errors in obtained


approximations.
Solution: We have f 0 (x) = (x + 1)ex and therefore the exact value of
f 0 (2) is
f 0 (2) = 3e2 = 22.16716829679165 . . .

(a) Forward difference formula (11.4) with x0 = 2 takes the form

f (2 + h) − f (2)
f 0 (2) ≈ .
h

When h = 0.1, we have


f (2.1) − f (2)
= 23.7084461853 . . .
0.1
f (2.1) − f (2)
with the error f 0 (2) − ≈ 1.54.
0.1
When h = 0.01, we have
f (2.01) − f (2)
= 22.315567 . . .
0.01
f (2.01) − f (2)
with the error f 0 (2) − ≈ 0.15.
0.01
When h = 0.001, we have
f (2.001) − f (2)
= 22.1819525 . . .
0.001
f (2.001) − f (2)
with the error f 0 (2) − ≈ 0.015.
0.001
(b) Central difference formula (11.10) with x0 = 2 takes the form

f (2 + h) − f (2 − h)
f 0 (2) ≈ .
2h
150
Lecture Notes - Numerical Methods M.Ashyraliyev

When h = 0.1, we have


f (2.1) − f (1.9)
= 22.22878688 . . .
0.2
f (2.1) − f (1.9)
with the error f 0 (2) − ≈ 0.06.
0.2
When h = 0.01, we have
f (2.01) − f (1.99)
= 22.167784 . . .
0.02
f (2.01) − f (1.99)
with the error f 0 (2) − ≈ 6 × 10−4 .
0.02
When h = 0.001, we have
f (2.001) − f (1.999)
= 22.167174454 . . .
0.002
f (2.001) − f (1.999)
with the error f 0 (2) − ≈ 6 × 10−6 .
0.002

11.1.2 Approximation for the second derivative


Theorem 11.4. If f (4) (x) is continuous on the interval [a, b], then for
any points x0 − h and x0 + h from [a, b] we have

00 f (x0 − h) − 2f (x0 ) + f (x0 + h) h2 (4)


f (x0 ) = − f (ξ) (11.11)
h2 12
for some ξ between x0 − h and x0 + h.

Proof. With n = 3 in (11.2) we have

0 h2 00 h3 000 h4 (4)
f (x0 +h) = f (x0 )+hf (x0 )+ f (x0 )+ f (x0 )+ f (ξ2 ), (11.12)
2! 3! 4!

151
Lecture Notes - Numerical Methods M.Ashyraliyev

where ξ2 is some point between x0 and x0 + h. Replacing h with −h


in (11.12) gives

0 h2 00 h3 000 h4 (4)
f (x0 −h) = f (x0 )−hf (x0 )+ f (x0 )− f (x0 )+ f (ξ1 ), (11.13)
2! 3! 4!
where ξ1 is some point between x0 − h and x0 . Adding the equalities
(11.12) and (11.13) side by side, we get

2 00h4  (4) (4)



f (x0 − h) + f (x0 + h) = 2f (x0 ) + h f (x0 ) + f (ξ1 ) + f (ξ2 ) ,
24
from which we obtain

00 f (x0 − h) − 2f (x0 ) + f (x0 + h) h2  (4) (4)



f (x0 ) = − f (ξ1 ) + f (ξ2 ) .
h2 24
(11.14)
(4)
Since f (x) is continuous on [a, b] and therefore it is continuous on
[ξ1 , ξ2 ], there must exist some point ξ between ξ1 and ξ2 such that
(4) f (4) (ξ1 ) + f (4) (ξ2 )
f (ξ) = . Using this in (11.14), we obtain (11.11).
2

Remark 11.5. Assuming that h is small enough, from (11.11) we get


an approximation for the second derivative of function f at point x0
f (x0 − h) − 2f (x0 ) + f (x0 + h)
f 00 (x0 ) ≈ , (11.15)
h2
which is called the central difference formula for the second deriva-
tive. Note that the truncation error in the approximation (11.15) is
O(h2 ).

Example 11.2. Let f (x) = xex . Approximate f 00 (2) using the central
difference formula (11.15) with h = 0.1, h = 0.01 and h = 0.001. Find
the errors in obtained approximations.

152
Lecture Notes - Numerical Methods M.Ashyraliyev

Solution: We have f 00 (x) = (x + 2)ex and therefore the exact value of


f 00 (2) is
f 00 (2) = 4e2 = 29.5562243957226 . . .
The central difference formula (11.15) with x0 = 2 takes the form

f (2 − h) − 2f (2) + f (2 + h)
f 00 (2) ≈ .
h2
When h = 0.1, we have
f (1.9) − 2f (2) + f (2.1)
= 29.5931861 . . .
0.01
f (1.9) − 2f (2) + f (2.1)
with the error f 00 (2) − ≈ 0.037.
0.01
When h = 0.01, we have
f (1.99) − 2f (2) + f (2.01)
= 29.55659385 . . .
0.0001
f (1.99) − 2f (2) + f (2.01)
with the error f 00 (2) − ≈ 3.7 × 10−4 .
0.0001
When h = 0.001, we have
f (1.999) − 2f (2) + f (2.001)
= 29.55622809 . . .
0.000001
f (1.999) − 2f (2) + f (2.001)
with the error f 00 (2) − ≈ 3.7 × 10−6 .
0.000001

11.2 Numerical Integration


The exact techniques which are used to evaluate the definite integral
Zb
f (x)dx are based on the Fundamental Theorem of Calculus and
a

153
Lecture Notes - Numerical Methods M.Ashyraliyev

therefore require the antiderivative of function f (x). Most of the time,


it is extremely difficult or even impossible to find the antiderivatives.
Here we will learn the numerical methods to approximate the definite
integral. These methods are called numerical quadratures.

11.2.1 Trapezoidal Rule


Theorem 11.5. If f 00 (x) is continuous on the interval [x0 , x1 ], then
there exists a point ξ between x0 and x1 such that
Zx1  h3
h
f (x)dx = f (x0 ) + f (x1 ) − f 00 (ξ), (11.16)
2 12
x0

where h = x1 − x0 .

Proof. Using the first order Lagrange interpolating polynomial we have


x − x1 x − x0 f 00 (ξ)
f (x) = f (x0 ) + f (x1 ) + (x − x0 )(x − x1 ),
x0 − x1 x1 − x0 2!
where ξ is some point between x0 and x1 . Then
Zx1 Zx1 
f 00 (ξ)

f (x0 ) f (x1 )
f (x)dx = − (x − x1 ) + (x − x0 ) + (x − x0 )(x − x1 ) dx =
h h 2!
x0 x0
x x
f (x0 ) (x − x1 )2 1 f (x1 ) (x − x0 )2 1
=− + +
h 2 x0 h 2 x0
 x1
f 00 (ξ) (x − x0 )(x − x1 )2 (x − x1 )3

+ − =
2 2 6 x0

hh i h3
= f (x0 ) + f (x1 ] − f 00 (ξ).
2 12

154
Lecture Notes - Numerical Methods M.Ashyraliyev

Zb
b−a
Now, consider the integral f (x)dx. Let h = , where n is a
n
a
positive integer. We denote

x0 = a, x1 = a + h, x2 = a + 2h, . . . , xn = a + nh = b.

Then the sum


hh i
Tn = f (x0 ) + 2f (x1 ) + 2f (x2 ) + . . . + 2f (xn−1 ) + f (xn )
2
n−1 (11.17)
hh X i
= f (x0 ) + 2 f (xk ) + f (xn )
2
k=1

is called the Trapezoidal Sum for function f (x) over interval [a, b].

Theorem 11.6. If f 00 (x) is continuous on the interval [a, b], then there
exists a point ξ between a and b such that
Zb
h2
f (x)dx = Tn − (b − a)f 00 (ξ), (11.18)
12
a

where Tn is the Trapezoidal sum defined by (11.17).


Proof. Using the result of Theorem 11.5, we have
Zb Zx1 Zx2 x
Zn−1 Zxn
f (x)dx = f (x)dx + f (x)dx + . . . + f (x)dx + f (x)dx =
a x0 x1 xn−2 xn−1
h  h3 h  h3
00
= f (x0 ) + f (x1 ) − f (ξ1 ) + f (x1 ) + f (x2 ) − f 00 (ξ2 )+
2 12 2 12
h   h3
+ ... + f (xn−2 ) + f (xn−1 ) − f 00 (ξn−1 )+
2 12
h  h3
+ f (xn−1 ) + f (xn ) − f 00 (ξn ) =
2 12
155
Lecture Notes - Numerical Methods M.Ashyraliyev

hh i
= f (x0 ) + 2f (x1 ) + 2f (x2 ) + . . . + 2f (xn−1 ) + f (xn ) −
2
h3  00 00 00 00

− f (ξ1 ) + f (ξ2 ) + . . . + f (ξn−1 ) + f (ξn ) =
12
h3 00 h2
= Tn − nf (ξ) = Tn − (b − a)f 00 (ξ).
12 12

Remark 11.6. Assuming that h is small enough, from (11.18) we get


an approximation formula
Zb n−1
hh X i
f (x)dx ≈ f (x0 ) + 2 f (xk ) + f (xn ) , (11.19)
2
a k=1

which is called the Trapezoidal Rule. Note that the truncation error
in the approximation (11.19) is O(h2 ).

Z1
Example 11.3. Approximate the integral x4 dx using the Trape-
0
zoidal sum with n = 10 and find the error in the obtained approxima-
tion.
1−0
Solution: We denote f (x) = x4 . Since h = = 0.1, we have
10
xk = hk = 0.1k for k = 0, 1, 2, . . . , 10. Then
9
hh X i
T10 = f (x0 ) + 2 f (xk ) + f (x10 ) =
2
k=1

0.1 4
h
4 4 4 4
 4
i
= 0 + 2 0.1 + 0.2 + 0.3 + . . . + 0.9 + 1 =
2
h i
= 0.05 0 + 2 · 1.5333 + 1 = 0.20333.

156
Lecture Notes - Numerical Methods M.Ashyraliyev

Z1
1
Note that the exact value of given integral is x4 dx = = 0.2 and
5
0
therefore the error in the approximation of integral by T10 is
E = |T10 − 0.2| = |0.20333 − 0.2| = 0.00333.

11.2.2 Simpson’s Rule


Theorem 11.7. If f (4) (x) is continuous on the interval [x0 , x2 ], then
there exists a point ξ between x0 and x2 such that
Zx2  h5
h
f (x)dx = f (x0 ) + 4f (x1 ) + f (x2 ) − f (4) (ξ), (11.20)
3 90
x0

where h = x1 − x0 = x2 − x1 .
Proof. The proof of (11.20) uses the second order Lagrange interpo-
lating polynomial and is similar to the proof of (11.16). So, we leave
it for the self-study.
Zb
Now, consider again the integral f (x)dx. Let n be even number.
a
b−a
We define h = and xk = a + kh for k = 0, 1, 2, . . . , n. Then the
sum n

hh i
Sn = f (x0 )+4f (x1 )+2f (x2 )+4f (x3 )+2f (x4 )+. . .+4f (xn−1 )+f (xn )
3
or
n/2 (n/2)−1
hh X X i
Sn = f (x0 ) + 4 f (x2k−1 ) + 2 f (x2k ) + f (xn ) (11.21)
3
k=1 k=1

is called the Simpson’s Sum for function f (x) over interval [a, b].

157
Lecture Notes - Numerical Methods M.Ashyraliyev

Theorem 11.8. If f (4) (x) is continuous on the interval [a, b], then
there exists a point ξ between a and b such that

Zb
h4
f (x)dx = Sn − (b − a)f (4) (ξ), (11.22)
180
a

where Sn is the Simpson’s sum defined by (11.21).

Proof. Using the result of Theorem 11.7, we have

Zb Zx2 Zx4 x
Zn−2 Zxn
f (x)dx = f (x)dx + f (x)dx + . . . + f (x)dx + f (x)dx =
a x0 x2 xn−4 xn−2
h  h5
= f (x0 ) + 4f (x1 ) + f (x2 ) − f (4) (ξ1 )+
3 90
h  h5
+ f (x2 ) + 4f (x3 ) + f (x4 ) − f (4) (ξ2 ) + . . . +
3 90
h  h5
+ f (xn−4 ) + 4f (xn−3 ) + f (xn−2 ) − f (4) (ξ n2 −1 )+
3 90
h   h5
+ f (xn−2 ) + 4f (xn−1 ) + f (xn ) − f (4) (ξ n2 ) =
3 90
hh i
= f (x0 ) + 4f (x1 ) + 2f (x2 ) + 4f (x3 ) + 2f (x4 ) + . . . + 4f (xn−1 ) + f (xn ) −
3
h5  (4) (4) (4) n (4) n

− f (ξ1 ) + f (ξ2 ) + . . . + f (ξ 2 −1 ) + f (ξ 2 ) =
90
h5 n (4) h4
= Sn − f (ξ) = Sn − (b − a)f (4) (ξ).
90 2 180

Remark 11.7. Assuming that h is small enough, from (11.22) we get

158
Lecture Notes - Numerical Methods M.Ashyraliyev

an approximation formula

Zb n/2 (n/2)−1
hh X X i
f (x)dx ≈ f (x0 ) + 4 f (x2k−1 ) + 2 f (x2k ) + f (xn ) ,
3
a k=1 k=1
(11.23)
which is called the Simpson’s Rule. Note that the truncation error
in the approximation (11.23) is O(h4 ).

Z1
Example 11.4. Approximate the integral x4 dx using the Simpson’s
0
sum with n = 10 and find the error in the obtained approximation.
1−0
Solution: We denote f (x) = x4 . Since h = = 0.1, we have
10
xk = hk = 0.1k for k = 0, 1, 2, . . . , 10. Then
5 4
hh X X i
S10 = f (x0 ) + 4 f (x2k−1 ) + 2 f (x2k ) + f (x10 ) =
3
k=1 k=1

0.1 h 4
0 + 4 0.14 + 0.34 + 0.54 + 0.74 + 0.94 +

=
3
i
4 4 4 4 4

+ 2 0.2 + 0.4 + 0.6 + 0.8 + 1 =
0.1 h i 0.1
= 0 + 4 · 0.9669 + 2 · 0.5664 + 1 = · 6.0004 = 0.2000133 . . .
3 3
The error in the approximation of given integral by S10 is

E = |S10 − 0.2| = 0.0000133 . . .

11.3 Self-study Problems


Problem 11.1. Consider the data given in the table

159
Lecture Notes - Numerical Methods M.Ashyraliyev

x 0.5 0.6 0.7


f (x) 0.4794 0.5646 0.6442

a) Approximate f 0 (0.6) by using the forward difference formula.


b) Approximate f 0 (0.6) by using the backward difference formula.
c) Approximate f 0 (0.6) by using the central difference formula.
d) Approximate f 00 (0.6) by using the central difference formula.

Problem 11.2. Find the values of a, b and c such that


af (x0 ) + bf (x0 + h) + cf (x0 + 2h)
f 0 (x0 ) = + O(h2 )
2h
and then use the obtained formula to approximate f 0 (0.5) for the data
given in Problem 11.1.

Problem 11.3. Find the values of a, b and c such that


af (x0 ) + bf (x0 − h) + cf (x0 − 2h)
f 0 (x0 ) = + O(h2 )
2h
and then use the obtained formula to approximate f 0 (0.7) for the data
given in Problem 11.1.

Problem 11.4. Let f (x) = x ln x. Approximate f 0 (8) using


(a) forward difference formula
(b) central difference formula
with h = 0.1, h = 0.01 and h = 0.001. Find the errors in obtained
approximations.

160
Lecture Notes - Numerical Methods M.Ashyraliyev

Problem 11.5. Let f (x) = x ln x. Approximate f 00 (8) using the cen-


tral difference formula with h = 0.1, h = 0.01 and h = 0.001. Find the
errors in obtained approximations.

Z2
Problem 11.6. Approximate the integral x ln xdx using
1

(a) Trapezoidal sum with n = 4 and find the error in the obtained
approximation.

(b) Simpson’s sum with n = 4 and find the error in the obtained
approximation.


Problem 11.7. Approximate the integral x2 cos xdx using
0

(a) Trapezoidal sum with n = 6 and find the error in the obtained
approximation.

(b) Simpson’s sum with n = 6 and find the error in the obtained
approximation.

Z2
2
Problem 11.8. Approximate the integral ex dx using
0

(a) Trapezoidal sum with h = 0.2

(b) Simpson’s sum with h = 0.2

161

You might also like