0% found this document useful (0 votes)

34 views492 pages

Comp 361

The document contains lecture notes on elementary numerical methods, covering topics such as vector and matrix norms, solutions to linear and nonlinear equations, function approximation, numerical differentiation, integration, and methods for initial and boundary value problems. It includes detailed explanations, examples, and exercises with solutions. The content is structured into sections, each addressing different numerical techniques and their applications.

Uploaded by

lankwitzjacques

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views492 pages

Comp 361

Uploaded by

lankwitzjacques

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 492

LECTURE NOTES

ELEMENTARY NUMERICAL METHODS

with Solutions

Eusebius Doedel
TABLE OF CONTENTS

Vector and Matrix Norms 1

Banach Lemma 20

The Numerical Solution of Linear Systems 25

Gauss Elimination 25
Operation Count 29
Using the LU-decomposition for multiple right hand sides 34
Tridiagonal Systems 37
Inverses 40
Practical Considerations 47
Gauss Elimination with Pivoting 53
Error Analysis 56

The Numerical Solution of Nonlinear Equations 73

Some Methods for Scalar Nonlinear Equations 77
Bisection 78
Regula Falsi 80
Newton’s Method 83
The Chord Method 87
Newton’s Method for Systems of Nonlinear Equations 92
Residual Correction 99
Convergence Analysis for Scalar Equations 102
Convergence Analysis for Systems 145
The Approximation of Functions 158
Function Norms 158
Lagrange Interpolation Polynomial 166
Lagrange Interpolation Theorem 176
Chebyshev Polynomials 185
Chebyshev Theorem 191
Taylor Polynomial 207
Taylor Theorem 211
Local Polynomial Interpolation 216

Numerical Differentiation 231

Best Approximation in the k · k2 240

Best Approximation in R3 240
Best Approximation in General 247
Gram-Schmidt Orthogonalization 256
Best Approximation in Function Space 259

Numerical Integration 268

Trapezoidal Rule 270
Simpson’s Rule 273
Gauss Quadrature 287

Discrete Least Squares Approximation 296

Linear Least Squares 298
General Least Squares 306
Smooth Interpolation by Piecewise Polynomials 326
Cubic Spline Interpolation 330

Numerical Methods for Initial Value Problems 341

Numerical Methods 347
Stability of Numerical Approximations 355
Stiff Differential Equations 365

Boundary Value Problems in ODE 384

A Nonlinear Boundary Value Problem 400

Diffusion Problems 404

Nonlinear Diffusion Equations 417
VECTOR AND MATRIX NORMS

In later analysis we shall need a quantity (called vector norm) that measures
the magnitude of a vector.

Let x ≡ (x1 , x2 , · · · , xn )T ∈ Rn .

EXAMPLES (of norms) :

n
X
k x k1 ≡ | xk | , (the “one-norm ”)
k=1

n
X 1
2
k x k2 ≡ ( xk ) ,2 (the “two-norm ”, or Euclidean length)
k=1

k x k∞ ≡ max | xk | , (the “infinity-norm ”, or “max-norm”)

1 ≤ k ≤ n

1
k x k1 and k x k2 are special cases of
n
X 1
p
k x kp ≡ ( | xk | ) ,
p (where p is a positive integer),
k=1

while for any fixed vector x we have

k x k∞ is the limit of k x kp as p → ∞ . (Check !)

EXAMPLE : If x = (1, −2, 4)T then

√
k x k1 = 7 , k x k2 = 21 , k x k∞ = 4 .

2
Vector norms are required to satisfy

(i) k x k ≥ 0, ∀x ∈ Rn and k x k = 0 only if x = 0,

(ii) k αx k = | α | k x k, ∀x ∈ Rn , ∀α ∈ R,

(iii) k x + y k ≤ k x k + k y k, ∀x, y ∈ Rn (Triangle inequality).

3
All of the examples of norms given above satisfy (i) and (ii). (Check !)

To check condition (iii) let

x = (x1 , x2 , · · · , xn )T , y = (y1 , y2 , · · · , yn )T .
Then
Pn Pn
k x + y k1 = k=1 | xk + yk | ≤ k=1 ( | xk | + | yk | )
Pn Pn
= k=1 | xk | + k=1 | yk | = k x k1 + k y k1 .

k x + y k2 ≤ k x k2 + k y k2 “by geometry′′ (Proof given later.)

k x + y k∞ = maxk | xk + yk | ≤ maxk | xk | + maxk | yk |

= k x k∞ + k y k∞ .

4
EXERCISES :

•[001] Let x = (1, −2, 3)T . Compute k x k1 , k x k2 , and k x k∞ .

•[002] Graphically indicate all points x = (x1 , x2 )T in R2 for which k x k2 = 1.

Do the same for k x k1 and k x k∞ .

•[003] Graphically indicate all points x = (x1 , x2 )T in R2 for which k x k2 ≤ 1.

Do the same for k x k1 and k x k∞ .

•[004] Graphically indicate all points x = (x1 , x2 , x3 )T ∈ R3 with k x k2 = 1.

Do the same for k x k1 and k x k∞ .

•[005] Prove that k x k1 ≤ n k x k∞ .

√
•[006] Prove that k x k2 ≤ n k x k∞ .

•[007] Prove that k x k2 ≤ k x k1 .

•[008] Prove that limp→∞ k x kp = k x k∞ .

5
•[005] Prove that k x k1 ≤ n k x k∞ .
SOLUTION :
n
X
k x k1 = | xi | ≤ n max | xi | = n k x k∞ .
i
i=1
√
•[006] Prove that k x k2 ≤ n k x k∞ .
SOLUTION : From
Xn
k x k22 = x2i ≤ n max | xi |2 = n k x k2∞ ,
i
i=1 √
it follows that k x k2 ≤ n k x k∞ .

•[007] Prove that k x k2 ≤ k x k1 .

SOLUTION : From
n
X n
X
k x k21 = ( | xi |)2 ≥ x2i = k x k22 ,
i=1 i=1
it follows that k x k2 ≤ k x k1 .
•[008] Prove that limp→∞ k x kp = k x k∞ .
SOLUTION : Clearly true if x = 0, so assume that x 6= 0.
Then Pn p
1 n h
k x kp ( k=1 | xk | )
p X | xk | ip p1
= = .
k x k∞ k x k∞ k=1
k x k∞

|xk |
Note that the ratio kxk∞
equals 1 for at least one value of k.
|xk |
We may assume kxk∞
= 1 for m values of k, where 0 < m ≤ n.
|xk |
For the remaining values of k the ratio kxk∞
is strictly less than 1.
It follows that n h
X | xk | ip
→ m as p → ∞ .
k=1
k x k∞

Thus Xn h
| xk | ip p1 1
lim = lim m = 1 .
p
p→∞
k=1
k x k∞ p→∞
We also need a measure of the magnitude of a square matrix (matrix norm).

This is defined in terms of a given vector norm , namely,

k Ax k
k A k ≡ max .
x6=0 kxk

Thus k A k measures the maximum relative stretching in a given vector

norm that occurs when multiplying all non-zero vectors x ∈ Rn by A.

From this definition it follows that for arbitrary y ∈ Rn , y 6= 0, we have

k Ay k
≤ kAk ,
kyk
i.e.,
k Ay k ≤ k A k k y k .

6
For specific choices of vector norm it is convenient to express the induced
matrix norm directly in terms of the elements of the matrix :

For the case of the k · k∞ let

 
a11 a12 · · · a1n
n
 a21 a22 · · · a2n  X
A ≡   ··· ··· ··· ··· ,
 and let R ≡ max | aij | .
i
j=1
an1 an2 · · · ann
Thus R is the “maximum absolute row sum”. For x ∈ Rn , x 6= 0, we have
Pn
k Ax k∞ maxi | j=1 aij xj |
=
k x k∞ k x k∞
Pn
maxi j=1 | aij || xj |
≤
k x k∞
Pn
maxi { j=1 | aij | k x k∞ }
≤ = R.
k x k∞

7
Next we show that for any matrix A there always is a vector y for which
k Ay k∞
= R.
k y k∞
Pn
Let k be the row of A for which j=1 | akj | is a maximum, i.e.,
n
X
| akj | = R .
j=1

Take y = (y1 , y2 , · · · , yn )T such that


 1 if akj ≥ 0 ,
yj =
−1 if akj < 0 .


Then
n n
k Ay k∞ X X
= k Ay k∞ = max | aij yj | = | akj | = R .
k y k∞ i
j=1 j=1

8
Thus we have shown that

k A k∞ is equal to the maximum absolute row sum.

EXAMPLE : If  
1 2 −3
A =  1 0 4 ,
−1 5 1
then
k A k∞ = max{6, 5, 7} = 7 .

NOTE : In this example the vector y is given by y = (−1, 1, 1)T .

For this vector we have

k Ay k∞
= 7 = maximum absolute row sum.
k y k∞

9
Similarly one can show that

n
k Ax k1 X
k A k1 ≡ max = max | aij |
x6=0 k x k1 j
i=1

= the maximum absolute column sum.

(Check !)

EXAMPLE : For the matrix  

1 2 −3
A =  1 0 4 ,
−1 5 1
we have
k A k1 = max{3, 7, 8} = 8 .

10
One can also show that

k Ax k2
k A k2 ≡ max = max κi (A) ,
x6=0 k x k2 i

where the κi (A) are defined to be

the square roots of the eigenvalues of the matrix AT A .

(These eigenvalues are indeed nonnegative).

The quantities {κi (A)}ni=1 are called the singular values of the matrix A.

11
EXAMPLE : If
1 1
A= ,
0 1
then
T 1 0 1 1 1 1
A A = = .
1 1 0 1 1 2

The eigenvalues λ of AT A are obtained from

T 1−λ 1
det(A A−λI) = det = (1−λ)(2−λ)−1 = λ2 −3λ+1 = 0 ,
1 2−λ
from which √ √
3+ 5 3− 5
λ1 = and λ2 = .
2 2
Thus we have
q √
k A k2 = (3 + 5)/2 ∼
= 1.618 .

12
If A is invertible then we also have
−1 1
kA k2 = .
mini κi (A)

Thus if we order the square roots of the eigenvalues of AT A as

κ1 ≥ κ2 ≥ · · · κn ≥ 0 ,

then
−1 1
k A k2 = κ1 , and kA k2 = .
κn

Thus in the previous example we have

−1 1 ∼
kA k2 = q √ = 1.618 (!)
(3 − 5)/2

13
EXERCISES :

0 2
•[009] Let A = . Compute k A k2 .
0 0

 
1 0 0
•[010] Let A =  0 0 1  . Compute k A k2 .
0 −1 0

For a general n by n matrix A :

√
•[011] Prove that k A k2 ≤ n k A k∞ .

•[012] Prove that k A k1 is equal to the maximum absolute column sum.

14
SOLUTIONS :

0 0 0 2 0 0
•[009] AT A = = ,
2 0 0 0 0 4
√
with eigenvalues λ1 = 0 and λ2 = 4 , so that k A k2 = 4 = 2 .
    
1 0 0 1 0 0 1 0 0
•[010] AT A =  0 0 −1   0 0 1  =  0 1 0  ,
0 1 0 0 −1 0 0 0 1
√
which has three eigenvalues equal to 1 . Thus k A k2 = 1 = 1 .
√
•[011] k A k2 ≤ n k A k∞ :
√
By a previous exercise k y k2 ≤ n k y k∞ for any vector y .
Also it is easy to see that k x k2 ≥ k x k∞ . Thus
k Ax k2 √ k Ax k∞
k A k2 ≡ max ≤ n max
x6=0 k x k2 x6=0 k x k2
√ k Ax k∞ √
≤ n max = n k A k∞ .
x6=0 k x k∞
•[012] k A k1 is equal to the maximum absolute column sum.
 
a11 a12 · · · a1n
PROOF : Let  a21 a22 · · · a2n 
A≡
 ···
 ,
··· ··· ··· 
an1 an2 · · · ann
and n
X
C ≡ max | aij | (maximum absolute column sum) .
j
i=1

First, for any vector x 6= 0, we will show that

k Ax k1
≤ C,
k x k1

and then we will show there is a vector x for which

k Ax k1
= C.
k x k1
PROOF : (continued · · · )

First Pn
|
Pn
aij xj |
k Ax k1 i=1 j=1
=
k x k1 k x k1
Pn Pn
i=1 j=1 | aij || xj |
≤
k x k1
Pn Pn
j=1 i=1 | aij || xj |
=
k x k1
Pn Pn
j=1 | xj | i=1 | aij |
=
k x k1
Pn
j=1 | xj | C
≤ = C.
k x k1
PROOF : (continued · · · )

Secondly, we show that there is a vector x for which

k Ax k1
= C.
k x k1

This vector is constructed as follows :

Let k be the index of the column having the maximum absolute column sum.

Let x be the zero vector, except for the kth element which is set to 1.

Then
k x k1 = 1 and k Ax k1 = C ,
so that indeed
k Ax k1
= C.
k x k1
EXERCISES :

•[013] Let A be any n by n matrix. For each of the following state whether
it is true or false. If false then give a counter example.
k A k1 ≤ k A k∞ , k A k∞ ≤ k A k1 .

•[014] Give a constructive proof that for any square matrix A

there is a vector x 6= 0 such that
k Ax k∞ = k A k∞ k x k∞ .

•[015] Give a constructive proof that for any square matrix A

there is a vector x 6= 0 such that
k Ax k1 = k A k1 k x k1 .

•[016] Is there a vector x such that

k Ax k1 > k A k1 k x k1 ?

15
SOLUTIONS :

•[013] k A k1 ≤ k A k∞ and k A k∞ ≤ k A k1 are both False :

1 1
For A = we have
0 0

k A k∞ = 2 and k A k1 = 1 , so k A k1 < k A k∞ ,

1 0
while for A = we have
1 0

k A k∞ = 1 and k A k1 = 2 , so k A k∞ < k A k1 .
•[014] Prove that for any square matrix A there is a vector x 6= 0 such that
k Ax k∞ = k A k∞ k x k∞ .

PROOF : Let k be a row where the maximum absolute row sum

occurs, with value S. Take the vector x to have these components:
xi = 1 if Ak,i ≥ 0 , and xi = −1 if Ak,i < 0 .
Then k A k∞ = S, k x k∞ = 1, and it is easily seen that k Ax k∞ = S.
Thus
k Ax k∞ = k A k∞ k x k∞ .

EXAMPLE : (which is not a proof !) : Let

 
1 2 0
A =  −3 2 −1  .
2 −1 1
and take x = (−1, 1, −1)T .
•[015] Prove that for any square matrix A there is a vector x 6= 0 such that
k Ax k1 = k A k1 k x k1 .

PROOF : Let k be a column where the maximum absolute column

sum occurs, with value S. Take the vector x to have these components:
xi = 1 if i=k, and xi = 0 otherwise .
Then k A k1 = S, k x k1 = 1, and it is easily seen that k Ax k1 = S.
Thus
k Ax k1 = k A k1 k x k1 .

EXAMPLE : (which is not a proof !) : Let

 
1 2 0
A =  −3 2 −1  .
2 −1 1
and take x = (1, 0, 0)T .
•[016] Is there a vector x such that
k Ax k1 > k A k1 k x k1 ?
The answer is ”NO”.

PROOF : (by contradiction) :

Suppose there is such a vector x .
Then x cannot be the zero vector. (Why not?)
It follows that
k Ax k1
> k A k1 ,
k x k1
which contradicts that by definition k A k1 is the biggest value
of that ratio.
All matrix norms defined in terms of (induced by) a given vector norm as

k Ax k
k A k = max
x6=0 kxk
automatically satisfy

(i) k A k ≥ 0, and k A k = 0 only if A = O (zero matrix) ,

(ii) k αA k = | α | k A k, ∀α ∈ R ,

(iii) k A + B k ≤ k A k + k B k .

Check : Properties (i) and (ii) !

16
PROOF of (iii) :

k (A + B)x k
kA+Bk = max
x6=0 kxk

k Ax + Bx k
= max
x6=0 kxk

k Ax k + k Bx k
≤ max
x6=0 kxk

k Ax k k Bx k
≤ max + max
x6=0 k x k x6=0 k x k

≡ kAk+kBk . QED !

17
In addition we have

(iv) k AB k ≤ k A k k B k .

PROOF of (iv) :

k (AB)x k
k AB k = max
x6=0 kxk

k A(Bx) k
= max
x6=0 kxk

k A k k Bx k
≤ max
x6=0 kxk

kAk kBk kxk

≤ max = kAk kBk . QED !
x6=0 kxk

18
EXERCISES : Let A and B be arbitrary n by n matrices.

•[017] Is it true that

k AB k2 = k A k2 k B k2 ?
If false then give a counterexample.

•[018] Is it true that

k A k1 k A−1 k1 = 1 ?
If false then give a counterexample.

0 2
•[019] Let A = . Compute spr(A) , the “spectral radius” of A .
0 0
(Here spr(A) is the absolute value of the largest eigenvalue of A.)

Explain why spr(A) is not a matrix norm.

19
SOLUTIONS :

•[017] k AB k2 = k A k2 k B k2 is false:

1 0 0 0
Take A = , and B = . Then AB is the zero matrix.
0 0 0 1

So k AB k2 = 0 , while k A k2 = 1 and k B k2 = 1 .

•[018] k A k1 k A−1 k1 = 1 is also false :

1 1 −1 1 −1
If A = then A = and k A k1 k A−1 k1 = 4 .
0 1 0 1

0 2
•[019] Let A = . Compute spr(A) .
0 0
Here spr(A)=0 because both eigenvalues of A are zero.
Since A is not the zero matrix, it follows that the spectral radius of a
matrix cannot serve as a matrix norm.
The Banach Lemma.

Let B be an n by n matrix .

If in some induced matrix norm

kBk < 1,

then
I + B is nonsingular

and
−1 1
k (I + B) k ≤ .
1− k B k

20
PROOF :
Suppose on the contrary that I + B is singular.

Then
(I + B) y = 0 ,
for some nonzero vector y .

Hence
B y = −y ,

and
kByk kBk kyk
1 = ≤ = kBk ,
kyk kyk

which contradicts the assumption of the Lemma. (Why ?)

Hence I + B is nonsingular.

21
We now have

(I + B) (I + B)−1 = I ,
from which

(I + B)−1 = I − B (I + B)−1 .

Hence
k (I + B)−1 k ≤ k I k + k B k k (I + B)−1 k .

Since k I k is always 1 in any induced matrix norm, we get

(1− k B k) k (I + B)−1 k ≤ 1 ,

from which
−1 1
k (I + B) k ≤ . QED !
1− k B k

22
EXERCISES :
•[020] Consider the n by n tridiagonal matrix Tn = diag[1, 3, 1] .
For example,  
3 1 0 0
1 3 1 0
T4 =  0 1 3 1
 .

0 0 1 3
Use the Banach Lemma to show that Tn is invertible for all positive
integers n. Also compute an upper bound on k T−1
n k∞ .

•[021] Let An be the n by n symmetric matrix

1 n1 n1 · · · 1 1
 
n n
 1 1 1 ··· 1 1 
 n1 1 n n
1
n
1

An =   n n 1 ··· n n
 .

 · · · ··· · ·
1 1 1 1
n n n
··· n
1
Show An for the case n = 3. Prove that An is invertible for any
dimension n , and determine an upper bound on k A−1
n k∞ .

23
•[020] SOLUTION :
We can write
Tn = diag[1 , 3 , 1] = 3 ( In + Bn ) ,
where In is the n by n identity matrix, and
1 1
Bn = diag[ , 0 , ].
3 3
Then
2
k Bn k∞ = < 1,
3
so that using the Banach Lemma we have that ( In + Bn ) is invertible,
and
1
T−1
n = (In + Bn )−1 ,
3
with
1 1
k T−1
n k∞ ≤ · 2 = 1.
3 1− 3

NOTE : The cases n = 1 and n = 2 should be considered separately.

•[021] SOLUTION :
1 1 1 1
 
1 n n
··· n n
1 1 1
··· 1 1 
 n1 1
n n
1
n
1

An = 
n n
1 ··· n n
 .

· · · ··· · ·
1 1 1 1
n n n
··· n
1

Let An = In + Bn , where In is the n by n identity matrix , and

1 1 1 1
0 ···
 
n n n n
 n1 0 1
n ··· 1
n
1
n 
Bn =  n1 1
0 ··· 1 1
 .
 
n n n
· · · ··· · ·
 
1 1 1 1
n n n ··· n 0

n−1
Here k Bn k∞ = n
< 1, so In + Bn is invertible, and
−1 1 1
k (In + Bn ) k∞ ≤ = n−1 = n .
1− k Bn k∞ 1− n
EXERCISES :
•[022] Let An be the n by n symmetric matrix
 
2n 1 1 ··· 1 1
 1 2n 1 · · · 1 1 
 
An =   1 1 2n · · · 1 1  .
 · · · ··· · · 
1 1 1 ··· 1 2n
Show An for the cases n = 2, 3. Prove that An is invertible for any
dimension n , and determine an upper bound on k A−1
n k∞ .

•[023] A square matrix is called diagonally dominant if in each row the ab-
solute value of the diagonal element is greater than the sum of the
absolute values of the off-diagonal elements. Use the Banach Lemma
to prove that a diagonally dominant matrix is invertible.
•[024] Derive an upper bound on k T−1
n k∞ for the n by n tridiagonal matrix
Tn = diag[1 , 2 + 1/n , 1] .

24
•[022] SOLUTION :
 
6 1 1
4 1
A2 = , A3 = 1 6 1 .
1 4
1 1 6

Write An = 2n (In + Bn ) , where In is the n by n identity matrix ,

and
1 1 1 1
0 ···
 
2n 2n 2n 2n
1 1 1 1
 2n 0 2n ··· 2n 2n 
 1 1 1 1
Bn =  2n 0 ···  .

2n 2n 2n
· · · ··· · ·
 
1 1 1 1
2n 2n 2n ··· 2n 0

n−1
Here k Bn k∞ = 2n
< 1, so that In + Bn is invertible,
and
1 1 1 1 1
k A−1
n k∞ = ≤ · = · n−1 = .
2n 1− k Bn k∞ 2n 1 − 2n n+1
•[023] A diagonally dominant matrix A is invertible :
SOLUTION :
Let
A = D + E,
where D contains the diagonal entries of A , and E contains the off-
diagonal entries, that is,
a11 0 0 0 · 0 0 a12 a13 a14 · a1n
   
 0 a22 0 0 · 0   a21 0 a23 a24 · a2n 
 0 0 a33 0 · 0 a a32 0 a34 · a3n 
   
D= E =  31

 0 0 0 a44 · 0  a41 a42 a43 0 · a4n 
 

· · · · · · · · · · · ·
   
0 0 0 0 · ann an1 an2 an3 an4 · 0

By assumption all diagonal entries aii must be nonzero.

Thus D is invertible, and we can write

A = D(I + B) , where B = D−1 E .

SOLUTION : continued · · ·

The matrix B = D−1 E is given by

a12 a13 a14 a1n
 
0 a11 a11 a11
· a11
 a21 a23 a24 a2n
 aa22 a0 · 
a22 a22 a22
a34 a3n

 31
a33
32
a33
0 a33
· a33

B =  a41 a42 a43 a4n
 .
 a44 a44 a44 0 · a44


 · · · · · · 
an1 an2 an3 an4
ann ann ann ann
· 0

Now
k B k∞ < 1 ,
because by the diagonal dominance assumption we have for each row that
n n
X aij 1 X
| | = | aij | < 1 i = 1, 2, · · · , n .
j=1
aii | aii | j=1

It follows that A is invertible.

1
•[024] Derive an upper bound on k T−1
n k∞ for Tn = diag[1 , 2 + n
, 1] .

SOLUTION : We can write

2n + 1 2n + 1
Tn = diag[1 , , 1] = · ( In + Bn ) ,
n n
where In is the n by n identity matrix, and
n n
Bn = diag[ , 0, ].
2n + 1 2n + 1
Then
2n
k Bn k∞ = < 1,
2n + 1
so that using the Banach Lemma we have that
n 1
k T−1
n k∞ ≤ · 2n = n.
2n + 1 1 − 2n+1

NOTE : The cases n = 1 and n = 2 should be considered separately.

THE NUMERICAL SOLUTION OF LINEAR SYSTEMS

The Gauss Elimination Method.

EXAMPLE : For given 4 by 4 matrix A and vector f ∈ R4 ,

   
1 −2 −1 2 −2
2 0 1 2  5 
A =  2 0
 , f =   ,
4 1   7 
1 6 1 2 16
solve
Ax = f ,
for  
x1
 x2 
x =   .
 x3 
x4

25
    
1 −2 −1 2 x1 −2
2 0 1 2   x2  =  5  subtract 2 × row 1 from row 2
   

2 0 4 1   x3   7  subtract 2 × row 1 from row 3
1 6 1 2 x4 16 subtract 1 × row 1 from row 4
    
1 −2 −1 2 x1 −2
0 4 3 −2   x2   9 
   =  
0 4 6 −3   x3   11  subtract 1 × row 2 from row 3
0 8 2 0 x4 18 subtract 2 × row 2 from row 4
    
1 −2 −1 2 x1 −2
0 4 3 −2   x2   9 
   =  
0 0 3 −1   x3   2 
0 0 −4 4 x4 0 subtract − 43 × row 3 f rom row 4
    
1 −2 −1 2 x1 −2
0 4 3 −2   x2   9 
   =  
0 0 3 −1   x3   2 
0 0 0 8/3 x4 8/3

26
The bold-face numbers at the top left of each submatrix are the pivots :

 
1 −2 −1 2
0 4 3 −2 
 
0 0 3 −1 
0 0 0 8/3

The final matrix is an upper triangular matrix.

27
The upper triangular system
    
1 −2 −1 2 x1 −2
0 4 3 −2   x2  =  9  ,
   

0 0 3 −1   x3   2 
0 0 0 8/3 x4 8/3

can be solved by backsubstitution :

x4 = (8/3)/(8/3) = 1 ,

x3 = [2 − (−1)1]/3 = 1 ,

x2 = [9 − (−2)1 − (3)1]/4 = 2 ,

x1 = [−2 − (2)1 − (−1)1 − (−2)2]/1 = 1 .

(Of course, actual computer computations use floating point arithmetic .)

28
Operation Count.
Using Gauss elimination for general n by n matrices, counting multiplications
and divisions only (and treating these as equivalent).

(i) Triangularization (illustrated for n = 4) :

         
• • • • x1 • • • • • x1 •
• • • •  x2  =  • 
◦ ⋆ ⋆ ⋆  x2  =  ⋆ 
       

• ⇒ 
• • •   x3   •  ◦ ◦ ⋆ ⋆   x3   ⋆ 
• • • • x4 • ◦ ◦ ◦ ⋆ x4 ⋆

(n + 1)(n − 1) + n(n − 2) + · · · + (3)(1)

n−1
X n−1
X n−1
X
= k(k + 2) = k2 + 2 k
k=1 k=1 k=1

(n − 1)n(2n − 1) n(n − 1)(2n + 5)

= + n(n − 1) = . (Check !)
6 6

29
(ii) Backsubstitution :
    
• • • • x1 •
◦ ⋆ ⋆ ⋆  x2  =  ⋆ 
   

◦ ◦ ⋆ ⋆   x3   ⋆ 
◦ ◦ ◦ ⋆ x4 ⋆
n(n + 1)
1 + 2 + ··· + n = .
2
Taking the total of triangularization and backsubstitution we obtain
n(n − 1)(2n + 5) n(n + 1) n3 2 n
+ = + n − . (Check !)
6 2 3 3
EXAMPLES :
if n = 10 , then the total is 430,
if n = 100 , then the total is 343 430,
if n = 1000, then the total is 336 333 430.

For large values of n the dominant term in the total operation count is n3 /3.

30
Reconsider the Gauss elimination procedure for solving the system

Ax = f ,

given by     
1 −2 −1 2 x1 −2
2 0 1 2   x2   5 
   =   .
2 0 4 1   x3   7 
1 6 1 2 x4 16

• In each step retain the equation that cancels the operation performed.

• This is done by storing the multipliers appropriately in the identity matrix.

31
I A x I f
      
1 0 0 0 1 −2 −1 2 x1 1 0 0 0 −2
0 1 0 02 0 1 2   x2  =  0 1 0 0 5 
     

0 0 1 02 0 4 1   x3   0 0 1 0 7 
0 0 0 1 1 6 1 2 x4 0 0 0 1 16
      
1 0 0 0 1 −2 −1 2 x1 1 0 0 0 −2
2 1 0 00 4 3 −2 
  x2  =  2 1 0 0  9 
     

2 0 1 00 4 6 −3   x3   2 0 1 0   11 
1 0 0 1 0 8 2 0 x4 1 0 0 1 18
      
1 0 0 0 1 −2 −1 2 x1 1 0 0 0 −2
2 1 0 00 4 3 −2 
  x2  =  2 1 0 0  9 
     

2 1 1 00 0 3 −1   x3   2 1 1 0 2 
1 2 0 1 0 0 −4 4 x4 1 2 0 1 0
      
1 0 0 0 1 −2 −1 2 x1 1 0 0 0 −2
2 1 0 00 4 3 −2 
  x2  =  2 1 0 0  9 
     

2 1 1 00 0 3 −1   x3   2 1 1 0 2 
1 2 − 34 1 0 0 0 8
3 x4 1 2 − 43 1 8
3
L U x L g

32
NOTE :

• Gauss elimination generates an LU-decomposition of A :

A = LU.
L is lower triangular and U is upper triangular.

The below-diagonal entries of L are the multipliers.

• In addition we have Lg = f .

Furthermore, LUx = Lg, where L is nonsingular.

• Hence we also have Ux = g .

33
Using the LU-decomposition for multiple right hand sides.

Suppose we want to solve

Ax(k) = f (k) ,
with fixed A, but for multiple right hand side vectors

f (k) , k = 1, 2, · · · , m .

Algorithm :
(i) Determine the LU-decomposition of A .

 Lg(k) = f (k) ,
(ii) Solve k = 1, 2, · · · , m .
Ux(k) = g(k) ,


Note that the decomposition need only be computed once.

34
Operation Count.

Multiplications and divisions for an n by n system with m right hand sides :

Step (i) (LU-decomposition) :

   
• • • • • • • •
• • • • ◦ ⋆ ⋆ ⋆

• • • •
 ⇒ 
◦

◦ ⋆ ⋆
• • • • ◦ ◦ ◦ ⋆

n (n − 1) + (n − 1) (n − 2) + · · · + (2) (1)
n−1
X n−1
X n−1
X
= k (k + 1) = k2 + k
k=1 k=1 k=1

(n − 1) n (2n − 1) n (n − 1) n3 n
= + = − . (Check !)
6 2 3 3

35
 L  g   f   U   x  g 
1 ◦ ◦ ◦ g1 • • • • • x1 •
⋆ 1 ◦ ◦  g2  =  • 
◦ ⋆ ⋆ ⋆  x2  =  • 
       

⋆ , 
⋆ 1 ◦   g3   •  ◦ ◦ ⋆ ⋆   x3   • 
⋆ ⋆ ⋆ 1 g4 • ◦ ◦ ◦ ⋆ x4 •

Step (ii) (Backsubstitution) :


 Lg(k) = f (k) : m(1 + 2 + · · · + (n − 1)) ,
k = 1, 2, · · · , m .
Ux(k) = g(k) : m(1 + 2 + · · · + n) .


Total Step (ii) : mn2 (Check !).

The total of Steps (i) and (ii) is therefore

n3 2 n
+ mn − .
3 3
NOTE : For m small and n large the dominant term remains n3 /3.

36
Tridiagonal systems.

For tridiagonal systems of linear equations

    
b1 c1 x1 f1
 a2 b2 c2   x2   f2 
    

 a3 b3 c3   x3   f3 
 =  ,

 . . .  .   . 
   
 an−1 bn−1 cn−1   xn−1   fn−1 
an bn xn fn

Gauss elimination reduces to this simple algorithm :

β1 = b1 , g1 = f1 ,

γk = ak /βk−1 , 




βk = bk − γk ck−1 , k = 2, 3, · · · , n .




gk = fk − γk gk−1 ,


37
This transform the tridiagonal system into the upper-triangular form

    
β1 c1 x1 g1

 β2 c2   x2   g2 
   
 β3 c3   x3   g3 
 .  =  .  .
    

 . .    
 βn−1 cn−1   xn−1   gn−1 
βn xn gn

The backsubstitution algorithm now becomes

gn
xn = ,
βn

gk − ck xk+1
xk = , k = n −1 , n −2 , ··· , 1 .
βk

38
The resulting LU-decomposition is

  
1 β1 c1
 γ2 1  β2 c2 
  

 γ3 1 
 β3 c3 


 . . 
 . . 

 γn−1 1  βn−1 cn−1 
γn 1 βn

The total number of multiplications and divisions is 5n − 4 . (Check !)

39
Inverses.

The inverse of a n by n matrix A is defined to be a matrix A−1 such that

A (A−1 ) = (A−1 ) A = I ,

where  
1 0 · 0
0 1 · 0
I ≡ 
· · · ·
 (the identity matrix ) .
0 0 · 1

A is invertible if and only if

det A 6= 0 .

The inverse is then unique.

40
To compute A−1 we can solve

A (A−1 ) = I ,

which is of the form

 A  A−1   I 
• • • • c11 c12 c13 c14 1 0 0 0
• • • •  c21 c22 c23 c24 
 = 0 1 0 0
 
  .
• • • •   c31 c32 c33 c34  0 0 1 0
• • • • c41 c42 c43 c44 0 0 0 1

This corresponds to solving a linear system with n right hand sides.

Using the earlier formula, the number of multiplications and divisions is

n3 2 n 4n3 n
+ (n) n − = − .
3 3 3 3

41
But we can omit some operations , because the right hand side vectors,
i.e., the columns of I , are very special.

In particular, multiplications by 0 or 1 can be omitted.

The total number of multiplications that can be omitted is seen to be

n−1
X
(n) (n − 1) + (n − 1) (n − 2) + · · · + (2) (1) = k (k + 1)
k=1
n−1 n−1
X
2
X (n − 1) n (2n − 1) n (n − 1)
= k + k = +
k=1 k=1
6 2
n3 n
= − . (Check !)
3 3

Thus to find A−1 we need only

4n3 n n3 n
( − ) − ( − ) = n3 operations .
3 3 3 3

42
NOTE :
To solve a n by n linear system
Ax(k) = f (k) ,
with m right hand sides, takes
n3 n
+ mn2 − operations ,
3 3
as derived earlier for the LU-decomposition algorithm.

One can also find the solution vectors by computing A−1 , and setting
x(k) = A−1 f (k) .
But this takes
n3 + mn2 operations ,

which is always less efficient , no matter how big n and m are !

43
EXERCISES :

•[025] Compute the LU-decomposition of the tridiagonal matrix

 
3 1 0 0
1 3 1 0
T4 =  0 1 3 1
 .

0 0 1 3
Let f = (4, 5, 5, 4)T . Using the matrices L and U, solve Lg = f ,
followed by Ux = g . After having computed the vector x in this way,
check your answer by verifying that x satisfies the equation T4 x = f .

•[026] How many multiplications and divisions are needed to compute the
LU-decomposition of the specific tridiagonal matrix Tn = diag[1, 3, 1]
as a function of n ? Make sure not to count unnecessary operations.

•[027] If the LU-decomposition of this n by n tridiagonal matrix takes 0.01

second on a given computer if n = 105 , then how much time could it
take if n = 109 ?

44
•[025] SOLUTION :

Applying the LU-decomposition algorithm to the matrix

 
3 1 0 0
1 3 1 0
T4 =  0 1 3 1 ,


0 0 1 3
results in the matrices
   
1 0 0 0 3 1 0 0
 1 1 0 0 0 8 1 0
L =  3  , and U =  3  .
0 3  0 0 21
8
1 0 
8
1
8 55
0 0 21 1 0 0 0 21

11 29 55 T
Solving Lg = f gives g = (4 , 3
, 8
, 21
) ,

and subsequently solving Ux = g gives x = (1 , 1 , 1 , 1)T .

NOTE : Actual computations use real numbers, rather than rationals.

•[026] How many multiplications and divisions are needed to compute the
LU-decomposition of the specific tridiagonal matrix

Tn = diag[1 , 3 , 1] ,

as a function of n ?

SOLUTION :
Only n − 1 divisions are needed, and no multiplications !

•[027] If the LU-decomposition of this n by n tridiagonal matrix takes 0.01

second on a given computer if n = 105 , then how much time could it take if
n = 109 ?

SOLUTION :
The estimated time, based only on the number of divisions, is 100 seconds.
EXERCISES :

•[028] Suppose the LU decomposition of a matrix A is given by

   
1 0 0 1 2 3
L =  1 1 0  and U =  0 1 2.
1 1 1 0 0 1

Using only L , U , and f , i.e., without explicitly determining A ,

solve Ax = f , when f = ( 6, 9, 10 )T .

•[029] Suppose that solving a general n by n linear system of the form Ax = f

by Gauss elimination takes 10 seconds on a given computer if n = 1000.
Estimate how much time it will take to solve a 1000 by 1000 system
Lg = f , followed by solving Ux = g, where L is lower triangular with
1’s along its main diagonal, and where U is upper triangular?
Thus you may assume that L and U have already been computed.

45
•[028] SOLUTION :

Solving Lg = f gives g = (6 , 3 , 1)T ,

and subsequently solving Ux = g gives x = (1 , 1 , 1 , 1)T .

•[029] SOLUTION :
The leading term for the number of ”operations” for Ax = f is n3 /3 .
Thus the estimated time µ per operation is obtained from
(103 )3
µ = 10 seconds ,
3
which gives µ = 3 · 10−8 seconds .
The leading term for solving Lg = f followed by Ux = g is n2 .

The estimated time for doing this is therefore

(103 )2 · 3 · 10−8 = 0.03 seconds .
EXERCISES :

•[030] Suppose that multiplying two general n by n matrices takes 3 seconds

on a given computer, if n = 1000.
Estimate how much time it will take to compute the LU-decomposition
of such a matrix.

•[031] Suppose that solving a general system of linear equations of dimension

1000 takes 10 seconds on a given computer.
Estimate how much time it will take to solve a tridiagonal linear system
of dimension 106 on that computer.

•[032] How many divisions are needed for LU-decomposition of an n by n

tridiagonal matrix (not counting multiplications and additions)?

•[033] How many divisions are needed for LU-decomposition of an n by n

general matrix (not counting multiplications and additions)?

46
•[030] Suppose multiplying two general n by n matrices takes 3 seconds on a
given computer, if n = 1000. Estimate how much time it will take to
compute the LU-decomposition of such a matrix.
SOLUTION : Multiplying two n by n matrices takes n3 multipli-
cations, while LU-decomposition takes approximately n3 /3 multiplica-
tions and divisions. Thus LU-decomposition of a matrix of dimension
n = 1000 can be estimated to take one second.

•[031] Suppose solving a general system of linear equations of dimension 1000

takes 10 seconds on a given computer. Estimate how much time it will
take to solve a tridiagonal system of dimension 106 on that computer.
SOLUTION : Solving a system of n equations takes approximately
n3 /3 multiplications and divisions, while solving a tridiagonal system
takes approximately 5n multiplications and divisions. Thus the answer
is
5 · 106 15 · 106
3
· 10 = 9
· 10 = 0.15 seconds .
1000 /3 10
•[032] How many divisions are needed for LU-decomposition of an n by n
tridiagonal matrix (not counting multiplications and additions) ?

SOLUTION :
n − 1 divisions .

•[033] How many divisions are needed for LU-decomposition of an n by n

general matrix (not counting multiplications and additions) ?

SOLUTION :
n (n − 1)
divisions .
2
Practical Considerations.
• Memory reduction.
In an implementation of the LU decomposition algorithm, the multipliers
can be stored in the lower triangular part of the original matrix A.

In the earlier example, with

 
1 −2 −1 2
2 0 1 2 
A = 
2 0
 ,
4 1
1 6 1 2
this function would return the matrix :
 
1 −2 −1 2

 2 4 3 −2  .
 2 1 3 −1 
1 2 −4/3 8/3

47
• Row interchanges.

Gauss elimination will fail for the matrix

 
0 2 1
1 1 2  ,
2 3 −1
since the first pivot is zero.

A division by zero will occur when the first multiplier is computed !

The remedy is to interchange rows to get

 
1 1 2
0 2 1 
2 3 −1

Several such interchanges may be needed during Gauss elimination.

48
• Loss of accuracy.

More generally, loss of accuracy may occur when there are large multipliers .

EXAMPLE : Solve
  
 
0.0000001 1 x1 1
   =   ,
1 1 x2 2
on a “six-decimal-digit computer”.

NOTE :

• The solution is x1 ∼
= 1 , x2 ∼
= 1.

• The multiplier is 10,000,000 .

49
A “Six-decimal-digit computer ” :

Assume all arithmetic operations are performed to infinite precision,

but then truncated to six decimal digits (plus exponent).

Thus, for example,

−100/3

is stored as
−3.33333 101

50
   

0.0000001 1 x1 1
   =  
1 1 x2 2

(a) Elimination gives :

    
1.00000E − 07 1.00000E + 00 x1 1.00000E + 00
   =   .
0 −999999E + 01 x2 −999999E + 01

(b) Backsubstitution gives :

x2 = 1.00000E + 00 , x1 = 0.00000E + 00 .

Clearly this result is very bad !

51
Again, the remedy is to interchange rows :
    
1 1 x1 2
   =   .
0.0000001 1 x2 1

Now the multiplier is only 1.00000E − 07, and we obtain :

(a) Elimination :
    
1.000000E + 00 1.000000E + 00 x1 2.00000E + 00
   =   .
0 .999999E + 00 x2 .999999E + 00

(b) Backsubstitution : x2 = 1.00000E + 00, x1 = 1.0000E + 00 .

This solution is accurate !

52
Gauss Elimination with pivoting.

A variant of the Gauss elimination procedure that avoids loss of accuracy

due to large multipliers is called

“Gauss elimination with partial pivoting (or row pivoting)”.

Here rows are interchanged each time a pivot element is sought, so that the
pivot is as large as possible in absolute value.

(In practice pointers to rows are interchanged.)

All multipliers will then be less than 1 in magnitude.

53
EXAMPLE :
     
2 2 1 x1 5 interchange row 1 and 3
1 0 1   x2  =  2 
4 1 2 x3 7
     
4 1 2 x1 7
1 0 1   x2  =  2  subtract 41 row 1 f rom row 2
2 2 1 x3 5 subtract 24 row 1 f rom row 3
    
4 1 2 x1 7
0 −1/4 1/2   x2  =  1/4  interchange row 2 and 3
0 3/2 0 x3 3/2
    
4 1 2 x1 7
0 3/2 0   x2  =  3/2 
0 −1/4 1/2 x3 1/4 subtract −1
6 row 2 f rom row 3
    
4 1 2 x1 7 backsubstitution : x1 = 1
0 3/2 0   x2  =  3/2  backsubstitution : x2 = 1
0 0 1/2 x3 1/2 backsubstitution : x3 = 1

54
EXERCISES :

•[034] Use Gauss elimination with row pivoting to solve

    
1 2 3 x1 0
 2 8 11   x2  =  1  .
3 22 35 x3 10
The solution is x = (−1 , −1 , 1) . If done honestly (with pivoting !)
then all multipliers will be less than 1 in magnitude.

•[035] Suppose that Gauss elimination with row pivoting is used to solve
    
2 1 0 0 x1 4
1 2 1 0  x2   8 
   =   .
0 1 2 1   x3   12 
0 0 1 2 x4 11
Are any rows actually interchanged?
Can you also answer this question for general Tn = diag[1, 2, 1] ?

55
•[035] SOLUTION :

Without using pivoting the LU-decomposition of the matrix

 
2 1 0 0
1 2 1 0
T4 =  0 1 2 1
 ,

0 0 1 2
results in the matrices
   
1 0 0 0 2 1 0 0
 1 1 0 0 0 3
1 0
L =  2 2
 0 2 1 0  , and U =  0
  4
 .
3
0 3
1
0 0 34 1 0 0 0 5
4

We observe that all multipliers, as stored below the main diagonal of L ,

are less than 1 in magnitude which shows that pivoting is not needed.

The pattern 21 , 32 , 34 , · · · , of the nonzero multipliers is seen to continue

for larger dimensions. Thus no pivoting is done for any dimension of Tn .
Error Analysis.

Suppose we want to solve

Ax = f ,

where the n by n matrix A is nonsingular (i.e., invertible).

Suppose a small error is made in the right hand side, i.e., instead we solve

Ay = f + r.

What will be the error k y − x k in the solution ?

56
From
Ay = f + r ,
subtract
Ax = f ,
to get
A(y − x) = r .

Thus
y − x = A−1 r ,
so that
k y − x k = k A−1 r k ≤ k A−1 k k r k .

Hence if k A−1 k is large then a small perturbation r in f may lead to a

large change in x .

57
EXAMPLE :

1 −1.001 x1 2.001
= ,
2.001 −2 x2 4.001

has exact solution

x1 = 1 , x2 = −1 .

Suppose instead we solve

1 −1.001 y1 2.000
= .
2.001 −2 y2 4.002

The exact solution of this system is

y1 = 2 , y2 = 0 .

58
Note that the small change in the right hand side has norm

k r k∞ = 0.001 .

Also note that the change in the solution is much larger , namely,

k x − y k∞ = 1 .

In this example
 
−666.44 333.55
A−1 ∼
=  .
−666.77 333.22

Hence
k A−1 k∞ ∼
= 1000 , whereas k A k∞ ∼
= 4.

59
Errors always occur in floating point computations due to finite word length.

1
For example, on a “six digit computer” 3
is represented by 3.33333 10−1 .

Such errors occur in both right hand side and matrix.

Suppose we want to solve

Ax = f ,

but instead solve the perturbed system

(A + E)y = f + r .

Here the perturbation E is also a n by n matrix.

60
THEOREM : Consider

Ax = f , and (A + E)y = f + r .

Assume that
1 −1
A is nonsingular, and that k E k< , i .e. , k A k kEk < 1.
k A−1 k

Then
ky−xk cond(A) k r k k E k
≤ −1
+ ,
kxk 1− k A kk E k k f k k A k

where
cond(A) ≡ k A−1 kk A k ,

is called the condition number of A .

61
PROOF :

First write
A + E = A (I + A−1 E) .

Now, using the assumption,

k A−1 E k ≤ k A−1 k k E k < 1 ,

it follows from by the Banach Lemma that

(I + A−1 E) is nonsingular and

−1 −1 1 1
k (I + A E) k ≤ −1
≤ −1
.
1− k A E k 1− k A kk E k

62
Next

(A + E) y = f + r ,

implies
(I + A−1 E) y = A−1 (f + r) ,

so that

y = (I + A−1 E)−1 A−1 (f + r) .

Then

y − x = (I + A−1 E)−1 A−1 (f + r) − (I + A−1 E)x

= (I + A−1 E)−1 x + A−1 r − x − A−1 Ex

= (I + A−1 E)−1 A−1 (r − Ex) .

63
Finally,

ky−xk k (I + A−1 E)−1 k k A−1 k k r k + k E kk x k
≤
kxk kxk

k A−1 k k r k
≤ −1
+kEk
1− k A k k E k k x k

k A−1 k k A k krk k E k
= −1
+
1− k A k k E k k A k k x k k A k

k A−1 k k A k k r k k E k
≤ −1
+ .
1− k A k k E k k f k k A k

The last step uses the fact that

k f k = k Ax k ≤ k A kk x k . QED !

64
From the above theorem we can conclude that :

ky−xk
If cond(A) is large, then the relative error can be large .
kxk

Note, however, that cond(A) is never less than 1 because

1 = k I k = k A−1 A k ≤ k A−1 k k A k ≡ cond(A) ,

in any induced matrix norm.

A matrix having a large condition number is said to be ill-conditioned.

65
EXAMPLE : The 2 by 2 matrix

1 −1.001 x1
,
2.001 −2 x2

from the earlier example, with inverse

−666.44 333.55
A−1 ∼= ,
−666.77 333.22

has condition number

cond(A) ∼
= (4)(1000) = 4000 ,

in the matrix infinity norm.

We may say that this matrix is ill-conditioned.

66
Solving ill-conditioned systems numerically, if they can be solved at all on a
given computer, normally requires pivoting.

However, systems that need pivoting need not be ill-conditioned .

Reconsider, for example, the 2 by 2 matrix

.0000001 1 −1 1
A = , with A−1 ∼ = (Check !) ,
1 1 1 0

for which the condition number is approximately 4 using the infinity norm.

But solving a linear system with the above A as coefficient matrix requires
pivoting, at least on a six (decimal) digit computer.

67
EXERCISE :

•[036] Let
0 = t0 < t1 < t2 < · · · < tn = 1 ,
and let
hi = ti − ti−1 , i = 1, 2, · · · , n .
The following tridiagonal matrix arises in cubic spline interpolation
(to be discussed later) :
 
2(h1 + h2 ) h2

 h2 2(h2 + h3 ) h3 

Sn−1 =  h3 2(h3 + h4 ) h4  .

 · · · 
hn−1 2(hn−1 + hn )

Prove that Sn−1 is invertible and determine a bound on cond(Sn−1 ) .

Are there situations where Sn−1 could be ill-conditioned?

68
•[036] SOLUTION : Write Sn−1 = Dn−1 (In−1 + Bn−1 ) , where
2(h1 + h2 )
 
 2(h2 + h3 ) 
Dn−1 = 2(h3 + h4 )
 

· · ·
 
2(hn−1 + hn )
and  h2 
0 2(h1 +h2 )
h2 h3

 2(h2 +h 3)
0 2(h2 +h3 )


Bn−1 = h3 h4  .
 
 2(h3 +h4 ) 0 2(h3 +h4 ) 
 · · ·
hn−1
2(hn−1 +hn ) 0

1
k Bn−1 k∞ = 2
< 1, so In−1 + Bn−1 , and hence Sn−1 , is invertible, with
k D−1
n−1 k∞ 2
k S−1
n−1 k∞ ≤ ≤ ,
1− k Bn−1 k∞ 2 mini {hi + hi+1 }
and
3 maxi {hi + hi+1 }
cond(Sn−1 ) = k Sn−1 k∞ k S−1
n−1 k∞ = ,
mini {hi + hi+1 }
which can be arbitrarily large.
EXERCISES :

•[037] Consider the n by n matrix :

 
1 1 ··· 1
1 1 ··· 1
Cn = 
· ·
 ,
· ·
1 1 ··· 1
and let In be the n by n identity matrix.
For what ǫ does the Banach Lemma guarantee In + ǫCn is invertible?
Also determine a bound on cond(In + ǫCn ) in this case.

•[038] Use the Banach Lemma to prove that the five-diagonal matrix
Tn = diag[1, 1, 5, 1, 1] ,
is invertible for all n ≥ 1 .
Derive an upper bound on cond(Tn ) using the matrix infinity norm.

69
EXERCISES :

For each of the following statements, state whether it is true or false.

If true then explain why; if false then give a counterexample.

•[039] A condition number of 106 is large.

•[040] All large matrices are ill-conditioned.

•[041] All ill-conditioned matrices have small determinants.

•[042] Only ill-conditioned matrices require pivoting.

•[043] If pivoting is needed then the matrix is ill-conditioned.

•[044] The condition number of a matrix is never smaller than 1.

•[045] Tridiagonal matrices are never ill-conditioned.

70
EXERCISES :

For each of the following statements about matrices, say whether it is true
or false. Explain your answer.

•[046] Two n by n matrices can be multiplied using n2 multiplications.

•[047] LU-decomposition of the n by n tridiagonal matrix diag[1, 2, 1] can be

done using only n − 1 divisions and zero multiplications.

•[048] LU-decomposition of a general n by n tridiagonal matrix requires 2n−2

multiplications and divisions.

•[049] The n by n tridiagonal matrix Tn = diag[1, 2 + 1/n, 1] is nonsingular

for any positive integer n.

71
EXERCISES :

For each of the following statements about matrices, say whether it is true
or false. Explain your answer.

•[050] For large n, the LU-decomposition of a general n by n matrix requires

approximately n3 /3 multiplications and divisions.

•[051] The inverse of a general, nonsingular n by n matrix can be computed

using n3 multiplications and divisions.

•[052] If D is a diagonal matrix (i.e., its entries dij are zero if i 6= j), then

k D k1 = k D k2 = k D k∞ .

•[053] If k A−1 k is large then cond(A) is large.

72
BRIEF SOLUTIONS : (You are expected to provide details.)

Page 69:
1
•[037] In + ǫCn is invertible if ǫ < n
, in which case

1 + ǫn
cond( In + ǫCn ) ≤ .
1 − ǫn

•[038] k T−1
n k ∞ ≤ 1 , and cond(Tn ) = k Tn k ∞ k T−1
n k∞ ≤ 9 .

Page 70: •[039−045] : T F F F F T F

Page 71: •[046−049] : F T T T

Page 72: •[050−053] : T T T F

THE NUMERICAL SOLUTION OF NONLINEAR EQUATIONS

Introduction.

For a system of n linear equations in n unknowns

Ax = f ,

where x, f ∈ Rn , and A an n by n matrix , we have these possibilities :

(i) A is nonsingular : In this case there is a unique solution.

(ii) A is singular : There are no solutions or infinitely many. (Examples?)

Usually only case (i) is of interest.

The solution can be computed in a finite number of steps by Gauss Elimina-

tion (with pivoting, if necessary).

73
We can write a system of n nonlinear equations in n unknowns as

G(x) = 0 ,

where

x , 0 ∈ Rn ,

x = (x1 , x2 , · · · , xn )T ,

and where G is a vector-valued function of x having n component functions,

G(x) = (g1 (x) , g2 (x) , · · · , gn (x))T .

74
EXAMPLES : (of possible situations) :

• x2 − 1 = 0 has two solutions : x = 1 , x = −1 .

• x2 + 1 = 0 has two solutions : x = i , x = −i .

• e−x − sin(x) = 0 has a countably infinite number of solutions.

• The system (2x1 − x2 = 0 , x31 + x1 − x2 = 0) has three solution pairs,

namely, (x1 , x2 ) = (0, 0) , (1, 2) , (−1, −2) .

• The system (x1 x2 − 1 = 0 , x1 x2 − 2 = 0) has no solutions.

• The system (ex1 −x2 − 1 = 0 , x1 − x2 = 0) has a continuum of solutions.

75
NOTE :

For nonlinear equations :

• There can be 0, 1, 2, 3, 4, · · · ∞ solutions.

• A solution can usually not be computed in a finite number of steps.

• Instead, iterative methods will be used.

• We will not consider the case of a continuum of solutions.

76
Some Methods for Scalar Nonlinear Equations.

Consider the scalar equation (one equation, one unknown)

g(x) = 0 ,

and let x∗ denote a solution (or zero , or root ) of this equation.

77
The Bisection Method.

This method requires two initial points :

x(0) , y (0) with g(x(0) ) < 0 and g(y (0) ) > 0 .

Algorithm : For k = 0, 1, 2, · · · :

1
• Set z (k) = 2
(x(k)
+ y (k) ) ,

• If g(z (k) ) < 0 set x(k+1) = z (k) , y (k+1) = y (k) ,

• If g(z (k) ) > 0 set x(k+1) = x(k) , y (k+1) = z (k) .

78
g(x)
(0)
g(y )

x (1)
x (0)
x
x* z (0) y (0)

y (1)

g(x(0) ) The Bisection method.

The bisection method works if g(x) is continuous in the interval [x(0) , y (0) ].

In fact we have
1
| x(k) − x∗ | ≤ k
| x(0)
− y (0)
| .
2

The method does not readily generalize to systems of nonlinear equations.

79
The Regula Falsi.

This method is similar to the bisection method.

However, in each step we now let

(k) x(k) g(y (k) ) − y (k) g(x(k) )

z = (k) (k)
.
g(y ) − g(x )

80
(0)
g(x)
g(y )

x (1)
x (0)
(0) x
x
*
z y (0)

y (1)

g(x )
(0)
The Regula Falsi.

x(k) g(y (k) ) − y (k) g(x(k) )

z (k) = (k) (k)
.
g(y ) − g(x )

z (k) is the zero of the line from (x(k) , g(x(k) )) to (y (k) , g(y (k) )) . (Check !)

81
Unlike the bisection method, not both x(k) and y (k) need converge to x∗ :

(0)
g(x)
g(y )

.
.
x (2)
x (1)
x (0) x*
(0)
x
z y (0)

y (1)

(0) z(2)
g(x ) y(2)

The Regula Falsi : Nonconvergence of x(k) to x∗ .

The Regula Falsi does not readily generalize to nonlinear systems.

82
Newton’s Method.

Let x(0) be an initial guess for a zero x∗ of g(x) = 0 .

The line p0 (x) that satisfies

p0 (x(0) ) = g(x(0) )

and
p′0 (x(0) ) = g ′ (x(0) ) ,

is given by

p0 (x) = g(x(0) ) + (x − x(0) ) g ′ (x(0) ) . (Check !)

83
g(x)

g(x(0) )

p0 (x)
x*
x
x(2) x (1) (0)
x

Newton’s method

If g is sufficiently smooth and x(0) is close to x∗ then we expect the zero

(1) (0) g(x(0) )

x = x − ′ (0) , (Check !)
g (x )

of p0 (x) to be a better approximation to x∗ than x(0) .

84
This procedure may now be repeated for the point x(1) .

The general algorithm for Newton’s method can therefore be written as

(k+1) (k) g(x(k) )

x = x − ′ (k) , k = 0, 1, 2, · · · .
g (x )

Later we show that Newton’s method converges to a zero x∗ of g(x) = 0 if

• g has two continuous derivatives near x∗ ,

• g ′ (x∗ ) 6= 0 ,

• x(0) is sufficiently close to x∗ .

85
EXAMPLE :

Use Newton’s method to compute the square root of 2 .

Note that this square root is a zero of g(x) ≡ x2 − 2.

Thus the Newton iteration is given by

(k+1) (k) (x(k) )2 − 2 (x(k) )2 + 2

x = x − (k)
= (k)
.
2x 2x

With x(0) = 1.5 , we get

x(1) = 1.41666 , x(2) = 1.414215 , etc.

Newton’s method generalizes to systems of nonlinear equations.

This extension will be considered later.

86
The Chord Method.

This method is similar to Newton’s method.

The only difference is that g ′ (x) is always evaluated at the initial point x(0) .

Thus the algorithm is

(k+1) (k) g(x(k) )

x = x − ′ (0) .
g (x )

87
g(x)

g(x(0) )

p0 (x)
x*
(2) (1) (0) x
x x x

The Chord method

g(x(k) )
x(k+1) = x(k) − ′ (0) .
g (x )

88
Compared to Newton’s method :

• The Chord method takes fewer arithmetic operations per iteration.

• The two methods converge under essentially the same conditions.

• The Chord method needs more iterations for a prescribed accuracy.

89
EXAMPLE :

With x(0) = 1.5 the Chord method for solving x2 − 2 = 0 takes the form

(k) 2
(x ) −2
x(k+1) (k)
= x − .
3

The first few iterations give

x(1) = 1.416666 , x(2) = 1.414351 , x(3) = 1.414221 .

90
EXERCISES :

•[054] Show how to use Newton’s method to compute the cube root of 2.
Carry out the first few iterations, using x(0) = 0.6.

•[055] Show how to use the Chord method to compute the cube root of 2.
Carry out the first few iterations, using x(0) = 0.6.

•[056] Consider the equation

sin(x) = 1/x .
Show the graphs of sin(x) and 1/x in one diagram. How many solutions
are there to this equation ? Write down Newton’s method for finding
a solution. Carry out the first few iterations with x(0) = π/2.

•[057] Consider the equation sin(x) = e−x . Draw the functions sin(x) and
e−x in one graph. How many solutions are there to the above equa-
tion ? Show how one can use Newton’s method to find a solution of
the equation. Carry out the first few Newton iterations, using x(0) = 0.

91
•[054] SOLUTION : Here is a Fortran code for Newton’s method for the
cube root of 2, followed by the output. Note the rapid convergence !
g(x) = x**3 - 2
gp(x) = 3*x**2

nit = 10
x = 0.6

DO k=1,nit
x = x - g(x)/gp(x)
WRITE(6,101)k, x
ENDDO

101 FORMAT(I3,1PE16.6)
STOP
END

1 2.251852E+00
2 1.632705E+00
3 1.338558E+00
4 1.264450E+00
5 1.259937E+00
6 1.259921E+00
7 1.259921E+00
•[055] SOLUTION : Here is a Fortran code for the Chord method for
the cube root of 2, followed by the output. The iteration diverges ! It will
converge if x0 is closer to the cube root of 2, but convergence is slower than
Newton.
g(x) = x**3 - 2
nit = 5
x = 0.6
gp = 3*x**2

DO k=1,nit
x = x - g(x)/gp
WRITE(6,101)k, x
ENDDO

101 FORMAT(I3,1PE16.6)
STOP
END

1 2.251852E+00
2 -6.469230E+00
3 2.460709E+02
4 -1.379588E+07
5 2.431220E+21
•[056] SOLUTION : For the equation sin(x) = 1/x .
Drawing the graphs of sin(x) and 1/x in a single diagram, one observes
that there are infinitely many solutions.
With g(x) = sin(x) − x−1 , Newton’s method for finding a zero of g(x) is
(k+1) (k) sin(x) − x−1
x = f (x ) , where f (x) = x − .
cos(x) + x−2
Taking x(0) = π/2 gives these results :

x(1) = 0.674191 ,
x(2) = 0.962321 ,
x(3) = 1.094709 ,
x(4) = 1.113808 ,
x(5) = 1.114157 ,
x(6) = 1.114157 ,

i.e., the iteration converges to the positive solution that is nearest to zero.
•[057] SOLUTION : For the equation sin(x) = e−x .

Drawing the graphs of sin(x) and e−x in a single diagram, one observes
that there are infinitely many solutions.

Letting g(x) = sin(x) − e−x , Newton’s method for a zero of g(x) is

(k) −x(k)
sin(x ) − e
x(k+1) = x(k) − (k) −x(k)
.
cos(x ) + e
Taking x(0) = 1 results in rapid convergence to the first positive solution:

x(1) = 0.47852772 ,
x(2) = 0.58415699 ,
x(3) = 0.58852512 ,
x(4) = 0.58853275 ,
x(5) = 0.58853275 ,
x(6) = 0.58853275 .
Newton’s Method for Systems of Nonlinear Equations.

First reconsider Newton’s method for scalar equations

g(x) = 0 .

Given x(k) , we set x(k+1) to be the zero of

pk (x) ≡ g(x(k) ) + (x − x(k) ) g ′ (x(k) ) .

NOTE :

• pk (x) is the tangent line to g(x) at x = x(k) , i.e.,

• pk (x) is the linear approximation to g(x) at x = x(k) , i.e.,

• pk (x) is the degree 1 Taylor polynomial of g(x) at x = x(k) .

92
Similarly for systems of the form

G(x) = 0 ,

we have the linear approximation

Pk (x) ≡ G(x(k) ) + G′ (x(k) ) (x − x(k) )

of G(x) about x = x(k) .

Here G′ (x(k) ) is the Jacobian matrix of G(x) at x = x(k) .

Analogous to the scalar case we let x(k+1) be the zero of Pk (x) = 0 .

93
Thus x(k+1) is the solution of the linear system

G′ (x(k) ) (x(k+1) − x(k) ) = − G(x(k) ) ,

that is, we can get x(k+1) by first solving

• G′ (x(k) ) ∆x(k) = − G(x(k) ) ,

and then setting

• x(k+1) = x(k) + ∆x(k) .

94
EXAMPLE :

Use Newton’s method to solve the system

x21 x2 − 1 = 0 ,

x2 − x41 = 0 .

Here
x21 x2

x1 g1 (x) g1 (x1 , x2 ) −1
x = , G(x) = = = .
x2 g2 (x) g2 (x1 , x2 ) x2 − x41

The Jacobian matrix in this example is given by

 ∂g1 ∂g1
x21
  
∂x1 ∂x2
2x1 x2
G′ (x) ≡   =   .
∂g2 ∂g2
∂x1 ∂x2
−4x31 1

95
Hence Newton’s method for this problem takes the form

(k) (k) (k) (k) (k) (k)

    
2x1 x2 (x1 )2 ∆x1 1− (x1 )2 x2
•    =   ,
(k) (k) (k) (k)
−4(x1 )3 1 ∆x2 (x1 )4 − x2

 (k+1) (k) (k)

 x1 = x1 + ∆x1 ,
•
(k+1) (k) (k)
x2 = x2 + ∆x2 ,


for k = 0, 1, 2, · · · .

Thus for each iteration two linear equations in two unknowns must be solved.

96
With the initial guess
(0) (0)
x1 = 2 , x2 = 2 ,
the first iteration consists of solving
(0)
8 4 ∆x1 −7
(0) = ,
−32 1 ∆x2 14
which gives
(0) (0)
∆x1 = −0.463 , ∆x2 = −0.823 ,
and then setting
(1) (0) (0)
x1 = x1 + ∆x1 = 1.537 ,
(1) (0) (0)
x2 = x2 + ∆x2 = 1.177 .

(2) (2)
After a second iteration what will x1 and x2 be ?

97
EXERCISES :

•[058] Describe in detail how Newton’s method can be used to compute solu-
tions (x1 , x2 ) of the system of two nonlinear equations

x21 + x22 − 1 = 0 ,

x2 − ex1 = 0 .

•[059] Describe in detail how Newton’s method can be used to compute a

solution (x1 , x2 , x3 ) of the system of three nonlinear equations

x21 + x22 + x23 − 1 = 0 ,

x3 − ex1 = 0 ,

x3 − ex2 = 0 .

98
•[058] SOLUTION : For the 2D problem :

x21 x22

x1 g1 (x) g1 (x1 , x2 ) + −1
x = , G(x) = = = ,
x2 g2 (x) g2 (x1 , x2 ) x2 − ex1

and Newton’s method takes the form

(k) (k) (k) (k) (k)

    
2x1 2x2 ∆x1 1 − (x1 )2 − (x2 )2
   =   ,
(k) (k) (k) (k)
x1 x1
−e 1 ∆x2 e − x2

 (k+1) (k) (k)

 x1 = x1 + ∆x1 ,
(k+1) (k) (k)
x2 = x2 + ∆x2 ,


for k = 0, 1, 2, · · · .
•[059] SOLUTION :

For the 3D problem Newton’s method takes the form :

(k) (k) (k) (k) (k) (k) (k)

    
2x1 2x2 2x3 ∆x1 1 − (x1 )2 − (x2 )2 − (x3 )2
    
 −ex(k) (k)  (k) (k)
    
x1
 1 0 1   ∆x2  = 
 
 e − x3 

    
(k) (k) (k) (k)
x2 x2
0 −e 1 ∆x3 e − x3

(k+1) (k) (k)

x1 = x1 + ∆x1 ,
(k+1) (k) (k)
x2 = x2 + ∆x2 ,
(k+1) (k) (k)
x3 = x3 + ∆x3 ,
for k = 0, 1, 2, · · · .
Residual Correction.

Suppose we use Newton’s method for a linear system Ax = f ,

that is, we let

G(x) ≡ Ax − f .
Then
G′ (x) = A ,
so that Newton’s method
G′ (x(k) ) ∆x(k) = − G(x(k) ) ,

x(k+1) = x(k) + ∆x(k) ,

becomes
A ∆x(k) = − (Ax(k) − f ) ,

x(k+1) = x(k) + ∆x(k) .

99
A ∆x(k) = − (Ax(k) − f ) ,

x(k+1) = x(k) + ∆x(k) .

NOTE :

• the Jacobian needs to be LU-decomposed only once.

• With exact arithmetic , the exact solution is found in one iteration :

∆x(0) = − A−1 (Ax(0) − f ) = − x(0) + x ,

so that
x(1) = x(0) + ∆x(0) = x(0) − x(0) + x = x .

100
A ∆x(k) = − (Ax(k) − f ) ,

x(k+1) = x(k) + ∆x(k) .

NOTE :

• For inexact arithmetic , this iteration is called residual correction .

• Residual correction can improve the accuracy of the solution of Ax = f .

• Residual correction is valuable for mildly ill-conditioned linear systems.

• The “residual ” Ax(k) − f should be computed with high precision.

101
Convergence Analysis for Scalar Equations.

Most iterative methods for solving a scalar equation g(x) = 0 can be written

x(k+1) = f (x(k) ) , k = 0, 1, 2, · · · , x(0) given .

EXAMPLE : In Newton’s method

g(x)
f (x) = x − ′ ,
g (x)
and in the Chord method
g(x)
f (x) = x − ′ (0) .
g (x )

Sometimes the iteration x(k+1) = x(k) − g(x(k) ) also works. In this method
f (x) = x − g(x) .

102
NOTE :

Iterations of the form

x(k+1) = f (x(k) ) , k = 0, 1, 2, · · · , x(0) given ,

also arise independently, e.g., as models of “population growth” :

x(k+1) = c x(k) , k = 0, 1, 2, · · · , x(0) > 0 given ,

models exponential population growth.

EXERCISES : What happens to the sequence x(k) , k = 0, 1, 2, · · · ,

•[060] when x(0) > 0 and c > 1 ?

•[061] when x(0) > 0 and c < 1 ?

103
The iteration

x(k+1) = c x(k) (1 − x(k) ) , k = 0, 1, 2, · · · ,

known as the logistic equation, models population growth when there are
limited resources.

EXERCISES : For 0 < x(0) < 1 :

What happens to the sequence x(k) , k = 0, 1, 2, · · · ,

•[062] when 0 ≤ c < 1 ?

•[063] when 1 ≤ c < 2 ?

•[064] when 2 ≤ c < 3 ?

•[065] when 3 ≤ c ≤ 4 ?

104
In general, an iteration of the form
x(k+1) = f (x(k) ) , k = 0, 1, 2, · · · ,
is sometimes called a fixed point iteration (or a recurrence relation ,
or a discrete dynamical system) .

Suppose the sequence x(k) , k = 0, 1, 2, · · ·, converges, i.e.,

lim x(k) = x∗ .
k→∞

Then x∗ satisfies the equation

x = f (x) ,
(assuming that f is continuous near x∗ ).

We call x∗ a fixed point of f .

105
EXAMPLE :

In Newton’s method
g(x)
f (x) = x − ′ .
g (x)

Thus a fixed point x∗ satisfies

∗
g(x )
x∗ ∗
= x − ′ ∗ ,
g (x )

that is,
g(x∗ ) = 0 ,
(assuming that g ′ (x∗ ) 6= 0 .)

Thus x∗ is a solution of g(x) = 0.

106
Assuming that f has a fixed point, when does the fixed point iteration

x(k+1) = f (x(k) ) ,

converge ?

The answer is suggested in the following two diagrams :

107
y

y=x
y=f(x)

0 x* x
0 (0) (1) (2)
x x x

A convergent fixed point iteration.

108
y

y=x
y=f(x)

0 x
0 (3) (2) (1) (0)
x x x x x*

A divergent fixed point iteration.

109
THEOREM :

Let f ′ (x) be continuous near a fixed point x∗ of f (x), and assume that

| f ′ (x∗ ) | < 1 .

Then the fixed point iteration

x(k+1) = f (x(k) ) , k = 0, 1, 2, · · · ,

converges to x∗ , whenever the initial guess x(0) is sufficiently close to x∗ .

110
PROOF :

Let α ≡ | f ′ (x∗ ) |.

Then α < 1.

Choose β such that α < β < 1.

Then, for some ǫ > 0, there exists an interval

Iǫ ≡ [x∗ − ǫ, x∗ + ǫ] ,

such that
| f ′ (x) | ≤ β in Iǫ ,

(because f ′ is continuous near x∗ ).

111
Let x(0) ∈ Iǫ .

By Taylor’s Theorem (or by the Mean Value Theorem)

x(1) ≡ f (x(0) ) = f (x∗ ) + (x(0) − x∗ ) f ′ (η0 ) ,

for some η0 between x(0) and x∗ .

Since f (x∗ ) = x∗ it follows that

| x(1) −x∗ | = | (x(0) −x∗ )f ′ (η0 ) | = | x(0) −x∗ | | f ′ (η0 ) | ≤ β | x(0) −x∗ | .

Thus x(1) ∈ Iǫ , (because 0 < β < 1) .

112
Again by Taylor’s Theorem (or the Mean Value Theorem)

x(2) ≡ f (x(1) ) = f (x∗ ) + (x(1) − x∗ ) f ′ (η1 ) ,

for some η1 between x(1) and x∗ .

Hence
| x(2) − x∗ | ≤ β | x(1) − x∗ | ≤ β 2 | x(0) − x∗ | .

Thus x(2) ∈ Iǫ , (because 0 < β < 1) .

Proceeding in this manner we find

| x(k) − x∗ | ≤ β k | x(0) − x∗ | .

Since 0 < β < 1 this implies that

x(k) → x∗ as k → ∞ .

113
COROLLARY : Let

Iǫ ≡ [ x∗ − ǫ , x∗ + ǫ ] ,

and assume that for some ǫ > 0 we have :

• f (x) has a fixed point x∗ ∈ Iǫ ,

• f (x) is a smooth function in Iǫ ,
• | f ′ (x) | < 1 everywhere in Iǫ ,
• x(0) ∈ Iǫ .

Then the fixed point iteration

x(k+1) = f (x(k) ) , k = 0, 1, 2, · · · .
converges to x∗ .

PROOF : This follows from the proof of the Theorem.

114
COROLLARY :

• x∗ is a zero of g(x) = 0 ,

• g(x) has two continuous derivatives near x∗ ,

• g ′ (x∗ ) 6= 0 ,

• x(0) is sufficiently close to x∗ .

then Newton’s method for solving g(x) = 0, i.e.,

(k)
g(x )
x(k+1) (k)
= x − ′ (k) ,
g (x )

converges to x∗ .

115
PROOF :

In Newton’s method
g(x)
f (x) = x − ′ .
g (x)

Hence

′ ∗ g ′ (x∗ )2 − g(x∗ ) g ′′ (x∗ ) g(x∗ ) g ′′ (x∗ )

f (x ) = 1 − = = 0.
g ′ (x∗ )2 g ′ (x∗ )2

Therefore, certainly | f ′ (x∗ ) |< 1 .

116
EXAMPLE :

The fixed points of the logistic equation ,

x(k+1) = c x(k) (1 − x(k) ) ,

satisfy
x∗ = c x∗ (1 − x∗ ) .

We see that the fixed points are given by

∗ ∗ 1
x = 0, and x = 1 − .
c

EXERCISE : •[066] Determine, for all values of c , (0 < c ≤ 4) , whether

these fixed points are attracting (| f ′ (x∗ ) |< 1) , or repelling (| f ′ (x∗ ) |> 1) .

117
•[066] SOLUTION : Here f (x) = c x (1 − x), so that f ′ (x) = c(1 − 2x).
The fixed points satisfy
x = c x (1 − x) .
which we can rewrite as
x [1 − c (1 − x)] = 0 .

One fixed point is x∗ = 0 , with derivative f ′ (0) = c. Thus this fixed point
is attracting when 0 ≤ c < 1, and repelling when 1 < c ≤ 4.
1
The other fixed point satisfies 1 − c(1 − x)] = 0, from which x∗ = 1 − c
,
with derivative
1 1
f ′ (1 − ) = c[1 − 2(1 − )] = 2 − c .
c c
We see that 1
′
| f (1 − ) |< 1 when 1 < c < 3 (attracting) ,
c
and
1
| f ′ (1 − ) |> 1 when 0 ≤ c < 1 and 3 < c ≤ 4 (repelling) .
c
1.0

0.9

0.8

0.7

0.6
y

0.5

0.4

0.3

0.2

0.1

0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 0.9 .

118
1.0

0.9

0.8

0.7

0.6
y

0.5

0.4

0.3

0.2

0.1

0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 1.7 .

119
1.0

0.9

0.8

0.7

0.6
y

0.5

0.4

0.3

0.2

0.1

0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.46 .

120
1.0

0.9

0.8

0.7

0.6
y

0.5

0.4

0.3

0.2

0.1

0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.561 .

121
1.0

0.9

0.8

0.7

0.6
y

0.5

0.4

0.3

0.2

0.1

0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.6 .

122
1.0

0.9

0.8

0.7

0.6
y

0.5

0.4

0.3

0.2

0.1

0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.77 .

123
1.0

0.9

0.8

0.7

0.6
y

0.5

0.4

0.3

0.2

0.1

0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.89 .

124
1.0

0.9

0.8

0.7

0.6
y

0.5

0.4

0.3

0.2

0.1

0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.99 .

125
If a fixed point iteration

x(k+1) = f (x(k) ) ,

converges to a fixed point x∗ of f (x), then how fast does it converge ?

To answer this we let

ek ≡ | x(k) − x∗ | .

Thus ek is the error after the kth iteration.

We can now show the following :

126
THEOREM :
Let f ′ (x) be continuous near x∗ with | f ′ (x∗ ) |< 1 .

Assume that x(0) is sufficiently close to x∗ , so that the fixed point iteration

x(k+1) = f (x(k) )
converges to x∗ .

Then ek+1
lim = | f ′ (x∗ ) | (linear convergence) .
k→∞ ek
(The value of | f ′ (x∗ ) | is then called the rate of convergence.)

If in addition f ′ (x∗ ) = 0 and if f ′′ (x) is continuous near x∗ then

ek+1 1 ′′ ∗
lim 2
= | f (x ) | (quadratic convergence) .
k→∞ ek 2

127
PROOF :
Case : f ′ (x∗ ) 6= 0 :
ek+1 = | x(k+1) − x∗ |

= | f (x(k) ) − x∗ |

= | f (x∗ ) + (x(k) − x∗ ) f ′ (ηk ) − x∗ |

= | x(k) − x∗ | | f ′ (ηk ) |

= ek | f ′ (ηk ) | ,

where ηk is some point between x(k) and x∗ .

Hence
ek+1
lim = lim | f ′ (ηk ) | = | f ′ (x∗ ) | .
k→∞ ek k→∞

128
Case : f ′ (x∗ ) = 0 :

ek+1 = | x(k+1) − x∗ | = | f (x(k) ) − x∗ |

1 (k)
= | f (x∗ ) + (x(k) − x∗ )f ′ (x∗ ) + (x − x∗ )2 f ′′ (ηk ) − x∗ |
2
1 2 ′′
= ek | f (ηk ) | .
2

where ηk is some point between x(k) and x∗ ,

Thus
ek+1 1 1
lim = lim | f ′′ (ηk ) | = | f ′′ (x∗ ) | QED !
k→∞ e2 k→∞ 2 2
k

129
COROLLARY :

• g(x) has three continuous derivatives near a zero x∗ of g(x) = 0 ,

• g ′ (x∗ ) 6= 0 ,

• x(0) is sufficiently close to x∗ ,

then Newton’s method for solving g(x) = 0 converges quadratically .

PROOF : In Newton’s method

g(x)
f (x) = x − ′ ,
g (x)
and we have already shown that
f ′ (x∗ ) = 0 .

130
EXAMPLE :

√
Newton’s method for computing 2 , i.e., for computing a zero of

g(x) = x2 − 2 ,
is given by
(k+1) (k) (x(k) )2 − 2
x = x − (k)
,
2x
that is,
(x(k) )2 + 2
x(k+1) = (k)
,
2x
that is,
x(k+1) = f (x(k) ) ,

where
x2 + 2
f (x) = .
2x

131
(k+1) (k) x2 + 2
x = f (x ) , where f (x) = .
2x

We observe that :

∗
√ ∗
√
• The fixed points of f are x = + 2 and x = − 2 .

• f ′ (x∗ ) = 0 . (Hence quadratic convergence).

• f (x) → ∞ as x ↓ 0 , and f (x) → −∞ as x ↑ 0 . (Vertical asymptotes.)

• f (x) ∼
= x/2 as | x |→ ∞ .

(Check !)

132
√
Newton’s Method for 2 as a fixed point iteration

133
√
Newton’s Method for 2 as a fixed point iteration (blow-up)

134
√
Newton’s Method for ± 2 as a fixed point iteration

135
From the last graph we see that :

√
• The iteration converges to x = + 2 for any x(0) > 0 .
∗

√
• The iteration converges to x = − 2 for any x(0) < 0 .
∗

(Check !)

136
EXAMPLE :

Consider the fixed point iteration

x(k+1) = x(k) − γ g(x(k) ) ,

for computing a zero of g(x) = 0.

Indeed, a fixed point x∗ satisfies

x∗ = x∗ − γ g(x∗ ) ,

that is,
g(x∗ ) = 0 . (Assuming γ 6= 0 .)

137
In this example

f (x) = x − γ g(x) .

A fixed point x∗ is attracting if

| f ′ (x∗ ) | < 1 ,

i.e., if
| 1 − γ g ′ (x∗ ) | < 1 ,
i.e., if
−1 < 1 − γ g ′ (x∗ ) < 1 ,
i.e., if
−2 < − γ g ′ (x∗ ) < 0 ,
i.e., if
0 < γ g ′ (x∗ ) < 2 .

138
The convergence is quadratic if

f ′ (x∗ ) = 1 − γ g ′ (x∗ ) = 0 ,
that is, if 1
γ = γ̂ ≡ ′ ∗ .
g (x )

Now x∗ is unknown beforehand and therefore γ̂ is also unknown.

However, after the kth iteration an approximation to γ̂ is given by

1
γ̂ ∼
= γk ≡ ′ (k)
.
g (x )
This leads to the iteration

x(k+1) = x(k) − γk g(x(k) ) ,

where γk = 1/g ′ (x(k) ) ,

i.e., we have rediscovered Newton’s method !

139
EXERCISES :

•[067] If the following fixed point iteration converges, then what number will
it converge to? Is the convergence quadratic?
2x3 + 3
x(k+1) = f (x(k) ) , where f (x) = 2
.
3x

•[068] Analytically determine all fixed points of

x(k+1) = 2(x(k) )2 − 2x(k) + 1 , k = 0, 1, 2, · · · .

Are the fixed points attracting or repelling? If attracting, then is the

convergence linear or quadratic? Also draw a graphical interpretation.

•[069] Analytically determine all fixed points of x(k+1) = 2x(k) (1 − x(k) ). Are
these attracting or repelling? If attracting then is the convergence
linear or quadratic? Also give a graphical interpretation.

140
(k+1) (k) 2x3 +3
•[067] x = f (x ) , with f (x) = 3x2
.
SOLUTION : This is Newton’s method for the cube root of 3.
The convergence is quadratic for sufficiently close initial guess.

•[068] x(k+1) = f (x(k) ) , with f (x) = 2x2 − 2x + 1 .

1
SOLUTION : The fixed points are x = 2
and x = 1 .
Here f ′ (x) = 4x − 2 , with f ′ (1) = 2 (repelling),
and f ′ ( 12 ) = 0, (attracting, with quadratic convergence).

•[069] x(k+1) = f (x(k) ) , with f (x) = 2x(1 − x) .

1
SOLUTION : The fixed points are x = 0 and x = 2
.
Here f ′ (x) = 2 − 4x , with f ′ (0) = 2 (repelling),
and f ′ ( 12 ) = 0, (attracting, with quadratic convergence).
EXERCISES :
•[070] Give a graphical interpretation of the fixed point iteration.
x(k+1) = sin(x(k) ) .
What are the fixed points? Does the derivative test give conclusive
evidence whether the fixed point x = 0 is attracting or repelling? Based
on the graphical interpretation, can one conclude whether x = 0 is
attracting or repelling?

•[071] Consider the fixed point iteration x(k+1) = f (x(k) ) , where

when x ≤ 12 ,

 2x ,
f (x) =
2(1 − x) , when x > 21 .


Give an accurate graphical interpretation in the interval [0, 1], with

x(0) ∼
= 0.1, showing enough iterations to illustrate the behavior of this
fixed point iteration. Analytically determine all fixed points, and for
each fixed point determine whether it is attracting or repelling.

141
•[070] SOLUTION : Here f (x) = sin(x) , and f ′ (x) = cos(x) .

There is only one fixed point, namely, x = 0 .

f ′ (0) = 1 , so the derivative test is inconclusive.

Graphical interpretation shows that x = 0 is attracting, but very slowly.

Here are some of the very long-time iterations, starting with x(0) = π/4 :

1000 0.05455137
2000 0.03864754
3000 0.03157667
4000 0.02735556
5000 0.02447269
6000 0.02234359
7000 0.02068827
8000 0.01935361
9000 0.01824787
10000 0.01731231
1

 2x , when x ≤ 2
,
•[071] f (x) =
1
2(1 − x) , when x > .

2
SOLUTION :
The fixed points are x = 0 and x = 32 . Both are repelling since | f ′ (x) |= 2.

The first 15 iterations, with x(0) = 0.0211 .

EXERCISES :

•[072] Show how to use the Chord method to compute the cube root of 5.
Carry out the first two iterations of the Chord method, using x(0) = 2 .
Analytically determine all fixed points of this Chord iteration.
For each fixed point, determine whether it is attracting or repelling.

•[073] Draw the graph of g(x) = x3 − 2 , clearly showing its zero.

Write down Newton’s method for finding a zero of g , and simplify the
expression for the Newton iteration as much as possible.
Will Newton’s method converge if the initial guess is sufficiently close?
If yes, then what is the rate of convergence?
Will Newton’s method converge for any positive initial guess ?
Will Newton’s method converge for negative initial guesses ?

142
•[073] SOLUTION : Newton’s method for the cube root of 2 :
x3 − 2 2(x3 + 1)
x(k+1) = f (x(k) ) , where f (x) = x − = .
3x2 3x2

Analytically we find that f (x) :

- has a vertical asymptote at x = 0

- approaches the line y = 23 x when | x |→ ∞

∗
√
3
- has (of course!) a fixed point at x = 2

- | f ′ (x∗ ) |= 0 , so convergence is quadratic (once x(k) is close to x∗ )

Graphically we see that :

- the iteration converges for any x(0) > 0

- the iteration converges for “most” negative x(0) , except for a countably
infinite such x(0) , namely those for which x(k) = 0 for some k.
SOLUTION : continued · · ·

Newton’s method for the cube root of 2, with a converging iteration (red)
having x(0) = 0.35 , and the iteration (black) with x(k) = 0 for some k.
EXERCISES :

•[074] Suppose you enter any number on a calculator and then keep pushing
the cosine button. (Assume the calculator is in “radian-mode”.)
What will happen in the limit to the result shown in the display?
Give a full mathematical explanation and a graphical interpretation.
Do the same for sin(x) and tan(x).

•[075] Consider the fixed point iteration

x(k+1) = x(k) (1 − x(k) ) .

Does the derivative test give conclusive evidence whether or not the
fixed point x = 0 is attracting?
Give a careful graphical interpretation of this fixed point iteration.
What can you say about the convergence of the fixed point iteration?

143
•[074] SOLUTION : x(k+1) = cos(x(k) ) , k = 0, 1, 2, · · · .
Here f (x) = cos(x) and f ′ (x) = − sin(x) .
There is only a single fixed point, namely, x∗ ≈ 0.739085133 .
The fixed point is attracting, with | f ′ (x∗ ) | ≈ 0.673612029 .
A careful graphical interpretation shows there is convergence to x∗
for any initial guess x(0) .

•[075] SOLUTION : x(k+1) = x(k) (1 − x(k) ) , k = 0, 1, 2, · · · .

Here f (x) = x(1 − x) and f ′ (x) = 1 − 2x .
There is only a single fixed point, namely, x∗ = 0 .
Here f ′ (0) = 1 , so the derivative test is inconclusive.
Graphical interpretation shows x = 0 is attracting, but very slowly.
EXERCISES :

•[076] Consider the fixed point iteration

(k+1) 1
x = √ , k = 0, 1, 2, · · · .
x (k)

Give a careful graphical interpretation of this fixed point iteration.

Determine all fixed points and whether they are attracting or repelling.
Does the iteration converge for all positive initial points x(0) ?

•[077] Consider the fixed point iteration

1 + x2
x(k+1) = f (x(k) ) , where f (x) = .
1+x
Determine all fixed points and whether they are attracting or repelling.
If attracting determine whether the convergence is linear or quadratic.
Give a graphical interpretation of the first few iterations, with x(0) = 2.

144
(k+1) 1
•[076] x = √ , k = 0, 1, 2, · · · .
x(k)

SOLUTION :

Here
− 12 ′ 1 −3
f (x) = x , and f (x) = − x 2 .
2

There is one fixed point, namely x = 1 .

1
Furthermore, | f ′ (1) | = 2
, so x = 1 is attracting.

A careful graphical interpretation shows there is convergence to x = 1

for all positive initial guesses x(0) .

1 + x2
•[077] x(k+1) = f (x(k) ) , where f (x) = .
1+x
SOLUTION : Analytically we find that f (x) :

- has a vertical asymptote at x = −1

- approaches the line y = x when | x |→ ∞

- has a fixed point at x∗ = 1

1
- | f ′ (x∗ ) | = 2
< 1 , so x∗ is attracting

- graphically we see the iteration converges for any x(0) > −1

- graphically we also see the iteration diverges for any x(0) < −1

- the divergence for x(0) < −1 is slow as x(k) becomes more negative
SOLUTION : continued · · ·

(k+1) (k) 1+x2

The fixed point iteration x = f (x ) , where f (x) = 1+x
.
Convergence Analysis for Systems.

Again most iterative methods for solving

G(x) = 0 ,

can be written as

x(k+1) = F(x(k) ) , k =, 1, 2, · · · ,

where the function F should be chosen such that

x∗ is a root of G(x) = 0 if and only if x∗ is a fixed point of F.

145
EXAMPLE :

Newton’s method for systems is

G′ (x(k) ) (x(k+1) − x(k) ) = − G(x(k) ) .

Thus

x(k+1) = x(k) − G′ (x(k) )−1 G(x(k) ) ,

assuming G′ (x)−1 to exist near x = x∗ .

So here
F(x) = x − G′ (x)−1 G(x) .

146
NOTE : Fixed point iterations also arise as models of physical processes,
where they are called difference equations or discrete dynamical systems .

EXAMPLE :

The equations
(k+1) (k) (k) (k) (k)
x1 = λ x1 (1 − x1 ) − c1 x1 x2 ,
(k+1) (k) (k) (k)
x2 = c2 x2 + c1 x1 x2 ,

model a “predator-prey ” system, where, for example,

(k)
x1 denotes the biomass of “fish” in year k ,
and
(k)
x2 denotes the biomass of “sharks” in year k ,

and where λ , c1 , and c2 are constants.

147
Derivatives :

For scalar functions

f (x + h) − f (x)
f ′ (x) ≡ lim ,
h→0 h

or, equivalently, f ′ (x) is the number such that

f (x + h) − f (x) − f ′ (x)h
→ 0 as h → 0 .
h

If f ′ (x) exists then f is said to be differentiable at x .

148
Similarly for vector valued functions F(x) we say that F is differentiable at
x if there exists a matrix F′ (x) such that

k F(x + h) − F(x) − F′ (x)h k

→ 0 as khk → 0.
khk

The matrix F′ (x) is the Jacobian matrix introduced earlier :

If F(x) has component functions

(f 1 (x) , f 2 (x) , · · · , f n (x))T ,

and if
x ≡ (x1 , x2 , · · · , xn )T ,

then
′ ∂f i
{F (x)}i,j ≡ .
∂xj

149
THEOREM :

Let F′ (x) be continuous near a fixed point x∗ of F(x) and

k F′ (x∗ ) k < 1 ,

in some induced matrix norm.

Then the fixed point iteration

x(k+1) = F(x(k) ) , k = 0, 1, 2, · · · ,

converges to x∗ whenever the initial guess x(0) is sufficiently close to x∗ .

150
NOTE :

It can be shown that sufficient condition for

k F′ (x∗ ) k < 1 ,

in some matrix norm, is that

spr(F′ (x∗ )) < 1 .

Here
spr(F′ (x∗ )) is the spectral radius of F′ (x∗ ) ,

that is, the size of the largest eigenvalue of F′ (x∗ ) .

NOTE : Eigenvalues may be complex numbers.

151
PROOF (of the Theorem) :

(Similar to the proof of the scalar case.)

Let α ≡ k F′ (x∗ ) k . Then α < 1.

By definition of F′ , given any ǫ > 0, in particular

1−α
ǫ ≡ ,
2
there exists a δ > 0 such that

k F(x) − F(x∗ ) − F′ (x∗ ) (x − x∗ ) k ≤ ǫ k x − x∗ k ,

whenever x ∈ Bδ (x∗ ).

Here
Bδ (x∗ ) ≡ {x : k x − x∗ k ≤ δ} .

152
Let x(0) ∈ Bδ (x∗ ). Then
k x(1) − x∗ k = k F(x(0) ) − F(x∗ ) k

= k F(x(0) ) − F(x∗ ) − F′ (x∗ ) (x(0) − x∗ ) + F′ (x∗ ) (x(0) − x∗ ) k

≤ k F(x(0) ) − F(x∗ ) − F′ (x∗ ) (x(0) − x∗ ) k + k F′ (x∗ ) (x(0) − x∗ ) k

≤ ǫ k x(0) − x∗ k + α k x(0) − x∗ k

≤ (ǫ + α) k x(0) − x∗ k
1−α
= ( + α) k x(0) − x∗ k
2
1+α
= k x(0) − x∗ k ≤ β δ ,
2
where 1+α
β ≡ < 1.
2
Thus x(1) ∈ Bδ (x∗ ) .

153
Since x(1) ∈ Bδ (x∗ ) , we also have

k x(2) − x∗ k ≤ β k x(1) − x∗ k ≤ β2 δ .

Thus, since β < 1, we have x(2) ∈ Bδ (x∗ ) .

Continuing in this fashion, we find

k x(k) − x∗ k ≤ βk δ .

Thus, since β < 1, we see that x(k) converges to x∗ . QED !

154
EXAMPLE : In Newton’s method

F(x) = x − (G′ (x)−1 ) G(x) .

Hence
G′ (x)F(x) = G′ (x)x − G(x) .

⇒ G′′ (x)F(x) + G′ (x)F′ (x) = G′′ (x)x + G′ (x) − G′ (x) = G′′ (x)x

⇒ G′ (x)F′ (x) = G′′ (x)(x − F(x)) = G′′ (x)(G′ (x))−1 G(x)

⇒ F′ (x) = (G′ (x))−1 G′′ (x)(G′ (x))−1 G(x)

⇒ F′ (x∗ ) = (G′ (x∗ ))−1 G′′ (x∗ )(G′ (x∗ ))−1 G(x∗ ) = O (zero matrix) .

because G(x∗ ) = 0.

So k F′ (x∗ ) k= 0 , and therefore certainly k F′ (x∗ ) k< 1 .

155
Thus if

• G′′ (x) is continuous near x∗ ,

• (G′ (x∗ ))−1 exists ,

• x(0) is sufficiently close to x∗ ,

then Newton’s method converges.

Again this convergence can be shown to be quadratic , i.e.,

k x(k+1) − x∗ k
lim (k) ∗ 2
≤ C, for some constant C.
k→∞ k x −x k

156
EXERCISE :

•[078] Consider the fixed point iteration

(k+1) (k) (k) (k) (k)

x1 = λ x1 (1 − x1 ) − 0.2 x1 x2 ,
(k+1) (k) (k) (k)
x2 = 0.9 x2 + 0.2 x1 x2 .

(k)
This is a “predator-prey ” model, where, for example, x1 denotes the
(k)
biomass of “fish” and x2 denotes the biomass of “sharks” in year k .

(k) (k)
Numerically determine the long-time behavior of x1 and x2 for
the following values of λ :
λ = 0.5, 1.0, 1, 5, 2.0, 2.5 ,
(0) (0)
taking, for example, x1 = 0.1 and x2 = 0.1.

What can you say analytically about the fixed points of this system?

157
•[078] SOLUTION :

Here is a ”quick” Fortran code for doing the iteration. Note the use of the
”intermediate variables” t1 and t2 so that the iteration is done correctly:

rl = 2.5

x1 = 0.1
x2 = 0.1

nit = 5000
npr = 500

PRINT*, " lambda = ",rl

PRINT*, " k x1 x2"

DO k = 1,nit
t1 = rl*x1*(1-x1) - 0.2*x1*x2
t2 = 0.9*x2 + 0.2*x1*x2
x1 = t1
x2 = t2
IF (MOD(k,npr).EQ.0) PRINT*, k, x1, x2
ENDDO

STOP
END
SOLUTION : (continued · · · )
For each of the given values of λ the iteration converges to finite values,
which are shown in the Table below. For the cases λ = 1.0 and λ = 2.0
the convergence is extremely slow, and a very large number of iterations is
needed to converge to the fixed points shown in the Table.
λ x1 x2
0.5 0.000000 0.000000
1.0 0.000000 0.000000
1.5 0.333333 0.000000
2.0 0.500000 0.000000
2.5 0.500000 1.250000

Analytically it can be seen that the fixed points in the Table are indeed
attracting (”stable”), although the fixed points for λ = 1.0 and for λ = 2.0
are “critically stable”. Indeed, the eigenvalues of the Jacobian matrix for
this problem are strictly less than 1 for λ = 0.5, 1.5, 2.5, while for λ = 1.0
and λ = 2.0 there is an eigenvalue of magnitude 1. It must be mentioned
that there are also repelling fixed points, but we leave out details on these
here.
SOLUTION : (continued · · · )

The value of x1 along families of fixed points of the predator-prey system.

Solid/dashed curves represent attracting/repelling fixed points, respectively.
Black : x1 = x2 = 0. Red : x1 6= 0, x2 = 0. Blue : x1 = 0.5, x2 6= 0.
SOLUTION : (continued · · · )

The value of x2 along families of fixed points of the predator-prey system.

Solid/dashed curves represent attracting/repelling fixed points, respectively.
Black : x1 = x2 = 0. Red : x1 6= 0, x2 = 0. Blue : x1 = 0.5, x2 6= 0.
THE APPROXIMATION OF FUNCTIONS.

Function Norms.

To measure how well a given function f ∈ C[a, b] is approximated by another

function we need a quantity called function norm.

Examples of these are :

Z b
k f k1 ≡ | f (x) | dx ,
a
Z b
1
2
k f k2 ≡ { f (x) dx} 2 ,
a

k f k∞ ≡ max | f (x) | .
[a,b]

Note the similarity of these norms to the corresponding vector norms.

158
A function norm must satisfy :

(i) kf k ≥ 0, ∀f ∈ C[a, b] , k f k = 0 iff f ≡ 0 ,

(ii) k αf k = | α | k f k , ∀α ∈ R , ∀f ∈ C[a, b] ,

(iii) kf +g k ≤ kf k + kg k , ∀f, g ∈ C[a, b] .

159
All of the norms above satisfy these requirements. (Check !)

EXAMPLE :
Z b
k f + g k1 = | f (x) + g(x) | dx
a

Z b
≤ | f (x) | + | g(x) | dx
a

Z b Z b
= | f (x) | dx + | g(x) | dx
a a

= k f k1 + k g k1 .

(For the k · k2 we shall verify (iii) later.)

160
NOTE : If a function is “small” in a given function norm then it need not
be small in another norm.

EXAMPLE : Consider fk (x) , k = 2, 3, · · · , as shown below :

y
k

f k (x)

x
0 1/2 1
1/2 - 1/k 2 1/2 + 1/k 2

161
Then

k fk k∞ = k → ∞ as k → ∞ ,

while

1
1
Z
k fk k1 = | fk (x) | dx = → 0 as k → ∞ ,
0 k

and
Z 1
1 p
2
k fk k2 = { fk (x) dx} 2 = 2/3 (Check !) .
0

162
EXAMPLE :

Approximate
f (x) = x3
by
3 2 1
p(x) = x − x,
2 2
on the interval [0, 1].

Then
1
p(x) = f (x) for x = 0 , , 1,
2

that is, p(x) interpolates f (x) at these points.

163
Graph of f (x) = x3 (blue) and its interpolant p(x) = 23 x2 − 12 x (red) .

164
A measure of “how close” f and p are is then given by, for example,

Z 1
1
2
k f − p k2 = { (f (x) − p(x)) dx} .
2

We find that

√
210 ∼
k f − p k2 = = 0.0345. (Check !)
420

165
The Lagrange Interpolation Polynomial.

Let f be a function defined on [a, b] .

Let Pn denote all polynomials of degree less than or equal to n .

Given points {xk }nk=0 with

a ≤ x0 < x1 < · · · < xn ≤ b ,

we want to find p ∈ Pn such that

p(xk ) = f (xk ) , k = 0, 1, · · · , n .

166
1
Graph of f (x) = 10
+ 15 x + x2 sin(2πx) (blue)
and its Lagrange interpolant p(x) ∈ P5 (red)
at six interpolation points (n = 5) .

167
The following questions arise :

(i) Is p(x) uniquely defined ?

(ii) How well does p approximate f ?

(iii) Does the approximation get better as n → ∞ ?

168
To answer the above questions let

n
Y (x − xk )
ℓi (x) ≡ , i = 0, 1, · · · , n ,
k=0,k6=i
(xi − xk )

be the Lagrange interpolating coefficients, or Lagrange basis functions.

Then each ℓi ∈ Pn . (Check !)

169
EXAMPLE : If n = 2 we have

(x − x1 )(x − x2 )
ℓ0 (x) = ,
(x0 − x1 )(x0 − x2 )
(x − x0 )(x − x2 )
ℓ1 (x) = ,
(x1 − x0 )(x1 − x2 )
and
(x − x0 )(x − x1 )
ℓ2 (x) = .
(x2 − x0 )(x2 − x1 )

NOTE : ℓi ∈ P2 , i = 0, 1, 2 , and

0 if k 6= i ,
ℓi (xk ) =
1 if k = i.

170
1 1 1

x0 x1 x2 x0 x1 x2 x0 x1 x2

(x−x1 )(x−x2 ) (x−x0 )(x−x2 ) (x−x0 )(x−x1 )

ℓ0 (x) = (x0 −x1 )(x0 −x2 )
, ℓ1 (x) = (x1 −x0 )(x1 −x2 )
, ℓ2 (x) = (x2 −x0 )(x2 −x1 )
.

Lagrange basis functions (case n = 2) .

171
Now given f (x) let
n
X
p(x) = f (xk ) ℓk (x) .
k=0

Then
p ∈ Pn ,

and n
X
p(xi ) = f (xk ) ℓk (xi ) = f (xi ) ,
k=0

that is, p(x) interpolates f (x) at the points x0 , x1 , · · · , xn .

172
THEOREM : Let f (x) be defined on [a, b] and let
a ≤ x0 < x1 < · · · < xn ≤ b .

Then there is a unique polynomial p ∈ Pn that interpolates f at the {xk }nk=0 .

PROOF :

We have already demonstrated the existence of p(x) .

Suppose q ∈ Pn also interpolates f at the points {xk }nk=0 .

Let r(x) ≡ p(x) − q(x) .

Then r ∈ Pn and r(xk ) = 0 , k = 0, 1, · · · , n .

But r ∈ Pn can have at most n zeroes, unless r(x) ≡ 0 .

Hence r(x) ≡ 0 .

Thus p(x) ≡ q(x) , i.e., p is unique. QED !

173
EXAMPLE : Let f (x) = ex .
Given f (0) = 1, f (1) = 2.71828, f (2) = 7.38905, we want to approximate
f (1.5) by polynomial interpolation at x = 0, 1, 2.

Here
(1.5 − 1) (1.5 − 2) 1
ℓ0 (1.5) = = − ,
(0 − 1) (0 − 2) 8
(1.5 − 0) (1.5 − 2) 6
ℓ1 (1.5) = = ,
(1 − 0) (1 − 2) 8
(1.5 − 0) (1.5 − 1) 3
ℓ2 (1.5) = = ,
(2 − 0) (2 − 1) 8
so that
p(1.5) = f (0) ℓ0 (1.5) + f (1) ℓ1 (1.5) + f (2) ℓ2 (1.5)
1 6 3
= (1) (− ) + (2.71828) ( ) + (7.38905) ( ) = 4.68460 .
8 8 8

The exact value is f (1.5) = e1.5 = 4.48168 .

174
Graph of f (x) = ex (blue) and its Lagrange interpolant p(x) ∈ P2 (red).

175
THE LAGRANGE INTERPOLATION THEOREM :

Let
x0 < x1 < · · · < xn , and let x∈R.
Define
a ≡ min{x0 , x} and b ≡ max{xn , x} .

Assume that f ∈ Cn+1 [a, b].

Let p ∈ Pn be the unique polynomial that interpolates f (x) at {xk }nk=0 .

Then n
(n+1)
f (ξ) Y
f (x) − p(x) = (x − xk ) ,
(n + 1)! k=0
for some point ξ ≡ ξ(x) ∈ [a, b].

176
PROOF :

If x = xk for some k then the formula is clearly valid.

So assume that x 6= xk , for k = 0, 1, · · · , n .

Let n
Y f (x) − p(x)
w(z) ≡ (z − xk ) and c(x) ≡ .
k=0
w(x)

Then c(x) is well defined since w(x) 6= 0 .

We want to show that

f (n+1) (ξ)
c(x) = . (Why ?)
(n + 1)!

177
Let
F (z) ≡ f (z) − p(z) − w(z) c(x) .

Then

F (xk ) = f (xk ) − p(xk ) − w(xk ) c(x) = 0, k = 0, 1, · · · , n ,

and
f (x) − p(x)
F (x) = f (x) − p(x) − w(x) = 0.
w(x)

Thus F (z) has (at least) n + 2 distinct zeroes in [a, b] .

178
F
x0 x1 xk xk+1 xn
x

F’

F ’’
.
.
etc.
.
.
ξ
F (n+1)

The zeroes of F (x) and of its derivatives.

179
Hence, by Rolle’s Theorem, F ′ (z) has n + 1 distinct zeroes in [a, b] ,

F ′′ (z) has n distinct zeroes in [a, b] ,

F ′′′ (z) has n − 1 distinct zeroes in [a, b] , etc.

We find that F (n+1) (z) has (at least) one zero in [a, b] , say,

F (n+1) (ξ) = 0 , ξ ∈ [a, b] .

But
F (n+1) (z) = f (n+1) (z) − p(n+1) (z) − w(n+1) (z) c(x) .

Hence
F (n+1) (ξ) = f (n+1) (ξ) − (n + 1)! c(x) = 0 .

It follows that
f (n+1) (ξ)
c(x) = .
(n + 1)!

180
EXAMPLE : In the last example we had

n=2, f (x) = ex , x0 = 0 , x1 = 1, x2 = 2 ,

and we computed the value of p(x) at x = 1.5.

By the Theorem

f (3) (ξ)
f (x) − p(x) = (x − 0)(x − 1)(x − 2) , ξ ∈ [0, 2] .
3!

Since f (3) (ξ) ≤ e2 < 7.4 we find that

7.4 7.4
| f (1.5) − p(1.5) | < (1.5) (0.5) (0.5) = < 0.47 .
6 16

The actual error is | p(1.5) − e1.5 | ∼

= | 4.68460 − 4.48168 | ∼
= 0.2 .

181
Qn
The graph of wn+1 (x) = k=0 (x − xk ) for equally spaced interpolation
points in the interval [−1, 1] , for the cases n + 1 = 3 , 6 , 7 , 10 .

182
n max n max n max n max

1 1.00000 5 0.06918 9 0.01256 13 0.00278

2 0.38490 6 0.04382 10 0.00853 14 0.00193

3 0.19749 7 0.02845 11 0.00584 15 0.00134

4 0.11348 8 0.01877 12 0.00400 16 0.00095

The maximum value of | wn+1 (x) | in the interval [−1, 1]

for the case of n + 1 equally spaced interpolation points .

183
EXERCISES :

•[079] Consider the polynomial pn (x) of degree n or less that interpolates

f (x) = sin(x) at n + 1 distinct points in [−1, 1]. Write down the
general error formula for | sin(x) − pn (x) |. For distinct, but otherwise
arbitrary interpolation points, how big should n be to guarantee that
the maximum interpolation error in [−1, 1] is less than 10−2 ?

•[080] Also answer the above question for equally spaced interpolation points
in [−1, 1], using the Table on the preceding page .

•[081,082] Also answer the above questions for the case of f (x) = ex in [−1, 1].

•[083] Consider the problem of interpolating a smooth function f (x) at two

points, x0 = −h/2 and x1 = h/2, by a polynomial p ∈ P3 such that
p(x0 ) = f (x0 ), p′ (x0 ) = f ′ (x0 ), p(x1 ) = f (x1 ), p′ (x1 ) = f ′ (x1 ).
Prove that this interpolation problem has one and only one solution.

184
•[079] Consider the polynomial pn (x) of degree n or less that interpolates
sin(x) at n + 1 distinct points in [−1, 1]. For distinct, but arbitrary interpola-
tion points, how big should n be to guarantee that the maximum interpolation
error in [−1, 1] is less than 10−2 ?

SOLUTION :
(n+1) max | wn+1 (x) |
max | sin(x) − pn (x) | ≤ max | f (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
2n+1 ∼
≤ = 6.35 10−3 when n = 7 .
(n + 1)!

•[080] Also answer the above question for equally spaced interpolation points
in [−1, 1], using the Table on the preceding page .

SOLUTION :
(n+1) max | wn+1 (x) |
max | sin(x) − pn (x) | ≤ max | f (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
max | wn+1 (x) | 0.19749 ∼
≤ = = 8.23 10−3 when n = 3 .
(n + 1)! 4!
•[081] Consider the polynomial pn (x) of degree n or less that interpolates ex
at n + 1 distinct points in [−1, 1]. For distinct, but arbitrary interpolation
points, how big should n be to guarantee that the maximum interpolation
error in [−1, 1] is less than 10−2 ?

SOLUTION :
x (n+1) max | wn+1 (x) |
max | e − pn (x) | ≤ max | f (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
2n+1 ∼
≤ e · = 3.836 10−3 when n = 8 .
(n + 1)!

•[082] Also answer the above question for equally spaced interpolation points
in [−1, 1], using the Table on the preceding page .

SOLUTION :
x (n+1) max | wn+1 (x) |
max | e − pn (x) | ≤ max | f (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
max | wn+1 (x) | 0.11348 ∼
≤e · = e · = 2.5706 10−3 when n = 4 .
(n + 1)! (n + 1)!
•[083] Consider the problem of interpolating a smooth function f (x) at two
points, x0 = −h/2 and x1 = h/2, by a polynomial p ∈ P3 such that
p(x0 ) = f (x0 ), p′ (x0 ) = f ′ (x0 ), p(x1 ) = f (x1 ), p′ (x1 ) = f ′ (x1 ).

Prove that this interpolation problem has one and only one solution.

SOLUTION : Write p(x) = c0 + c1 x + c2 x2 + c3 x3 . Then

h h h 2 h 3 h
p(− ) = c0 − c1 + c2 ( ) − c3 ( ) = f (− ),
2 2 2 2 2

h h h h h
p( ) = c0 + c1 + c2 ( )2 + c3 ( )3 = f ( ),
2 2 2 2 2

′ h h h 2 ′ h
p (− ) = c1 − 2 c2 + 3( ) c3 = f (− ) ,
2 2 2 2

h h h h
p′ ( ) = c1 + 2 c2 + 3( )2 c3 = f ′ ( ) .
2 2 2 2
SOLUTION : continued ··· :
h2 h3
− h2
 
1 4 −8 c0

f0
 
In matrix form 1 h h2 h3
  c1   f 1 
   
2 4 8
  =  ′  ,


0 3h2  c2 f0
1 −h 4
0 1 h 3h2 c3 f1′
4
where the matrix can be transformed to upper-triangular form as follows :
h2 3 h2 3
− h2 − h8 − h2 − h8
   
1 4 1 4
1 h h2 h3 0 h3
2 4 8
 h 0 4

→ 
   

0 1 3h2 3h2 
−h 4
 0 1 −h 4

3h2 3h2
0 1 h 4 0 1 h 4

h2 3
h2 h3
1 − h2 − h8
 
1 − h2 −8
 
4 4
0 h3 h3
h 0 4
 0 h 0 4

→  → 
  
0 0 h2 h2
−h 2
 0 0 −h 2

h2 0 0 0 h2
0 0 h 2

The determinant of the upper-triangular matrix is −h4 , which is not zero.

Thus the system in uniquely solvable.
Chebyshev Polynomials.

From the preceding Theorem it follows that

1
k f − p k∞ ≤ k f (n+1) k∞ k wn+1 k∞ ,
(n + 1)!
where n
Y
wn+1 (x) ≡ (x − xk ) ,
k=0
and where
k wn+1 k∞ ≡ max | wn+1 (x) | .
[a,b]

NOTE :
• It does not follow that k f − p k∞ → 0 as n → ∞ .
• There are examples where k f − p k∞ → ∞ as n → ∞.
• For given n, we can choose the points {xk }nk=0 so k wn+1 k∞ is minimized .

185
EXAMPLE :
Let n = 1 and place the x0 and x1 symmetrically in [−1, 1] :

w2 (x)
−1 x0 0 x1 1
−η +η

Then
w2 (x) = (x − x0 ) (x − x1 ) = (x + η) (x − η) = x2 − η 2 .

We want to choose η such that

k w2 k∞ ≡ max | w2 (x) |
[−1,1]
is minimized .

186
1−η2 1−η2
w2 (x)
−1 x0 0 x1 1
−η +η
−η2

At the critical point : w2 (0) = − η 2 .

At the endpoints : w2 (−1) = w2 (1) = 1 − η 2 .

We see that k w2 k∞ is minimized if we take η such that

| w2 (−1) | = | w2 (0) | = | w2 (1) | , i .e., if η2 = 1 − η2 ,

1√ 1
Thus η = 2 and k w2 k∞ = .
2 2

187
In general, the points

{xk }nk=0 that minimize k wn+1 k∞ on [−1, 1]

are the zeroes of the Chebyshev Polynomial Tn+1 of degree n + 1.

These polynomials are defined as

Tk (x) ≡ cos( k cos−1 (x) ) , k = 0, 1, 2, · · · , x ∈ [−1, 1] ,

where cos−1 (x) is used to denote arccos(x).

The Tk are indeed polynomials :

First T0 (x) ≡ 1 and T1 (x) = x .

Also
Tk+1 (x) = 2x Tk (x) − Tk−1 (x) .

188
Tk+1 (x) = 2x Tk (x) − Tk−1 (x)

To derive this recurrence formula we use the identity

cos( (k + 1)θ ) + cos( (k − 1)θ ) = 2 cos(kθ) cos(θ) .
which we rewrite as
cos( (k + 1)θ ) = 2 cos(kθ) cos(θ) − cos( (k − 1)θ ) .
so that, taking θ = cos−1 (x), we have

Tk+1 (x) ≡ cos( (k + 1) cos−1 (x) )

= 2 cos( k cos−1 (x) ) cos(cos−1 (x)) − cos( (k − 1) cos−1 (x) )

= 2x Tk (x) − Tk−1 (x) .

189
Thus, with T0 (x) ≡ 1 and T1 (x) = x , we obtain

T2 (x) = 2x T1 (x) − T0 (x) = 2x2 − 1 ,

T3 (x) = 2x T2 (x) − T1 (x) = 4x3 − 3x ,

T4 (x) = 2x T3 (x) − T2 (x) = 8x4 − 8x2 + 1 ,

·
·
·

Tn+1 (x) = 2n xn+1 + · · · .

190
THE CHEBYSHEV THEOREM :

Let n
Y
wn+1 (x) ≡ (x − xk ) .
k=0

Then for fixed n the value of

k wn+1 k∞ ≡ max | wn+1 (x) |

[−1,1]

is minimized if the points {xk }nk=0 are the zeroes of Tn+1 (x) .

For these points the value of k wn+1 k∞ is equal to 2−n .

191
PROOF :

Tn+1 (x) = cos( (n + 1) cos−1 (x) ) .

π
Tn+1 (x) = 0 if (n + 1) cos−1 (x) = (2k + 1) ,
2
i.e., Tn+1 (x) = 0 if

2k + 1
x = cos( π) , k = 0, 1, 2, · · · , n .
2(n + 1)

Hence the zeroes of Tn+1 (x) lie indeed in [−1, 1] .

There are n + 1 such zeroes.

192
2k + 1
x = cos( π) , k = 0, 1, 2, · · · , n .
2(n + 1)

x 1
4
x
3

7π/10 9π/10
x 0 π/10 π
2 3π/10

x
1
x
0 −1

The Chebyshev points xk , (k = 0, 1, 2, · · · , n) , for the case n = 4 .

193
Tn+1 (x) = cos( (n + 1) cos−1 (x) ) .

Tn+1 (x) = ± 1 if
(n + 1) cos−1 (x) = kπ ,
that is, if,
k
x = cos( π) , k = 0, 1, 2, · · · , n + 1 .
n+1

We can now draw the graph of Tn+1 :

194
The graph of Tn for the cases n = 2, 3, 4, 5 .

195
Recall that from the recurrence relation

T0 (x) ≡ 1 , T1 (x) = x , Tk+1 (x) = 2x Tk (x) − Tk−1 (x) ,

we have
Tn+1 (x) = 2n xn+1 + · · · .

Thus we can also write

Tn+1 (x) = 2n (x − x0 ) (x − x1 ) · · · (x − xn ) ,

where the xk are the zeroes of Tn+1 (x) .

Let
∗
wn+1 (x) ≡ 2−n Tn+1 (x) = (x − x0 ) (x − x1 ) · · · (x − xn ) ,

Then
k wn+1 ∗ k∞ = k 2−n Tn+1 k∞ = 2−n k Tn+1 k∞ = 2−n .

196
−n
2
w5*(x)
1
0 11
00
01010
−1 x0
0
1 00
11
x1 x2 x11
00
00
11
3 x4 1010 1

−n
−2
∗
Qn
The graph of wn+1 = k=0 (x − xk ) for the case n = 4 .

Claim :

There does not exist w ∈ Pn+1 , with leading coefficient 1, such that

k w k∞ < 2−n .

197
Suppose there does exist a wn+1 ∈ Pn+1 , with leading coefficient 1, such that

k wn+1 k∞ < k wn+1 ∗ k∞ = 2−n .

−n
2
1
0
0
1 w5*(x)
11
00
00
11
1
0
0
1
00
11
00
11 01010 11
00
00
11
1
0
0
1
−1 x 0 x1 x2 x3 x4 1
00
11 1
0
1
0 0
1

w5 (x)
−n
−2

198
Then wn+1 must intersect wn+1 ∗ at least n + 1 times in [−1, 1] .

Thus (wn+1 − wn+1 ∗ ) has n + 1 zeroes in [−1, 1] .

But
(wn+1 − wn+1 ∗ ) ∈ Pn

since both wn+1 and wn+1 ∗ have leading coefficient 1 .

Hence wn+1 − wn+1 ∗ ≡ 0 .

Thus wn+1 = wn+1 ∗ . QED !

199
n uniform Chebyshev n uniform Chebyshev

1 1.00000 0.50000 5 0.06918 0.03125

2 0.38490 0.25000 6 0.04382 0.01563

3 0.19749 0.12500 7 0.02845 0.00782

4 0.11348 0.06250 8 0.01877 0.00391

The maximum of | wn+1 (x) | in the interval [−1, 1]

for uniformly spaced points and for Chebyshev points .

200
EXAMPLE :
Let f (x) = ex on [−1, 1] and take n = 2 .
T3 (x) = 4x3 − 3x has zeroes
1√ 1√
x0 = − 3, x1 = 0 , x2 = 3.
2 2

Approximate f (0.5) by polynomial interpolation at x0 , x1 , x2 :

√
(0.5 − x1 )(0.5 − x2 ) 1− 3
ℓ0 (0.5) = = ,
(x0 − x1 )(x0 − x2 ) 6

(0.5 − x0 )(0.5 − x2 ) 4
ℓ1 (0.5) = = ,
(x1 − x0 )(x1 − x2 ) 6
√
(0.5 − x0 )(0.5 − x1 ) 1+ 3
ℓ2 (0.5) = = .
(x2 − x0 )(x2 − x1 ) 6

201
Thus

p(0.5) = f (x0 ) l0 (0.5) + f (x1 ) l1 (0.5) + f (x2 ) l2 (0.5)

√ √
√ √
(−0.5 3) (1 − 3) 0 4 (0.5 3) (1 + 3)
= e + e ( ) + e
6 6 6
∼
= 1.697 .

The exact value is e0.5 = 1.648 · · · .

Thus the exact absolute error is

| e0.5 − p(0.5) | ∼
= | 1.648 − 1.697 | = 0.049 .

202
Graph of f (x) = ex (blue) on the interval [−1, 1] ,
and its Lagrange interpolating polynomial p(x) ∈ P2 (red)
at three Chebyshev interpolation points (n = 2) .

203
EXAMPLE : More generally, if we interpolate

f (x) = ex by p ∈ Pn at n + 1 Chebyshev points in [−1, 1] ,

then for x ∈ [−1, 1] we have

f (n+1) (ξ)
| ex − p(x) | = | wn+1 (x) | ,
(n + 1)!
where ξ ≡ ξ(x) ∈ [−1, 1] , and where
n
Y
wn+1 (x) = (x − xk ) , ( Chebyshev points xk ),
k=0

Thus
x k f (n+1) k∞
max | e − p(x) | ≤ k wn+1 k∞
x∈[−1,1] (n + 1)!
e
≤ k wn+1 k∞
(n + 1)!
e
= 2−n .
(n + 1)!

204
NOTE :

Let f be a sufficiently smooth function.

Let pU be the polynomial that interpolates f at n+1 uniformly spaced points.

Let pC denote the polynomial that interpolates f at n + 1 Chebyshev points.

Although the Theorem does not guarantee that

k pC − f k ∞ ≤ k pU − f k ∞ ,

this inequality is “usually” valid.

205
EXERCISES :

•[084] Suppose p ∈ Pn interpolates sin(x) at n + 1 distinct points in [−1, 1].

For the case of Chebyshev points, how big should n be for the error to
be less than 10−4 everywhere in [−1, 1] ?

•[085] Suppose that p ∈ Pn interpolates ex at n + 1 distinct points in [−1, 1].

For the case of Chebyshev points, how big should n be for the error to
be less than 10−4 everywhere in [−1, 1] ?

•[086] Suppose p ∈ Pn interpolates xn+1 at n + 1 distinct points in [−1, 1].

For the case of Chebyshev points, how big should n be for the maximum
interpolation error in [−1, 1] to be less than 10−4 ?

206
SOLUTIONS :
(n+1) 2−n
•[084] max | sin(x) − pn (x) | ≤ max | f (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
2−n ∼
≤ = 4.34 10−5 when n = 5 .
(n + 1)!
−n
2
•[085] max | ex − pn (x) | ≤ max | f (n+1) (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
e · 2−n ∼
≤ = 8.43 10−6 when n = 6 .
(n + 1)!

n+1 (n+1) 2−n

•[086] max | x − pn (x) | ≤ max | f (x) |
x∈[−1,1] x∈[−1,1] (n + 1)!
2−n
≤ (n + 1)! = 2−n
(n + 1)!
∼
= 6.1 10−5 when n = 14 .
The Taylor Polynomial.

Let f ∈ C n [a, b] .

Let Pn denote all polynomials of degree less than or equal to n.

Given the point x0 ∈ [a, b] , we want to find p ∈ Pn such that

p(k) (x0 ) = f (k) (x0 ) , k = 0, 1, · · · , n .

207
The function ex (blue) and its Taylor polynomials pk (x) about x0 = 0 :
k = 1 : purple, k = 2 : red, k = 3 : brown, k = 4 : green, k = 5 : black .

208
As for Lagrange interpolation, we have the following questions :

• Is the polynomial p(x) ∈ Pn uniquely defined ?

• How well does p approximate f ?

• Does the approximation get better as n → ∞ ?

209
Existence :
n
X f (k) (x0 )
p(x) = (x − x0 )k .
k=0
k!

Clearly

p(k) (x0 ) = f (k) (x0 ) , k = 0, 1, · · · , n . (Check !)

DEFINITION :

p(x) is called the Taylor polynomial of degree n for f (x) about the point x0 .

210
TAYLOR’s THEOREM :

Let f ∈ C n+1 [a, b] , x0 ∈ [a, b] .

Let p(x) ∈ Pn be the Taylor polynomial for f about the point x0 , i.e.,
n
X f (k) (x0 )
p(x) = (x − x0 )k .
k=0
k!

Then, for x ∈ [a, b] ,

f (n+1) (ξ)
f (x) − p(x) = (x − x0 )n+1 ,
(n + 1)!
for some point ξ = ξ(x) that lies between x0 and x .

f (n+1) (ξ)
DEFINITION : Rn (x) ≡ (x − x0 )n+1
(n + 1)!
is called the Taylor remainder .

211
•[087] PROOF of TAYLOR’s THEOREM (EXERCISE !) :

The steps are similar to those in the Lagrange Interpolation Theorem :

• First show that the Theorem holds if x = x0 .

• Next assume x is arbitrary, but x 6= x0 . (Consider x as fixed.)

• Define
f (x) − p(x)
c(x) = n+1
.
(x − x0 )

• Define F (z) = f (z) − p(z) − c(x) (z − x0 )n+1 .

• Show that F (k) (x0 ) = 0 , k = 0, 1, · · · , n , and that F (x) = 0 .

• Give a qualitative graph of F (z) .

212
• Show F ′ (ξ0 ) = 0 for some ξ0 between x0 and x . Graph F ′ (z) .

• Show F ′′ (ξ1 ) = 0 for some ξ1 between x0 and ξ0 . Graph F ′′ (z) .

• etc.

• Show that F (n+1) (ξn ) = 0 for some ξn between x0 and ξn−1 .

• From this derive that

f (n+1) (ξ)
c(x) = , (ξ = ξn ) .
(n + 1)!

• Show how Taylor’s Theorem follows from this last step. QED !

213
EXERCISES :

•[088] Write down the Taylor polynomials pn (x) of degree n (or less) for
f (x) = ex about the point x0 = 0, for the cases n = 1, 2, 3, 4.

How big should n be so that | ex − pn (x) |< 10−4 everywhere in

[−1, 1] ?

•[089] Do the same for f (x) = sin(x) in [0, 1] about the point x0 = 0 .

•[090] Do the same for f (x) = ln(x) in [ 21 , 32 ] about the point x0 = 1 .

214
•[088] SOLUTION :

The Taylor polynomials pn (x) of degree n for f (x) = ex about the point
x0 = 0, for n = 1, 2, 3, 4:

p1 (x) = 1 + x,
1
p2 (x) = 1 + x + 2
x2 ,
1 1
p3 (x) = 1 + x + 2
x2 + 6
x3 ,
1 1 1
p4 (x) = 1 + x + 2
x2 + 6
x3 + 24
x4 .

How big should n be so that | ex − pn (x) |< 10−4 everywhere in [−1, 1] ?

The error bound is

e
≈ 6.74177 10−5 when n = 7.
(n + 1)!
•[089] SOLUTION :

The Taylor polynomials pn (x) for f (x) = sin(x) about x0 = 0 all have
odd degree, since the coefficients of even degree terms are zero :
p1 (x) = x,
1
p3 (x) = x − 6
x3 ,
1 1
p5 (x) = x − 6
x3 + 120
x5 ,
1 1 1
p7 (x) = x − 6
x3 + 120
x5 − 5040
x7 ,
1 1 1 1
p9 (x) = x − 6
x3 + 120
x5 − 5040
x7 + 362880
x9 .
(The formulas for p5 (x), p7 (x), and p9 (x) were actually not asked for.)
How big should n be so | sin(x) − pn (x) |< 10−4 for x ∈ [0, 1] ?
The error bound for odd values of n is
1 1
= ≈ 2.756 10−6 when n = 7.
(n + 2)! 362880
Graph of the function sin(x) (blue) and its Taylor polynomials pk (x)
about x0 = 0 : k = 1: purple, k = 3: red, k = 5: brown, k = 7: black .

215
•[090] SOLUTION :
f (x) = ln(x) has derivatives
f (n) (x) = (−1)n−1 (n − 1)! x−n ,
for example,
f (1) (x) = x−1 , f (2) (x) = − x−2 , f (3) (x) = 2x−3 ,

f (4) (x) = − 6x−4 , f (5) (x) = 24x−5 , f (6) (x) = − 120x−6 , · · ·

Thus f (n) (1) = (−1)n−1 (n − 1)! , and

(1) (x − 1)2 (2) (x − 1)n (n)
pn (x) = f (1) + (x−1)f (1) + f (1) + · · · + f (1) ,
2! n!
(x − 1)2 (x − 1)3 (x − 1)n
= (x−1) + (−1) + (2) + · · · + (−1)n−1 (n−1)!
2! 3! n!
(x − 1)2 (x − 1)3 (x − 1) n
= (x − 1) − + − · · · + (−1)n−1 .
2 3 n
SOLUTION : continued · · · :

For f (x) = ln(x) we found that

f (n) (x) = (−1)n−1 (n − 1)! x−n ,
so that
f (n+1) (x) = (−1)n n! x−(n+1) .

For x ∈ [ 12 , 23 ] , the Taylor remainder satisfies

(n+1) (x − 1)n+1
| f (x) − pn (x) | = f (ξ)
(n + 1)!
n+1
n −(n+1) (x − 1)
= (−1) n! ξ
(n + 1)!
1 −(n+1) 1 n+1 1 1
≤ = ,
2 2 n+1 n+1

using the fact that ξ = ξ(x) also lies in [ 21 , 23 ] .

Thus for | f (x) − pn (x) | < 10−4 in [ 12 , 23 ] , we need n ≥ 104 . (!)

Local Polynomial Interpolation.

Let f ∈ Cn+1 [a, b] .

Let p ∈ Pn interpolate f at n + 1 distinct points {xk }nk=0 in [a, b] .

Does
k f − p k∞ → 0 as n → ∞ ?

The answer is often NO !

216
EXAMPLE : If
1
f (x) = on [−5, 5] ,
1 + x2

and if p ∈ Pn interpolates f at the n + 1 equally spaced points {xk }nk=0 with

10
xk = − 5 + k ∆x , k = 0, 1, · · · , n , ∆x = ,
n

then it can be shown that

k f − p k∞ → ∞ as n → ∞.

217
1
Graph of f (x) = 1+x2
on the interval [−5, 5]
and its Lagrange interpolant p(x) ∈ P9 (red)
at ten equally spaced interpolation points (n = 9) .

218
1
Graph of f (x) = 1+x2
on the interval [−5, 5]
and its Lagrange interpolant p(x) ∈ P13 (red)
at fourteen equally spaced interpolation points (n = 13) .

219
Conclusion :

• Interpolating a function by a high degree polynomial is not a good idea !

Alternative :

• Interpolate the function locally by polynomials of relatively low degree .

220
For given integer N let
b−a
h ≡ ,
N
and partition [a, b] into

a = t0 < t1 < · · · < tN = b ,

where
tj = a + jh , j = 0, 1, · · · , N .

In each subinterval [tj−1 , tj ] :

interpolate f by a local polynomial pj ∈ Pn at n+1 distinct points {xj,i }ni=0 .

221
1
Local polynomial interpolation of f (x) = 1+x2
at 3 points in 5 intervals.

222
1
Local polynomial interpolation of f (x) = 1+x2
at 3 points in 10 intervals.

223
1
Local polynomial interpolation of f (x) = 1+x2
at 2 Chebyshev points.

224
1
Local polynomial interpolation of f (x) = 1+x2
at 3 Chebyshev points.

225
By the Lagrange Interpolation Theorem
1
max | f (x) − pj (x) | ≤ max | f (n+1) (x) | max | wn+1 (x) |
[tj−1 ,tj ] (n + 1)! [tj−1 ,tj ] [tj−1 ,tj ]

where n
Y b−a
wn+1 (x) = (x − xj,i ) , h ≡ tj − tj−1 = .
i=0
N

The Tables on Page 183, 200 show values of Cn ≡ max[−1,1] | wn+1 (x) |
for uniform and for Chebyshev interpolation points.

A scaling argument shows that for uniformly spaced local interpolation points
h
max | wn+1 (x) | ≤ ( )n+1 Cn ,
[tj−1 ,tj ] 2
while for local Chebyshev points we have
h n+1 −n
max | wn+1 (x) | ≤ ( ) 2 .
[tj−1 ,tj ] 2

226
NOTE :

• Keeping n fixed , pj converges to f as h → 0 , (i.e., as N → ∞) .

• To get more accuracy, increase N , keeping the degree n fixed .

227
EXAMPLE : If we approximate
π
f (x) = cos(x) on [0, ],
2
by local interpolation at 3 equally spaced local interpolation points
tj−1 + tj
xj,0 = tj−1 , xj,1 = , xj,2 = tj ,
2

then n = 2, h = π/(2N ) , and, using the Table on Page 183,

k f (3) k∞ h 3 1 h3
max | f (x) − pj (x) | ≤ ( ) C2 ≤ 0.3849 .
[tj−1 ,tj ] 3! 2 6 8

Specifically, if N = 4 (four intervals), then h = π/8, so that

1 1 π 3
max | f (x) − pj (x) | ≤ ( ) 0.3849 = 0.000486.
[tj−1 ,tj ] 6 8 8

228
Local polynomial interpolation at 3 points in 4 intervals.

229
EXERCISES :

If we approximate a function f (x) on a given interval by local interpolation

with cubic polynomials, then how many intervals of equal size are needed to
ensure that the maximum error is less than 10−4 ? Answer this question for
each of the following cases :

•[091] f (x) = sin(x) on [0, 2π] , with arbitrary local interpolation points.

•[092] f (x) = sin(x) on [0, 2π] , with equally spaced local points.

•[093] f (x) = sin(x) on [0, 2π] , with local Chebyshev interpolation points.

•[094] f (x) = ex on [−1, 1] , with equally spaced local points.

•[095] f (x) = ex on [−1, 1] , with local Chebyshev interpolation points.

230
•[093] SOLUTION : Local interpolation of sin(x) at Chebyshev points :
If N is the number of intervals then their size is h = 2π/N .
The local error for sin(x) for an interval of size h is bounded by
1
| sin(x) − pn (x) | ≤ max | wn+1 (x) | .
(n + 1)!
For the reference interval [−1, 1] the maximum of | wn+1 (x) | is 2−n .
The adjusted bound for a local interval of size h is
1 h n+1
| sin(x) − pn (x) | ≤ 2−n ,
(n + 1)! 2
which for the case n = 3 gives
1 h 4 −3
| sin(x) − p3 (x) | ≤ 2 ,
4! 2
which is less than 10−4 when h < 0.74448 , that is, when N ≥ 9 .
NOTE: For arbitrary points use 2n+1 as maximum of | wn+1 (x) | in [−1, 1] ,
while for equally spaced points use instead the Table on Page 183.
NUMERICAL DIFFERENTIATION.

Numerical differentiation formulas can be derived from local interpolation :

Let p ∈ Pn interpolate f (x) at points {xi }ni=0 . Thus

n
X
p(x) = f (xi ) ℓi (x) ,
i=0

where n
Y (x − xk )
ℓi (x) = .
k=0,k6=i
(xi − xk )

For m ≤ n we can approximate f (m) (x) by

n
f (m) (x) ∼
(m)
X
= p(m) (x) = f (xi ) ℓi (x) .
i=0

231
EXAMPLE :

Consider the case

n = 2, m = 2, x =0,

for the reference interval [−h, h] :

x0 = − h , x1 = 0 , x2 = h .

Thus we want to approximate f ′′ (x1 ) in terms of

f0 , f1 , and f2 , ( fi ≡ f (xi ) ) .

232
f (x)

h h

x0 x1 x2

In this case

f ′′ (x1 ) ∼
= p′′ (x1 ) = f0 ℓ′′0 (x1 ) + f1 ℓ′′1 (x1 ) + f2 ℓ′′2 (x1 ) .

233
f ′′ (x1 ) ∼
= f0 ℓ′′0 (x1 ) + f1 ℓ′′1 (x1 ) + f2 ℓ′′2 (x1 ) .

Here
(x − x1 )(x − x2 )
l0 (x) = ,
(x0 − x1 )(x0 − x2 )
so that 2 1
ℓ′′0 (x) = = 2 .
(x0 − x1 )(x0 − x2 ) h

In particular,
1
ℓ′′0 (x1 ) = 2 .
h
Similarly
2 1
ℓ′′1 (x1 ) = − 2 , ℓ′′2 (x1 ) = 2 . (Check !)
h h
Hence f0 − 2f1 + f2
′′ ∼
f (x1 ) = .
h 2

234
To derive an optimal error bound we use Taylor’s Theorem :
f2 − 2f1 + f0 ′′
− f1
h2

1 ′ h2 ′′ h3 ′′′ h4 ′′′′
= 2
f1 + hf1 + f1 + f1 + f (ζ1 )
h 2 6 24
− 2f1

h2 ′′ h3 ′′′ h4 ′′′′
+ f1 − hf1′ + f1 − f1 + f (ζ2 ) − f1′′
2 6 24

h2 ′′′′ h2 ′′′′
= f (ζ1 ) + f ′′′′ (ζ2 ) = f (η) ,
24 12
where η ∈ (x0 , x2 ) .

(In the last step we used the Intermediate Value Theorem .)

235
EXAMPLE : With n = 4 , m = 2 , and x = x2 , and reference interval
x0 = − 2h , x1 = − h , x2 = 0 , x3 = h , x4 = 2h ,
we have
4
f ′′ (x2 ) ∼
X
= fi ℓ′′i (x2 ) .
i=0
Here
(x − x1 )(x − x2 )(x − x3 )(x − x4 )
l0 (x) =
(x0 − x1 )(x0 − x2 )(x0 − x3 )(x0 − x4 )

(x − x1 )(x − x2 )(x − x3 )(x − x4 )

= 4
.
24h

Differentiating, and setting x equal to x2 , we find

−1
ℓ′′0 (x2 ) = 2
. (Check !)
12h

236
Similarly
16 −30 16 −1
ℓ′′1 (x2 ) = , ℓ′′2 (x2 ) = , ℓ′′3 (x2 ) = , ℓ′′4 (x2 ) = .
12h2 12h2 12h2 12h2

(Check !)

Hence we have the five point finite difference approximation

′′ ∼ −f0 + 16f1 − 30f2 + 16f3 − f4

f (x2 ) = 2
.
12h

By Taylor expansion one can show that the leading error term is

h4 f (6) (x2 )
. (Check !)
90

We say that the order of accuracy of this approximation is equal to 4 .

237
EXERCISES :

•[096] Derive a formula for the error in the approximation formula

f (2h) − 2f (h) + f (0)
f ′′ (0) ∼
= 2
.
h
What is the order of accuracy?

•[097] Do the same for f ′′ (0) ∼ f (h) − 2f (0) + f (−h)

= 2
.
h

•[098] Derive the approximation formula

′ ∼ −3f (0) + 4f (h) − f (2h)

f (0) = .
2h
and determine the order of accuracy.

238
•[098] SOLUTION : For the formula
−3f (0) + 4f (h) − f (2h)
f ′ (0) ∼
= ,
2h
we use the notation f0 = f (0) , f1 = f (h) , and f2 = f (2h).
The Lagrange interpolating polynomial is
(x − h)(x − 2h) (x − 0)(x − 2h) (x − 0)(x − h)
p(x) = f0 + f1 + f2 ,
(0 − h)(0 − 2h) (h − 0)(h − 2h) (2h − 0)(2h − h)
so that
(x − h) + (x − 2h) (x − 0) + (x − 2h) (x − 0) + (x − h)
p′ (x) = f0 + f1 + f2 ,
(0 − h)(0 − 2h) (h − 0)(h − 2h) (2h − 0)(2h − h)
with
′ (0 − h) + (0 − 2h) (0 − 0) + (0 − 2h) (0 − 0) + (0 − h)
p (0) = f0 + f1 + f2 ,
(0 − h)(0 − 2h) (h − 0)(h − 2h) (2h − 0)(2h − h)
from which
−3h −2h −h −3f0 + 4f1 − f2
f ′ (0) ∼
= p ′
(0) = f 0 + f 1 + f 2 = .
2h2 −h2 2h2 2h
SOLUTION : continued · · · :

The order of accuracy is determined by Taylor expansion :

−3f0 + 4f1 − f2
− f0′
2h
1
= − 3f0
2h
h2 ′′ h3 ′′′
+ 4 [ f0 + hf0′
+ f0 + f0 + · · · ]
2 6
2 3
(2h) (2h)
− [ f0 + 2hf0′ + f0′′ + f0′′′ + · · · ] − f0′
2 6
1 4h3 ′′′ (2h)3 ′′′
= f0 − f0 + · · ·
2h 6 6
h2 ′′′
= − f0 + higher order terms .
3
Thus this formula is of second order accuracy.
EXERCISES :

•[099] For the reference interval [0, 3h] , give complete details on the deriva-
tion of the four weights in the numerical differentiation formula

′ ∼ −11f (0) + 18f (h) − 9f (2h) + 2f (3h)

f (0) = .
6h
Use Taylor expansions to determine the leading error term.

•[100] For the reference interval [−3h/2, 3h/2], give complete details on the
derivation of the weights in the numerical differentiation formula

′′′ ∼ −f (−3h/2) + 3f (−h/2) − 3f (h/2) + f (3h/2)

f (0) = 3
.
h
Use Taylor expansions to determine the leading error term.

239
BEST APPROXIMATION IN THE k · k2 .

Introductory Example : Best approximation in R3 .

Recall (from Linear Algebra) :

• A vector x ∈ R3 is an ordered set of three numbers, x = (x1 , x2 , x3 )T .

• We can think of x as a point or an arrow .

• The dot product or inner product of two vectors x and y is defined as

hx, yi ≡ x1 y1 + x2 y2 + x3 y3 .

240
• The length or norm of a vector is defined in terms of the inner product :

1
q
k x k2 ≡ hx, xi 2 = x21 + x22 + x23 .

• Then k x1 − x2 k2 denotes the distance between x1 and x2 .

• Two vectors are perpendicular if hx1 , x2 i = 0 .

241
Let

e1 ≡ (1, 0, 0)T , e2 ≡ (0, 1, 0)T , and e3 ≡ (0, 0, 1)T .

The set {ek }3k=1 is a basis of R3 .

This basis is orthogonal because

he1 , e2 i = he1 , e3 i = he2 , e3 i = 0 ,

and normal since

k e1 k2 = k e2 k2 = k e3 k2 = 1,

i.e., the basis is orthonormal .

242
Let S2 denote the x1 , x2 -plane .

Then
S2 = Span{e1 , e2 } .

S2 is a 2-dimensional subspace of R3 .

Suppose we want to find the best approximation

p∗ ∈ S2 ,
to a given vector x ∈ R3 .

Thus we want p∗ ∈ S2 that minimizes

k x − p k2 ,
over all p ∈ S2 .

243
x

x-p*

e
1

S
2
p*
e
2

244
Geometrically we see that k x − p k2 is minimized if and only if

(x − p) ⊥ S2 ,

i.e., if and only if

hx − p , e1 i = 0 , and hx − p , e2 i = 0 ,

i.e., if and only if

hx , e1 i = hp , e1 i , and hx , e2 i = hp , e2 i .

245
Since p ∈ S2 we have

p = c1 e1 + c2 e2 ,

for certain constants c1 and c2 .

Thus k x − p k2 is minimized if and only if

hx, e1 i = hp, e1 i = hc1 e1 + c2 e2 , e1 i = c1 ,

hx, e2 i = hp, e2 i = hc1 e1 + c2 e2 , e2 i = c2 .

Hence
p∗ = c1 e1 + c2 e2 ,
with
c1 = hx, e1 i and c2 = hx, e2 i .

246
Best Approximation in General.

Let X be a (possibly infinite-dimensional) real vector space,

with an inner product satisfying :

for all x, y, z ∈ X , and for all α ∈ R :

(i) hx, xi ≥ 0 , hx, xi = 0 only if x = 0 ,

(ii) hαx, yi = hx, αyi = αhx, yi ,

(iii) hx, yi = hy, xi ,

(iv) hx + y, zi = hx, zi + hy, zi .

247
THEOREM :

Let X be a vector space with an inner product satisfying the properties above.

Then 1
kxk ≡ hx, xi ,2

defines a norm on X.

PROOF : We must show that k · k satisfies the usual properties :

(i) Clearly k x k ≥ 0 , and k x k = 0 only if x = 0 .

1 1
2
(ii) k αx k = hαx, αxi 2 = ( α hx, xi ) 2 = |α| kxk.

(iii) The triangle inequality is also satisfied :

248
Let
hx, yi hx, yi
α ≡ = 2
, where x, y ∈ X .
hy, yi kyk
Then
0 ≤ k x − αy k2 = hx − αy, x − αyi

= k x k2 − 2αhx, yi + α2 k y k2

hx, yi hx, yi2

= k x k2 − 2 hx, yi + k y k 2
k y k2 k y k4
hx, yi2
= k x k2 − 2
.
kyk
Hence
hx, yi2 ≤ k x k2 k y k2 ,
or
| hx, yi | ≤ k x k k y k (Cauchy − Schwartz Inequality) .

249
Now
k x + y k2 = hx + y, x + yi

= k x k2 + 2hx, yi + k y k2

≤ k x k2 + 2 | hx, yi | + k y k2

≤ k x k2 + 2 k x kk y k + k y k2

= ( k x k + k y k )2 .

Hence
kx+y k ≤ kxk + ky k . ( Triangle Inequality ) QED !

250
Suppose {ek }nk=1 is an orthonormal set of vectors in X , i.e.,


 0, if l 6= k ,
heℓ , ek i =
1, if l = k .


Let Sn ⊂ X be defined by

Sn = Span{ek }nk=1 .

We want the best approximation p∗ ∈ Sn to a given vector x ∈ X .

Thus we want to find p∗ ∈ Sn that minimizes k x − p k over all p ∈ Sn .

251
THEOREM :

The best approximation p∗ ∈ Sn to x ∈ X is given by

n
X
p∗ = ck ek ,
k=1

where the Fourier Coefficients ck , (k = 1, 2, · · · , n) , are given by

hx, ek i
ck = , if the basis is orthogonal ,
hek , ek i

and

ck = hx, ek i , if the basis is orthonormal .

252
PROOF : Let n
X
F (c1 , c2 , · · · , cn ) ≡ kx− ck ek k2 .
k=1

Thus we want to find the {ck }nk=1 that minimize F .

Now
n
X n
X
F (c1 , c2 , · · · , cn ) = hx − ck ek , x − ck ek i
k=1 k=1
n
X n
X n
X
= hx, xi − 2 h ck ek , xi + h ck ek , ck ek i
k=1 k=1 k=1
Xn Xn
= k x k2 − 2 ck hx, ek i + c2k hek , ek i .
k=1 k=1

For F to be minimized we must have

∂F
= 0, ℓ = 1, 2, · · · , n .
∂cℓ

253
We had
n
X n
X
F (c1 , c2 , · · · , cn ) = k x k2 − 2 ck hx, ek i + c2k hek , ek i .
k=1 k=1
∂F
Setting ∂cℓ
= 0 gives
−2hx, eℓ i + 2cℓ heℓ , eℓ i = 0 .

Hence, for ℓ = 1, 2, · · · , n , we have

hx, eℓ i
cℓ = ,
heℓ , eℓ i

cℓ = hx, eℓ i , if the basis is orthonormal .

QED !

254
NOTE :

• The proof uses the fact that X is an inner product space ,

with norm defined in terms of the inner product.

• In normed vector spaces without inner product, e.g.,

C[0, 1] with k · k∞ ,

it is more difficult to find a best approximation.

255
Gram-Schmidt Orthogonalization.
To construct

an orthogonal basis {ek }nk=1 of a subspace Sn of X ,

we have the Gram-Schmidt Orthogonalization Procedure :

• Take any nonzero e1 ∈ Sn .

• Choose any v2 ∈ Sn that is linearly independent from e1 .

• Set
hv2 , e1 i
e2 = v2 − e1 .
he1 , e1 i
Then
he2 , e1 i = 0 . (Check !)

256
v
2

e
1

e =v -e
2 2

< v2 , e1 >
e= ( ) e1
< e1 , e1 >

257
Inductively, suppose we have mutually orthogonal {ek }m−1
k=1 , (m ≤ n).

m−1
• Choose vm ∈ Sn linearly independent from the {ek }k=1 .

• Set
m−1
X hvm , ek i
em = vm − ek .
k=1
hek , ek i

Then
hem , eℓ i = 0 . ℓ = 1, 2, · · · , m − 1 . (Check !)

An orthonormal basis can be obtained by normalizing :

ek
êk = , k = 1, 2, · · · , n .
k ek k

258
Best Approximation in a Function Space.

We now apply the general results the special case where

X = C[−1, 1] ,

with inner product Z 1

hf, gi ≡ f (x) g(x) dx .
−1

This definition satisfies all conditions an inner product must satisfy. (Check !)

Hence, from the Theorem it follows that

Z 1
1 1
2
k f k2 ≡ hf, f i 2 = ( f (x) dx ) ,
2

−1

is a norm on C[−1, 1] .

259
Suppose we want to find p∗ ∈ Pn that best approximates a given function

f ∈ C[−1, 1] ,
in the k · k2 .

Here Pn is the (n + 1)-dimensional subspace of C[−1, 1] consisting of all

polynomials of degree less than or equal to n.

By the Theorem we have n

X
p∗ (x) = ck ek (x) ,
k=0
where
R1
hf, ek i −1
f (x) ek (x) dx
ck = = R1 2 , k = 0, 1, · · · , n ,
hek , ek i e (x) dx
−1 k

and where the {ek }nk=0 denote the first n + 1 orthogonal polynomials .

260
Use the Gram-Schmidt procedure to construct an orthogonal basis of Pn :

(These basis polynomials are called the Legendre polynomials.)

Take e0 (x) ≡ 1 and v1 (x) = x .

Then R1
hv1 , e0 i −1
x dx
= R1 = 0.
he0 , e0 i 12 dx
−1

Hence
e1 (x) = v1 (x) − 0 · e0 (x) = x.

261
Take v2 (x) = x2 .

Then R1 2
hv2 , e0 i −1
x dx 1
= R1 = ,
he0 , e0 i 12 dx 3
−1

and R1 3
hv2 , e1 i −1
x dx
= R1 = 0.
he1 , e1 i x2 dx
−1

Hence
1 2 1
e2 (x) = v2 (x) − e0 (x) − 0 · e1 (x) = x − .
3 3

262
Take v3 (x) = x3 . Then
R13
hv3 , e0 i −1
x dx
= R1 = 0,
he0 , e0 i 2
1 dx
−1

and R14
hv3 , e1 i −1
x dx 3
= R1 = ,
he1 , e1 i x2 dx 5
−1
and R1 3 2 1
hv3 , e2 i −1
x (x − 3
) dx
= R1 = 0.
he2 , e2 i 2 1 2
(x − 3 ) dx
−1

Hence
3 3
e3 (x) = v3 (x) − 0 · e0 (x) − e1 (x) − 0 · e2 (x) = x3 − x.
5 5
etc.

263
EXAMPLE :

The polynomial p∗ ∈ P2 that best approximates

f (x) = ex , on [−1, 1] , in k · k2 ,

is given by
p∗ (x) = c0 e0 (x) + c1 e1 (x) + c2 e2 (x) ,

where
hf, e0 i hf, e1 i hf, e2 i
c0 = , c1 = , c2 = .
he0 , e0 i he1 , e1 i he2 , e2 i

264
We find that
R1 x
hf, e0 i −1
e dx 1 1
c0 = = R1 = (e − ) = 1.175 ,
he0 , e0 i 2
1 dx 2 e
−1

R1 x
hf, e1 i e x dx 3
c1 = = −1
R1 = (x − 1)ex |1−1 = 1.103 ,
he1 , e1 i x2 dx 2
−1

R1 x 2 1
hf, e2 i −1
e (x − 3
) dx 45 2 5 x1
c2 = = R1 = (x − 2x + )e |−1 = 0.536 .
he2 , e2 i 1
(x2 − 3 )2 dx 8 3
−1

Therefore
∗ 1 2
p (x) = 1.175 (1) + 1.103 (x) + 0.536 (x − )
3
= 0.536 x2 + 1.103 x + 0.996 . (Check !)

265
Best approximation of f (x) = ex in [−1, 1] by a polynomial p ∈ P2 .

266
EXERCISES :

•[101] Use the Gram-Schmidt procedure to construct an orthogonal basis of

the polynomial space P4 on the interval [−1, 1], by deriving e4 (x), given
e0 (x) = 1 , e1 (x) = x , e2 (x) = x2 − 13 , and e3 (x) = x3 − 35 x .
•[102] Use the Gram-Schmidt procedure to construct an orthogonal basis of
the linear space Span{1, x, x3 } for the interval [−1, 1]. Determine the
best approximation in the k · k2 to f (x) = x5 .
•[103] Use the Gram-Schmidt procedure to construct an orthogonal basis of
the linear space Span{1, x2 , x4 } for the interval [−1, 1]. Determine the
best approximation in the k · k2 to f (x) = x6 .
•[104] Show that the functions e0 (x) ≡ 1, e1 (x) = sin(x), e2 (x) = cos(x),
e3 (x) = sin(2x), e4 (x) = cos(2x),Rare mutually orthogonal with respect
2π
to the inner product < f, g >= 0 f (x)g(x) dx. Also show how one
can determine the coefficients ck , k = 0, 1, 2, 3, 4, of the trigonometric
polynomial p(x) = c0 + c1 sin(x) + c2 cos(x) + c3 sin(2x) + c4 cos(2x) that
R 2π
minimizes 0 (p(x) − f (x))2 dx, when f (x) = ex .

267
•[101] Use the Gram-Schmidt procedure to construct an orthogonal basis of
the polynomial space P4 on the interval [−1, 1], by deriving e4 (x), given
e0 (x) = 1 , e1 (x) = x , e2 (x) = x2 − 13 , and e3 (x) = x3 − 35 x .

SOLUTION : A lengthy calculation, along the lines of the derivation

3
of e3 (x) earlier in this section, shows that e4 (x) = x4 − 67 x2 + 35 .

•[102] Use the Gram-Schmidt procedure to construct an orthogonal basis of

the linear space Span{1, x, x3 } for the interval [−1, 1]. Determine the
best approximation in the k · k2 to f (x) = x5 .

SOLUTION : We can take e0 (x) = 1 and e1 (x) = x, as these are

already orthogonal. Letting v2 (x) = x3 we find
R1 3
R1 4
hv2 , e0 i −1
x dx hv2 , e1 i −1
x dx 3
= R1 = 0 and = R1 = .
he0 , e0 i 2
1 dx he1 , e1 i 2
x dx 5
−1 −1

Hence
3
e2 (x) = v2 (x) − 5
e1 (x) = x3 − 53 x .
SOLUTION : continued · · ·
We found thet e0 (x) = 1, e1 (x) = x, and e2 (x) = x3 − 35 x .

For the best approximation to f (x) = x5 we have

R1 5
hf, e0 i −1
x dx
c0 = = R1 = 0,
he0 , e0 i 12 dx−1

6
R1
hf, e1 i −1
x dx 3
c1 = = R1 = ,
he1 , e1 i x2 dx 7
−1

R1 5 3 3
hf, e2 i −1
x (x − 5
x) dx 10
c2 = = R1 = .
he2 , e2 i 3 3 2
(x − 5 x) dx 9
−1
Thus
p∗ (x) = c0 e0 (x) + c1 e1 (x) + c2 e2 (x)
3 10
= 7
x + 9
(x3 − 53 x) = 10 3
9
x − 5
21
x .
Best approximation of f (x) = x5 (blue) in [−1, 1]
by a polynomial p ∈ Span{1, x, x3 } (red).
•[103] Use the Gram-Schmidt procedure to construct an orthogonal basis of
the linear space Span{1, x2 , x4 } for the interval [−1, 1]. Determine the
best approximation in the k · k2 to f (x) = x6 .

SOLUTION : Take e0 (x) ≡ 1 and v1 (x) = x2 . Then

R1 2
hv1 , e0 i −1
x dx 1
= R1 = .
he0 , e0 i 12 dx 3
−1
Hence
1 2 1
e1 (x) = v1 (x) − e0 (x) = x − .
3 3
Next take v2 (x) = x4 . Then
R1 4
hv2 , e0 i −1
x dx 1
= R1 =
he0 , e0 i 12 dx 5
−1
and R1 6 1 4
hv2 , e1 i −1
x − 3
x dx 6
= R1 = .
he1 , e1 i 2 1 2
(x − 3 ) dx 7
−1
1 6 6 3
Hence e2 (x) = v2 (x) − 5
e0 (x) − 7
e1 (x) = x4 − 7
x2 + 35
.
SOLUTION : continued · · ·
We found
e0 (x) = 1, e1 (x) = x2 − 13 , e2 (x) = x4 − 6
7
x2 + 3
35
.
For the best approximation to f (x) = x6 we have
R1 6
hf, e0 i −1
x dx 1 ∼
c0 = = R1 = = 0.14285715 ,
he0 , e0 i 2
1 dx 7
−1

R1 6 2 1
hf, e1 i −1
x (x − 3
) dx 5 ∼
c1 = = R1 = = 0.71428573 ,
he1 , e1 i 2 1 2
(x − 3 ) dx 7
−1

R1 6 4 6 2 3
hf, e2 i x (x − x + ) dx
c2 = = −1
R1
7 35 ∼
= 1.3636366 .
he2 , e2 i 6 3
(x4 − 7 x2 + 35 )2 dx
−1

Thus p∗ (x) = c0 e0 (x) + c1 e1 (x) + c2 e2 (x)

∼
= 1.36364 x4 − 0.454545 x2 + 0.0216450 .
Best approximation of f (x) = x6 (blue) in [−1, 1]
by a polynomial p ∈ Span{1, x2 , x4 } (red).
NUMERICAL INTEGRATION

• Many definite integrals, e.g.,

Z 1
(x2 )
e dx ,
0

are difficult or impossible to evaluate analytically.

• In such cases we can use numerical integration.

• There are many numerical integration (or quadrature ) formulas.

268
Most formulas are based on integrating local interpolating polynomials of f :
Z b N Z tj
f (x) dx ∼
X
= pj (x) dx ,
a j=1 tj−1

where pj ∈ Pn interpolates f at n + 1 points in [tj−1 , tj ] .

269
The Trapezoidal Rule.
If n = 1 , and if pj ∈ P1 interpolates f at tj−1 and tj , then
Z tj
h
pj (x) dx = (fj−1 + fj ) , (local integration formula) .
tj−1 2

f 1
0
0
1
0
1
11
00 11
00
00
11 00
11
pN
p1 1
0
1
0
0
1 p2 0
1

t0 t1 t2 tN
a b
270
The composite integration formula then becomes

Z b N Z tj
f (x) dx ∼
X
= pj (x) dx
a j=1 tj−1

N
X h
= (fj−1 + fj )
j=1
2

1 1
= h ( f0 + f1 + · · · + fN −1 + fN ) ,
2 2
where fj ≡ f (tj ).

This is the well-known Trapezoidal Rule.

271
In general
n
X
pj (x) = f (xji ) ℓji (x) ,
i=0

where n
Y x − xjk
ℓji (x) = .
k=0,k6=i
xji − xjk

Thus we have the approximation

Z tj Z tj n Z tj
f (x) dx ∼
X
= pj (x) dx = f (xji ) ℓji (x) dx .
tj−1 tj−1 i=0 tj−1

The integrals Z tj
ℓji (x) dx ,
tj−1

are called the weights in the local integration formula.

272
Simpson’s Rule.
Let n = 2 , and in each subinterval [tj−1 , tj ] choose the interpolation points
1
tj−1 , tj− 1 ≡ (tj−1 + tj ) , and tj .
2 2

11
00
00
11
00
11
f 1
0
0
1
0
1
11
00 11
00
00
11 00
11 1
0
pN 0
1

1
0 1
0
0
1 0
1
p1 0
1 0
1
1
0
1
0
0
1 p2 0
1

t0 t 1/2 t1 t 3/2 t2 t N−1/2 tN

a b
273
It is convenient to derive the weights for the reference interval [−h/2 , h/2] :

11
00
00
11
11
00
00
11 p
f
11
00
00
11

−h/2 0 h/2

274
The weights are

h/2
(x − 0) (x − h2 ) h
Z
h h h
dx = ,
−h/2 (− 2 − 0) (− 2 − 2 ) 6

h/2
(x + h2 ) (x − h2 ) 4h
Z
h h
dx = ,
−h/2 (0 + 2 ) (0 − 2 ) 6

h/2
(x + h2 ) (x − 0) h
Z
h h h
dx = .
−h/2 ( 2 + 2 ) ( 2 − 0) 6

(Check !)

275
With uniformly spaced {tj }N
j=0 , the composite integration formula becomes

b N
h
Z
f (x) dx ∼
X
= (fj−1 + 4fj− 1 + fj )
a j=1
6 2

h
= ( f0 + 4f 1 + 2f1 + 4f1 1 + 2f2 + · · · + 2fN −1 + 4fN − 1 + fN ) .
6 2 2 2

This formula is known as Simpson’s Rule.

276
The local polynomials (red) in Simpson’s Rule
1
for numerically integrating f (x) = 1+x 2 (blue) .

277
THEOREM :

The error in the composite integration formula, based on local polynomial

interpolation at n + 1 equally spaced local points , satisfies the estimate

b N Z tj
k f (n+1) k∞ hn+1 Cn (b − a)
Z X
| f (x) dx − pj (x) dx | ≤ n+1
,
a j=1 tj−1 (n + 1)! 2

where
b−a
h = ,
N
and where the value of

Cn ≡ max | wn+1 (x) | ,

[−1,1]

can be found in the Table on Page 183.

278
PROOF : The local error is

Z tj Z tj Z tj
| f (x) dx − pj (x) dx | = | f (x) − pj (x) dx |
tj−1 tj−1 tj−1

tj n
f (n+1) (ξ(x)) Y
Z
= | (x − xji ) dx |
tj−1 (n + 1)! i=0

k f (n+1) k∞ h n+1
≤ | tj − tj−1 | Cn
(n + 1)! 2

k f (n+1) k∞ hn+2 Cn
= n+1
.
(n + 1)! 2

279
The error in the composite formula is now easily determined :
Z b N Z
X tj N Z
X tj
| f (x) dx − pj (x) dx | = | f (x) − pj (x) dx |
a j=1 tj−1 j=1 tj−1

N
X Z tj
≤ | f (x) − pj (x) dx |
j=1 tj−1

k f (n+1) k∞ hn+2 Cn
= N
(n + 1)! 2n+1

k f (n+1) k∞ hn+1 Cn (b − a)
= n+1
,
(n + 1)! 2
where the last step uses the fact that
b−a b−a
h= , i .e. , N= .
N h
QED !

280
b N Z tj
k f (n+1) k∞ hn+1 Cn (b − a)
Z X
| f (x) dx − pj (x) dx | ≤ n+1
.
a j=1 tj−1 (n + 1)! 2

NOTE :

• The order of accuracy is at least n + 1 .

• We say that the method is O(hn+1 ) .

• For equally spaced local interpolation points we find Cn in the Table.

• For local Chebyshev points we have Cn = 2−n .

• The actual order may be higher.

• For local Gausss points we have much higher order of accuracy !

281
EXAMPLES :

For the Trapezoidal Rule , where n = 1 , C1 = 1 , we get the error bound

h2
k f ′′ k∞ (b − a) .
8
Indeed the Trapezoidal Rule is O(h2 ) .

For Simpson’s Rule , where n = 2 , C2 = 0.3849 , we get the bound

8.01875 · 10−3 h3 k f (3) k∞ (b − a) .

The actual order of Simpson’s Rule is higher, namely, O(h4 ) .

282
EXAMPLE :

Taylor expand for the precise local error in Simpson’s Rule :

Z h/2
h h h
f (x) dx − f (− ) + 4f (0) + f ( )
−h/2 6 2 2

h/2 2 3 4
x x x
Z
= f0 + xf0′ + f0′′ + f0′′′ + f0′′′′ + · · · dx
−h/2 2 6 24

h h ′ 1 h 2 ′′ 1 h 3 ′′′ 1 h 4 ′′′′
− f0 − ( )f0 + ( ) f0 − ( ) f0 + ( ) f0 + · · ·
6 2 2 2 6 2 24 2
+ 4f0
h ′ 1 h 2 ′′ 1 h 3 ′′′ 1 h 4 ′′′′
+ f0 + ( )f0 + ( ) f0 + ( ) f0 + ( ) f0 + · · ·
2 2 2 6 2 24 2

where f0 ≡ f (0) , etc.

283
x2 ′ x3 ′′ x4 ′′′ x5 ′′′′ h/2
= xf0 + f0 + f0 + f0 + f0 + · · ·
2 6 24 120 −h/2
h h2 ′′ h4 ′′′′
− 6f0 + f0 + f0 + · · ·
6 4 192

(h/2)3 ′′ (h/2)5 ′′′′

= hf0 + f0 + f0 + · · ·
3 60
3
h ′′ h5 ′′′′
− hf0 + f0 + f0 + · · ·
24 1152

h5
= − f0′′′′ + higher order terms.
2880

Thus the leading error term of the composite Simpson’s Rule is bounded by
h4
k f ′′′′ k∞ (b − a) .
2880

284
EXERCISE :

•[105] The Local Midpoint Rule, for numerically integrating a function f (x)
over the reference interval [−h/2, h/2], is given by
Z h/2
f (x) dx ∼
= hf (0) .
−h/2

Use Taylor expansion to determine the error in this local formula.

Write down the formula for the Composite Midpoint Rule for integrating
f (x) over a general interval [a, b].

Derive an error formula for the composite formula.

How big must N be for the global error to be less than 10−6 , when
integrating f (x) = sin(x) over the interval [0, 1] ?

285
•[105] SOLUTION : Taylor expand for the local error :
Z h/2
f (x) dx − h f0
−h/2
Z h/2 2
x
= f0 + xf0′ + f0′′ + · · · dx − h f0
−h/2 2
x2 h/2 ′ x3 h/2 ′′
= hf0 + f0 + f0 + · · · − h f0
2 −h/2 6 −h/2
h3 ′′
= f0 + higher order terms.
24

With uniformly spaced {tj }N

j=0 , the composite integration formula becomes
Z b N
∼
X 1
f (x) dx = h f (xj− 1 ) , where xj− 1 = a + (j − ) h .
a j=1
2 2 2

The leading error term of the composite Midpoint Rule is bounded by

h2
k f ′′ k∞ (b − a) .
24
EXERCISE :

•[106] The local Trapezoidal Rule for the reference interval [−h/2, h/2] is
Z h/2
h h i
f (x) dx ∼
= f (−h/2) + f (h/2) .
−h/2 2
Use Taylor expansions to derive the local error formula.

Let h = (b − a)/N and xk = a + k h , for k = 0, 1, 2, 3, · · · , N .

Then the composite Trapezoidal Rule is given by
Z b
h h i
f (x)dx ∼
= f (x0 ) + 2f (x1 ) + 2f (x2 ) + · · · + 2f (xN −1 ) + f (xN ) .
a 2
Based on the local error, derive an upper bound on the global error.
SOLUTION : You should find that the global error of the composite
Trapezoidal Rule is bigger than for the Midpoint Rule, namely,
h2
k f ′′ k∞ (b − a) .
12

286
THE GAUSS QUADRATURE THEOREM :

If in each subinterval [tj−1 , tj ] the interpolation points {xji }ni=0 are taken as

the zeroes of the (n + 1)st orthogonal polynomial en+1 (x) ,

( relative to [tj−1 , tj ] ) ,

then the composite integration formula is O(h2n+2 ) .

NOTE :

• Such integration formulas are known as Gauss Quadrature formulas.

• The points {xji }ni=0 are the Gauss points .

• The order improves from O(hn+1 ) to O(h2n+2 ) for Gauss points.

287
EXAMPLE : The case n = 1 :

Relative to the interval [−1, 1] , the second degree orthogonal polynomial is

12
e2 (x) = x − .
3

The two Gauss points relative to [−1, 1] are the zeroes of e2 (x) , i.e.,
√ √
3 3
x0 = − and x1 = .
3 3

Relative to [tj−1 , tj ] the Gauss points are

√ √
3 3
xj,0 = tj− 1 − h , and xj,1 = tj− 1 + h ,
2 6 2 6
1
where tj− 1 ≡ (t
2 j−1
+ tj ) .
2

288
Relative to the reference interval Ih ≡ [−h/2 , h/2] the Gauss points are
√ √
h 3 h 3
x0 = − and x1 = ,
6 6

with interpolating polynomial

p(x) = f (x0 ) ℓ0 (x) + f (x1 ) ℓ1 (x) ,

where
√
x − x1 x − h 3/6
ℓ0 (x) = = √ ,
x0 − x1 −h 3/3

and
√
x − x0 x + h 3/6
ℓ1 (x) = = √ .
x1 − x0 h 3/3

289
The local integration formula is
Z h/2 Z h/2 Z h/2
f (x) dx ∼= f (x0 ) ℓ0 (x) dx + f (x1 ) ℓ1 (x) dx .
−h/2 −h/2 −h/2

Integrating ℓ0 (x) and ℓ1 (x) , we find

Z h/2
∼ h h
f (x) dx = f (x0 ) + f (x1 ) .
−h/2 2 2

Hence the composite two point Gauss quadrature formula is

b N
h
Z
f (x) dx ∼
X
= [f (xj,0 ) + f (xj,1 )] .
a 2 j=1

By the Theorem this integration formula is O(h4 ) .

290
PROOF (of the Gauss Quadrature Theorem.)

The local error for the reference interval Ih ≡ [−h/2 , h/2] is

Z h/2
f (x) − p(x) dx ,
−h/2

where p ∈ Pn interpolates f (x) at Gauss points {xi }ni=0 (relative to Ih ) .

By the Lagrange Interpolation Theorem

Z h/2 Z h/2 (n+1) n
f (ξ(x)) Y
f (x) − p(x) dx = (x − xi ) dx
−h/2 −h/2 (n + 1)! i=0
Z h/2
= c(x) en+1 (x) dx ,
−h/2
where f (n+1) (ξ(x))
n
Y
c(x) ≡ and en+1 (x) = (x − xi ) .
(n + 1)! i=0

Note that en+1 is the (n + 1)st orthogonal polynomial (relative to Ih ) .

291
FACT : If f (x) is very smooth then c(x) has n + 1 continuous derivatives.

Thus we can Taylor expand :

n
X xk xn+1
c(x) = c(k) (0) + c(n+1) (η(x)) .
k=0
k! (n + 1)!

Call the remainder r(x) and use the fact that each summation term is in Pn :
n
X
c(x) = ck ek (x) + r(x) ,
k=0

where ek is the kth orthogonal polynomial relative to Ih .

(Recall that the {ek }nk=0 form an orthogonal basis of Pn .)

292
We have
Z h/2 Z h/2
f (x) − p(x) dx = c(x) en+1 (x) dx ,
−h/2 −h/2

and n
X
c(x) = ck ek (x) + r(x) .
k=0

It follows that
Z h/2 Z h/2 n
X
| f (x) − p(x) dx | = | [ ck ek (x) + r(x)] en+1 (x) dx |
−h/2 −h/2 k=0

n
X Z h/2 Z h/2
= | ck ek (x) en+1 (x) dx + r(x) en+1 (x) dx | .
k=0 −h/2 −h/2

Hence the local integration formula is O(h2n+3 ) .

As before, this implies that the composite formula is O(h2n+2 ) . QED !

294
EXERCISE :

•[107] Give complete details on the derivation of the local 3-point Gauss
integration formula. Also write down the composite 3-point Gauss
formula for integrating a function f (x) over a general interval [a, b].

•[108] Are the following True or False for any sufficiently smooth f (x) ?

- The order of accuracy of a general composite (n + 1)-point inte-

gration formula for f (x) is at least O(hn+1 ).

- The order of accuracy of the composite (n+1)-point Gauss formula

for integrating f (x) is O(h2n+4 ).

- The order of accuracy of the composite 2-Point Gauss formula

is the same as the order of accuracy of the composite Simpson
formula.

295
DISCRETE LEAST SQUARES APPROXIMATION

We have solved the continuous least squares problem :

Given f (x) on [−1, 1], find a polynomial p(x) ∈ Pn that minimizes

Z 1
k p − f k22 ≡ [p(x) − f (x)]2 dx .
−1

296
Next we solve the discrete least squares problem :

Given a set of discrete data points

{ (xi , yi ) }N
i=1 ,

find p ∈ Pn such that N

X
eL ≡ [p(xi ) − yi ]2
i=1
is minimized.

More generally, find

n
X
p(x) = ai φi (x) ,
i=0

(not necessarily a polynomial), such that eL is minimized.

297
Linear Least Squares

• Suppose we have data on the daily high temperature in March.

• For each day we compute the average high temperature.

• Each average is taken over a number of years .

• The (fictitious) data are given in the Table below.

1 -2.4 2 -0.6 3 -1.7 4 0.1 5 -2.0 6 -0.6 7 -1.8

8 1.7 9 2.0 10 1.2 11 0.7 12 0.6 13 1.3 14 1.5
15 2.6 16 1.8 17 0.9 18 2.7 19 2.7 20 3.5 21 3.1
22 3.8 23 3.5 24 4.4 25 3.5 26 7.6 27 3.2 28 7.5
29 5.5 30 6.8 31 5.9
Average daily high temperature in Montreal in March

298
8

4
Temperature

−1

−2

−3
0 5 10 15 20 25 30 35
Day
Average daily high temperature in Montreal in March

299
Suppose that :

• We believe these temperatures basically increase linearly .

• Thus we believe in a relation

Tk = c1 + c2 k , k = 1, 2, · · · , 31 .

• The deviations from linearity come from random influences .

• These random influences can be due to many factors .

• We want to determine ”the best ” linear approximation.

300
8

4
Temperature

−1

−2

−3
0 5 10 15 20 25 30 35
Day
Average daily high temperatures, with a linear approximation .

301
• There are many ways to determine such a linear approximation.

• Often used is the least squares method .

• This method determines c1 and c2 that minimize

N
X
( Tk − (c1 + c2 xk ) )2 ,
k=1

where, in our example, N = 31 and xk = k .

• To do so set the partial derivatives w.r.t. c1 and c2 to zero :

XN
w.r.t. c1 : −2 ( Tk − (c1 + c2 xk ) ) = 0 ,
k=1
N
X
w.r.t. c2 : −2 xk ( Tk − (c1 + c2 xk ) ) = 0 .
k=1

302
The least squares error versus c1 and c2 .

303
From setting the partial derivatives to zero, we have
N
X N
X
( Tk − (c1 + c2 xk ) ) = 0 , xk ( Tk − (c1 + c2 xk ) ) = 0 .
k=1 k=1

Solving these two equations for c1 and c2 gives

PN PN
k=1 xk Tk − x̄ k=1 Tk
c2 = PN 2 ,
2
k=1 xk − N x̄
and
c1 = T̄ − c2 x̄ ,
where
N N
1 X 1 X
x̄ = xk , T̄ = Tk .
N k=1 N k=1

EXERCISE : Check these formulas !

304
EXAMPLE : For our ”March temperatures” example, we find
c1 = − 2.111 and c2 = 0.272 .

4
Temperature

−1

−2

−3
0 5 10 15 20 25 30 35
Day
Average daily high temperatures, with linear least squares approximation .

305
General Least Squares

Given discrete data points

{ (xi , yi ) }N
i=1 ,

find the coefficients ck of the function

Xn
p(x) ≡ ck φk (x) ,
k=1

that minimize the least squares error

N
X
EL ≡ (p(xi ) − yi )2
i=1

EXAMPLES :
• p(x) = c1 + c2 x . (Already done !)

• p(x) = c1 + c2 x + c3 x2 . (Quadratic approximation)

306
For any vector x ∈ RN we have
N
X
k x k22 ≡ xT x ≡ x2k . (T denotes transpose).
i=1
Then
   
p(x1 ) y1
N
X  ·   ·  2
EL ≡ [p(xi ) − yi ]2 = k
 ·  −  ·  k2
  
i=1
p(xN ) yN
 Pn   
y1
i=1 ci φi (x1 )
 ·   ·  2
= k  −   k2

Pn ·   · 
i=1 ci φi (xN ) yN
   
φ1 (x1 ) · φn (x1 )   y1
c 1
· ·  ·  2
k Ac − y k22 .
 
= k   ·  −   k2 ≡
 · ·   · 
cn
φ1 (xN ) · φn (xN ) yN

307
THEOREM :

For the least squares error EL to be minimized we must have

AT A c = AT y .

PROOF :

EL = k Ac − y k22

= (Ac − y)T (Ac − y)

= (Ac)T Ac − (Ac)T y − yT Ac + yT y

= cT AT Ac − cT AT y − yT Ac + yT y .

308
PROOF : continued · · ·

We had
EL = cT AT Ac − cT AT y − yT Ac + yT y .

For a minimum we need

∂EL ∂EL
= 0, i .e., = 0, i = 0, 1, · · · , n ,
∂c ∂ci
which gives
cT AT A + (AT Ac)T − (AT y)T − yT A = 0 , (Check !)
i.e.,
2cT AT A − 2yT A = 0 ,
or
cT AT A = y T A .
Transposing gives
AT Ac = AT y . QED !

309
EXAMPLE : Given the data points
{ (xi , yi ) }4i=1 = { (0, 1) , (1, 3) , (2, 2) , (4, 3) } ,
find the coefficients c1 and c2 of p(x) = c1 + c2 x ,
that minimize
X4
EL ≡ [ (c1 + c2 xi ) − yi ]2 .
i=1

SOLUTION : Here N = 4 , n = 2 , φ1 (x) = 1 , φ2 (x) = x .

Use the Theorem :    
1 0 1
1 1 1 1 1 1 c1 1 1 1 1  3
=  ,
0 1 2 4 1 2  c2 0 1 2 4 2
1 4 3
or
4 7 c1 9
= ,
7 21 c2 19

with solution c1 = 1.6 and c2 = 0.371429 .

310
EXAMPLE : Given the same data points, find the coefficients of
p(x) = c1 + c2 x + c3 x2 ,
that minimize 4
X
EL ≡ [ (c1 + c2 xi + c3 x2i ) − yi ]2 .
i=1
SOLUTION : Here
N =4 , n=3 , φ1 (x) = 1 , φ2 (x) = x , φ3 (x) = x2 .
Use the Theorem :    
  1 0 0    1 
1 1 1 1  c1 1 1 1 1  
0 1 1 1   c2  =  0 1 2 4   3  ,
1 2 4   
1 2 4  2
0 1 4 16 c3 0 1 4 16
1 4 16 3
or     
4 7 21 c1 9
 7 21 73   c2  =  19  ,
21 73 273 c3 59

with solution c1 = 1.32727 , c2 = 0.936364 , c3 = −0.136364 .

311
The least squares approximations from the preceding two examples :

3.5 3.5

3.0 3.0

2.5 2.5
y

y
2.0 2.0

1.5 1.5

1.0 1.0

0.5 0.5
0 1 2 3 4 0 1 2 3 4
x x

p(x) = c1 + c2 x p(x) = c1 + c2 x + c3 x2

312
EXAMPLE : From actual data :

The average daily high temperatures in Montreal (by month) are :

January -5
February -3
March 3
April 11
May 19
June 24
July 26
August 25
September 20
October 13
November 6
December -2

Source : http://weather.uk.msn.com

313
30

15
Temperature

−5

−10
0 2 4 6 8 10 12
Month
Average daily high temperature in Montreal (by month).

314
EXAMPLE : continued · · ·
The graph suggests using a 3-term least squares approximation

p(x) = c1 φ1 (x) + c2 φ2 (x) + c3 φ3 (x) ,

of the form
πx πx
p(x) = c1 + c2 sin( ) + c3 cos( ) .
6 6
QUESTIONS :
• Why include φ2 (x) = sin( πx
6
)?
πx
• Why is the argument 6
?

• Why include the constant term φ1 (x) = c1 ?

• Why include φ3 (x) = cos( πx

6
)?

In this example we find the least squares coefficients

c1 = 11.4 , c2 = −8.66 , c3 = −12.8 .

315
30

15
Temperature

−5

−10
0 2 4 6 8 10 12
Month
Least squares fit of average daily high temperatures.

316
EXAMPLE : Consider the following experimental data :

0.6

0.5

0.4
y

0.3

0.2

0.1

0.0
0 1 2 3 4 5 6 7 8
x

317
EXAMPLE : continued · · ·

Suppose we are given that :

• These data contain ”noise ” .

• The underlying physical process is understood.

• The functional dependence is known to have the form

y = c1 xc2 e−c3 x .

• The values of c1 , c2 , c3 are not known.

318
EXAMPLE : continued · · ·

The functional relationship has the form

y = c1 xc2 e−c3 x .

Note that :

• The unknown coefficients c1 , c2 , c3 appear nonlinearly !

• This gives nonlinear equations for c1 , c2 , c3 !

• Such problems are more difficult to solve !

• What to do ?

319
EXAMPLE : continued · · ·

Fortunately, in this example we can take the logarithm :

log y = log c1 + c2 log x − c3 x .

This gives a linear relationship

log y = ĉ1 φ1 (x) + c2 φ2 (x) + c3 φ3 (x) ,
where
ĉ1 = log c1 .
and
φ1 (x) = 1 , φ2 (x) = log x , φ3 (x) = − x .

Thus
• We can now use regular least squares.

• We first need to take the logarithm of the data.

320
EXAMPLE : continued · · ·

0.0

−0.5

−1.0

−1.5

−2.0
log y

−2.5

−3.0

−3.5

−4.0

−4.5
0 1 2 3 4 5 6 7 8
x
The logarithm of the original y-values versus x .

321
EXAMPLE : continued · · ·

We had
y = c1 xc2 e−c3 x ,
and
log y = ĉ1 φ1 (x) + c2 φ2 (x) + c3 φ3 (x) ,
with
φ1 (x) = 1 , φ2 (x) = log x , φ3 (x) = − x ,
and
ĉ1 = log c1 .

We find the following least squares values of the coefficients :

ĉ1 = − 0.00473 , c2 = 2.04 , c3 = 1.01 ,

and
c1 = eĉ1 = 0.995 .

322
EXAMPLE : continued · · ·

−1

−2

−3
log y

−4

−5

−6

−7
0 1 2 3 4 5 6 7 8
x
The least squares approximation of the transformed data.

323
EXAMPLE : continued · · ·

0.6

0.5

0.4
y

0.3

0.2

0.1

0.0
0 1 2 3 4 5 6 7 8
x
The least squares approximation shown in the original data.

324
EXERCISES :

•[109] Compute the discrete least squares approximation of the form

p(x) = c0 + c1 x + c2 x2 to the data {(0, 2), (1, 1), (2, 1), (3, 3)} .

•[110] Compute the discrete least squares approximation of the form

p(x) = c0 + c1 x + c2 x1 to the data {(1, 5), (2, 3), (3, 2), (4, 3)} .

•[111] Derive a formula in terms of N and n for the number of multiplications

and divisions needed to solve the linear discrete least squares system
AT Ac = AT y ,
for c ∈ Rn , given the N by n matrix A and the vector y ∈ RN . Here AT
denotes the transpose of A. What is the total number of multiplications
and divisions in terms of N for the special case n = 2 ?

325
SMOOTH INTERPOLATION BY PIECEWISE POLYNOMIALS

We have already discussed local (or piecewise) polynomial interpolation :

In each subinterval [tj−1 , tj ] the function f is interpolated by a polynomial

pj ∈ Pn at interpolation points { xj,i }ni=0 :

326
NOTE :

• The collection {pj }N

j=1 defines a function p(t) .

• p(t) is generally not smooth, (not continuously differentiable) .

• In fact, p(t) is not continuous, unless

xj,0 = tj−1 and xj,n = tj ,

i.e., unless in each subinterval the leftmost and rightmost interpolation

points are the end points of the subinterval.

327
NOTE :

• Sometimes a smooth interpolant is wanted.

• These can also be constructed using piecewise polynomials.

• One class of smooth piecewise polynomial are called cubic splines.

• Cubic splines are piecewise polynomial functions

p(t) ∈ C2 [a, b] ,
for which each component pj is in P3 .

328
The cubic spline that interpolates the indicated data points.

329
Cubic Spline Interpolation.

Given f (t) defined on [a, b] we seek a function p(t) satisfying :

• p ∈ C2 [a, b] ,

• The restriction pj of p to [tj−1 , tj ] lies in P3 ,

• p(tj ) = f (tj ) , j = 0, 1, · · · , N ,

⋆ p′′ (t0 ) = 0 , p′′ (tN ) = 0 .

• There are other possible choices for ⋆ .

• With the above choice of ⋆ a spline is called the natural cubic spline.

• We may also have discrete data points (tj , fj ) , j = 0, 1, · · · , N .

330
This spline is ”formally well defined”, because

the total number of unknowns is 4N ,

(since each pj is defined by four coefficients) ,

which is matched by the number of equations :

continuity equations 3(N − 1)

interpolation equations N +1

end point conditions 2

———
Total 4N

331
NOTE :

• In practice we do not solve these 4N equations to find the spline.

• Often we want the values of the spline at a large number of points,

whereas the actual number of data points

{ ( tj , fj ) }N
j=0

is relatively small.

• For this purpose we derive a more efficient algorithm below.

332
Consider the interval [tj−1 , tj ] of size hj .

To simplify notation take the interval [t0 , t1 ] of size h1 .

Corresponding to this interval we have a polynomial p ∈ P3 .

We can write

p(t0 ) = p0 , p(t1 ) = p1 ,

p′′ (t0 ) = p′′0 , p′′ (t1 ) = p′′1 .

These four equations uniquely define p ∈ P3 in terms of the values

p0 , p1 , p′′0 , p′′1 .

333
In fact, for the interval [t0 , t1 ] , one finds the polynomial

p′′0 3 p ′′
1 3 p 1 p ′′
1 h 1 p 0 p ′′
0 h1
p1 (t) = (t1 −t) + (t−t0 ) + ( − ) (t−t0 ) + ( − ) (t1 −t) .
6h1 6h1 h1 6 h1 6

Indeed, p1 ∈ P3 , and
p(t0 ) = p0 , p(t1 ) = p1 ,

p′′ (t0 ) = p′′0 , p′′ (t1 ) = p′′1 .

Similarly, for the interval [t1 , t2 ] , one finds the polynomial

p′′1 3 p ′′
2 3 p 2 p ′′
2 h 2 p 1 p ′′
1 h2
p2 (t) = (t2 −t) + (t−t1 ) + ( − ) (t−t1 ) + ( − ) (t2 −t) .
6h2 6h2 h2 6 h2 6

EXERCISE : Derive the formulas given above.

334
By construction the local polynomials p1 and p2 connect continuously at t1 .

By construction the second derivatives also connect continuously.

However, the first derivatives must also match :

p′1 (t1 ) = p′2 (t1 ) .

This requirement leads to the consistency relation

p − p p − p
2 1 1 0
h1 p′′0 + 2(h1 + h2 ) p′′1 + h2 p′′2 = 6 − .
h2 h1

EXERCISE : Derive this formula.

335
For consecutive intervals [tj−1 , tj ] and [tj , tj+1 ], the consistency relation is
p − pj pj − pj−1
j+1
hj p′′j−1 + 2(hj + hj+1 ) p′′j + hj+1 p′′j+1 = 6 − ,
hj+1 hj
where
hj ≡ tj − tj−1 and hj+1 ≡ tj+1 − tj .
We have one such equation for each interior mesh point .

To interpolate the data points {(tj , fj )}N

j=0 , we have

pj = fj , j = 0, 1, · · · , N .

Furthermore we have the natural spline endpoint conditions

p′′0 = p′′N = 0 .

336
This gives a tridiagonal system of equations for the unknown values

p′′j , for j = 1, · · · , N − 1 ,

namely,

p′′1
 
   
2(h1 + h2 ) h2 F1
 p′′2


 h2 2(h2 + h3 ) h3 
 ·  
  F2 


 · · · 
 ·  
= · 

 · · ·   ′′   FN −2 
 pN −2 
hN −1 2(hN −1 + hN ) ′′ FN −1
pN −1

where
f − f f − f
j+1 j j j−1
Fj ≡ 6 − .
hj+1 hj

337
NOTE :

• In each row the diagonal entry is bigger than the sum of the other entries.

• Such a matrix is called diagonally dominant.

• By the Banach Lemma this matrix is nonsingular. (Check !)

• Thus we can compute the p′′j using the tridiagonal algorithm.

Thereafter evaluate each local polynomial with the formula

p′′j−1 p ′′
j
pj (t) = (tj − t)3 + (t − tj−1 )3
6hj 6hj
pj p′′j hj pj−1 p′′j−1 hj
+ ( − ) (t − tj−1 ) + ( − ) (tj − t) .
hj 6 hj 6

338
NOTE :

• The smoothness makes the component polynomials interdependent.

• One can not determine each component polynomial individually.

• As seen above, a tridiagonal system must be solved.

• This interdependence can lead to unwanted oscillations.

339
The cubic spline that interpolates the indicated data points.

340
NUMERICAL METHODS FOR INITIAL VALUE PROBLEMS

Here we discuss some basic concepts that arise in the numerical solution of
initial value problems (IVPs) in ordinary differential equations (ODEs) .

Consider the first order IVP

u′ (t) = f ( u(t) ) , for t ≥ 0 ,

with initial conditions

u(0) = u0 .

Here u , f (·) ∈ Rn .

341
Many higher order ODEs can be rewritten as first order systems .

EXAMPLE :
u′′ = g( u(t) , u′ (t) ) ,

where u , g(·, ·) ∈ Rn , with initial conditions

u(0) = u0 ,

u′ (0) = v0 ,
can be rewritten as
u′ (t) = v(t) ,

v′ (t) = g( u(t) , v(t) ) ,

with initial conditions
u(0) = u0 ,

v(0) = v0 .

342
EXAMPLE :

The equations of motion of a satellite in an Earth-Moon-like system are :

x′′ = 2y ′ + x − (1 − µ)(x + µ)r1−3 − µ(x − 1 + µ)r2−3 ,

y ′′ = −2x′ + y − (1 − µ)yr1−3 − µyr2−3 ,

z ′′ = −(1 − µ)zr1−3 − µzr2−3 ,

where
p p
r1 = (x + µ)2 + y2 + z2 , r2 = (x − 1 + µ)2 + y 2 + z 2 .

343
Rewritten as a first order system :
x′ = vx ,

y′ = vy ,

z′ = vz ,

vx′ = 2vy + x − (1 − µ)(x + µ)r1−3 − µ(x − 1 + µ)r2−3 ,

vy′ = −2vx + y − (1 − µ)yr1−3 − µyr2−3 ,

vz′ = −(1 − µ)zr1−3 − µzr2−3 ,

with
p p
r1 = (x + µ) + y + z and r2 = (x − 1 + µ)2 + y 2 + z 2 .
2 2 2

This system is of the form

u′ (t) = f ( u(t) ) , with initial condition u(0) = u0 .

344
Here µ is the mass ratio , i.e.,
m2
µ ≡ ,
m1 + m2
where m1 is the mass of the larger body, and m2 of the smaller body.

For example,
µ ∼
= 0.01215 for the Earth Moon system,

µ ∼
= 9.53 10−4 for the Sun Jupiter system,

µ ∼
= 3.0 10−6 for the Sun Earth system.

The variables are scaled such that

• the distance between the two bodies is 1 ,
• the sum of their masses is 1 .

The larger body is located at (−µ, 0, 0) , and the smaller body at (1−µ, 0, 0) .

345
A trajectory connecting a periodic “Halo orbit” to itself.

346
Numerical Methods.

Let
tj ≡ j ∆t , j = 0, 1, 2, · · · .

Below we give several basic numerical methods for solving the IVP

u′ (t) = f ( u(t) ) , u , f (·) ∈ Rn .

u(0) = u0 .

We use the notation

u(tj ) = the exact solution of the ODE at time tj ,

uj = the numerical solution at time tj .

347
Euler’s Method :

Using the first order accurate numerical differentiation formula

u(tj+1 ) − u(tj ) ∼ ′
= u (tj ) = f ( u(tj ) ) ,
∆t

we have

uj+1 = uj + ∆t f (uj ) , j = 0, 1, 2, · · · ,

( explicit, one-step , O(∆t) ) .

(Check the order of accuracy!)

348
The Trapezoidal Method :

Using the second order accurate approximation formula

u(tj+1 ) − u(tj ) ∼ u′ (tj ) + u′ (tj+1 ) f ( u(tj ) ) + f ( u(tj+1 ) )

= = ,
∆t 2 2
we have

∆t
uj+1 = uj + [ f (uj ) + f (uj+1 ) ] , j = 0, 1, 2, · · · ,
2

( implicit, one-step , O(∆t2 ) ) .

(Check the order of accuracy!)

NOTE : In each time-step a nonlinear system must be solved !

349
A Two-Step (Three-Point) Backward Differentiation Formula (BDF) :

Using the second order accurate approximation formula

3 u(tj+1 ) − 4 u(tj ) + u(tj−1 ) ∼ ′

= u (tj+1 ) = f ( u(tj+1 ) ) , (Check !)
2∆t

we have
4 1 2∆t
uj+1 = uj − uj−1 + f (uj+1 ) , j = 1, 2, · · · ,
3 3 3

( implicit, two-step , O(∆t2 ) ) .

NOTE : In each time-step a nonlinear system must be solved!

350
A Two-Step (Three-Point) Forward Differentiation Formula :

Using the second order accurate approximation formula

−u(tj+1 ) + 4 u(tj ) − 3u(tj−1 ) ∼ ′

= u (tj−1 ) = f ( u(tj−1 ) ) , (Check !)
2∆t

we have

uj+1 = 4 uj − 3 uj−1 − 2∆t f (uj−1 ) , j = 1, 2, · · · ,

( explicit, two-step , O(∆t2 ) ) .

(We will show that this method is useless !)

351
The Improved Euler Method :

ûj+1 = uj + ∆t f (uj ) ,

∆t
uj+1 = uj + [ f (uj ) + f (ûj+1 ) ] ,
2

for j = 0, 1, 2, · · · .

( explicit, one-step , O(∆t2 ) ) .

352
An Explicit 4th order accurate Runge-Kutta Method :

k1 = f (uj ) ,
∆t
k2 = f (uj + k1 ) ,
2
∆t
k3 = f (uj + k2 ) ,
2
k4 = f (uj + ∆t k3 ) ,

∆t
uj+1 = uj + {k1 + 2k2 + 2k3 + k4 } ,
6

for j = 0, 1, 2, · · · .

( explicit, one-step , O(∆t4 ) ) .

353
The order of accuracy of a local formula can be found by Taylor expansion .

EXAMPLE :

For the two-step BDF we have the local discretization error

1 3 1
τj ≡ { u(tj+1 ) − 2u(tj ) + u(tj−1 ) } − u′ (tj+1 )
∆t 2 2
1 n3
= u(tj+1 )
∆t 2
2 3
∆t ∆t
− 2[ u(tj+1 ) − ∆t u′ (tj+1 ) + u′′ (tj+1 ) − u′′′ (tj+1 ) + · · · ]
2 6
1 (2∆t) 2 (2∆t) 3 o
+ [ u(tj+1 ) − 2∆t u′ (tj+1 ) + u′′ (tj+1 ) − u′′′ (tj+1 ) + · · · ]
′
2 2 6
− u (tj+1 )
1
= − ∆t2 u′′′ (tj+1 ) + higher order terms.
3

The accuracy of this method is of order 2 .

354
Stability of Numerical Approximations.
The very simple model equation

u′ (t) = 0 , u(0) = u0 , u, 0 ∈R,

has solution
u(t) = u0 , (constant) .

A general m-step approximation has the form

αm uj+1 + αm−1 uj · · · + α0 uj+1−m = 0 .

Assume that
u0 is given ,
and (if m > 1) that

u1 , u2 , · · · , um−1 are computed by another method ,

e.g., by a one-step method of the same order of accuracy.

355
General m-step approximation of u′ (t) = 0 , u(0) = u0 :

αm uj+1 + αm−1 uj · · · + α0 uj+1−m = 0 .

EXAMPLES :

(1) uj+1 − uj = 0 , u0 given Euler, Trapezoidal

(2) 3uj+1 − 4uj + uj−1 = 0 u0 , u1 given Backward Differentiation

(3) −uj+1 + 4uj − 3uj−1 = 0 u0 , u1 given Forward Differentiation

356
The difference equation
αm uj+1 + αm−1 uj + · · · + α0 uj+1−m = 0 ,
can be solved explicitly :

Try solutions of the form uj = z j .

Then we have

αm z j+1 + αm−1 z j + · · · + α0 z j+1−m = 0 ,

or, multiplying through by z m−j−1

αm z m + αm−1 z m−1 + · · · + α1 z + α0 = 0 .

This is the Characteristic Equation of the difference equation.

357
Difference equation :

αm uj+1 + αm−1 uj + · · · + α0 uj+1−m = 0 .

Characteristic Equation :

αm z m + αm−1 z m−1 + · · · + α1 z + α0 = 0 .

• If αm 6= 0 , then the characteristic equation has m roots {zk }m

k=1 .

• For simplicity we assume here that these roots are distinct .

• The general solution of the difference equation is then

uj = γ1 z1j + γ2 z2j + · · · + γm zm
j
.

• The coeficients γk are determined by the initial data u0 , u1 , · · · , um−1 .

358
FACT :

If the characteristic equation has one or more zeroes zk with | zk |> 1

then the numerical method is unstable .

In such a case the uj can become arbitrarily large in a fixed time interval
by taking ∆t sufficiently small.

THEOREM :
A necessary condition for numerical stability of a multistep method is that
the characteristic equation

αm z j+1 + αm−1 z j + · · · + α0 = 0 ,

have no zeroes outside the unit circle .

359
EXAMPLES :

Formula Char. Eqn. Roots Stability

(1) uj+1 − uj = 0 z−1=0 z=1 Stable

(2) 3uj+1 − 4uj + uj−1 = 0 3z 2 − 4z + 1 = 0 z = 1, 13 Stable

(3) −uj+1 + 4uj − 3uj−1 = 0 −z 2 + 4z − 3 = 0 z = 1, 3 Unstable

360
Consider the last two examples in more detail :

Case (2) : Here the general solution is

j 1 j
uj = γ1 (1) + γ2 ( ) .
3

The initial data are u0 and u1 , so that

γ1 + γ2 = u0 ,
1
γ1 + γ2 = u1 ,
3
from which
3 1
γ1 = u1 − u0 ,
2 2
3 3
γ2 = u0 − u1 .
2 2

361
Hence
3 1 3 3 1 j
uj = ( u1 − u0 ) + ( u0 − u1 ) ( ) .
2 2 2 2 3
If
u1 = u0 ,
then we see that
uj = u0 , for all j .

Moreover, if
u1 = u0 + ǫ ,
then
3 3ǫ 1 j
uj = u0 + ǫ − ( ) ,
2 2 3

so that uj stays close to u0 if ǫ is small.

362
Case (3) : Here the general solution is
uj = γ1 (1)j + γ2 (3)j .
Using the initial data
γ1 + γ2 = u0 ,

γ1 + 3 γ2 = u1 ,
we find 3 1 1 1
γ1 = u0 − u1 , γ2 = u1 − u0 .
2 2 2 2
Hence
3 1 1 1
uj = ( u0 − u1 ) + ( u1 − u0 ) (3)j .
2 2 2 2

Again, if u1 = u0 then uj = u0 for all j .

1 1
But if u1 = u0 + ǫ then uj = u0 − 2
ǫ + 2
ǫ 3j .

Hence uj becomes arbitrarily large in finite time by taking small ∆t !

363
THEOREM :

Suppose that the local approximation is accurate.

Then if the zeroes {zk }m

k=1 of the characteristic equation

αm z m + αm−1 z m−1 + · · · + α1 z + α0 = 0 ,
satisfy

| zk | ≤ 1 , and | zk | = 1 ⇒ zk is simple ,

then the method is stable and

uj → u(tj ) as ∆t → 0 .

PROOF : Omitted.

364
Stiff Differential Equations.

There are ODEs for which explicit difference approximations require ∆t to

be very small before one gets the convergence guaranteed by the theorem.

To investigate this, we use the model equation

u′ (t) = λ u(t) , t≥0,

with
u(0) = u0 ,

where λ is a constant. (We allow λ to be complex.)

The solution is
u(t) = eλt u0 .

365
Consider the case where
Re(λ) << 0 ,
i.e., λ has large negative real part .

Then the exact solution of

u′ (t) = λ u(t) ,
namely, u(t) = eλt u0 , decays very quickly as t → ∞.

The numerical solution uj again has the form

m
X
uj = γk zkj ,
k=1

and we certainly don’t want uj to increase as j → ∞ !

Thus we don’t want any zk outside the unit disk in the complex plane.

366
However, for many difference formulas

∆t must be very small

in order that

all zk , k = 1, · · · m, are inside the unit disk .

Thus problems with

Re(λ) << 0 , (“Stiff Problems ”) ,

need special difference approximations .

367
More generally the IVP

u′ (t) = f ( u(t) ) ,

u(0) = u0 ,

is called stiff if the Jacobian

fu ( u(t) ) ,

has one or more eigenvalues λi = λi (t) , with

Re(λi ) << 0 .

NOTE : Eigenvalues can be complex.

368
EXAMPLES :

We will approximate
u′ (t) = λ u(t) ,

by various discretization formulas and determine the values of ∆tλ in the

complex plane for which the solution of the difference formula decays .

Assume
∆t > 0 , and Re(λ) < 0 .

Then ∆tλ always lies in the negative half plane , i.e.,

Re(∆tλ) < 0 .

NOTE : Since eigenvalues can be complex, we allow λ to be complex.

369
Explicit Euler.

Applying Euler’s explicit formula to the model equation

u′ (t) = λ u(t) ,

we get the difference equation

1
(uj+1 − uj ) = λ uj , i .e., uj+1 = (1 + ∆tλ) uj .
∆t

Trying solutions of the form uj = z j gives the characteristic equation

z − (1 + ∆tλ) = 0 , with zero z = 1 + ∆tλ .

Thus | z | ≤ 1 if and only if | 1 + ∆tλ | ≤ 1 , i.e., if and only if

| ∆tλ − (−1) | ≤ 1 .

370
Complex (∆ t .λ)− plane

-2 -1 0

Stability region of the Explicit Euler method

371
EXAMPLE : Take λ = − 106 .

Then
6 t)
u(t) = e(−10 u0 ,

which decays very rapidly for increasing t !

However, for uj to decay, one must take

∆tλ > −2 ,

that is,
∆t < 2 · 10−6 !

Thus the explicit Euler method is useless for stiff equations !

372
Implicit Euler.

The difference formula is

1
(uj+1 − uj ) = λ uj+1 ,
∆t
that is,
1
uj+1 = uj .
1 − ∆tλ
The characteristic equation
1
z − = 0,
1 − ∆tλ
has zero
1
z = ,
1 − ∆tλ

so that | z | ≤ 1 if and only if | 1 − ∆tλ | ≥ 1 , i.e., if and only if

| ∆tλ − 1 | ≥ 1 .

373
Complex (∆ t .λ)− plane

0 1 2

Stability region of the Implicit Euler method

374
Trapezoidal Method.
When applied to u′ (t) = λ u(t) , the Trapezoidal Method gives
1 1
(uj+1 − uj ) = λ (uj + uj+1 ) .
∆t 2
Thus the characteristic equation is

1 1 1 + 21 ∆tλ
(1 − ∆tλ) z − (1 + ∆tλ) = 0 , with zero z = 1 .
2 2 1 − 2 ∆tλ

We find that z = eiθ if

z−1 eiθ − 1 θ
∆tλ = 2 ( ) = 2 ( iθ ) = 2i tan( ) .
z+1 e +1 2

The region of stability is now precisely the entire negative half plane.

Thus, z ≤ 1 if and only if Re(∆tλ) < 0 , which is very desirable.

375
A disadvantage is that the decay rate becomes smaller when

Re(λ) → − ∞ ,

contrary to the decay rate of the solution of the differential equation.

In fact (thinking of ∆t as fixed) we have

1 + 21 ∆tλ
lim z(λ) = lim = −1 .
λ→−∞ λ→−∞ 1 − 1 ∆tλ
2

376
Complex (∆ t .λ)− plane

Stability region of the Trapezoidal method

377
Backward Differentiation Formulas (BDF).

For the differential equation u′ (t) = f (u(t)) the BDF take the form
m
1 X
αi uj+1−i = f (uj+1 ) .
∆t i=0
m
The {αi }m
i=0 are chosen so the order is as high as possible, namely, O(∆t ) .

These formulas follow from the numerical differentiation formulas

that approximate u′ (tj+1 ) in terms of

u(tj+1 ) , u(tj ) , · · · , u(tj+1−m ) .

All of these methods are implicit .

The choice m = 1 gives the implicit Euler method.

378
Let Sm denote the stability region of the m-step BDF.

Concerning Sm one can show the following :

m = 1, 2 :
Sm contains the negative half plane.

These methods are called A-stable .

m = 3, 4, 5, 6 :
Sm contains the negative axis, but not the entire negative half plane.

These methods are called A(α)-stable .

m≥7:
These methods are unstable , even for solving u′ (t) = 0 !

379
Stability region of Backward Differentiation Formulas.

380
Collocation at 2 Gauss Points.

The 2-point Gauss collocation method for taking a time step for the IVP
u′ (t) = f (u(t)) , u(0) = u0 ,
is defined by finding a local polynomial p ∈ P2 that satisfies
p(tj ) = uj ,
and
p′ (xj,i ) = f ( p(xj,i ) ) , i = 1, 2 ,
(collocation) ,
where √
tj + tj+1 ∆t 3
xj,i = ± ,
and then setting 2 6
uj+1 = p(tj+1 ) .

Applied to the model equation u′ (t) = λ u(t) this gives

1 2
1 + ∆t λ + 12
(∆t λ)
uj+1 = 1 uj ≡ z(∆tλ) uj .
1 − ∆t λ + 12
(∆tλ)2

381
It can be shown that the stability region
1
1 + ∆tλ + 12
(∆tλ)2
S ≡ { ∆tλ : 1 2
≤ 1},
1 − ∆tλ + 12
(∆tλ)

is the entire negative half plane.

All Gauss collocation methods have this property and thus are A-stable .

However,
lim z(∆tλ) = 1 ,
λ→−∞

so that the methods lead to slow decay for stiff problems .

382
Complex (∆ t . λ)− plane

Stability region of the collocation method

383
BOUNDARY VALUE PROBLEMS IN ODEs

EXAMPLE : The boundary value problem (BVP)

y ′′ (x) − y(x) = −5 sin(2x) , x ∈ [0, π] ,

y(0) = 0 , y(π) = 0 ,

has the exact (and unique) solution

y(x) = sin(2x) .

• This BVP is a simple example of problems from science and engineering.

• Usually it is difficult or impossible to find an exact solution.

• In such cases numerical techniques can be used.

384
Partition [0, π] into a grid or mesh :

0 = x0 < x1 < x2 < · · · < xN = π ,

where
π
xj = jh , (j = 0, 1, 2, · · · , N ) , h = .
N

We want to find approximations uj to y(xj ) , j = 0, 1, 2, · · · , N .

A finite difference approximation to y ′′ (xj ) is given by

yj+1 −yj yj −yj−1

− yj+1 − 2yj + yj−1
y (xj ) ∼
′′
= h h
= 2
,
h h
where
yj ≡ y(xj ) .

385
We want to find approximations uj to y(xj ) , j = 0, 1, 2, · · · , N .

The uj are computed by solving the finite difference equations :

u0 = 0 ,
u2 − 2u1 + u0
2
− u1 = − 5 sin(2x1 ) ,
h
u3 − 2u2 + u1
2
− u2 = − 5 sin(2x2 ) ,
h
·
·
·
uN − 2uN −1 + uN −2
2
− uN −1 = − 5 sin(2xN −1 ) ,
h
uN = 0 .

386
Write the finite difference equations as
1 2 1
( 2
) uj−1 − (1 + 2
) uj + ( 2
) uj+1 = − 5 sin(2xj ) , j = 1, 2, · · · , N − 1,
h h h
and put them in matrix form :

−1 − h22 1
    
h2
u1 f1
1 2 1

 h2 −1 − h2 h2

 u2 




f2

 . . . 
 .  = 
 
 ,
 .
1 2 1
 2 −1 −
h h2 h2
  uN −2   fN −2 
1 2
h2
−1 − h2
uN −1 fN −1
where    
f1 −5 sin(2x1 )

 f2 

 −5 sin(2x2 ) 
 

 .  = 
  .  ,

 fN −2   −5 sin(2xN −2 ) 
fN −1 −5 sin(2xN −1 )
and where the matrix has dimensions N − 1 by N − 1.

387
We found that :

The finite difference equations can be written in matrix form as

Ah uh = fh ,
where
1 2 1
Ah = diag[ 2 , − (1 + 2 ) , 2 ] ,
h h h

uh ≡ (u1 , u2 , · · · , uN −1 )T ,
and

fh ≡ − 5 ( sin(2x1 ) , sin(2x2 ) , · · · , sin(2xN −1 ) )T .

388
QUESTIONS :

• How to solve the linear systems efficiently, especially when N is large ?

• How to approximate derivatives and find the error in the approximation ?

• What is the actual error after solving the system,

i.e. , what is
max | uj − y(xj ) | ?
j

(assuming exact arithmetic)

389
• How to solve the linear systems efficiently, especially when N is large ?

ANSWER :

The matrix is tridiagonal .

Thus the linear system can be solved by the specialized Gauss elimination
algorithm for tridiagonal systems.

390
• How to approximate derivatives and find the error in the approximation ?

ANSWER : As done earlier, the local discretization error is

yj+1 − 2yj + yj−1 ′′

τj ≡ − yj
h2
2 3 4
1 ′ h ′′ h ′′′ h ′′′′
= 2
yj + hy j + yj + yj + y (ζ1 )
h 2 6 24
− 2yj
2 3 4
h h h
+ yj − hyj′ + yj′′ − yj′′′ + y ′′′′ (ζ2 ) − yj′′
2 6 24
h2 ′′′′
= y (ζ1 ) + y ′′′′ (ζ2 )
24
h2 ′′′′
= y (ηj ) , for some ηj ∈ (xj−1 , xj+1 ) ,
12
using Taylor and Intermediate Value Theorem, assuming y ′′′′ is continuous.

391
We found that
yj+1 − 2yj + yj−1 ′′ h2 ′′′′
τj ≡ − yj = y (ηj ) .
h2 12

In our BVP, we have

y(x) = sin(2x) , and y ′′′′ (x) = 16 sin(2x) .

Thus | y ′′′′ (x) | ≤ 16 , and

16 2 4 2
| τj | ≤ h = h , j = 1, 2, · · · , N − 1 .
12 3

392
• What is the actual error after solving the system ?

i.e., what is
max | uj − y(xj ) | ?
j

ANSWER :

For this, we will use the Banach Lemma .

393
We already showed that
(yj−1 − 2yj + yj+1 ) ′′ 4h2
| τj | ≡ | − yj | ≤ , j = 1, 2, · · · , N − 1 .
h2 3
Now
1 2 1
2
yj−1 − (1 + 2
)yj + 2
yj+1
h h h
(yj+1 − 2yj + yj−1 )
= 2
− yj
h
= yj′′ + τj − yj

= τj − 5 sin(2xj ) .
Thus if we define
yh ≡ ( y1 , y2 , · · · , yN −1 )T , and τ h ≡ ( τ1 , τ2 , · · · , τN −1 )T ,
then
Ah yh = τ h + fh .

394
We found that

Ah yh = τ h + fh .

Since
Ah uh = fh ,

it follows from subtraction that

Ah (yh − uh ) = τ h .

395
We found that

Ah (yh − uh ) = τ h .

Thus if we can show that Ah has an inverse and that

k A−1
h k∞ ≤ K ,

for some constant K that does not depend on h, then

k yh − uh k∞ = k A−1
h τ h k∞

≤ k A−1
h k∞ k τ h k∞

4h2
≤ K .
3

396
Now 
−1 − 2 1 
h2 h2
1 2 1
 h2
−1 − h2 h2

Ah =  . . .
 

1 2 1

h2
−1 − h2 h2

1 2
h2
−1 − h2

0 1
 
h2 + 2 1 1 0 1
= − Ih +

h 2 h 2  · · · 
1 0

0 1
 
h2 + 2 h h2 1 1 0 1 i
= − Ih − 2

h 2 h + 2 h2
 · · ·
1 0

0 1
 
h2 + 2 h 1 1 0 1
i
= − I − .

h
h2 h2 + 2
 · · ·
1 0

397
0 1
 
We have
h2 + 2 h 1 1 0 1
i
Ah = − Ih − 2

h 2 h +2
 · · ·
1 0
h2 + 2
= − (Ih + Bh ) ,
h2

where Ih is the identity matrix and

0 1
 
−1  1 0 1
Bh ≡ 2  .

h +2
 · · ·
1 0
Since
2
k Bh k∞ = 2 < 1,
h +2

it follows by the Banach Lemma that (Ih + Bh )−1 exists and that
1 h2 + 2
k (Ih + Bh )−1 k∞ ≤ 2 = 2
.
1− 2
h +2
h

398
We have
h2 + 2
Ah = − 2
(Ih + Bh ) ,
h
and
−1 h2 + 2
k (Ih + Bh ) k∞ ≤ 2
.
h

Hence
−h2 h 2
h 2
+2
k A−1
h k∞ = k 2 −1
(Ih + Bh ) k∞ ≤ 2 2
= 1.
h +2 h +2 h

Thus K = 1 , and

4h2
k yh − uh k∞ ≤ .
3

399
A Nonlinear Boundary Value Problem.

Consider the Gelfand-Bratu problem

u′′ (x) + λeu(x) = 0 , x ∈ [0, 1] ,

u(0) = 0 , u(1) = 0 , λ is a parameter ,

and its finite difference approximation
u2 − 2u1 u1
g1 (u) ≡ 2
+ λe = 0,
h
u3 − 2u2 + u1 u2
g2 (u) ≡ 2
+ λe = 0,
h
·
·
uN −1 − 2uN −2 + uN −3 uN −2
gN −2 (u) ≡ 2
+ λe = 0,
h
−2uN −1 + uN −2 uN −1
gN −1 (u) ≡ 2
+ λe = 0,
h

where u ≡ (u1 , u2 , · · · , uN −1 )T .

400
If we let

G(u) ≡ ( g1 (u) , g2 (u) , · · · , gN −1 (u) )T ,

and
0 ≡ (0, 0, · · · , 0)T ∈ RN −1 ,

then these equations can be compactly written as

G(u) = 0 .

The Jacobian matrix is an N − 1 by N − 1 tridiagonal matrix :

− h22 u1 1
 
+ λe h2
1
′

h2
− h22 + λeu2 1
h2

G (u) =   .
 · · · 
1
h2
− h22 + λeuN −1

401
Each Newton iteration for solving the nonlinear system

G(u) = 0 ,

consists of solving the tridiagonal system

(k)
− h22 u1 1
 
+ λe h2
(k)
1
− h22 + λe u2 1
 ∆u(k) = − G(u(k) ) ,
 
 h2 h2
 · · (k) 
1
h2
− h22 + λeuN −1

where
(k) (k) (k)
∆u(k) ≡ (∆u1 , ∆u2 , · · · , ∆uN −1 )T ,
and updating

u(k+1) = u(k) + ∆u(k) .

402
Solutions of the Gelfand-Bratu equations for different values of λ.

403
DIFFUSION PROBLEMS

Here we consider parabolic partial differential equations.

The simplest is the linear diffusion equation or heat equation :

ut (x, t) = uxx (x, t) , x ∈ [0, 1] , t≥0,

u(x, 0) = g(x) ,

u(0, t) = u(1, t) = 0 .

This equation governs, for example, the temperature in an insulated rod of

which the endpoints are kept at the constant temperature zero, and in which
the initial temperature distribution is g(x).

404
First discretize in space :

uj−1 (t) − 2uj (t) + uj+1 (t)

u′j (t) = 2
,
∆x

uj (0) = g(xj ) ,

u0 (t) = uN (t) = 0 ,

where we have introduced the notation

uj (t) ≡ u(xj , t) ,
′
and where denotes differentiation with respect to t.

These space-discretized equations represents a system of N − 1 coupled

ordinary differential equations.

405
t

0 1
x
x0 x1 x2 * xN

406
In matrix-vector notation we can write the space-discretized equations as

′ 1
u (t) = 2
D u(t) ,
∆x

where  
−2 1
 1 −2 1 
 
D ≡ 
 . . .  ,

 1 −2 1 
1 −2
and  
u1

 u2 

u ≡ 
 ·  .

 uN −2 
uN −1

407
Now discretize in time :

Often used is the Trapezoidal rule :

uk+1 − uk 1 k+1 k
= 2
D {u + u },
∆t 2∆x

where
uk1
 

k
 uk2 
u ≡   ,
 · 
ukN −1

and
ukj approximates u(xj , tk ) .

408
Assume that the solution has been computed up to time tk .

Thus uk is known , and we want to solve for uk+1 .

Rewrite the above equation as

∆t k+1 ∆t k
(I − 2
D) u = (I + 2
D) u .
2∆x 2∆x

Thus to take a step in time we have to solve a tridiagonal linear system.

This method is also known as the Crank-Nicolson scheme.

409
t

t1
t0 1
x
0 x0 x1 x2 * xN

410
NOTE :

• We can also use an explicit method in time, for example explicit Euler.

• But this can be a bad choice because the ODE system is stiff.

• The time step ∆t may have to be very small to have stability.

411
For the system of ODEs
1
u′ (t) = 2
D u(t) , with D = diag[ 1 , −2 , 1 ] ,
∆x

we can demonstrate the stiffness analytically.

In fact, we can explicitly compute the eigenvalues of the matrix

1
2
D,
∆x
as follows :

An eigenvalue-eigenvector pair λ, v satisfies

1
Dv = λv ,
∆x2
that is,
1
2
(vj−1 −2vj +vj+1 ) = λvj , j = 1, 2, · · · , N −1 , v0 = vN = 0 .
∆x

412
We had the difference equation
1
2
(vj−1 − 2vj + vj+1 ) = λvj .
∆x
Try a solution of the form vj = z j .

This gives the characteristic equation

z 2 − (2 + ∆x2 λ)z + 1 = 0 ,
or
z + z −1 − 2
λ = .
∆x2

The characteristic equation has zeroes

z = z1 and z = z1−1 .

The general solution of the difference equation then has the form
vj = c1 z1j + c2 z1−j .

413
From the first boundary condition we have
v0 = 0 ⇒ c1 + c2 = 0 .

Thus we can take

c1 = c and c2 = − c .

Then
vj = c (z1j − z1−j ) .

From the second boundary condition we now find

vN = 0 ⇒ c (z1N − z1−N ) = 0 ,

from which
k2πi
z12N = 1 ⇒ z1 = e 2N .

414
The eigenvalues are therefore

z + z −1 − 2
λk =
∆x2

k2πi
− k2πi
e 2N +e 2N −2
=
∆x2

2 cos( k2π
2N
)−2
=
∆x2

2(cos( k2π
2N
) − 1)
=
∆x2

4 2 kπ
= − 2 sin ( ), k = 1, 2, · · · , N − 1 .
∆x 2N

415
The eigenvalue with largest negative real part is
4 2 (N − 1)π
λN −1 = − 2
sin ( ),
∆x 2N
which for large N behaves like
∼ ∗ 4
λN −1 = λ ≡ − 2
.
∆x

Thus the system is stiff if ∆x is small .

EXAMPLE : To make the explicit Euler method stable we need to take

the timestep ∆t so that ∆tλ∗ lies in the circle of radius 1 centered at −1,
i.e., we must take 1 2
∆t < ∆x .
2

Using explicit Euler is often not a good idea .

416
Nonlinear Diffusion Equations.

An example of a nonlinear diffusion equation is the Fisher equation

ut (x, t) = uxx (x, t) + λ u(x, t) (1 − u(x, t)) ,

for
x ∈ [0, 1] , t≥0,
with
u(x, 0) = g(x) , u(0, t) = u(1, t) = 0 .

This is a simple model of population growth with diffusion and with

maximal sustainable population equal to 1 .

417
Another example is the time-dependent Gelfand-Bratu equation

ut (x, t) = uxx (x, t) + λ eu(x,t) ,

for
x ∈ [0, 1] , t≥0,
with
u(x, 0) = g(x) , u(0, t) = u(1, t) = 0 ,

for which we have already considered the stationary equations

uxx (x) + λ eu(x) = 0 , x ∈ [0, 1] ,

with
u(0) = u(1) = 0 .

418
We illustrate the numerical solution procedure for the general equation

ut (x, t) = uxx (x, t) + f (u(x, t)) , x ∈ [0, 1] , t≥0,

u(x, 0) = g(x) ,

u(0, t) = u(1, t) = 0 ,

where

f (u) = λ u (1 − u) for the Fisher equation ,

and

f (u) = λ eu for the Gelfand-Bratu equation .

419
We approximate this equation as follows :

First discretize in space to get a system of ODEs :

uj−1 (t) − 2uj (t) + uj+1 (t)
u′j (t) = 2
+ f (uj (t)) ,
∆x

for j = 1, 2, · · · , N − 1 , with
uj (0) = g(xj ) ,

u0 (t) = uN (t) = 0 .

Then discretize in time using Implicit Euler :

uk+1
j − ukj uk+1 k+1
j−1 − 2uj + uk+1
j+1 k+1
= 2
+ f (uj ).
∆t ∆x

420
Rewrite these equations as

∆t k+1
Fjk+1 ≡ uk+1
j − u k
j − 2
(u j−1 − 2u k+1
j + u k+1
j+1 ) − ∆t f (u k+1
j ) = 0,
∆x

for j = 1, 2, · · · , N − 1 ,
with
uk+1
0 = 0 and uk+1
N = 0.

We can assume that the solution has been computed up to time tk ,

i.e., the ukj are known and we must solve for the uk+1
j .

Since the equations are nonlinear we use Newton’s method.

421
As initial approximation to uk+1
j in Newton’s method use
(uk+1
j ) (0)
= uk
j, j = 1, 2, · · · , N − 1 .
Each Newton iteration then consists of solving a linear tridiagonal system
Tk+1,(ν) ∆uk+1,(ν) = − Fk+1,(ν) ,
where
k+1,(ν)
 
∆t ∆t
1+ 2 ∆x 2 − ∆tfu (u1 ) − ∆x 2
 ∆t ∆t k+1,(ν) ∆t

 − ∆x 2 1 + 2 ∆x 2 − ∆tfu (u2 ) − ∆x 2

Tk+1,(ν) = 
 
 . . . 

 . . . 
∆t ∆t k+1,(ν)
− ∆x2 1 + 2 ∆x 2 − ∆tfu (uN −1 )
and  k+1,(ν)   k+1,(ν) 
∆u1 F1
k+1,(ν)
 ∆uk+1,(ν)  k+1,(ν)
 F k+1,(ν) 
∆u =   2  , F =   2  .
·  · 
k+1,(ν) k+1,(ν)
∆uN −1 FN −1

Then set the next approximation to the solution at time t = tk+1 equal to
uk+1,(ν+1) = uk+1,(ν) + ∆uk+1,(ν) .

422
Time-evolution of solutions of the Gelfand-Bratu equations for λ = 2.

423
Time-evolution of solutions of the Gelfand-Bratu equations for λ = 4.

424

Solution To Analysis On Manifolds Chapter1 Munkres
No ratings yet
Solution To Analysis On Manifolds Chapter1 Munkres
10 pages
Solution Optimization 2ed
58% (19)
Solution Optimization 2ed
138 pages
Linear Algebra Solution Manual by Peter Olver
80% (10)
Linear Algebra Solution Manual by Peter Olver
350 pages
Imc 1994 2017 Omegaleph
No ratings yet
Imc 1994 2017 Omegaleph
214 pages
2 - Numerical Methods For Solving Linear Systems of Equations
No ratings yet
2 - Numerical Methods For Solving Linear Systems of Equations
35 pages
An Introduction To Optimization 4th Edition Solution Manual PDF
100% (1)
An Introduction To Optimization 4th Edition Solution Manual PDF
220 pages
(Peter J. Olver, Chehrzad Shakiban) Instructor's S
No ratings yet
(Peter J. Olver, Chehrzad Shakiban) Instructor's S
357 pages
Linear Algebra Review
No ratings yet
Linear Algebra Review
18 pages
ADA120254
No ratings yet
ADA120254
27 pages
공수1 이영민
No ratings yet
공수1 이영민
60 pages
Solution of Tutorial Sheet 1 Q1 Q16
No ratings yet
Solution of Tutorial Sheet 1 Q1 Q16
55 pages
HW 6 Solutions
No ratings yet
HW 6 Solutions
11 pages
Linear Algebra Solutions
No ratings yet
Linear Algebra Solutions
357 pages
Solution Manual
No ratings yet
Solution Manual
350 pages
Solutions4 HW
No ratings yet
Solutions4 HW
7 pages
hw1 Solutions
No ratings yet
hw1 Solutions
5 pages
Slides
No ratings yet
Slides
428 pages
Direct Methods
No ratings yet
Direct Methods
79 pages
Col726 2302 Ass3 Solutions
No ratings yet
Col726 2302 Ass3 Solutions
5 pages
Lin Syster RN
No ratings yet
Lin Syster RN
6 pages
Fa 2
No ratings yet
Fa 2
7 pages
Annual Vojtěch Jarník
No ratings yet
Annual Vojtěch Jarník
99 pages
Lecture Week04 PDF
No ratings yet
Lecture Week04 PDF
9 pages
Trefethen Bau
100% (2)
Trefethen Bau
29 pages
2018 2 Solutions
No ratings yet
2018 2 Solutions
9 pages
Completion Matrix: ON The Unitary OFA
No ratings yet
Completion Matrix: ON The Unitary OFA
7 pages
(2022) 30407 - Exam - Solution
No ratings yet
(2022) 30407 - Exam - Solution
4 pages
Tutorial On Compressed Sensing Exercises: 1. Exercise
No ratings yet
Tutorial On Compressed Sensing Exercises: 1. Exercise
12 pages
HW 2 Sol
No ratings yet
HW 2 Sol
9 pages
Tor Contest 5
No ratings yet
Tor Contest 5
8 pages
λ and vectors x = 0 f or which Ax = λx
No ratings yet
λ and vectors x = 0 f or which Ax = λx
22 pages
Midterm Solutions: SOLUTION. We Can Write F (U
100% (1)
Midterm Solutions: SOLUTION. We Can Write F (U
7 pages
Matrix Algebra Solution
No ratings yet
Matrix Algebra Solution
23 pages
MIT18 06SCF11 FinalRevsum
No ratings yet
MIT18 06SCF11 FinalRevsum
7 pages
Mathematical Tools Problem
No ratings yet
Mathematical Tools Problem
6 pages
Linear Algebra Quiz - Mit - Exam2 - 2
No ratings yet
Linear Algebra Quiz - Mit - Exam2 - 2
101 pages
Solutions For Applied Numerical Linear Algebra PDF
No ratings yet
Solutions For Applied Numerical Linear Algebra PDF
75 pages
Honors Linear Algebra Final 2013
No ratings yet
Honors Linear Algebra Final 2013
15 pages
Homework 4 MATH2050
No ratings yet
Homework 4 MATH2050
7 pages
Manual For Instructors: TO Linear Algebra Fifth Edition
No ratings yet
Manual For Instructors: TO Linear Algebra Fifth Edition
9 pages
Ahinf Norm Proof
No ratings yet
Ahinf Norm Proof
9 pages
Matrix Perturbation Theory
No ratings yet
Matrix Perturbation Theory
18 pages
Exam2016 17s1
No ratings yet
Exam2016 17s1
9 pages
(Solution) Linear Algebra 2nd (Kwak, Hong) Birkhauser
31% (16)
(Solution) Linear Algebra 2nd (Kwak, Hong) Birkhauser
22 pages
EE263 Homework 3 Solutions
No ratings yet
EE263 Homework 3 Solutions
16 pages
Linalg 09 Jan Solns
No ratings yet
Linalg 09 Jan Solns
6 pages
1.6 Mathematical Analysis For Recursive Algorithms
No ratings yet
1.6 Mathematical Analysis For Recursive Algorithms
22 pages
Ecd 01
No ratings yet
Ecd 01
16 pages
Solution 1
No ratings yet
Solution 1
9 pages
18.06SC Final Exam Solutions
No ratings yet
18.06SC Final Exam Solutions
14 pages
Algebraic Combinatorics - Po-Shen-Loh - MOP 2011
No ratings yet
Algebraic Combinatorics - Po-Shen-Loh - MOP 2011
5 pages
MA 106: Spring 2014: Tutorial Sheet 3
No ratings yet
MA 106: Spring 2014: Tutorial Sheet 3
4 pages
Book 1
No ratings yet
Book 1
20 pages
Control System Using Matlab
No ratings yet
Control System Using Matlab
48 pages
CLA Week3
No ratings yet
CLA Week3
13 pages
2010-Seemous-Problems Solutions 1 PDF
No ratings yet
2010-Seemous-Problems Solutions 1 PDF
4 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
Solutions: Problem Set 1: January 17, 2013
No ratings yet
Solutions: Problem Set 1: January 17, 2013
9 pages
Algebraic Methods in Combinatorics: 1 Linear Algebra Review
No ratings yet
Algebraic Methods in Combinatorics: 1 Linear Algebra Review
5 pages
Topic 7 Polynomials and Factoring Notes PDF
No ratings yet
Topic 7 Polynomials and Factoring Notes PDF
10 pages
20ecpc303 Ss QB Ese Stu
No ratings yet
20ecpc303 Ss QB Ese Stu
19 pages
Mani Mam Lab Manual
No ratings yet
Mani Mam Lab Manual
92 pages
Ex. No: 9.a Power Spectral Density Estimation Date: by Bartlett Method
No ratings yet
Ex. No: 9.a Power Spectral Density Estimation Date: by Bartlett Method
7 pages
Final DSA GROUP 1
No ratings yet
Final DSA GROUP 1
19 pages
03 InformedHeuristicSearch
No ratings yet
03 InformedHeuristicSearch
80 pages
DAA Final
No ratings yet
DAA Final
12 pages
Quantization and Sampling Using Matlab
No ratings yet
Quantization and Sampling Using Matlab
3 pages
Design and Analysis of Algorithms
No ratings yet
Design and Analysis of Algorithms
47 pages
High-Resolution Image Synthesis and Semantic Manipulation With Conditional Gans
No ratings yet
High-Resolution Image Synthesis and Semantic Manipulation With Conditional Gans
60 pages
2425s Csec520 08 Naive Bayes KNN
No ratings yet
2425s Csec520 08 Naive Bayes KNN
44 pages
Tmi 2018 2833635
No ratings yet
Tmi 2018 2833635
14 pages
Assignment 5 Solution
No ratings yet
Assignment 5 Solution
3 pages
Data Structures and Algorithms
No ratings yet
Data Structures and Algorithms
5 pages
An Incremental K-Means Algorithm
No ratings yet
An Incremental K-Means Algorithm
14 pages
2023-Physics-Informed Recurrent Neural Networks and Hyper-Parameter Optimization For Dynamic Process Systems
No ratings yet
2023-Physics-Informed Recurrent Neural Networks and Hyper-Parameter Optimization For Dynamic Process Systems
13 pages
Numerical Methods Question Bank
No ratings yet
Numerical Methods Question Bank
3 pages
29-Karatsuba Algorithm-23-05-2023
No ratings yet
29-Karatsuba Algorithm-23-05-2023
21 pages
1.4 Barcode Label Specification: 1.4.1 Specifications
No ratings yet
1.4 Barcode Label Specification: 1.4.1 Specifications
4 pages
Midterm SIM 2021 MCQ Questions
No ratings yet
Midterm SIM 2021 MCQ Questions
5 pages
Algorithms: CS 202 Epp Section ??? Aaron Bloomfield
No ratings yet
Algorithms: CS 202 Epp Section ??? Aaron Bloomfield
25 pages
Assignment 1 1
No ratings yet
Assignment 1 1
3 pages
Lab-1 Manual Upload
No ratings yet
Lab-1 Manual Upload
6 pages
Practice Set DP Greedy
No ratings yet
Practice Set DP Greedy
2 pages
EML4314C (Fall (2014 ( (Exam (1 ( (October (16, (2014 ( ( (: Grading Key
No ratings yet
EML4314C (Fall (2014 ( (Exam (1 ( (October (16, (2014 ( ( (: Grading Key
6 pages
HCS Clustering Algorithm
No ratings yet
HCS Clustering Algorithm
2 pages
Machine Learning 2: Exercise Sheet 1
No ratings yet
Machine Learning 2: Exercise Sheet 1
2 pages
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Sequences and Infinite Series, A Collection of Solved Problems
From Everand
Sequences and Infinite Series, A Collection of Solved Problems
Steven Tan
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet