Comp 361
Comp 361
on
with Solutions
Eusebius Doedel
TABLE OF CONTENTS
In later analysis we shall need a quantity (called vector norm) that measures
the magnitude of a vector.
Let x ≡ (x1 , x2 , · · · , xn )T ∈ Rn .
n
X 1
2
k x k2 ≡ ( xk ) ,2 (the “two-norm ”, or Euclidean length)
k=1
1
k x k1 and k x k2 are special cases of
n
X 1
p
k x kp ≡ ( | xk | ) ,
p (where p is a positive integer),
k=1
2
Vector norms are required to satisfy
(ii) k αx k = | α | k x k, ∀x ∈ Rn , ∀α ∈ R,
3
All of the examples of norms given above satisfy (i) and (ii). (Check !)
= k x k∞ + k y k∞ .
4
EXERCISES :
5
•[005] Prove that k x k1 ≤ n k x k∞ .
SOLUTION :
n
X
k x k1 = | xi | ≤ n max | xi | = n k x k∞ .
i
i=1
√
•[006] Prove that k x k2 ≤ n k x k∞ .
SOLUTION : From
Xn
k x k22 = x2i ≤ n max | xi |2 = n k x k2∞ ,
i
i=1 √
it follows that k x k2 ≤ n k x k∞ .
|xk |
Note that the ratio kxk∞
equals 1 for at least one value of k.
|xk |
We may assume kxk∞
= 1 for m values of k, where 0 < m ≤ n.
|xk |
For the remaining values of k the ratio kxk∞
is strictly less than 1.
It follows that n h
X | xk | ip
→ m as p → ∞ .
k=1
k x k∞
Thus Xn h
| xk | ip p1 1
lim = lim m = 1 .
p
p→∞
k=1
k x k∞ p→∞
We also need a measure of the magnitude of a square matrix (matrix norm).
6
For specific choices of vector norm it is convenient to express the induced
matrix norm directly in terms of the elements of the matrix :
7
Next we show that for any matrix A there always is a vector y for which
k Ay k∞
= R.
k y k∞
Pn
Let k be the row of A for which j=1 | akj | is a maximum, i.e.,
n
X
| akj | = R .
j=1
Then
n n
k Ay k∞ X X
= k Ay k∞ = max | aij yj | = | akj | = R .
k y k∞ i
j=1 j=1
8
Thus we have shown that
EXAMPLE : If
1 2 −3
A = 1 0 4 ,
−1 5 1
then
k A k∞ = max{6, 5, 7} = 7 .
9
Similarly one can show that
n
k Ax k1 X
k A k1 ≡ max = max | aij |
x6=0 k x k1 j
i=1
10
One can also show that
k Ax k2
k A k2 ≡ max = max κi (A) ,
x6=0 k x k2 i
The quantities {κi (A)}ni=1 are called the singular values of the matrix A.
11
EXAMPLE : If
1 1
A= ,
0 1
then
T 1 0 1 1 1 1
A A = = .
1 1 0 1 1 2
12
If A is invertible then we also have
−1 1
kA k2 = .
mini κi (A)
κ1 ≥ κ2 ≥ · · · κn ≥ 0 ,
then
−1 1
k A k2 = κ1 , and kA k2 = .
κn
−1 1 ∼
kA k2 = q √ = 1.618 (!)
(3 − 5)/2
13
EXERCISES :
0 2
•[009] Let A = . Compute k A k2 .
0 0
1 0 0
•[010] Let A = 0 0 1 . Compute k A k2 .
0 −1 0
√
•[011] Prove that k A k2 ≤ n k A k∞ .
14
SOLUTIONS :
0 0 0 2 0 0
•[009] AT A = = ,
2 0 0 0 0 4
√
with eigenvalues λ1 = 0 and λ2 = 4 , so that k A k2 = 4 = 2 .
1 0 0 1 0 0 1 0 0
•[010] AT A = 0 0 −1 0 0 1 = 0 1 0 ,
0 1 0 0 −1 0 0 0 1
√
which has three eigenvalues equal to 1 . Thus k A k2 = 1 = 1 .
√
•[011] k A k2 ≤ n k A k∞ :
√
By a previous exercise k y k2 ≤ n k y k∞ for any vector y .
Also it is easy to see that k x k2 ≥ k x k∞ . Thus
k Ax k2 √ k Ax k∞
k A k2 ≡ max ≤ n max
x6=0 k x k2 x6=0 k x k2
√ k Ax k∞ √
≤ n max = n k A k∞ .
x6=0 k x k∞
•[012] k A k1 is equal to the maximum absolute column sum.
a11 a12 · · · a1n
PROOF : Let a21 a22 · · · a2n
A≡
···
,
··· ··· ···
an1 an2 · · · ann
and n
X
C ≡ max | aij | (maximum absolute column sum) .
j
i=1
First Pn
|
Pn
aij xj |
k Ax k1 i=1 j=1
=
k x k1 k x k1
Pn Pn
i=1 j=1 | aij || xj |
≤
k x k1
Pn Pn
j=1 i=1 | aij || xj |
=
k x k1
Pn Pn
j=1 | xj | i=1 | aij |
=
k x k1
Pn
j=1 | xj | C
≤ = C.
k x k1
PROOF : (continued · · · )
Let k be the index of the column having the maximum absolute column sum.
Let x be the zero vector, except for the kth element which is set to 1.
Then
k x k1 = 1 and k Ax k1 = C ,
so that indeed
k Ax k1
= C.
k x k1
EXERCISES :
•[013] Let A be any n by n matrix. For each of the following state whether
it is true or false. If false then give a counter example.
k A k1 ≤ k A k∞ , k A k∞ ≤ k A k1 .
15
SOLUTIONS :
1 1
For A = we have
0 0
k A k∞ = 2 and k A k1 = 1 , so k A k1 < k A k∞ ,
1 0
while for A = we have
1 0
k A k∞ = 1 and k A k1 = 2 , so k A k∞ < k A k1 .
•[014] Prove that for any square matrix A there is a vector x 6= 0 such that
k Ax k∞ = k A k∞ k x k∞ .
k Ax k
k A k = max
x6=0 kxk
automatically satisfy
(ii) k αA k = | α | k A k, ∀α ∈ R ,
(iii) k A + B k ≤ k A k + k B k .
16
PROOF of (iii) :
k (A + B)x k
kA+Bk = max
x6=0 kxk
k Ax + Bx k
= max
x6=0 kxk
k Ax k + k Bx k
≤ max
x6=0 kxk
k Ax k k Bx k
≤ max + max
x6=0 k x k x6=0 k x k
≡ kAk+kBk . QED !
17
In addition we have
(iv) k AB k ≤ k A k k B k .
PROOF of (iv) :
k (AB)x k
k AB k = max
x6=0 kxk
k A(Bx) k
= max
x6=0 kxk
k A k k Bx k
≤ max
x6=0 kxk
18
EXERCISES : Let A and B be arbitrary n by n matrices.
0 2
•[019] Let A = . Compute spr(A) , the “spectral radius” of A .
0 0
(Here spr(A) is the absolute value of the largest eigenvalue of A.)
19
SOLUTIONS :
•[017] k AB k2 = k A k2 k B k2 is false:
1 0 0 0
Take A = , and B = . Then AB is the zero matrix.
0 0 0 1
So k AB k2 = 0 , while k A k2 = 1 and k B k2 = 1 .
Let B be an n by n matrix .
kBk < 1,
then
I + B is nonsingular
and
−1 1
k (I + B) k ≤ .
1− k B k
20
PROOF :
Suppose on the contrary that I + B is singular.
Then
(I + B) y = 0 ,
for some nonzero vector y .
Hence
B y = −y ,
and
kByk kBk kyk
1 = ≤ = kBk ,
kyk kyk
Hence I + B is nonsingular.
21
We now have
(I + B) (I + B)−1 = I ,
from which
(I + B)−1 = I − B (I + B)−1 .
Hence
k (I + B)−1 k ≤ k I k + k B k k (I + B)−1 k .
(1− k B k) k (I + B)−1 k ≤ 1 ,
from which
−1 1
k (I + B) k ≤ . QED !
1− k B k
22
EXERCISES :
•[020] Consider the n by n tridiagonal matrix Tn = diag[1, 3, 1] .
For example,
3 1 0 0
1 3 1 0
T4 = 0 1 3 1
.
0 0 1 3
Use the Banach Lemma to show that Tn is invertible for all positive
integers n. Also compute an upper bound on k T−1
n k∞ .
23
•[020] SOLUTION :
We can write
Tn = diag[1 , 3 , 1] = 3 ( In + Bn ) ,
where In is the n by n identity matrix, and
1 1
Bn = diag[ , 0 , ].
3 3
Then
2
k Bn k∞ = < 1,
3
so that using the Banach Lemma we have that ( In + Bn ) is invertible,
and
1
T−1
n = (In + Bn )−1 ,
3
with
1 1
k T−1
n k∞ ≤ · 2 = 1.
3 1− 3
1 1 1 1
0 ···
n n n n
n1 0 1
n ··· 1
n
1
n
Bn = n1 1
0 ··· 1 1
.
n n n
· · · ··· · ·
1 1 1 1
n n n ··· n 0
n−1
Here k Bn k∞ = n
< 1, so In + Bn is invertible, and
−1 1 1
k (In + Bn ) k∞ ≤ = n−1 = n .
1− k Bn k∞ 1− n
EXERCISES :
•[022] Let An be the n by n symmetric matrix
2n 1 1 ··· 1 1
1 2n 1 · · · 1 1
An = 1 1 2n · · · 1 1 .
· · · ··· · ·
1 1 1 ··· 1 2n
Show An for the cases n = 2, 3. Prove that An is invertible for any
dimension n , and determine an upper bound on k A−1
n k∞ .
•[023] A square matrix is called diagonally dominant if in each row the ab-
solute value of the diagonal element is greater than the sum of the
absolute values of the off-diagonal elements. Use the Banach Lemma
to prove that a diagonally dominant matrix is invertible.
•[024] Derive an upper bound on k T−1
n k∞ for the n by n tridiagonal matrix
Tn = diag[1 , 2 + 1/n , 1] .
24
•[022] SOLUTION :
6 1 1
4 1
A2 = , A3 = 1 6 1 .
1 4
1 1 6
n−1
Here k Bn k∞ = 2n
< 1, so that In + Bn is invertible,
and
1 1 1 1 1
k A−1
n k∞ = ≤ · = · n−1 = .
2n 1− k Bn k∞ 2n 1 − 2n n+1
•[023] A diagonally dominant matrix A is invertible :
SOLUTION :
Let
A = D + E,
where D contains the diagonal entries of A , and E contains the off-
diagonal entries, that is,
a11 0 0 0 · 0 0 a12 a13 a14 · a1n
0 a22 0 0 · 0 a21 0 a23 a24 · a2n
0 0 a33 0 · 0 a a32 0 a34 · a3n
D= E = 31
0 0 0 a44 · 0 a41 a42 a43 0 · a4n
· · · · · · · · · · · ·
0 0 0 0 · ann an1 an2 an3 an4 · 0
Now
k B k∞ < 1 ,
because by the diagonal dominance assumption we have for each row that
n n
X aij 1 X
| | = | aij | < 1 i = 1, 2, · · · , n .
j=1
aii | aii | j=1
25
1 −2 −1 2 x1 −2
2 0 1 2 x2 = 5 subtract 2 × row 1 from row 2
2 0 4 1 x3 7 subtract 2 × row 1 from row 3
1 6 1 2 x4 16 subtract 1 × row 1 from row 4
1 −2 −1 2 x1 −2
0 4 3 −2 x2 9
=
0 4 6 −3 x3 11 subtract 1 × row 2 from row 3
0 8 2 0 x4 18 subtract 2 × row 2 from row 4
1 −2 −1 2 x1 −2
0 4 3 −2 x2 9
=
0 0 3 −1 x3 2
0 0 −4 4 x4 0 subtract − 43 × row 3 f rom row 4
1 −2 −1 2 x1 −2
0 4 3 −2 x2 9
=
0 0 3 −1 x3 2
0 0 0 8/3 x4 8/3
26
The bold-face numbers at the top left of each submatrix are the pivots :
1 −2 −1 2
0 4 3 −2
0 0 3 −1
0 0 0 8/3
27
The upper triangular system
1 −2 −1 2 x1 −2
0 4 3 −2 x2 = 9 ,
0 0 3 −1 x3 2
0 0 0 8/3 x4 8/3
x4 = (8/3)/(8/3) = 1 ,
x3 = [2 − (−1)1]/3 = 1 ,
x2 = [9 − (−2)1 − (3)1]/4 = 2 ,
28
Operation Count.
Using Gauss elimination for general n by n matrices, counting multiplications
and divisions only (and treating these as equivalent).
29
(ii) Backsubstitution :
• • • • x1 •
◦ ⋆ ⋆ ⋆ x2 = ⋆
◦ ◦ ⋆ ⋆ x3 ⋆
◦ ◦ ◦ ⋆ x4 ⋆
n(n + 1)
1 + 2 + ··· + n = .
2
Taking the total of triangularization and backsubstitution we obtain
n(n − 1)(2n + 5) n(n + 1) n3 2 n
+ = + n − . (Check !)
6 2 3 3
EXAMPLES :
if n = 10 , then the total is 430,
if n = 100 , then the total is 343 430,
if n = 1000, then the total is 336 333 430.
For large values of n the dominant term in the total operation count is n3 /3.
30
Reconsider the Gauss elimination procedure for solving the system
Ax = f ,
given by
1 −2 −1 2 x1 −2
2 0 1 2 x2 5
= .
2 0 4 1 x3 7
1 6 1 2 x4 16
• In each step retain the equation that cancels the operation performed.
31
I A x I f
1 0 0 0 1 −2 −1 2 x1 1 0 0 0 −2
0 1 0 02 0 1 2 x2 = 0 1 0 0 5
0 0 1 02 0 4 1 x3 0 0 1 0 7
0 0 0 1 1 6 1 2 x4 0 0 0 1 16
1 0 0 0 1 −2 −1 2 x1 1 0 0 0 −2
2 1 0 00 4 3 −2
x2 = 2 1 0 0 9
2 0 1 00 4 6 −3 x3 2 0 1 0 11
1 0 0 1 0 8 2 0 x4 1 0 0 1 18
1 0 0 0 1 −2 −1 2 x1 1 0 0 0 −2
2 1 0 00 4 3 −2
x2 = 2 1 0 0 9
2 1 1 00 0 3 −1 x3 2 1 1 0 2
1 2 0 1 0 0 −4 4 x4 1 2 0 1 0
1 0 0 0 1 −2 −1 2 x1 1 0 0 0 −2
2 1 0 00 4 3 −2
x2 = 2 1 0 0 9
2 1 1 00 0 3 −1 x3 2 1 1 0 2
1 2 − 34 1 0 0 0 8
3 x4 1 2 − 43 1 8
3
L U x L g
32
NOTE :
• In addition we have Lg = f .
33
Using the LU-decomposition for multiple right hand sides.
f (k) , k = 1, 2, · · · , m .
Algorithm :
(i) Determine the LU-decomposition of A .
Lg(k) = f (k) ,
(ii) Solve k = 1, 2, · · · , m .
Ux(k) = g(k) ,
34
Operation Count.
n (n − 1) + (n − 1) (n − 2) + · · · + (2) (1)
n−1
X n−1
X n−1
X
= k (k + 1) = k2 + k
k=1 k=1 k=1
(n − 1) n (2n − 1) n (n − 1) n3 n
= + = − . (Check !)
6 2 3 3
35
L g f U x g
1 ◦ ◦ ◦ g1 • • • • • x1 •
⋆ 1 ◦ ◦ g2 = •
◦ ⋆ ⋆ ⋆ x2 = •
⋆ ,
⋆ 1 ◦ g3 • ◦ ◦ ⋆ ⋆ x3 •
⋆ ⋆ ⋆ 1 g4 • ◦ ◦ ◦ ⋆ x4 •
36
Tridiagonal systems.
37
This transform the tridiagonal system into the upper-triangular form
β1 c1 x1 g1
β2 c2 x2 g2
β3 c3 x3 g3
. = . .
. .
βn−1 cn−1 xn−1 gn−1
βn xn gn
gk − ck xk+1
xk = , k = n −1 , n −2 , ··· , 1 .
βk
38
The resulting LU-decomposition is
1 β1 c1
γ2 1 β2 c2
γ3 1
β3 c3
. .
. .
γn−1 1 βn−1 cn−1
γn 1 βn
39
Inverses.
A (A−1 ) = (A−1 ) A = I ,
where
1 0 · 0
0 1 · 0
I ≡
· · · ·
(the identity matrix ) .
0 0 · 1
det A 6= 0 .
40
To compute A−1 we can solve
A (A−1 ) = I ,
n3 2 n 4n3 n
+ (n) n − = − .
3 3 3 3
41
But we can omit some operations , because the right hand side vectors,
i.e., the columns of I , are very special.
42
NOTE :
To solve a n by n linear system
Ax(k) = f (k) ,
with m right hand sides, takes
n3 n
+ mn2 − operations ,
3 3
as derived earlier for the LU-decomposition algorithm.
One can also find the solution vectors by computing A−1 , and setting
x(k) = A−1 f (k) .
But this takes
n3 + mn2 operations ,
43
EXERCISES :
0 0 1 3
Let f = (4, 5, 5, 4)T . Using the matrices L and U, solve Lg = f ,
followed by Ux = g . After having computed the vector x in this way,
check your answer by verifying that x satisfies the equation T4 x = f .
•[026] How many multiplications and divisions are needed to compute the
LU-decomposition of the specific tridiagonal matrix Tn = diag[1, 3, 1]
as a function of n ? Make sure not to count unnecessary operations.
44
•[025] SOLUTION :
0 0 1 3
results in the matrices
1 0 0 0 3 1 0 0
1 1 0 0 0 8 1 0
L = 3 , and U = 3 .
0 3 0 0 21
8
1 0
8
1
8 55
0 0 21 1 0 0 0 21
11 29 55 T
Solving Lg = f gives g = (4 , 3
, 8
, 21
) ,
Tn = diag[1 , 3 , 1] ,
as a function of n ?
SOLUTION :
Only n − 1 divisions are needed, and no multiplications !
SOLUTION :
The estimated time, based only on the number of divisions, is 100 seconds.
EXERCISES :
45
•[028] SOLUTION :
•[029] SOLUTION :
The leading term for the number of ”operations” for Ax = f is n3 /3 .
Thus the estimated time µ per operation is obtained from
(103 )3
µ = 10 seconds ,
3
which gives µ = 3 · 10−8 seconds .
The leading term for solving Lg = f followed by Ux = g is n2 .
46
•[030] Suppose multiplying two general n by n matrices takes 3 seconds on a
given computer, if n = 1000. Estimate how much time it will take to
compute the LU-decomposition of such a matrix.
SOLUTION : Multiplying two n by n matrices takes n3 multipli-
cations, while LU-decomposition takes approximately n3 /3 multiplica-
tions and divisions. Thus LU-decomposition of a matrix of dimension
n = 1000 can be estimated to take one second.
SOLUTION :
n − 1 divisions .
SOLUTION :
n (n − 1)
divisions .
2
Practical Considerations.
• Memory reduction.
In an implementation of the LU decomposition algorithm, the multipliers
can be stored in the lower triangular part of the original matrix A.
47
• Row interchanges.
48
• Loss of accuracy.
More generally, loss of accuracy may occur when there are large multipliers .
EXAMPLE : Solve
0.0000001 1 x1 1
= ,
1 1 x2 2
on a “six-decimal-digit computer”.
NOTE :
• The solution is x1 ∼
= 1 , x2 ∼
= 1.
49
A “Six-decimal-digit computer ” :
is stored as
−3.33333 101
50
0.0000001 1 x1 1
=
1 1 x2 2
x2 = 1.00000E + 00 , x1 = 0.00000E + 00 .
51
Again, the remedy is to interchange rows :
1 1 x1 2
= .
0.0000001 1 x2 1
(a) Elimination :
1.000000E + 00 1.000000E + 00 x1 2.00000E + 00
= .
0 .999999E + 00 x2 .999999E + 00
52
Gauss Elimination with pivoting.
Here rows are interchanged each time a pivot element is sought, so that the
pivot is as large as possible in absolute value.
53
EXAMPLE :
2 2 1 x1 5 interchange row 1 and 3
1 0 1 x2 = 2
4 1 2 x3 7
4 1 2 x1 7
1 0 1 x2 = 2 subtract 41 row 1 f rom row 2
2 2 1 x3 5 subtract 24 row 1 f rom row 3
4 1 2 x1 7
0 −1/4 1/2 x2 = 1/4 interchange row 2 and 3
0 3/2 0 x3 3/2
4 1 2 x1 7
0 3/2 0 x2 = 3/2
0 −1/4 1/2 x3 1/4 subtract −1
6 row 2 f rom row 3
4 1 2 x1 7 backsubstitution : x1 = 1
0 3/2 0 x2 = 3/2 backsubstitution : x2 = 1
0 0 1/2 x3 1/2 backsubstitution : x3 = 1
54
EXERCISES :
•[035] Suppose that Gauss elimination with row pivoting is used to solve
2 1 0 0 x1 4
1 2 1 0 x2 8
= .
0 1 2 1 x3 12
0 0 1 2 x4 11
Are any rows actually interchanged?
Can you also answer this question for general Tn = diag[1, 2, 1] ?
55
•[035] SOLUTION :
0 0 1 2
results in the matrices
1 0 0 0 2 1 0 0
1 1 0 0 0 3
1 0
L = 2 2
0 2 1 0 , and U = 0
4
.
3
0 3
1
0 0 34 1 0 0 0 5
4
Ax = f ,
Suppose a small error is made in the right hand side, i.e., instead we solve
Ay = f + r.
56
From
Ay = f + r ,
subtract
Ax = f ,
to get
A(y − x) = r .
Thus
y − x = A−1 r ,
so that
k y − x k = k A−1 r k ≤ k A−1 k k r k .
57
EXAMPLE :
1 −1.001 x1 2.001
= ,
2.001 −2 x2 4.001
y1 = 2 , y2 = 0 .
58
Note that the small change in the right hand side has norm
k r k∞ = 0.001 .
Also note that the change in the solution is much larger , namely,
k x − y k∞ = 1 .
In this example
−666.44 333.55
A−1 ∼
= .
−666.77 333.22
Hence
k A−1 k∞ ∼
= 1000 , whereas k A k∞ ∼
= 4.
59
Errors always occur in floating point computations due to finite word length.
1
For example, on a “six digit computer” 3
is represented by 3.33333 10−1 .
(A + E)y = f + r .
60
THEOREM : Consider
Ax = f , and (A + E)y = f + r .
Assume that
1 −1
A is nonsingular, and that k E k< , i .e. , k A k kEk < 1.
k A−1 k
Then
ky−xk cond(A) k r k k E k
≤ −1
+ ,
kxk 1− k A kk E k k f k k A k
where
cond(A) ≡ k A−1 kk A k ,
61
PROOF :
First write
A + E = A (I + A−1 E) .
−1 −1 1 1
k (I + A E) k ≤ −1
≤ −1
.
1− k A E k 1− k A kk E k
62
Next
(A + E) y = f + r ,
implies
(I + A−1 E) y = A−1 (f + r) ,
so that
Then
y − x = (I + A−1 E)−1 A−1 (f + r) − (I + A−1 E)x
= (I + A−1 E)−1 x + A−1 r − x − A−1 Ex
63
Finally,
ky−xk k (I + A−1 E)−1 k k A−1 k k r k + k E kk x k
≤
kxk kxk
k A−1 k k r k
≤ −1
+kEk
1− k A k k E k k x k
k A−1 k k A k krk k E k
= −1
+
1− k A k k E k k A k k x k k A k
k A−1 k k A k k r k k E k
≤ −1
+ .
1− k A k k E k k f k k A k
64
From the above theorem we can conclude that :
ky−xk
If cond(A) is large, then the relative error can be large .
kxk
65
EXAMPLE : The 2 by 2 matrix
1 −1.001 x1
,
2.001 −2 x2
cond(A) ∼
= (4)(1000) = 4000 ,
66
Solving ill-conditioned systems numerically, if they can be solved at all on a
given computer, normally requires pivoting.
for which the condition number is approximately 4 using the infinity norm.
But solving a linear system with the above A as coefficient matrix requires
pivoting, at least on a six (decimal) digit computer.
67
EXERCISE :
•[036] Let
0 = t0 < t1 < t2 < · · · < tn = 1 ,
and let
hi = ti − ti−1 , i = 1, 2, · · · , n .
The following tridiagonal matrix arises in cubic spline interpolation
(to be discussed later) :
2(h1 + h2 ) h2
h2 2(h2 + h3 ) h3
Sn−1 = h3 2(h3 + h4 ) h4 .
· · ·
hn−1 2(hn−1 + hn )
68
•[036] SOLUTION : Write Sn−1 = Dn−1 (In−1 + Bn−1 ) , where
2(h1 + h2 )
2(h2 + h3 )
Dn−1 = 2(h3 + h4 )
· · ·
2(hn−1 + hn )
and h2
0 2(h1 +h2 )
h2 h3
2(h2 +h 3)
0 2(h2 +h3 )
Bn−1 = h3 h4 .
2(h3 +h4 ) 0 2(h3 +h4 )
· · ·
hn−1
2(hn−1 +hn ) 0
1
k Bn−1 k∞ = 2
< 1, so In−1 + Bn−1 , and hence Sn−1 , is invertible, with
k D−1
n−1 k∞ 2
k S−1
n−1 k∞ ≤ ≤ ,
1− k Bn−1 k∞ 2 mini {hi + hi+1 }
and
3 maxi {hi + hi+1 }
cond(Sn−1 ) = k Sn−1 k∞ k S−1
n−1 k∞ = ,
mini {hi + hi+1 }
which can be arbitrarily large.
EXERCISES :
•[038] Use the Banach Lemma to prove that the five-diagonal matrix
Tn = diag[1, 1, 5, 1, 1] ,
is invertible for all n ≥ 1 .
Derive an upper bound on cond(Tn ) using the matrix infinity norm.
69
EXERCISES :
70
EXERCISES :
For each of the following statements about matrices, say whether it is true
or false. Explain your answer.
71
EXERCISES :
For each of the following statements about matrices, say whether it is true
or false. Explain your answer.
•[052] If D is a diagonal matrix (i.e., its entries dij are zero if i 6= j), then
k D k1 = k D k2 = k D k∞ .
72
BRIEF SOLUTIONS : (You are expected to provide details.)
Page 69:
1
•[037] In + ǫCn is invertible if ǫ < n
, in which case
1 + ǫn
cond( In + ǫCn ) ≤ .
1 − ǫn
•[038] k T−1
n k ∞ ≤ 1 , and cond(Tn ) = k Tn k ∞ k T−1
n k∞ ≤ 9 .
Introduction.
Ax = f ,
73
We can write a system of n nonlinear equations in n unknowns as
G(x) = 0 ,
where
x , 0 ∈ Rn ,
x = (x1 , x2 , · · · , xn )T ,
74
EXAMPLES : (of possible situations) :
75
NOTE :
76
Some Methods for Scalar Nonlinear Equations.
g(x) = 0 ,
77
The Bisection Method.
Algorithm : For k = 0, 1, 2, · · · :
1
• Set z (k) = 2
(x(k)
+ y (k) ) ,
78
g(x)
(0)
g(y )
x (1)
x (0)
x
x* z (0) y (0)
y (1)
The bisection method works if g(x) is continuous in the interval [x(0) , y (0) ].
In fact we have
1
| x(k) − x∗ | ≤ k
| x(0)
− y (0)
| .
2
79
The Regula Falsi.
80
(0)
g(x)
g(y )
x (1)
x (0)
(0) x
x
*
z y (0)
y (1)
g(x )
(0)
The Regula Falsi.
z (k) is the zero of the line from (x(k) , g(x(k) )) to (y (k) , g(y (k) )) . (Check !)
81
Unlike the bisection method, not both x(k) and y (k) need converge to x∗ :
(0)
g(x)
g(y )
.
.
x (2)
x (1)
x (0) x*
(0)
x
z y (0)
y (1)
(0) z(2)
g(x ) y(2)
82
Newton’s Method.
p0 (x(0) ) = g(x(0) )
and
p′0 (x(0) ) = g ′ (x(0) ) ,
is given by
83
g(x)
g(x(0) )
p0 (x)
x*
x
x(2) x (1) (0)
x
Newton’s method
84
This procedure may now be repeated for the point x(1) .
• g ′ (x∗ ) 6= 0 ,
85
EXAMPLE :
86
The Chord Method.
The only difference is that g ′ (x) is always evaluated at the initial point x(0) .
87
g(x)
g(x(0) )
p0 (x)
x*
(2) (1) (0) x
x x x
g(x(k) )
x(k+1) = x(k) − ′ (0) .
g (x )
88
Compared to Newton’s method :
89
EXAMPLE :
With x(0) = 1.5 the Chord method for solving x2 − 2 = 0 takes the form
(k) 2
(x ) −2
x(k+1) (k)
= x − .
3
90
EXERCISES :
•[054] Show how to use Newton’s method to compute the cube root of 2.
Carry out the first few iterations, using x(0) = 0.6.
•[055] Show how to use the Chord method to compute the cube root of 2.
Carry out the first few iterations, using x(0) = 0.6.
•[057] Consider the equation sin(x) = e−x . Draw the functions sin(x) and
e−x in one graph. How many solutions are there to the above equa-
tion ? Show how one can use Newton’s method to find a solution of
the equation. Carry out the first few Newton iterations, using x(0) = 0.
91
•[054] SOLUTION : Here is a Fortran code for Newton’s method for the
cube root of 2, followed by the output. Note the rapid convergence !
g(x) = x**3 - 2
gp(x) = 3*x**2
nit = 10
x = 0.6
DO k=1,nit
x = x - g(x)/gp(x)
WRITE(6,101)k, x
ENDDO
101 FORMAT(I3,1PE16.6)
STOP
END
1 2.251852E+00
2 1.632705E+00
3 1.338558E+00
4 1.264450E+00
5 1.259937E+00
6 1.259921E+00
7 1.259921E+00
•[055] SOLUTION : Here is a Fortran code for the Chord method for
the cube root of 2, followed by the output. The iteration diverges ! It will
converge if x0 is closer to the cube root of 2, but convergence is slower than
Newton.
g(x) = x**3 - 2
nit = 5
x = 0.6
gp = 3*x**2
DO k=1,nit
x = x - g(x)/gp
WRITE(6,101)k, x
ENDDO
101 FORMAT(I3,1PE16.6)
STOP
END
1 2.251852E+00
2 -6.469230E+00
3 2.460709E+02
4 -1.379588E+07
5 2.431220E+21
•[056] SOLUTION : For the equation sin(x) = 1/x .
Drawing the graphs of sin(x) and 1/x in a single diagram, one observes
that there are infinitely many solutions.
With g(x) = sin(x) − x−1 , Newton’s method for finding a zero of g(x) is
(k+1) (k) sin(x) − x−1
x = f (x ) , where f (x) = x − .
cos(x) + x−2
Taking x(0) = π/2 gives these results :
x(1) = 0.674191 ,
x(2) = 0.962321 ,
x(3) = 1.094709 ,
x(4) = 1.113808 ,
x(5) = 1.114157 ,
x(6) = 1.114157 ,
i.e., the iteration converges to the positive solution that is nearest to zero.
•[057] SOLUTION : For the equation sin(x) = e−x .
Drawing the graphs of sin(x) and e−x in a single diagram, one observes
that there are infinitely many solutions.
x(1) = 0.47852772 ,
x(2) = 0.58415699 ,
x(3) = 0.58852512 ,
x(4) = 0.58853275 ,
x(5) = 0.58853275 ,
x(6) = 0.58853275 .
Newton’s Method for Systems of Nonlinear Equations.
g(x) = 0 .
NOTE :
92
Similarly for systems of the form
G(x) = 0 ,
93
Thus x(k+1) is the solution of the linear system
94
EXAMPLE :
x21 x2 − 1 = 0 ,
x2 − x41 = 0 .
Here
x21 x2
x1 g1 (x) g1 (x1 , x2 ) −1
x = , G(x) = = = .
x2 g2 (x) g2 (x1 , x2 ) x2 − x41
95
Hence Newton’s method for this problem takes the form
for k = 0, 1, 2, · · · .
Thus for each iteration two linear equations in two unknowns must be solved.
96
With the initial guess
(0) (0)
x1 = 2 , x2 = 2 ,
the first iteration consists of solving
(0)
8 4 ∆x1 −7
(0) = ,
−32 1 ∆x2 14
which gives
(0) (0)
∆x1 = −0.463 , ∆x2 = −0.823 ,
and then setting
(1) (0) (0)
x1 = x1 + ∆x1 = 1.537 ,
(1) (0) (0)
x2 = x2 + ∆x2 = 1.177 .
(2) (2)
After a second iteration what will x1 and x2 be ?
97
EXERCISES :
•[058] Describe in detail how Newton’s method can be used to compute solu-
tions (x1 , x2 ) of the system of two nonlinear equations
x21 + x22 − 1 = 0 ,
x2 − ex1 = 0 .
x3 − ex1 = 0 ,
x3 − ex2 = 0 .
98
•[058] SOLUTION : For the 2D problem :
x21 x22
x1 g1 (x) g1 (x1 , x2 ) + −1
x = , G(x) = = = ,
x2 g2 (x) g2 (x1 , x2 ) x2 − ex1
for k = 0, 1, 2, · · · .
•[059] SOLUTION :
becomes
A ∆x(k) = − (Ax(k) − f ) ,
99
A ∆x(k) = − (Ax(k) − f ) ,
NOTE :
100
A ∆x(k) = − (Ax(k) − f ) ,
NOTE :
101
Convergence Analysis for Scalar Equations.
Most iterative methods for solving a scalar equation g(x) = 0 can be written
Sometimes the iteration x(k+1) = x(k) − g(x(k) ) also works. In this method
f (x) = x − g(x) .
102
NOTE :
103
The iteration
known as the logistic equation, models population growth when there are
limited resources.
•[065] when 3 ≤ c ≤ 4 ?
104
In general, an iteration of the form
x(k+1) = f (x(k) ) , k = 0, 1, 2, · · · ,
is sometimes called a fixed point iteration (or a recurrence relation ,
or a discrete dynamical system) .
105
EXAMPLE :
In Newton’s method
g(x)
f (x) = x − ′ .
g (x)
that is,
g(x∗ ) = 0 ,
(assuming that g ′ (x∗ ) 6= 0 .)
106
Assuming that f has a fixed point, when does the fixed point iteration
x(k+1) = f (x(k) ) ,
converge ?
107
y
y=x
y=f(x)
0 x* x
0 (0) (1) (2)
x x x
108
y
y=x
y=f(x)
0 x
0 (3) (2) (1) (0)
x x x x x*
109
THEOREM :
Let f ′ (x) be continuous near a fixed point x∗ of f (x), and assume that
| f ′ (x∗ ) | < 1 .
x(k+1) = f (x(k) ) , k = 0, 1, 2, · · · ,
110
PROOF :
Let α ≡ | f ′ (x∗ ) |.
Then α < 1.
Iǫ ≡ [x∗ − ǫ, x∗ + ǫ] ,
such that
| f ′ (x) | ≤ β in Iǫ ,
111
Let x(0) ∈ Iǫ .
| x(1) −x∗ | = | (x(0) −x∗ )f ′ (η0 ) | = | x(0) −x∗ | | f ′ (η0 ) | ≤ β | x(0) −x∗ | .
112
Again by Taylor’s Theorem (or the Mean Value Theorem)
Hence
| x(2) − x∗ | ≤ β | x(1) − x∗ | ≤ β 2 | x(0) − x∗ | .
| x(k) − x∗ | ≤ β k | x(0) − x∗ | .
x(k) → x∗ as k → ∞ .
113
COROLLARY : Let
Iǫ ≡ [ x∗ − ǫ , x∗ + ǫ ] ,
x(k+1) = f (x(k) ) , k = 0, 1, 2, · · · .
converges to x∗ .
114
COROLLARY :
If
• x∗ is a zero of g(x) = 0 ,
• g ′ (x∗ ) 6= 0 ,
converges to x∗ .
115
PROOF :
In Newton’s method
g(x)
f (x) = x − ′ .
g (x)
Hence
116
EXAMPLE :
satisfy
x∗ = c x∗ (1 − x∗ ) .
∗ ∗ 1
x = 0, and x = 1 − .
c
117
•[066] SOLUTION : Here f (x) = c x (1 − x), so that f ′ (x) = c(1 − 2x).
The fixed points satisfy
x = c x (1 − x) .
which we can rewrite as
x [1 − c (1 − x)] = 0 .
One fixed point is x∗ = 0 , with derivative f ′ (0) = c. Thus this fixed point
is attracting when 0 ≤ c < 1, and repelling when 1 < c ≤ 4.
1
The other fixed point satisfies 1 − c(1 − x)] = 0, from which x∗ = 1 − c
,
with derivative
1 1
f ′ (1 − ) = c[1 − 2(1 − )] = 2 − c .
c c
We see that 1
′
| f (1 − ) |< 1 when 1 < c < 3 (attracting) ,
c
and
1
| f ′ (1 − ) |> 1 when 0 ≤ c < 1 and 3 < c ≤ 4 (repelling) .
c
1.0
0.9
0.8
0.7
0.6
y
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 0.9 .
118
1.0
0.9
0.8
0.7
0.6
y
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 1.7 .
119
1.0
0.9
0.8
0.7
0.6
y
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.46 .
120
1.0
0.9
0.8
0.7
0.6
y
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.561 .
121
1.0
0.9
0.8
0.7
0.6
y
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.6 .
122
1.0
0.9
0.8
0.7
0.6
y
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.77 .
123
1.0
0.9
0.8
0.7
0.6
y
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.89 .
124
1.0
0.9
0.8
0.7
0.6
y
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
The logistic equation : c = 3.99 .
125
If a fixed point iteration
x(k+1) = f (x(k) ) ,
126
THEOREM :
Let f ′ (x) be continuous near x∗ with | f ′ (x∗ ) |< 1 .
Assume that x(0) is sufficiently close to x∗ , so that the fixed point iteration
x(k+1) = f (x(k) )
converges to x∗ .
Then ek+1
lim = | f ′ (x∗ ) | (linear convergence) .
k→∞ ek
(The value of | f ′ (x∗ ) | is then called the rate of convergence.)
ek+1 1 ′′ ∗
lim 2
= | f (x ) | (quadratic convergence) .
k→∞ ek 2
127
PROOF :
Case : f ′ (x∗ ) 6= 0 :
ek+1 = | x(k+1) − x∗ |
= | f (x(k) ) − x∗ |
= | x(k) − x∗ | | f ′ (ηk ) |
= ek | f ′ (ηk ) | ,
Hence
ek+1
lim = lim | f ′ (ηk ) | = | f ′ (x∗ ) | .
k→∞ ek k→∞
128
Case : f ′ (x∗ ) = 0 :
Thus
ek+1 1 1
lim = lim | f ′′ (ηk ) | = | f ′′ (x∗ ) | QED !
k→∞ e2 k→∞ 2 2
k
129
COROLLARY :
If
• g ′ (x∗ ) 6= 0 ,
130
EXAMPLE :
√
Newton’s method for computing 2 , i.e., for computing a zero of
g(x) = x2 − 2 ,
is given by
(k+1) (k) (x(k) )2 − 2
x = x − (k)
,
2x
that is,
(x(k) )2 + 2
x(k+1) = (k)
,
2x
that is,
x(k+1) = f (x(k) ) ,
where
x2 + 2
f (x) = .
2x
131
(k+1) (k) x2 + 2
x = f (x ) , where f (x) = .
2x
We observe that :
∗
√ ∗
√
• The fixed points of f are x = + 2 and x = − 2 .
• f (x) ∼
= x/2 as | x |→ ∞ .
(Check !)
132
√
Newton’s Method for 2 as a fixed point iteration
133
√
Newton’s Method for 2 as a fixed point iteration (blow-up)
134
√
Newton’s Method for ± 2 as a fixed point iteration
135
From the last graph we see that :
√
• The iteration converges to x = + 2 for any x(0) > 0 .
∗
√
• The iteration converges to x = − 2 for any x(0) < 0 .
∗
(Check !)
136
EXAMPLE :
x∗ = x∗ − γ g(x∗ ) ,
that is,
g(x∗ ) = 0 . (Assuming γ 6= 0 .)
137
In this example
f (x) = x − γ g(x) .
| f ′ (x∗ ) | < 1 ,
i.e., if
| 1 − γ g ′ (x∗ ) | < 1 ,
i.e., if
−1 < 1 − γ g ′ (x∗ ) < 1 ,
i.e., if
−2 < − γ g ′ (x∗ ) < 0 ,
i.e., if
0 < γ g ′ (x∗ ) < 2 .
138
The convergence is quadratic if
f ′ (x∗ ) = 1 − γ g ′ (x∗ ) = 0 ,
that is, if 1
γ = γ̂ ≡ ′ ∗ .
g (x )
139
EXERCISES :
•[067] If the following fixed point iteration converges, then what number will
it converge to? Is the convergence quadratic?
2x3 + 3
x(k+1) = f (x(k) ) , where f (x) = 2
.
3x
•[069] Analytically determine all fixed points of x(k+1) = 2x(k) (1 − x(k) ). Are
these attracting or repelling? If attracting then is the convergence
linear or quadratic? Also give a graphical interpretation.
140
(k+1) (k) 2x3 +3
•[067] x = f (x ) , with f (x) = 3x2
.
SOLUTION : This is Newton’s method for the cube root of 3.
The convergence is quadratic for sufficiently close initial guess.
141
•[070] SOLUTION : Here f (x) = sin(x) , and f ′ (x) = cos(x) .
Here are some of the very long-time iterations, starting with x(0) = π/4 :
1000 0.05455137
2000 0.03864754
3000 0.03157667
4000 0.02735556
5000 0.02447269
6000 0.02234359
7000 0.02068827
8000 0.01935361
9000 0.01824787
10000 0.01731231
1
2x , when x ≤ 2
,
•[071] f (x) =
1
2(1 − x) , when x > .
2
SOLUTION :
The fixed points are x = 0 and x = 32 . Both are repelling since | f ′ (x) |= 2.
•[072] Show how to use the Chord method to compute the cube root of 5.
Carry out the first two iterations of the Chord method, using x(0) = 2 .
Analytically determine all fixed points of this Chord iteration.
For each fixed point, determine whether it is attracting or repelling.
142
•[073] SOLUTION : Newton’s method for the cube root of 2 :
x3 − 2 2(x3 + 1)
x(k+1) = f (x(k) ) , where f (x) = x − = .
3x2 3x2
- the iteration converges for “most” negative x(0) , except for a countably
infinite such x(0) , namely those for which x(k) = 0 for some k.
SOLUTION : continued · · ·
Newton’s method for the cube root of 2, with a converging iteration (red)
having x(0) = 0.35 , and the iteration (black) with x(k) = 0 for some k.
EXERCISES :
•[074] Suppose you enter any number on a calculator and then keep pushing
the cosine button. (Assume the calculator is in “radian-mode”.)
What will happen in the limit to the result shown in the display?
Give a full mathematical explanation and a graphical interpretation.
Do the same for sin(x) and tan(x).
Does the derivative test give conclusive evidence whether or not the
fixed point x = 0 is attracting?
Give a careful graphical interpretation of this fixed point iteration.
What can you say about the convergence of the fixed point iteration?
143
•[074] SOLUTION : x(k+1) = cos(x(k) ) , k = 0, 1, 2, · · · .
Here f (x) = cos(x) and f ′ (x) = − sin(x) .
There is only a single fixed point, namely, x∗ ≈ 0.739085133 .
The fixed point is attracting, with | f ′ (x∗ ) | ≈ 0.673612029 .
A careful graphical interpretation shows there is convergence to x∗
for any initial guess x(0) .
144
(k+1) 1
•[076] x = √ , k = 0, 1, 2, · · · .
x(k)
SOLUTION :
Here
− 12 ′ 1 −3
f (x) = x , and f (x) = − x 2 .
2
1
Furthermore, | f ′ (1) | = 2
, so x = 1 is attracting.
1
- | f ′ (x∗ ) | = 2
< 1 , so x∗ is attracting
- graphically we also see the iteration diverges for any x(0) < −1
- the divergence for x(0) < −1 is slow as x(k) becomes more negative
SOLUTION : continued · · ·
G(x) = 0 ,
can be written as
x(k+1) = F(x(k) ) , k =, 1, 2, · · · ,
145
EXAMPLE :
Thus
So here
F(x) = x − G′ (x)−1 G(x) .
146
NOTE : Fixed point iterations also arise as models of physical processes,
where they are called difference equations or discrete dynamical systems .
EXAMPLE :
The equations
(k+1) (k) (k) (k) (k)
x1 = λ x1 (1 − x1 ) − c1 x1 x2 ,
(k+1) (k) (k) (k)
x2 = c2 x2 + c1 x1 x2 ,
147
Derivatives :
f (x + h) − f (x) − f ′ (x)h
→ 0 as h → 0 .
h
148
Similarly for vector valued functions F(x) we say that F is differentiable at
x if there exists a matrix F′ (x) such that
and if
x ≡ (x1 , x2 , · · · , xn )T ,
then
′ ∂f i
{F (x)}i,j ≡ .
∂xj
149
THEOREM :
k F′ (x∗ ) k < 1 ,
x(k+1) = F(x(k) ) , k = 0, 1, 2, · · · ,
150
NOTE :
k F′ (x∗ ) k < 1 ,
Here
spr(F′ (x∗ )) is the spectral radius of F′ (x∗ ) ,
151
PROOF (of the Theorem) :
whenever x ∈ Bδ (x∗ ).
Here
Bδ (x∗ ) ≡ {x : k x − x∗ k ≤ δ} .
152
Let x(0) ∈ Bδ (x∗ ). Then
k x(1) − x∗ k = k F(x(0) ) − F(x∗ ) k
≤ ǫ k x(0) − x∗ k + α k x(0) − x∗ k
≤ (ǫ + α) k x(0) − x∗ k
1−α
= ( + α) k x(0) − x∗ k
2
1+α
= k x(0) − x∗ k ≤ β δ ,
2
where 1+α
β ≡ < 1.
2
Thus x(1) ∈ Bδ (x∗ ) .
153
Since x(1) ∈ Bδ (x∗ ) , we also have
k x(2) − x∗ k ≤ β k x(1) − x∗ k ≤ β2 δ .
k x(k) − x∗ k ≤ βk δ .
154
EXAMPLE : In Newton’s method
Hence
G′ (x)F(x) = G′ (x)x − G(x) .
⇒ G′′ (x)F(x) + G′ (x)F′ (x) = G′′ (x)x + G′ (x) − G′ (x) = G′′ (x)x
⇒ F′ (x∗ ) = (G′ (x∗ ))−1 G′′ (x∗ )(G′ (x∗ ))−1 G(x∗ ) = O (zero matrix) .
because G(x∗ ) = 0.
155
Thus if
k x(k+1) − x∗ k
lim (k) ∗ 2
≤ C, for some constant C.
k→∞ k x −x k
156
EXERCISE :
(k)
This is a “predator-prey ” model, where, for example, x1 denotes the
(k)
biomass of “fish” and x2 denotes the biomass of “sharks” in year k .
(k) (k)
Numerically determine the long-time behavior of x1 and x2 for
the following values of λ :
λ = 0.5, 1.0, 1, 5, 2.0, 2.5 ,
(0) (0)
taking, for example, x1 = 0.1 and x2 = 0.1.
What can you say analytically about the fixed points of this system?
157
•[078] SOLUTION :
Here is a ”quick” Fortran code for doing the iteration. Note the use of the
”intermediate variables” t1 and t2 so that the iteration is done correctly:
rl = 2.5
x1 = 0.1
x2 = 0.1
nit = 5000
npr = 500
DO k = 1,nit
t1 = rl*x1*(1-x1) - 0.2*x1*x2
t2 = 0.9*x2 + 0.2*x1*x2
x1 = t1
x2 = t2
IF (MOD(k,npr).EQ.0) PRINT*, k, x1, x2
ENDDO
STOP
END
SOLUTION : (continued · · · )
For each of the given values of λ the iteration converges to finite values,
which are shown in the Table below. For the cases λ = 1.0 and λ = 2.0
the convergence is extremely slow, and a very large number of iterations is
needed to converge to the fixed points shown in the Table.
λ x1 x2
0.5 0.000000 0.000000
1.0 0.000000 0.000000
1.5 0.333333 0.000000
2.0 0.500000 0.000000
2.5 0.500000 1.250000
Analytically it can be seen that the fixed points in the Table are indeed
attracting (”stable”), although the fixed points for λ = 1.0 and for λ = 2.0
are “critically stable”. Indeed, the eigenvalues of the Jacobian matrix for
this problem are strictly less than 1 for λ = 0.5, 1.5, 2.5, while for λ = 1.0
and λ = 2.0 there is an eigenvalue of magnitude 1. It must be mentioned
that there are also repelling fixed points, but we leave out details on these
here.
SOLUTION : (continued · · · )
Function Norms.
k f k∞ ≡ max | f (x) | .
[a,b]
158
A function norm must satisfy :
(ii) k αf k = | α | k f k , ∀α ∈ R , ∀f ∈ C[a, b] ,
159
All of the norms above satisfy these requirements. (Check !)
EXAMPLE :
Z b
k f + g k1 = | f (x) + g(x) | dx
a
Z b
≤ | f (x) | + | g(x) | dx
a
Z b Z b
= | f (x) | dx + | g(x) | dx
a a
= k f k1 + k g k1 .
160
NOTE : If a function is “small” in a given function norm then it need not
be small in another norm.
f k (x)
x
0 1/2 1
1/2 - 1/k 2 1/2 + 1/k 2
161
Then
k fk k∞ = k → ∞ as k → ∞ ,
while
1
1
Z
k fk k1 = | fk (x) | dx = → 0 as k → ∞ ,
0 k
and
Z 1
1 p
2
k fk k2 = { fk (x) dx} 2 = 2/3 (Check !) .
0
162
EXAMPLE :
Approximate
f (x) = x3
by
3 2 1
p(x) = x − x,
2 2
on the interval [0, 1].
Then
1
p(x) = f (x) for x = 0 , , 1,
2
163
Graph of f (x) = x3 (blue) and its interpolant p(x) = 23 x2 − 12 x (red) .
164
A measure of “how close” f and p are is then given by, for example,
Z 1
1
2
k f − p k2 = { (f (x) − p(x)) dx} .
2
We find that
√
210 ∼
k f − p k2 = = 0.0345. (Check !)
420
165
The Lagrange Interpolation Polynomial.
p(xk ) = f (xk ) , k = 0, 1, · · · , n .
166
1
Graph of f (x) = 10
+ 15 x + x2 sin(2πx) (blue)
and its Lagrange interpolant p(x) ∈ P5 (red)
at six interpolation points (n = 5) .
167
The following questions arise :
168
To answer the above questions let
n
Y (x − xk )
ℓi (x) ≡ , i = 0, 1, · · · , n ,
k=0,k6=i
(xi − xk )
169
EXAMPLE : If n = 2 we have
(x − x1 )(x − x2 )
ℓ0 (x) = ,
(x0 − x1 )(x0 − x2 )
(x − x0 )(x − x2 )
ℓ1 (x) = ,
(x1 − x0 )(x1 − x2 )
and
(x − x0 )(x − x1 )
ℓ2 (x) = .
(x2 − x0 )(x2 − x1 )
NOTE : ℓi ∈ P2 , i = 0, 1, 2 , and
0 if k 6= i ,
ℓi (xk ) =
1 if k = i.
170
1 1 1
x0 x1 x2 x0 x1 x2 x0 x1 x2
171
Now given f (x) let
n
X
p(x) = f (xk ) ℓk (x) .
k=0
Then
p ∈ Pn ,
and n
X
p(xi ) = f (xk ) ℓk (xi ) = f (xi ) ,
k=0
172
THEOREM : Let f (x) be defined on [a, b] and let
a ≤ x0 < x1 < · · · < xn ≤ b .
PROOF :
Hence r(x) ≡ 0 .
173
EXAMPLE : Let f (x) = ex .
Given f (0) = 1, f (1) = 2.71828, f (2) = 7.38905, we want to approximate
f (1.5) by polynomial interpolation at x = 0, 1, 2.
Here
(1.5 − 1) (1.5 − 2) 1
ℓ0 (1.5) = = − ,
(0 − 1) (0 − 2) 8
(1.5 − 0) (1.5 − 2) 6
ℓ1 (1.5) = = ,
(1 − 0) (1 − 2) 8
(1.5 − 0) (1.5 − 1) 3
ℓ2 (1.5) = = ,
(2 − 0) (2 − 1) 8
so that
p(1.5) = f (0) ℓ0 (1.5) + f (1) ℓ1 (1.5) + f (2) ℓ2 (1.5)
1 6 3
= (1) (− ) + (2.71828) ( ) + (7.38905) ( ) = 4.68460 .
8 8 8
174
Graph of f (x) = ex (blue) and its Lagrange interpolant p(x) ∈ P2 (red).
175
THE LAGRANGE INTERPOLATION THEOREM :
Let
x0 < x1 < · · · < xn , and let x∈R.
Define
a ≡ min{x0 , x} and b ≡ max{xn , x} .
Then n
(n+1)
f (ξ) Y
f (x) − p(x) = (x − xk ) ,
(n + 1)! k=0
for some point ξ ≡ ξ(x) ∈ [a, b].
176
PROOF :
Let n
Y f (x) − p(x)
w(z) ≡ (z − xk ) and c(x) ≡ .
k=0
w(x)
177
Let
F (z) ≡ f (z) − p(z) − w(z) c(x) .
Then
and
f (x) − p(x)
F (x) = f (x) − p(x) − w(x) = 0.
w(x)
178
F
x0 x1 xk xk+1 xn
x
F’
F ’’
.
.
etc.
.
.
ξ
F (n+1)
179
Hence, by Rolle’s Theorem, F ′ (z) has n + 1 distinct zeroes in [a, b] ,
We find that F (n+1) (z) has (at least) one zero in [a, b] , say,
Hence
F (n+1) (ξ) = f (n+1) (ξ) − (n + 1)! c(x) = 0 .
It follows that
f (n+1) (ξ)
c(x) = .
(n + 1)!
180
EXAMPLE : In the last example we had
n=2, f (x) = ex , x0 = 0 , x1 = 1, x2 = 2 ,
By the Theorem
f (3) (ξ)
f (x) − p(x) = (x − 0)(x − 1)(x − 2) , ξ ∈ [0, 2] .
3!
181
Qn
The graph of wn+1 (x) = k=0 (x − xk ) for equally spaced interpolation
points in the interval [−1, 1] , for the cases n + 1 = 3 , 6 , 7 , 10 .
182
n max n max n max n max
183
EXERCISES :
•[080] Also answer the above question for equally spaced interpolation points
in [−1, 1], using the Table on the preceding page .
•[081,082] Also answer the above questions for the case of f (x) = ex in [−1, 1].
184
•[079] Consider the polynomial pn (x) of degree n or less that interpolates
sin(x) at n + 1 distinct points in [−1, 1]. For distinct, but arbitrary interpola-
tion points, how big should n be to guarantee that the maximum interpolation
error in [−1, 1] is less than 10−2 ?
SOLUTION :
(n+1) max | wn+1 (x) |
max | sin(x) − pn (x) | ≤ max | f (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
2n+1 ∼
≤ = 6.35 10−3 when n = 7 .
(n + 1)!
•[080] Also answer the above question for equally spaced interpolation points
in [−1, 1], using the Table on the preceding page .
SOLUTION :
(n+1) max | wn+1 (x) |
max | sin(x) − pn (x) | ≤ max | f (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
max | wn+1 (x) | 0.19749 ∼
≤ = = 8.23 10−3 when n = 3 .
(n + 1)! 4!
•[081] Consider the polynomial pn (x) of degree n or less that interpolates ex
at n + 1 distinct points in [−1, 1]. For distinct, but arbitrary interpolation
points, how big should n be to guarantee that the maximum interpolation
error in [−1, 1] is less than 10−2 ?
SOLUTION :
x (n+1) max | wn+1 (x) |
max | e − pn (x) | ≤ max | f (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
2n+1 ∼
≤ e · = 3.836 10−3 when n = 8 .
(n + 1)!
•[082] Also answer the above question for equally spaced interpolation points
in [−1, 1], using the Table on the preceding page .
SOLUTION :
x (n+1) max | wn+1 (x) |
max | e − pn (x) | ≤ max | f (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
max | wn+1 (x) | 0.11348 ∼
≤e · = e · = 2.5706 10−3 when n = 4 .
(n + 1)! (n + 1)!
•[083] Consider the problem of interpolating a smooth function f (x) at two
points, x0 = −h/2 and x1 = h/2, by a polynomial p ∈ P3 such that
p(x0 ) = f (x0 ), p′ (x0 ) = f ′ (x0 ), p(x1 ) = f (x1 ), p′ (x1 ) = f ′ (x1 ).
Prove that this interpolation problem has one and only one solution.
h h h h h
p( ) = c0 + c1 + c2 ( )2 + c3 ( )3 = f ( ),
2 2 2 2 2
′ h h h 2 ′ h
p (− ) = c1 − 2 c2 + 3( ) c3 = f (− ) ,
2 2 2 2
h h h h
p′ ( ) = c1 + 2 c2 + 3( )2 c3 = f ′ ( ) .
2 2 2 2
SOLUTION : continued ··· :
h2 h3
− h2
1 4 −8 c0
f0
In matrix form 1 h h2 h3
c1 f 1
2 4 8
= ′ ,
0 3h2 c2 f0
1 −h 4
0 1 h 3h2 c3 f1′
4
where the matrix can be transformed to upper-triangular form as follows :
h2 3 h2 3
− h2 − h8 − h2 − h8
1 4 1 4
1 h h2 h3 0 h3
2 4 8
h 0 4
→
0 1 3h2 3h2
−h 4
0 1 −h 4
3h2 3h2
0 1 h 4 0 1 h 4
h2 3
h2 h3
1 − h2 − h8
1 − h2 −8
4 4
0 h3 h3
h 0 4
0 h 0 4
→ →
0 0 h2 h2
−h 2
0 0 −h 2
h2 0 0 0 h2
0 0 h 2
NOTE :
• It does not follow that k f − p k∞ → 0 as n → ∞ .
• There are examples where k f − p k∞ → ∞ as n → ∞.
• For given n, we can choose the points {xk }nk=0 so k wn+1 k∞ is minimized .
185
EXAMPLE :
Let n = 1 and place the x0 and x1 symmetrically in [−1, 1] :
w2 (x)
−1 x0 0 x1 1
−η +η
Then
w2 (x) = (x − x0 ) (x − x1 ) = (x + η) (x − η) = x2 − η 2 .
k w2 k∞ ≡ max | w2 (x) |
[−1,1]
is minimized .
186
1−η2 1−η2
w2 (x)
−1 x0 0 x1 1
−η +η
−η2
1√ 1
Thus η = 2 and k w2 k∞ = .
2 2
187
In general, the points
Also
Tk+1 (x) = 2x Tk (x) − Tk−1 (x) .
188
Tk+1 (x) = 2x Tk (x) − Tk−1 (x)
189
Thus, with T0 (x) ≡ 1 and T1 (x) = x , we obtain
·
·
·
190
THE CHEBYSHEV THEOREM :
Let n
Y
wn+1 (x) ≡ (x − xk ) .
k=0
is minimized if the points {xk }nk=0 are the zeroes of Tn+1 (x) .
191
PROOF :
π
Tn+1 (x) = 0 if (n + 1) cos−1 (x) = (2k + 1) ,
2
i.e., Tn+1 (x) = 0 if
2k + 1
x = cos( π) , k = 0, 1, 2, · · · , n .
2(n + 1)
192
2k + 1
x = cos( π) , k = 0, 1, 2, · · · , n .
2(n + 1)
x 1
4
x
3
7π/10 9π/10
x 0 π/10 π
2 3π/10
x
1
x
0 −1
193
Tn+1 (x) = cos( (n + 1) cos−1 (x) ) .
Tn+1 (x) = ± 1 if
(n + 1) cos−1 (x) = kπ ,
that is, if,
k
x = cos( π) , k = 0, 1, 2, · · · , n + 1 .
n+1
194
The graph of Tn for the cases n = 2, 3, 4, 5 .
195
Recall that from the recurrence relation
Let
∗
wn+1 (x) ≡ 2−n Tn+1 (x) = (x − x0 ) (x − x1 ) · · · (x − xn ) ,
Then
k wn+1 ∗ k∞ = k 2−n Tn+1 k∞ = 2−n k Tn+1 k∞ = 2−n .
196
−n
2
w5*(x)
1
0 11
00
01010
−1 x0
0
1 00
11
x1 x2 x11
00
00
11
3 x4 1010 1
−n
−2
∗
Qn
The graph of wn+1 = k=0 (x − xk ) for the case n = 4 .
Claim :
There does not exist w ∈ Pn+1 , with leading coefficient 1, such that
k w k∞ < 2−n .
197
Suppose there does exist a wn+1 ∈ Pn+1 , with leading coefficient 1, such that
−n
2
1
0
0
1 w5*(x)
11
00
00
11
1
0
0
1
00
11
00
11 01010 11
00
00
11
1
0
0
1
−1 x 0 x1 x2 x3 x4 1
00
11 1
0
1
0 0
1
w5 (x)
−n
−2
198
Then wn+1 must intersect wn+1 ∗ at least n + 1 times in [−1, 1] .
But
(wn+1 − wn+1 ∗ ) ∈ Pn
199
n uniform Chebyshev n uniform Chebyshev
200
EXAMPLE :
Let f (x) = ex on [−1, 1] and take n = 2 .
T3 (x) = 4x3 − 3x has zeroes
1√ 1√
x0 = − 3, x1 = 0 , x2 = 3.
2 2
(0.5 − x0 )(0.5 − x2 ) 4
ℓ1 (0.5) = = ,
(x1 − x0 )(x1 − x2 ) 6
√
(0.5 − x0 )(0.5 − x1 ) 1+ 3
ℓ2 (0.5) = = .
(x2 − x0 )(x2 − x1 ) 6
201
Thus
| e0.5 − p(0.5) | ∼
= | 1.648 − 1.697 | = 0.049 .
202
Graph of f (x) = ex (blue) on the interval [−1, 1] ,
and its Lagrange interpolating polynomial p(x) ∈ P2 (red)
at three Chebyshev interpolation points (n = 2) .
203
EXAMPLE : More generally, if we interpolate
Thus
x k f (n+1) k∞
max | e − p(x) | ≤ k wn+1 k∞
x∈[−1,1] (n + 1)!
e
≤ k wn+1 k∞
(n + 1)!
e
= 2−n .
(n + 1)!
204
NOTE :
k pC − f k ∞ ≤ k pU − f k ∞ ,
205
EXERCISES :
206
SOLUTIONS :
(n+1) 2−n
•[084] max | sin(x) − pn (x) | ≤ max | f (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
2−n ∼
≤ = 4.34 10−5 when n = 5 .
(n + 1)!
−n
2
•[085] max | ex − pn (x) | ≤ max | f (n+1) (x) | ·
x∈[−1,1] x∈[−1,1] (n + 1)!
e · 2−n ∼
≤ = 8.43 10−6 when n = 6 .
(n + 1)!
Let f ∈ C n [a, b] .
207
The function ex (blue) and its Taylor polynomials pk (x) about x0 = 0 :
k = 1 : purple, k = 2 : red, k = 3 : brown, k = 4 : green, k = 5 : black .
208
As for Lagrange interpolation, we have the following questions :
209
Existence :
n
X f (k) (x0 )
p(x) = (x − x0 )k .
k=0
k!
Clearly
DEFINITION :
p(x) is called the Taylor polynomial of degree n for f (x) about the point x0 .
210
TAYLOR’s THEOREM :
Let p(x) ∈ Pn be the Taylor polynomial for f about the point x0 , i.e.,
n
X f (k) (x0 )
p(x) = (x − x0 )k .
k=0
k!
f (n+1) (ξ)
DEFINITION : Rn (x) ≡ (x − x0 )n+1
(n + 1)!
is called the Taylor remainder .
211
•[087] PROOF of TAYLOR’s THEOREM (EXERCISE !) :
• Define
f (x) − p(x)
c(x) = n+1
.
(x − x0 )
212
• Show F ′ (ξ0 ) = 0 for some ξ0 between x0 and x . Graph F ′ (z) .
• etc.
• Show how Taylor’s Theorem follows from this last step. QED !
213
EXERCISES :
•[088] Write down the Taylor polynomials pn (x) of degree n (or less) for
f (x) = ex about the point x0 = 0, for the cases n = 1, 2, 3, 4.
•[089] Do the same for f (x) = sin(x) in [0, 1] about the point x0 = 0 .
214
•[088] SOLUTION :
The Taylor polynomials pn (x) of degree n for f (x) = ex about the point
x0 = 0, for n = 1, 2, 3, 4:
p1 (x) = 1 + x,
1
p2 (x) = 1 + x + 2
x2 ,
1 1
p3 (x) = 1 + x + 2
x2 + 6
x3 ,
1 1 1
p4 (x) = 1 + x + 2
x2 + 6
x3 + 24
x4 .
The Taylor polynomials pn (x) for f (x) = sin(x) about x0 = 0 all have
odd degree, since the coefficients of even degree terms are zero :
p1 (x) = x,
1
p3 (x) = x − 6
x3 ,
1 1
p5 (x) = x − 6
x3 + 120
x5 ,
1 1 1
p7 (x) = x − 6
x3 + 120
x5 − 5040
x7 ,
1 1 1 1
p9 (x) = x − 6
x3 + 120
x5 − 5040
x7 + 362880
x9 .
(The formulas for p5 (x), p7 (x), and p9 (x) were actually not asked for.)
How big should n be so | sin(x) − pn (x) |< 10−4 for x ∈ [0, 1] ?
The error bound for odd values of n is
1 1
= ≈ 2.756 10−6 when n = 7.
(n + 2)! 362880
Graph of the function sin(x) (blue) and its Taylor polynomials pk (x)
about x0 = 0 : k = 1: purple, k = 3: red, k = 5: brown, k = 7: black .
215
•[090] SOLUTION :
f (x) = ln(x) has derivatives
f (n) (x) = (−1)n−1 (n − 1)! x−n ,
for example,
f (1) (x) = x−1 , f (2) (x) = − x−2 , f (3) (x) = 2x−3 ,
(n+1) (x − 1)n+1
| f (x) − pn (x) | = f (ξ)
(n + 1)!
n+1
n −(n+1) (x − 1)
= (−1) n! ξ
(n + 1)!
1 −(n+1) 1 n+1 1 1
≤ = ,
2 2 n+1 n+1
Does
k f − p k∞ → 0 as n → ∞ ?
216
EXAMPLE : If
1
f (x) = on [−5, 5] ,
1 + x2
10
xk = − 5 + k ∆x , k = 0, 1, · · · , n , ∆x = ,
n
k f − p k∞ → ∞ as n → ∞.
217
1
Graph of f (x) = 1+x2
on the interval [−5, 5]
and its Lagrange interpolant p(x) ∈ P9 (red)
at ten equally spaced interpolation points (n = 9) .
218
1
Graph of f (x) = 1+x2
on the interval [−5, 5]
and its Lagrange interpolant p(x) ∈ P13 (red)
at fourteen equally spaced interpolation points (n = 13) .
219
Conclusion :
Alternative :
220
For given integer N let
b−a
h ≡ ,
N
and partition [a, b] into
where
tj = a + jh , j = 0, 1, · · · , N .
221
1
Local polynomial interpolation of f (x) = 1+x2
at 3 points in 5 intervals.
222
1
Local polynomial interpolation of f (x) = 1+x2
at 3 points in 10 intervals.
223
1
Local polynomial interpolation of f (x) = 1+x2
at 2 Chebyshev points.
224
1
Local polynomial interpolation of f (x) = 1+x2
at 3 Chebyshev points.
225
By the Lagrange Interpolation Theorem
1
max | f (x) − pj (x) | ≤ max | f (n+1) (x) | max | wn+1 (x) |
[tj−1 ,tj ] (n + 1)! [tj−1 ,tj ] [tj−1 ,tj ]
where n
Y b−a
wn+1 (x) = (x − xj,i ) , h ≡ tj − tj−1 = .
i=0
N
The Tables on Page 183, 200 show values of Cn ≡ max[−1,1] | wn+1 (x) |
for uniform and for Chebyshev interpolation points.
A scaling argument shows that for uniformly spaced local interpolation points
h
max | wn+1 (x) | ≤ ( )n+1 Cn ,
[tj−1 ,tj ] 2
while for local Chebyshev points we have
h n+1 −n
max | wn+1 (x) | ≤ ( ) 2 .
[tj−1 ,tj ] 2
226
NOTE :
227
EXAMPLE : If we approximate
π
f (x) = cos(x) on [0, ],
2
by local interpolation at 3 equally spaced local interpolation points
tj−1 + tj
xj,0 = tj−1 , xj,1 = , xj,2 = tj ,
2
k f (3) k∞ h 3 1 h3
max | f (x) − pj (x) | ≤ ( ) C2 ≤ 0.3849 .
[tj−1 ,tj ] 3! 2 6 8
228
Local polynomial interpolation at 3 points in 4 intervals.
229
EXERCISES :
•[091] f (x) = sin(x) on [0, 2π] , with arbitrary local interpolation points.
•[092] f (x) = sin(x) on [0, 2π] , with equally spaced local points.
•[093] f (x) = sin(x) on [0, 2π] , with local Chebyshev interpolation points.
230
•[093] SOLUTION : Local interpolation of sin(x) at Chebyshev points :
If N is the number of intervals then their size is h = 2π/N .
The local error for sin(x) for an interval of size h is bounded by
1
| sin(x) − pn (x) | ≤ max | wn+1 (x) | .
(n + 1)!
For the reference interval [−1, 1] the maximum of | wn+1 (x) | is 2−n .
The adjusted bound for a local interval of size h is
1 h n+1
| sin(x) − pn (x) | ≤ 2−n ,
(n + 1)! 2
which for the case n = 3 gives
1 h 4 −3
| sin(x) − p3 (x) | ≤ 2 ,
4! 2
which is less than 10−4 when h < 0.74448 , that is, when N ≥ 9 .
NOTE: For arbitrary points use 2n+1 as maximum of | wn+1 (x) | in [−1, 1] ,
while for equally spaced points use instead the Table on Page 183.
NUMERICAL DIFFERENTIATION.
where n
Y (x − xk )
ℓi (x) = .
k=0,k6=i
(xi − xk )
231
EXAMPLE :
n = 2, m = 2, x =0,
x0 = − h , x1 = 0 , x2 = h .
f0 , f1 , and f2 , ( fi ≡ f (xi ) ) .
232
f (x)
h h
x0 x1 x2
In this case
f ′′ (x1 ) ∼
= p′′ (x1 ) = f0 ℓ′′0 (x1 ) + f1 ℓ′′1 (x1 ) + f2 ℓ′′2 (x1 ) .
233
f ′′ (x1 ) ∼
= f0 ℓ′′0 (x1 ) + f1 ℓ′′1 (x1 ) + f2 ℓ′′2 (x1 ) .
Here
(x − x1 )(x − x2 )
l0 (x) = ,
(x0 − x1 )(x0 − x2 )
so that 2 1
ℓ′′0 (x) = = 2 .
(x0 − x1 )(x0 − x2 ) h
In particular,
1
ℓ′′0 (x1 ) = 2 .
h
Similarly
2 1
ℓ′′1 (x1 ) = − 2 , ℓ′′2 (x1 ) = 2 . (Check !)
h h
Hence f0 − 2f1 + f2
′′ ∼
f (x1 ) = .
h 2
234
To derive an optimal error bound we use Taylor’s Theorem :
f2 − 2f1 + f0 ′′
− f1
h2
1 ′ h2 ′′ h3 ′′′ h4 ′′′′
= 2
f1 + hf1 + f1 + f1 + f (ζ1 )
h 2 6 24
− 2f1
h2 ′′ h3 ′′′ h4 ′′′′
+ f1 − hf1′ + f1 − f1 + f (ζ2 ) − f1′′
2 6 24
h2 ′′′′ h2 ′′′′
= f (ζ1 ) + f ′′′′ (ζ2 ) = f (η) ,
24 12
where η ∈ (x0 , x2 ) .
235
EXAMPLE : With n = 4 , m = 2 , and x = x2 , and reference interval
x0 = − 2h , x1 = − h , x2 = 0 , x3 = h , x4 = 2h ,
we have
4
f ′′ (x2 ) ∼
X
= fi ℓ′′i (x2 ) .
i=0
Here
(x − x1 )(x − x2 )(x − x3 )(x − x4 )
l0 (x) =
(x0 − x1 )(x0 − x2 )(x0 − x3 )(x0 − x4 )
236
Similarly
16 −30 16 −1
ℓ′′1 (x2 ) = , ℓ′′2 (x2 ) = , ℓ′′3 (x2 ) = , ℓ′′4 (x2 ) = .
12h2 12h2 12h2 12h2
(Check !)
By Taylor expansion one can show that the leading error term is
h4 f (6) (x2 )
. (Check !)
90
237
EXERCISES :
238
•[098] SOLUTION : For the formula
−3f (0) + 4f (h) − f (2h)
f ′ (0) ∼
= ,
2h
we use the notation f0 = f (0) , f1 = f (h) , and f2 = f (2h).
The Lagrange interpolating polynomial is
(x − h)(x − 2h) (x − 0)(x − 2h) (x − 0)(x − h)
p(x) = f0 + f1 + f2 ,
(0 − h)(0 − 2h) (h − 0)(h − 2h) (2h − 0)(2h − h)
so that
(x − h) + (x − 2h) (x − 0) + (x − 2h) (x − 0) + (x − h)
p′ (x) = f0 + f1 + f2 ,
(0 − h)(0 − 2h) (h − 0)(h − 2h) (2h − 0)(2h − h)
with
′ (0 − h) + (0 − 2h) (0 − 0) + (0 − 2h) (0 − 0) + (0 − h)
p (0) = f0 + f1 + f2 ,
(0 − h)(0 − 2h) (h − 0)(h − 2h) (2h − 0)(2h − h)
from which
−3h −2h −h −3f0 + 4f1 − f2
f ′ (0) ∼
= p ′
(0) = f 0 + f 1 + f 2 = .
2h2 −h2 2h2 2h
SOLUTION : continued · · · :
−3f0 + 4f1 − f2
− f0′
2h
1
= − 3f0
2h
h2 ′′ h3 ′′′
+ 4 [ f0 + hf0′
+ f0 + f0 + · · · ]
2 6
2 3
(2h) (2h)
− [ f0 + 2hf0′ + f0′′ + f0′′′ + · · · ] − f0′
2 6
1 4h3 ′′′ (2h)3 ′′′
= f0 − f0 + · · ·
2h 6 6
h2 ′′′
= − f0 + higher order terms .
3
Thus this formula is of second order accuracy.
EXERCISES :
•[099] For the reference interval [0, 3h] , give complete details on the deriva-
tion of the four weights in the numerical differentiation formula
•[100] For the reference interval [−3h/2, 3h/2], give complete details on the
derivation of the weights in the numerical differentiation formula
239
BEST APPROXIMATION IN THE k · k2 .
hx, yi ≡ x1 y1 + x2 y2 + x3 y3 .
240
• The length or norm of a vector is defined in terms of the inner product :
1
q
k x k2 ≡ hx, xi 2 = x21 + x22 + x23 .
241
Let
k e1 k2 = k e2 k2 = k e3 k2 = 1,
242
Let S2 denote the x1 , x2 -plane .
Then
S2 = Span{e1 , e2 } .
S2 is a 2-dimensional subspace of R3 .
p∗ ∈ S2 ,
to a given vector x ∈ R3 .
k x − p k2 ,
over all p ∈ S2 .
243
x
x-p*
e
1
S
2
p*
e
2
244
Geometrically we see that k x − p k2 is minimized if and only if
(x − p) ⊥ S2 ,
hx − p , e1 i = 0 , and hx − p , e2 i = 0 ,
hx , e1 i = hp , e1 i , and hx , e2 i = hp , e2 i .
245
Since p ∈ S2 we have
p = c1 e1 + c2 e2 ,
Hence
p∗ = c1 e1 + c2 e2 ,
with
c1 = hx, e1 i and c2 = hx, e2 i .
246
Best Approximation in General.
247
THEOREM :
Let X be a vector space with an inner product satisfying the properties above.
Then 1
kxk ≡ hx, xi ,2
defines a norm on X.
1 1
2
(ii) k αx k = hαx, αxi 2 = ( α hx, xi ) 2 = |α| kxk.
248
Let
hx, yi hx, yi
α ≡ = 2
, where x, y ∈ X .
hy, yi kyk
Then
0 ≤ k x − αy k2 = hx − αy, x − αyi
= k x k2 − 2αhx, yi + α2 k y k2
249
Now
k x + y k2 = hx + y, x + yi
= k x k2 + 2hx, yi + k y k2
≤ k x k2 + 2 | hx, yi | + k y k2
≤ k x k2 + 2 k x kk y k + k y k2
= ( k x k + k y k )2 .
Hence
kx+y k ≤ kxk + ky k . ( Triangle Inequality ) QED !
250
Suppose {ek }nk=1 is an orthonormal set of vectors in X , i.e.,
0, if l 6= k ,
heℓ , ek i =
1, if l = k .
Let Sn ⊂ X be defined by
Sn = Span{ek }nk=1 .
251
THEOREM :
hx, ek i
ck = , if the basis is orthogonal ,
hek , ek i
and
252
PROOF : Let n
X
F (c1 , c2 , · · · , cn ) ≡ kx− ck ek k2 .
k=1
253
We had
n
X n
X
F (c1 , c2 , · · · , cn ) = k x k2 − 2 ck hx, ek i + c2k hek , ek i .
k=1 k=1
∂F
Setting ∂cℓ
= 0 gives
−2hx, eℓ i + 2cℓ heℓ , eℓ i = 0 .
hx, eℓ i
cℓ = ,
heℓ , eℓ i
QED !
254
NOTE :
C[0, 1] with k · k∞ ,
255
Gram-Schmidt Orthogonalization.
To construct
• Set
hv2 , e1 i
e2 = v2 − e1 .
he1 , e1 i
Then
he2 , e1 i = 0 . (Check !)
256
v
2
e
1
e =v -e
2 2
< v2 , e1 >
e= ( ) e1
< e1 , e1 >
257
Inductively, suppose we have mutually orthogonal {ek }m−1
k=1 , (m ≤ n).
m−1
• Choose vm ∈ Sn linearly independent from the {ek }k=1 .
• Set
m−1
X hvm , ek i
em = vm − ek .
k=1
hek , ek i
Then
hem , eℓ i = 0 . ℓ = 1, 2, · · · , m − 1 . (Check !)
258
Best Approximation in a Function Space.
X = C[−1, 1] ,
This definition satisfies all conditions an inner product must satisfy. (Check !)
−1
is a norm on C[−1, 1] .
259
Suppose we want to find p∗ ∈ Pn that best approximates a given function
f ∈ C[−1, 1] ,
in the k · k2 .
and where the {ek }nk=0 denote the first n + 1 orthogonal polynomials .
260
Use the Gram-Schmidt procedure to construct an orthogonal basis of Pn :
Then R1
hv1 , e0 i −1
x dx
= R1 = 0.
he0 , e0 i 12 dx
−1
Hence
e1 (x) = v1 (x) − 0 · e0 (x) = x.
261
Take v2 (x) = x2 .
Then R1 2
hv2 , e0 i −1
x dx 1
= R1 = ,
he0 , e0 i 12 dx 3
−1
and R1 3
hv2 , e1 i −1
x dx
= R1 = 0.
he1 , e1 i x2 dx
−1
Hence
1 2 1
e2 (x) = v2 (x) − e0 (x) − 0 · e1 (x) = x − .
3 3
262
Take v3 (x) = x3 . Then
R13
hv3 , e0 i −1
x dx
= R1 = 0,
he0 , e0 i 2
1 dx
−1
and R14
hv3 , e1 i −1
x dx 3
= R1 = ,
he1 , e1 i x2 dx 5
−1
and R1 3 2 1
hv3 , e2 i −1
x (x − 3
) dx
= R1 = 0.
he2 , e2 i 2 1 2
(x − 3 ) dx
−1
Hence
3 3
e3 (x) = v3 (x) − 0 · e0 (x) − e1 (x) − 0 · e2 (x) = x3 − x.
5 5
etc.
263
EXAMPLE :
f (x) = ex , on [−1, 1] , in k · k2 ,
is given by
p∗ (x) = c0 e0 (x) + c1 e1 (x) + c2 e2 (x) ,
where
hf, e0 i hf, e1 i hf, e2 i
c0 = , c1 = , c2 = .
he0 , e0 i he1 , e1 i he2 , e2 i
264
We find that
R1 x
hf, e0 i −1
e dx 1 1
c0 = = R1 = (e − ) = 1.175 ,
he0 , e0 i 2
1 dx 2 e
−1
R1 x
hf, e1 i e x dx 3
c1 = = −1
R1 = (x − 1)ex |1−1 = 1.103 ,
he1 , e1 i x2 dx 2
−1
R1 x 2 1
hf, e2 i −1
e (x − 3
) dx 45 2 5 x1
c2 = = R1 = (x − 2x + )e |−1 = 0.536 .
he2 , e2 i 1
(x2 − 3 )2 dx 8 3
−1
Therefore
∗ 1 2
p (x) = 1.175 (1) + 1.103 (x) + 0.536 (x − )
3
= 0.536 x2 + 1.103 x + 0.996 . (Check !)
265
Best approximation of f (x) = ex in [−1, 1] by a polynomial p ∈ P2 .
266
EXERCISES :
267
•[101] Use the Gram-Schmidt procedure to construct an orthogonal basis of
the polynomial space P4 on the interval [−1, 1], by deriving e4 (x), given
e0 (x) = 1 , e1 (x) = x , e2 (x) = x2 − 13 , and e3 (x) = x3 − 35 x .
Hence
3
e2 (x) = v2 (x) − 5
e1 (x) = x3 − 53 x .
SOLUTION : continued · · ·
We found thet e0 (x) = 1, e1 (x) = x, and e2 (x) = x3 − 35 x .
6
R1
hf, e1 i −1
x dx 3
c1 = = R1 = ,
he1 , e1 i x2 dx 7
−1
R1 5 3 3
hf, e2 i −1
x (x − 5
x) dx 10
c2 = = R1 = .
he2 , e2 i 3 3 2
(x − 5 x) dx 9
−1
Thus
p∗ (x) = c0 e0 (x) + c1 e1 (x) + c2 e2 (x)
3 10
= 7
x + 9
(x3 − 53 x) = 10 3
9
x − 5
21
x .
Best approximation of f (x) = x5 (blue) in [−1, 1]
by a polynomial p ∈ Span{1, x, x3 } (red).
•[103] Use the Gram-Schmidt procedure to construct an orthogonal basis of
the linear space Span{1, x2 , x4 } for the interval [−1, 1]. Determine the
best approximation in the k · k2 to f (x) = x6 .
R1 6 2 1
hf, e1 i −1
x (x − 3
) dx 5 ∼
c1 = = R1 = = 0.71428573 ,
he1 , e1 i 2 1 2
(x − 3 ) dx 7
−1
R1 6 4 6 2 3
hf, e2 i x (x − x + ) dx
c2 = = −1
R1
7 35 ∼
= 1.3636366 .
he2 , e2 i 6 3
(x4 − 7 x2 + 35 )2 dx
−1
268
Most formulas are based on integrating local interpolating polynomials of f :
Z b N Z tj
f (x) dx ∼
X
= pj (x) dx ,
a j=1 tj−1
269
The Trapezoidal Rule.
If n = 1 , and if pj ∈ P1 interpolates f at tj−1 and tj , then
Z tj
h
pj (x) dx = (fj−1 + fj ) , (local integration formula) .
tj−1 2
f 1
0
0
1
0
1
11
00 11
00
00
11 00
11
pN
p1 1
0
1
0
0
1 p2 0
1
t0 t1 t2 tN
a b
270
The composite integration formula then becomes
Z b N Z tj
f (x) dx ∼
X
= pj (x) dx
a j=1 tj−1
N
X h
= (fj−1 + fj )
j=1
2
1 1
= h ( f0 + f1 + · · · + fN −1 + fN ) ,
2 2
where fj ≡ f (tj ).
271
In general
n
X
pj (x) = f (xji ) ℓji (x) ,
i=0
where n
Y x − xjk
ℓji (x) = .
k=0,k6=i
xji − xjk
The integrals Z tj
ℓji (x) dx ,
tj−1
272
Simpson’s Rule.
Let n = 2 , and in each subinterval [tj−1 , tj ] choose the interpolation points
1
tj−1 , tj− 1 ≡ (tj−1 + tj ) , and tj .
2 2
11
00
00
11
00
11
f 1
0
0
1
0
1
11
00 11
00
00
11 00
11 1
0
pN 0
1
1
0 1
0
0
1 0
1
p1 0
1 0
1
1
0
1
0
0
1 p2 0
1
11
00
00
11
11
00
00
11 p
f
11
00
00
11
−h/2 0 h/2
274
The weights are
h/2
(x − 0) (x − h2 ) h
Z
h h h
dx = ,
−h/2 (− 2 − 0) (− 2 − 2 ) 6
h/2
(x + h2 ) (x − h2 ) 4h
Z
h h
dx = ,
−h/2 (0 + 2 ) (0 − 2 ) 6
h/2
(x + h2 ) (x − 0) h
Z
h h h
dx = .
−h/2 ( 2 + 2 ) ( 2 − 0) 6
(Check !)
275
With uniformly spaced {tj }N
j=0 , the composite integration formula becomes
b N
h
Z
f (x) dx ∼
X
= (fj−1 + 4fj− 1 + fj )
a j=1
6 2
h
= ( f0 + 4f 1 + 2f1 + 4f1 1 + 2f2 + · · · + 2fN −1 + 4fN − 1 + fN ) .
6 2 2 2
276
The local polynomials (red) in Simpson’s Rule
1
for numerically integrating f (x) = 1+x 2 (blue) .
277
THEOREM :
b N Z tj
k f (n+1) k∞ hn+1 Cn (b − a)
Z X
| f (x) dx − pj (x) dx | ≤ n+1
,
a j=1 tj−1 (n + 1)! 2
where
b−a
h = ,
N
and where the value of
278
PROOF : The local error is
Z tj Z tj Z tj
| f (x) dx − pj (x) dx | = | f (x) − pj (x) dx |
tj−1 tj−1 tj−1
tj n
f (n+1) (ξ(x)) Y
Z
= | (x − xji ) dx |
tj−1 (n + 1)! i=0
k f (n+1) k∞ h n+1
≤ | tj − tj−1 | Cn
(n + 1)! 2
k f (n+1) k∞ hn+2 Cn
= n+1
.
(n + 1)! 2
279
The error in the composite formula is now easily determined :
Z b N Z
X tj N Z
X tj
| f (x) dx − pj (x) dx | = | f (x) − pj (x) dx |
a j=1 tj−1 j=1 tj−1
N
X Z tj
≤ | f (x) − pj (x) dx |
j=1 tj−1
k f (n+1) k∞ hn+2 Cn
= N
(n + 1)! 2n+1
k f (n+1) k∞ hn+1 Cn (b − a)
= n+1
,
(n + 1)! 2
where the last step uses the fact that
b−a b−a
h= , i .e. , N= .
N h
QED !
280
b N Z tj
k f (n+1) k∞ hn+1 Cn (b − a)
Z X
| f (x) dx − pj (x) dx | ≤ n+1
.
a j=1 tj−1 (n + 1)! 2
NOTE :
281
EXAMPLES :
h2
k f ′′ k∞ (b − a) .
8
Indeed the Trapezoidal Rule is O(h2 ) .
282
EXAMPLE :
h/2 2 3 4
x x x
Z
= f0 + xf0′ + f0′′ + f0′′′ + f0′′′′ + · · · dx
−h/2 2 6 24
h h ′ 1 h 2 ′′ 1 h 3 ′′′ 1 h 4 ′′′′
− f0 − ( )f0 + ( ) f0 − ( ) f0 + ( ) f0 + · · ·
6 2 2 2 6 2 24 2
+ 4f0
h ′ 1 h 2 ′′ 1 h 3 ′′′ 1 h 4 ′′′′
+ f0 + ( )f0 + ( ) f0 + ( ) f0 + ( ) f0 + · · ·
2 2 2 6 2 24 2
283
x2 ′ x3 ′′ x4 ′′′ x5 ′′′′ h/2
= xf0 + f0 + f0 + f0 + f0 + · · ·
2 6 24 120 −h/2
h h2 ′′ h4 ′′′′
− 6f0 + f0 + f0 + · · ·
6 4 192
h5
= − f0′′′′ + higher order terms.
2880
Thus the leading error term of the composite Simpson’s Rule is bounded by
h4
k f ′′′′ k∞ (b − a) .
2880
284
EXERCISE :
•[105] The Local Midpoint Rule, for numerically integrating a function f (x)
over the reference interval [−h/2, h/2], is given by
Z h/2
f (x) dx ∼
= hf (0) .
−h/2
Write down the formula for the Composite Midpoint Rule for integrating
f (x) over a general interval [a, b].
How big must N be for the global error to be less than 10−6 , when
integrating f (x) = sin(x) over the interval [0, 1] ?
285
•[105] SOLUTION : Taylor expand for the local error :
Z h/2
f (x) dx − h f0
−h/2
Z h/2 2
x
= f0 + xf0′ + f0′′ + · · · dx − h f0
−h/2 2
x2 h/2 ′ x3 h/2 ′′
= hf0 + f0 + f0 + · · · − h f0
2 −h/2 6 −h/2
h3 ′′
= f0 + higher order terms.
24
•[106] The local Trapezoidal Rule for the reference interval [−h/2, h/2] is
Z h/2
h h i
f (x) dx ∼
= f (−h/2) + f (h/2) .
−h/2 2
Use Taylor expansions to derive the local error formula.
286
THE GAUSS QUADRATURE THEOREM :
If in each subinterval [tj−1 , tj ] the interpolation points {xji }ni=0 are taken as
( relative to [tj−1 , tj ] ) ,
NOTE :
287
EXAMPLE : The case n = 1 :
The two Gauss points relative to [−1, 1] are the zeroes of e2 (x) , i.e.,
√ √
3 3
x0 = − and x1 = .
3 3
288
Relative to the reference interval Ih ≡ [−h/2 , h/2] the Gauss points are
√ √
h 3 h 3
x0 = − and x1 = ,
6 6
where
√
x − x1 x − h 3/6
ℓ0 (x) = = √ ,
x0 − x1 −h 3/3
and
√
x − x0 x + h 3/6
ℓ1 (x) = = √ .
x1 − x0 h 3/3
289
The local integration formula is
Z h/2 Z h/2 Z h/2
f (x) dx ∼= f (x0 ) ℓ0 (x) dx + f (x1 ) ℓ1 (x) dx .
−h/2 −h/2 −h/2
290
PROOF (of the Gauss Quadrature Theorem.)
291
FACT : If f (x) is very smooth then c(x) has n + 1 continuous derivatives.
Call the remainder r(x) and use the fact that each summation term is in Pn :
n
X
c(x) = ck ek (x) + r(x) ,
k=0
292
We have
Z h/2 Z h/2
f (x) − p(x) dx = c(x) en+1 (x) dx ,
−h/2 −h/2
and n
X
c(x) = ck ek (x) + r(x) .
k=0
It follows that
Z h/2 Z h/2 n
X
| f (x) − p(x) dx | = | [ ck ek (x) + r(x)] en+1 (x) dx |
−h/2 −h/2 k=0
n
X Z h/2 Z h/2
= | ck ek (x) en+1 (x) dx + r(x) en+1 (x) dx | .
k=0 −h/2 −h/2
293
Note that all terms in the summation term are zero by orthogonality, so that
Z h/2 Z h/2
| f (x) − p(x) dx | = | r(x) en+1 (x) dx |
−h/2 −h/2
h/2 n
xn+1
Z Y
= | c(n+1) (η(x)) (x − xi ) dx |
−h/2 (n + 1)! i=0
n
xn+1 Y
≤ h max | c(n+1) (η(x)) (x − xi ) |
x∈Ih (n + 1)! i=0
(h/2)n+1
≤ h max | c(n+1) (x) | hn+1
(n + 1)! x∈Ih
h2n+3
= n+1 max | c(n+1) (x) | .
2 (n + 1)! x∈Ih
294
EXERCISE :
•[107] Give complete details on the derivation of the local 3-point Gauss
integration formula. Also write down the composite 3-point Gauss
formula for integrating a function f (x) over a general interval [a, b].
•[108] Are the following True or False for any sufficiently smooth f (x) ?
295
DISCRETE LEAST SQUARES APPROXIMATION
Z 1
k p − f k22 ≡ [p(x) − f (x)]2 dx .
−1
296
Next we solve the discrete least squares problem :
{ (xi , yi ) }N
i=1 ,
297
Linear Least Squares
298
8
4
Temperature
−1
−2
−3
0 5 10 15 20 25 30 35
Day
Average daily high temperature in Montreal in March
299
Suppose that :
Tk = c1 + c2 k , k = 1, 2, · · · , 31 .
300
8
4
Temperature
−1
−2
−3
0 5 10 15 20 25 30 35
Day
Average daily high temperatures, with a linear approximation .
301
• There are many ways to determine such a linear approximation.
302
The least squares error versus c1 and c2 .
303
From setting the partial derivatives to zero, we have
N
X N
X
( Tk − (c1 + c2 xk ) ) = 0 , xk ( Tk − (c1 + c2 xk ) ) = 0 .
k=1 k=1
304
EXAMPLE : For our ”March temperatures” example, we find
c1 = − 2.111 and c2 = 0.272 .
4
Temperature
−1
−2
−3
0 5 10 15 20 25 30 35
Day
Average daily high temperatures, with linear least squares approximation .
305
General Least Squares
EXAMPLES :
• p(x) = c1 + c2 x . (Already done !)
306
For any vector x ∈ RN we have
N
X
k x k22 ≡ xT x ≡ x2k . (T denotes transpose).
i=1
Then
p(x1 ) y1
N
X · · 2
EL ≡ [p(xi ) − yi ]2 = k
· − · k2
i=1
p(xN ) yN
Pn
y1
i=1 ci φi (x1 )
· · 2
= k − k2
Pn · ·
i=1 ci φi (xN ) yN
φ1 (x1 ) · φn (x1 ) y1
c 1
· · · 2
k Ac − y k22 .
= k · − k2 ≡
· · ·
cn
φ1 (xN ) · φn (xN ) yN
307
THEOREM :
AT A c = AT y .
PROOF :
EL = k Ac − y k22
= (Ac)T Ac − (Ac)T y − yT Ac + yT y
= cT AT Ac − cT AT y − yT Ac + yT y .
308
PROOF : continued · · ·
We had
EL = cT AT Ac − cT AT y − yT Ac + yT y .
309
EXAMPLE : Given the data points
{ (xi , yi ) }4i=1 = { (0, 1) , (1, 3) , (2, 2) , (4, 3) } ,
find the coefficients c1 and c2 of p(x) = c1 + c2 x ,
that minimize
X4
EL ≡ [ (c1 + c2 xi ) − yi ]2 .
i=1
310
EXAMPLE : Given the same data points, find the coefficients of
p(x) = c1 + c2 x + c3 x2 ,
that minimize 4
X
EL ≡ [ (c1 + c2 xi + c3 x2i ) − yi ]2 .
i=1
SOLUTION : Here
N =4 , n=3 , φ1 (x) = 1 , φ2 (x) = x , φ3 (x) = x2 .
Use the Theorem :
1 0 0 1
1 1 1 1 c1 1 1 1 1
0 1 1 1 c2 = 0 1 2 4 3 ,
1 2 4
1 2 4 2
0 1 4 16 c3 0 1 4 16
1 4 16 3
or
4 7 21 c1 9
7 21 73 c2 = 19 ,
21 73 273 c3 59
311
The least squares approximations from the preceding two examples :
3.5 3.5
3.0 3.0
2.5 2.5
y
y
2.0 2.0
1.5 1.5
1.0 1.0
0.5 0.5
0 1 2 3 4 0 1 2 3 4
x x
p(x) = c1 + c2 x p(x) = c1 + c2 x + c3 x2
312
EXAMPLE : From actual data :
January -5
February -3
March 3
April 11
May 19
June 24
July 26
August 25
September 20
October 13
November 6
December -2
Source : http://weather.uk.msn.com
313
30
25
20
15
Temperature
10
−5
−10
0 2 4 6 8 10 12
Month
Average daily high temperature in Montreal (by month).
314
EXAMPLE : continued · · ·
The graph suggests using a 3-term least squares approximation
315
30
25
20
15
Temperature
10
−5
−10
0 2 4 6 8 10 12
Month
Least squares fit of average daily high temperatures.
316
EXAMPLE : Consider the following experimental data :
0.6
0.5
0.4
y
0.3
0.2
0.1
0.0
0 1 2 3 4 5 6 7 8
x
317
EXAMPLE : continued · · ·
y = c1 xc2 e−c3 x .
318
EXAMPLE : continued · · ·
y = c1 xc2 e−c3 x .
Note that :
• What to do ?
319
EXAMPLE : continued · · ·
Thus
• We can now use regular least squares.
320
EXAMPLE : continued · · ·
0.0
−0.5
−1.0
−1.5
−2.0
log y
−2.5
−3.0
−3.5
−4.0
−4.5
0 1 2 3 4 5 6 7 8
x
The logarithm of the original y-values versus x .
321
EXAMPLE : continued · · ·
We had
y = c1 xc2 e−c3 x ,
and
log y = ĉ1 φ1 (x) + c2 φ2 (x) + c3 φ3 (x) ,
with
φ1 (x) = 1 , φ2 (x) = log x , φ3 (x) = − x ,
and
ĉ1 = log c1 .
and
c1 = eĉ1 = 0.995 .
322
EXAMPLE : continued · · ·
−1
−2
−3
log y
−4
−5
−6
−7
0 1 2 3 4 5 6 7 8
x
The least squares approximation of the transformed data.
323
EXAMPLE : continued · · ·
0.6
0.5
0.4
y
0.3
0.2
0.1
0.0
0 1 2 3 4 5 6 7 8
x
The least squares approximation shown in the original data.
324
EXERCISES :
325
SMOOTH INTERPOLATION BY PIECEWISE POLYNOMIALS
326
NOTE :
327
NOTE :
328
The cubic spline that interpolates the indicated data points.
329
Cubic Spline Interpolation.
• p ∈ C2 [a, b] ,
• p(tj ) = f (tj ) , j = 0, 1, · · · , N ,
• With the above choice of ⋆ a spline is called the natural cubic spline.
330
This spline is ”formally well defined”, because
interpolation equations N +1
331
NOTE :
{ ( tj , fj ) }N
j=0
is relatively small.
332
Consider the interval [tj−1 , tj ] of size hj .
We can write
p(t0 ) = p0 , p(t1 ) = p1 ,
p0 , p1 , p′′0 , p′′1 .
333
In fact, for the interval [t0 , t1 ] , one finds the polynomial
p′′0 3 p ′′
1 3 p 1 p ′′
1 h 1 p 0 p ′′
0 h1
p1 (t) = (t1 −t) + (t−t0 ) + ( − ) (t−t0 ) + ( − ) (t1 −t) .
6h1 6h1 h1 6 h1 6
Indeed, p1 ∈ P3 , and
p(t0 ) = p0 , p(t1 ) = p1 ,
334
By construction the local polynomials p1 and p2 connect continuously at t1 .
335
For consecutive intervals [tj−1 , tj ] and [tj , tj+1 ], the consistency relation is
p − pj pj − pj−1
j+1
hj p′′j−1 + 2(hj + hj+1 ) p′′j + hj+1 p′′j+1 = 6 − ,
hj+1 hj
where
hj ≡ tj − tj−1 and hj+1 ≡ tj+1 − tj .
We have one such equation for each interior mesh point .
pj = fj , j = 0, 1, · · · , N .
336
This gives a tridiagonal system of equations for the unknown values
p′′j , for j = 1, · · · , N − 1 ,
namely,
p′′1
2(h1 + h2 ) h2 F1
p′′2
h2 2(h2 + h3 ) h3
·
F2
· · ·
·
= ·
· · · ′′ FN −2
pN −2
hN −1 2(hN −1 + hN ) ′′ FN −1
pN −1
where
f − f f − f
j+1 j j j−1
Fj ≡ 6 − .
hj+1 hj
337
NOTE :
• In each row the diagonal entry is bigger than the sum of the other entries.
338
NOTE :
339
The cubic spline that interpolates the indicated data points.
340
NUMERICAL METHODS FOR INITIAL VALUE PROBLEMS
Here we discuss some basic concepts that arise in the numerical solution of
initial value problems (IVPs) in ordinary differential equations (ODEs) .
Here u , f (·) ∈ Rn .
341
Many higher order ODEs can be rewritten as first order systems .
EXAMPLE :
u′′ = g( u(t) , u′ (t) ) ,
u(0) = u0 ,
u′ (0) = v0 ,
can be rewritten as
u′ (t) = v(t) ,
v(0) = v0 .
342
EXAMPLE :
where
p p
r1 = (x + µ)2 + y2 + z2 , r2 = (x − 1 + µ)2 + y 2 + z 2 .
343
Rewritten as a first order system :
x′ = vx ,
y′ = vy ,
z′ = vz ,
344
Here µ is the mass ratio , i.e.,
m2
µ ≡ ,
m1 + m2
where m1 is the mass of the larger body, and m2 of the smaller body.
For example,
µ ∼
= 0.01215 for the Earth Moon system,
µ ∼
= 9.53 10−4 for the Sun Jupiter system,
µ ∼
= 3.0 10−6 for the Sun Earth system.
The larger body is located at (−µ, 0, 0) , and the smaller body at (1−µ, 0, 0) .
345
A trajectory connecting a periodic “Halo orbit” to itself.
346
Numerical Methods.
Let
tj ≡ j ∆t , j = 0, 1, 2, · · · .
Below we give several basic numerical methods for solving the IVP
u(0) = u0 .
347
Euler’s Method :
u(tj+1 ) − u(tj ) ∼ ′
= u (tj ) = f ( u(tj ) ) ,
∆t
we have
uj+1 = uj + ∆t f (uj ) , j = 0, 1, 2, · · · ,
348
The Trapezoidal Method :
∆t
uj+1 = uj + [ f (uj ) + f (uj+1 ) ] , j = 0, 1, 2, · · · ,
2
349
A Two-Step (Three-Point) Backward Differentiation Formula (BDF) :
we have
4 1 2∆t
uj+1 = uj − uj−1 + f (uj+1 ) , j = 1, 2, · · · ,
3 3 3
350
A Two-Step (Three-Point) Forward Differentiation Formula :
we have
351
The Improved Euler Method :
ûj+1 = uj + ∆t f (uj ) ,
∆t
uj+1 = uj + [ f (uj ) + f (ûj+1 ) ] ,
2
for j = 0, 1, 2, · · · .
352
An Explicit 4th order accurate Runge-Kutta Method :
k1 = f (uj ) ,
∆t
k2 = f (uj + k1 ) ,
2
∆t
k3 = f (uj + k2 ) ,
2
k4 = f (uj + ∆t k3 ) ,
∆t
uj+1 = uj + {k1 + 2k2 + 2k3 + k4 } ,
6
for j = 0, 1, 2, · · · .
353
The order of accuracy of a local formula can be found by Taylor expansion .
EXAMPLE :
354
Stability of Numerical Approximations.
The very simple model equation
has solution
u(t) = u0 , (constant) .
Assume that
u0 is given ,
and (if m > 1) that
355
General m-step approximation of u′ (t) = 0 , u(0) = u0 :
EXAMPLES :
356
The difference equation
αm uj+1 + αm−1 uj + · · · + α0 uj+1−m = 0 ,
can be solved explicitly :
Then we have
357
Difference equation :
Characteristic Equation :
αm z m + αm−1 z m−1 + · · · + α1 z + α0 = 0 .
uj = γ1 z1j + γ2 z2j + · · · + γm zm
j
.
358
FACT :
In such a case the uj can become arbitrarily large in a fixed time interval
by taking ∆t sufficiently small.
THEOREM :
A necessary condition for numerical stability of a multistep method is that
the characteristic equation
αm z j+1 + αm−1 z j + · · · + α0 = 0 ,
359
EXAMPLES :
360
Consider the last two examples in more detail :
361
Hence
3 1 3 3 1 j
uj = ( u1 − u0 ) + ( u0 − u1 ) ( ) .
2 2 2 2 3
If
u1 = u0 ,
then we see that
uj = u0 , for all j .
Moreover, if
u1 = u0 + ǫ ,
then
3 3ǫ 1 j
uj = u0 + ǫ − ( ) ,
2 2 3
362
Case (3) : Here the general solution is
uj = γ1 (1)j + γ2 (3)j .
Using the initial data
γ1 + γ2 = u0 ,
γ1 + 3 γ2 = u1 ,
we find 3 1 1 1
γ1 = u0 − u1 , γ2 = u1 − u0 .
2 2 2 2
Hence
3 1 1 1
uj = ( u0 − u1 ) + ( u1 − u0 ) (3)j .
2 2 2 2
1 1
But if u1 = u0 + ǫ then uj = u0 − 2
ǫ + 2
ǫ 3j .
363
THEOREM :
αm z m + αm−1 z m−1 + · · · + α1 z + α0 = 0 ,
satisfy
| zk | ≤ 1 , and | zk | = 1 ⇒ zk is simple ,
uj → u(tj ) as ∆t → 0 .
PROOF : Omitted.
364
Stiff Differential Equations.
with
u(0) = u0 ,
The solution is
u(t) = eλt u0 .
365
Consider the case where
Re(λ) << 0 ,
i.e., λ has large negative real part .
Thus we don’t want any zk outside the unit disk in the complex plane.
366
However, for many difference formulas
in order that
367
More generally the IVP
u′ (t) = f ( u(t) ) ,
u(0) = u0 ,
fu ( u(t) ) ,
Re(λi ) << 0 .
368
EXAMPLES :
We will approximate
u′ (t) = λ u(t) ,
Assume
∆t > 0 , and Re(λ) < 0 .
Re(∆tλ) < 0 .
369
Explicit Euler.
u′ (t) = λ u(t) ,
370
Complex (∆ t .λ)− plane
-2 -1 0
371
EXAMPLE : Take λ = − 106 .
Then
6 t)
u(t) = e(−10 u0 ,
∆tλ > −2 ,
that is,
∆t < 2 · 10−6 !
372
Implicit Euler.
| ∆tλ − 1 | ≥ 1 .
373
Complex (∆ t .λ)− plane
0 1 2
374
Trapezoidal Method.
When applied to u′ (t) = λ u(t) , the Trapezoidal Method gives
1 1
(uj+1 − uj ) = λ (uj + uj+1 ) .
∆t 2
Thus the characteristic equation is
1 1 1 + 21 ∆tλ
(1 − ∆tλ) z − (1 + ∆tλ) = 0 , with zero z = 1 .
2 2 1 − 2 ∆tλ
The region of stability is now precisely the entire negative half plane.
375
A disadvantage is that the decay rate becomes smaller when
Re(λ) → − ∞ ,
1 + 21 ∆tλ
lim z(λ) = lim = −1 .
λ→−∞ λ→−∞ 1 − 1 ∆tλ
2
376
Complex (∆ t .λ)− plane
377
Backward Differentiation Formulas (BDF).
For the differential equation u′ (t) = f (u(t)) the BDF take the form
m
1 X
αi uj+1−i = f (uj+1 ) .
∆t i=0
m
The {αi }m
i=0 are chosen so the order is as high as possible, namely, O(∆t ) .
378
Let Sm denote the stability region of the m-step BDF.
m = 1, 2 :
Sm contains the negative half plane.
m = 3, 4, 5, 6 :
Sm contains the negative axis, but not the entire negative half plane.
m≥7:
These methods are unstable , even for solving u′ (t) = 0 !
379
Stability region of Backward Differentiation Formulas.
380
Collocation at 2 Gauss Points.
The 2-point Gauss collocation method for taking a time step for the IVP
u′ (t) = f (u(t)) , u(0) = u0 ,
is defined by finding a local polynomial p ∈ P2 that satisfies
p(tj ) = uj ,
and
p′ (xj,i ) = f ( p(xj,i ) ) , i = 1, 2 ,
(collocation) ,
where √
tj + tj+1 ∆t 3
xj,i = ± ,
and then setting 2 6
uj+1 = p(tj+1 ) .
381
It can be shown that the stability region
1
1 + ∆tλ + 12
(∆tλ)2
S ≡ { ∆tλ : 1 2
≤ 1},
1 − ∆tλ + 12
(∆tλ)
All Gauss collocation methods have this property and thus are A-stable .
However,
lim z(∆tλ) = 1 ,
λ→−∞
382
Complex (∆ t . λ)− plane
383
BOUNDARY VALUE PROBLEMS IN ODEs
y(0) = 0 , y(π) = 0 ,
y(x) = sin(2x) .
384
Partition [0, π] into a grid or mesh :
385
We want to find approximations uj to y(xj ) , j = 0, 1, 2, · · · , N .
u0 = 0 ,
u2 − 2u1 + u0
2
− u1 = − 5 sin(2x1 ) ,
h
u3 − 2u2 + u1
2
− u2 = − 5 sin(2x2 ) ,
h
·
·
·
uN − 2uN −1 + uN −2
2
− uN −1 = − 5 sin(2xN −1 ) ,
h
uN = 0 .
386
Write the finite difference equations as
1 2 1
( 2
) uj−1 − (1 + 2
) uj + ( 2
) uj+1 = − 5 sin(2xj ) , j = 1, 2, · · · , N − 1,
h h h
and put them in matrix form :
−1 − h22 1
h2
u1 f1
1 2 1
h2 −1 − h2 h2
u2
f2
. . .
. =
,
.
1 2 1
2 −1 −
h h2 h2
uN −2 fN −2
1 2
h2
−1 − h2
uN −1 fN −1
where
f1 −5 sin(2x1 )
f2
−5 sin(2x2 )
. =
. ,
fN −2 −5 sin(2xN −2 )
fN −1 −5 sin(2xN −1 )
and where the matrix has dimensions N − 1 by N − 1.
387
We found that :
Ah uh = fh ,
where
1 2 1
Ah = diag[ 2 , − (1 + 2 ) , 2 ] ,
h h h
uh ≡ (u1 , u2 , · · · , uN −1 )T ,
and
388
QUESTIONS :
i.e. , what is
max | uj − y(xj ) | ?
j
389
• How to solve the linear systems efficiently, especially when N is large ?
ANSWER :
Thus the linear system can be solved by the specialized Gauss elimination
algorithm for tridiagonal systems.
390
• How to approximate derivatives and find the error in the approximation ?
391
We found that
yj+1 − 2yj + yj−1 ′′ h2 ′′′′
τj ≡ − yj = y (ηj ) .
h2 12
16 2 4 2
| τj | ≤ h = h , j = 1, 2, · · · , N − 1 .
12 3
392
• What is the actual error after solving the system ?
i.e., what is
max | uj − y(xj ) | ?
j
ANSWER :
393
We already showed that
(yj−1 − 2yj + yj+1 ) ′′ 4h2
| τj | ≡ | − yj | ≤ , j = 1, 2, · · · , N − 1 .
h2 3
Now
1 2 1
2
yj−1 − (1 + 2
)yj + 2
yj+1
h h h
(yj+1 − 2yj + yj−1 )
= 2
− yj
h
= yj′′ + τj − yj
= τj − 5 sin(2xj ) .
Thus if we define
yh ≡ ( y1 , y2 , · · · , yN −1 )T , and τ h ≡ ( τ1 , τ2 , · · · , τN −1 )T ,
then
Ah yh = τ h + fh .
394
We found that
Ah yh = τ h + fh .
Since
Ah uh = fh ,
Ah (yh − uh ) = τ h .
395
We found that
Ah (yh − uh ) = τ h .
k yh − uh k∞ = k A−1
h τ h k∞
≤ k A−1
h k∞ k τ h k∞
4h2
≤ K .
3
396
Now
−1 − 2 1
h2 h2
1 2 1
h2
−1 − h2 h2
Ah = . . .
1 2 1
h2
−1 − h2 h2
1 2
h2
−1 − h2
0 1
h2 + 2 1 1 0 1
= − Ih +
h 2 h 2 · · ·
1 0
0 1
h2 + 2 h h2 1 1 0 1 i
= − Ih − 2
h 2 h + 2 h2
· · ·
1 0
0 1
h2 + 2 h 1 1 0 1
i
= − I − .
h
h2 h2 + 2
· · ·
1 0
397
0 1
We have
h2 + 2 h 1 1 0 1
i
Ah = − Ih − 2
h 2 h +2
· · ·
1 0
h2 + 2
= − (Ih + Bh ) ,
h2
it follows by the Banach Lemma that (Ih + Bh )−1 exists and that
1 h2 + 2
k (Ih + Bh )−1 k∞ ≤ 2 = 2
.
1− 2
h +2
h
398
We have
h2 + 2
Ah = − 2
(Ih + Bh ) ,
h
and
−1 h2 + 2
k (Ih + Bh ) k∞ ≤ 2
.
h
Hence
−h2 h 2
h 2
+2
k A−1
h k∞ = k 2 −1
(Ih + Bh ) k∞ ≤ 2 2
= 1.
h +2 h +2 h
Thus K = 1 , and
4h2
k yh − uh k∞ ≤ .
3
399
A Nonlinear Boundary Value Problem.
where u ≡ (u1 , u2 , · · · , uN −1 )T .
400
If we let
G(u) = 0 .
− h22 u1 1
+ λe h2
1
′
h2
− h22 + λeu2 1
h2
G (u) = .
· · ·
1
h2
− h22 + λeuN −1
401
Each Newton iteration for solving the nonlinear system
G(u) = 0 ,
(k)
− h22 u1 1
+ λe h2
(k)
1
− h22 + λe u2 1
∆u(k) = − G(u(k) ) ,
h2 h2
· · (k)
1
h2
− h22 + λeuN −1
where
(k) (k) (k)
∆u(k) ≡ (∆u1 , ∆u2 , · · · , ∆uN −1 )T ,
and updating
402
Solutions of the Gelfand-Bratu equations for different values of λ.
403
DIFFUSION PROBLEMS
u(x, 0) = g(x) ,
u(0, t) = u(1, t) = 0 .
404
First discretize in space :
uj (0) = g(xj ) ,
u0 (t) = uN (t) = 0 ,
uj (t) ≡ u(xj , t) ,
′
and where denotes differentiation with respect to t.
405
t
0 1
x
x0 x1 x2 * xN
406
In matrix-vector notation we can write the space-discretized equations as
′ 1
u (t) = 2
D u(t) ,
∆x
where
−2 1
1 −2 1
D ≡
. . . ,
1 −2 1
1 −2
and
u1
u2
u ≡
· .
uN −2
uN −1
407
Now discretize in time :
uk+1 − uk 1 k+1 k
= 2
D {u + u },
∆t 2∆x
where
uk1
k
uk2
u ≡ ,
·
ukN −1
and
ukj approximates u(xj , tk ) .
408
Assume that the solution has been computed up to time tk .
409
t
t2
t1
t0 1
x
0 x0 x1 x2 * xN
410
NOTE :
• We can also use an explicit method in time, for example explicit Euler.
• But this can be a bad choice because the ODE system is stiff.
411
For the system of ODEs
1
u′ (t) = 2
D u(t) , with D = diag[ 1 , −2 , 1 ] ,
∆x
412
We had the difference equation
1
2
(vj−1 − 2vj + vj+1 ) = λvj .
∆x
Try a solution of the form vj = z j .
The general solution of the difference equation then has the form
vj = c1 z1j + c2 z1−j .
413
From the first boundary condition we have
v0 = 0 ⇒ c1 + c2 = 0 .
c1 = c and c2 = − c .
Then
vj = c (z1j − z1−j ) .
vN = 0 ⇒ c (z1N − z1−N ) = 0 ,
from which
k2πi
z12N = 1 ⇒ z1 = e 2N .
414
The eigenvalues are therefore
z + z −1 − 2
λk =
∆x2
k2πi
− k2πi
e 2N +e 2N −2
=
∆x2
2 cos( k2π
2N
)−2
=
∆x2
2(cos( k2π
2N
) − 1)
=
∆x2
4 2 kπ
= − 2 sin ( ), k = 1, 2, · · · , N − 1 .
∆x 2N
415
The eigenvalue with largest negative real part is
4 2 (N − 1)π
λN −1 = − 2
sin ( ),
∆x 2N
which for large N behaves like
∼ ∗ 4
λN −1 = λ ≡ − 2
.
∆x
416
Nonlinear Diffusion Equations.
417
Another example is the time-dependent Gelfand-Bratu equation
418
We illustrate the numerical solution procedure for the general equation
u(x, 0) = g(x) ,
u(0, t) = u(1, t) = 0 ,
where
and
419
We approximate this equation as follows :
for j = 1, 2, · · · , N − 1 , with
uj (0) = g(xj ) ,
u0 (t) = uN (t) = 0 .
uk+1
j − ukj uk+1 k+1
j−1 − 2uj + uk+1
j+1 k+1
= 2
+ f (uj ).
∆t ∆x
420
Rewrite these equations as
∆t k+1
Fjk+1 ≡ uk+1
j − u k
j − 2
(u j−1 − 2u k+1
j + u k+1
j+1 ) − ∆t f (u k+1
j ) = 0,
∆x
for j = 1, 2, · · · , N − 1 ,
with
uk+1
0 = 0 and uk+1
N = 0.
421
As initial approximation to uk+1
j in Newton’s method use
(uk+1
j ) (0)
= uk
j, j = 1, 2, · · · , N − 1 .
Each Newton iteration then consists of solving a linear tridiagonal system
Tk+1,(ν) ∆uk+1,(ν) = − Fk+1,(ν) ,
where
k+1,(ν)
∆t ∆t
1+ 2 ∆x 2 − ∆tfu (u1 ) − ∆x 2
∆t ∆t k+1,(ν) ∆t
− ∆x 2 1 + 2 ∆x 2 − ∆tfu (u2 ) − ∆x 2
Tk+1,(ν) =
. . .
. . .
∆t ∆t k+1,(ν)
− ∆x2 1 + 2 ∆x 2 − ∆tfu (uN −1 )
and k+1,(ν) k+1,(ν)
∆u1 F1
k+1,(ν)
∆uk+1,(ν) k+1,(ν)
F k+1,(ν)
∆u = 2 , F = 2 .
· ·
k+1,(ν) k+1,(ν)
∆uN −1 FN −1
Then set the next approximation to the solution at time t = tk+1 equal to
uk+1,(ν+1) = uk+1,(ν) + ∆uk+1,(ν) .
422
Time-evolution of solutions of the Gelfand-Bratu equations for λ = 2.
423
Time-evolution of solutions of the Gelfand-Bratu equations for λ = 4.
424