Arbenz Peter, Numerical Methods For Computational Science and Engineering
Arbenz Peter, Numerical Methods For Computational Science and Engineering
1/48
References
2/48
Scientific Computing
3/48
4/48
5/48
x ,
div e(x) = 0,
x .
is accelerator cavity.
2. Particles move according to Newtons law of motion
dx(t)
= v,
dt
dv(t)
q
=
(E + v B) .
dt
m0
6/48
7/48
1
(x)
0
8/48
9/48
10/48
11/48
12/48
13/48
No emphasis on
I
14/48
Goals
Knowledge of the fundamental algorithms in numerical
mathematics
Knowledge of the essential terms in numerical mathematics
and the techniques used for the analysis of numerical
algorithms
Ability to choose the appropriate numerical method for
concrete problems
Ability to interpret numerical results
Ability to implement numerical algorithms efficiently in
Matlab
15/48
Literature
Uri Ascher & Chen Greif: A First Course in Numerical
Methods. SIAM, 2011.
http://www.siam.org/books/cs07/
Excellent reference.
16/48
Literature (cont.)
W. Dahmen & A. Reusken: Numerik f
ur Ingenieure und
Naturwissenschaftler, Springer, 2006.
A lot of simple examples and good explanations, but also rigorous mathematical treatment. Target
audience: undergraduate students in science and engineering.
17/48
Prerequisites
Essential prerequisite for this course is a solid knowledge in linear
algebra and calculus. Familiarity with the topics covered in the
first semester courses is taken for granted, see
K. Nipp and D. Stoffer, Lineare Algebra, vdf
Hochschulverlag, Z
urich, 5 ed., 2002.
M. Gutknecht, Lineare algebra, lecture notes, SAM, ETH
Z
urich, 2009.
http://www.sam.math.ethz.ch/~mhg/unt/LA/HS07/.
M. Struwe, Analysis f
ur Informatiker. Lecture notes, ETH
Z
urich, 2009.
18/48
Organization
Lecturer:
Prof. Peter Arbenz
Assistants:
Daniel Hupp
Christian Sch
uller
Alexander Lobbe
Sharan Jagathrakashakan
Alexander Bohn
Manuel Moser
Fabian Th
uring
Timo Welti
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
19/48
Venue
Classes:
Tutorials:
20/48
Assignments
The assignment sheets will be uploaded on the course
webpage on Monday every week the latest.
The exercise should be solved until the following tutorial class.
(Hand them in to the assistant.)
21/48
Examination
Three-hour written examination involving coding problems to
be done at the computer on
TBA
Dry-run for computer based examination:
Does not exist anymore.
Try out a computer in the student labs in HG.
Pre-exam question session:
TBA
22/48
Examination (cont.)
Topics of examination:
I All topics, that have been addressed in class or in a
homework assignment.
I One exam question will be one of the homework
assignment.
Lecture slides will be available as (a single) PDF file during
the examination.
The Ascher-Greif book will be made available, too.
The exam questions will be asked in English.
23/48
I
I
I
I
24/48
25/48
1
1
1.5
100
100
0.99
1.01
1.2
99.99
99
Absolute
Error
0.01
0.01
0.3
0.01
1
Relative
Error
0.01
0.01
0.2
0.0001
0.01
26/48
Approximation example
The Stirling approximation
v = Sn =
2n
n n
e
27/48
Types of errors
I
Approximation errors
I
I
I
Roundoff errors
I
28/48
29/48
f 0 (x0 )
f (x0 + h) f (x0 )
h
30/48
x0 < < x0 + h.
f
(x
)
0
0
f (x0 )
= f () f (x0 )
h
2
2
Or, using the big-O notation:
0
f (x0 ) Dx ,h (f ) = O(h).
0
31/48
Results
Try for f (x) = sin(x) at x0 = 1.2.
(So, we are approximating cos(1.2) = 0.362357754476674 . . .)
h
0.1
0.01
0.001
104
107
Absolute error
4.71667 102
4.666196 103
4.660799 104
4.660256 105
4.619326 108
32/48
33/48
Absolute error
4.36105 1010
5.594726 108
1.669696 107
7.938531 106
6.851746 104
8.173146 102
3.623578 101
34/48
35/48
Algorithm properties
Performance features that may be expected from a good numerical
algorithm.
I
Accuracy
Relates to errors. How accurate is the result going to be when
a numerical algorithm is run with some particular input data.
Efficiency
I
I
I
Robustness
(Numerical) software should run under all circumstances.
Should yield correct results to within an acceptable error or
should fail gracefully if not successful.
36/48
Complexity I
Complexity/computational cost of an algorithm : number of
elementary operators
Asymptotic complexity =
leading order term of complexity w.r.t.
large problem size parameters
The usual choice of problem size parameters in numerical linear
algebra is the number of independent real variables needed to
describe the input data (vector length, matrix sizes).
operation
inner product
outer product
tensor product
matrix product
description
(x Rn , y Rn ) 7 xH y
(x Rm , y Rn ) 7 xyH
#mul/div
n
nm
#add/sub
n1
0
O(n)
O(mn)
(A Rmn , B Rnk ) 7 AB
mnk
mk(n 1)
O(mnk)
37/48
as n .
38/48
39/48
Complexity II
To a certain extent, the asymptotic complexity allows to predict
the dependence of the runtime of a particular implementation of
an algorithm on the problem size (for large problems). For
instance, an algorithm with asymptotic complexity O(n2 ) is likely
to take 4 as much time when the problem size is doubled.
One may argue that the memory accesses are more decisive for run
times than floating point operations. Often there is a linear
dependence among the two. So, there is no difference in the O
notation.
40/48
Scaling
Scaling multiplication with diagonal matrices (with non-zero
diagonal entries) from left and/or right.
It is important to know the different effects of multiplying with a
diagonal matrix from left or right:
DA
vs. AD
41/48
Elementary matrices
Matrices of the form A = I + uvT are called elementary.
Again we can apply A to a vector x in a straightforward and a
more clever way:
Ax = (I + uvT )x
or
Ax = x + u(vT x)
Cf. exercises.
42/48
43/48
An unstable algorithm
44/48
A stable algorithm
45/48
Unstable algorithm
Problem statement: evaluate the integrals
Z 1
xn
dx, for n = 0, 1, 2, . . . , 30.
yn =
0 x + 10
Algorithm development: observe that analytically, for n > 0,
Z 1 n
Z 1
x + 10x n1
1
yn + 10yn1 =
dx =
x n1 dx = .
x + 10
n
0
0
Also,
Z
y0 =
0
1
dx = log(11) log(10).
x + 10
Algorithm:
I Evaluate y0 = log(1.1).
I For n = 1, 2, . . . , 30, evaluate yn =
NumCSE, Lecture 1, Sept 18, 2014
1
n
10yn1 .
46/48
47/48
48/48
2/34
Introduction
One of the important tasks of numerical mathematics is the
determination of the accuracy of results of some computation.
There are three types of errors that limit accuracy:
1. Errors in the mathematical model of the problem to be solved.
Simplified models are easier to solve (shape of objects,
unimportant chemical reactants).
2. Discretization or approximation errors depend on the chosen
algorithm or the type of discretization.
I
1
1
1
+
+
+
1! 2! 3!
3/34
Introduction (cont.)
I
I
4/34
n := 2
n
Fn = cos 2n sin 2n
An = nFn = n cos
n
n
sin
2
2
as n
5/34
n0 = 6,
6 = 2/n0 = 60 ,
sin 6 =
2
6/34
7/34
Integers
Also integers suffer from the finiteness of computers.
Matlab represents integers by 32-bit signed ints (in the twos
complement format)
30
P
falls a31 = 0
ai 2 i ,
i=0
a=
30
P
(232
ai 2i ), falls a31 = 1
i=0
8/34
Real numbers
A number x R (in the binary number system) has the form
x = (1.d1 d2 d3 dt1 dt dt+1 ) 2e
e is an integer exponent, the binary digits di are either 0 or 1.
d1 d2 d3
+ 2 + 3 +
2
2
2
In general, infinitely many digits are needed to represent a real
number.
1.d1 d2 d3 = 1 +
9/34
P
P
1 m
3. (0.010101 . . .)2 =
22m = 41
= 14 11 1 = 13
4
m=1
4.
1
5
m=0
10/34
11/34
12/34
Chopping
fl(x) = (1.d1 d2 d3 dt ) 2e
1
= 2t .
2
13/34
The exponent has 8 bits, the mantissa 23 Bit. There is a sign bit.
The value of a normalized 32-bit IEEE floating point number V is
V = (-1)S x 2(E-127) x (1.M)
Normalized means 0 < E < 255 = 28 1. (127 is called a bias.)
NumCSE, Lecture 2, Sept 22, 2014
14/34
double:
1 sign bit
11 bits exponent
52 bits mantissa
The value of a normalized 64-bit IEEE floating point number
V is
V = (-1)S x 2(E-1023) x (1.M)
Normalized means, that 0 < E < 2047 = 211 1.
15/34
0 (zero): e = 0, m = 0, s arbitrary.
-Infinity, +Infinity: e = all ones, m = 0.
e = all ones, m 6= 0: NaN
16/34
24
23
-125
128
2
6 108
53
52 -1021
1024 2
1 1016
63 -16381 16384 264 5 1020
Lemma
If x 6= 0 is a normalized floating point number and fl(x) obtained
after rounding with t digits, then
|fl(x) x| 2et /2
|fl(x) x|
2t /2
x
NumCSE, Lecture 2, Sept 22, 2014
17/34
Rounding errors
We assume that all numbers are normalized.
Let t be the length of the mantissa.
Between powers of 2, the floating point numbers are equidistant.
Definition
Machine precision = 2(t+1) (half of Matlabs eps)
This is half the distance of the numbers between 1 and 2.
NumCSE, Lecture 2, Sept 22, 2014
18/34
19/34
|i | .
20/34
Guard digit
Floating point system with = 10 and t = 4. So, = 21 103 . Let
x = .1103, = 1.103 101 ,
y = 9.963 103 .
|0.1003370.1003|
0.100337
21/34
fl()fl() = 10.01110001 22
fl(fl()fl()) = 1.00111 23 = 9.75
So,
2 fl(fl()fl()) 0.12
2 fl(fl()fl())
0.012
2
22/34
23/34
Wilkinsons Principle
The result of a numerical computation on the computer
is the exact result with slightly perturbed initial data.
This also holds for good implementations of (library) functions!
24/34
Cancellation
Cancellation (dt. Ausl
oschung) is a special kind of rounding error.
Consider the following two numbers with 5 decimal digits:
1.2345 e0
1.2344 e0
0.0001 e0 = 1.0000 e4
If the two numbers were exact, the result delivered by the
computer would also be exact. But if the first two numbers had
been obtained by previous calculations and were affected by
rounding errors, then the result would at best be 1.xxxx e4,
where the digits denoted by x are unknown.
25/34
Cancellation (cont.)
Suppose z = x y , where x y . Then
|z fl(z)| |x fl(x)| + |y fl(y )|,
from which it follows that the relative error satisfies
|x fl(x)| + |y fl(y )|
|z fl(z)|
,
|z|
|x y |
Numerator: OK.
If the denominator is very close to zero, x y , then the relative
error in z can become large.
26/34
(13% error)
a + b = fl(a + b) = 2.3
a b = fl(a b) = 0.10
(a + b)(a b) = 0.23
(0% error)
27/34
# positions 99 70 2
2
1.0
4
0.02000
6
0.00530000
10
0.005050660000
= 0.0050506338833466 .
9801 + 9800
results
9801 9800
0.0
0.01000
0.00510000
0.005050640000
1
9801+ 9800
0.0050
0.005051
0.00505063
0.005050633884
The reason for the numerical instability of the first two algorithms
is the subtraction of almost equal numbers (indicated by appended
zeros).
NumCSE, Lecture 2, Sept 22, 2014
28/34
1 x
e e x .
2
5
x3
+
,
6
120
|| < x.
29/34
Remark on overflow
An overflow is obtained when a number is too large to fit into the
floating point system in use.
q
Example: kxk2 = x12 + x22 + + xn2
To make things simple, let n = 2.
If x = (1060 , 1)T , then the exact result is kxk2 = 1060 .
But in the course of the computation we have to form x12 = 10120 .
To avoid overflow, we scale with a positive scalar s, e.g.,
s = max(|xi |)
r
x 2
x1 2 x2 2
n
kxk2 = s
+
+ +
s
s
s
30/34
31/34
32/34
Stopping criterion exploits fact that A6 < < An < A2n <
NumCSE, Lecture 2, Sept 22, 2014
33/34
34/34
Goals
I
2/34
Topics of today
I
Bisection algorithm
Reference
I
Next time
I
3/34
The problem
Want to find solutions of the scalar nonlinear equation
f (x) = 0
=
a6=0
x =
b
a
4/34
Examples
1. f (x) = x 1 on [a, b] = [0, 2].
2. f (x) = sin(x)
On [a, b] = [ 2 , 3
2 ] there is one root x = .
On [a, b] = [0, 4] there are five roots, cf. Fig. on next page.
3. f (x) = x 3 30x 2 + 2552 on [0, 20].
4. f (x) = 10 cosh(x/4) on < x <
cosh(t) =
1
2
(e t + e t )
5/34
Examples (cont.)
6/34
7/34
Even for polynomials this holds only for very low orders.
(Plot)
8/34
9/34
Bisection
I
10/34
Code bisect
function [x,fx] = bisect(f,a,b,tol)
%BISECT [x,y] = bisect(f,a,b) computes a zero of f(x) = 0
% in interval [a,b] assuming that f(a)f(b) < 0.
fa = f(a);
fb = f(b);
11/34
(x < b)),
try with
f=@(x) cos(x)*cosh(x)+1
[x,fx] = bisect(f,0,3,1e-9)
[x,fx] = bisect(f,0,3)
NumCSE, Lecture 3, Sept 25, 2014
12/34
Bisection (cont)
I
13/34
Properties of bisection
I
Simple
Slow
14/34
f (x) = 0
as
x = g (x).
()
so that f (x ) = 0 g (x ) = x .
x is a fixed point of the mapping g .
15/34
g (x) = x.
Then:
1. Start from an initial guess x0 .
2. For k = 0, 1, 2, . . . set
xk+1 = g (xk ),
k = 0, 1, . . .
16/34
x [0, 1]
17/34
xk+1 := g1 (xk )
0.500000000000000
0.606530659712633
0.545239211892605
0.579703094878068
0.560064627938902
0.571172148977215
0.564862946980323
0.568438047570066
0.566409452746921
0.567559634262242
0.566907212935471
xk+1 := g2 (xk )
0.500000000000000
0.566311003197218
0.567143165034862
0.567143290409781
0.567143290409784
0.567143290409784
0.567143290409784
0.567143290409784
0.567143290409784
0.567143290409784
0.567143290409784
xk+1 := g3 (xk )
0.500000000000000
0.675639364649936
0.347812678511202
0.855321409174107
-0.156505955383169
0.977326422747719
-0.619764251895580
0.713713087416146
0.256626649129847
0.924920676910549
-0.407422405542253
18/34
|xk x |
0.067143290409784
0.039387369302849
0.021904078517179
0.012559804468284
0.007078662470882
0.004028858567431
0.002280343429460
0.001294757160282
0.000733837662863
0.000416343852458
0.000236077474313
|xk x |
0.067143290409784
0.000832287212566
0.000000125374922
0.000000000000003
0.000000000000000
0.000000000000000
0.000000000000000
0.000000000000000
0.000000000000000
0.000000000000000
0.000000000000000
|xk x |
0.067143290409784
0.108496074240152
0.219330611898582
0.288178118764323
0.723649245792953
0.410183132337935
1.186907542305364
0.146569797006362
0.310516641279937
0.357777386500765
0.974565695952037
19/34
20/34
x (a, b),
21/34
22/34
g1 : linear convergence?
g2 : quadratic convergence?
g3 : no convergence?
23/34
24/34
left: 1 < g 0 (x ) 0
at least linear convergence
right: g 0 (x ) < 1
divergence
25/34
left: 0 g 0 (x ) < 1
at least linear convergence
right: 1 < g 0 (x )
divergence
26/34
Rate of convergence
Let x be a fixed point of the iteration xk+1 = g (xk ) and
= |g 0 (x )|
with
0 < < 1.
27/34
0.999
0.99
0.90
0.50
0.10
rate
0.00044
0.0044
0.046
0.30
1
k
2302
230
22
4
1
28/34
29/34
Termination criteria
Residual based termination:
|f (xk )|
=
prescribed tolerance > 0
= no guaranteed accuracy
f
30/34
abs
|xk+1 xk | or
rel |xk |
abs /rel are prescribed absolute / relative tolerances > 0
Usually (but not always) the relative criterion is more robust than
the absolute one.
A combination of the first two is
31/34
|xk+1 x | 1 |xk+1 xk |
That is: is we know (or can estimate) then the formula tells us
how close we are to the solution x .
Proof.
Idea of the proof: Bound |xk+m xk+1 |, m > 1, independent of m
and let m .
|xk+2 xk+1 | = |g (xk+1 ) g (xk )| |xk+1 xk |
NumCSE, Lecture 3, Sept 25, 2014
32/34
m2
m2
+ + )|xk+1 xk |
m3
+ + 1)|xk+1 xk |
1 m1
|xk+1 xk | <
|xk+1 xk |
1
1
m > 2
33/34
34/34
Topics of today
I
Zeroin (fzero)
Multiple zeros
Scalar optimization
References
I
2/40
3/40
Newton iteration
I
(1)
4/40
k = 0, 1, . . .
(2)
5/40
6/40
f (xk )
f 0 (xk )
7/40
8/40
a, a > 0:
f (x) = x 2 a,
f 0 (x) = 2x,
x2 a
x2 + a
1
a
=
= (x + ),
2x
2x
2
x
2+a
x
1
a
g 0 (x) = 1
= 2,
g 0 ( a) = 0,
2
2x
2 2x
a
1
g 00 (x) = 3 ,
g 00 ( a) = .
x
a
g (x) = x
a
1
xk+1 = (xk + )
2
xk
NumCSE, Lecture 4, Sept 29, 2014
|xk+1
a| =
1
|xk a|2 .
2|xk |
9/40
xk
2.00000000000000000
1.50000000000000000
1.41666666666666652
1.41421568627450966
1.41421356237468987
1.41421356237309492
ek := xk 2
0.58578643762690485
0.08578643762690485
0.00245310429357137
0.00000212390141452
0.00000000000159472
0.00000000000000022
log
|ek |
|ek1 |
: log
|ek1 |
|ek2 |
1.850
1.984
2.000
0.630
10/40
g (x) = 2 cosh(x/4).
11/40
2 cosh(xk /4) xk
.
0.5 sinh(xk /4) 1
12/40
0
-4.76e-1
1
8.43e-2
2
1.56e-3
3
5.65e-7
4
7.28e-14
5
1.78e-15
13/40
Order of convergence
The method is said to be
I
14/40
Sketch of a proof.
(i) It is easy to verify that
g 0 (x) =
f (x)f 00 (x)
.
(f 0 (x))2
So, g 0 (x ) = 0.
15/40
16/40
Another example
The equation f (x) = x 3 3x + 2 = (x + 2)(x 1)2 has two zeros:
2 and 1. The Newton iteration is
2xk
2
+
3
3(xk + 1)
2x(x + 2)
2
2
g 0 (x) =
=
3(x + 1)2
3 3(x + 1)2
xk+1 = g (xk ) =
g 0 (2) = 0
1
g 0 (1) =
2
The double zero makes troubles.
17/40
f (x)
(x x )m h(x)
=
x
f 0 (x)
m(x x )m1 h(x) + (x x )m h0 (x)
(x x )h(x)
=x
m h(x) + (x x )h0 (x)
18/40
19/40
k = 0, 1, . . .
20/40
Damped Newton
To avoid overshooting one can damp (shorten) the Newton step
xk+1 = xk k f (xk )/f 0 (xk ),
k = 0, 1, . . .
21/40
Secant method
xk+1 = xk f (xk )
xk xk1
f (xk ) f (xk1 )
secant method.
f (xk ) f (xk1 )
.
xk xk1
22/40
23/40
24/40
0
2.26
1
-4.76e-1
2
-1.64e-1
3
2.45e-2
4
-9.93e-4
5
-5.62e-6
6
1.30e-9
25/40
f 0 (t) = q 0 (t),
f 00 (t) = q 00 (t),
f (t2 ) = q(t2 ),
f (t3 ) = q(t3 ),
26/40
Inverse interpolation
Given a data set
(xi , yi = f (xi )),
i = 0, 1, . . . , n.
27/40
y yk1
y yk
f 1 (yk1 )
yk yk1
yk yk1
28/40
xk+1 =
29/40
30/40
xk
0.08520390058175
0.16009252622586
0.79879381816390
0.63094636752843
0.56107750991028
0.56706941033107
0.56714331707092
0.56714329040980
f (xk )
-0.90721814294134
-0.81211229637354
0.77560534067946
0.18579323999999
-0.01667806436181
-0.00020413476766
0.00000007367067
0.00000000000003
ek := xk x
-0.48193938982803
-0.40705076418392
0.23165052775411
0.06380307711864
-0.00606578049951
-0.00007388007872
0.00000002666114
0.00000000000001
3.33791154378839
2.28740488912208
1.82494667289715
1.87323264214217
1.79832936980454
1.84841261527097
31/40
Matlabs fzero
Combine the reliability of bisection with the convergence speed of
secant and inverse quadratic interpolation (IQI). Requires only
function evaluation.
Outline:
I
I
I
I
32/40
If the IQI or secant step is in the interval [a, b], take it.
33/40
Comparision of methods
Comparison of some methods for computing the smallest zero of
f (x) = cos(x) cosh(x) + 1 = 0. (x 1.8751, tol = 109 )
method
bisection
secant method
secant method
Newton
fzero
fzero
start
[0, 3]
[1.5, 3]
[0, 3]
x0 = 1.5
[0, 3]
[1.5, 3]
# steps
32
8
15
5
8
7
# function evals
34
9
16
11
10
9
34/40
f (x)
,
x z
35/40
f 0 (x)
xz
f (x)
(xz)2
f (x)
xz
1
f 0 (x)
.
f (x)
x z
1
f 0 (xk ) 1
f (xk ) xk z
36/40
37/40
h2 00
(x ) +
2
h2 00
[ (x ) + O(h)].
2
38/40
39/40
x
10
sinh( ) 1,
4
4
00 (x) = f 0 (x) =
10
x
cosh( ).
16
4
40/40
Topics of today
I
Reference
I
AscherGreif, Section 4.
2/35
n
X
xi i (t).
i=1
i = 1, . . . , m.
(1)
3/35
A Rnn
4/35
1 t1 t12 t1n1
1 t2 t 2 t n1
2
2
A = . . . .
.. .
..
.. .. ..
.
1 tn tn2
tnn1
1i<jn (tj
ti ).
5/35
Numerical example
t = [0,0.1,0.8,1]; b = [1,-0.9,10,9];
A = fliplr(vander(t)); % notice the definition of
% MATLABs vander function
x = A \ b;
% This solves the system Ax = b
6/35
i = 1, . . . , m,
7/35
m
X
i=1
= min
x
m
X
i=1
"
bi
n
X
#2
xj j (ti )
i=1
A Rmn , b Rm , x Rn , m n.
8/35
=
=
=
=
[0,0.1,0.8,1]; b = [1,-0.9,10,9];
fliplr(vander(t));
A(:,1:3); % only quadratic polynomial
A \ b;
% This solves the system Ax = b
9/35
10/35
i = 1, . . . , N 1
or
1 2 1
v0
g (t1 )
v1 g (t2 )
1 2 1
..
..
1 2 1
=
.
.
h2
.
.
.
.
.
.
vN1 g (tN2 )
.
.
.
1 2 1
vN
g (tN1 )
NumCSE, Lecture 5, Oct 2, 2014
11/35
12/35
2
1
h2
1
2 1
1
2 1
..
..
.
.
1
v1
v2
v3
..
.
g (t1 )
g (t2 )
g (t3 )
..
.
.
..
2 1 vN1 g (tN1 )
2
2
vN
g (tN )
1
g (tN ).
2
13/35
Picture at
http://www.geologie.ens.fr/~vigny/images/gps-cours1.jpg
NumCSE, Lecture 5, Oct 2, 2014
14/35
Picture from
http://en.wikipedia.org/wiki/
Global_Positioning_System
15/35
16/35
u=
si =
z
zi ,
b
i
| {z }
{z
}
|
data from satellite i
unknown data about receiver
Actual distance of satelite i and receiver:
q
(xi x)2 + (yi y )2 + (zi z)2 + b = i .
Thus,
(xi x)2 + (yi y )2 + (zi z)2 = (i b)2 ,
17/35
1
1
hsi , si i hsi , ui + hu, ui = 0,
2
2
i.
(2)
18/35
1
x1 y1 z1 1
hs1 , s1 i
x2 y2 z2 2
1
1 hs2 , s2 i
1
B=.
, a = . , e = . , = hu, ui.
.
.
.
.
.
.
.
.
.
2 .
2
.
.
.
.
.
1
xn yn zn n
hsn , sn i
B is an n 4 matrix, a and e are n-vectors, and is a scalar. We
now can simultaneously write the n equations from (2) as
a Bu + e = 0
Bu = (a + e)
19/35
(3)
(Q)
20/35
u2 = B + (a + 2 e).
(4)
It turns out that only one of these two solutions is useful, namely
the one that leads to receiver coordinates that are on the surface
of the earth.
To check this, the first three components of u are used.
21/35
22/35
1
(x1 + + xN )
N
Let
xk = xk m,
k = 1, . . . , N.
23/35
1
BB T Rpp .
N 1
24/35
4
B = [
x1 ,
x2 ,
x3 ,
x4 ] = 2
4
10
6
8
S = 6
0 8
1
2 3
2
4 0 .
8 4 0
0
8
32
25/35
26/35
27/35
1
BT
N 1
AT A = S.
Let
UV T = A
be the economic singular value decomposition of A. Then
AT A = V U T UV T = S
V T SV = 2 = diagonal.
p
X
j2 .
j=1
28/35
22 = 13.8430,
32 = 1.6057,
tr(S) = 50.
Since 34.5513
+ 13.8430
= 48.3943
= 0.9679 the first two component
50
50
50
contain 97% of the total variation (information).
NumCSE, Lecture 5, Oct 2, 2014
29/35
30/35
31/35
32/35
33/35
34/35
Further reading
I
35/35
2/40
General remarks
I
I
I
3/40
Pivoting
Diagonally dominant matrices
Cholesky factorization
References
I
4/40
a22 a2n
A=
..
..
.
.
ann
Algorithm: backward substitution
x = zeros(n,1);
for k=n:-1:1
x(k) = (b(k) - a(k,k+1:n)*x(k+1:n))/a(k,k);
end
Complexity: n(n 1)/2 subtractions/multiplications and n
divisions = n(n + 1)/2 = O(n2 ) floating point operations.
5/40
Example:
x1 4x2 + 3x3 = 2,
5x2 3x3 =
7,
2x3 = 2.
1 4
3
x1
2
0
5 3 x2 = 7
0
0 2
x3
2
Solution: x3 = 1, x2 = 2, x1 = 3.
6/40
a11
a12 a22
A= .
.
.
.
.
.
.
.
.
an1 an2
ann
7/40
Gaussian elimination
To solve the linear system of equation A x = b we proceed as
follows:
1. Compute LU factorization
LU = A.
with L a lower unit triangular matrix and U an upper
triangular matrix, respectively.
2. Solve Ly = b by forward substitution. (Can be fused with 1.)
3. Solve Ux = y by backward substitution.
The cost is 32 n3 + O(n2 ) for the LU factorization and 2n2 + O(n)
for each forward and backward substitution, in total
2 3
n + O(n2 ) flops.
3
NumCSE, Lecture 6, Oct 6, 2014
8/40
row interchanges.
9/40
1
`21 1
`31
(1)
1
L =
..
.
.
.
.
`n1
1
After the first step, the modified linear system of equations is
A(1) x = b(1) ,
NumCSE, Lecture 6, Oct 6, 2014
A(1) = L(1) A,
b(1) = L(1) b.
10/40
`32 1
(1)
L =
..
.
.
.
.
`n2
1
After the 2nd step, the linear system of equations is
A(2) x = b(2) ,
11/40
1
`21 1
`31 `32 1
(1) 1 (2) 1
(n1) 1
L=L
L
L
=
.
..
..
..
..
.
.
.
.
`n1 `n2 `n3 1
12/40
Numerical example
I
Solve Ax = b for
1
A = 1
3
1 3
1 0
2 1
2
b = 5
6
1 0 0
1 1 3
2
= 1 1 0 , A(1) = L(1) A = 0 2 3 , b(1) = L(1) b = 7
3 0 1
0 1 8
12
L(1)
1 1
3
2
1 0 0
= 0 1 0 , A(2) = L(2) A(1) = 0 2 3 , b(2) = L(2) b(1) = 7
17
0 0 13
0 12 1
2
2
L(2)
13/40
Note that
[L(1) ]1 [L(2) ]1
1
= 1
3
0
1
0
1
0
0 0
1
0
0
1
1
2
0
1
0 = 1
1
3
0
1
1
2
0
0 = L
1
and
1
LU = 1
3
0
1
1
2
0
1
0 0
1
0
1
3
1
2 3 = 1
3
0 13
2
1 3
1 0 = A.
2 1
14/40
15/40
k=1
16/40
0 1
x1
4
=
1 1
x2
7
x1 x2 1
0 1 4
1 1 7
17/40
x1
x2
1
0.00035 1.2654
3.5267
0
4535.0 12636
with
l21 = 1.2547/0.00035 3584.9
u22 = 1.3182 3584.9 1.2654 1.3182 4536.3 4535.0
c2 = 6.8541 3584.9 (3.5267) 6.8541 12643 12636.
18/40
x2 = 2.7863
19/40
|apk
(k1)
| = max |aik
ik
20/40
x1
x2
1
1.2547 1.3182 6.8541
0
1.2650 3.5248
Then
l21 = 0.00027895
x2 = 2.7864
x1 = (6.8541 1.3182 2.7864)/1.2547
(6.8541 3.6730)/1.2547 = 3.1811/1.2547 2.5353
There is a deviation from the exact solution in the last digit only.
NumCSE, Lecture 6, Oct 6, 2014
21/40
GEPP for A x = b
With partial pivoting we have after n 1 stages an upper
triangular matrix U,
U = A(n1) = L(n1) P (n1) L(2) P (2) L(1) P (1) A.
For n = 4 this is
U = L(3) P (3) L(2) P (2) L(1) P (1) A.
This can be rewritten as
U = L(3) (P (3) L(2) P (3)
|
P (3)
22/40
23/40
24/40
GEPP stability
Question: Does the incorporation of a partial pivoting procedure
guarantee stability of the Gaussian elimination algorithm, in the
sense that roundoff errors do not get amplified by factors that
grow unboundedly as the matrix size n increases?
We have seen that |lik | 1.
i.e., the elements in U do not grow too fast in the course of GE.
Unfortunately, it is not possible in general to guarantee stability of
GEPP.
25/40
((aij ))ni,j=1
A10
1
1
1
1
1
1
1
1
1
1
0
1
1
1
1
1
1
1
1
1
1
aij = 1
with
0
0
1
1
1
1
1
1
1
1
0
0
0
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
1
1
, if i = j j = n
, if i > j
else.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
1
0
0
1 1
1
0
1 1 1
1
1 1 1 1
1
1
1
1
1
1
1
1
1
1
26/40
Complete pivoting
In complete pivoting we look for (one of) the largest element in
modulus
(k1)
(k1)
|aqr | = max |aij
|
ki,jn
27/40
()
k6=i
Theorem
If A is nonsingular and diagonally dominant then the LU
factorization can be computed without pivoting.
28/40
Proof
We show that after reduction of the first row, the reduced system
is again diagonally dominant. We have
(1)
aik = aik
ai1 a1k
,
a11
i, k = 2, . . . , n.
i = 2, . . . , n.
(1)
|aii |
n
X
(1)
|aik |,
i = 2, . . . , n.
k=2
k6=i
29/40
Proof (cont.)
Now for the sum of the moduli of the off-diagonal elements of row
i (i = 2, . . . , n) we get
n
X
k=2
k6=i
(1)
|aik |
n
n
n
X
ai1 X
ai1 a1k X
|aik | +
|a1k |
=
aik a11
a
11
k=2
k=2
k=2
k6=i
n
X
k=1
k6=i
k6=i
k6=i
)
( n
ai1 X
|aik | |ai1 | +
|a1k | |a1i |
a11
k=2
ai1
ai1 a1i (1)
a
|aii | |ai1 | +
{|a11 | |a1i |} = |aii |
ii
a11
a11
30/40
for all x 6= 0
31/40
Proof.
1. aii = eT
i Aei > 0
2. (ei + ek )T A(ei + ek ) = aii 2 + 2aik + akk > 0, . This
quadratic equation has no real zero , therefore its
2 4a a
discriminant 4aik
ii kk must be negative.
3. Clear.
NumCSE, Lecture 6, Oct 6, 2014
32/40
= yT Ay > 0.
33/40
Cholesky decomposition
From the proof we see that
A = LU = LDLT
with
U = DLT
Definition
Let L1 = LD 1/2 . Then
A = L1 LT
1
is called the Cholesky decomposition of A.
NumCSE, Lecture 6, Oct 6, 2014
34/40
35/40
(Cholesky decomposition)
(Forward substitution)
(Backward substitution)
36/40
Efficient implementation
I
37/40
38/40
39/40
Important remark
I
I
I
I
40/40
Topics of today
I
Fill-in
Error estimation
2/41
3/41
4/41
VLSI chips
Modern electric circuits (VLSI chips):
105 107 circuit elements
I Each element is connected to only
a few nodes
I
5/41
6/41
7/41
Banded matrices
a11
..
.
a
p1
A=
I
I
I
I
I
. . . a1q
..
.
..
.
..
.
..
.
..
.
..
..
..
.
..
..
.
..
.
an,np+1 . . .
anq+1,n
..
.
ann
So, aij = 0, if i j + p, or if j i + q.
The bandwidth is p + q 1.
Bandwidth = 1: diagonal matrices (p = q = 1)
Bandwidth = 3: tridiagonal matrices (p = q = 2)
Matrices of this type are supported by LAPACK.
8/41
a11
a21
A=
a31
a12
a22 a23
a11
a21
a31
a12
a22
a32
a42
9/41
0
0
0
0
0 0 0
0
0
0
0 0
0
0 0
0 0
10/41
11/41
12/41
13/41
A=
2
3
4
5
6
8
7
10
[1, 2, 5, 3, 6, 9, 4, 8, 7, 10]
col ind =
[1, 2, 3, 2, 3, 5, 2, 4, 3, 5]
row ptr =
[1, 2, 4, 7, 9, 11]
Explanation:
1. real/complex vector for the matrix elements,
2. int vector with column index of each matrix element,
3. int vector, encoding the partition of val and row ptr.
NumCSE, Lecture 7, Oct 9, 2014
14/41
15/41
16/41
Reordering can make a difference!
NumCSE, Lecture 7, Oct 9, 2014
17/41
18/41
A=gallery(poisson,n);
spy(A);
19/41
subplot(1,3,1), spy(A)
p = symrcm(A);
p = symamd(A);
subplot(1,3,2), spy(Arcm,red)
subplot(1,3,3), spy(Aamd,green
Arcm = A(p,p);
Aamd = A(p,p);
pause
L = chol(A,lower);
Lrcm = chol(Arcm,lower);
Lamd = chol(Aamd,lower);
subplot(1,3,1), spy(L)
subplot(1,3,2), spy(Lrcm,red)
subplot(1,3,3), spy(Lamd,green)
20/41
Error estimation
Two questions regarding the accuracy of
x as an approximaton to
the solution x of the linear system of equations A x = b.
1. First we investigate what we can derive from the size of the
residual r = b A
x.
Note that r = b A x = 0.
2. Then, what is the effect of errors in the initial data (b, A) on
the solution x? That is, how sensitive is the solution to
perturbations in the initial data?
21/41
0.8642
b=
0.1440
x=
0.4870
108
108
r = b A x =
= krk = 108
2
Since x =
= k
x xk = 1.513 which is 108
2
times larger than the residual!
Then,
22/41
23/41
kE k2
1
=
.
kAk2
2 (A)
24/41
1.513
2
<
3.27
0.8642
kzk
krk
(A)
kxk
kbk
25/41
,
0 7.7 109
0 0.86480.2161
0.1441
1.2969
So, indeed,
kE k
1
.
kAk
(A)
(This estimate holds in `2 and ` norms.)
26/41
x = x + x.
27/41
Ax = b A x A x.
28/41
29/41
1
kAk
.
kxk
kbk
Therefore, we have
kxk
kA1 k
kbk
+
kAk
kxk
(1 kA1 kkAk) kxk
kA1 kkAk
kbk kAk
+
(1 kA1 kkAk) kbk
kAk
NumCSE, Lecture 7, Oct 9, 2014
30/41
+
kxk
kAk
1 (A) kAk kbk
kAk
31/41
Rule of thumb
Lets assume we compute with d decimal digits such that the
initial errors in the input data are about
kbk
5 10d ,
kbk
kAk
5 10d .
kAk
32/41
33/41
Note on A and b
It can be shown that Gaussian elimination with pivoting yield a
perturbation bounded by
kAk (n)gn (A)
where (n) is a low order polynomial in the order of the matrix n
(cubic at most) and is the rounding unit. The bound on forward
and backward substitutions and on b are significantly smaller.
Thus, as long as the pivoting keeps gn (A) growing only moderately
and n is not too large then the overall perturbations A and b are
not larger than a few orders of magnitude times .
34/41
Scaling
I
I
35/41
Scaling (cont.)
Example (Forsythe & Moler)
10 1000 000
1
1
0
x1
100 000
=
x2
2
36/41
% exact solution of 1s
% corresponding right hand side
37/41
38/41
Theorem on scaling
Theorem (`1 row scaling)
Let A Rnn be nonsingular. Let the diagonal matrix Dz be
defined by
1
n
X
dii = |aij | .
j=1
Then
(Dz A) (DA)
39/41
Remark on determinants
Although det(A) = 0 for singular matrices, small determinants do
not indicate bad condition!
1 1 1
..
..
..
.
.
.
A=
..
. 1
1
NumCSE, Lecture 7, Oct 9, 2014
40/41
Condition estimation
After having solved A x = b via PA = LU we would like to
ascertain the number of correct digits in the computed
x.
Need estimate of (A) = kAk kA1 k
Known: kAk = max
n
P
|aij |
1in j=1
41/41
Topics of today
A basic problem in science is to fit a model to observations subject
to errors.
The more observations that are available, the more accurately will
it be possible to calculate the parameters in the model.
This gives rise to the problem of solving overdetermined linear or
nonlinear systems of equations.
(Ake Bj
orck, Numerical Methods for Least Squares Problems, SIAM,
1996)
2/25
b Rm , x Rn ,
m>n
or
mn
Reference
I
AscherGreif, Chapter 6.
3/25
http://en.wikipedia.org/wiki/Least_squares
NumCSE, Lecture 8, Oct 13, 2014
4/25
Points measured on
two orthogonal lines
g1 and g2 .
5/25
6/25
7/25
A Rmn ,
b Rm ,
x Rn ,
m > n.
1
kb Axk22
2
is minimized.
8/25
9/25
1
with (x) = krk22
2
(x) = 0,
xk
k = 1, . . . , n.
10/25
11/25
B Rnn .
for all z
Q(z) = 0 Az = 0 z = 0
So, in particular,
1
kA xk22 > 0,
2
for all x 6= 0.
and
(x + x) > (x),
NumCSE, Lecture 8, Oct 13, 2014
for all x 6= 0.
12/25
where A has full column rank, has a unique solution that satisfies
the normal equations
(AT A)x = AT b.
1 T
We have x = AT A
A b. The matrix multiplying b is called
the pseudo-inverse of A:
13/25
Geometrical interpretation
From
grad (x) = AT A x AT b = AT (A x b) = 0
we see that AT r = 0.
Ax
14/25
3. L z = y,
(normal equations B x = y)
(Cholesky factorization)
LT x
4. r = Ax b
=z
(Compute residual)
Complexity:
I
Step 2:
1 3
3n
+ O(n2 ) flops
15/25
1
2
A=
5
3
1
0
1
3
5
53
3 2
R ,
5
4
6
3
4
2
5
b=
5 R
2
1
40 30 10
18
B = AT A = 30 79 47 ,
y = AT b = 5
10 47 55
21
The final solution is x = (0.3472, 0.3990, 0.7859)T , correct to
the number of digits shown. The residual is
r = (4.4387, 0.0381, 0.495, 1.893, 1.311)T .
NumCSE, Lecture 8, Oct 13, 2014
16/25
Data fitting
Generally, data fitting problems arise as follows:
I
The task is to find x such that the predicted data match the
observed data to the extent possible.
Here, we study the linear case where predicted data are given
by Ax.
(The condition that A has maximal rank means that there is
no redundancy in the representation of the predicted data.)
17/25
1 t1
!
Pm
1 t2
m
i=1 ti
A = . .
B = Pm
Pm 2
.. ..
t
t
i=1 i
i=1 i
1 tm
i
ti
bi
1
2
3
0.0 1.0 2.0
0.1 0.9 2.0
0.05
x=
0.95
18/25
v (ti ) bi ,
i = 1, . . . , m.
1 t0 t0n2 t0n1
1 t1 t n2 t n1
1
1
mn
A = . .
.
..
.. R
.. ..
.
.
n2 t n1
1 tm tm
m
19/25
20/25
21/25
22/25
c1 + n1 x + n2 y = 0,
n12 + n22 = 1,
g2 :
c2 n2 x + n1 y = 0,
n12 + n22 = 1,
23/25
1
1
..
.
..
.
0
0
..
.
x1
x2
..
.
y1
y2
..
.
0
1
1
..
.
xm1
ym1 +1
ym1 +2
..
.
ym1
xm1 +1
xm1 +2
..
.
0 1 ym1 +m2
xm1 +m2
0
0
..
.
c1
c2 0
n1 0
0
n2
..
24/25
Remarks
I
25/25
Topics of today
I
b Rm , x Rn ,
m > n,
Reference
I
AscherGreif, Chapter 6.
2/34
1
kb Axk22
2
is minimized.
3/34
A=
0
0
1
0
1
0
,
0
1 + 2
1
1
1 + 2
1 .
B = AT A = 1
1
1
1 + 2
4/34
0.2
0.3
0.5
0.5
1.0
0.8
1.5
1.0
2.0
1.2
3.0
1.3
5/34
0.166667
0.333333
0.500000
0.600000
0.666667
0.750000
0.3
0.181269
0.393469 0.5
0.632121
1 = 0.8
1.0
0.776870
2
1.2
0.864665
0.950213
1.3
6/34
x
+ 1.03782(1 e x ).
1+x
with residual
r (0.0478, 0.0364, 0.0481, 0.0368, 0.0465, 0.0257)T .
Since the eigenvalues of B are 1 4.59627 and 2 0.0009006,
the condition number of B is (B) 5104. We lost 4 out of 6
decimal digits.
A computation with 12 decimal digits gives
1 0.382495,
2 1.03915.
The residual does not change showing the sensitivity of the system.
NumCSE, Lecture 9, Oct 16, 2014
7/34
8/34
Hence also Q 1 = Q T .
9/34
(*)
10/34
c
Q b=
d
T
Then
krk2 = kb Axk2 = kc Rxk2 + kdk2 .
1.
2.
3.
4.
11/34
1 0
A = 1 1
1 2
In Matlab execute
0.1
b = 0.9
2.0
[Q, T] = qr(A)
[m,n] = size(A);
R = T(1:n, 1:n)
Q*b
In the end, we have to solve
1.7321
1.7321 1.7321
c
R
1.3435
0
1.4142 x
min
x
= min
x
x
d
O
0.1225
0
0
NumCSE, Lecture 9, Oct 16, 2014
12/34
Economy size QR
R
R
A=Q
= [ Q1 , Q2 ]
= Q1 R.
|{z} |{z}
O
O
n mn
With the pseudo-inverse of A we get
x = A+ b (AT A)1 AT b = (R T Q1T Q1 R)1 R T Q1T b
= (R T R)1 R T Q1T b = R 1 Q1T b
Notice, that R T R is the Cholesky factorization of B: B = R T R.
13/34
(full version)
(economy version)
14/34
for n = 300:100:1000
m = 3*n+1; % m = n+1; % or something else
A = randn(m,n); b = randn(m,1);
% solve and find execution times; first, Matlabs way usin
tic; xqr = A\b; tqr(n/100-2) = toc;
% next use normal equations
tic; B = A*A; y = A*b; xne = B\y; tne(n/100-2) = toc;
end
plot(300:100:1000,tqr./tne) % plot ratio tqr/tne
15/34
16/34
Rotation
a=
cos
,
sin
> 0,
cos sin
cos
QT a =
sin cos
sin
cos( + )
=
sin( + )
(
Choose s.t.
+ = 0, or
+ = .
17/34
Givens rotation
The orthogonal matrix G (i, j, ) Rn,n rotates coordinates i and j
of some vector. (The other coordinates are not altered.)
q
xj
s = sin = q
.
xi2 + xj2
18/34
19/34
QR decomposition
Theorem (QR decomposition)
For every matrix A Rmn , m > n, with maximal rank n there is
an orthogonal matrix Q Rmm such that
R
A=Q
, with R Rn,n ,
O
with R upper triangular and nonsingular.
Proof.
By successively zeroing elements in A by the right sequence of
Givens rotations we can prove the statement. The sequence is
(1, 2), (1, 3), . . . , (1, m), (2, 3), (2, 4), . . . , (2, m), (3, 4), . . . (n, m).
Later rotations do not destroy previously generated zeros.
NumCSE, Lecture 9, Oct 16, 2014
20/34
x
+ 2 (1 e x ).
1+x
Now we get
R=
1.68492
,
0.0486849
1.32508
0.0
T
c=Q b=
2.25773
,
0.0505901
0.0478834
0.0363745
0.0481185
r = b Ax =
0.0367843
0.0464831
0.0257141
21/34
Householder reflectors
Householder reflectors are very special elementary matrices:
H = I 2uuT ,
kuk2 = 1.
H T = H 1 ,
Hv = v
if v u.
22/34
Therefore,
v := 2u = x kxk2 e1 ,
i.e., we know the direction of u as soon as we have determined the
sign.
In order to avoid cancellation we choose the sign equal to the sign
of x1 :
v = x + sign(x1 )kxk2 e1 .
NumCSE, Lecture 9, Oct 16, 2014
23/34
Example
If x = [3, 1, 5, 1]T and v = [9, 1, 5, 1]T , then
27
vvT
1
9
H =I 2 T =
v v
54 45
9
9
53
5
1
45 9
5 1
29 5
5 53
24/34
Matlab function
25/34
26/34
Householder QR factorization
After the application of three Householder
reflectors to a 20 5 matrix.
I
etc.
27/34
Costs O(nm2 )
[Q,R] = qr(A,0);
Costs O(n2 m)
28/34
Timings of QR with m = n2
29/34
30/34
31/34
32/34
33/34
Tridiagonal systems
Tridiagonal systems can be solved in O(n) flops by Givens
rotations. (Do not generate Q = G1 Gn1 .)
A = full(gallery(tridiag,10,-2,3,-1));
n = size(A,1);
for k=1:n-1,
[c,s] = givens(A(k,k),A(k+1,k));
A(k:k+1,k:n) = [c -s;s c]*A(k:k+1,k:n);
A(k+1,k) = 0;
end
34/34
Topics of today
I
References
I
I
2/40
nn
nonsingular.
3/40
4/40
5/40
Stationary temperature
distribution.
6/40
7/40
u(x, y ) = f (x, y ),
x 2
y 2
u(x, y ) = p(x, y ) on ,
u(x, y ) =
u(x, y ) 2 u(x, y )
2
x
y
1
(4u(x, y ) u(x + h, y ) u(x h, y )
h2
u(x, y + h) u(x, y h)) + O(h2 )
8/40
1
(4u(xi , yj ) u(xi+1 , yj ) u(xi1 , yj )
h2
u(xi , yj+1 ) u(xi , yj1 ))
9/40
10/40
1 i, j n.
11/40
u1,1
u2,1
..
.
uN,1
u1,2
u = u2,2 ,
..
.
uN,2
u1,3
..
.
uN,N
NumCSE, Lecture 10, Oct 20, 2014
b1,1
b2,1
..
.
bN,1
b1,2
b = b2,2 ,
..
.
bN,2
b1,3
..
.
bi,j = h2 f (xi , yj )
+ possible boundary
terms
bN,N
12/40
13/40
J
I
A=
I
J
..
.
I
..
.
I
..
.
J
I
I
J
4
1
J=
1
4 1
..
..
.
.
1
,
.
4 1
1
4
..
14/40
In 3 dimensions: n = N 3 . Bandwidth of A is N 2 .
N = 100 n = 106 , N = 1000 n = 109 .
Gaussian elimination with the original matrix ordering costs
O(N 7 ) = O(n2.33 ) flops.
15/40
1 i, j N.
max 8.
Thus i,j > 0 for all 1 i, j N, and we see that the matrix A is
also positive definite.
Knowing the eigenvalues explicitly is helpful in understanding
performance, convergence and accuracy issues related to linear
solvers.
16/40
k = 0, 1, . . .
17/40
M x = Nx + b = M x + b A x.
xk+1 = xk + M 1 rk
Note:
I In practice, M 1 should be a good (but cheaper to invert)
approximation to A1 (preconditioner).
NumCSE, Lecture 10, Oct 20, 2014
18/40
xk+1 = T xk + c
N = M A.
19/40
Theorem on convergence
Theorem (Convergence theorem for stationary iterations)
Let A = M N be a matrix splitting with M invertible. Then,
(1) The iteration
Mxk+1 = Nxk + b,
(+)
T = M 1 N = I M 1 A.
20/40
Convergence rate
I
1
log10 (T )
1
log10 (T ).
k
The rate is higher (faster convergence), the smaller .
rate =
21/40
7
A = 3
1
E = tril(A)
3
1
10
2
7 15
Jacobi iteration:
0
0
7
10
0 , E = 3
0 15
1
0
10
7
0
0
15
xk+1 = xk + D 1 rk
GaussSeidel iteration:
NumCSE, Lecture 10, Oct 20, 2014
7
D = 0
0
xk+1 = xk + E 1 rk
22/40
Jacobi example
7x1 = 3 3x2 x3 ,
10x2 = 4 + 3x1 2x3 ,
15x3 = 2 x1 7x2 .
Evaluate right hand side at current iterate k and left hand side as
unknown at step k + 1:
(k+1)
x1
(k+1)
x2
(k+1)
x3
1
(k)
(k)
3 3x2 x3
,
7
1
(k)
(k)
=
4 + 3x1 2x3
,
10
1
(k)
(k)
=
2 x1 7x2
.
15
=
23/40
GaussSeidel example
7x1 = 3 3x2 x3 ,
10x2 3x1 = 4 2x3 ,
15x3 + x1 + 7x2 = 2.
Evaluate right hand side at current iterate k and left hand side as
unknown at step k + 1:
(k+1)
x1
(k+1)
x2
(k+1)
x3
1
(k)
(k)
3 3x2 x3
,
7
1
(k+1)
(k)
=
4 + 3x1
2x3
,
10
1
(k+1)
(k+1)
=
2 x1
7x2
.
15
=
24/40
1 i, j N.
Eigenvalues
l,m = 4 2(cos
I
l
m
+ cos
),
N +1
N +1
1 l, m N.
1
T = I D 1 A = I A.
4
25/40
Spectral radius
(T ) = cos
1 l, m N.
1 .
N +1
n
Rate of convergence
rate = log (T ) 1/n.
Thus, O(n) iterations are required for error reduction by a
constant factor.
26/40
(A) tol.
kxk
kbk
Thus, the tolerance must be taken suitably small if the
condition number of A is large.
27/40
kT k
kxk xk1 k
1 kT k
28/40
Both methods are simple but slow. Used as building blocks for
faster, more complex methods.
29/40
30/40
31/40
2
q
> 1,
1 + 1 2J
32/40
2
,
1 + sin n+1
.
1 + sin n+1
1 + sin N+1
7. Once the optimal parameter is used, convergence is much
faster than with Jacobi or GaussSeidel.
33/40
M x(k+ 2 ) = N x(k) + b,
e x(k+1) = N
e x(k+ 12 ) + b,
M
e = 1 D L.
N
Note:
I
34/40
A11
A21
A= .
..
A12
A22
..
.
Am1 Am2
A1m
A2m
..
.
Amm
A11
A21
M= .
..
A22
..
.
Am1 Am2
..
Amm
35/40
MATLAB
M = D = diag(diag(A))
M= DB = triu(tril(A,1),-1)
M=tril(A)
M=tril(A,1)
M1 =tril(A)/sqrt(D); M2 = M1T
M1 =tril(A,1)/chol(DB ); M2 = M1T
D/omega + tril(A,-1)
triu(tril(A,1),-1)/omega + tril(A,-m)
nit
341
176
174
90
90
48
32
24
36/40
n = 312
2157
1093
1085
547
85
61
n = 632
7787
3943
3905
1959
238
132
37/40
38/40
Iterative refinement/improvement
Let Ax = b be solved via GEPP, PA = LU.
We wish to improve the accuracy of the computed solution
x.
If we execute
r = b A
x
Solve L y = Pr
Solve U z = y
x0 =
x+z
then, in exact arithmetic,
A
x0 = A
x + A z = (b r) + r = b.
39/40
on iterative improvement
0.58824;
0.64286;
0.36842;
0.38462]
0.15380]
x = single(A)\single(b);
r = b - A*x
r = b - A*double(x)
X = A\b;
R = b - A*X
x-X, r-R
z = single(A)\r;
x = x + z
40/40
Topics of today
I
References
I
2/47
Nonstationary methods
I
I
I
I
I
I
3/47
4/47
kxkA =
5/47
Descent directions
Definition: Suppose that for a functional and vectors x, d there
is a 0 such that
(x + d) < (x),
0 < 0 .
2 T
d Ad.
2
6/47
7/47
8/47
max min
kek kA .
max + min
(A) 1
(A) + 1
k+1
ke0 kA .
9/47
10/47
Iteration steps
How many iteration steps k = k() are needed such that
kep kA ke0 kA ?
We write
1
1 + 1/(A)
1 1/(A)
k
11/47
A2 =
7.5353 1.8588
1.8588
0.5647
(A2 ) = {0.1, 8}
12/47
13/47
LT x = y.
Note:
I
14/47
15/47
16/47
A by M 1 A,
rk by zk = M 1 rk , and
So,
k =
zT
zT
zT
k Mzk
k rk
k Mzk
=
=
TAz
Ts
1 A z )
zT
M(M
z
z
k
k
k
k
k k
rk+1 = rk k sk .
17/47
18/47
19/47
(A1 ) = 2,
7.5353 1.8588
1.8588
0.5647
A2 =
(A2 ) = 80.
Jacobi: M = D 2 = diag(diag(A2 ))
1/2
(D2
1/2
A2 D2
1/2
1/2
A2 D2
) = 19.22
1/2
20/47
21/47
(xk+1 )=min
k =
pT
k rk
T
pk Apk
rk+1 = rk k Apk
Search directions must be conjugate:
T
pT
k1 Apk = pk1 A(rk +k1 pk1 ) = 0
k1 =
rkT Apk1
pT
k1 Apk1
rkT rk
k
= T
,
T
pk Apk
pk Apk
k =
k+1
,
k
k = krk k2 .
22/47
Theorem
k 6= j.
(1)
Proof. By induction.
Theorem
In the k-th step of the cg method, xk minimizes (x) in
x0 + Kk (A; r0 ) where the Krylov space Kk (A; r0 ) is defined by
Kk (A; r0 ) = span(r0 , Ar0 , A2 r0 , . . . , Ak1 r0 ).
and r0 = b Ax0 is the initial residual.
In general, dim(Kk (A; r0 )) = k.
23/47
Note
We do not consider CG as an algorithm that terminates after a
finite number of steps (as Gaussian elimination) but as an iterative
method.
24/47
25/47
Convergence of CG
Convergence rate:
kek kA 2
!k
p
(A) 1
p
ke0 kA .
(A) + 1
(2)
26/47
27/47
Algorithm PCG
Choose x0 . Set r0 = b Ax0 . Solve Mz0 = r0 . 0 = zT
0 r0 .
Set p0 = z0 .
for k = 0, 1, . . ., maxit do
sk = Apk .
k = k /pT
k sk .
xk+1 = xk + k pk .
rk+1 = rk k sk .
Solve Mzk+1 = rk+1 .
(Solve with preconditioner)
k+1 = zT
k+1 rk+1 .
if k+1 < 2 0 exit.
(Test for convergence)
k = k+1 /k .
pk+1 = zk+1 + k pk .
endfor
28/47
PCG tests
Problem: Poisson equation on square [revisited]
PCG with preconditioner as indicated
preconditioner
Jacobi
block Jacobi
symmetric Gauss-Seidel
symmetric block Gauss-Seidel
SSOR ( = 1.8)
block SSOR ( = 1.8)
n = 312
76
57
33
22
18
15
n = 632
149
110
58
39
26
21
29/47
LLT = A,
30/47
Numerical example
Jacobi
0.45 (76)
18.0 (234)
Block Jacobi
1.23 (57)
54.1 (166)
Sym. GS
0.34 (33)
10.1 (84)
ICCG(0)
0.28 (28)
8.8 (73)
31/47
32/47
33/47
with A 6= AT
(3)
We could solve
AT Ax = AT b
by the (preconditioned) conjugate gradient method.
Two potential issues:
1. Each iteration step requires multiplication with A and AT .
2. Condition number (AAT ) = (AT A) = 2 (A).
Consensus:
Only when A is well conditioned and matvec with AT is cheap.
NumCSE, Lecture 11, Oct 23, 2014
34/47
Krylov spaces
In steepest descent and CG we have
rk+1 = rk k Apk ,
pk = rk + k1 pk1 ,
k
X
cj Aj r0
rk = pk (A)r0 ,
pk (0) = 1,
j=1
Moreover,
Pk
j
j=1 cj A rj
= rk r0 = A(xk x0 ). So,
xk = x0
k1
X
cj+1 Aj r0
()
j=0
NumCSE, Lecture 11, Oct 23, 2014
35/47
36/47
37/47
38/47
j
X
vi hij .
i=1
hj+1,j := kwj k2 .
Thus,
Avj =
j+1
X
vi hij .
i=1
NumCSE, Lecture 11, Oct 23, 2014
39/47
40/47
Arnoldi relation
Define Vk := [v1 , . . . , vk ]. Then we get the Arnoldi relation
AVk = Vk+1 Hk+1,k .
with
Hk+1,k
h11
h21
h12
h22
h3,2
..
.
h1,k
h2,k
h3,k
..
.
R(k+1)k
hk+1,k
41/47
42/47
GMRES
The GMRES (Generalized Minimal Residual) algorithm computes
xm Km that leads to the smallest residual exploiting the Arnoldi
relation.
Goal: Cheaply minimize
krm k2 = kb Axm k2 ,
xm x0 + Km (A, r0 ),
43/47
44/47
Preconditioning
I
u(x, y ) +
u
u
+
= g (x, y ).
x
y
45/47
Preconditioning (cont.)
46/47
Practical issues
I
I
I
47/47
Topics of today
1. Eigenvalue problems, definitions
2. Power method and variants
3. Googles pagerank
References
I
2/34
Ax = x
(A + I )x = ( + )x
Ax = x
Ak x = k x
3/34
Ay =
n
X
j=1
j Axj =
n
X
xj j and
j j xj .
j=1
4/34
5/34
Numerical example
I
0.9880
1.8000 0.8793 0.5977 0.7819
1.9417 0.5835 0.1846 0.7250
1.0422
0.8222
1.4453
1.3369 0.6069
0.8043
0.4187 0.2939
1.4814 0.2119 1.2771
has eigenvalues given approximately by 1 = 2,
2 = 1 + 2.5, 3 = 1 2.5, 4 = 2, and 5 = 2.
It is known that closed form formulas for the roots of a
polynomial do not generally exist if the polynomial is of
degree 5 or higher. Thus we cannot expect to be able to solve
the eigenvalue problem in a finite procedure.
6/34
7/34
L
R
~
~
8/34
1
1
L
0
C + L
1
2
1
A = L
C + R1 + L
L
1
1
0
L
C + L
0 0 0
L1
0
C 0 0
L
1 1
2
1
= 0 R1 0 + 0 C 0 +
L
L L
1
1
0 0 C
0 0 0
0 L
L
or
A() := W + C
S,
W , C , S Rnn symmetric
9/34
Apply rescirc.m:
Scan u = A()1 e1
for 0 < < 2.
Plot of |ui (U())|,
i = 1, 2, 3 for
R=L=C =1
(scaled model)
Blow-up of some
nodal potentials for
certain !
NumCSE, Lecture 12, Oct 27, 2014
10/34
11/34
1
S is singular
1
x
1
S)x = 0
x = y:
W S
x
C 0
x
=
I 0
y
0
I
y
| {z } |{z}
|
{z
}
z
M
B
NumCSE, Lecture 12, Oct 27, 2014
12/34
13/34
Resonant frequencies
for circuit (including
decaying modes with
Im() > 0)
14/34
..
A=S
.
A Cnn
1
S ,
n
{z
}
D
y = Ay, z = S 1 y
S Cnn nonsingular
z = Dz
15/34
y(0) = y0 Cn
z = Dz,
z(0) = z0 = S 1 y0
16/34
A vi = i vi ,
k = 1, 2, . . .
Clearly,
xk := Ak x0 .
NumCSE, Lecture 12, Oct 27, 2014
17/34
Convergence
Let xj = V yj , j 0. Then,
yk = V 1 xk = V 1 Axk1 = V 1 V V 1 xk1 = yk1 .
From this we have
k
yk = y0 =
k
= diag(1,
k1
k
y0 .
2
n
, . . . , )k diag(1, 0, 0, . . . , 0).
k
1
1
Therefore,
1
y
k1 k k
NumCSE, Lecture 12, Oct 27, 2014
(0)
y1 e1 ,
(0)
k 1k yk y1 e1 k = O 2 .
1
1
18/34
Convergence (cont.)
From xk = V yk it follows that as k :
xk points in the direction of v1 = V e1 , i.e., into the direction of the
eigenvector coresponding to the largest eigenvalue (in magnitude).
What about the eigenvalue?
Good choice: Rayleigh quotient
(x) =
xT A x
xT x
19/34
v = A vk1
vk =
v/k
vk
(k)
T
1 = vk Avk
end for
The convergence criterion may be, e.g., the angle between vk and
vk1 :
T
T
sin (vk , vk1 ) = kvk vk1 (vk1
vk )k = k(I vk1 vk1
)vk k.
20/34
Example
Absolute error:
(k)
(k)
|1 1 | |1 32|.
31
32
0.968
30
32
0.938
21/34
Then, for any particular query, Google finds the pages on the
Web that match that query and lists those pages in the order
of their PageRank.
22/34
23/34
24/34
0
1
0
G =
0
0
1
3
0
0
1
1
0
0
4
0
0
0
1
1
1
5
1
0
0
0
0
0
6
0
0
0
0
0
0
1
0
0
0
25/34
1
2
0
A=
0
0
1
2
I
0
0
1
2
1
2
0
0
0 1
0 0
0 0
1
3 0
1
3 0
1
3 0
1
6
1
6
1
6
1
6
1
6
1
6
1
0
0
0
26/34
I
I
I
I
I
27/34
x = Ax
exists and is unique to within a scaling factor. If this scaling factor
is chosen so that
n
X
xi = 1
i=1
28/34
Matlab code
function [x,cnt] = pagerankpow(G)
% PAGERANKPOW PageRank by power method with no matrix operations.
% x = pagerankpow(G) is the PageRank of the graph G.
% [x,cnt] = pagerankpow(G) also counts the number of iterations.
% There are no matrix operations. Only the link structure
% of G is used with the power method.
% Link structure
[n,n] = size(G);
for j = 1:n
L{j} = find(G(:,j)); % set of links coming into node j
c(j) = length(L{j}); % in-degree
end
% Power method
p = .85; delta = (1-p)/n;
NumCSE, Lecture 12, Oct 27, 2014
29/34
30/34
31/34
(A I ) xk := xk1 ,
k = 1, 2, . . .
Convergence rate:
1
2
1
1
1
.
2
32/34
33/34
Experiment
A = diag(1, 2, . . . , 30, 31, 32) R3232
Shift = 33, 35
Rayleigh quotient
iteration:
First shift = 33, after
two steps RQ shift.
Cubic convergence if
A is symmetric.
34/34
Topics of today
1. QR algorithm and Matlabs eig
2. Krylov space methods and Matlabs eigs
3. Singular value decomposition
References
I
2/34
(1)
(2)
3/34
4/34
(o)
5/34
The QR algorithm
The QR algorithm (not to be confused with the QR factorization)
is an extremely clever algorithm to compute the Schur
decomposition of an arbitrary matrix A.
At the origin of the algorithm is the observation (mostly due to
Rutishauser) that the sequence {Aj }
j=0 defined by
A0 = A,
Qj Rj = Aj ,
(QR factorization)
Aj+1 = Rj Qj ,
converges to an upper triangular matrix. Since the Aj are similar
the diagonal elements of this upper triangular matrix are the
eigenvalues of A0 .
NumCSE, Lecture 13, Oct 30, 2014
6/34
7/34
[ -4.4529e-01
[ -6.3941e+00
[ 3.6842e+00
[ 3.1209e+00
4.9063e+00
1.3354e+01
-6.6617e+00
-5.2052e+00
-8.7871e-01
1.6668e+00
-6.0021e-02
-1.4130e+00
6.3036e+00]
1.1945e+01]
-7.0043e+00]
-2.8484e+00]
A(20) =
[ 4.0000e+00
[ -3.9562e-04
[ 5.7679e-07
[ -1.0563e-12
2.0896e-02
2.9998e+00
1.8846e-04
-2.6442e-10
-7.0425e+00
8.7266e-01
2.0002e+00
-1.4682e-06
2.1898e+01]
3.2019e+00]
-3.6424e+00]
1.0000e+00]
A(20)./A(19) =
[ 1.0000
[ 0.7495
[ 0.5000
[ -0.2500
0.9752
1.0000
0.6668
-0.3334
1.0000
0.9988
1.0000
-0.4999
-1.0000]
-1.0008]
-1.0001]
1.0000]
8/34
Aj+1 = Rj Qj + I ,
Still Aj+1 Aj .
Right choice of shifts gives quadratic convergence to
individual eigenvalues one by one.
3. Deflation. As soon as one eigenvalue has converged, deflate
and continue computation with smaller submatrix.
NumCSE, Lecture 13, Oct 30, 2014
9/34
10/34
t11 i
t12
t1i
t1n
x1
0
t
x
22
2n
i
2i
2 0
.
.
.
.
.
. . ..
..
.. ..
0
tin
1 0
. .
.
.
..
..
.. ..
tnn i
0
0
can be solved easily provided that tkk 6= i for k < i.
NumCSE, Lecture 13, Oct 30, 2014
11/34
(Hm Hessenberg)
or
AV m = Vm Tm + wm eT
m,
(A symmetric, Tm tridiagonal)
12/34
13/34
14/34
1
.
15/34
eigs
I
16/34
17/34
18/34
kxk=1
kxk=1
n
X
i2 .
i=1
NumCSE, Lecture 13, Oct 30, 2014
19/34
kxk=1
Nullspace of a matrix:
The columns of V corresponding to zero singular values span
N (A).
Range of a matrix:
The columns of U corresponding to positive singular values
span R(A).
20/34
21/34
22/34
.. ..
B is bidiagonal.
U1T AV1 = B =
,
.
.
For details see, e.g., Golub & van Loan, Matrix Computations.
23/34
k
X
i u i v T .
i=1
min
F Rnm
rank(F ) = k
kA F k,
24/34
25/34
26/34
27/34
We proceed as follows:
28/34
Example
We reconsider the problem of solving
1 1
2
A=
,
b=
.
3 3
6
Here, A is singular, but b R(A). Matlab yields the SVD
0.316227766016838 0.948683298050514
U=
0.948683298050514
0.316227766016838
0.707106781186547
0.707106781186548
V =
0.707106781186548 0.707106781186547
4.47213595499958
0
=
0
4.01876204512712e 16
NumCSE, Lecture 13, Oct 30, 2014
29/34
Example (cont.)
We determine that A has rank r = 1. So,
z1 = 6.32455532033676 y1 = 1.4142135623731 x =
1
.
1
Recall that all solutions for this problem have the form
x = (1 + , 1 )T . Since
k
xk22 = (1 + )2 + (1 )2 = 2(1 + 2 ),
we see that our SVD procedure has stably found the solution of
minimum 2-norm for this singular problem.
30/34
kxk=1
kxk=1
= min kV T xk
kxk=1
= min kyk,
kyk=1
y = V T x,
= ken k = n
31/34
c1 + n1 x + n2 y = 0,
n12 + n22 = 1,
g2 :
c2 n2 x + n1 y = 0,
n12 + n22 = 1,
32/34
1
1
..
.
1
c
A
=
0
n
..
.
0
0
..
.
x1
x2
..
.
y1
y2
..
.
0
1
1
..
.
xm1
ym1 +1
ym1 +2
..
.
ym1
xm1 +1
xm1 +2
..
.
0 1 ym1 +m2
xm1 +m2
0
0
..
.
c1
c2 0
n1 0
0
n2
..
33/34
r22 r23
r33
A: A = QR.
c1
r14
c2
r24
0
r34 n1
n2
r44
34/34
Topics of today
I
References
I
2/27
Introduction: Optimization
I
I
3/27
with x = (x1 , x2 , . . . , xn )T Rn .
(Constrained optimization: x in a true subset of Rn .)
I
4/27
2
grad =
.. = 0.
.
xn
5/27
f(x) = 0
fn (x1 , x2 , . . . , xn ) = 0,
where f : Rn 7 Rn .
We claimed earlier that Newtons method extends directly to
higher-dimensional problems. But we need derivatives.
NumCSE, Lecture 14, Nov 3, 2014
6/27
(parabola)
(circle)
Two roots:
1
(1)
x
=
0
0
x (2) =
1
7/27
f
f1
f1
1
. . . x
x1
x2
n
f
2 f2 . . . f2
x1 x2
xn
J(x) =
..
..
.. .
.
.
.
fm
fm
fm
. . . xn
x1
x2
Pointwise: fi (x + p) = fi (x) +
n
P
j=1
fi
xj pj
+ O(kpk2 ), i = 1, . . . , m.
8/27
Newtons method
I
I
I
9/27
10/27
11/27
12/27
f(x ) = 0
Jg (x ) = 0.
13/27
x12
x22
(parabola)
1=0
The Jacobian is
J(x) =
2x1 2 1
2x1
2x2
(circle)
14/27
(k)
x1
1.000000000000000
1.500000000000000
1.083333333333333
1.015350877192982
1.000124514863663
1.000000034615861
1.000000000000001
(k)
x2
1.000000000000000
0
-0.166666666666667
-0.004385964912281
-0.000231826605833
-0.000000015495331
-0.000000000000001
kf(xk )k
1.4142e+00
1.2748e+00
2.6589e-01
3.1300e-02
3.4030e-04
7.0945e-08
1.8841e-15
15/27
16/27
0 < t < 1,
i = 1, . . . , N 1.
17/27
0 < t < 1,
u(0) = u(1) = 0.
The system becomes
2 1
1
2 1
1
1
2 1
2
h
.
.. ... ...
1
2
u1
u2
u3
..
.
e u1
e u2
e u3
..
.
uN1
e uN1
0
0
0
=
..
.
0
+ e ui .
18/27
2 + h2 e u1
1
1
h2
1
2 + h2 e u2
1
fi
uj :
1
2 + h2 e u3
..
.
1
..
.
..
h2 e uN1
2 +
19/27
function J = Dff(u)
% test function Jacobian
global T
f = T*u + exp(u);
J = T + diag(exp(u));
20/27
21/27
22/27
23/27
Example 2 continued
Linear convergence
for = 5.
24/27
25/27
Piecewise linear
convergence for
= 5.
Jacobian is updated
every 5th iteration
step.
26/27
27/27
Topics of today
I
Unconstrained optimization
Descent methods
References
I
2/40
Optimization
I
Maximization:
max (x)
x
x Rn .
and
min (x)
x
3/40
Example 9.4:
For n = 2, x = (x1 , x2 )T , we specify the function
(x) = x12 + x24 + 1.
This function obviously has the minimum value of 1 at
x = (0, 0)T = 0.
Matlab:
x = [-2:.1:2]; y = x;
[X,Y] = meshgrid(x,y);
F = X.^2 + Y.^4 + 1;
surf(x,y,F)
shading interp
NumCSE, Lecture 15, Nov 6, 2014
4/40
5/40
Practical issues
I
6/40
2
2
.
.
.
2
x
x
x
x
n
1
2
1
x1
1
2
2
2
x
.
.
.
2
x2
x2 xn
x2
x2 x1
grad (x) =
,
(x)
=
.
..
..
..
..
.
.
.
.
xn
2
2
2
xn x1
xn x2 . . .
x 2
n
7/40
Local minimum
Let x be a local minimum: (x ) (x) in a neighborhood of x .
Then for any direction p we have
1
(x +p) = (x )+grad (x )T p+ pT 2 (x )p+O(kpk3 ) (x ).
2
For very small kpk we have
(x + p) = (x ) + grad (x )T p + arbitrary small terms.
From this we have grad (x ) = 0, i.e., x is a critical point.
Otherwise: If grad (x ) 6= 0 choose p = grad (x ) to get
(x + p) < (x ).
NumCSE, Lecture 15, Nov 6, 2014
8/40
9/40
10/40
11/40
12/40
13/40
Example 9.5.
Consider minimizing
1
(x) =
2
2
2
2 !
3
9
21
2
3
x1 (1 x2 ) +
x1 (1 x2 ) +
x1 (1 x2 )
.
2
4
8
14/40
x = (0, 1)T .
Found starting at
(8, .8)T .
Look at the Hessian
matrix at
x.
15/40
kxk x k
0
1
2
3
4
5
6
7
8
5.01e+00
8.66e-01
6.49e-02
1.39e-01
2.10e-02
1.38e-03
3.03e-06
2.84e-11
|(xk )
(x )|
4.09e+01
1.21e+00
1.20e-02
1.72e-03
6.91e-05
1.43e-07
1.09e-12
6.16e-23
fkT pk
kxk
xk
7.3e+01
2.3e+00
2.3e-02
3.2e-03
1.4e-04
2.9e-07
2.2e-12
8.00e+00
6.95e+00
7.21e+00
1.36e+01
3.16e+00
4.44e-01
6.90e-02
2.43e-04
1.50e-09
|(xk )
(
x)|
6.08e+00
6.97e+00
6.99e+00
2.80e+02
1.90e+01
8.78e-01
2.12e-03
1.42e-08
0.00e+00
fkT pk
1.7e+00
2.9e-02
3.2e-01
4.7e+02
3.2e+01
1.9e+00
4.2e-03
2.8e-08
5.4e-19
fk = grad (xk ).
16/40
Descent direction
I
17/40
Descent methods
I
How to choose Bk ?
18/40
Gradient descent
Gradient descent: Bk = I
pk = grad (xk )
+ Garanteed reduction of .
+ No linear system to be solved.
We know from system solving: slow convergence.
19/40
As expensive as Newton
20/40
Line search
Search for an k such that (xk+1 ) = (xk + k pk ) < (xk ).
Simple backtracking algorithm:
Set k = 1.
while (xk + k pk ) > (xk ) do
Set k = k /2;
end while
More sophisticated approach:
Starting with
= 1, determine quadratic polynomial () such
that (0) = (xk ), 0 (0) = pT
k grad (xk ), and
(
) = ((xk +
pk ) where
is current unsatisfactory step size.
Minimize () to get k .
NumCSE, Lecture 15, Nov 6, 2014
21/40
Example 9.6.
With (x) = x14 + x1 x2 + (1 + x2 )2 we get
4x13 + x2
12x12 1
2
grad (x) =
,
(x) =
.
x1 + 2(1 + x2 )
1
2
has unique minimum at x (0.6959, 1.3479)T , where
(x ) 0.5824.
I
22/40
1
0.147
0.386
0.479
1
1
1
kxk x k
1.79e+00
1.88e+00
8.89e-01
3.37e-01
2.85e-03
1.14e-05
1.83e-10
|(xk ) (x )|
2.27e+00
2.09e+00
6.43e-01
1.31e-01
1.74e-05
2.76e-10
2.22e-16
fkT pk
1.38e+00
4.15e+00
1.63e+00
2.62e-01
3.46e-05
5.51e-10
1.42e-19
23/40
Inexact Newton
I
24/40
Quasi-Newton methods
What to do when Hessian 2 (x) is not available and numerical
differentiation is too expensive?
I
Thus, require
Bk+1 wk = yk ,
25/40
f 0 (xk+1 ) f 0 (xk )
xk+1 xk
(difference quotient)
26/40
wk = xk+1 xk ,
27/40
28/40
10
10
Small 2 2 example.
10
Normen
10
10
10
10
Broyden: ||F(x(k))||
Broyden: Fehlernorm
Newton: ||F(x(k))||
Newton: Fehlernorm
Newton (vereinfacht)
12
10
14
10
5
6
Iterationsschritt
10
11
29/40
30/40
g : Rn 7 Rm , m > n, g C 2 .
1X
1
(x) := kg(x) bk2 =
[gi (x) bi ]2 .
2
2
i=1
31/40
(x) X
gi (x)
=
[gi (x) bi ]
= 0,
xj
xj
j = 1, . . . , n.
i=1
g
x )
32/40
m
X
2 gk
(gk bk )
xi xj
k=1
33/40
dk = b g(xk ),
34/40
35/40
Example 9.8:
We generate data using the function u(t) = e 2t cos(20t), see the
solid blue curve in the next figure. At the 51 points ti = .02(i 1),
i = 1, . . . , 51, we add 20% random noise to u(ti ) to generate data
bi (green circles in figure).
We now pretend that we have never seen the blue curve and
attempt to fit to this b a function of the form
v (t) = x1 e x2 t cos(x3 t).
In the above notation we have m = 51, n = 3, and the ith rows of
g and A are
gi = x1 e x2 ti cos(x3 ti ),
ai,2 = ti gi ,
ai,1 = e x2 ti cos(x3 ti ),
ai,3 = ti x1 e x2 ti sin(x3 ti ),
1 i m.
36/40
37/40
38/40
39/40
Simplified Newton
Quasi-Newton
Inexact Newton: iterative solution of Newton step
(preconditioner?)
40/40
2/41
Topics of today
I
What is interpolation
Monomial basis
Lagrange basis
References
I
3/41
Interpolating data
We are given a collection of data samples {(xi , yi )}ni=0
I
The {xi }ni=0 are called the abscissae, the {yi }ni=0 are called
the data values.
v (xi ) = yi ,
i = 0, 1, . . . , n.
4/41
5/41
Interpolating functions
A function f (x) may be given explicitly or implicitly. Want
interpolant v (x) such that
v (xi ) = f (xi ),
i = 0, 1, . . . , n.
6/41
7/41
Tabulated numbers
Stock performance
8/41
Interpolation formulation
Assume a linear form of interpolation
v (x) =
n
X
j=0
9/41
0 (x0 ) 1 (x0 ) 2 (x0 ) n (x0 )
y0
c0
0 (x1 ) 1 (x1 ) 2 (x1 ) n (x1 ) c1 y1
..
..
..
.. .. = ..
.
.
.
. . .
0 (xn ) 1 (xn ) 2 (xn )
n (xn )
cn
yn
10/41
1 2 4
8
1 6 36 216
A=
1 4 16 64
1 7 49 343
14
24
y=
25
15
11/41
12/41
e sin t dt
0.4
0.5
0.6
0.7
0.8
0.9
0.4904 0.6449 0.8136 0.9967 1.1944 1.4063
a
x b
13/41
0.4429
= 0.9145 f (0.66) = 0.9216
0.66 1.1443
14/41
Polynomial interpolation
I
i = 0, . . . , n.
()
15/41
1
1
..
.
x0 . . . x0n1 x0n
x1 . . . x1n1 x1n
..
..
..
.
.
.
1 xn . . . xnn1 xnn
{z
|
V
c0
..
.
cn1
cn
} | {z
c
y0
y1
..
.
(1)
yn
}
| {z }
=
y
16/41
The determinant of V is
det(V ) =
(xi xj )
i6=j
I
17/41
Uniqueness of interpolation
The polynomial pn (x) in () is the only polynomial of degree n
that interpolates f at the n + 1 distinct points x0 , . . . , xn .
Proof.
If qn (x) Pn was another such polynomial:
pn (xk ) = qn (xk ) = yk ,
k = 0, 1, . . . , n.
18/41
Polynomials in Matlab
Matlab displays polynomials as row vectors containing the
coefficients ordered by descending powers.
p(x) = c0 + c1 x + . . . + cn x n
c = [cn , . . . , c1 , c0 ]
19/41
20/41
21/41
Lagrange interpolation
Lagrange polynomials are polynomials of degree n:
Li (x) =
n
Y
x xj
,
x
i xj
j=0
i = 0, 1, . . . , n,
j6=i
They satisfy
(
1, if i = k,
Li (xk ) = ik =
0, if i 6= k.
Therefore,
pn (x) =
n
X
f (xj ) Lj (x),
(2)
j=0
22/41
Lagrange polynomials of
degree 4, with abszissae at
x = 2, 4, 6, 7, and value 1 at
one of these x.
23/41
(xj xi ),
wj =
i6=j
1
,
j
j = 0, . . . , n.
n
Y
(x xi ).
i=0
Then,
p(x) = (x)
n
X
j=0
wj yj
,
(x xj )
yj = f (xj ).
24/41
1
n
X
w
1
j
and
Therefore, (x) =
(x xj )
j=0
n
X
p(x) =
j=0
n
X
j=0
wj yj
(x xj )
wj
(x xj )
25/41
wj = Q
1
,
(x
i6=j j xi )
Compute barycentric
j = 0, 1, . . . , n,
Pn
p(x) =
26/41
27/41
Newton interpolation
For Newton interpolation we represent the interpolation polynomial
in a different way:
pn (x) := c0 + c1 (x x0 ) + c2 (x x0 )(x x1 ) +
+ cn (x x0 )(x x1 ) (x xn1 ).
All terms but the first vanish at x0 :
Pn (x0 ) = c0
c0 = y0 .
c1 =
y1 y0
.
x1 x0
28/41
c2 =
y2 y0
y1 y0
x1 x0 (x2
x0 )
(x2 x0 )(x2 x1 )
y2 y0 y1 y0
x2 x0 x1 x0
c2 =
x2 x1
29/41
c1 = 2.
=
2
c2 = .
3
2
p2 (x) = 1 + 2(x 1) (x 1)(x 2).
3
NumCSE, Lecture 16, Nov 10, 2014
30/41
1 (x) = (x 2),
1
1
A=
1
1
I
0
0 0
4
0 0
2 4 0
5
5 15
31/41
Newton interpolation
Zeroth divided difference:
f [xi ] := f (xi ).
First divided difference:
f [x0 , x1 ] :=
f (x1 ) f (x0 )
f [x1 ] f [x0 ]
=
x1 x0
x1 x0
where k = 2, 3, . . . , n and i = 0, 1, . . . , n k.
32/41
f [x1 ] f [x0 ]
f (x1 ) f (x0 )
=
.
x1 x0
x1 x0
f (x2 )f (x1 )
x2 x1
f (x1 )f (x0 )
x1 x0
x2 x0
f [x1 , x2 , x3 ] f [x0 , x1 , x2 ]
.
x3 x0
33/41
f [x0 , x1 , x2 ] = c2
f [x1 , x2 ]
x2 f [x2 ]
f [x0 , x1 , x2 , x3 ] = c3
f [x1 , x2 , x3 ]
f [x2 , x3 ]
x3 f [x3 ]
Here is the case n = 3.
34/41
Matlab code
function [c,D] = coeffnewton0(x,y)
% COEFFNEWTON computes the divided differences needed for
%
constructing the interpolating polynomial
%
through (x_i,y_i)
n = length(x)-1;
% degree of interpolating polynomial
for i=1:n+1
% divided differences
D(i,1) = y(i);
for j=1:i-1
D(i,j+1) = (D(i,j)-D(i-1,j))/(x(i)-x(i-j));
end
end
c=diag(D);
35/41
36/41
i = n 1, n 2, . . . , 0.
37/41
Matlab code
function y = intnewton(x,c,z)
% INTNEWTON evaluates the Newton interpolating polynomial
%
at the new points z: y = P_n(z) using the
%
Horner form and the diagonal c of the
%
divided difference scheme.
n = length(x)-1;
y = c(n+1);
for i= n:-1:1
y = y.*(z - x(i)) + c(i);
end
38/41
Matlab example
We want to compute log(1.57) = 0.451075619360217. We have a
table with 4 values of the natural logarithm:
x
1.4
1.5
1.6
1.7
y
0.336472236621213
0.405465108108164
0.470003629245736
0.530628251062170
(1.57)
p...
0.451109779691149
0.451053032333184
0.451077622854969
error
3.416 105
2.259 105
2.003 106
39/41
n
X
j=0
cj
j1
Y
(x xi )
i=0
by Horners rule.
40/41
Algorithms comparison
Basis
name
j (x)
construction
cost
evaluation
cost
selling
feature
Monomial
xj
2 3
3n
2n
Lagrange
Lj (x)
n2
5n
simple
cj = yj
most stable
3 2
2n
2n
Newton
j1
Q
(x xi )
i=0
adaptive
41/41
Topics of today
I
Chebyshev points
Interpolating derivatives
References
I
2/25
Polynomial interpolation
I
i = 0, . . . , n.
(+)
3/25
Newton interpolation
For Newton interpolation we represent the interpolation polynomial
in a different way:
pn (x) := c0 + c1 (x x0 ) + c2 (x x0 )(x x1 ) +
+ cn (x x0 )(x x1 ) (x xn1 ).
The coefficient cj is determined by the j-th divided differences
cj = f [x0 , x1 , . . . , xj ],
j = 0, . . . , n.
4/25
i = 0, . . . , n.
where k = 1, 2, . . . , n and i = 0, 1, . . . , n k.
5/25
f [x0 , x1 , x2 ] = c2
f [x1 , x2 ]
x2 f [x2 ]
f [x0 , x1 , x2 , x3 ] = c3
f [x1 , x2 , x3 ]
f [x2 , x3 ]
x3 f [x3 ]
Here is the case n = 3.
6/25
Error expression
I
n
Y
(x xj )
j=0
7/25
Rearranging
en (x) = f (x) pn (x) = f [x0 , . . . , xn , x]
n
Y
(x xj )
j=0
I
such that
f [x0 , x1 , . . . , xn , x] =
f (n+1) ()
(n + 1)!
8/25
f (x) pn (x) =
f (n+1) () Y
(x xj )
(n + 1)!
j=0
By consequence
|en (x)| max
atb
ken k
n
Y
max (s xj )
(n + 1)! asb
j=0
|f (n+1) (t)|
kf (n+1) k
(b a)n+1
(n + 1)!
9/25
Example
Consider {(x0 , y0 ), (x1 , y1 )}.
So n = 1, hence n + 1 = 2.
1
ke1 k kf 00 k max |(s x0 )(s x1 )|.
asb
2
Maximum at
x0 +x1
2 ,
Thus,
1
ke1 k (x1 x0 )2 kf 00 k .
8
10/25
3
31.2
6
32.0
9
35.3
12
34.1
15
35.0
18
35.5
21
34.1
24
35.1
27
36.0
11/25
We do in general
not know the
n + 1-st derivative
of f .
Maybe we can
choose the
interpolation
points xi ?
Compare
equidistant with
Chebyshev points.
Here n = 8.
12/25
atb
n
Y
max (s xj )
(n + 1)! asb
j=0
|f (n+1) (t)|
13/25
Chebyshev points
I
min
1x1
I
2n (n
1
max |f (n+1) (t)|
+ 1)! 1t1
For a general interval [a, b], scale and translate [1, 1] onto
[a, b]
ba
x =a+
(t + 1), t [1, 1].
2
14/25
15/25
16/25
17/25
0 < x < 1.
18/25
Interpolating derivatives
We now admit also derivatives at the interpolation points.
f (0) = 1.5,
Hence
f 0 (0) = 1,
f (20) = 0.
(1.5 20)
21.5
=
400
400
21.5 2
400 x .
19/25
Hermite interpolation
I
f [xk , xk ] = lim
I
20/25
0
P2n+1
(xi ) = yi0 ,
i = 0, . . . , n.
f [x0 ] = y0 = c0
f [x0 , x0 ] := y00 = c1
f [x0 ]
f [x0 , x0 , x1 ] = c2
f [x0 , x1 ]
f [x0 , x0 , x1 , x1 ] = c3
f [x1 ] = y1
f [x0 , x1 , x1 ]
f [x1 , x1 ] := y10
f [x1 ]
21/25
22/25
with derivative
0 = c0
5 = c1
7.6066 = c2
2.6066
1
2.6066
2.6066
11.4620 = c3
3.8554
1.2487
23/25
n+1
2
3
4
5
Hermite error
1.8
0.077
0.0024
0.000043
2n + 2
4
6
8
10
Newton error
0.40
0.038
0.0018
0.000044
24/25
f (ti )
17.564921
18.505155
f 0 (ti )
3.116256
3.151762
f 00 (ti )
0.120482
f []
17.564921
17.564921
17.564921
18.505155
18.505155
f [, ]
f [, , ]
f [, , , ]
f [, , , ]
3.116256
3.116256
3.130780
3.151762
0.060241
0.048413
0.069400
-0.039426
0.071756
0.370604
25/25
Topics of today
I
Spline interpolation
Parametric interpolation
References
I
2/40
I
I
i = 0, 1, . . . , n.
kf (n+1) k
is not.
(n + 1)!
High order polynomials tend to oscillate unreasonably.
3/40
Construct interpolant
v (x) = si (x),
ti x ti+1 ,
i = 0, . . . , r 1
4/40
5/40
ti and xi
I
6/40
7/40
8/40
9/40
x xi1
xi xi1
i = x xi+1 , xi < x < xi+1
xi xi+1
0,
otherwise
v (x) =
n
X
f (xi )i (x)
i=0
10/40
Then
|em (x)| max
atb
Y
m
max (s xj )
(m + 1)! asb
j=0
|f (m+1) (t)|
11/40
max
xi txi+1
max
xi txi+1
|f 00 (t)|
max |(s xi )(s xi+1 )|
2 xi sxi+1
|f 00 (t)| |xi+1 xi |2
2
4
h2 00
kf k .
8
12/40
i = 0, . . . , r 1.
13/40
I
I
si (ti+1 ) = f (ti+1 )
si0 (ti )
= f (ti ),
h4
max |f 0000 (s)|
384 asb
14/40
15/40
[xi , xi+1 ],
si (xi+1 ) = yi+1
si0 (xi )
si0 (xi+1 )
yi0 ,
0
yi+1
interpolate
continuous derivative
16/40
x xi
hi
with
hi := xi+1 xi .
qi (1) = yi+1
qi0 (0)
0
qi0 (1) = hi yi+1
hi yi0 ,
17/40
(1)
with
H1 (t) = 1 3t 2 + 2t 3
H3 (t) = t 2t 2 + t 3
H2 (t) = 3t 2 2t 3
H4 (t) = t 2 + t 3
18/40
H3 (t) := hi ( ti+1hit ),
i
H4 (t) := hi ( tt
hi )
( ) := 3 2 2 3 ,
( ) := 3 2 ,
hi := ti+1 ti
19/40
yi = c 0
[yi , yi ] = hi yi0 = c1
yi
[yi , yi , yi+1 ] = c2
[yi , yi+1 ]
yi+1
0
[yi+1 , yi+1 ] = hi yi+1
yi+1
20/40
21/40
22/40
Given only function values, {xi , f (xi )}, x0 < x1 < < xn .
= f (xi ),
=
0
(xi+1 ),
si+1
si (xi+1 ) = f (xi+1 ),
si00 (xi+1 )
00
si+1
(xi+1 ),
i = 0, . . . , n1,
i = 0, . . . , n2.
23/40
24/40
Construction of C 2 spline
We have v (x) = si (x) for xi x xi+1 , i = 0, . . . , n 1
si (x) = ai + bi (x xi ) + ci (x xi )2 + di (x xi )3
si0 (x) = bi + 2ci (x xi ) + 3di (x xi )2
si00 (x) = 2ci + 6di (x xi )
Clearly,
ai = f (xi ),
i = 0, . . . , n 1
(2)
hi = xi+1 xi .
(3)
25/40
i = 0, . . . , n 2.
(4)
00 (x
Continuity of v 00 , i.e. si00 (xi+1 ) = si+1
i+1 ):
ci + 3hi di = ci+1 ,
i = 0, . . . , n 2.
(5)
ci+1 ci
,
3hi
bi = f [xi , xi+1 ]
NumCSE, Lecture 18, Nov 17, 2014
i = 0, . . . , n 2,
hi
(2ci + ci+1 ),
3
i = 0, . . . , n 2.
26/40
i = 1, . . . , n 1.
27/40
A=
h0
2(h0 + h1 )
h1
h1
2(h1 + h2 )
..
.
h2
..
.
hn2
..
.
2(hn2 + hn1 ) hn1
i = 1, . . . , n 1.
28/40
Boundary conditions
1. Natural boundary conditions:
00
s000 (x0 ) = sn1
(xn ) = 0
c0 = cn = 0.
29/40
Clamped plate
2. Clamped plate
Derivatives of f are prescribed:
b0 = f 0 (x0 ),
bn = f 0 (xn ).
30/40
Not-a-knot condition
3. Not-a-knot condition of de Boor: The two polynomials in the
first/last two intervals shall be the same:
s0 (x) s1 (x)
000
000
and sn2
(x1 ) = sn1
(x1 )
or
d0 = d1 ,
dn2 = dn1 .
31/40
Not-a-knot cubic
spline interpolation
for the Runge
Example at 20
equidistant points.
pp = csape(x,y,not-a-knot);
NumCSE, Lecture 18, Nov 17, 2014
fnplt(pp)
32/40
Interpolant
Piecewise
constant
Broken
line
Piecewise
cubic Hermite
Spline
(not-a-knot)
Local?
Order
Smooth?
yes
bounded
yes
C0
yes
C1
not quite
C2
Selling features
Accommodates
general f
Simple, max and min
at data values
Elegant and
accurate
Accurate, smooth,
requires only f data
33/40
Shape preservation
Interpolation that preserves monotonicity and shape of the data.
Matlab function:
v = pchip(t,y,x);
t:
y:
x:
v:
Sampling points
Sampling values
Evaluation points xi
Vector s(xi )
34/40
Parametric curves
We would like to generate a curve through the n given points in
the plane (xi , yi ), i = 0, 1, . . . , n.
Note: The numbering of the points is crucial; reordering them will
give us another curve.
Plane curves are represented by parametric functions
(x(s), y (s))
with
s0 s sn .
We can interpret the given points as function values for some (yet
to be determined) parametrization
x(si ) = xi ,
y (si ) = yi
i = 0, 1, . . . , n.
35/40
q
(xi+1 xi )2 + (yi+1 yi )2 ,
i = 0, . . . , n 1
s1 s2 . . . sn
x1 x2 . . . xn
s
y
s1 s2 . . . sn
y1 y2 . . . yn
and
36/40
Example
For the points
x
y
37/40
38/40
39/40
Introduction
Computer Aided Geometric Design (CAGD) applications make
extensive use of parametric curve fitting.
These applications require
I
2/39
Topics today
I
Bezier polynomials
Bezier curves
References
I
3/39
Bernstein polynomials
From the binomial theorem we have for any n
n
X
n
n
1 = ((1 t) + t) =
(1 t)ni t i .
i
i=0
n
X
Bi,n (t) = 1.
i=1
4/39
B1,1 (t) = t.
B2,2 (t) = t 2 .
B3,3 (t) = t 3 .
5/39
6/39
t=
ua
ba
to get
Bi,n (u; a, b) := Bi,n
ua
ba
1
n
=
(b u)ni (u a)i .
n
(b a) i
7/39
8/39
()
9/39
nk
t = t ((1 t) + t)
nk
X
j=0
n
X
r =k
nk
r k
n
r
nk
(1 t)nkj t j+k
j
n
(1 t)nr t r
r
|
{z
}
Brn (t)
In particular, t =
n
P
r =k
r
n Brn (t).
10/39
n
X
i Bi,n (t).
(+)
i=0
Definition
I
11/39
Example
12/39
Example (cont.)
Theorem
I
In fact:
0
p (t) =
n
X
i=0
n d
[(1 t)ni t i ]
i
i dt
13/39
Casteljau algorithm
The Casteljau algorithm is an algorithm to compute value of the
Bezier polynomial. It exploits the recurrences ().
Given the Bezier polynomial p(t) =
n
P
i=0
p(t) =
n
X
i Bi,n (t)
i=0
= 0 (1 t)B0,n1 (t) +
n1
X
i=1
+ n tBn1,n1 (t)
=
n1
X
i=0
14/39
n
X
i=0
n1
X
i Bi,n (t)
[i (1 t) + i+1 t]Bi,n1 (t)
i=0
n2
X
n1
X
(1)
i Bi,n1 (t)
i=0
(2)
i Bi,n2 (t)
i=0
= =
(n)
= 0
(k)
with i
(k1)
= i
(k1)
(1 t) + i+1 t.
15/39
(0)
(1)
(1)
xi
(1)
(0)
(0)
= (1t)xi +t xi+1 =
i +t
,
n
(1)
yi
(1)
= i
(0)
(0)
= (1t)yi +t yi+1
(2)
(2)
xi
I
(2)
= (1 t)xi
(1)
+ txi+1 ,
(2)
yi
(1)
= (1 t)yi
(1)
+ yi+1 t
etc.
16/39
17/39
Bezier curves
Bezier curves are polynomial curves represented in the basis of
Bernstein polynomials:
Pn (t) =
n
X
Bi,n (t)Pi ,
t [0, 1]
i=0
18/39
19/39
)
t [0, 1]
i=0
Theorem
For the end points of a Bezier curve we have
P(0) = P0 ,
0
P (0) = n(P1 P0 ),
P(1) = Pn
P0 (1) = n(Pn Pn1 ).
20/39
Example
21/39
Algorithm of de Casteljau
The following Matlab function bezier computes a point on the
Bezier curve with the de Casteljau algorithm.
function [y] = bezier(t,P)
% BEZIER evaluates the Bezier polynomial defined
%
by the control points P for parameter t
%
using the de Casteljau scheme
[m,n] = size(P); n=n-1;
T = zeros(m,n+1,n+1);
for i = 0:n
T(:, i+1,1) = P(:,i+1);
for j = 1:i
T(:,i+1,j+1) = (1-t)*T(:,i,j) + t*T(:,i+1,j);
end;
NumCSE, Lecture 19, Nov 20, 2014
22/39
3
8
8
7
10
3
23/39
8
7
10
3
T(:,:,2) =
0
0
2.5000
4.5000
5.5000
7.5000
9.0000
5.0000
T(:,:,3) =
0
0
0
0
4.0000
6.0000
7.2500
6.2500
T(:,:,4) =
0
0
0
0
0
0
5.6250
6.1250
24/39
25/39
B0,n (t)
n
B1,n (t)
X
P(t) =
Bi,n (t)Pi = [P0 , P1 , . . . , Pn ] .
..
i=0
Bn,n (t)
Let Bi,n (t) =
Pn
j=0 bij t
nj .
Then,
tn
t n1
Bn = ((bij ))
26/39
P3 (t) =
x(t)
y (t)
3
t
3 3 1
1
t 2
3
6
3
0
2 3 8 10
=
3
0 0 t
1 8 7 3 3
1
1
0
0 0
3
t2
7
12 3 2
t
=
5 24 21 1
t
1
7
12
3
2
=
t+
t+
t+
5
24
21
1
27/39
28/39
n
X
u [uj1 , uj ].
i=0
29/39
Since
d
d
Bin (u; uj1 , uj ) =
Bin
du
du
u uj1
uj uj1
=
d
1
Bin (t) ,
dt
hj
hj+1
hj
Pn1,j +
P1,j+1
hj + hj+1
hj + hj+1
30/39
Example
I
31/39
Example (cont.)
P(t) =
3
X
Pi Bi3 (t)
i=0
1
1
0 3
3
2
2
=
(1 t) +
3(1 t) t +
3(1 t)t +
t
0
1
1
I
How to choose ? Lets try to have the Bezier curve meet the
circle at t = 1/2:
1
p
1 1 0 1
3 = p1/2
0 1 1 8 3
1/2
1
p
1 p
= 3+4 = 8 1/2 = = (8 1/24) 0.552285.
3
32/39
Profile of an airfoil.
Most pictures today from Schwarz/K
ockler: Numerische Mathematik.
Springer.
NumCSE, Lecture 19, Nov 20, 2014
33/39
34/39
Bezier surfaces
Similarly as curves it is possible to define surfaces in 3D:
Let Pij , i = 0, 1, . . . , n; j = 0, 1, . . . , m be 3-dimensional vectors.
We define a surface by
x(s, t) :=
n X
m
X
s, t [0, 1].
i=0 j=0
35/39
Tensor splines
Let (xi , yj ), i = 0, 1, . . . , n; j = 0, 1, . . . , m be a regular grid.
Let Bi (x), Bj (y ) be the piecewise linear functions that satisfy
Bi (xk ) = ik ,
Bj (y` ) = j` .
i=0 j=0
Inside a rectangle [xi , xi+1 ] [yj , yj+1 ] this function is bilinear, i.e.
of the form
+ x + y + xy .
The 4 values , , , are determined by the 4 values of f at the
corners of the rectangle.
36/39
Piecewise
bilinear basis
function.
This is the
way
Matlab
works.
37/39
f
(xi , yj ),
y
2f
(xi , yj ),
xy
into account.
38/39
Piecewise
bicubic basis
function.
39/39
2/43
Topics of today
1. Best least squares approximation
2. Orthogonal basis functions
3. Trigonometric basis function
4. Weighted least squares
5. Chebyshev polynomials
References
I
3/43
n
X
cj j (x),
j=0
4/43
5/43
kg k 0; kg k = 0 iff g (x) = 0,
kg k = ||kg k for all scalars ,
kf + g k kf k + kg k.
R
b
a
g (x)2 dx
1/2
(least squares)
Rb
L1 : kg k1 = |g (x)|dx,
a
L : kg k = max |g (x)|,
axb
(maximum norm)
6/43
f (x)g (x) dx = 0.
a
I
7/43
8/43
n
X
cj j (x)
j=0
v k22
Z
min
c
f (x)
n
X
2
cj j (x) dx
j=0
9/43
n
X
cj j (x) k (x) dx = 0,
k = 0, . . . , n.
j=0
or
n
X
j=0
Zb
cj
Zb
j (x)k (x) dx =
|a
{z
jk
B
f (x)k (x) dx ,
|a
{z
bk
k = 0, . . . , n.
10/43
1. Calculate
jk =
B
j, k = 0, 1, . . . , n,
bj =
j = 0, 1, . . . , n.
Bc
for the coefficients c = (c0 , c1 , . . . , cn )T .
NumCSE, Lecture 20, Nov 24, 2014
11/43
x j+k dx =
1
,
j +k +1
j, k = 0, 1, . . . , n.
The integrals
bj =
f (x) x j dx
12/43
13/43
a
I
14/43
2j (x) dx,
j = 0, 1, . . . , n,
j = 0, 1, . . . , n.
j = 0, 1, . . . , n.
15/43
Legendre polynomials
Defined on the interval [1, 1] by
0 (x) = 1,
1 (x) = x,
j
2j + 1
x j (x)
j1 (x),
j+1 (x) =
j +1
j +1
j 1.
So,
1
2 (x) = (3x 2 1),
2
5
2
1
3 (x) = x(3x 2 1) x = (5x 3 3x).
6
3
2
16/43
Orthogonality:
Z1
j (x)k (x) dx =
1
0,
2
,
2j + 1
j 6= k,
j = k.
17/43
18/43
19/43
20/43
Trigonometric polynomials
I
1
0 (x) = ,
2
1
2` (x) = cos(`x),
for ` = 1, 2, . . .
I
21/43
Fourier Analysis
For 2-time-periodic real-valued functions f one often considers a
frequency or Fourier decomposition:
f (t)
X
a0 X
+
ak cos(kt) +
bk sin(kt),
2
k=1
(1)
k=1
ak =
k 0,
k 1.
?
22/43
1
2 (f0
if f is continuous at x0 ,
f0+ ),
if f is discontinuous at x0 .
Here,
lim f (x0 h) = f0 ,
h+0
h+0
23/43
< x < .
|x| dx =
x dx =
Z
2
2 1
1
ak =
x cos(kx) dx =
x sin(kx)
sin(kx) dx
0
k
k 0
0
2
2
=
cos(kx) =
[(1)k 1], k > 0.
2
k
k 2
a0 =
24/43
+
+
+
g (x) =
2
12
32
52
25/43
Example 2: f (x) = x 2
The 2-periodic function f (x) is defined in (0, 2) as
f (x) = x 2 ,
0 < x < 2.
8 2
,
3
ak =
4
,
k2
bk =
4
,
k
such that
4 2 X
+
g (x) =
3
k=1
4
4
cos(kx)
sin(kx)
k2
k
26/43
27/43
28/43
as to minimize
Z
a
I
j 6= k.
29/43
and
bj =
30/43
j 2.
where
Rb
j = Ra b
a
Rb
j =
NumCSE, Lecture 20, Nov 24, 2014
xw (x)[j1 (x)]2 dx
w (x)[j1 (x)]2 dx
j 1,
j 2.
31/43
6= 0.
q(x)(x x1 )(x x2 ) (x xj ) dx = 0.
1
32/43
x p(x).
33/43
e x
For w (x) =
and [a, b] = [0, ) the Laguerre polynomials form
an orthogonal family:
0 (x) = 1,
1 (x) = 1 x,
2j + 1 x
j
j+1 (x) =
j (x)
j1 (x),
j +1
j +1
NumCSE, Lecture 20, Nov 24, 2014
j > 0.
34/43
w (x) = e x ,
[a, b] = (, )
Hermite polynomials
0 (x) = 1,
1 (x) = 2x,
j+1 (x) = 2xj (x) 2jj1 (x),
j > 0.
35/43
Chebyshev polynomials
I
1
1 x2
Define
j (x) = Tj (x) = cos(j arccos(x)) = cos(j),
where x = cos().
36/43
Z
=
0
I
Tj (x)Tk (x)
dx
1 x2
(
0, j 6= k,
cos(j) cos(k)d =
2 , j = k.
j 1.
1 x 1.
37/43
38/43
39/43
40/43
n
Y
(x xk ).
k=1
1x1
1
2n1
41/43
1 (x) = x,
T
2 (x) = x 2 1 ,
T
2
n
Y
T
(x)
n (x) = n
T
=
(x xk ),
2n1
n 1.
k=1
1
(1)k .
2n1
42/43
43/43
Goals
I
2/38
Topics today
I
Filtering
References
I
3/38
2k (x) = cos(kx),
k > 0.
4/38
5/38
6/38
P
Consider (yet again) approximation v (x) = nj=0 cj j (x).
Family of orthogonal basis fcts: trigonometric polynomials
0 (x) = 1,
2k (x) = cos(kx),
k=1
k=1
X
a0 X
+
ak cos(kt) +
bk sin(kt).
2
7/38
n
X
j=0
cj j (x)
`1
a0 X
+ (ak cos(kt)+bk sin(kt))+a` cos(`t).
2
k=1
8/38
9/38
k = 0, 1, 2, . . .
f (x) =
ck k (x)
j=
f (t)e kt dt,
10/38
z = x + y ,
I Magnitude: |z| =
x 2 + y 2 = (x + y )(x y ) = z
z , where
z = x y is the conjugate of z.
I
Euler identity
e = cos() + sin(),
where the angle is in radian.
In polar coordinates
z = re = r cos() + r sin() = x + y ,
where r = |z| and tan() = y /x.
11/38
Application: convolution
I
Problem: evaluate
Z
g (x s)f (s)ds,
(x) =
Z Z
1
=
g ()e k d e ks f (s) ds,
= x s,
2
Z
= ckg
e ks f (s) ds = 2ckg ckf .
12/38
Application: differentiation
I
13/38
i
2i
= ,
2n
n
i = 0, 1, . . . , 2n 1.
X
1
p2n (x) = (a0 + an cos(nx)) +
[ak cos(kx) + bk sin(kx)].
2
k=1
by interpolation
yi = p2n (xi ),
NumCSE, Lecture 21, Nov 27, 2014
i = 0, 1, . . . , 2n 1.
14/38
2n1
0 k 6= j
1 X
cos(kxi ) cos(jxi ) = 1 0 < k = j < n
i=1
2 k = j = 0, or k = j = n,
(
2n1
0 k 6= j
1 X
sin(kxi ) sin(jxi ) =
n
1 0<k =j <n
i=1
2n1
1 X
sin(kxi ) cos(jxi ) = 0,
n
j, k.
i=1
15/38
DFT:
ak =
bk =
I
2n1
1 X
yi cos(kxi ),
n
k = 0, 1, . . . , n,
1
n
k = 1, 2, . . . , n 1.
i=0
2n1
X
yi sin(kxi ),
i=0
X
1
p2n (x) = (a0 + an cos(nx)) +
[ak cos(kx) + bk sin(kx)].
2
k=1
16/38
1 t 2.
(
1, 1 x + 1,
f (x) =
0, otherwise.
17/38
18/38
19/38
20/38
Trigonometric interpolation
2n1
Let {yj }j=0
be an 2n-periodic sequence. We define a
trigonometric polynomial of degree 2n by
p2n (x) = a0 +
n
X
ak cos(kx) +
k=1
n1
X
bk sin(kx).
k=1
for
xj = j/n,
j = 0, 1, . . . , 2n 1.
21/38
Trigonometric interpolation
To construct a trigonometric polynomial of degree 2n, we first
rewrite it in complex form: for n k n, we let
2 (ak bk ), 1 k n 1,
1 (a + b ), n + 1 k 1,
k
k
ck = 2
a0 ,
k = 0,
a ,
k = n.
n
We also use that
e z = cos(z) + sin(z),
1
cos(z) = (e z + e z ),
2
NumCSE, Lecture 21, Nov 27, 2014
sin(z) =
1 z
(e e z ).
2
22/38
n1
1X
1 X
p2n (x) = a0 +
ak (e kx + e kx ) +
bk (e kx e kx )
2
2
k=1
k=1
1
= a0 + an (e nx + e nx ) +
2
1
= cn e nx +
2
n1
X
k=n+1
n1
X
k=1
n1
ak bk kx X ak + bk kx
e +
e
2
2
k=1
1
ck e kx + cn e nx .
2
23/38
n1
X
k=n+1
1
ck e kxj + cn e nxj .
2
2n1
X
k=0
ck e
jk/n
2n1
X
jk
ck 2n
,
m = e 2/m .
k=0
24/38
1
1
1 2n
2
1 2n
.
..
.
.
.
have
2n1
1 2n
|
{z
V
2n1
2n
2(2n1)
2n
..
(2n1)2
2n
c0
c1
c2
..
.
y0
y1
y2
..
.
=
,
c2n1
y2n1
} | {z } | {z }
c
y
25/38
Vj,k = N
1 j, k N,
we have for 1 k, l N
[V V ]kl =
N
X
j=1
(k1)(j1) (j1)(l1)
N
N1
X
j=0
j(lk)
N1
X
mj
N
,
j=0
with m = l k.
26/38
m = 1, thus [V V ]
Case k = l: m = 0 and N
kk = N.
jm
N
=
j=0
Nm
1 N
m = 0.
1 N
Therefore, [V V ]kl = 0.
I
So, V V = N IN , or, V 1 =
c=
Note:
1 V
N
1
NV
and
1
V y.
N
27/38
N1
X
ck e kx
k=0
(+)
j=0
28/38
N1
X
k=0
jk
ck N
=
N1
X
ck e 2jk/N .
()
k=0
29/38
30/38
N1
X
jk
c j N
,
N = 2n
j=0
n1
X
j=0
n1
X
j=0
2jk
c2j N
+
n1
X
(2j+1)k
c2j+1 N
j=0
k
c2j njk + N
n1
X
c2j+1 njk ,
2jk
N
= njk
j=0
31/38
0 k < N.
k
(FN1 c)k = (Fn1 ce )k N
(Fn1 co )k ,
0 k < n.
32/38
In
In
In
In
=
=
N 2
N
n =
..
n
Fn
even ck
n
Fn
odd ck
n
Fn
PN c
n
Fn n
m1
N
(1)
PnN is the N-by-N permutation matrix that picks first the even and
then the odd components of the N-vector c.
NumCSE, Lecture 21, Nov 27, 2014
33/38
y = FN x
In n
Fn
=
PN x
In n
Fn n
F
I
m
m
m
n
Pm
Fm
I
n
Im m
= n
n P
Pm
Fm
Im m
In n
Fm
Im m
After the permutations, the directly neighbored vector elements
with indices 2k and 2k + 1 are combined. In the next step,
elements that are two apart are combined. Finally, elements that
are four (= n/2) apart are combined.
34/38
function y = myIFFT(x)
%MYIFFT y = myIFFT(x) computes the inverse Fourier transform of
%
vector x having length equal a power of 2
%
using the Cooley-Tukey recursive algorithm.
n = length(x);
if (n == 1)
y = x;
else
w = exp(2i*pi/n*[0:n/2-1]);
ze = myIFFT(x(1:2:n-1));
zo = myIFFT(x(2:2:n));
y = [ze+w.*zo; ze-w.*zo];
end
To compute the forward Fourier transform, one merely needs to replace
e 2k/n by e 2k/n and divide the final result by n.
35/38
Example: IFFT
;
ifft(1) = 1
1
ifft
3
1
2
ifft
3
4
&
1+13= 4
1 1 3 = 2
4 + 1 6 = 10
2 + i (2) = 2 2i
4 1 6 = 2
2 i (2) = 2 + 2i
> ifft(2) = 2
2
ifft
4
'
ifft(3) = 3
2+14= 6
2 1 4 = 2
ifft(4) = 4
NumCSE, Lecture 21, Nov 27, 2014
36/38
Matlab
The Matlab built-in functions fft and ifft implement the
discrete Fourier transform and its inverse in compiled code.
Unlike in the definitions (+) and (), the FFT in Matlab puts
the division by n in the inverse transform, i.e., Matlab uses
(Fn y)k =
n1
X
yj e 2jk/n ,
j=0
n1
1X
(Fn1 c)j =
ck e 2jk/n .
n
k=0
37/38
Complexity
Let C (n) be the complexity to execute the Fourier transform of
length n = 2q . Then,
n
C (n) = 3n
2 + 2C ( 2 )
3n
n
= 3n
2 + 2 4 + 4C ( 4 )
..
.
= 3n
2 log2 n + n = O(n log2 n)
Notice that these are complex flops. As a complex
addition/subtraction amounts to two real additions/subtractions
and a complex multiplication amounts to four real multiplications
and two additions/subtractions, the real complexity of the FFT of
length n = 2q is
38/38
Topics today
We continue with the discussion of the Discrete Fourier Transform,
keeping in mind its fast variant, the FFT.
I
Convolution
Filtering
Circulant matrices
2/38
3/38
(i+1/2)
,
m
i = 0, 1, . . . , n.
n
2 X
yi cos(kxi ),
m
k = 0, 1, . . . , n.
i=1
pn (x) =
X
1
a0 +
ak cos(kx)
2
k=1
yi = pn (xi ) =
X
1
a0 +
ak cos(kxi ),
2
i = 0, 1, . . . , n.
k=1
4/38
5/38
1 t 2.
error (r = 2)
9.8e-3
2.4e-3
3.9e-4
error (r = 10)
2.0e-5
2.2e-8
2.5e-11
6/38
Error (r = 2)
1.3e-13
2.1e-15
3.7e-15
Error (r = 10)
6.5e-6
2.5e-14
6.4e-14
7/38
Signal processing
I
For sound signals, it is often the case that f (t) is periodic (or
nearly periodic) with period T , i.e., that
f (t + T ) = f (t),
for all t R.
Sampling: Signal is
sampled at discrete
equidistant times ti .
vectors t, y,
yi = f (ti ).
8/38
9/38
Filter
Discrete finite linear time-invariant causal channel (filter)
xk
yk
input signal
output signal
10/38
Impulse response
In order to link digital filters to linear algebra, we have to assume
certain properties that are indicated by the attributes finite ,
linear, time-invariant and causal:
I
11/38
response
h1
hm
h2
t0 t1 t2
tn
h3
hn
t0 t1 t2
tm tn
12/38
13/38
14/38
0
h0
0
0
..
..
h0
0
.
..
..
h
hn1
.
0
..
.
yn
0
hn1
.
.
..
+ + xn1
.. = x0 0 + x1 0 + x2
hn1
0
..
..
.
.
.
0
h0
..
.
.
.
..
y2n3
..
..
..
0
y2n2
0
0
hn1
y0
y1
..
.
yk =
n1
P
hkj xj ,
k = 0, . . . , 2n2,
j=0
15/38
h
0
h
1
y ..
.
0
... ..
.
.. h
. = n1
.. 0
.
..
y2n2
.
..
.
0
Matrix notation
0 0
..
.. ..
. .
.
x
..
.. .. ..
0
. . . .
...
.. ..
. . 0
h1 h0
...
..
..
..
.
.
.
.. x
.. ..
n1
. .
.
.
.. ..
. . ..
0 hn1
(Toeplitz matrix)
16/38
Discrete convolution
Definition (discrete convolution)
Given x = (x0 , . . . , xn1 )T Rn , h = (h0 , . . . , hn1 )T Rn their
discrete convolution is the vector y R2n1 with components
yk =
n1
X
hkj xj
k = 0, . . . , 2n 2
j=0
Notation: y = h x.
x0 , . . . , xn1
FIR h0 , . . . , hn1
h0 , . . . , hn1
FIR x0 , . . . , xn1
17/38
Multiplication of polynomials
The bilinear operation that takes two input vectors and produces
an output vector with double the number of entries also governs
the multiplication of polynomials:
p(z) =
n1
P
q(z) =
k=0
n1
P
k=0
ak z k ,
bk z k
= (pq)(z) =
2n2
k
X X
aj bkj
zk
j=0
k=0
{z
=: ck
18/38
Convolution of sequences
The notion of a discrete convolution extends naturally to
sequences N0 7 R:
The (discrete) convolution of two sequences (xj )jN0 , (yj )jN0 is
the sequence (zj )jN0 defined by
zk :=
k
X
j=0
xkj yj =
k
X
xj ykj ,
k N0 .
j=0
19/38
n1
X
hkj xj
j=0
NumCSE, Lecture 22, Dec 1, 2014
20/38
h0 hn1 h1
y0
h1
h0 h2
y1
Hy = .
. . .
..
..
. .. ..
hn1 hn2
h0
yn1
H is a circulant matrix.
21/38
22/38
1 XX
(Fn (y z))k =
yl zjl e 2jk/n
n
j=0 l=0
1
n
n1 X
n1
X
j=0 l=0
n1
n1
1 X 2lk/n X
zjl e 2(jl)k/n
=
yl e
n
l=0
=n
j=0
X
X
n1
n1
1
1
yl e 2lk/n
zj e 2jk/n
n
n
l=0
j=0
23/38
zk :=
n1
X
xkj yj =
j=0
n1
X
ykj xj
k Z
j=0
24/38
n1
X
aj bkj ,
k = 0, . . . , 2n 2
j=0
25/38
an1
n
=
0
2n
3n
4n
k = 0, . . . , 2n 2.
26/38
Matrix notation
h0
y0
...
h
.1
.
..
..
..
..
.
.
= hn1
y
n1
yn
0
..
..
.
.
..
.
..
.
y2n2
0
|
0
.. ..
. .
.. ..
. .
..
.
..
.
.. ..
. .
..
.
..
..
.
h1
0
..
.
..
.
0
h0
hn1
..
.
0
.. . .
.
.
..
.
0
h1 h0
..
.
h1
..
..
..
. .
.
0 hn1 hn2
{z
0
..
.
..
.
..
..
..
.
..
.
h1
..
.
..
.
x0
..
.
..
.
..
.
hn1
0
xn1
..
..
.
.
..
.
0
0
h0
}
27/38
Circulant matrices
Let C be a circulant matrix:
c0
c1
C = .
..
cn1
c0
..
.
cn1 cn2
c1
c2
.. .
.
c0
Then,
C=
n1
X
ck Jnk
k=0
where
0 1
. .
Jn = . . . . Rn,n .
0 1
1
0
28/38
1
0
.. ..
0
. .
.. = .. .
.
.
1
n 1
1
n1
is an eigenvalue of Jn if n = 1.
The numbers nk e 2k/n , k = 0, . . . , n 1, are n different
solutions of n = 1. Therefore, with
1 1
1
n1
1 n n
2(n1)
2
1
n
n
Fn =
.
.
.
..
.. ..
(n1)2
1 nn1 n
29/38
1
Jn = Fn n Fn1 = Fn n Fn
n
n2
n =
.
.
..
where
nn1
The eigenvalue decomposition of C is now easily found:
C=
n1
X
ck Jnk =
k=0
n1
X
k=0
n1
X
1
ck Fn k
F = Fn
n
n n
ck k
n
k=0
|
NumCSE, Lecture 22, Dec 1, 2014
{z
1
F .
n n
30/38
n1
X
ck k
n
k=0
n1
X
ck nkj .
k=0
Therefore,
c0
c1
..
.
c0
c1
..
.
c
=
F
c
=
F
.
n
n
cn1
cn1
31/38
H = circ(h0 , . . . , hn1 ).
1
F x,
n n
x = Fn1 y = Fn y.
= Fn h and
1. Use FFTs to compute h
x = Fn x.
2. Compute w
= n(h
x) (elementwise product).
3. Use the IFFT to compute y = Fn1 w.
Since steps 1 and 3 each cost O(n log n) and step 2 costs O(n),
the overall cost is O(n log n).
NumCSE, Lecture 22, Dec 1, 2014
32/38
Matlab functions
I
33/38
34/38
2
16
14
|ck|2
Signal
12
10
8
1
6
4
2
2
10
20
30
40
50
60
70
10
15
20
25
30
Coefficient index k
35/38
36/38
High frequencies
Re
Low frequencies
37/38
350
signal
noisy signal
low pass filter
high pass filter
300
2.5
250
2
200
|ck|
1.5
150
0.5
100
0
50
0.5
0.1
0.2
0.3
0.4
0.5
time
0.6
0.7
0.8
0.9
20
40
60
80
100
120
140
38/38
Topics of today
1. NewtonCotes Rules
I
I
Trapezoidal rule
Simpsons rule
2. Romberg integration
3. Gauss quadrature
4. Adaptive quadrature
References
I
2/40
Introduction
Let f (x) be a continuous real-valued function of a real variable,
defined on a finite interval a x b.
We seek to compute the value of the integral,
Rb
f (x) dx.
3/40
Introduction (cont.)
The fundamental additive property of definite integrals is the basis
for many quadrature formulae. For a < c < b we have
Zc
Zb
f (x) dx =
a
Zb
f (x) dx +
f (x) dx
c
4/40
midpoint rule
M = hf
a+b
2
trapezoidal rule
f (a) + f (b)
2
The accuracy of a quadrature rule can be predicted in part by
examining its behavior on polynomials.
T =h
5/40
NewtonCotes rule
One obvious way to approximate f is by polynomial interpolation
and then integrate the polynomial.
Let a = x0 < x1 < x2 < < xn = b be equidistant points in
[a, b]. The Lagrange interpolating polynomial is given by
Pn (x) =
n
X
Li (x)f (xi ),
Li (x) =
i=0
n
Y
x xj
,
xi xj
j=0
i = 0, . . . , n.
j6=i
Then
Zb
Zb
f (x) dx
Pn (x) dx =
a
n
X
i=0
Zb
f (xi )
Li (x) dx =
a
n
X
f (xi )wi .
i=0
6/40
bx
x a
+ f (b)
ba
ba
and (with x = a + h)
Zb
bx
dx =
ba
Zb
x a
dx = h
ba
Z1
d =
T =
h
(f (a) + f (b)).
2
h
.
2
So,
7/40
Simpsons rule
Simpsons rule is the NewtonCotes rule for n = 2: (x = a + h)
P2 () = f (a)(1 2)(1 ) + f (c)4(1 ) + f (b)(2 1).
S=
h
(f (a) + 4f (c) + f (b)),
6
c=
a+b
.
2
Note that
2
1
S = M + T.
3
3
It turns out that Simpsons rule also integrates cubics exactly, but
not quartics, so its order is four.
NewtonCotes methods with order up to 8 exist.
Even higher order methods can be devised, but some of their
weights wi get negative which is undesirable for potential round-off.
8/40
Rule
a+b
)
(b a) f (
2
(b a)
[f (a) + f (b)]
2
a+b
(b a)
f (a) + 4f (
) + f (b)
6
2
Error
f 00 (1 )
(b a)3
24
f 00 (2 )
(b a)3
12
5
f (4) (3 ) b a
90
2
9/40
Composite Rules
I
Rb
Applying a quadrature rule to a f (x) dx may not yield an
approximation with the desired accuracy.
10/40
11/40
1
2 y0
+ y1 + + yn1 + 12 yn ,
yi = f (xi ).
[a, b].
12/40
13/40
function T=trapez(f,a,b,tol);
% TRAPEZ(f,a,b,tol) tries to integrate int_a^b f(x) dx
% to a relative tolerance tol using the composite
% trapezoidal rule.
%
h = b-a; s = (f(a)+f(b))/2;
tnew = h*s; zh = 1; told = 2*tnew;
fprintf(h = 1/%2.0f, T(h) = %12.10f\n,1/h, tnew)
while abs(told-tnew)>tol*abs(tnew),
told = tnew; zh = 2*zh; h = h/2;
s = s+sum(f(a+[1:2:zh]*h));
tnew = h*s;
fprintf(h = 1/%2.0f, T(h) = %12.10f\n,1/h, tnew)
end;
T = tnew;
14/40
Example
Z1
I =
xe x
e 2
= 0.3591409142 . . .
dx =
2
(x + 1)
2
T (hi )
0.339785228
1
2
1
4
1
8
1
16
1
32
1
64
0.353083866
0.31293
0.357515195
0.26840
0.358726477
0.25492
0.359036783
0.25125
0.359114848
0.25031
0.359134395
0.25007
2
3
4
5
6
If
f 00
T (hi ) I
T (hi1 ) I
does not vary too much, the error decreases by factor 4/step.
15/40
f (x) dx
h
(yi1 + 4yi + yi+1 )
3
xi1
h
(y0 + 4y1 + 2y2 + 4y3 + + 2y2n2 + 4y2n1 + y2n ) .
3
16/40
(4)
f
()
,
with a b.
17/40
S(h) =
s1new
s2new = s2 + s4
s4new = sum of new function values
h/2 new
S(h/2) =
(s
+ 4s4new + 2s2new ).
3 1
NumCSE, Lecture 23, Dec 4, 2014
18/40
function S=simpson(f,a,b,tol);
% SIMPSON (f,a,b,tol) tries to integrate int_a^b f(x) dx
% to a relative tolerance tol using the composite
% Simpson rule.
%
h = (b-a)/2; s1 = f(a)+f(b); s2 = 0;
s4 = f(a+h); snew = h*(s1+4*s4)/3;
zh = 2; sold = 2*snew;
fprintf(h = 1/%2.0f, S(h) = %12.10f\n,1/h, snew)
while abs(sold-snew)>tol*abs(snew),
sold = snew; zh = 2*zh; h = h/2; s2 = s2+s4;
s4 = sum(f(a +[1:2:zh]*h));
snew = h*(s1+2*s2+4*s4)/3;
fprintf(h = 1/%2.0f, S(h) = %12.10f\n,1/h, snew)
end
S = snew;
NumCSE, Lecture 23, Dec 4, 2014
19/40
Same exampleZ1
I =
xe x
e 2
= 0.3591409142 . . .
dx =
2
(x + 1)
2
S(hi )
0.357516745
1
2
1
4
1
8
1
16
1
32
1
64
0.358992305
0.09149
0.359130237
0.07184
0.359140219
0.06511
0.359140870
0.06317
0.359140911
0.06267
0.359140914
0.06254
2
3
4
5
6
S(hi ) I
S(hi1 ) I
20/40
Romberg method
The asymptotic expansion for the approximation of an integral by
the trapezoidal rule is given by
Zb
T (h) =
g (t) dt + K1 h2 + K2 h4 + + Km h2m + Rm .
P2m (h) = K0 + K1 h + K2 h + + Km h
2m
Z
, with K0 =
g (t) dt.
a
21/40
h02
h12
T (h0 ) T (h1 )
hi2
T (hi )
22/40
Tkk
with
Ti+1,j =
4j Ti+1,j1 Ti,j1
,
4j 1
0 j i k.
23/40
xe x
e 2
dx =
= 0.3591409142 . . .
2
(x + 1)
2
24/40
0.35751674591915
0.35899230563633
0.35913023759497
0.35914021901962
0.35914087030714
0.35914091147685
0.35909067628414
0.35913943305888
0.35914088444793
0.35914091372630
0.35914091422149
0.35914091422123
0.35914091422951
0.35914091422952
0.35914020697594
0.35914090748585
0.35914091419104
0.35914091422935
columns 4 to 6
0.35914091023295
0.35914091421734
0.35914091422950
25/40
Another example
Lets integrate
R1
0
[I,T]=romberg(@sqrt,0,1,1e-8)
generates a warning message that 15 extrapolation steps were not
sufficient to meet the tolerance of 108 .
romberg is hardly more accurate than simpson!
26/40
Gauss quadrature
I
I
I
a+b
2x b a
ba
t+
t =
[1, 1]
2
2
ba
may be needed.
Z b
Z
ba 1
ba
a+b
f (x) dx =
f
t+
dt.
2
2
2
a
1
NumCSE, Lecture 23, Dec 4, 2014
27/40
f (x) dx
1
n
X
wk f (k )
(Gauss)
k=1
Theorem
The order m of an n-point quadrature rule (Gauss) is bounded by
2n 1.
28/40
Z1
1 dx = 2 = 1w1 + 1w2
Z1
x dx = 0 = 1 w1 + 2 w2
1
2
x dx = = 12 w1 + 22 w2
3
2
Z1
x 3 dx = 0 = 13 w1 + 23 w2 .
29/40
w1 = 1, w2 = 1, 1 = 1/ 3, 2 = 1/ 3
and the corresponding GaussLegendre rule for n = 2 is
Z 1
1
1
f (x) dx f
+f
3
3
1
This approach (solving a nonlinear system of 2n equations) doesnt
work for big n.
30/40
31/40
wi f (i ) =
i=1
n
X
i=1
Z 1
wi [q(i ) n (i ) + r (i )] =
wi r (i )
i=1
n
X
f (x) dx.
r (i )Li (x) dx =
1 i=1
n
X
{z
r (x)
32/40
33/40
Adaptive Quadrature
I
34/40
R1
0
x dx = 2/3.
|f 00 ()| 3 h3/2
h
= 6.357 107 < 105 .
12
48
35/40
36/40
37/40
38/40
39/40
Comparison
With f (x) =
xe x
we get
(x + 1)2
40/40
Topics of today
1. Single-step methods
I
I
I
References
I
2/37
Differential Equations
I
a < t < b.
3/37
t 0.
Solution is
y (t) = t 1 + e t ,
any R.
y (t) = t 1 + (c + 1)e t .
4/37
dy (t)
= f (t, y (t))
dt
(1)
(2)
5/37
6/37
()
7/37
y20 = g sin(y1 ).
y2
y1
0
= f(t, y(t))
y (t) = 0 = 00 =
g sin(y1 )
y2
8/37
u(t)
u 0 (t)
v (t)
v 0 (t)
y(t) =
y0 (t) =
u 0 (t) ,
u(t)/r (t)3 .
v 0 (t)
v (t)/r (t)3
NumCSE, Lecture 24, Dec 8, 2014
9/37
v = (u )v .
0
y = f(y)
u
with y =
,
v
(prey)
(predator)
( v )u
f(y) =
.
(u )v
10/37
11/37
12/37
Forward Euler
I
y (a) = c.
Method derivation
Explicit vs. implicit methods
Local truncation error and global error
Order of accuracy
Convergence
Absolute stability and stiffness.
13/37
y (tj+1 ) y (tj ) h 00
+ y (j ).
h
2
Therefore,
y (tj+1 ) = y (tj ) + h f (tj , y (tj )) +
h2 00
y (j ).
2
14/37
Then, set
y0 = y (t0 ) = c,
yj+1 = yj + h f (tj , yj ),
j = 0, . . . , N 1.
15/37
16/37
17/37
Note the simplicity of the code, as well as the rather small h, which yields
322,000 time steps.
Execute Matlab program Example16_4Figure16_3.m.
NumCSE, Lecture 24, Dec 8, 2014
18/37
y (tj+1 ) y (tj ) h 00
+ y (j ).
h
2
So,
y (tj+1 ) = y (tj ) + h f (tj+1 , y (tj+1 )) + O(h2 ).
The initial value problem is then solved by
y0 = c,
yj+1 = yj + h f (tj , yj+1 ),
j = 0, . . . , N 1.
19/37
Trapezoidal rule
I
ti+1
h
(f (tj , yj ) + f (tj+1 , yj+1 )).
2
20/37
21/37
Midpoint rule
Let us try the analog to the midpoint rule for integrating functions.
This requires a second point of evaluation of f .
s1 = f (tj , yj ),
h
h
s2 = f (tj + , yj + s1 ),
2
2
yj+1 = yj + h s2 ,
tj+1 = tj + h.
The midpoint rule has accuracy order 2.
22/37
Comparison
We compare the four methods (all except implicit trapezoidal rule)
for the initial value problem
y 0 (t) = 2ty 2 ,
y (0) = 1.
1
.
1 + t2
We see that midpoint rule and trapezoidal rule are much more
accurate than both Euler methods.
y (t) =
23/37
Comparison (cont.)
24/37
y0 = y (t0 ).
(Ah )
25/37
1
(y (t + h) y (t)) (t, y (t), h).
h
(3)
h0
1
(y (t + h) y (t)) = f (t, y ) = (t, y , 0)
h
26/37
h0
f
y (t, y ).
27/37
28/37
(3)
and
yj+1 = yj + h (tj , yj , h).
(Ah )
Then
eh (tj+1 ) = eh (tj ) + h ((tj , y (tj ), h) (tj , yj , h)) + hh (tj )
Using the Lipschitz condition on we get
|eh (tj+1 )| |eh (tj )| + hL |y (tj ) yj | + h|h (tj )|
= (1 + hL)|eh (tj )| + h|h (tj )|
NumCSE, Lecture 24, Dec 8, 2014
29/37
Then,
|eh (tj+1 )| (1 + hL)|eh (tj )| + hD.
Applying this inequality recursively to j = n, n 1, . . . , 1 yields
|eh (tn )|
(1 + hL)n 1
D + (1 + hL)n |eh (t0 )|.
L
30/37
Theorem
The error of the method (Ah ) at tn = nh + t0 is
|eh (tn )|
M hp L(tn t0 )
(e
1).
L
31/37
Absolute stability
I
y (0) = y0 ,
Solution: y (t) = y0 e t .
I
32/37
2
||
33/37
I
I
y (0) = 1,
34/37
Stiffness
Definition
The initial-value ODE problem is stiff if the step size needed to
maintain absolute stability of the forward Euler method is much
smaller than the step size needed to represent the solution
accurately.
Remark: The example of the previous slide would require h 0.1
to accuratly represent cos(t). It is the stability of forward Euler
that requires h 0.002.
Remark: The example can be made even stiffer by increasing the
constant 1000.
35/37
yi+1 =
1
yi .
1 h
36/37
I
I
j = 1, . . . , m.
37/37
Topics of today
1. RungeKutta methods
2. Stability of single-step methods
3. Multistep methods
References
I
2/44
Introduction
Initial value problem for ordinary differential equation (ODE): find
function y (t) that satisfies
y0 (t) =
dy(t)
= f (t, y(t))
dt
(1)
(2)
3/44
4/44
5/44
h2 00
hp
y (t) + + y(p) (t + h)
2
p!
y(t + h) y(t)
h
= f(t, y) + [ft (t, y) + fy (t, y)f(t, y)] +
h
2
NumCSE, Lecture 25, Dec 11, 2014
6/44
1
a2 p1 = ,
2
a2 p2 =
1
2
7/44
k1 := f(t, y),
1
1
k2 := f(t + h, y + hk1 ),
2
2
1
1
k3 := f(t + h, y + hk2 ),
2
2
k4 := f(t + h, y + hk3 ).
8/44
9/44
10/44
Euler
4.7e-3
2.3e-3
1.2e-3
4.6e-3
2.3e-4
1.2e-4
4.6e-5
rate
RK2
3.3e-4
7.4e-5
1.8e-5
2.8e-6
6.8e-7
1.7e-7
2.7e-8
1.01
1.01
1.00
1.00
1.00
1.00
where rate(h) = log2 e(2h)
e(h)
NumCSE, Lecture 25, Dec 11, 2014
rate
2.15
2.07
2.03
2.01
2.01
2.00
RK4
2.0e-7
1.4e-8
8.6e-10
2.2-11
1.4e-12
8.7e-14
1.9e-15
rate
3.90
3.98
4.00
4.00
4.00
4.19
11/44
y1 (0) = 80
y20
y2 (0) = 30
= y2 + .01y1 y2 ,
Prey: y1 , predator: y2
NumCSE, Lecture 25, Dec 11, 2014
12/44
RungeKutta methods
Definition: A s-stage (explicit) RungeKutta method is a
single-step method where (t, y, h) is defined by the
2s 1 + s(s 1)/2 real numbers ij , 2 i s, 1 j < i, and
c1 , . . . , cs , 2 , . . . , s :
(t, y, h) := c1 k1 + + cs ks
where
k1 := f(t, y),
k2 := f(t + 2 h, y + h21 k1 ),
k3 := f(t + 3 h, y + h(31 k1 + 32 k2 )),
..
.
ks := f(t + s h, y + h(s1 k1 + + s,s1 ks1 )).
NumCSE, Lecture 25, Dec 11, 2014
13/44
s1 s2
c1 c2
s,s1
cs1 cs
(t, y, h) is consistent if c1 + c2 + + cs = 1.
For high order approximation we have to have i =
i1
P
i` .
`=1
14/44
0 0
1
0
I
1
2
0 0
1 1
0
0
1
2
1
2
0 0
1
2 0
0 1
15/44
0
Classical 4th-order RK s-stage methods: 21
1
(order 4,
2
related to Simpson quadrature rule)
1
1
2
0
0
0
0
0
0
1
0
0
0
0
0
0
1
6
2
6
2
6
1
6
0
0
1
1 1
0
0
0
1
0
0
0
0
1
8
3
8
1
8
1
2
1
3
2
3
1
3
31
3
8
16/44
17/44
y (0) = 1,
C.
18/44
1
1
hk1 ) = (yj + hk1 ) = ( +
2
2
1
1
hk2 ) = (yj + hk2 ) = ( +
2
2
1 2
h )yj ,
2
1 2 1 2 3
h + h )yj ,
2
2
1
1
k4 = f (tj + h, yj + hk3 ) = (yj + hk3 ) = ( + h2 + h2 3 + h3 4 )yj
2
4
For
yj+1 = yj +
h
{k1 + 2k2 + 2k3 + k4 }
6
we get
NumCSE, Lecture 25, Dec 11, 2014
19/44
= F (h) yj
1
1
1
F (h) = 1 + h + h2 2 + h3 3 + h4 4 .
2
6
24
Note that these are the first five terms of the Taylor series of e h .
20/44
Therefore:
|F (h)| < 1 cannot hold for all negative values of h.
21/44
= h,
22/44
23/44
Time-marching method
Lets try to determine the temperature u(x) in a 1D room of
width 1, when the temperatures at the walls are 0 and 1. The heat
(diffusion) equation is
u
2u
=
,
t
x 2
u(0, t) = 0,
u(1, t) = 1.
u(0) = 0,
u(1) = 1.
24/44
25/44
26/44
for all i.
t
t
A)uj+1 = (I +
A)uj + t b
2
2
27/44
28/44
Multistep methods
I
29/44
j yi+1j = h
s
X
j fi+1j .
j=0
However, the explicit RK2 and RK4 are not in the linear
multistep form.
30/44
ti+1
31/44
ti+1
p(t) dt = h
ti
p(ti + h) d
0
[fi + (fi fi1 )] d = h
=h
0
3
1
fi fi1
2
2
h
[3fi fi1 ]
2
h (ti+1 ) = Ch2 y 000 (ti ) + O(h3 ). The method has order s = 2.
So, the method is yi+1 = yi +
32/44
(1,1)error
4.7e-03
2.3e-03
1.2e-03
4.6e-04
2.3e-04
1.2e-04
4.6e-05
rate
1.01
1.01
1.00
1.00
1.00
1.00
(2,2)error
9.3e-04
2.3e-04
5.7e-05
9.0e-06
2.3e-06
5.6e-07
9.0e-08
rate
2.02
2.01
2.01
2.00
2.00
2.00
(4,4)error
1.6e-04
1.2e-05
7.9e-07
2.1e-08
1.4e-09
8.6e-11
2.2e-12
rate
3.76
3.87
3.94
3.97
3.99
3.99
33/44
h
[fi1 + 8fi + 5fi+1 ] .
12
h
[fi1 + 8fi + 5fi+1 ]
12
34/44
(1,1)error
4.6e-03
2.3e-03
1.1e-03
4.6e-04
2.3e-04
1.2e-04
4.6e-05
rate
0.99
1.00
1.00
1.00
1.00
1.00
(1,2)error
1.8e-04
4.5e-05
1.1e-05
1.8e-06
4.5e-07
1.1e-07
1.8e-08
rate
2.01
2.00
2.00
2.00
2.00
2.00
(3,4)error
1.1e-05
8.4e-07
5.9e-08
1.6e-09
1.0e-10
6.5e-12
1.7e-13
rate
3.73
3.85
3.92
3.97
3.98
3.99
35/44
s
X
j ,
j=0
s
s
X
X
1
1
Ci = (1)i
j i j +
j i1 j ,
i!
(i 1)!
j=1
i = 1, 2, . . .
j=0
Cp+1 6= 0.
36/44
1 1
1 2
1
1
=
2
0.5
h (ti ) =
3
1 = ,
2
1
2 = .
2
5 2 000
h y (ti ) + O(h3 ).
12
37/44
BDF methods
I
j yi+1j = h0 fi+1 .
j=0
38/44
(BDF2)
39/44
(1,1)error
4.6e-03
2.3e-03
1.1e-03
4.6e-04
2.3e-04
1.2e-04
4.6e-05
rate
0.00
0.99
1.00
1.00
1.00
1.00
1.00
(2,2)error
7.3e-04
1.8e-04
4.5e-05
7.2e-06
1.8e-06
4.5e-07
7.2e-08
rate
0.00
2.02
2.00
2.00
2.00
2.00
2.00
(4,4)error
7.6e-05
6.1e-06
4.3e-07
1.2e-08
7.8e-10
4.9e-11
1.3e-12
rate
0.00
3.65
3.81
3.91
3.96
3.98
3.99
40/44
Predictor-corrector methods
I
41/44
42/44
h
[5fj2 16fj1 + 23fj ]
12
Correction:
yj+1 = yj +
h
0
9f (tj+1 , yj+1
) + fj2 5fj1 + 19fj
24
43/44
However, for stiff problems, the BDF methods are still popular.
44/44
Topics of today
1. Stiff problems
2. Step size control
3. Wrap-up of the whole lecture
References
I
2/28
Introduction
Initial value problem for ordinary differential equation (ODE): find
function y (t) that satisfies
dy (t)
= f (t, y (t))
dt
(1)
(2)
3/28
Stiff problems
I
4/28
Stiff system
I
f1
1
. . . y
y1
n
f
..
.
.
..
..
= .
J=
.
y
fm
fm
. . . yn
y1
5/28
6/28
1.71 .43
8.32
0
0
0
0
0
1.71 8.75
0
0
0
0
0
0
0
0 10.03 .43
.035
0
0
0
0
8.32 1.71 1.12
0
0
0
0
0
0
0
1.745
.43
.43
0
0
0
0
.69
1.71 280y8 .43 .69 280y6
0
0
0
0
0
280y8
1.81 280y6
0
0
0
0
0
280y8
1.81 280y6
Initial condition: y(0) = y0 = (1, 0, 0, 0, 0, 0, 0, .0057)T .
NumCSE, Lecture 26, Dec 15, 2014
7/28
8/28
y (0) = ,
0 t 2/.
9/28
10/28
15
12
1
0.5t
45t
y(t) = 0 e
+ 12 e
+ 1 e 75t .
0
4
3
The system has very different decay rates.
11/28
At t = 0.3 also the e 45t term has become small that we just
have to make sure that it does not blow up. (h h is fine.)
12/28
13/28
14/28
15/28
16/28
h tol
|
yi+1 yi+1 |
1
q
17/28
18/28
19/28
yj+1 = yj + h
k1 + k3 +
k4 + k5 + k6
450
25
225
30
25
The error is estimated by k
yj+1 yj+1 k
h (tj+1 )
h
k2k1 + 9k3 64k4 15k5 + 72k6 k
300
20/28
y (0) = 1.
21/28
22/28
u1 +
,
D1
D2
u2
u2
u200 = u2 2u10
,
D1
D2
u100 = u1 + 2u20
Initial conditions:
u1 (0) = 0.994, u2 (0) = 0, u10 (0) = 0,
u20 (0) = 2.00158510637908.
Solution is periodic with period 17.0652 . . .
23/28
24/28
r (t) =
q
u(t)2 + v (t)2 .
25/28
Epilog
I
26/28
Epilog (cont.)
I
Examination
I
I
I
I
I
I
27/28
Epilog (cont.)
28/28