0% found this document useful (0 votes)
64 views

Lecture Notes Introduction To Pdes and Numerical Methods: Winter Term 2002/03

This document provides lecture notes on partial differential equations (PDEs) and numerical methods. The first chapter introduces PDEs through the derivation of the heat equation from physical principles like energy conservation. It presents analytical solutions to the heat equation for simple cases and introduces finite difference methods for numerical solutions. It discusses spatial and temporal discretization, stability analysis, and extending the methods to multiple dimensions.

Uploaded by

Sebastian VP
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views

Lecture Notes Introduction To Pdes and Numerical Methods: Winter Term 2002/03

This document provides lecture notes on partial differential equations (PDEs) and numerical methods. The first chapter introduces PDEs through the derivation of the heat equation from physical principles like energy conservation. It presents analytical solutions to the heat equation for simple cases and introduces finite difference methods for numerical solutions. It discusses spatial and temporal discretization, stability analysis, and extending the methods to multiple dimensions.

Uploaded by

Sebastian VP
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 104

Lecture Notes

Introduction to PDEs and Numerical


Methods
Winter Term 2002/03

Hermann G. Matthies
Oliver Kayser-Herold
Institute of Scientific Computing
Technical University Braunschweig
Contents

1 An Introductory Example 5
1.1 Derivation of the PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Energy Conservation . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 From the Integral Form to the PDE . . . . . . . . . . . . . . . . . . 7
1.1.3 Constitutive Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.4 Initial and Boundary Conditions . . . . . . . . . . . . . . . . . . . . 9
1.1.5 General Way of Modelling Physical Systems . . . . . . . . . . . . . 9
1.2 Analytical Solutions of PDEs . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.1 Heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.2 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.3 General Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.4 Solutions with Source Terms and Initial Conditions . . . . . . . . . 14
1.3 Non-Dimensional Form of the Heat Equation . . . . . . . . . . . . . . . . . 14
1.4 Finite Difference methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4.1 Spatial approximation of the heat equation . . . . . . . . . . . . . . 19
1.4.2 Method of Lines / Semi-Discrete Approximation . . . . . . . . . . . 21
1.4.3 Analysis of the Spatial Discretisation . . . . . . . . . . . . . . . . . 21
1.4.4 Time Discretisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.5 Von Neumann Stability Analysis . . . . . . . . . . . . . . . . . . . 30
1.4.6 Stability and Consistency . . . . . . . . . . . . . . . . . . . . . . . 33
1.5 FD Methods in More Dimensions . . . . . . . . . . . . . . . . . . . . . . . 38
1.5.1 Basic Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.5.2 Computational Molecules/Stencils . . . . . . . . . . . . . . . . . . . 40
1.5.3 Boundary Treatment . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.5.4 Time Discretisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2
CONTENTS 3

2 Equilibrium Equation and Iterative Solvers 42


2.1 Equilibrium equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.2 Iterative methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2.1 Timestepping, Richardson’s Method . . . . . . . . . . . . . . . . . . 44
2.2.2 Jacobi’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2.3 Matrix Splitting methods . . . . . . . . . . . . . . . . . . . . . . . 45
2.3 Multigrid methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.3.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.3.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.3.3 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Weighted residual methods 58


3.1 Basic theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.1.1 Weak form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.1.2 Variational formulation . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.1.3 Numerical methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.2 Example: The Finite Element method . . . . . . . . . . . . . . . . . . . . 61
3.2.1 Nodal basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2.2 Matrix assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.3 Example: The Finite Volume method . . . . . . . . . . . . . . . . . . . . . 65
3.4 Higher dimensional elements . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.1 Isoparametric mapping . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.2 Quadrilateral elements . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4.3 Triangular elements . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.4.4 Higher order elements . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.5 Time dependent problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4 Hyperbolic equations 80
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.1.1 Telegraph equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.1.2 Analytical solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.1.3 Fourier series solution . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.1.4 D’Alambert’s solution . . . . . . . . . . . . . . . . . . . . . . . . . 88
4 CONTENTS

4.1.5 Characteristics of 1st order equations . . . . . . . . . . . . . . . . . 89


4.1.6 Group velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.1.7 Eigenvector decomposition . . . . . . . . . . . . . . . . . . . . . . . 93
4.2 Numerical methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.2.1 Finite difference approximation . . . . . . . . . . . . . . . . . . . . 94
4.2.2 Stability analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.2.3 Friedrich’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2.4 Lax-Wendroff method . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.2.5 Dispersion of numerical methods . . . . . . . . . . . . . . . . . . . 99
4.3 Time integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3.1 General remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3.2 Analysis of the time integration . . . . . . . . . . . . . . . . . . . . 101
Chapter 1

An Introductory Example

In this introductory chapter we will go through the steps of setting up a mathematical


model for heat conduction. This will be derived from basic physical principles and will
lead to the integral form of a partial differential equation. We will look at exact solutions
for very idealised situations, in order to see the typical behaviour. For more complicated
circumstances we have to resort to numerical methods. These will again be studied in a
very idealised setting.

1.1 Derivation of the PDE


To illustrate the way how to derive a partial differential equation describing a physical
system out of the basic laws of physics we will consider a simple rod consisting of a normal
material (Fig. 1.1).
The rod should be insulated against any heat loss on the whole length. Only at the ends
it can gain or loose heat. We are interested in the temperature distribution inside this rod
at a specific time. As for most dynamical systems we must know the exact state at a given
time t0 . And certainly the temperature at both ends is important, too.

1.1.1 Energy Conservation


The conservation law that seems right for this problem is the conservation of energy because
the temperature is equivalent to the motion energy of the molecules that build the rod.
First we have the heat (or thermal energy) in the rod. Its density per unit length is:

A · ρ · c · θ(x, t) (1.1)

Here ρ is the density of the material, c the specific heat capacity, and A the cross sectional
area of the rod. They could all be functions of space and temperature, but for the sake

5
6 Chapter 1. An Introductory Example

of simplicity we shall assume them to be constant. The function θ(x, t) describes the
temperature at a given point in space and time. And so the change of energy of an
arbitrary piece of rod from a to b is:
Z b

R1 = A · ρ · c · θ(x, t)dx (1.2)
∂t a

Next we will take a look at the energy that goes into the rod or out of it. As mentioned
before this can only happen at the two ends of the rod.
There we have the heat flow which is described by the function q(x, t). So the energy that
goes into the rod is:

R2 = A · (q(a, t) − q(b, t)) = −A · (q(b, t) − q(a, t)) (1.3)

This equation can be transformed with the fundamental theorem of calculus into:
Z b 

−A · (q(b, t) − q(a, t)) = −A q(x, t) dx (1.4)
a ∂x

Finally we assume an internal source of heat. This effect should model something similar
to a microwave oven which heats something from the inside. We introduce the function
h(x, t) which describes the power density of additional heat sources.
Z b
R3 = A · h(x, t)dx (1.5)
a

Conservation of energy means that we must have

R1 = R2 + R3 (1.6)

Inserting the equations again into the short form gives:


Z b Z b Z b
∂ ∂
Acρθ(x, t)dx = −A q(x, t)dx + A h(x, t)dx (1.7)
∂t a a ∂x a

Separating the parts of the equation with known functions and the parts with unknown
functions leads to:
Z b Z b Z b
∂ ∂
Acρ θ(x, y)dx + A q(x, t)dx = A h(x, t)dx (1.8)
a ∂t a ∂x a

Finally we obtain for any a, b ∈ [0, l]:


1.1. Derivation of the PDE 7

Z b  Z b
∂ ∂
cρ θ(x, t) + q(x, t) dx = h(x, t)dx (1.9)
a ∂t ∂x a

This is the integral form of the PDE.

1.1.2 From the Integral Form to the PDE


Up to this point all equations were integral equations which gave some restrictions on the
solution. Now the following lemma allows the transformation of these Integrals into a PDE
under some conditions:

Lemma 1 (Fundamental lemma of calculus of variations) Let ϕ be a continuous func-


tion ϕ : [A, B] → R. If for arbitrary a, b ∈ [A, B] with b > a
Z b
ϕ(x)dx = 0, (1.10)
a
then
∀x ∈ [a, b], ϕ(x) = 0 (1.11)

Proof (by contradiction) :

Assume ∃x0 : ϕ(x0 ) > 0


ϕ continuous ⇒ there is a neighbourhood of x0 , ([x0 − ε, x0 + ε]) where ϕ(x) ≥ δ > 0.
Then with a = x0 − ε, b = x0 + ε
Z b Z b Z b
ϕ(x)dx ≥ δdx = δ dx = δ(b − a) = 2δε > 0 (1.12)
a a a

in contradiction to Eq. (1.10).


Going back to the relations describing the heat transfer in the rod, we have the following
equation:
Z b 
∂ ∂
cρ θ(x, t) + q(x, t) − h(x, t) dx = 0 (1.13)
a ∂t ∂x
| {z }
ϕ(x)

If we assume that ϕ(x) is continuous then the fundamental lemma of variational calculus
gives directly the differential or pointwise form of the PDE:

∂ ∂
cρ θ(x, t) + q(x, t) = h(x, t) (1.14)
∂t ∂x
8 Chapter 1. An Introductory Example

With given boundary conditions q(a, t), q(b, t) and initial conditions θ(x, 0).
One important aspect of this assumption is that the expression under the integral has
to be continuous. In contrast the original integral equation can also be satisfied by a
discontinuous function which may appear in real life problems. So one must keep in mind
that the partial differential equations come originally from the integral form and therefore
the strict continuity requirements of the PDE may sometimes be neglected. In fact, in the
sequel unless stated otherwise, write the differential form – as it is simpler – but we will
mean the integral form.

1.1.3 Constitutive Laws


To get a solvable equation one of the two unknown functions must be replaced by a known
function. Often this is done with a constitutive law which connects two physical properties
with a function. For the heat equation the Fourier Law provides this kind of function.


q(x, t) = −λ θ(x, t) (1.15)
∂x
Where λ is the heat conductivity. This again could be a function of temperature or position,
but again for simplicity we shall assume it constant. Inserting this constitutive law into
the PDE gives finally the well known heat equation:
 
∂ ∂ ∂
ρ θ(x, t) − λ θ(x, t) = h(x, t) (1.16)
∂t ∂x ∂x

sorting the constants gives:

∂2
 
∂ λ h(x, t)
θ(x, t) − 2
θ(x, t) = = η(x, t) (1.17)
∂t cρ ∂x cρ

The time derivative will be abbreviated with a superposed dot:


θ(x, t) = θ̇(x, t) (1.18)
∂t
Another possible constitutive law which can be applied in this context is the law of con-
vective transport. While Fourier’s law describes a slow diffusive transport of energy, the
convective transport is similar to putting a cup of hot water into a river. The energy is
transported with the speed of the water flowing in the river:

q(x, t) = cρθv (1.19)

where v is the velocity of the transport medium.


1.1. Derivation of the PDE 9

1.1.4 Initial and Boundary Conditions


Most PDEs have an infinite number of admissible solutions. Thus the PDE alone is not
sufficient to get a unique solution. Usually some boundary conditions and initial conditions
are required.
For the heat equation the simplest boundary conditions are fixed temperatures at both
ends:

θ(0, t) = h1 (t) (1.20)


θ(l, t) = h2 (t) (1.21)

where l is the length of the rod and h1 (t) the temperature at the first end and h2 (t) the
temperature at the second end.
The initial conditions specify an arbitrary initial temperature distribution inside the rod:

θ(x, 0) = θ0 (x) (1.22)


.

1.1.5 General Way of Modelling Physical Systems


Basically many PDEs in mathematical physics are derived in the way, shown in the exam-
ple. So if we have a quantity with density u which should be conserved, the change of that
quantity for an arbitrary piece [a, b] is:
Z b

udx (1.23)
∂t a

It is equal to the amount going in or out through the boundary with flow density p:

−p|ba (1.24)

and the amount generated or consumed inside the domain:


Z b
j(x)dx (1.25)
a

which finally gives us the general form of a conservation law:


Z b Z b

udx = −p|ba + j(x)dx (1.26)
∂t a a
10 Chapter 1. An Introductory Example

Z b Z b Z b
∂ ∂
⇒ udx = − p+ j(x)dx (1.27)
∂t a a ∂x a

The situation does not change if the domain is part of a multidimensional space like R2
or R3 . Only the flux into the domain changes a little bit when going from 1D to higher
dimensions. If we consider a domain Ω in R2 or R3 , and an arbitrary part V with a given
flux field p on the boundary ∂V (see Fig. 1.2) the amount which goes into the domain
through a point on ∂V is exactly p · n where n is the normal vector in that point. Here
∂V denotes the boundary of V .
So the conservation law becomes:
Z Z Z

u dV = − p · n dS + j dV (1.28)
∂t V ∂V V

This equation must also be satisfied on every small subdomain V of Ω. Applying the Gauss-
Theorem to the integral over the boundary in Eq. (1.29) gives finally for any subdomain
V ⊂ Ω:

Z z ϕ
}| {
u̇ + div p − j dV = 0 (1.29)
V

This is again the integral form of the PDE. If the expression under the integral in Eq. (1.29)
– th function ϕ – is continuous, we may again use the fundamental lemma of the calculus
of variations (suitably modified for higher dimensions), to arrive at the differential form:

u̇ + div p − j = 0 (1.30)
If we introduce the characteristic function of the subdomain V which is defined as:

1 if x ∈ V
χV (x) = (1.31)
0 otherwise
the condition that the conservation is also satisfied on every subdomain can be written as:
Z
χV (u̇ + div p − j) dV = 0, ∀χV (1.32)

The integral is now over the complete domain Ω. If we take linear combinations of different
χV , and with certain continuity arguments we may deduce that instead of χV in Eq. (1.32)
we may take any function ψ such that the integral
Z
ψ(u̇ + div p − j) dV = 0, ∀ψ (1.33)

is still meaningful. This is the so called weak form of the PDE.


1.2. Analytical Solutions of PDEs 11

1.2 Analytical Solutions of PDEs


Although most Partial Differential Equations have no closed solution on complex domains,
it is possible to find solutions for some basic equations on simple domains. They are
especially important to verify the accuracy and correctness of numerical methods.

1.2.1 Heat equation


We will start again with the heat equation for the rod from section 1.1. It can be written
– without convection – in a simplified form as:

∂u ∂2u
− β2 2 = 0 (1.34)
∂t ∂x
Initial and boundary conditions are also required. But these conditions are not necessary
for the first steps. The first thing on the way towards a solution is an idea how the function
which satisfies the PDE should look like. Here we assume that the solution is the product
of two unknown functions A(x) and B(t) – a so called product-ansatz:

u(x, t) = A(x) · B(t) (1.35)

After that the partial derivatives of u with respect to t and x can be computed:

∂u
= u̇ = A(x) · Ḃ(t) (1.36)
∂t
∂u
= u0 = A0 (x) · B(t) (1.37)
∂x
∂2u
= u00 = A00 (x) · B(t) (1.38)
∂x2
(A dot means the time derivative while the prime denotes the spatial derivate). Inserting
these derivatives into the original PDE gives the following result:

A(x) · Ḃ(t) − β 2 A00 (x) · B(t) = 0 (1.39)


2 00
or A(x) · Ḃ(t) = β A (x) · B(t) (1.40)

Obviously the trivial solution u(x, t) = 0 satisfies the PDE, but we are not interested in
the trivial solution, so we can assume that u(x, t) = A(x)B(t) 6= 0 and thus multiply with
1
AB
:

Ḃ(t) A00 (x)


= β2 (1.41)
B(t) A(x)
12 Chapter 1. An Introductory Example

This equation can only be satisfied if both sides are constant. So it is possible to introduce
a constant κ2 :

Ḃ(t) A00 (x)


= β2 = −κ2 (1.42)
B(t) A(x)

From this we get the following equation:

Ḃ(t) = −κ2 B(t) (1.43)

It is easy to see that the solution of that equation is the exponential function:

2t
B(t) = B0 e−κ (1.44)

Applying the same steps to the second part of Eq. (1.42) gives:

κ2
A00 (x) = − A(x) (1.45)
β2
with the solutions

κ κ
A(t) = cos x, A(t) = sin x (1.46)
β β

Finally going back to the ansatz Eq. (1.35) we get:

2 κ
A(x)B(t) = B0 e−κ t cos x (1.47)
β
2 κ
and A(x)B(t) = B0 e−κ t sin x (1.48)
β
as solutions for the heat equation.

1.2.2 Boundary Conditions


If we want to impose the boundary conditions u(0, t) = 0 and u(l, t) = 0 on the beginning
and the end of the rod, the parameters κ and β have to satisfy certain conditions depending
on the length of the rod l:

2
A(0)B(t) = B0 e−κ t sin 0 = 0 (1.49)
2 κ
A(L)B(t) = B0 e−κ t sin l = 0 (1.50)
β
1.2. Analytical Solutions of PDEs 13

Condition (1.49) is always satisfied but Eq. (1.50) leads to the following relation between
κ and an arbitrary integer k :

κk β
L = kπ ⇒ κk = kπ (1.51)
β L

1.2.3 General Solution


Because the heat equation is a linear PDE the sum of two functions satisfying the PDE is
also a solution of the PDE. This leads to the following equation:


X 2 κk
θ(x, t) = Bk eκk t sin( x) (1.52)
k=1
β

The type of solution is only valid for some special boundary conditions (i.e. u(0, t) = 0
and u(l, t) = 0). But by also using the cosine functions it is possible to satisfy arbitrary
boundary conditions.
The function which defines the initial conditions must be decomposed into sines and cosines
by a Fourier analysis to find the parameters Bk for the initial conditions.
Another solution can be obtained by integrating the solution from −∞ to +∞:
Z +∞ 2
2 κ 1 − x
e−κ t cos x dκ = √ e 4β2 t (1.53)
−∞ β 2β πt

This solution is called the fundamental solution of the heat equation (cf. also Fig. 1.3). In-
troducing a coordinate transform gives the following more general form of the fundamental
solution:

1 −(x−ξ)2
θ(x, t) = p e 4β 2 t (1.54)
4β 2 πt

Here ξ is the parameter which specifies the distance the function is shifted along the x-axis.
Although it might seem that the function disappears slowly the following equation holds:
Z ∞
∀t > 0 θ(x, t) dx = 1 (1.55)
−∞

As t → 0+ the fundamental solution approaches the so called Delta Function denoted by


δ(x), which is not a function in the classical meaning. Looking at the graph of the function
(Fig. 1.3) one might guess what it looks like. At an infinitely small part of the X-axis
centred around zero the function has an infinite value.
14 Chapter 1. An Introductory Example

It is only defined in a weak sense. That means only the integral of this function together
with another function v(x) ∈ C 0 (R) has a defined value:
Z +∞
δ(x)v(x) dx = v(0) (1.56)
−∞

Z +∞
and δ(x − ξ)v(x) dx = v(ξ) (1.57)
−∞

Using the following limit:


Z ∞
lim θ(x, t)v(x) dx = v(0) (1.58)
t→0 −∞

shows that θ(x, 0) must be the Delta Function.

1.2.4 Solutions with Source Terms and Initial Conditions


Using the property that θ(x, 0) is the Delta Function and the linearity of the Laplace oper-
ator allows the construction of analytical solutions which satisfy arbitrary initial conditions
or functions generating energy or heat.
Without more explanations the following equations come out:
a.) and internal sources h(t, x)

t
x2
Z
x
θ̂(t, x) = p h(t − τ )τ 3/2 exp(− )dτ (1.59)
4β 2 π 0 4β 2 τ

b.) initial conditions θ(0, x) = f (x) and no internal sources.

+∞
(x − ξ)2
Z
1
θ̃(t, x) = p f (ξ) exp(− )dξ (1.60)
4β 2 πt −∞ 4β 2 t

1.3 Non-Dimensional Form of the Heat Equation


In this section the behaviour of the PDE for different scales should be examined. One
example may be the diffusion of some chemical substances in the sea which a large ship
looses through a leakage, another may be one drop of milk in a cup of coffee. First step in
this examination is the introduction of a coordinate transformation, to make all quantities
in the equation non-dimensional
1.3. Non-Dimensional Form of the Heat Equation 15

θ = ϑ(x, y, z, t)θ̄ (1.61)

with

x = ξL (1.62)
y = ηL (1.63)
z = ζL (1.64)
t = τ ·T (1.65)

where L is a reference length and T a reference time.


This time we will consider the heat equation together with convective transport:

θ̇ − β 2 ∆θ + v T ∇θ = 0 (1.66)

Here v is the velocity of the convective transport. Perhaps the gulf stream or stirring the
cup of coffee. Now the partial derivatives in Eq. (1.66) must be replaced by the derivatives
with respect to the new variables ξ, η, ζ and τ .

∂ 1 ∂ ∂2 1 ∂2
= ⇒ = (1.67)
∂x L ∂ξ ∂x2 L2 ∂ξ 2
2
∂ 1 ∂ ∂ 1 ∂2
= ⇒ = (1.68)
∂y L ∂η ∂x2 L2 ∂η 2
2
∂ 1 ∂ ∂ 1 ∂2
= ⇒ = (1.69)
∂z L ∂ζ ∂x2 L2 ∂ζ 2
∂ 1 ∂
= · (1.70)
∂t T ∂τ

And the velocity of the convective flow must obviously also be adapted to the new scales:

L
v=υ (1.71)
T
With these equations the gradient and the Laplacian become:

T
∂2 ∂2 ∂2
  
∂ ∂ ∂
∇ξ = , , and ∆ξ = + + (1.72)
∂ξ ∂η ∂ζ ∂ξ 2 ∂η 2 ∂ζ 2

and the heat equation thus:


16 Chapter 1. An Introductory Example

1 ∂ β 2 θ̄ θ̄
ϑθ̄ − ∆ξ ϑ + υ T · ∇ξ ϑ = 0 (1.73)
T ∂τ L T
Multiplying with T and dividing by θ̄ gives:

∂ 1
ϑ− ∆ξ ϑ + υ T · ∇ξ ϑ = 0 (1.74)
∂τ Pe
Where P e = βL2 T . In this equation the reference time and length totally disappeared except
for the factor 1/P e in front of the Laplacian. As β 2 = cρλ
, we have P e = cρ·L
λ·T
. It is a non-
dimensional number like in many other areas (Reynolds number, Mach number, . . .). All
scales of the actual configuration go into that number. So physical phenomena on domains
with totally different sizes and different materials can have the same behaviour if their
Peclet number is the same.
1.3. Non-Dimensional Form of the Heat Equation 17

Area A

q(b,t)

X
l
b
Isolation

a
q(a,t)

Figure 1.1: Insulated rod


18 Chapter 1. An Introductory Example

p
n

Figure 1.2: The domain Ω and a part V

0.8
0.6 1
0.4
0.8
0.2
0 0.6
-4
t
-2
0.4
0
x 2 0.2
4

Figure 1.3: Fundamental solution of the heat equation


1.4. Finite Difference methods 19

1.4 Finite Difference methods


One result of the last section was the PDE which describes the heat transfer in a insulated
rod. Furthermore, several analytical solutions of this PDE were presented. But these
solutions satisfied very special initial and boundary conditions. If we want to solve real
problems with arbitrary boundary and initial conditions, it will almost be impossible to
find analytical solutions.
Thus this section will show one possible way to find a numerical approximation for the
solution of the PDE. Next the properties of this approximation will be compared with the
properties of the analytical solution. At the end some other schemes will be introduced
and analysed.

1.4.1 Spatial approximation of the heat equation


If we consider again the heat equation:

∂u
− β 2 ∆u = f, (1.75)
∂t
∀x u(x, 0) = ũ0 (x) given, (1.76)
∀t > 0 u(0, t) = û0 (t), (1.77)
u(l, t) = 0. (1.78)

we see two partial derivatives. One with respect to time and the other with respect to
spatial variables. Although some newer methods (Time-Space Finite Elements) treat the
time derivatives in the same way as the spatial derivatives, most classical approaches
separate the time and space directions and start with a numerical approximation of the
space derivative.
Because the real solution u(x, t) of the PDE is defined on infinitely many points inside the
domain, it is impossible to handle the complete function inside the computer. So we must
limit our solution to a finite number of points in space. For simplicity we assume these
points are distributed equidistant on the domain. So each point has a distance of h to its
left and right neighbour.
2
The goal of the approximation is to find an expression for ∂∂xu2 , which depends only on some
neighbour points. One way to derive this expression is a Taylor expansion of u around a
given point x. The first approximation is used for the right neighbour:

∂u 1 ∂2u 2 1 ∂3u 3 1 ∂4u 4


u(x + h) = u(x) + (x)h + h + h + h + O(h5 ), (1.79)
∂x 2 ∂x2 3! ∂x3 4! ∂x4
the second one for the left neighbour of point x:
20 Chapter 1. An Introductory Example

∂u 1 ∂2u 2 1 ∂3u 3 1 ∂4u 4


u(x − h) = u(x) − (x)h + 2
h − 3
h + 4
h − O(h5 ). (1.80)
∂x 2 ∂x 3! ∂x 4! ∂x
Adding Eq. (1.79) and Eq. (1.80) results in:

∂2u 2 2 ∂4u 4
u(x + h) + u(x − h) = 2u(x) + 0 + 2
h + 0 + 4
h + O(h6 ). (1.81)
∂x 4! ∂x

Dividing by h2 and rearranging gives:

∂2u 1 1 ∂4u 2
= (u(x + h) − 2u(x) + u(x − h)) − h + O(h4 ). (1.82)
∂x2 h2 12 ∂x4
As we only want to use the values at the points x − h, x, x + h, we may shorten this to

∂2u 1
2
= 2 (u(x + h) − 2u(x) + u(x − h)) + O(h2 ). (1.83)
∂x h
Because we have a finite number of equidistant points it is possible to label these points
from 0 to N , where h · N = l. At a typical point xj = x0 + j · h we introduce the notation

uj := u(xj ) (1.84)
∂uj ∂u( xj )
:= , etc. (1.85)
∂x ∂x

Introducing this numbering gives for an arbitrary point xj :

∂ 2 uj 1
2
= 2 (uj+1 − 2uj + uj−1 ) + O(h2 ) (1.86)
∂x h
This equation provides already an error estimate. Reducing the distance between two
points to one half of the original distance reduces the error to roughly one quarter of the
previous value.
Another way to derive this equation is to use the well known relation that the second
derivative of a function is the derivative of the first derivative of this function. The same
applies to the differences. Here we take the difference between the first forward difference
and the first backward difference.
 
1 uj+1 − uj uj − uj−1 1
− = (uj+1 − 2uj + uj−1 ) (1.87)
h h h h2
1.4. Finite Difference methods 21

1.4.2 Method of Lines / Semi-Discrete Approximation


By inserting the approximation for the second derivative in Eq. (1.75) we obtain approxi-
mately:

∂uj β2
− 2 (uj−1 − 2uj + uj+1 ) = fj (t), j ∈ [1..N − 1] (1.88)
∂t h
The PDE has now become a system of ODEs. Introducing the vector u
 
u1 (t)
..
.
 
 
u(t) =  uj (t) (1.89)
 

 .. 
 . 
uN −1 (t)
allows us to write the system of ODEs in matrix form:

d
u(t) = Au(t) + f (t) (1.90)
dt
with
2
   
β
2 −1 0 0 f1 (t) + ∆x 2 û0 (t)

 −1 2 −1
β2  f2 (t)
  
A=− 2 and f (t) =  (1.91)
  
∆x  . . . . . .  .. 
. . .   . 
0 −1 2 fN −1 (t)
One problem occurs at the boundarys which lie at the points u0 and uN . Here we have
circumvented it by assuming the simple boundary conditions in Eq. (1.75), where the first
(inhomogeneous one) at x0 gives a contribution to the vector f . Other boundary conditions
will be treated later.
The name Method of Lines comes from the fact that we have reduced the original problem
of finding a solution u(x, t) at an infinite number of points in the space-time domain to the
problem of finding solutions uj (t) on a finite number of lines in the space-time domain (cf.
Fig. 1.4). These solutions can be obtained by solving the system of ODEs analytically or
by using another numerical method to discretise these ODEs as well in time.

1.4.3 Analysis of the Spatial Discretisation


In this section a general analytical solution for the system of ODEs which came from the
spatial discretisation will be derived. For simplicity we consider the heat equation with
boundary conditions as in Eq. (1.75), with f ≡ 0 and û0 ≡ 0.
22 Chapter 1. An Introductory Example

j
...

u
h
t

Figure 1.4: Scheme of the Method of Lines

The spatially discrete system Eq. (1.90) from the method of lines then simply reads

u̇ = Au (1.92)

where the matrix A in Eq. (1.91) is symmetric A = AT and thus has the following proper-
ties:

• A has N − 1 orthogonal eigenvectors which form a basis of RN −1

• A has real eigenvalues

For our analysis we need an analytical solution for Eq. (1.92). We start with the following
Ansatz:

u(t) = v · eαt (1.93)

where α is a number and v a vector. Inserting Eq. (1.93) into Eq. (1.92) gives:

αveαt = eαt Av ⇒ Av = αv (1.94)


1.4. Finite Difference methods 23

and hence v and α have to be eigenvector and eigenvalue of A in order that Eq. (1.93) is a
solution of Eq. (1.92). One problem with this solution is that it does not satisfy the initial
conditions u(x, 0) = ũ0 (x).
It is possible to overcome this problem because the eigenvectors of A provide an orthogonal
basis. Every vector of initial conditions can then be build up from the eigenvectors:
 
u1 (0) N −1
u(0) =  ..  X 0
= βj vj (1.95)

.
uN −1 (0) j=1

The solution vector at an arbitrary time is decomposed in the same way:


 
u1 (0) N −1
u(t) =  .
.  X
= βj (t)vj (1.96)

.
uN −1 (t) j=1

Obviously this solution must satisfy the system of ODEs which gives the following relation:

−1 −1
N N
! N −1 N −1
X X X X
β̇j (t)vj = A βj (t)vj = βj (t)Avj = βj (t)λj vj (1.97)
j=1 j=1 j=1 j=1

This leads to the following condition for the variables βj :

N
X −1
(β̇j (t) − βj (t)λj )vj = 0 (1.98)
j=1

As {vj } is a basis, this is only possible if the parenthesised term vanishes for each j.
With this basis transformation it is possible to split the original system of coupled ODEs
into a set of uncoupled linear ODEs:

β̇j (t) = λj βj (t), βj (0) = βj0 (1.99)

with the analytical solutions:

βj (t) = βj0 eλj t (1.100)

After this preparation we have everything together to analyse the behaviour of the ana-
lytical solution of the system of ODEs which we obtained from the spatial discretisation
of the heat equation. One very important thing about the solutions of the heat equation
was the fact that all solutions were decaying if no internal heat sources were present. If
24 Chapter 1. An Introductory Example

our spatial discretisation can not guarantee that these properties remain in the solutions
of the ODEs it will be not very useful, because the goal of our work is to get a method
which can be used to compute reliable predictions.
From Eq. (1.100) it can be seen that the eigenvalues λj of the matrix A are essential for
the solutions. If λj > 0 it is clear that the exponent will grow as time increases and thus
the solution will also grow. So a decaying solution requires that all λj are smaller than
zero. To find out if this is true for our matrix A we need a general eigenvalue analysis
of the matrix A. Fortunately a closed formula exists for the eigenvalues of a tridiagonal
symmetric matrix.

Lemma 2 (Eigenvalues of a tridiagonal matrix) Let A be a symmetric tridiagonal


matrix of size N − 1 × N − 1 with the following structure:
 
a b
 b a b 
 
A=
 . .. .. ...
. 

 
 b a b 
b a

Then the eigenvalues λj of A are:


 

λj = a + 2b cos , j = [1 . . . N − 1]
N
and the eigenvectors vj of A are:
 
vj1  
kjπ
vj =  ...  , vjk = sin , k, j = [1 . . . N − 1]
 
N
vjk

In our case we have:

2β 2 β2
a=− 2 , b= 2 (1.101)
h h
So we obtain:

2β 2 2β 2 2β 2
  
jπ jπ
λj = − 2 + 2 cos = 2 cos −1 (1.102)
h h N h N

The first part of Eq. (1.102) is just a positive constant. So whether the largest eigenvalue is
greater than zero is determined by the last part, which can only become zero if the cosine
becomes one. Because the expression j/N never becomes zero the cosine never reaches 1
1.4. Finite Difference methods 25

and the eigenvalues λj are always negative. This shows that the analytical solutions of
the ODEs will always decay and thus reproduce qualitatively the original behaviour of the
PDE.

1.4.4 Time Discretisation


Although we have found an analytical solution for the system of ODEs coming from the
spatial discretisation this task will become more difficult and most often impossible if we
consider more complex domains. Therefore we need another discretisation which approxi-
mates the time derivative and allows us to solve the ODEs numerically (See Fig. 1.5)

j,n
...

...
h
t

∆t

Figure 1.5: Scheme of a full discretisation

Forward Differences

To approximate the time derivative we use again a Taylor series expansion of u around a
given time t. Let ∆t denote the time step size, then we have:

∂u
u(t + ∆t) = u(t) + ∆t + O(∆t2 ), (1.103)
∂t t
26 Chapter 1. An Introductory Example

or

∂u u(t + ∆t) − u(t)
= + O(∆t). (1.104)
∂t t ∆t

If we insert this approximation of the time derivative into the spatially discretised heat
equation, we obtain:

u(t + ∆t) − u(t)


= Au(t) (1.105)
∆t
The two approximation errors of size O(∆t) for the time discretisation and O(h2 ) for the
space discretisation bring a total discretisation error of O(∆t) + O(h2 ) = O(∆t + h2 ).
Assuming that the size of the time steps stays constant it is possible to number the different
discrete time points:

tn = t0 + n · ∆t (1.106)

Together with the spatial discretisation we have a solution vector at every time point:
   
u1 (tn ) u1,n
un =  ..   .. 
= .  (1.107)

.
uj (tn ) uj,n

With these vectors the discrete heat equation can be written as:

un+1 = un + ∆tAun = (I + ∆tA)un (1.108)


| {z }
B

This method for ODEs is also known as the Euler forward method. It is now a fully discrete
linear dynamical system of difference equations with matrix B.
An important question is now whether the numerical solutions of this difference equation
also decay. To find an answer another eigenvalue analysis with the matrix B is necessary.
Again the matrix is tridiagonal which makes the eigenvalue analysis easy.
 
1 − 2β 2 ∆th2
β 2 ∆t
h2
0 0 . . .
 β 2 ∆t
h 2 1 − 2β 2 ∆t
h2
β 2 ∆t
h2
0 ... 
B= (1.109)
 
2 ∆t 2 ∆t 2 ∆t
 0 β h2 1 − 2β h2 β h2 0 . . .  
.. .. .. ..
. . . .

Here a = 1 − 2r and b = r with r = β 2 ∆t


h2
and thus:
1.4. Finite Difference methods 27

 
jπ jπ
λj = 1 − 2r + 2r cos = 1 − 2r 1 − cos (1.110)
N N
The solution of linear difference equations is growing if the absolute value of one eigenvalue
is greater than one. Therefore we must look if one of the eigenvalues is greater than one or
less than one. During the analysis of the spatial approximation we already saw that cos jπ N
never becomes zero. From this fact we see that Eq. (1.110) is always less then one. The
other ”dangerous” value is −1. If we set j = N − 1 the cosine approaches its maximum
negative value:
 
(N − 1)π
λN −1 = 1 − 2r 1 − cos (1.111)
N
To guarantee decreasing solutions we can make the condition a little bit stronger by re-
quiring:

1
λN −1 > λN = 1 − 4r > −1, or r < , (1.112)
2
which gives the following relation for β, h and ∆t:

h2
∆t < (1.113)
2β 2
Satisfying this relation guarantees a stable behaviour with decaying solutions. One inter-
esting thing about this equation is the fact that the time step size depends on the spatial
discretisation. So reducing the distance between the points in space requires a reduction
of the time step, but with a quadratic dependence !. If we want the solution to be four
times as accurate, we have to double the number of spatial points (O(h2 )), and divide the
time step by 4, both for accuracy (O(∆t)) and stability (Eq. (1.113)) reasons.

θ - Methods

To overcome the restrictions of the forward differences in time, other time discretisation
schemes must be used. One idea is to use not only the forward difference but to take also
the backward difference.
The forward difference is defined as:

∂u un+1 − un
= + O(∆t) (1.114)
∂t t=tn
∆t

This difference leads, as we already know, to the Euler forward method for ODEs. Inserting
this finite difference approximation into the original system of ODEs results in:
28 Chapter 1. An Introductory Example

un+1 − un
= Aun (1.115)
∆t
The backward difference is:

∂u un+1 − un
= + O(∆t) (1.116)
∂t t=tn+1
∆t

This leads to the Euler backward method for ODEs. We insert this approximation into the
original system of ODEs to obtain:

un+1 − un
= Au,n+1 (1.117)
∆t
The class of θ-methods is based on a linear combination of the forward and backward
difference formulas. Introducing a weighting parameter θ we get:

∂u
θbackw. + (1 − θ)forw. ≈ + O(∆tp ). (1.118)
∂t t=tn+θ

For θ = 1/2 the order of the method is p = 2. All other methods achieve only an order of
p = 1. Inserting Eq. (1.115) and Eq. (1.117) into Eq. (1.118) gives:

un+1 − un
= θAun+1 + (1 − θ)Aun (1.119)
∆t
By solving for u,n+1 we obtain:

(I − θ∆tA)un+1 = (I + (1 − θ)∆tA)un (1.120)

In this equation we can observe several properties of the θ-method. Non astonishingly for
θ = 0 it is exactly the same as the Euler forward method. Furthermore we can see that
the system of linear equations which must be solved to get the next solution vector un+1 is
non-trivial for all θ > 0. Hence larger timesteps through better stability properties of the
method have to be bought at the expense of more floating point operations per time step.
To see if we may use larger time steps with the θ-methods we need the same type of analysis
as for the finite difference method.
As both (I − θ∆tA) = B1 and (I + (1 − θ)∆tA) = B2 are tridiagonal and symmetric, they
have the same eigenvectors and may be diagonalised simultaneously, with

jπ jπ
λi (B1 ) = 1 + 2rθ − 2rθ cos = 1 + 2rθ(1 − cos ) (1.121)
N N
and
1.4. Finite Difference methods 29


λi (B2 ) = 1 − 2r(1 − θ)(1 − cos ). (1.122)
N
The system in Eq. (1.120) can be written as

un+1 = B−1
1 B2 un = Bun (1.123)

and hence B has eigenvalues

λj (B2 ) 1 − 2r(1 − θ)(1 − cos jπ


N
)
λj (B) = = jπ (1.124)
λj (B1 ) 1 + 2rθ(1 − cos N )

(and the same eigenvectors as B1 and B2 ). We require that

−1 < λj (B) < 1. (1.125)

The right inequality leads to 1 − 2r(1 − cos jπ


N
) < 1 which is satisfied for all j, and the left
inequality gives the requirement


r(1 − cos )(1 − 2θ) < 1. (1.126)
N
This is certainly satisfied if θ ≥ 1/2, and hence those θ-methods are stable for any com-
bination of ∆t and h; this is called unconditionally stable. For θ < 1/2 the inequality is
1
certainly satisfied if r · 2 · (1 − 2θ) < 1, or r < 2(1−2θ) . For θ = 0 this is relation Eq. (1.112).
30 Chapter 1. An Introductory Example

1.4.5 Von Neumann Stability Analysis


Some error estimates were obtained by the Taylor series expansion of the PDE in time and
space (Eq. (1.118)). These error estimates showed the consistency of the numerical ap-
proximation which, means that the numerical solution is an approximation to the solution
of the PDE.
But consistency is not enough to get correct solutions for the PDE. Another requirement is
the stability of the numerical solution. The condition for stability Eq. (1.113) was derived
by the matrix stability analysis. Stability and consistency guarantee together that the
numerical solution converges to the real solution of the PDE.
In this section another method to find the stability conditions for a method will be pre-
sented. This method starts with an assumption about the analytical solutions. These
solutions consist of sine and cosine functions of different frequencies at each time instance:

u(x) = cos(k · x) + i sin(k · x) = eikx (1.127)

Here i is the imaginary unit and k is the wavenumber. For this analysis we also assume
that the number of discrete points is infinite. Then looking at this function at our discrete
grid points where x = j · h reveals:

u(j) = eikjh (1.128)

Currently this Ansatz captures only the spatial structure of the solution. From the ana-
lytical solution we know that the time evolution of the function is an exponential function.
In the discrete case this exponential function is approximated by the gain factor, G(k)n
where:

G(k) = eα(k) (1.129)

Bringing Eq. (1.128) and Eq. (1.129) together gives the following ansatz function for the
solution in one of the discrete points:

un,j = G(k)n eikjh (1.130)

β 2 ∆t
Using again r = h2
, the general form of the Theta-methods can be written as:

−θrun+1,j−1 +(1+2θr)un+1,j −θrun+1,j+1 = (1−θ)run,j−1 +(1−2(1−θ)r)un,j +(1−θ)run,j+1


(1.131)
Inserting the Ansatz Eq. (1.130) into the difference formula gives:
1.4. Finite Difference methods 31

(1 + 2θr)G(k)n+1 eikjh − θr(G(k)n+1 eik(j+1)h + G(k)n+1 eik(j−1)h )


=(1 − 2(1 − θ)r)G(k)n eikjh + (1 − θ)r(G(k)n eik(j+1)h + G(k)n eik(j−1)h )
(1.132)

Dividing by G(k)n eikjh , which is nonzero, simplifies the equation to:

(1 + 2θr)G(k) − θrG(k)(eikh + e−ikh ) = (1 − 2(1 − θ)r) + (1 − θ)r(eikh + e−ikh ) (1.133)

From eiξ = cos ξ + i sin ξ it is easy to derive the following two formulae:

1 iξ
e + e−iξ

cos ξ = (1.134)
2
1 iξ
e − e−iξ

sin ξ = (1.135)
2i
Using the first of these gives:

(1 + 2θr − 2θr cos(kh))G(k) = 1 − 2(1 − θ)r + 2(1 − θ)r cos(kh) (1.136)

Solving for G(k), we finally arrive at the following expression for the gain factor:

1 − 2(1 − θ)r(1 − cos(kh))


G(k) = (1.137)
1 + 2θr(1 − cos(kh))
Obviously the gain factor depends on the wave number and the spatial discretisation. For
stability the following condition must be satisfied:

|G(k)| ≤ 1 (1.138)

Another important component in the stability analysis is the highest wavenumber k which
will be included in our examination. This wavenumber is naturally given by the spatial
discretisation with alternating values at successive grid points. This means the upper limit
is kmax = πh . Higher frequencies appear as lower frequencies. This effect is known as
aliasing and follows directly from Shannon’s theorem about the discretisation of signals.
The extreme values of G which are important for the stability analysis depend mainly on
the cosine in the quotient of Eq. (1.137). Demanding cos(kh) = 1 leads to k = 0 which is
the lowest possible frequency and thus:

1−0
G(0) = =1 (1.139)
1+0
32 Chapter 1. An Introductory Example

This extreme value does not cause any trouble (it is actually necessary for consistency)
because it only reaches the stability limit. Now we have to examine the other extreme
value cos(kh) = −1, kh = π ⇒ k = πh :

π 1 − 4(1 − θ)r
G( ) = (1.140)
k 1 + 4θr
While the first limit exactly measures the amplification of the lowest frequency, the lower
limit corresponds to the amplification of the highest frequencies which can be resolved with
the given spatial discretisation. And the second limit can become less than −1 and is thus
the ”dangerous” limit which needs further investigation:

1 − 4(1 − θ)r 1
≥ −1 ⇒ (1 − 2θ)r ≤ (1.141)
1 + 4θr 2
For θ ≥ 12 we get an unconditionally stable method for all r > 0. If θ < 12 a restriction on
the time step must be imposed to get a stable method (r < 1/2(1 − 2θ)). Comparing this
stability result with the matrix stability analysis for the Euler method (θ = 0) shows that
we get the same restriction on r.
If G < 0 the first factor Gn of the discrete solution will change its sign with every time step.
These solutions are called oscillatory solutions. Because the analytical solution does not
show this behaviour it would be nice to avoid also this unwanted characteristic. Inserting
this requirement into the equation for the gain factor reveals:

1 − 4(1 − θ)r 1
≥0⇒r≤ (1.142)
1 + 4θr 4(1 − θ)
A last conditions can be derived from the numerical schemes. It is called positivity and
should prevent the solution from becoming negative. Looking at Fig. 1.6 shows how the
solution at a given point depends on the neighbour points:

un+1,j = un,j + (1 − θ)r(un,j−1 − 2un,j + un,j+1 )


= (1 − 2(1 − θ)r)un,j + (1 − θ)r(un,j−1 + un,j+1 ) (1.143)
| {z }
a

The important criteria for positivity is the part a in Eq. (1.143) because the rest of the
equation is always positive, if the algorithm is started with positive initial conditions. It
follows that:

1
(1 − 2(1 − θ)r) ≥ 0 ⇒ r ≤ (1.144)
2(1 − θ)
In summary we have found the following three criteria which can be used to find the right
parameters for the numerical solution:
1.4. Finite Difference methods 33

(1− θ )r

j+1
1−2(1− θ)r

j
(1− θ )r

j−1
n+1

Figure 1.6: Computational molecule or difference star for the theta methods

1
• Stability : r ≤ 2(1−2θ)

1
• Positivity : r ≤ 2(1−θ)

1
• No oscillations : r ≤ 4(1−θ)

For the three schemes which are used most the result are shown in Table 1.1
Euler fwd. Trap.Rule/Crank Nicholson Euler bwd.
Stability r ≤ 1/2 r≤∞ r≤∞
Positivity r ≤ 1/2 r≤1 r≤∞
No oscill. r ≤ 1/4 r ≤ 1/2 r≤∞

Table 1.1: Limits for θ = 0, θ = 1/2, θ = 1

1.4.6 Stability and Consistency


In the previous section we analysed the stability of the Theta-methods, which were fortu-
nately consistent. Otherwise the methods could have been stable and nevertheless been
34 Chapter 1. An Introductory Example

producing wrong results. The meaning of consistency, stability and convergence will be
illustrated in the next chapter with some examples which show the need for these criteria.

Well posedness

A very useful demand on PDEs is the well posedness. Following the definition of Hadamard
a PDE

L(u) = f (1.145)

is well posed if it possesses three properties:

• the solution exists

• the solution is unique

• the solution depends continuously on auxiliary data

To show the existence of a solution may be a difficult problem, but usually depends on
the proper formulation of the problem. It requires the operator L is surjective, i.e. for
any f there is at least one u satisfying Eq. (1.145). For the uniqueness of the solution the
operator L must be injective, i.e. there is at most one u satisfying Eq. (1.145). The last
requirement can be satisfied if L and also L−1 are continuous.
Although well posed problems are very nice, not all physical phenomena can be described
by a well posed PDE. A simple example is an elastic rod with one fixed end and an
increasing force acting in the direction of the rod on the other end. For small forces the
problem is well posed. The deformation of the rod follows simply Hooke’s law. But at a
certain point, when the rod starts buckling, the problem is no longer well posed because
the rod can buckle to an arbitrary direction. So infinitely many solutions which are all
physically correct can exist.

Convergence

The most important criterium for a numerical approximation is the convergence which
demands that the approximate solution gets closer to the exact solution as the discretisation
is made finer.
Let L(u) = f define the exact solution and Lh (uh ) = fh be the discrete approximation.
Then convergence is:

uh → u, as (h → 0) (1.146)
1.4. Finite Difference methods 35

With this definition one open question remains. How to measure if a function approaches
another function. For this purpose the concept of norms, which is known from finite
dimensional spaces, is transferred to function spaces. A first basic norm is the L2 norm
which is defined by:
sZ
||u||L2 = u(x)2 dx (1.147)

Utilising an arbitrary norm the convergence can be written as:

||uh − u|| → 0, as (h → 0) (1.148)

A weaker criterium than the convergence is the consistency, which requires that the discrete
system approaches the continuous one as h → 0 (with fixed u) !

Lh (u) → L(u)
, as (h → 0) (1.149)
fh → f

The last important thing is the stability of a method, which was examined in the previous
sections. Formally it can be written as (the inverse or solution operator is uniformly
bounded):

||L−1
h || ≤ C, ∀h > 0 (1.150)

Where we shall now assume that both L and Lh are linear operators. These three conditions
are brought together by the following theorem.

Theorem 1 Consistency and Stability ⇔ Convergence

Proof:

||u − uh || = ||L−1 −1
h (Lh (u) − L(u)) + Lh (f − fh )|| (1.151)

With the triangle inequality we can find the following upper bound:

≤ ||L−1 −1
h (Lh (u) − L(u))|| + ||Lh (f − fh )|| (1.152)
≤ ||L−1 −1
h || · ||(Lh (u) − L(u))|| + ||Lh || · ||(f − fh )|| (1.153)
= ||L−1
h ||(||(Lh (u) − L(u))|| + ||(f − fh )||) (1.154)

Stability allows us to introduce another bound:


36 Chapter 1. An Introductory Example

≤ C(||(Lh (u) − L(u))|| + ||(f − fh )||) (1.155)

From consistency we get that:

||Lh (u) − L(u)|| → 0


as (h → 0) (1.156)
||fh − f || → 0
and thus:

C(||(Lh (u) − L(u))|| + ||(f − fh )||) → 0, as (h → 0) (1.157)

which shows the convergence. The other direction needs some deeper results from func-
tional analysis, and will not be given here.

Richardson scheme

As we have seen in one of the previous sections, approximating the time derivative with
forward or backward differences gives only an accuracy of O(∆t) in time. To overcome
this shortcoming, Richardson developed another scheme which has second order accuracy
in time. He simply replaced the forward difference by a difference over two time steps at
a given point:

∂u un+1,j − un−1,j
≈ (1.158)
∂t 2∆t
Including this approximation into the spatial discretisation of the heat equation generates
the following scheme:

un+1,j − un−1,j β2
− 2 (un,j−1 − 2un,j + un,j+1 ) = 0 (1.159)
2∆t h
The stability is examined again with a von Neumann stability analysis. We start with the
ansatz:

un,j = G(k)n · eikjh (1.160)

Inserting this Ansatz into the difference scheme gives:

1 β2
(G(k)n+1 eikjh − G(k)n−1 eikjh ) + 2 G(k)n [−eikh(j+1) + 2eikhj − eikh(j−1) ] = 0 (1.161)
2∆t h

Dividing by G(k)n eikjh = un,j :


1.4. Finite Difference methods 37

1 β2
(G(k) − G(k)−1 ) + 2 [−eihk + 2 − e−ihk ] = 0 (1.162)
2∆t h

Replacing again cos(x) = 21 eix + e−ix :

kh kh
G(k) − G(k)−1 = 4r(cos(kh) − 1) = 4r(−2 sin2 ( )) = −8r sin2 ( ) (1.163)
2 2

Multiplying with G(k) gives the following quadratic equation:

kh
G(k)2 − 1 = −8rG(k) sin2 ( ) (1.164)
2

with solutions:

r
2 kh kh
G(k)1,2 = −4r sin ( ) ± 1 + 16r2 sin4 ( ) (1.165)
2 2

The expression below the square root is always positive because of the square and the fourth
power and larger than 1. Furthermore the first part of Eq. (1.165) is always negative. Thus
the dangerous limit is −1 and it is clear that the Richardson method will always have a gain
factor less than −1. As a consequence the Richardson method is unconditionally unstable.
No choice of time step or spatial discretisation can make this method stable. Therefore the
only useful application of the Richardson method is as an example for an unstable method.

DuFort-Frankel scheme

One reason for the instability of the Richardson method is probably the fact that the time
step where the spatial derivative is computed is not coupled to the time steps where the
time derivative is computed. The DuFort-Frankel scheme tries to overcome this problem
by replacing the midpoint of the Richardson scheme un , j with the average of un−1 , j and
un+1 , j. Written in the normal way the DuFort-Frankel scheme takes the following form:

un+1,j − un−1,j β2
− 2 (un,j−1 − (un−1,j + un+1,j ) + un,j+1 ) = 0 (1.166)
2∆t h

The von Neumann stability analysis shows that this scheme is unconditionally stable. But
this method has another drawback which can be analysed by a consistency analysis. Using
Taylor expansions for the points used in Eq. (1.166):
38 Chapter 1. An Introductory Example

∂u 1 ∂2u 2
un+1,j = u(t + ∆t, x) = un,j + ∆t + ∆t + O(∆t3 ) (1.167)
∂t 2 ∂t2
∂u 1 ∂2u 2
un−1,j = u(t − ∆t, x) = un,j − ∆t + ∆t + O(∆t3 ) (1.168)
∂t 2 ∂t2
∂u 1 ∂2u 2
un,j+1 = u(t, x + h) = un,j + h+ h + O(h3 ) (1.169)
∂x 2 ∂x2
∂u 1 ∂2u 2
un,j−1 = u(t, x + h) = un,j − h+ h + O(h3 ) (1.170)
∂x 2 ∂x2

and inserting these equations into Eq. (1.166) we obtain:

2 ∂2u
2 ∂u
∂t
∆t + O(∆t3 ) β 2 (2un,j + ∂∂xu2 h2 + O(h3 )) β 2 (2un,j + ∂t2
∆t2 + O(∆t3 ))
− − = 0 (1.171)
2∆t h2 h2

Some simplifications give:

∂u ∂2u
− β2 2 + E = 0 (1.172)
∂t ∂x

with

β 2 ∆t2 ∂ 2 u
E= + O(∆t2 ) + O(h) (1.173)
h2 ∂t2

Looking at the error reveals that we do not only have the normal and unavoidable dis-
cretisation errors, but also an additional term which does not exist in the original PDE.
If we use the DuFort-Frankel scheme without any restrictions, we will get the solution
for a different PDE. This is called inconsistency. If we use the method to solve the heat
equation, we have to require that ∆th
→ 0 as ∆t, h → 0, which is incidentally satisfied by
the stability requirements we saw earlier, with ∆t = O(h2 ).

1.5 FD Methods in More Dimensions


The numerical solution of 1D problems serves as an introduction to the treatment of
problems in higher dimensions. As soon as it comes to the solution of 2 or 3 dimensional
problems, the use of numerical solution methods is almost always unavoidable. Here we
will cover the basic ideas of finite difference methods in more dimensions.
1.5. FD Methods in More Dimensions 39

1.5.1 Basic Ideas


If we consider again the instationary heat equation, but this time in 2 dimensions, we
obtain:
∂u ∂2u ∂2u
− β 2( 2 + 2 ) = f (1.174)
∂t ∂x ∂y
Recalling that in the one dimensional case the second spatial derivative was replaced by
a finite difference, this idea can be applied straightforward to the 2 dimensional equation.
Prior to doing this we again have to introduce a discretisation of the domain (See Fig. 1.7)

Missing figure!!!

Figure 1.7: Scheme of the 2D discretisation

The coordinates can be expressed in terms of the indices j and l:


x = j · ∆x (1.175)
y = l · ∆y. (1.176)
Then the partial derivatives can be replaced by finite differences:
∂ 2 uj,l 1
2
= (uj−1,l − 2uj,l + uj+1,l ) (1.177)
∂x ∆x2
∂ 2 uj,l 1
= (uj,l−1 − 2uj,l + uj,l+1 ) (1.178)
∂y 2 ∆y 2
Inserting these expressions into Eq. (1.174) we obtain:
∂uj,l β2 β2
− (u j−1,l − 2u j,l + u j+1,l ) − (uj,l−1 − 2uj,l + uj,l+1 ) = f (1.179)
∂t ∆x2 ∆y 2

Going one dimension up to three dimensional problems, the basic idea stays the same.
Introducing another coordinate z:
z = k · ∆z (1.180)
we get the following approximation for the second partial derivative with respect to z:
∂ 2 uj,l,k 1
2
= (uj,l,k−1 − 2uj,l,k + uj,l,k+1 ) (1.181)
∂z ∆z 2
The semi-discretisation of the three dimensional instationary heat equation then obviously
becomes:
∂uj,l,k β2 β2
− (u j−1,l,k − 2u j,l,k + u j+1,l,k ) − (uj,l−1,k − 2uj,l,k + uj,l+1,k )
∂t ∆x2 ∆y 2
(1.182)
β2
− (uj,l,k−1 − 2uj,l,k + uj,l,k+1 ) = f.
∆z 2
40 Chapter 1. An Introductory Example

1.5.2 Computational Molecules/Stencils

Another simplification is to use the same step size in both space directions. This leads
then in 2D to the following expression with h = ∆x = ∆y being the unique discretisation
parameter:
∂uj,l β 2
− 2 (−4uj,l + uj−1,l + uj+1,l + uj,l−1 + uj,l+1 ) = f. (1.183)
∂t h
or in 3d to:

∂uj,l,k β 2
− 2 (−8uj,l,k + uj−1,l,k + uj+1,l,k + uj,l−1,k + uj,l+1,k + uj,l,k−1 + uj,l,k+1 ) = f. (1.184)
∂t h

A very nice way to visualise these schemes is to draw the points used in the schemes
with their weights in the original computational domain. For the two schemes shown here
one obtains pictures as shown in Fig. 1.8 and Fig. 1.9. These are often referred to as
Computational Molecules or Stencils.
1

1
1
−4

−8
1

1
1
1

Figure 1.8: Stencil for 2D Laplace opera- Figure 1.9: Stencil for 3D Laplace opera-
tor tor

1.5.3 Boundary Treatment

As we know from 1-D problems, the solution is only completely specified if the boundary
conditions are satisfied along with the differential equation. These boundary conditions
have to be discretised also for a numerical treatment. When the boundaries are not straight,
this becomes a cumbersome procedure for finite difference methods. We will not treat these
here, and refer to specialist texts. We will also see that this problem is much easier with the
finite element method, which will be treated next in the more general context of weighted
residual methods.
1.5. FD Methods in More Dimensions 41

1.5.4 Time Discretisation


Similar to the one dimensional case, the resulting system of ordinary differential equations
normally does not possess an analytical solution, which makes the use of numerical methods
necessary. Using the θ-method as an example we obtain the following system of equations:

un+1 n
j,l − uj,l β2
= 2 ((1 − θ)(−4unj,l + unj−1,l + unj+1,l + unj,l−1 + unj,l+1 )+
∆t h (1.185)
θ(−4un+1 n+1 n+1 n+1 n+1
j,l + uj−1,l + uj+1,l + uj,l−1 + uj,l+1 )) + f

For three and higher dimensional problems the idea and implementation is straightforward.
But it should be noted that the computational effort increases extremely fast with higher
dimensions. While in one dimension, taking h = 0.01 for a unit interval leads to approxi-
mately 100 points, the same discretisation size for a unit cube in three dimensions leads to
1000000 points!. Hence for higher dimensional problems often the practical implementation
becomes the real problem.
Chapter 2

Equilibrium Equation and Iterative


Solvers

The solution of the homogeneous heat equation with constant boundary conditions ap-
proaches a stationary state.


u(x, y, z, t) − β 2 ∆u(x, y, z, t) = f (x, y, z) (2.1)
∂t
u(x, y, z, t) → ũ(x, y, z) as t → ∞ (2.2)

This is the steady state of the instationary heat equation and also the solution of the
equilibrium equation.


ũ(x, y, z) = 0 ⇒ −β 2 ∆ũ(x, y, z) = f (x, y, z) (2.3)
∂t
In this chapter this equilibrium equation or stationary heat equation will be introduced.
After that some methods to find a solution for this equation will be introduced.

2.1 Equilibrium equation


The general form of the heat equation was:


u(x, y, z, t) − β 2 ∆u(x, y, z, t) = f (x, y, z, t) (2.4)
∂t
Other physical phenomena like diffusion can also be modelled with this type of equation.
This equation is a member of the family of parabolic equations. A more detailed description
of the different classes of partial differential equations will follow in a later chapter.

42
2.1. Equilibrium equation 43

In order to have a unique solution of this equation we need some boundary and initial
conditions. After spatial discretisation we have the following system of ODEs:


u + Au = f (2.5)
∂t
If the right hand side term f is independent of the time, and all boundary conditions are
also constant in time, the solution of Eq. (2.4) will converge to a steady state as t → ∞.
In the steady state the solution does not change anymore, so ∂u ∂t
= 0 and thus the steady
state will also satisfy the following partial differential equation:

−β 2 ∆u(x, y, z) = f (x, y, z) (2.6)

together with the boundary conditions. Now the equation is of elliptic type. Several other
problems like the stationary state of mechanical systems like displacement of the membrane
of a drum or the displacement of a simple beam can be described by elliptic equations.
If we apply finite difference approximation for the spatial derivative we obtain a system of
linear equations:

Au = f (2.7)

with

2 −1 0
 
...
 −1 2 −1 0 . . . 
β2  ... ... ...

A= 2 0 . (2.8)
 
h  .
 ..


−1 2

This matrix is tridiagonal and hence very sparse (its entries are mostly zeros). Tridiagonal
matrices can be factorised by direct elimination in O(n) operations (the so called Thomas
algorithm).
Normally, the discretisation of PDEs leads to sparse and often very large matrices with
solution vectors of several million unknowns, because the solution becomes more accurate
if the spatial and temporal discretisation is refined.
For not too large systems of linear equations the fastest solution if often to use a direct
solution method like Gaussian elimination. Especially for one dimensional problems one
can achieve a numerical complexity of O(n) where n is the number of unknowns. But for
higher dimensional problems the complexity of efficient direct solvers becomes O(n2 ) for
typical grid problems in 3D. This makes the use of an alternative approach for very large
systems of equations necessary.
44 Chapter 2. Equilibrium Equation and Iterative Solvers

2.2 Iterative methods


While the direct solvers try to find the solution of the system of equations in a finite and
predetermined number of steps, the iterative solution methods start with an initial guess
of the solution, and try then to get closer to the correct solution with each iteration. One
then usually stops the iteration when the iteration error is in the same order of magnitude
as the discretisation error. All the iterative methods replace the direct solution of the
original system of equations with the direct solution of a simpler system, which has to be
iterated over and over.

2.2.1 Timestepping, Richardson’s Method


If we look back to the instationary heat equation we can already identify a first iterative
method. We found out that the steady state solution as t → ∞ of the instationary heat
equation is a solution of Eq. (2.6) and thus also a solution of Eq. (2.7). Starting with
the initial conditions the Euler forward method allows us to come closer to the stationary
solution without solving any systems of equations.
Obviously it will not be possible to come to t = ∞ with finite time steps, but assuming
we have chosen a stable time step size we can be sure that every iteration brings the
approximation closer to the correct solution of the equilibrium equation. Hence an arbitrary
accuracy can be achieved after a finite number of time steps.
The Euler forward method for

u̇ + Au = f (2.9)

was

un+1 − un
+ Au = f . (2.10)
∆t
Rewriting it in matrix form gives:

un+1 = (I − ∆tA)un + ∆tf (2.11)

This method is equivalent to Richardson’s method for solving a linear system of equations
Au = f :

un+1 = (I − ϑA)un + ϑf = un + ϑ(f − Aun ) (2.12)

with a parameter ϑ which must be sufficiently small.


2.2. Iterative methods 45

2.2.2 Jacobi’s Method


A slightly different view to Eq. (2.12) reveals that every iteration is the solution of a very
simple system of linear equations:

Iun+1 = ϑf − (ϑA − I)un (2.13)

Jacobi’s method may be seen as replacing the identity matrix with a matrix of similar
complexity which is closer to the original system of linear equations. The diagonal matrix
D = diag(A) has the same structure as the identity matrix but is closer to the original
system of linear equations and is thus used for the Jacobi method:

Dun+1 = ϑf − (ϑA − D)un (2.14)

Another view, and the one initially motivating Jacobi, of the same method is illustrated
in Fig. 2.1.
Assuming the solution is known on all nodes except our current node j, we simply solve
the system of equations for that node:

N
X X (j) 1 (j) 1 X
aji u(i) = f (j) ⇔ ajj u(j) = f (j) − aji u(i) ⇒ un+1 = f − aji u(i)
n (2.15)
i=1 i6=j
ajj ajj i6=j

Obviously, Eq. (2.15) is equivalent to Eq. (2.14) with ϑ = 1.

2.2.3 Matrix Splitting methods


One common principle of the Eq. (2.13) and Eq. (2.14) was the solution of a simpler system
of equations in each iteration. This principle is generalised in the matrix splitting methods.
Instead of solving the original system of equations one time, simpler systems which are
similar to the original system of equations are solved several times to approximate the
solution.
A formal derivation starts with the original system of linear equations:

Au = f (2.16)

Multiplying the system with a factor ω and adding Mu gives:

Mu = Mu + ω(f − Au) (2.17)


46 Chapter 2. Equilibrium Equation and Iterative Solvers

From this system of equations, which is equivalent to the original system, the iterative
method is derived as:

Mun+1 = Mun + ω(f − Aun ) (2.18)

For ω = 1 we have

Mun+1 = f − (M − A)un = f + (A − M)un (2.19)

So we see the matrix A is split into the parts A and A − M.


It is important to have a matrix M which allows a fast solution of the system of equations.
A broad class of very popular methods is based on the splitting of A into the strictly lower
triangular part E, the strictly upper triangular part ET and the diagonal part D (See
Fig. 2.2):

A = D − E − ET (2.20)

Gauss-Seidel

If we consider again the Jacobi method, written with the splitting matrices, we obtain
(with ω = 1):

Dun+1 = Dun + f − (D − E − ET )un = f + Eun + ET un (2.21)

Under the assumption that our algorithm starts at the first unknown u1 and goes down to
the last unknown uN , we have already new values for ui , i = 1 . . . j − 1 at position j. The
Gauss-Seidel algorithm takes this into account by using these new values as soon as they
are available. From Eq. (2.15) we have

(j)
X (i)
X
ajj un+1 = f (j) − aji un+1 − aji u(i)
n , (2.22)
i<j i>j

or in matrix form:

Dun+1 = f + Eun + ET un+1 (2.23)

But as often the advantage of a faster convergence has some disadvantages. For large scale
applications it is often necessary to use parallel computers. The Jacobi method allows an
almost trivial parallelisation of the algorithm. Each processor gets some unknowns and
can compute the next iteration independently of the other processors. After each iteration
the new results must be distributed.
2.2. Iterative methods 47

In contrast the Gauss-Seidel algorithm cannot be parallelised in its original form because
the steps j + 1..N can only be started after the results 1..j are known. To overcome this
problem algorithms like the Block-Gauss-Seidel method were developed.
A typical implementation of the Gauss-Seidel method is shown below:

fct gauss_seidel (A,f,u)


for k := 1 to convergence
for j := 1 to N
(j) Pj−1 (i) PN (i)
uk+1 = a1jj (fj − i=1 aji uk+1 − i=j+1 aji uk )
end
end
end.

Successive Over-Relaxation (SOR)

Another acceleration of the solution process is achieved by the SOR method which is the
abbreviation for Successive Over Relaxation. Here the assumption is that each iteration
brings the solution closer to the right solution by a small amount ∆u. So for the Jacobi
or Gauss Seidel method we have something like:

uk+1 = uk + ∆uk (2.24)

If ∆u points into the direction of the solution we can come even closer to the solution if
we go a little bit further in that direction. Hence the SOR method uses:

uk+1 = uk + ω∆uk (2.25)

Beside the same parallelisation problems as in the Gauss-Seidel method the optimal choice
of ω is another problem with the SOR method. For most problems relaxation parameters
like ω = 1.1 can already bring a slight improvement.
A last variation is the SSOR method which changes the direction after each iteration. The
first iteration goes from j = 1 and the next from j = N .

Summary

In previous sections we saw some of the basic ideas of iterative solvers. The following table
gives an overview about the most popular matrix splitting methods:

• Richardson M = I

• Jacobi M = D
48 Chapter 2. Equilibrium Equation and Iterative Solvers

• Gauss-Seidel M = D − E or M = D − ET

• SOR M = ω1 D − E or M = ω1 D − ET

• SSOR once M = ω1 D − E and once M = ω1 D − ET

2.3 Multigrid methods


To increase the accuracy of numerical solutions of PDEs is to increase the number of
unknowns. This leads to huge systems of linear or nonlinear equations which must be
solved efficiently. Direct solvers and the simple iterative solvers shown in the previous
section reach their limits at roughly several thousands of unknowns. Complexity analysis
shows this behaviour and will be introduced in short in the last subsection.
Because problems like fluid dynamics need even more unknowns they often use a more
sophisticated iterative solution strategy called Multigrid. The basic ideas and concepts
will be shown in the next section.

2.3.1 Idea

The basis of multigrid method is the clever usage of the so called smoothing property of
most iterative solvers for systems of linear equations. Considering the system of linear
equations coming from the stationary heat equation we have several values along the X-
axis. Starting with a random initial guess for the solution vector u the residuum r = f −Au
along the X-axis looks very irregular (See Fig. 2.3). Interpreting the solution as a time
series all frequencies are included.
If we start iterating with the Gauss-Seidel method we can observe that each iteration makes
the curve of the residual look more smooth. This means that the higher spatial frequencies
(wavenumbers) are diminished (See Fig. 2.4).
Continuing with the iteration at some time only a smooth residual is left which decreases
very slowly (See Fig. 2.5). Looking at the norm of the residual vector shows also that the
error decreases very fast in the beginning and quite slowly at the end (See Fig. 2.6).
From this observation the basic idea is not far away. Transferring the residual on the fine
grid to a coarser grid by an arbitrary restriction operator lets it look ”rougher” to the
iterative solver on the coarser grid, which performs better as a consequence.
After the smooth parts of the residual were decimated on the coarse grid, the correction
to the solution is transferred back to the finer grid with an interpolation operator. Here
only the rough parts are left and can be smoothed away by the iterative solver. This
grid-transfer process is then repeated again and again.
2.3. Multigrid methods 49

2.3.2 Algorithm
The simplest implementation of the idea is the Twogrid iteration It uses a coarse and a fine
grid. Furthermore an interpolation and a restriction operator are required. Probably the
simplest restriction operator is to take only every second node of the grid. For interpolation
an easy and often used method is the linear interpolation which takes the average of the
two neighbouring points.

Twogrid iteration

For the variables the subset index h or H denotes if the variable is defined on the fine or
coarse grid. The superset is used for the iteration number.
The current solution vector is denoted by v, the matrix is called A and the residuum r.
As the exact solution u satisfies Au = f , the error e = u − v satisfies Ae = Au − Av =
f − Av = r
With these definitions we get the following algorithm to compute the next iteration k + 1
of the solution vector vhk :

1. Smooth e, vhk → v˜hk

2. Compute residual rhk = fh − Ah v˜hk

3. Transfer rhk → rH
k
(restriction)
k
4. On Grid H solve AH ekH = rH

5. Transfer ekH → ekh (prolongation)

6. vhk+1 = ṽhk + ekh

7. Optionally smooth vh

Graphically this algorithm can be visualised as shown in Fig. 2.7. Especially for more
complicated iteration schemes this visualisation becomes useful for understanding the al-
gorithm.

Multigrid iteration

One point in the two grid algorithm is not totally satisfying. In the fourth step the direct
solution of a smaller system of equations is required. For large problems this system of
equations may again be too large to solve directly. So the idea of the multigrid iteration
is to introduce another two grid scheme to solve this system of equations. Applying this
recursion several times gives a complete hierarchy of levels.
50 Chapter 2. Equilibrium Equation and Iterative Solvers

The variable names are the same as in the twogrid algorithm. Instead subscripts of h and
H an index variable l is introduced. Additionally we need a stopping criterion for the
recursion which is given by the number of levels lev.
Starting with an initial guess v1 on the fine grid we call the function MG(1,v,lev)

fct x = MG(l,vlk , fl ,lev)


if l = lev
Solve directly Al xl = fl
else
Smooth vlk → v˜hk
Compute residual rlk = fl − Al v˜hk
Transfer rlk → rl+1 k
(restriction)
k+1
On grid l+1, el+1 = MG(l+1,ekl+1 , rl+1
k
, lev)
k+1 k+1
Transfer el+1 → el (prolongation)
end

A graphical visualisation of the multigrid algorithm is shown in Fig. 2.8. Because of its
V-shape in the visualisation, a complete iteration is often called a V-cycle.

Full Multigrid V-Cycle (FMV)

Another improvement to the multigrid idea is the Full Multigrid V-Cycle which starts on
the coarsest level and takes several iterations limited to the two coarsest grids. This gives
the iteration on finer grids good starting values. After that the number of levels included
into the iteration is increased by one. This process continues until all levels are involved
in the V-Cycle (See Fig. 2.9). Empirical analysis shows that the FMV algorithm is one of
the most efficient algorithms for several problem types.

2.3.3 Complexity
An important issue regarding solvers for systems of linear equations is their complexity.
This is a function which describes the asymptotical runtime of the algorithm depending
on one or more variables describing the size of the problem.
Table 2.1 provides an overview about the complexity of several solvers for systems of
linear equations coming from a typical test problem, a finite difference discretisation of the
Laplace equation on a regular grid. The value in the table represents the exponent k in the
complexity function O(nk ) where n is the number of unknowns. Because the structure of
the matrix plays an important rule in the runtime behaviour of the solvers the dimension
of the test problems appears in the first row.
A first observation is that the complexity of iterative solvers decreases with increasing
dimension, while the complexity of the direct solver increases. As a rule of thumb direct
2.3. Multigrid methods 51

Dimension/Method 1D 2D 3D
Jacobi/GS 3 2 5/3
SOR 2 3/2 4/3
FMV 1 1 1
Direct 1 3/2 2
PCG 3/2 5/4 7/6

Table 2.1: Complexity of linear solvers

solvers perform well for 1 and 2 dimensional problems but are often unusable for large
problems in 3 dimensions. Iterative solvers become better for higher dimensional problems
and a large number of unknowns. But the performance of iterative solvers depend heavily
on the matrix, whereas direct solvers depend only on the structure of the matrix and are
therefore more robust.
Full Multigrid solvers seem to be perfectly suited for problems in any dimension and also
achieve the optimal performance. But they are generally not usable as ”Blackbox” solvers.
Often the adaption to a special problem is very difficult. So most time they are used in
programs which can cope only with a special kind of problem like fluid solvers.
52 Chapter 2. Equilibrium Equation and Iterative Solvers

...

u is unknown
...
uj
u is known

u j−1 u j
...
...

Figure 2.1: Scheme of the Jacobi method


2.3. Multigrid methods 53

−ET
D
−E
A=

Figure 2.2: Matrix splitting

Residual after 0 iterations Residual after 10 iterations


3000 10

2000
5

1000

0
res

res

−5

−1000

−10
−2000

−3000 −15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x x

Figure 2.3: Initial Residual Figure 2.4: Residual after 10 it.


54 Chapter 2. Equilibrium Equation and Iterative Solvers

Residual after 150 iterations Vector norm of residual


0 8000

−0.2
7000

−0.4
6000
−0.6

5000
−0.8
|res|
res

−1 4000

−1.2
3000

−1.4
2000
−1.6

1000
−1.8

−2 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 50 100 150 200 250 300 350 400 450 500
x iteration

Figure 2.5: Residual after 150 it. Figure 2.6: Norm of the residual
2.3. Multigrid methods 55

Prolongation
...
2nd.

Restriction
1st.

Coarse, H
Fine, h

Figure 2.7: Twogrid algorithm


56 Chapter 2. Equilibrium Equation and Iterative Solvers

Level n−1

Level n
Level 1

Level 2

Figure 2.8: Multigrid algorithm


2.3. Multigrid methods 57

Level n−3

Level n−2

Level n−1

Level n

Figure 2.9: Full Multigrid V-Cycle


Chapter 3

Weighted residual methods

In this section another method, or more precisely a general receipt for a couple of methods,
will be introduced. Although the finite difference method is convincing by its intuitive
approach and its simplicity it becomes quite difficult if it is applied to irregular domains.
Another disadvantage is the missing general framework for theoretical analysis which is
available for the weighted residual methods and thus gives some insight and a deeper
understanding of this class of methods.

3.1 Basic theory


As a simple example we will consider the stationary heat equation (often also called Poisson-
equation):

−∆u = f (3.1)

3.1.1 Weak form


The main idea is now to multiply the partial differential equation with a weighting function
ϕ and to integrate over the whole domain Ω:
Z Z
⇒ −∆u · ϕ dΩ = f · ϕ dΩ, ∀ϕ ∈ V (3.2)
Ω Ω

If Eq. (3.2) holds for every ϕ it is equivalent to Eq. (3.1). For sake of simplicity we assume
that u = 0 on ∂Ω. Then with Gauss’ theorem the following equation can be derived:
Z Z
T
(∇u) · ∇ϕ dΩ = f · ϕ dΩ, ∀ϕ (3.3)
Ω Ω

58
3.1. Basic theory 59

3.1.2 Variational formulation


An alternative use of the Poisson equation is to describe the displacement of an elastic bar
under load. It is well recognised that elastic structures minimise their internal energy. So
mechanical systems possess a natural minimisation principle:
 Z Z 
1 2
−∆u = f ⇔ min ||∇u|| + uf (3.4)
2
| {z }
Energyp

To minimise this functional it is necessary that its first variation becomes zero.

Z
1
p(u + v) = ||∇u + ∇v||2 − f (u + v)
2
Z Z
1 (3.5)
= p(u) + (∇u · ∇v − f v) + (∇v)2
2
| {z }
=0 for min.

If u minimises p, then
Z Z
T
(∇u) · ∇v dΩ = f · v dΩ, ∀v (3.6)
Ω Ω

which is equivalent to Eq. (3.3). The solution obtained by using the weighted residual
methods is thus equivalent to minimising the energy of the system.

3.1.3 Numerical methods


To solve the weak form (Eq. (3.3)) it is necessary to introduce an approximation of the func-
tion u. In the most general form this approximation is the sum of several ansatzfunctions
Ni which are multiplied with coefficients ui :

N
X
u(x) ≈ uh (x) = ui Ni (x) (3.7)
i=1

If this Approximation is put into Eq. (3.3) it is not possible to satisfy the equation for
all ϕ. Instead a finite subspace Vh ⊂ V must also be selected for the weighting functions.
This subspace may only have as much spanning functions as the space of Ansatzfunctions
in order to have a solution for Eq. (3.3). So the weightingfunction ϕ can be expressed
similarly as:
60 Chapter 3. Weighted residual methods

N
X
ϕ(x) ≈ ϕh (x) = ϕi (x) (3.8)
i=1

Depending on the type of weightingfunctions the numerical methods have different names.

Bubnov Galerkin methods

The characteristic of Bubnov Galerkin methods is that the weightingfunctions are the same
as the ansatzfunctions.

ϕ i = Ni (3.9)

It is one of the most popular weighted residual methods. Often the name Finite Element
Method or FEM is used synonymous with this type of weightingfunctions.

Petrov Galerkin methods

Petrov Galerkin methods are all weighted residual methods where the weightingfunctions
are different from the ansatzfunctions. It is obvious that this includes all methods which
are not Bubnov Galerkin. Nevertheless in literature most of the methods got different
names.
One choice for the weightingfunctions is the delta function:

ϕi = δ(x − xi ) (3.10)

Because integrating with the delta functions gives the function value at one point this
method is called pointwise collocation. Another choice is the characteristic function of
some subdomain Ωi inside the original domain Ω:

ϕ i = χ Ωi (3.11)

From obvious reasons this method is called subdomain collocation. It was independently
developed for conservation laws and is therefore often also called the Finite Volume Method.

Least Squares

Although the Least Squares method can also be seen as a Bubnov Galerkin method it
has some special properties. The idea is to apply the differential operator twice. One
time to the ansatzfunctions and once to the weightingfunctions. If we consider an abstract
differential operator L the least squares formulation is:
3.2. Example: The Finite Element method 61

Z
(Lu − f )(Lϕ) dΩ = 0 (3.12)

This method causes some difficulties when applied directly to higher order partial differ-
ential equations. Hence the most common approach is to convert the partial differential
equation into a first order system first.

Types of ansatzfunctions

Beside the different choices for the weightingfunctions there are also several possible ways
to choose the ansatzfunctions Ni . Some are:

• Polynomials: Ni = xi

• First N eigenfunctions of L: LNi = λi · Ni

• Trigonometric functions: Ni = sin(ix), Ni = cos(ix)

• Piecewise polynomials: Ni = xi onΩi

Not every set of functions is well suited for the solution of partial differential equations.
And functions which may be good from the analytical point of view may cause problems in
the numerical treatment. The most popular choice today are piecewise polynomials because
they have some very useful properties. For special problems like weather simulation also
the trigonometric functions are used. These methods are called spectral methods.

3.2 Example: The Finite Element method


Now the ingredients are complete to find an approximate solution for the partial differential
equation. Inserting the ansatzfunctions and the weightingfunctions into Eq. (3.3) gives:

Z N N Z N
∂ X ∂ X X
u i Ni · Nj dΩ = f Nj dΩ (3.13)
Ω ∂x i=1 ∂x j=1 Ω j=1

Evaluating these integrals for every index pair (i, j) ∈ [1 . . . N ] × [1 . . . N ] transforms this
equation into a system of linear equations:

⇒ Au = f (3.14)
62 Chapter 3. Weighted residual methods

i+1

i+1
N

x
i

i
x
N

Figure 3.1: Linear ansatzfunctions in 1D

3.2.1 Nodal basis

For the sparsity, which means the matrix A has only few nonzero entries, a local definition
of the ansatzfunctions is necessary. Piecewise polynomials are widely used for this purpose.
Here we will look at piecewise linear functions in one dimension. The one dimensional
domain is then subdivided into several smaller parts Ωi = [xi . . . xi+1 . The ansatzfunctions
on this interval are then (see Fig. 3.1) :


(x − xi )/l x ∈ [xi , xi+1 ]
Ni (x) = (3.15)
0 else

(xi+1 − x)/l x ∈ [xi , xi+1 ]
Ni+1 (x) = (3.16)
0 else

where l = xi+1 − xi is the length of the interval. The complete domain is then covered by
these functions (see Fig. 3.2).
It can easily be seen that the ansatzfunction Ni is one at the point xi and zero at all other
points xj , j 6= i. So if we find a solution vector u the value of our approximate solution
(Eq. (3.7)) at the point xi is equal to the value of the coefficient ui . For the interpretation
of the solution this property is very helpful because it makes the reconstruction of the
approximate solution unnecessary. The points xi are often called nodes which also gives
the name for this type of ansatzfunctions.
3.2. Example: The Finite Element method 63

Nn
Ni

...


N3
N2
N1

Figure 3.2: Ansatzfunctions on the whole domain


64 Chapter 3. Weighted residual methods

.
K4

.
K3

.
K2
K1

Figure 3.3: Assembly of the global matrix K

3.2.2 Matrix assembly

Another advantage of the nodal basis was the local definition of the ansatzfunctions. This
property allows the easy evaluation of Eq. (3.13). Because the ansatzfunctions are only
nonzero inside the local subdomain, the product of two ansatzfunctions can also be nonzero
only in the local subdomain. So the common way to get the global matrix in Eq. (3.14) is
to assemble it from the distributions of the small subdomains Ωi which are called elements
in the Finite Element method.
Consider the subdomain Ωi going from xi to xi+1 with length li = xi+1 − xi . The local
system of equations is then:

 R xi+1 ∂ Rx   R xi+1
N ∂ N dx R xii+1 ∂x
∂ ∂  
R xxi+1 ∂x i ∂x i
Ni+1 ∂x Ni dx u1 xi
f (x)Ni dx
i
∂ x = xi+1
N ∂ N dx xii+1 ∂x∂ ∂
R
xi ∂x i ∂x i+1
Ni+1 ∂x Ni+1 dx u2 xi
f (x)Ni+1 dx
(3.17)
Solving the integrals we obtain:

      R xi+1 
1 1 −1 u1 x
f (x)N i dx
= R xi+1
i
(3.18)
li −1 1 u2 xi
f (x)Ni+1 dx
| {z }
Ki

Summing up these local systems of equations gives the global system of linear equations
(see Fig. 3.3).

Ku = f (3.19)
3.3. Example: The Finite Volume method 65

3.3 Example: The Finite Volume method


The original idea for the finite volume methods came from the conservation laws written
in integral form. But as already mentioned in subsection 3.1.3 it can also be interpreted as
subdomain collocation. If we look at the heat equation again we get the following integral
form with subdomains Ωi = [xi , xi+1 ]:

∂2u
Z Z
χΩ dx = f χΩi dx ∀i (3.20)
Ω ∂x2 i Ω

Using the properties of the characteristic function these integrals can be written as:

xi+1 xi+1
∂2u
Z Z
· 1 dx = f dx (3.21)
xi ∂x2 xi

With partial integration we obtain:


 xi+1 Z xi+1 Z xi+1
∂u ∂u 0
− · 1 dx = f dx (3.22)
∂x xi xi ∂x xi
| {z }
=0

It follows directly that:


Z xi+1
∂u ∂u
(xi+1 ) − (xi ) = f dx (3.23)
∂x ∂x xi

This equation represents the original idea of the finite volume method. On the left side
it has the flux ∂u/∂x on both sides of the small subdomain Ωi (this subdomain is called
control volume in the Finite Volume method) and the source term on the right hand side.
So what goes into the control volume and does not go out must be equal to the amount
coming from the source term f .
Inserting locally defined piecewise linear functions which have the same boundarys as the
subdomains Ωi we get the following result (shown for Ω1 ):

Z x2
∂N1 ∂N2 ∂N1 ∂N2
(u1 (x2 ) + u2 (x2 )) − (u1 (x1 ) + u2 (x1 )) = f dx (3.24)
∂x ∂x ∂x ∂x
Zx1x2
⇒0 = f dx (3.25)
x1

It is clear that Eq. (3.24) is not very helpful. One possible way to get around this problem
is to put the control volume boundarys not onto the nodes of the ansatzfunctions but to
put them around the nodes (see Fig. 3.4).
66 Chapter 3. Weighted residual methods

Ωn
un
Nn

un−1
...

Ω4
Ni

Ω3
u3
N3

Ω2
u2
N2

Ω1
u1
N1

Figure 3.4: Position of the subdomains Ωi for the FVM


3.4. Higher dimensional elements 67

With this ansatz we get the following equations for a control volume Ωi inside the domain
Ω:

∂Ni−1 ∂Ni ∂Ni+1


(ui−1 (xi+1 ) + ui (xi+1 ) + ui+1 (xi+1 ))−
∂x ∂x ∂x
∂Ni−1 ∂Ni ∂Ni+1
Z x2 (3.26)
(ui−1 (xi ) + ui (xi ) + ui+1 (xi )) = f dx
∂x ∂x ∂x x1

Looking at Fig. 3.4 it is easy to find the appropriate values for the derivatives (assuming
the nodes of the ansatzfunctions are equidistant):

1 1
((ui−1 · 0) + (ui · − ) + (ui+1 · ))−
l l
1 1
Z x2 (3.27)
((ui−1 · − ) + (ui · ) + (ui+1 · 0)) = f dx
l l x1

Finally we get:
Z xi+1
1
(ui−1 − 2ui + ui+1 ) = f dx (3.28)
l xi

which is exactly the same system of equations as in the finite difference method.

3.4 Higher dimensional elements


In one dimension the advantages of the Finite Element method seem not really overwhelm-
ing. But already in two dimensions it is possible to model complex geometries without
difficulties which cannot be handled anymore by the finite difference method. The next
sections will cover the basic ideas to create finite elements of arbitrary spatial dimension
and arbitrary high order although naturally most time the dimension will be less than four
and higher order elements do not always have advantages.

3.4.1 Isoparametric mapping


For the simple 1D elements it was easy to find the ansatzfunctions on an element directly
in the global coordinate system. In more dimensions this task becomes quite difficult. One
solution is to define ansatzfunctions on a convenient domain and to introduce a coordinate
transformation from this domain or local coordinate system to the global coordinate system
(see Fig. 3.5).
68


Ω Global Coord. System

Coord. Trans.
Coord. Trans.

Local Coord. System


Chapter 3. Weighted residual methods

0 0

1D 2D
3.4. Higher dimensional elements 69

The two most used intervals for the local coordinate system are either the interval [−1 . . . 1]
or [0 . . . 1]. In higher dimensions the products of these intervals are used. It is also clear
that these intervals define lines, quadrilaterals and cubes in 1,2 and 3 dimensions. For
triangular elements slightly different domains are used.
In this lecture note the interval [−1 . . . 1] will be used. For 1D elements we get the following
ansatzfunctions on the local coordinate system:

1
N1 (ξ) = (1 − ξ) (3.29)
2
1
N2 (ξ) = (1 + ξ) (3.30)
2

Often this interval together with the ansatzfunctions is called Master- or Urelement because
it is the basis to derive all local elements.
Now a coordinate transformation from the interval [−1 . . . 1] to an arbitrary interval [xi . . . xi+1 ]
is required. The class of isoparametric elements uses the same ansatzfunctions for the co-
ordinate transformation. Other choices are the ansatzfunctions of lower order (lower poly-
nomial degree) which then give subparametric elements or ansatzfunctions of higher order
which result in superparametric elements. The latter two element classes can cause trouble
and thus are not used very often. For the isoparamtric elements we then get the following
coordinate transformation from the Masterelement to the element i with the coordinates
xi , xi+1 in the global coordinate system:

xglob (ξ) = xi h1 (ξ) + xi+1 h2 (ξ) (3.31)

Going back to the weak form of the heat equation we had the following equation for the
element stiffness matrix K:
Z xi+1
∂Nj ∂Ni
Kij = · dx i, j ∈ [1, 2] (3.32)
xi ∂x ∂x
Inserting the coordinate transformation we get:

1 
Z 
∂Nj (xglob (ξ)) ∂Ni (xglob (ξ)) dxglob (ξ)
Kij = · dξ dξ
i, j ∈ [1, 2] (3.33)
−1 ∂x ∂x

One little problem remains in Eq. (3.33). The partial derivatives of the ansatzfunctions
are still with respect to the global coordinate system. With the chain rule we obtain the
following equation:
 −1
∂N ∂N ∂xglob ∂xglob ∂N ∂N
= (xglob (ξ)) · ⇔ = (xglob (ξ))· (3.34)
∂ξ ∂x ∂ξ ∂ξ ∂ξ ∂x
70 Chapter 3. Weighted residual methods

Inserting this into Eq. (3.33) gives finally the integral equation for one element stiffness
matrix on the master element:

−1 −1 !
1
Z  
∂xglob ∂Nj ∂xglob ∂Ni dxglob (ξ)
Kij = · dξ dξ
i, j ∈ [1, 2] (3.35)
−1 ∂ξ ∂ξ ∂ξ ∂ξ

Computing this integral shows that it is equivalent to the equation obtained by integrating
in the global domain. For higher dimensions the integral equations on the master element
are derived exactly the same way.

3.4.2 Quadrilateral elements


In higher dimensions ansatzfunctions which have only local support are again required
to get sparse matrices. The simplest idea to get ansatzfunctions is thus to use the same
functions as in 1D in each spatial direction. Doing this we get the following ansatzfunctions
on the master element [−1 . . . 1] × [−1 . . . 1] (see Fig. 3.6) :

1
N1 (ξ, η) = (ξ − 1)(η − 1) (3.36)
4
1
N2 (ξ, η) = (ξ + 1)(η − 1) (3.37)
4
1
N3 (ξ, η) = (ξ + 1)(η + 1) (3.38)
4
1
N4 (ξ, η) = (ξ − 1) ∗ η + 1) (3.39)
4

They look similar to a pyramid around a node (see Fig. 3.7). The isoparametric coordinate
transformation then becomes:

         
xglob x1 x2 x3 x4
(ξ, η) = N1 (ξ, η) + N2 (ξ, η) + N3 (ξ, η) + N4 (ξ, η)
yglob y1 y2 y3 y4
(3.40)
where x1 , . . . , y4 are the global coordinates of the corner nodes of the quadrilateral. Some
difficulties appear when going to higher dimensions. Again the heat equation should il-
lustrate the use of the coordinate transformation. In 2D we have for the element stiffness
matrix:
! 
Z ∂Nj ∂Ni 
Kij = ∂x · ∂x dΩelm i, j ∈ [1, . . . , 4] (3.41)
∂Nj ∂Ni
Ωelm ∂y ∂y
3.4. Higher dimensional elements 71

x
y
ξ +1

u i,j
N i,j

1
+1

−1
η

−1

Figure 3.6: Masterelement for quadrilat- Figure 3.7: Schematic view of the
erals Ansatzfunction Ni,j

Inserting the coordinate transformation we get:

!
Z 1 Z 1 ∂Nj
Kij = ∂x (xglob (ξ, η), yglob (ξ, η))·
∂Nj
−1 −1 ∂y
(3.42)
 ∂Ni 
∂x (xglob (ξ, η), yglob (ξ, η)) |J(ξ, η)| dξ dη i, j ∈ [1, . . . , 4]
∂Ni
∂y

Here |J| should denote the determinant of J, which is the Jacobian of the coordinate
transformation:

∂xglob ∂xglob
!
∂ξ ∂η
J= ∂yglob ∂yglob (3.43)
∂ξ ∂η

Now we can again apply the chain rule to the spatial derivatives of the ansatzfunctions in
the masterelement:

∂N ∂N ∂xglob ∂N ∂yglob
= + (3.44)
∂ξ ∂x ∂ξ ∂y ∂ξ
∂N ∂N ∂xglob ∂N ∂yglob
= + (3.45)
∂η ∂x ∂η ∂y ∂η

With the Jacobian J it can be written more compact:


72 Chapter 3. Weighted residual methods

 ∂N   ∂N

∂ξ T ∂x
∂N =J ∂N (3.46)
∂η ∂y

Bringing the Jacobian to the left side gives:

 ∂N   ∂N

−T ∂ξ ∂x
J ∂N = ∂N (3.47)
∂η ∂y

So the derivatives with respect to the global coordinate system in Eq. (3.42) can be replaced
by derivatives in the local coordinate system:

∂Nj
! !
Z 1 Z 1 ∂Ni
Kij = J−T ∂ξ
∂Nj · J−T ∂ξ
∂Ni |J| dξ dη i, j ∈ [1, . . . , 4] (3.48)
−1 −1 ∂η ∂η

Higher dimensional elements can be treated in the same way. One point causing some
trouble in practical implementations is the term J−T . It implies some requirements for the
coordinate transformation. First the Jacobian must always and everywhere be invertible.
Furthermore a Jacobian with negative or zero determinant should be avoided.
A common problem in that context is the wrong ordering of the nodes Eq. (3.40). For
2 dimensional quadrilaterals the nodes in the global coordinate system must be ordered
counterclockwise to have a positive determinant of the Jacobian.
Another cause for a negative Jacobian can be a highly distorted element where the angle at
one corner is greater than 180 degrees. Sometimes this can happen together with automatic
mesh deformation.

3.4.3 Triangular elements


The other fundamental element type beside the quadrilateral and its higher dimensional
relatives is the triangular element. It was also the first finite element ever. Isoparametric
mapping can be used for the triangular elements as well. The ansatzfunctions in the master
element (see Fig. 3.8) are:

N1 (ξ, η) = ξ (3.49)
N2 (ξ, η) = η (3.50)
N3 (ξ, η) = 1 − ξ − η (3.51)

For the isoparametric coordinate transformation we get:


3.4. Higher dimensional elements 73

ξ1
η

0
1
Figure 3.8: Masterelement for triangular elements

       
xglob x1 x2 x3
(ξ, η) = N1 (ξ, η) + N2 (ξ, η) + N3 (ξ, η) (3.52)
yglob y1 y2 y3

The element stiffness matrix can then be derived the same way as shown for the quadrilat-
eral. From the numerical point of view the quadrilateral elements achieve a higher accuracy
with the same number of nodes. In mechanical system the triangular elements also tend
to be to stiff. Nevertheless in several areas triangular elements are still used because they
have some advantages. First thing is that they are quite robust. This means they do not
fail numerically when they undergo large deformations. If they become degenerated they
loose accuracy but they don’t cause trouble like the quadrilaterals, which cannot withstand
inner angles greater than 180 degrees. Another advantage is the availability of powerful
automatic mesh generators. Research is going on in the field of mesh generation tools for
quadrilaterals or cubes, but the automatic generation of triangular or tetrahedral meshes
is still more powerful and robust.

3.4.4 Higher order elements

Consider a triangulation1 of an arbitrary domain Ω where the partial differential equation


should be solved. The accuracy of the approximate solution which can be computed with
the finite element method depends on the size of the elements which are used in the
discretisation. To describe this size, the diameter of the smallest circle that completely
covers the element is used in 2D. For 3D elements it is the diameter of the smallest ball.
The diameter will be named h.
Let the error between the exact solution u and the finite element approximation uh be
measured in the W12 norm:
1
The word triangulation is used for general element patterns which are used to discretise a domain. It
does not always mean that the discretisation uses triangles
74 Chapter 3. Weighted residual methods

ξ
2
A1

A3
η

1
A2

(ξ ,η )
Figure 3.9: Area coordinates for triangular elements

Z Z
||u − uh ||21 = ||∇(u − uh )|| + 2
|u − uh |2 (3.53)
Ω Ω

For the Laplacian, the following estimate for the error ||u − uh ||21 can be found:

||u − uh ||21 ≤ C · hp (3.54)

where C is a constant and p depends on the order of the ansatzfunctions. From Eq. (3.54)
it can be seen that the error can be reduced either by increasing the number of elements
and thus reducing h or by increasing the order of the ansatzfunctions p. In the next part
methods to get elements with high order ansatzfunctions will be shown for triangles and
quadrilaterals.

Triangles

Another convenient way to write the ansatzfunctions for triangles is in terms of area co-
ordinates. These are defined as the quotient of the area of the triangles, which can be
constructed from a point inside the triangle, and the area of the complete triangle (see
Fig. 3.9):

Aj
Lj = (3.55)
Atot

where Aj denotes the area of triangle Aj in Fig. 3.9. With these functions the ansatzfunc-
tions in the triangle can easily be written as:
3.4. Higher dimensional elements 75

N1 (ξ, η) = 1 − ξη = L1 (ξ, η) (3.56)


N2 (ξ, η) = ξ = L2 (ξ, η) (3.57)
N3 (ξ, η) = η = L3 (ξ, η) (3.58)

To get a higher order element it is necessary to put some new nodes into the element.
For the next step, the midpoints of the edges of the triangle are a good choice. The
ansatzfunctions on these points must be constructed such that they are zero on all other
nodes and one at that point. For the fourth node, which should be located between node
1 and node 2, the product of L1 and L2 satisfies this conditions. Both are zero at node
3 and at node 1 or 2 one of these functions vanishes. At node 4 L1 and L2 are 1/2 so a
correction factor of must also be added. Hence:

N4 (ξ, η) = 4 · L1 (ξ, η) · L2 (ξ, η) (3.59)

The ansatzfunctions for node 5 and 6 can be constructed similarly. After that some cor-
rections must be applied to the old functions N1 to N3 because they must now become
zero on the additional nodes 4 to 6. This can be done by subtracting the newly created
functions N4 to N6 .
Pascal’s triangle can be used to determine the number and position of the nodes in advance.
It includes all the terms which appear in the (x + y)n . In Fig. 3.10 the relation can be seen.

Lagrange basis

It was easy to derive the quadrilateral and hexahedral elements from the 1D ansatzfunctions
by simply taking the products of these function. To get higher order quadrilaterals it is
therefore only necessary to look at the 1D elements. On these elements the ansatzfunctions
of arbitrary order can be computed using the Lagrange interpolation formulas, which give
also the name for this basis:

Q
j6=k (ξ − ξj )
lk (ξ) = Q (3.60)
j6=k (ξk − ξj )

Here the ξk are the interpolation points or the nodal points in the finite element language.
The lk (ξ) is zero on all nodal points except for the kth where it is exactly one. For quadratic
elements with nodes at −1, 0, 1 in the masterelement we get :
76 Chapter 3. Weighted residual methods

Quadratic tria.

Cubic tria.
Linear tria.

y3
y2

xy2
y

xy
1

x2y
x

x2

x3

Figure 3.10: Pascal’s triangle


3.5. Time dependent problems 77

(ξ − 0)(ξ − 1) 1
N1lin (ξ) = = (ξ 2 − ξ) (3.61)
(−1 − 0)(−1 − 1) 2
(ξ + 1)(ξ − 1)
N2lin (ξ) = = 1 − ξ2 (3.62)
(0 − 1)(0 + 1)
(ξ − 0)(ξ + 1) 1
N3lin (ξ) = = ξ2 + ξ (3.63)
(1 + 1)(1 − 0) 2

With these functions the ansatzfunctions in the quadrilateral masterelement become:

N1quad (ξ, η) = N1lin (ξ) · N1lin (η) (3.64)


N2quad (ξ, η) = N1lin (ξ) · N2lin (η) (3.65)
N3quad (ξ, η) = N1lin (ξ) · N3lin (η) (3.66)
N4quad (ξ, η) = N2lin (ξ) · N1lin (η) (3.67)
.. ..
. . (3.68)

At last some remarks about the higher order elements. In most finite element codes
quadratic elements will be the highest order elements available. One point is that beside
being more accurate higher order elements are much more expensive. That means they
need more computational time. One reason is the higher number of nodes (a quadratic
hexahedron has already 27 nodes). Most elements cannot be evaluated analytically any-
more, so numerical integration formulas are used. These formulas must also become more
accurate and thus expensive, if the ansatzfunctions have higher order. So at a certain order
the theoretical benefits of higher elements are eaten up by their higher numerical costs.
Another disadvantage is that the elements become numerically less robust. So moving the
mid node on the edges to far away from the geometrical centre of the edge can cause a
failure of the isoparametric mapping and the element.

3.5 Time dependent problems


For time dependent problems the weighted residual methods can be used exactly the same
way as for stationary problems. Consider the instationary heat equation:

u̇ − ∆u = f (3.69)

together with boundary conditions and initial conditions. Applying a weighted residual
method we get:
78 Chapter 3. Weighted residual methods

Z Z Z
T
u̇ϕ dΩ + (∇u) · ∇ϕ dΩ = f ϕ dΩ ∀ϕ (3.70)
Ω Ω Ω

From Eq. (3.70) two methods for the time discretisation can be derived. One is the time-
space finite element method, which will not be treated here and the other is the method of
lines which separates time- and space discretisation. Looking at the approximation of u:

N
X
u ≈ uh = u i Ni (3.71)
i=1

it is clear that also:

N
X
u̇ ≈ u̇h = u̇i Ni (3.72)
i=1

holds. Inserting this ansatz into Eq. (3.70) allows us to transform the instationary partial
differential equation into a system of ODEs for the coefficients ui .
 
XN Z  Z   Z 
T
Nj Ni dΩ u̇i + (∇Nj ) · ∇Ni dΩ ui  = Nj f dΩ ∀j (3.73)
 


i=1 | Ω Ω  Ω
{z } | {z } | {z }
Mij Kij fj (t)

Writing this system in matrix form we obtain:

Mu̇(t) + Ku(t) + f (t) ⇒ u̇ = −M−1 Ku + M−1 f (3.74)

So instead of a system of linear equations for the discretisation of a stationary problem we


get a system of ordinary differential equations. This is called a semidiscretisation because
it discretises only the spatial directions. Often the matrix M is called the mass matrix.
This name stems from the analysis of mechanical systems, where the matrix M is related
to the mass of a mechanical system.
In most cases another numerical method is required to find a solution which satisfies the
system of ordinary differential equations. One possible choice is the Euler forward method:

u(t + ∆t) = u(t) + ∆t(M−1 Ku(t) + M−1 f (t)) (3.75)

Although the Euler method is an explicit method for this system it involves the solution of
system of linear equations (instead of computing the inverse of M which should never be
done in real applications). So the disadvantages of the explicit Euler method stay, while
3.5. Time dependent problems 79

the advantage of not having to solve a system of equations is lost. To circumvent this
problem often a lumped mass matrix is used instead of the correct matrix. The lumped
matrix is a diagonal matrix which is easy to invert. Its diagonal elements are simply the
sum of all entries in the row of the diagonal element.

N
X
ML = diag(mi ), mk = Mij (3.76)
j=1

Because it is known that the mass matrix is responsible for the inertia the error in the
description of the physical system is tolerable for many applications.
Chapter 4

Hyperbolic equations

In the first chapter the Fourier’s law for heat transport or diffusion processes was in-
troduced. A slight variation of this equation was the transport equation which describes
convective heat transport. The difference between this two equations might seem small but
for the numerical treatment it is quite important. Similar equations also appear in many
other physical phenomena. Examples like the wave equation, the telegraph equation and
the transport equation will be shown. After that some properties of the solutions of hyper-
bolic equations will be analysed. Finally finite difference schemes to find an approximate
solution will be shown.

4.1 Introduction
Many physical phenomena like sound and electromagnetic fields need to be modelled with
waves. Thus the wave equation:

∂2u ∂2u ∂2u


− 2 − 2 = 0 ⇔ ü − ∆u = 0 (4.1)
∂t2 ∂x ∂y

is one prototype of a hyperbolic equation. Another one comes from transport processes,
as shown in the heat equation with convective transport:

∂u
− β 2 ∆u + (v T · ∇u) = f (4.2)
∂t
where v T is a prescribed velocity field. In the extreme case β = 0, which describes a pure
convective transport, the equation becomes the transport equation (shown in 1D):

∂u ∂u
+v =0 (4.3)
∂t ∂x

80
4.1. Introduction 81

Another example is an elastic string (in a piano, or a guitar). If u(x, t) is the displacement
of the string at a certain point the acceleration is ü(x, t). According to Newton’s the law the
2
force is then −ρ ∂∂t2u where ρ is the density of the string. Assuming small displacements the
2
force from the elastic deformation is −T ∂∂xu2 with T being a material constant describing
the strength of the string. Putting these terms together with an external force term it
follows that:

∂2u ∂2u
−ρ − T =f (4.4)
∂t2 ∂x2
So the motion of elastic string can also be described by the wave equation.

4.1.1 Telegraph equation


Now we will consider an example from electrical engineering. It is called the telegraph
equation because it models the behaviour of electrical signals on a telegraph line. The first
step in building the model is to replace a small piece of the wire by a quadrupole build
from resistors, capacitors and coils (see Fig. 4.1). Letting the size of this piece go to zero
we obtain a partial differential equation describing the behaviour of signals on the wire.
Using Kirchhoff’s laws we obtain the following two equation for the quadrupole:

dI 0
−U + I · R0 l + L · l + U + ∆U = 0 (4.5)
dt
I − Ic − Ig − I − ∆I = 0 (4.6)

The two currents at the capacitor C and the conductivity G can be expressed in terms of
the voltage change:

d(U + ∆U )
Ic = (C 0 l) (4.7)
dt
l
IG = (U + ∆U ) (4.8)
G0

Letting l go to zero and inserting Eq. (4.7) into Eq. (4.5) we get:

∂U ∂I
= −R0 · I − L0 (4.9)
∂x ∂t
∂I ∂u
= −C − S 0U (4.10)
∂x ∂t
82 Chapter 4. Hyperbolic equations

I+ ∆ I

I+ ∆ I
U+ ∆U

1 l
G’
C’l
’l
4.1. Introduction 83

Here S 0 is a replacement for l/G0 . Using matrix notation Eq. (4.9) and Eq. (4.10) can be
written as:

C0 0 S0 0
          
∂ U 0 1 ∂ U U
=− − (4.11)
0 L0 ∂t I 1 0 ∂x I 0 R0 I

Multiplying Eq. (4.9) with the partial differential operator ∂/∂x and Eq. (4.10) with ∂/∂t
gives:

∂2U 0 ∂I
2
0 ∂ I
= −R − L (4.12)
∂x2 ∂x ∂x∂t
∂2I ∂ 2
∂U
= −C 0 2 − S 0 (4.13)
∂x∂t ∂t ∂t

Inserting Eq. (4.13) into Eq. (4.12) results in:

∂2U 0 0 0 0 ∂U 0 0 ∂U
2
0 0∂ U
= S R U + C R + S L + C L (4.14)
∂x2 ∂t2 ∂t ∂t2

Sorting the terms and adding a source term v(x, t we finally obtain:

∂2U R0 S0 1 ∂2U S 0 R0
 
∂U
+ + − 0 0 2 + 0 0 U = v(x, t) (4.15)
∂t2 L0 C 0 ∂t C L ∂x CL

Looking at Eq. (4.15) shows that this equation is very similar to the wave equation. Two
additional terms c1 U and c2 ∂U/∂t are the only difference. The effect of these terms will
be examined later. But the main result is that the propagation of signals on a wire can be
seen as a wave phenomenom and thus be described by a hyperbolic equation.

4.1.2 Analytical solutions

Again the analysis of hyperbolic equations should be started with analytical solutions to
these equations. Exponential functions in time and space should be a good first ansatz:

u(x, t) = A · ept eikx , A 6= 0 (4.16)

where p and k are some constants. p describes the amplification of the solution in time,
while k is the wave number of the solution. Higher k correspond to higher frequencies
(perhaps for the telegraph equation the frequency of the input signal).
84 Chapter 4. Hyperbolic equations

Transport equation

The partial derivatives of the ansatz with respect to t and x are:

∂u ∂u
= p · u, = iku (4.17)
∂t ∂x
Inserting these expressions into the transport equation Eq. (4.3), which is the simplest
hyperbolic equation, we get:

p · u − viku = 0 ⇒ p = ivk (4.18)

So if Eq. (4.16) satisfies the transport equation the amplification factor p is purely imagi-
nary. Hence it does not describe an amplification but is another wave length. Introducing
the circular frequency ω it follows that:

iω = p = ivk ⇒ ω = vk (4.19)

From this relation we also get another form for the analytical solution which satisfies the
transport equation:

u(x, t) = Aei(ωt eikx = Aei(ωt+kx) = Aeik(vt+x) (4.20)

Looking into the spatial direction this solution is a trigonometric function or wave. On the
other hand an observer standing at one point of the domain will see that the solution in
time also is a wave. If the observer will move with the top of a wave the time and spatial
wave must be in constant phase, which means vt + x = 0. Therefore the observer must
choose his position such that:

x = −vt ⇔ x = cp t (4.21)

where cp is the phase velocity which is in this case equal to −v.

Wave equation

Now we will take a look at the wave equation as another typical hyperbolic equation:

∂2u 2
2∂ u
−c =0 (4.22)
∂t2 ∂x2
The partial derivatives are:

∂2u
= (−iω)2 Aei(kx−ωt) = −ω 2 Aei(kx−ωt) (4.23)
∂t2
4.1. Introduction 85

∂2u
= (ik)2 Aei(kx−ωt) = −k 2 Aei(kx−ωt) (4.24)
∂x2
Inserting these terms into Eq. (4.22) we obtain:

−ω 2 · u + c2 k 2 · u = 0 ⇒ ω 2 = c2 k 2 ⇒ ω = ±ck (4.25)

Eq. (4.25) is called the dispersion relation of a wave. The dispersion describes the difference
in speed of waves with different frequency. If ω/k = const holds, all waves travel with the
same speed. So there is no dispersion. A signal build from several waves of different
frequencies will travel along the domain unchanged.
Putting the dispersion relation into the ansatz we get for u:

u = Aei(kx∓ckt) (4.26)

It can be seen that for the wave equation to phase speeds exist. One with positive sign
and the other with negative sign. Information can travel from a point at time t into
both directions with the same speed. But for the wave equation it is not possible that
information travels faster than the phase speed. Thus it is possible to draw an area in the
time space domain which can be influenced by the information at a given point in space
and time (see Fig. 4.2).
Because for electromagnetic waves the speed of light is the phase speed this area of influence
is often called the lightcone. Applying a binomial formula to the wave equation shows that
it can be seen as two transport equations with different directions:
  
∂ ∂ ∂ ∂
−c +c u=0 (4.27)
∂t ∂x ∂t ∂x

Klein-Gordon equation

A slight variation of the pure wave equation is the Klein-Gordon equation. Although it
has it origins in quantum physics it can also be interpreted as a string which oscillates in
some foam which damps the oscillations:

∂2u 2
0∂ u
u − c + du = 0 (4.28)
∂t2 ∂x2
The term du is responsible for the damping. Deriving the dispersion relation shows that:

ω = ± c0 k 2 + d (4.29)

Drawing this function together with the dispersion relation of the wave equation (see
Fig. 4.3) shows that this time there is dispersion. So waves with longer wavelength travel
86 Chapter 4. Hyperbolic equations

Phase speed
4

3.5

3
Lightcone

1
c

2.5
w(k)

1.5

0.5
t

0
0 0.5 1 1.5 2 2.5 3 3.5
k

Figure 4.3: Dispersion relation of the


1
c

Klein-Gordon equation (blue) compared


with the dispersion of the wave equation


(green)
Figure 4.2: Lightcone of the wave equa-
tion
4.1. Introduction 87

slower than waves with shorter wavelength. A signal put into that system will become
a different signal as time progresses. For the telegraph equation a similar result can be
obtained. Therefore it is not possible to transfer information lossless over long distances.
After some time a signal having rectangular shape would become unrecognisable.

Beam equation

Another equation which seems to be hyperbolic is the beam equation:

∂ 2 u EI ∂ 4 u
ρ 2 + · =0 (4.30)
∂t ρ ∂x4

Here EI denotes the elastic modulus and ρ is the density of the material while u is the
displacement of the beam. Putting all material constants together in one constant a2 gives:

∂2u 4
2 ∂ u
+ a · =0 (4.31)
∂t2 ∂x4
The partial derivatives of the ansatz Eq. (4.16) are:

∂2u
2
= −ω 2 ei(kx−ωt) (4.32)
∂t
∂4u
= k 4 ei(kx−ωt) (4.33)
∂x4

With Eq. (4.31) we obtain:

−ω 2 ei(kx−ωt) + a2 k 4 ei(kx−ωt) = 0 (4.34)

and hence:

w(k) = ±ak 2 (4.35)

So the transmission speed is not limited. It can become infinitely large if the frequency
is high enough. Actually the equation is not really hyperbolic. It is a parabolic equation
which only looks like a hyperbolic equation. For parabolic equations it is known that these
allow infinite transmission speeds. But although it looks as if this observation can be used
to achieve infinite transmission speeds with the help of beams, it is not possible because
the model does not represent the real physics anymore if the frequencies become infinitely
high.
88 Chapter 4. Hyperbolic equations

4.1.3 Fourier series solution


Also for the hyperbolic equations it is possible to construct a solution for arbitrary initial
conditions by a Fourier series approximation. As shown in the previous sections, the
exponential function is a solution of the hyperbolic equations. So the integral over different
wavenumbers must be also a solution of the hyperbolic equations:
Z
Φ(x) = Φ̂(k)eikx dk (4.36)

With the dispersion relation for the equation the time dependent solution can be found:

XZ
u(x, t) = Φ̂(k)ei(kx−ω(k)t) dk (4.37)
ω

It is the sum over the different branches of the dispersion relation. This solution will
become quite useful for the stability analysis and for the derivation of the group speed.

4.1.4 D’Alambert’s solution


In the previous section the ansatz function was always an exponential function. For the
wave and transport equation another analytical solution exists. Taking the following initial
conditions:

u(x, 0) = u0 (x) = Φ(x) (4.38)

the transport equation has the following analytical solution:

u(x, t) = Φ(x + vt) (4.39)

For this solution we have the partial derivatives:


Φ = Φ0 (x + vt) (4.40)
∂x

Φ = Φ0 (x + vt)v (4.41)
∂t

Then the transport equation is with these derivatives:

vΦ0 (x + vt) − vΦ0 (x + vt) = 0 ⇔ 0=0 (4.42)


4.1. Introduction 89

Obviously this equation is always satisfied and thus Eq. (4.39) a solution of the transport
equation. For the wave equation a similar result can be derived. Here the analytical
solution is (with the same initial conditions as for the transport equation):

αΦ(x + ct) + βΦ(x − ct) (4.43)

For the second partial derivatives we can compute:

∂2u
= αΦ(x + ct) + βΦ(x + ct)) (4.44)
∂x2
∂2u
2
= c2 (αΦ(x + ct) + βΦ(x + ct)) (4.45)
∂t

With this solution Eq. (4.22) becomes:

c2 (αΦ(x + ct) + βΦ(x + ct)) − c2 (αΦ(x + ct) + βΦ(x + ct)) = 0 ⇔ 0 = 0 (4.46)

So Eq. (4.43) satisfies the wave equation. Although the result might seem trivial it is quite
useful for the analysis of numerical schemes, because it offers a huge range of analytical
solutions which can be used as a test case.

4.1.5 Characteristics of 1st order equations


A first order hyperbolic equation can be written in general form as:

∂u ∂u
a +b =c (4.47)
∂ξ ∂η

Geometrically the function u(ξ, η) describes a surface in a three dimensional vector space
with dimensions u, ξ, η. A normal vector can thus be found in every point of the surface.
From analysis it is known that this vector is:

 T
∂u ∂u
, , −1 (4.48)
∂ξ ∂η

Using the normal scalar product Eq. (4.47) can be written as:

 T  ∂u 
a ∂ξ
 b  ·  ∂u  = 0 (4.49)
∂η
c −1
90 Chapter 4. Hyperbolic equations

This allows another interpretation of the partial differential equation. Its solution is then
the surface which normal vector is orthogonal to the coefficient vector of the partial dif-
ferential equation. The idea for the method of characteristics is now to find a coordinate
transformation which reduces the partial differential equation to an ordinary differential
equation. Introducing the parameter s the coordinates become:

ξ(s), η(s), u(s) (4.50)

The geometric interpretation is a line in the three dimensional space. We now choose that
u should depend linearly on s:

du
=c (4.51)
ds
Writing u in terms of the coordinates ξ(s), η(s) we get

u(ξ, η) = u(ξ(s), η(s)) (4.52)

and thus (with the chain rule):

du ∂u ∂ξ ∂u ∂η
= + =c (4.53)
ds ∂ξ ∂s ∂η ∂s
Comparing this equation with Eq. (4.47)it is clear that:

dξ dη
= a, =b (4.54)
ds ds
From these equations it can be seen that the coordinate transformation is a line in the ξ, η
plane (at least for constant coefficients a and b). The steepness of this line with respect to
time limits the speed at which signals can be transmitted. If c = 0 the solution u does not
change along this line because du/ds = c = 0. So the initial conditions will be transported
along the characteristic (see Fig. 4.4).

4.1.6 Group velocity


The dispersion relation showed the theoretical limits for the transmission of waves. Nor-
mally it is not very useful to send waves along media because they do not transport
information. For practical purposes it is more important to known, how fast the maximal
amplitude of a signal will travel along the domain. As already mentioned the solution of
hyperbolic equations can be formulated in terms of a Fourier series as:
Z
Φ̂(k)eikx (4.55)
4.1. Introduction 91

Characteristics

ξ
η
u

Figure 4.4: Transmission of initial conditions u0 along the characteristic


92 Chapter 4. Hyperbolic equations

Considering two waves with slightly different wavenumber the solution is:

ei(kx−ωt) + ei((k+∆k)x−(ω+∆ω)t)) (4.56)

A short modification gives:

ei(kx−ωt) ( |1 + ei(∆kx−∆ωt)
{z } ) (4.57)
=2 if (∆kx−∆ωt)=0

So the condition, which must be satisfied at the point of maximal amplitude is (∆kx −
∆ωt) = 0. Bringing the speed x/t to the left side of the equation we obtain:

x ∆ω
= (4.58)
t ∆k

Letting the ∆k go to zero we get the definition of the group speed:

∆ω dω(k)
lim = = cgr (4.59)
∆k ∆k dk

While the phase speed limits transmission of waves without information, the group speed
limits the transfer of information. Applying this result to the wave equation with the
dispersion relation ω(k) = ±ck we obtain the following speeds:

ω(k) dω(k)
cph = = ±c, cgr = = ±c (4.60)
k dk

Here the phase speed is equal to the group speed. The wave equation can thus transport
information with the same speed as waves. Looking at d’Alamberts solution this is clear
because arbitrary initial conditions are transported without any change. So the maximum
of the solution will travel with the same speed as everything else. Taking again a look
at the beam equation with the dispersion relation ω(k) = ±ak 2 we obtain the following
speeds:

ω(k) dω(k)
cph = = ±ak, cgr = = ±2ak (4.61)
k dk

It is interesting that here the group speed is even higher than the phase speed. So the
maximum travels faster than the waves itself. But as mentioned earlier this is due to the
insufficient model which is not good to describe the wave phenomena in beams.
4.1. Introduction 93

4.1.7 Eigenvector decomposition


Let u be a vector of time dependent functions:
 
u1
u(x, t) =  ...  (4.62)
 
ud

Then a multidimensional hyperbolic equation can be written as:

∂ ∂
u+A u=0 (4.63)
∂t ∂x
where A ∈ Rd × Rd is a matrix. With a set of eigenvectors {e1 , . . . , ed } and eigenvalues
λi , . . . , λd the following equation must hold:

Aej = λj ej ∀j ∈ [1, . . . , d] (4.64)

By using the eigenvector basis it is possible to write the function vector u in terms of the
eigenvectors:

d
X
u(x, t) = aj (x, t)ej (4.65)
j=1

For the partial derivatives we obtain:

d
∂ X ∂
u = aj (x, t)ej (4.66)
∂t j=1
∂t
d
∂ X ∂
u = aj (x, t)ej (4.67)
∂x j=1
∂x

Inserting these expressions into Eq. (4.63) the partial differential equation becomes:

d d
X ∂ X ∂
aj (x, t)ej + A aj (x, t)ej = 0 (4.68)
j=1
∂t j=1
∂x

Using Eq. (4.64) and sorting the terms gives:

d  
X ∂ ∂
aj + λ j aj e j = 0 (4.69)
j=1
∂t ∂x
94 Chapter 4. Hyperbolic equations

Now this equation can be divided into d independent transport equations for the functions
aj :

∂ ∂
aj − λj aj = 0 ∀j ∈ [1, . . . , d] (4.70)
∂t ∂x
The initial conditions for the functions aj must be constructed such that:

d
X
u(x, 0) = aj (x, 0)ej (4.71)
j=1

With these equations it is possible to find analytical solutions even for multidimensional
hyperbolic equations. But this method is limited to cases without dispersion.

4.2 Numerical methods


Now we will consider numerical methods for hyperbolic equations. First the three simplest
finite difference approximations will be shown. After that a stability analysis will show
which method may be used. Then some comparisons between numerical and analytical
solutions regarding the propagation of will be done. Finally the influence of the used time
discretisation will be examined.

4.2.1 Finite difference approximation


The transport equation in 1D consists of a first partial derivative with respect to time and
a first partial derivative with respect to x. In analogy to the discretisation of the heat
equation the continuous time and space domain is divided into discrete points (compare
Fig. 1.5). For the time derivative we use forward differences:

∂u 1
≈ (un+1,j − un,j ) + O(∆t) (4.72)
∂t ∆t
The spatial derivative should be approximated with three different schemes because the
properties of these discretisations should be analysed later:

∂u un,j+ − un,j−η
≈ h (4.73)
∂x ( + η)

Here  and η are the parameters determining the type of spatial discretisation:

•  = 1, η = 1, central differences O(h2 )


4.2. Numerical methods 95

•  = 1, η = 0, forward differences O(h)

•  = 0, η = 1, backward differences O(h)

With these approximations we get the discrete form of the transport equation:
 
1 un,j+ − un,j−η
(un+1,j − un,j ) + c =0 (4.74)
∆t ( + η)h

4.2.2 Stability analysis


For the stability analysis we use the following ansatz:

un,j = ei(kx−ωt) (4.75)

Inserting the index to coordinate transformations x = j · h and t = n · ∆t we obtain:

un,j = e−iω∆tn eikjh = G(k)n eikjh (4.76)

where G(k) is the gain factor which describes the amplification of a wave with wavenumber
k. Using this ansatz in Eq. (4.74) gives the following expression:

r
G(k)n+1 eikjh − G(k)n eikjh + (G(k)n eik(j+)h − G(k)n eik(j−η)h ) = 0 (4.77)
+η

where r = c·∆t
h
is the Courant-number which is an important parameter for the numerical
solution of hyperbolic equations. Dividing by G(k)n e( ikjh) and solving for G(k) the result
is:

r
G(k) = 1 − (eikh − e−ikηh ) (4.78)
+η

Forward differences

Inserting the parameters  = 1, η = 0 for the forward differences into Eq. (4.78) it follows
that

G(k) = 1 − r(eikh − 1) = 1 + r(1 − eikh ) = 1 + r − reikh (4.79)

Recalling that for stability |G(k)| ≤ 1 we can see from Fig. 4.5 that this method is unstable
for all r > 0. So it is clear that it can not be use for hyperbolic equations.
96 Chapter 4. Hyperbolic equations

Re

Re

Re
G(k)

G(k)
G(k)

1−r
1

1
1
Im

Im

Im
Stab.Reg.

Stab.Reg.

Stab.Reg.
−1

−1

−1
Figure 4.5: Stability re- Figure 4.6: Stability region Figure 4.7: Stability re-
gion of the forward differ- of the backward difference gion of the central differ-
ence method method ence method

Taking a look at the computational stencil reveals that the numerical method takes the
value in front of the current point (let the front be defined as the direction in which the
convection transports the solution). Therefore this method is often referred to as the
downwind method.

Backward differences

For the backward differences the amplification G(k) becomes:

G(k) = 1 − r(1 − e−ikh ) = 1 − r + re−ikh (4.80)

From Fig. 4.6 it can be seen that this time the amplification factor lies within the stability
region. At least for

c∆t
r= ≤1 (4.81)
h

This relation is called the Courant-Friedrichs-Levy condition. If it is satisfied, the transport


equation can be solved with the backward difference method which is often called, in
analogy to the forward differences, the upwind method because it uses the value which lies
upstream.
It should be noted that similar to the heat equation some information about the relation
between convective velocity, time step and spatial discretisation can be seen. If either the
velocity is higher or the spatial discretisation becomes finer, the time step size must be
reduced.
4.2. Numerical methods 97

Central differences

Finally the gain factor for the central difference scheme is:

r
G(k) = 1 − (eikh − e−ikh ) = 1 − ri(sin(kh)) (4.82)
2

Hence the absolute value will always be greater or equal to one which makes this method
unstable. But the central difference scheme is the only one which achieves second accuracy.
So some methods are proposed which cure the instability of the central difference scheme
while being stable.

4.2.3 Friedrich’s method

An intuitive explanation for the absolute instability of the central difference method is
similar to the Richardson scheme for the heat equation. The points used for the space
discretisation are not attached to the points which are involved in the time discretisation.
A workaround which corresponds to the Du Fort Frankel scheme for the heat equation is
to replace un,j in the time derivative by (1/2)(un,j−1 + un,j+1 ). This scheme is called the
Friedrich’s method:

1
un+1,j − (un,j−1 − un,j+1 ) + r(un,j+1 − un,j−1 ) (4.83)
2

Doing a von Neumann stability analysis gives the following gain factor:

   
1 r −ikh 1 r ikh
G(k) = + e + − e
2 2 2 2 (4.84)
= cos(kh) − ir sin(kh)

For the absolute value of G we can get the following expression:

|G(k)| = cos2 (kh) + r2 sin2 (kh) ≤ 1 for r ≤ 1 = 0 (4.85)

As a consequence the Friedrich’s scheme for the transport equation is stable for courant
numbers less than one and also second order accurate in space, which is an advantage over
the upwind scheme.
98 Chapter 4. Hyperbolic equations

Central diff.

Friedrichs

x
j+2
j+1/2 j+1
j
t

n+1/2
n+1

Figure 4.8: Lax-Wendroff method

4.2.4 Lax-Wendroff method

Another proposed method for the transport equation is the Lax-Wendroff method. Its
goal is to achieve also second order accuracy in time. Central difference approximations
have this property, but simply replacing the forward difference, which is used for the time
derivative, by a central difference produces an unstable method. Instead the first point of
the idea is to introduce intermediate points, which lie between the discretisation points.
This is possible due to the fact that the Friedrich’s scheme does not use un,j . So if we
insert un,j and un,j+1 for un,j−1 and un,j+1 the Friedrich’s scheme will compute a point
un+1/2,j+1/2 .

After that has been done for all points, these points can be used as the basis for applying
a central difference scheme (see Fig. 4.8).

Writing the three steps (two times Friedrich’s method, one time the central difference
scheme) we get:
4.2. Numerical methods 99

2 1 c
(un+1/2,j+1/2 − (un,j + un,j+1 )) + (un,j+1 − un,j ) = 0 right blue tria.(4.86)
∆t 2 h
2 1 c
(un+1/2,j−1/2 − (un,j + un,j−1 )) + (un,j − un,j−1 ) = 0 left blue tria. (4.87)
∆t 2 h
1 c
(un+1,j − un,j ) + (un+1/2,j+1/2 − un+1/2,j−1/2 ) = 0 cent.diff. (4.88)
∆t h

After several steps this equations can be brought to the normal form which only includes
the real discretisation points:

1 c c2 ∆t
(un+1,j − un,j ) + (un,j+1 − un,j−1 ) − 2
(un,j+1 − 2un,j + un,j−1 ) = 0 (4.89)
∆t 2h |2h {z }
2 2u
≈− c2 ∆t ∂
∂x2

It is easy to see that the Lax-Wendroff scheme adds a term which corresponds to a diffusive
part in the partial differential equation although there is no such term in the original
equation. This is called numerical diffusion. The smoothing property of the Laplace
operator stabilises the numerical scheme. Also in the Finite Element method adding a
small diffusive part was one of the first methods to handle the problem occurring with
hyperbolic equations.

4.2.5 Dispersion of numerical methods


During the analysis of the properties of the analytical solutions it came out that dispersion
is an important aspect of hyperbolic equations. Now the reproduction of the dispersion
relation by a numerical method should be examined in more detail.
One analytical solution of the transport equation was:
Z
u(x, t) = Φ(k)ei(kx−ω(k)t) dk (4.90)

where Φ(k) depends on the initial conditions. The phase speed cph was defined as:

ω(k)
cph = (4.91)
k
and the group speed as:

dω(k)
cgr = (4.92)
dk
100 Chapter 4. Hyperbolic equations

The discretisation replaced the exponential term in Eq. (4.90) by a discrete counterpart:

n
ei(kx−ω(k)t) ⇒ ei(kjh−ω(k)∆t·n) = e−iω(k)∆t eikjh (4.93)
| {z }
G(k)

Introducing the numerical dispersion relation ω̂(k) we get:

G(k) = e−iω̂(k)∆t ⇒ ln G(k) = iω̂(k)∆t (4.94)

and hence:

i
ω̂(k) = ln G(k) (4.95)
∆t
Now it is trivial to derive the numerical phase and group speed ĉph and ĉgr :

ω̂(k)
ĉph (k) = (4.96)
k
dω̂(k)
ĉgr (k) = (4.97)
dk

A further investigation of these equations will be given in the next section.

4.3 Time integration


In the previous section two methods were proposed for the discretisation of the time deriva-
tive. One was the well known Euler forward method and the other was the Central dif-
ference scheme which was only stable together with an artificial diffusion term (in the
Lax-Wendroff method). Now the aspects of time integration should be analysed more de-
tailed because they are quite important for the overall behaviour of the numerical solution.

4.3.1 General remarks


During the analysis of the beam equation which is a parabolic equation, it came out
that parabolic equations allow infinite transmission speeds. So explicit time integration
methods, which compute the result of the next time step only from values which lie close
to the computed point, cannot reproduce this infinite transmission speed. In contrast
implicit methods like the Euler backward method or the trapezoidal rule can reproduce
this behaviour because all points are coupled through the system of linear equations.
4.3. Time integration 101

Comparing the situation with hyperbolic equations where we have only a finite transmission
speed, the explicit methods seem more appropriate. Especially the Upwind method repro-
duces the behaviour of the transport equation very intuitively. It simply takes the value
of the point which lies upstream and ”’transports”’ its value to the next point. Explicit
methods are therefore more ”’natural”’ for hyperbolic equations than implicit methods.
Nevertheless if large timesteps should be used, implicit methods are also necessary for hy-
perbolic equations. This is clear, because for large timesteps the value of a point may have
to be transported over a distance which is longer than the distance between two points.
Only an implicit method can do this.
Summarising these results we get the following rules of thumb:

• Parabolic equations with diffusive solution: implicit time integration methods in the
method of lines are ”’natural”’.

• Parabolic equations with waves: explicit methods are ”’natural”’ but enforce severe
restrictions on the timestep size ∆t.

• Hyperbolic equations with wave behaviour: explicit methods are ”’natural”’ but the
time step size must be kept small enough.

• Hyperbolic equations with large time step size: implicit method are ”’natural”’

4.3.2 Analysis of the time integration


Speaking of the accuracy of a numerical method is not trivial, because especially for wave
phenomena the difference between the exact solution and the numerical solution might be
large although the solution is not so bad at all (see Fig. 4.9). Here the distance between the
two functions is large but it is easy to see that basically the frequency is slightly different.
So for the wave equation the comparison between the numerical and the analytical group
or phase speed will probably bring more useful results.

Spatial discretisation

Starting again with the wave equation:

∂2u 2
2∂ u
− c =0 (4.98)
∂t2 ∂x2
a spatial discretisation (today mostly finite elements) transforms the partial differential
equation into a system of second order differential equations:

Mv̈ + Kv = f̃ (4.99)
102 Chapter 4. Hyperbolic equations

Cubic elements

κ
π
x
Huge error

Linear elements

0
Cgr

Cgr
u

Figure 4.9: Problem of measuring the er- Figure 4.10: Relation between numerical
ror between waves group speed and analytical group speed
for finite elements (schematic)
4.3. Time integration 103

In the previous sections the group and phase speed were defined as:

ω dω
cph = , cgr = (4.100)
k dk
Now we are looking the dispersion relation of spatially discretised wave equation. As an
ansatzfunction the exponential function will be used again:

v(t) = v0 eiωt (4.101)


⇒ v̈(t) = −ω 2 v0 eiωt (4.102)

The discrete equation becomes with this ansatz:

Kv0 = ω 2 Mv0 (4.103)

This can be seen as a generalised eigenvalue problem with v0 being an eigenvector and
ω 2 the eigenvalue. Computing these eigenvalues the dispersion relation for the spatially
discretised wave equation can be found. In Fig. 4.10 the relation between the group speed of
the analytical solution and the group speed of the numerical solution is shown for different
relative wavelengths (relative to the size of the finite elements). If linear elements are used,
the especially short waves travel much slower. Cubic elements can improve this behaviour.

Time discretisation

Summarising the results of the previous sections, there exist three dispersion relations:

• ω(k) dispersion relation of the partial differential equation

• ω̄(k) dispersion relation of the spatially discretised equation

• ω̂(k) dispersion relation of the totally discrete equation

Numerical methods for first order ordinary differential equations can be analysed by using
the test equation:

ẋ = λx (4.104)

with its analytical solution:

x(t) = x0 eλt (4.105)


104 Chapter 4. Hyperbolic equations

This equation is not sufficient to examine numerical method for second order differen-
tial equations because these equations describe waves in time. Instead the following test
equation shows to be very useful:

ẍ = −ω 2 x (4.106)

It has the following exact solution:

x(t) = Aeiωt + Be−iωt (4.107)

Here the initial conditions go into the parameters A and B. With the starting values
x0 = x(0) and v0 = ẋ0 = ẋ(0) normal numerical method for first order system can be
written as:
 h  h
xn xn+1
→ (4.108)
vn vn+1

Introducing the operator A which maps the solution at one time step into the solution at
the next time step the numerical method can be written:
 h  h
xn n xn+1
=A (4.109)
vn vn+1

Analysing the eigenvalues of the operator A will therefore give some insights about the
development of the solution.
(Hier hoeren meine Aufzeichnungen auf ...)

You might also like