Numerical Methods Kirkegaard
Numerical Methods Kirkegaard
Julius B. Kirkegaard
1
Contents
1 Introduction 4
2 Numerical Differentiation 6
2.1 First Order Derivatives . . . . . . . . . . . . . . . 6
2.2 Higher Order Derivatives . . . . . . . . . . . . . . 9
2.3 Deriving Schemes . . . . . . . . . . . . . . . . . . 12
2.4 Machine Precision . . . . . . . . . . . . . . . . . . 14
2.5 Spectral Methods . . . . . . . . . . . . . . . . . . 17
2
CONTENTS 3
4
CHAPTER 1. INTRODUCTION 5
6
CHAPTER 2. NUMERICAL DIFFERENTIATION 7
All of these formulas are, for good reason, called finite difference
expressions. To express how good an approximation is, we consider
what happens when applying them to the Taylor expansions of
functions. Smooth functions can (locally) be approximated by their
Taylor expansion
∞
Õ 1 (𝑛)
𝑓 (𝑥 − 𝑥 0 ) = 𝑓 (𝑥 0 )(𝑥 − 𝑥 0 ) 𝑛 (2.6)
𝑛=0
𝑛!
1 00
= 𝑓 (𝑥 0 ) + 𝑓 0 (𝑥0 )(𝑥 − 𝑥 0 ) + 𝑓 (𝑥 0 )(𝑥 − 𝑥 0 ) 2 + · · ·
2
Let us try and use our finite difference formulas on these Taylor
expansions. First we use the forward scheme of Eq.(2.2). Without
loss of generality we evaluate the derivative for 𝑥 = 0:
𝑓 (Δ𝑥) − 𝑓 (0) 1
𝑓 0 (0) ≈ = 𝑓 0 (0) + 𝑓 00 (0)Δ𝑥 + · · · (2.7)
Δ𝑥 2
CHAPTER 2. NUMERICAL DIFFERENTIATION 9
We find that the error of the approximation has a term that grows
proportional to Δ𝑥. The next error term grows like Δ𝑥 2 , but for
small Δ𝑥 this will be much smaller, and the following terms even
smaller than that. For this reason one writes
Forward Derivative
𝑓 (𝑥 + Δ𝑥) − 𝑓 (𝑥)
𝑓 0 (𝑥) = + O (Δ𝑥). (2.8)
Δ𝑥
The notation O (Δ𝑥) signifies that the error grows like ∼ Δ𝑥. If we
do the same calculation for the central derivative we find
Central Derivative
𝑓 (𝑥 + Δ𝑥) − 𝑓 (𝑥 − Δ𝑥)
𝑓 0 (𝑥) = + O (Δ𝑥 2 ). (2.9)
2Δ𝑥
It is in this sense that the central derivative is better: the error grows
like Δ𝑥 2 , which for small Δ𝑥 will be a lot smaller than Δ𝑥.
𝑓 0 (𝑥 + Δ𝑥) − 𝑓 0 (𝑥 − Δ𝑥)
𝑓 00 (𝑥) = + O (Δ𝑥 2 ).
2Δ𝑥
And then using the formula for 𝑥 + Δ𝑥 and 𝑥 − Δ𝑥 we have
𝑓 (𝑥 + 2Δ𝑥) − 𝑓 (𝑥)
𝑓 0 (𝑥 + Δ𝑥) = + O (Δ𝑥 2 )
2Δ𝑥
𝑓 (𝑥) − 𝑓 (𝑥 − 2Δ𝑥)
𝑓 0 (𝑥 − Δ𝑥) = + O (Δ𝑥 2 )
2Δ𝑥
Combining these we find
1
𝑓 000 (𝑥) = 𝑓 (𝑥 + 2Δ𝑥) − 2 𝑓 (𝑥 + Δ𝑥) (2.12)
2Δ𝑥 3
+2 𝑓 (𝑥 − Δ𝑥) − 𝑓 (𝑥 − 2Δ𝑥) + O (Δ𝑥 2 ).
Note that the schemes all sum to zero and are symmetric for even
derivatives and anti-symmetric for odd derivatives. Why must this
CHAPTER 2. NUMERICAL DIFFERENTIATION 12
be the case?
We will also give a few schemes for forward derivatives
where the 𝑎’s are the finite difference coefficients for this custom
stencil, which in general will depend on 𝑥.
The derivation of the formula is quite simple: we just need to
ensure that the terms of the Taylor expansion equal zero except at
the derivative we are trying to calculate, where instead we need to
correct for the factorial. We skip a step-step derivation and simply
state that the correct coefficients for evaluation at 𝑥 = 0 are found
by solving the following linear equation2
1 1 1 1 1 𝑎1 𝛿0,𝑑
© ª© ª © ª
𝑥
1 𝑥2 𝑥3 𝑥4 𝑥 5 ® 𝑎 2 ® 𝛿1,𝑑 ®
2
𝑥 22 𝑥32 𝑥 42 𝑥 52 ® 𝑎 3 ® = 𝑑! 𝛿2,𝑑 ® .
® ® ®
𝑥 1 (2.13)
3
𝑥 23 𝑥33 𝑥 43 𝑥 53 ® 𝑎 4 ®
® ® ®
𝑥 1 𝛿3,𝑑 ®
4 𝑥 24 𝑥34 𝑥 44 𝑥 54 ¬ «𝑎 5 ¬
«𝑥 1 «𝛿4,𝑑 ¬
Example
To find the third derivative scheme for the regular stencil
{𝑥 − 2Δ𝑥, 𝑥 − Δ𝑥, 𝑥, 𝑥 + Δ𝑥, 𝑥 + 2Δ𝑥}, one would solve
1 1 1 1 1 𝑎 −2 0
−2Δ𝑥 −Δ𝑥 0 Δ𝑥
© ª© ª © ª
2Δ𝑥 ® 𝑎 −1 ® 0®
(−2Δ𝑥) 2 (−Δ𝑥) 2 0 Δ𝑥 2 (2Δ𝑥) 2 ® 𝑎 0 ® = 0® .
® ® ®
(−2Δ𝑥) 3 (−Δ𝑥) 3 0 Δ𝑥 3 (2Δ𝑥) 3 ® 𝑎 1 ® 6®
® ® ®
4 4 0 Δ𝑥 4 (2Δ𝑥) 4 ¬ « 𝑎 2 ¬ «0¬
« (−2Δ𝑥) (−Δ𝑥)
With this formula we can calculate schemes for any order of deriva-
tive for any stencil (as long as 𝑁 > 𝑑). To calculate the schemes
presented in the tables above, you can set Δ𝑥 = 1.
6 Ifthe function is real, i.e. not complex, fewer coefficients need to be stored,
since then 𝑐 −𝑘 = 𝑐∗𝑘 .
CHAPTER 2. NUMERICAL DIFFERENTIATION 19
Initial-Value Problem
𝑓 (𝑡)
𝑡
and 𝒇 (0) = 𝒇 0 .
21
CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS 22
Boundary-Value Problem
𝑓 (𝑥)
𝑥
We know from Eq. (2.8) that the forward scheme has error of
size O (Δ𝑡), which means that each step in the Euler method has an
error of size O (Δ𝑡 2 )1. In order to integrate the equation from 𝑡 = 0
all the way to 𝑡 = 𝑇 using a step of size Δ𝑡, we will need to do the
above approximation 𝑛 ≈ 𝑇/Δ𝑡 times. This means that the error
of the final term 𝒇 (𝑇) will be of order 𝑇/Δ𝑡 × O (Δ𝑡 2 ) = O (Δ𝑡).
Although the method is exceedingly simple to implement, this large
error makes it unfit for many applications unless very small Δ𝑡 are
used.
Example
Let us solve
p
𝑓 0 (𝑡) = 𝑓 (𝑡), 𝑓 (0) = 1 (3.7)
using the Euler method with Δ𝑡 = 0.1.
The first time step gives us
p
𝑓 (Δ𝑡) = 𝑓 (0.1) ≈ 𝑓 (0) + 𝑓 (0) Δ𝑡
= 1 + 0.1 = 1.1. (3.8)
Verlet/Leapfrog Integration
For each time step do
1
𝑥 Δ𝑡/2 = 𝑥(𝑡) + 𝑣(𝑡)Δ𝑡
2
𝑣(𝑡 + Δ𝑡) = 𝑣(𝑡) + 𝑎(𝑥Δ𝑡/2 )Δ𝑡 (3.11)
1
𝑥(𝑡 + Δ𝑡) = 𝑥Δ𝑡/2 + 𝑣(𝑡 + Δ𝑡)Δ𝑡
2
The error of this method is only O (Δ𝑡 3 ) per time step, which is
a big improvement over the Euler Method. This is despite being
computationally as cheap as the Euler method: 𝑎(𝑥), which is
typically the most computationally expensive term to calculate, is
only evaluated once per time step for both methods.
A conservative system of equations will conserve energy, but
we cannot in general guarantee this from numerical approxima-
tions. The Verlet (or Leap-Frog) method is a so-called symplectic
method, which precisely ensures that the (time-averaged) energy is
conserved. This feature can be crucial in many physical simulations.
𝑘 1 = 𝐹 ( 𝑓 (𝑡), 𝑡)
1 1
𝑘 2 = 𝐹 ( 𝑓 (𝑡) + 𝑘 1 Δ𝑡, 𝑡 + Δ𝑡) (3.12)
2 2
𝑓 (𝑡 + Δ𝑡) = 𝑓 (𝑡) + 𝑘 2 Δ𝑡
Example
Let us again solve
p
𝑓 0 (𝑡) = 𝑓 (𝑡), 𝑓 (0) = 1 (3.13)
𝑘 1 = 𝐹 ( 𝑓 (𝑡), 𝑡)
1 1
𝑘 2 = 𝐹 ( 𝑓 (𝑡) + 𝑘 1 Δ𝑡, 𝑡 + Δ𝑡)
2 2
1 1
𝑘 3 = 𝐹 ( 𝑓 (𝑡) + 𝑘 2 Δ𝑡, 𝑡 + Δ𝑡) (3.14)
2 2
𝑘 4 = 𝐹 ( 𝑓 (𝑡) + 𝑘 3 Δ𝑡, 𝑡 + Δ𝑡)
1
𝑓 (𝑡 + Δ𝑡) = 𝑓 (𝑡) + [𝑘 1 + 2𝑘 2 + 2𝑘 3 + 𝑘 4 ] Δ𝑡
6
The error of this method is O (Δ𝑡 5 ) per time step, which allows
for significantly larger step sizes than that required by the Euler
method.
choices of 𝑘’s. In fact, the above scheme is not the best fourth order
Runge–Kutta scheme, but it is the easiest one to implement.
Using this scheme, large time steps will be taken when the solution
is smooth and small time steps will be taken when the solution
varies rapidly. This is computationally much more efficient than
taking the same time step at all times.
The method is not fully specified in the above presentation, as
there is still some choice in which Runge–Kutta formulas to use. A
brilliant scheme, and the one most widely in use, is the Dormand–
Prince method, which uses a fourth and fifth order scheme that
share most of their 𝑘-expressions.
Note that the right-hand side now contains 𝑓 (𝑡 + Δ𝑡), which is what
we are trying to calculate. This is the reason these methods are
called implicit: the methods only provide implicit equations.
Example
Consider the initial-value problem
𝑓 0 (𝑡) = −𝑎 𝑓 (𝑡), 𝑓 (0) = 𝑓0 > 0. (3.16)
Using explicit Euler, the update rule would be
𝑓 (𝑡 + Δ𝑡) = 𝑓 (𝑡) − 𝑎 𝑓 (𝑡)Δ𝑡. (3.17)
Note that if Δ𝑡 > 𝑎2 , this scheme becomes unstable as even
though 𝑓 (𝑡) should decay, it will instead do increasingly large
CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS 33
Example
Consider
This problem has two time scales: 1/𝑎 and 1/𝑏. For initial
conditions 𝑓1 (0) = 1 and 𝑓2 (0) = 0, the system has the
solution
1 −2𝑎𝑡 −2𝑏𝑡
𝑓1 (𝑡) = 𝑒 +𝑒 ,
2
1 −2𝑎𝑡
𝑓2 (𝑡) = 𝑒 − 𝑒 −2𝑏𝑡 .
2
Note that if, say, 𝑎 𝑏 then 𝑒 −2𝑏𝑡 will very quickly become
negligible. Yet, for the explicit method to work, we will
nonetheless be forced to choose Δ𝑡 1/𝑏. In other words:
even in cases where the shortest time scale does nothing
important for the final solution, we are still forced to use a
small time scale in our time steps for explicit methods.
The implicit scheme for the above equation is
(1 + 𝑎Δ𝑡 + 𝑏Δ𝑡) 𝑓1 (𝑡) + (𝑏 − 𝑎)Δ𝑡 𝑓2 (𝑡)
𝑓1 (𝑡 + Δ𝑡) = ,
(1 + 2𝑎Δ𝑡) (1 + 2𝑏Δ𝑡)
and similarly for 𝑓2 (𝑡 + Δ𝑡).
For the implicit time step, 𝑎 and 𝑏 appear in both the numera-
tor and denominator, which stabilises the scheme as, even for
CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS 35
large Δ𝑡, large values of e.g. 𝑏 will not drive the right-hand
side to be large.
While the Implicit Euler Method is much more stable, its ac-
curacy is similar to that of the Euler scheme, i.e. the error of each
time step is of order O (Δ𝑡 2 ). Implicit Runge–Kutta schemes also
exist, and we will give just one example of such:
Crank–Nicolson Method
For each time step solve the following system of equations
𝑘 1 = 𝐹 ( 𝑓 (𝑡), 𝑡)
𝑘 2 = 𝐹 ( 𝑓 (𝑡 + Δ𝑡), 𝑡 + Δ𝑡) (3.20)
1
𝑓 (𝑡 + Δ𝑡) = 𝑓 (𝑡) + (𝑘 1 + 𝑘 2 )Δ𝑡
2
CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS 36
This implicit method has an error of order O (Δ𝑡 3 ), but even though
it is more stable the explicit Euler, it is not stable for any value of
Δ𝑡.
𝑓 (𝑎) = 𝛼 (3.22)
CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS 37
𝑓 0 (𝑎) = 𝛼 (3.23)
𝑓 0 (𝑏) = 𝛽, (3.25)
𝑥 𝑓 𝑔
© 1 ª © 1 ª © 1 ª
𝑥2 ® 𝑓2 ® 𝑔2 ®
® ® ®
𝑥3 ® 𝑓3 ® 𝑔3 ®
𝒙 = .. ®® ,
𝒇 = .. ®® ,
𝒈 = .. ®® ,
(3.27)
. ® . ® . ®
𝑥 ® 𝑓 ® 𝑔 ®
𝑁−1 ® 𝑁−1 ® 𝑁−1 ®
« 𝑥𝑁 ¬ « 𝑓𝑁 ¬ « 𝑔𝑁 ¬
where 𝑥 1 = 𝑎 and 𝑥 𝑁 = 𝑏. Note that 𝒇 is not known, as this is the
one we are solving for. For regular grids we will have a constant
spacing 𝑥𝑖+1 − 𝑥𝑖 = Δ𝑥 for all 𝑖, which we will assume for now.
Using the central finite difference scheme of Eq. (2.11), we
have
𝑓𝑖+1 + 𝑓𝑖−1 − 2 𝑓𝑖
𝑓𝑖00 ≈ (3.28)
Δ𝑥 2
for 2 ≤ 𝑖 ≤ 𝑁 − 1. So except for the edge case 𝑖 = 1 and 𝑖 = 𝑁, we
CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS 40
? ? ? ? ··· ? ? ? ?
1 −2 1 0 · · ·
© ª
0 0 0® 0
0 1 −2 1 · · ·
®
0 0 0® 0
00 1 . . .. .. . . .. .... ®® 𝒇
..
𝒇 = .. .. . . . . . .® . (3.29)
Δ𝑥 2
0 0 0 0 · · · 1 −2 1 0®®
0 0 0 0 · · · 0 1 −2 1®®
«? ? ? ? ··· ? ? ? ?¬
We could use the central scheme to fill out the first and last rows if
we were solving a periodic problem. For the present problem we
could also use a forward or backward finite difference scheme to fill
out these rows to estimate the second derivative there. However, we
do not need to do this, as we instead need to use these rows to enforce
the boundary conditions. At 𝑥 = 𝑎 we have to enforce 𝑓1 = 𝛼. At
𝑥 = 𝑏 we need to enforce a Neumann boundary condition. We do
this by using backward scheme for the first derivative. We could
for instance enforce 𝑓 𝑁 − 𝑓 𝑁−1 = 𝛽Δ𝑥. However, as we use a
second order scheme with error O (Δ𝑥 2 ), we should do the same for
the boundary condition. The backward scheme of second order is
given by 23 𝑓 𝑁 − 2 𝑓 𝑁−1 + 12 𝑓 𝑁−2 = 𝛽Δ𝑥.
All in all, the finite difference version of Eq. (3.21) with bound-
CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS 41
Finally solve
Abc 𝒇 = 𝒃 bc (3.32)
to find 𝑓 , where bc denotes the updated arrays.
Example
Consider
−3 4 −1 0 · · · 0 0 0 0
−1 0 1 0 · · · 0 0 0
© ª
0®
0 −1 0 1 · · · 0 0 0
®
0®
1 . .. .. .. . . . .. .. .. ®® ,
A1 = .. . . . . .. . . .®
2Δ𝑥
0 0 0 0 · · · −1 0 1 0®®
0 0 0 0 · · · 0 −1 0 1®®
« 0 0 0 0 · · · 0 1 −4 3¬
where we used Eq. (2.9) for the inner rows and the forward
and backward schemes for first end last rows. The total
matrix representing the differential equation of Eq. (3.33) is
therefore
A = A2 + diag(𝒉) A1 , (3.34)
where diag(𝒉) is a matrix of zeros with 𝒉 filled along its
diagonal, and matrix multiplication is implied. Now the
boundary conditions can be applied to A. The right-hand
side is zero, so 𝒃 bc = (𝛼, 0, 0, · · · , 0, 0, 𝛽)𝑇 .
we would usea
A = A2 + A1 diag(𝒉), (3.36)
CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS 44
A 𝒇 = 𝒃, (3.39)
these are similar to the above with smart choices for the step size.
We will furthermore present an alternative method [Eq. (4.32)] in
the chapter on partial differential equations.
Spectral Method
Example
Consider Eq. (3.41) on the domain [0, 2𝜋) with 𝜂 = 1. We
will solve this on the grid
𝒙 = (0.00, 0.79, 1.57, 2.36, 3.14, 3.93, 4.71, 5.50),
i.e. with Δ𝑥 = 𝜋/8 ≈ 0.79. We will solve the ODE for a
right-hand side function 𝑔 that on our grid takes the values
𝒈 = 𝑔(𝒙) = (1.0, 1.7, 0.0, −1.7, −1.0, 0.3, 0.0, −0.3).
Using the fast fourier transform we finda
rfft( 𝒈) ≈ (0, 4.0, −4.0 𝑖, 0, 0),
CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS 48
Thus we have
rfft( 𝒈)
= ( 0.0/0.0, −2 − 2 𝑖, −0.4 + 0.8 𝑖, 0, 0).
−𝒌 2 + 𝑖𝒌
We replace 0.0/0.0 by zero, and then solve the PDE by calcu-
lating
rfft( 𝒈)
𝒇 = ifft
−𝒌 2 + 𝑖𝒌
= (−0.6, −0.2, 0.6, 0.9, 0.4, −0.2, −0.4, −0.5).
In the above example, we did not tell you what the function 𝑔 was,
but only defined it in terms of its values at the grid points. As can
CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS 49
This means that we would have obtained the exact same solution
if we had used 𝑔2 and the same number of grid points. This effect
is called aliasing. When using spectral methods there is therefore
a simple rule that must be followed: use a grid-spacing Δ𝑥 that is
small enough to capture the highest frequencies in all functions.
A 𝒇 = 𝑁 ( 𝒇 ). (3.47)
𝑓𝑖 = 𝐹 ({ 𝑓 𝑗 }), (3.48)
𝑓𝑖 ← 𝐹 ({ 𝑓 𝑗 }). (3.49)
This method can be used in many ways, and how well it works can
vary significantly depending on the problem at hand.
Example
To solve Eq. (3.47), choose a starting guess 𝒇 and repeat
𝒇 ← A−1 𝑁 ( 𝒇 ). (3.50)
Example
Consider
𝑓 00 (𝑥) = 𝑁 ( 𝑓 (𝑥), 𝑥). (3.51)
Using central differences we can discretise this as
𝑓𝑖−1 + 𝑓𝑖+1 − 2 𝑓𝑖
= 𝑁 ( 𝑓𝑖 , 𝑥) (3.52)
Δ𝑥 2
We can rewrite this in form of Eq. (3.48) as
1
𝑓𝑖 = 𝑓𝑖−1 + 𝑓𝑖+1 − 𝑁 ( 𝑓𝑖 , 𝑥𝑖 )Δ𝑥 2 , (3.53)
2
which can be iterated to find the solution. In this case we did
not even have to form a matrix.
Newton’s Method
To solve for 𝒙 in a system of non-linear equations, rewrite
the equation in the form of (3.54).
From an initial guess 𝑥 0 , use the Jacobian
𝜕𝐹1 𝜕𝐹1 𝜕𝐹1
© 𝜕𝑥1 𝜕𝑥2 ... 𝜕𝑥 𝑛 ª
𝜕𝐹2 𝜕𝐹2
... 𝜕𝐹2 ®
𝐽𝐹 = 𝜕𝑥. 1 𝜕𝑥2 𝜕𝑥 𝑛 ®
.. .. .. ®® (3.55)
.. . . . ®
𝜕𝐹𝑛 𝜕𝐹𝑛 𝜕𝐹𝑛
« 𝜕𝑥1 𝜕𝑥2 ... 𝜕𝑥 𝑛 ¬
𝐽𝐹 (𝒙 𝑛 ) 𝒂 = −F(𝒙 𝑛 ) (3.56)
Example
Consider the equations
𝑦3 = 1 − 𝑥2, (3.58)
𝑦 = −𝑒 𝑥 .
Rewriting we have
𝑦3 + 𝑥2 − 1
F(𝑥, 𝑦) = . (3.59)
𝑦 + 𝑒𝑥
The Jacobian is
2𝑥 3𝑦 2
𝐽𝐹 (𝑥, 𝑦) = 𝑥 . (3.60)
𝑒 1
𝜕 𝑓 (𝑡, 𝒙)
= 𝛾∇2 𝑓 (𝑡, 𝒙), (4.1)
𝜕𝑡
the advection equation
𝜕 𝑓 (𝑡, 𝒙)
= ∇ · (𝒖 𝑓 (𝑡, 𝒙)), (4.2)
𝜕𝑡
56
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 57
the most straightforward way. The method is, for obvious reasons,
known as the Forward-Time-Central-Space (FTCS) scheme:
FTCS Scheme
To solve PDEs of the form Eq. (4.4), use forward Euler
to discretise the time-derivative, and a central scheme to
discretise spatial derivatives.
For one spatial dimension this becomes:
Note that this method can only be used for time-dependent prob-
lems. It cannot be used to tackle equations of the form of Eq.
(4.5).
Example
Consider
𝜕 𝑓 (𝑡, 𝒙) 𝜕 2 𝑓 (𝑡, 𝒙)
=𝛾 (4.7)
𝜕𝑡 𝜕𝑥 2
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 59
orthogonality, we obtain
Δ𝑥 2
Δ𝑡 ≤ . (4.13)
2𝛾
This is the Von Neumann stability criteria for the FTCS Scheme
on the diffusion equation. There are two main takeaways: (1) The
better spatial resolution you require, the smaller Δ𝑡 you have to
choose, and (2) the larger the diffusion constant 𝛾, the more stable
the scheme is. Finally, we note that stability is not the same as
2 Note that 𝑐 0 does not change with time. This is because the steady state of
the diffusion equation is a constant).
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 61
𝜕 𝑓 (𝑡, 𝑥) 𝜕 𝑓 (𝑡, 𝑥)
=𝑢 , (4.14)
𝜕𝑡 𝜕𝑥
which has the FTCS Scheme
𝑓 (𝑡, 𝑥𝑖+1 ) − (𝑡, 𝑥𝑖−1 )
𝑓 (𝑡 + Δ𝑡, 𝑥𝑖 ) = 𝑓 (𝑡, 𝑥𝑖 ) + 𝑢 Δ𝑡. (4.15)
2Δ𝑥
Using the Von Neumann approach, we find in this case
𝑐 𝑛 (𝑡 + Δ𝑡)
𝑖𝑘 𝑛 Δ𝑥
−𝑖𝑘 𝑛 Δ𝑥 𝑢Δ𝑡
=1+ 𝑒 −𝑒 . (4.16)
𝑐 𝑛 (𝑡) 2Δ𝑥
The condition for this equation is therefore3
𝑢Δ𝑡
1 + 𝑖 sin(𝑘 𝑛 Δ𝑥) ≤ 1. (4.17)
4Δ𝑥
But we can never satisfy this condition for all 𝑛. Therefore the FCTS
scheme is unstable for any value of Δ𝑡 for the advection equation.
For this reason we will do not recommend the FCTS scheme
unless you are certain that you are applying it to a system for which
you know it is stable. Explicit schemes that work for the advection
equation do exist: for instance the Lax–Wendroff method, which
3 Using 𝑒 𝑖 𝑥 + 𝑒 −𝑖 𝑥 = 2𝑖 sin(𝑥).
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 62
Example
Consider the diffusion equation with a source term
𝜕 𝑓 (𝑡, 𝒙) 𝜕 2 𝑓 (𝑡, 𝒙)
=𝛾 + 𝑔(𝑥) (4.23)
𝜕𝑡 𝜕𝑥 2
on [𝑎, 𝑏] with boundary conditions 𝑓 (𝑡, 𝑎) = 𝛼 and 𝑓 (𝑡, 𝑏) =
𝛽. Using Eq. (3.29) and Eq. (4.21), we have
? ? ? ? ··· ? ?? ?
1 −2 1 0 · · ·
© ª
0 00® 0
0 1 −2 1 · · ·
®
0 00® 0
Δ𝑡 . . .. .. . . .. .. ®® .
.. ..
A = I − 2 .. .. . . . . ..® .
Δ𝑥
0 0 0 0 · · · 1 −2 1 0®®
0 0 0 0 · · · 0 1 −2 1®®
«? ? ? ? ··· ? ? ? ?¬
and
𝒃 = 𝒇 (𝑡) + 𝒈Δ𝑡 (4.24)
Applying our boundary conditions we find
1 0 0 0 ··· 0 0
© ª
−Δ𝑡2 1 + 2Δ𝑡2 −Δ𝑡
0 ··· 0 0 ®
Δ𝑥 Δ𝑥 Δ𝑥 2 ®
−Δ𝑡 2Δ𝑡 −Δ𝑡
+ ···
®
0 Δ𝑥 2 1 Δ𝑥 2 Δ𝑥 2
0 0 ®
. . . .. .. .. ®
Abc = ..
.. .. ... ®
. . . ®
−Δ𝑡 ®
0
0 0 0 ··· Δ𝑥 2
0 ®®
2Δ𝑡 −Δ𝑡 ®
0
0 0 0 · · · 1 + Δ𝑥 2 Δ𝑥 2 ®
« 0 0 0 0 ··· 0 1 ¬
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 65
and
𝛼
𝑓 (𝑡, 𝑥2 ) + 𝑔(𝑥2 )Δ𝑡 ®
© ª
𝑓 (𝑡, 𝑥3 ) + 𝑔(𝑥3 )Δ𝑡 ®
®
.. ®
𝒃 bc = . ®.
®
𝑓 (𝑡, 𝑥 ) + 𝑔(𝑥 )Δ𝑡 ®
𝑁−2 𝑁−2 ®
𝑓 (𝑡, 𝑥 ) + 𝑔(𝑥 )Δ𝑡 ®
𝑁−1 𝑁−1 ®
« 𝛽 ¬
Each time step is then taken by solving for 𝒇 (𝑡 + Δ𝑡) in
Note that it makes sense to use an iterative solver (such as Eq. (3.40)
or the method of conjugate gradients) to solve the linear equation,
since we have an excellent initial guess for 𝒇 (𝑡 + Δ𝑡) in the form of
𝒇 (𝑡).
If the problem is higher-dimensional, the approach is the same,
except that the boundary-value problem to be solved for each time
step is higher-dimensional. The next section is devoted to the
solution of such problems.
A 𝒇 = 𝒃, (4.26)
𝑓 (𝑥 1 , 𝑦 1 ) 𝑓 (𝑥 3 , 𝑦 2 ) ..
.
© 𝑓 (𝑥 , 𝑦 ) ª® © ª
𝑓 (𝑥 2 , 𝑦 1 ) ®
© ª
4 2 ® 𝑓 (𝑥 𝑁 , 𝑦 𝑁−1 ) ®®
.. ® .. ®
® . 𝑓 (𝑥1 , 𝑦 𝑁 ) ®
®
. ®
®
.
𝒇 = 𝑓 (𝑥 𝑁−1 , 𝑦 1 ) ®® 𝑓 (𝑥 𝑁 , 𝑦 2 ) ® .. 𝑓 (𝑥2 , 𝑦 𝑁 ) ®® ,
®
𝑓 (𝑥 , 𝑦 ) ® 𝑓 (𝑥 1 , 𝑦 3 ) ®® .. ®
𝑁 1 ® . ®
𝑓 (𝑥 , 𝑦 ) ® 𝑓 (𝑥 2 , 𝑦 3 ) ®
® ®
1 2 ® 𝑓 (𝑥 𝑁−1 𝑁 ®, 𝑦 ) ®
(𝑥 .. ®
2, 𝑦2) ¬
¬ « 𝑓 (𝑥 𝑁 , 𝑦 𝑁 ) ¬
« 𝑓 « .
6 For finite differences, periodicity implies that e.g. 𝑥1 and 𝑥 𝑁 are neighbour-
ing points.
0 1 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 𝑓 (𝑥1 ,𝑦 1 )
© −1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 ª© 𝑓 (𝑥2 ,𝑦 1 ) ª
0 −1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 ® 𝑓 (𝑥3 ,𝑦 1 ) ®
1 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 ® 𝑓 (𝑥4 ,𝑦 1 ) ®
0 0 0
0 0 1 0 −1 0 0 0 0 0 0 0
®
0 ® 𝑓 (𝑥1 ,𝑦 2 ) ®
®
0 0 0 0 −1 0 1 0 0 0 0 0 0 0 0 0 ® 𝑓 (𝑥2 ,𝑦 2 ) ®
0 0 0 0 0 −1 0 1 0 0 0 0 0 0 0 0 ® 𝑓 (𝑥3 ,𝑦 2 ) ®
1 0 0 0 0 1 0 −1 0 0 0 0 0 0 0 0
®
0 ® 𝑓 (𝑥4 ,𝑦 2 ) ®
®
2Δ𝑥 0 0 0 0 0 0 0 0 0 1 0 −1 0 0 0 0 ® 𝑓 (𝑥1 ,𝑦 3 ) ® .
−1 𝑓 (𝑥2 ,𝑦 3 ) ®
00 00 00 0 0 0 0 0 0 1 0 0 0 0 0 ®
𝑓 (𝑥3 ,𝑦 3 ) ®
0 0 0 0 0 0 −1 0 1 0 0 0
®
0 ®
®
0 0 0 0 0 0 0 0 1 0 −1 0 0 0 0 0 ® 𝑓 (𝑥4 ,𝑦 3 ) ®
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 −1 ® 𝑓 (𝑥1 ,𝑦 4 ) ®
𝑓 (𝑥2 ,𝑦 4 ) ®
® ®
0 0 0 0 0 0 0 0 0 0 0 0 −1 0 1 0 ®
0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 1 𝑓 (𝑥3 ,𝑦 4 )
«0 0 0 0 0 0 0 0 0 0 0 0 1 0 −1 0 ¬« 𝑓 (𝑥4 ,𝑦 4 ) ¬
This is somewhat hard to read, so do not feel bad if you skip it: this
is easier to program than to read! But do focus on single, random
row and make sure to understand that one. For instance, the second
1
row states 𝜕𝑥 𝑓 (𝑥 2 , 𝑦 1 ) = 2Δ𝑥 ( 𝑓 (𝑥3 , 𝑦 1 ) − 𝑓 (𝑥1 , 𝑦 1 )). Note that
the matrix would be very different if we needed 𝜕𝑦 𝑓 . Also note
that as we increase the number of grid points 𝑁, the fraction of
zeros increases. It is thus a very good idea to use a sparse matrix
representation.
in two dimensions.
It generalises straighforwardly to higher dimensions.
Again, make sure you understand just one or two rows. That should
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 70
Example
Consider
∇2 𝑓 (𝑥, 𝑦) = 𝑔(𝑥, 𝑦). (4.29)
The finite difference equation is then simply
L 𝒇 = 𝒈, (4.30)
Jacobi Method
A linear equation
A𝒇 = 𝒃 (4.31)
can be solved by splitting the matrix into its diagonal and off-
diagonal parts A = D + O. Here D only contains elements on
the diagonal and O only contains elements off the diagonal.
Then until convergence do
𝒇 ← D−1 (𝒃 − O 𝒇 ) (4.32)
8 The condition we have stated is sufficient for convergence, but not necessary.
The sufficient and necessary condition is that all eigenvalues of D−1 O are less
than or equal to 1, and at least one eigenvalue is strictly less than 1.
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 72
Example
Consider again
Many methods are intuitive to read. This one requires a bit more
thought, so make sure you understand how to obtain Eq. (4.34)
from Eq. (4.32). This method is simpler to implement, but not as
efficient as those found in many modern linear algebra libraries.
Ω Γ
Γ1 Ω Γ2
Example
Consider
∇2 𝑓 (𝑥, 𝑦) = 0 (4.37)
on a square 𝑥 ∈ [𝑎 𝑥 , 𝑏 𝑥 ] and 𝑦 ∈ [𝑎 𝑦 , 𝑏 𝑦 ]. Suitable boundary
conditions could be
𝑓 (𝑥, 𝑎 𝑦 ) = 𝑓 (𝑎 𝑥 , 𝑦) = 𝛽
𝜕𝑥 𝑓 (𝑏 𝑥 , 𝑦) = 𝛼1 (4.38)
𝜕𝑦 𝑓 (𝑥, 𝑏 𝑦 ) = 𝛼2
Here we used two Neumann conditions:
𝑦
𝜕𝑦 𝑓 = 𝛼2
𝜕𝑥 𝑓 = 𝛼1
A 𝒇 = 𝒃, (4.39)
B 𝒇 = 𝜷, (4.40)
A 𝑅𝑅 𝒇 𝑅 + A 𝑅𝑀 𝒇 𝑀 = 𝒃 𝑅 (4.42)
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 77
(𝑥1 , 𝑦 1 )
(𝑥 2 , 𝑦 2 ) (𝑥 3 , 𝑦 3 )
𝜙𝑖 (𝑥, 𝑦) = 𝛼1 + 𝛼2 𝑥 + 𝛼3 𝑦. (4.55)
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 83
𝛼 𝑥 2𝑖 𝑦𝑖3 − 𝑥 3𝑖 𝑦𝑖2
© 1 ª 1 © 𝑖 ª
𝛼2 ® = 𝑦 2 − 𝑦𝑖3 ®® (4.57)
𝐴
« 𝑥3 − 𝑥2 ¬
𝑖 𝑖
«𝛼3 ¬
where
1 𝑥 1𝑖 𝑦𝑖1
𝐴 = 1 𝑥 2𝑖 𝑦𝑖2 = 𝑥 1𝑖 𝑦𝑖2 − 𝑥 1𝑖 𝑦𝑖3 − 𝑥2𝑖 𝑦𝑖1 + 𝑥 2𝑖 𝑦𝑖3 + 𝑥 3𝑖 𝑦𝑖1 − 𝑥 3𝑖 𝑦𝑖2
1 𝑥 3𝑖 𝑦𝑖3
is determinant of the matrix (this equals twice the area of the trian-
gle). Our basis function therefore evaluates to
1 𝑖 𝑖
𝜙𝑖 (𝑥, 𝑦) = (𝑥 2 𝑦 3 − 𝑥 3𝑖 𝑦𝑖2 ) + (𝑦𝑖2 − 𝑦𝑖3 ) 𝑥 + (𝑥 3𝑖 − 𝑥 2𝑖 ) 𝑦 (4.58)
𝐴
inside this specific triangle. Note that 𝜙𝑖 will have different formulas
inside different triangles and is non-zero only inside triangles that
have node 𝑖 as a corner.
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 84
It should be pretty clear that the above equation holds for any choice
of ℎ if Eq. (4.53) holds, but why this is a useful thing to consider
might be less obvious. In fact, if we require Eq. (4.59) to hold
for any choice ℎ, then Eq. (4.53) and Eq. (4.59) are exactly
equivalent!11 This integral version [Eq. (4.59)] is called the weak
formulation of the differential equation and is the one used in FEM.
The function ℎ is called a test function.
By integrating by parts, we obtain
∫ ∫ ∫
− ∇ 𝑓 · ∇ℎ dΩ + ℎ ∇ 𝑓 · 𝑛ˆ dΓ = 𝑔 ℎ dΩ, (4.60)
Ω Γ Ω
To deal with the requirement ‘for all ℎ‘, we will also expand ℎ
in our basis functions:13
Õ
ℎ(𝑥, 𝑦) = ℎ𝑖 𝜙𝑖 (𝑥, 𝑦). (4.63)
𝑖
since if the equation holds for all basis functions, it will also hold
for all combinations of them and therefore for any ℎ.
We now simply need to perform the integrals, which only depend
on quantities that are known, and then we have a linear equation for
{ 𝑓𝑖 } that can be solved by our usual approach.
You now know all of the main ingredients of the finite element
method. What is left is the mathematics of carrying out the in-
tegrals. In principle, you could simply do numerical integration,
but we can also do it analytically, which is naturally better. Let us
explicitly evaluate the left-hand side of Eq. (4.64) for our linear
basis function.
Clearly ∫
∇𝜙𝑖 · ∇𝜙 𝑗 dΩ = 0 (4.65)
Ω
if node 𝑖 and node 𝑗 do not share a triangle. If they do share a
triangle, we have
1 𝑖 𝑖
𝜙𝑖 (𝑥, 𝑦) = (𝑥 2 𝑦 3 − 𝑥3𝑖 𝑦𝑖2 ) + (𝑦𝑖2 − 𝑦𝑖3 ) 𝑥 + (𝑥3𝑖 − 𝑥 2𝑖 ) 𝑦 ,
𝐴
1 h 𝑗 𝑗 𝑗 𝑗 𝑗 𝑗 𝑗 𝑗
i
𝜙 𝑗 (𝑥, 𝑦) = (𝑥2 𝑦 3 − 𝑥 3 𝑦 2 ) + (𝑦 2 − 𝑦 3 ) 𝑥 + (𝑥 3 − 𝑥 2 ) 𝑦
𝐴
inside the shared triangle. Note that with our slightly cumbersome
notation for the points, our formula is valid for both the case when
𝑖 = 𝑗 and 𝑖 ≠ 𝑗.
Thus within the triangle
1 h 𝑗 𝑗 𝑗 𝑗
i
∇𝜙𝑖 · ∇𝜙 𝑗 = 2 (𝑦𝑖2 − 𝑦𝑖3 )(𝑦 2 − 𝑦 3 ) + (𝑥 3𝑖 − 𝑥 2𝑖 )(𝑥 3 − 𝑥 2 )
𝐴
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 87
Example
Consider Laplace’s equation
∇2 𝑓 (𝑥, 𝑦) = 0 (4.67)
2 4
10 11
12
1 9 16 5
13
15
14
8 6
7
where grey colour indicates boundary nodes and edges.
We take the following boundary conditions:
𝑓 (𝑥, 𝑦) = 𝛼 on edge 1–2,
𝑓 (𝑥, 𝑦) = 𝛽 on edge 6–7, (4.68)
𝑓 (𝑥, 𝑦) · 𝑛ˆ = 0
on remaining boundary.
Physically the solution of this boundary-value problem cor-
responds to the steady state temperature profile of the con-
sidered region, where the edge between node 1 and 2 is kept
at temperature 𝛼, and the edge between node 6 and 7 is kept
at temperature 𝛽, and no heat can escape through the rest of
the boundary.
Taking 𝑔 = 0 in Eq. (4.64), we write our equation as
A 𝒇 = 0, (4.69)
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 90
Abc 𝒇 = 𝒃 bc (4.70)
each triangle six points are solved for in order to fix a quadratic
polynomial14 on the triangle. These are typically chosen as15
14 𝜙 (𝑥, 𝑦) = 𝛼 + 𝛼 𝑥 + 𝛼 𝑦 + 𝛼 𝑥 2 + 𝛼 𝑦 2 + 𝛼6 𝑥𝑦
𝑖 1 2 3 4 5
15 We have to put three nodes on each edge in order to make the one-
dimensional quadratic polynomials well-defined on the edge, which is shared
between two triangles. Some element types will have nodes inside the triangles
as well (e.g. P3-elements, which use cubic polynomials, will have a single node
inside the triangles and four on each edge). Elements can also be defined that do
not share any nodes between neighbouring triangles/cells. This can be useful if
the solution is expected to have discontinuities.
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 93
Example
Consider
𝜕 2 𝑓 (𝑥)
=𝛾 (4.71)
𝜕𝑥 2
for 𝑥 ∈ [𝑎, 𝑏] with 𝑓 (𝑎) = 𝛼 and 𝑓 0 (𝑏) = 𝛽, 𝛾 is a constant.
Multiplying by ℎ(𝑥) and integrating by parts our weak for-
mulation becomes
∫ 𝑏 ∫ 𝑏
0 0 0 𝑏
− 𝑓 (𝑥)ℎ (𝑥) d𝑥+[ 𝑓 (𝑥)ℎ(𝑥)] 𝑎 = 𝛾ℎ(𝑥) d𝑥 for all ℎ.
𝑎 𝑎
𝑥𝑖−1 𝑥𝑖 𝑥𝑖+1
which can be written as
𝑥−𝑥𝑖−1
𝑥𝑖 −𝑥𝑖−1 𝑥𝑖−1 < 𝑥 ≤ 𝑥𝑖 & 𝑖>1
𝑥 −𝑥
𝜙𝑖 (𝑥) = 𝑖+1
𝑥𝑖+1 −𝑥𝑖 𝑥𝑖 < 𝑥 ≤ 𝑥𝑖+1 & 𝑖 < 𝑁 (4.72)
0 elsewhere.
Í
Writing 𝑓 (𝑥) = 𝑖 𝑓𝑖 𝜙𝑖 (𝑥) and ℎ(𝑥) = 𝜙 𝑗 (𝑥), the equations
CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS 94
∫
? 𝑗 =1
𝑏
− 𝑓𝑖 𝜙0𝑖 (𝑥)𝜙0𝑗 (𝑥) d𝑥 = 1
2 (𝑥 𝑗+1 − 𝑥 𝑗−1 ) 𝛾 1< 𝑗 <𝑁
𝑎
𝛽
𝑗 = 𝑁.
find
∫ 𝑏
𝛿 𝑁, 𝑖 − 1
𝜙0𝑖 (𝑥)𝜙0𝑖+1 (𝑥) d𝑥 = . (4.75)
𝑎 𝑥𝑖+1 − 𝑥𝑖
All other integrals are equal to zero. We can now construct
∫𝑏
the matrix A which has entries A𝑖 𝑗 = − 𝑎 𝜙0𝑖 (𝑥)𝜙0𝑗 (𝑥) d𝑥,
apply our boundary conditions and finally solve Abc 𝒇 = 𝒃 bc
as usual.
1 0 0 0 ··· 0 0 0 0 𝛼
© ª
1 −2 1 © ª
Δ𝑥 Δ𝑥 Δ𝑥 0 ··· 0 0 0 0® ® 𝛾Δ𝑥 ®
®
1 −2 1 ®
0 Δ𝑥 Δ𝑥 Δ𝑥 ··· 0 0 0 0® 𝛾Δ𝑥 ®
®
®
.. .. .. .. .. .. .. .. .. ® 𝒇 = .. ®® ,
. . . . . . . . . ®® . ®
1 −2 1 ®
0
0 0 0 ··· Δ𝑥 Δ𝑥 Δ𝑥 0® ® 𝛾Δ𝑥 ®
®
1 −2 1 ®
0
0 0 0 ··· 0 Δ𝑥 Δ𝑥 Δ𝑥 ®
® 𝛾Δ𝑥 ®
®
−1 1
«0 0 0 0 ··· 0 0 Δ𝑥 Δ𝑥 ¬ « 𝛽 ¬
steps in turn for each operator. This is called Strang Splitting and
will improve accuracy.
Operator splitting is particularly useful when solving systems
of PDEs that depend on one another as it allows us to solve the time
step of each equation independently of the others.
Example
Consider the incompressible Navier–Stokes equations
𝜕𝒖
= −(𝒖 · ∇)𝒖 + 𝜈∇2 𝒖 − ∇𝑝, (4.82)
𝜕𝑡
∇ · 𝒖 = 0.
After solving for 𝑝(𝑡 + Δ𝑡), we inset the solution into Eq.
(4.84) and finally evaluate 𝒖(𝑡 + Δ𝑡). In this way each time
step of the Navier–Stokes equation is taken by solving three
equations: one for a tentative version of the velocity field
𝒖 ∗ , one for the pressure field 𝑝, and finally one for the real
velocity field 𝒖.
𝒖 𝒖 𝒖
𝑝 𝑝
𝒖 𝒖 𝒖
𝑝 𝑝
𝒖 𝒖 𝒖
Note that a central finite difference scheme of e.g. 𝑝 in an equation
for 𝒖 is naturally formulated on such grids This is good for accuracy
and helps avoid some numerical problems (“checkboard problems”)
when using non-projection methods to solve fluid problems. We
will not discuss such approaches further here, but mention them
only so you are aware of their existence.
Stochastic Systems
102
CHAPTER 5. STOCHASTIC SYSTEMS 103
We start with some seed 𝑥 0 and then keep applying the above
formula. For instance, starting with 𝑥 0 = 1656264184 yields
Example
To sample a random number from the exponential distribu-
tion 𝑝(𝑥) = 𝜆𝑒 −𝜆𝑥 (defined on [0, ∞]) we first need solve
∫ 𝑥
𝜆𝑒 −𝜆𝑥 d𝑥 0 = 1 − 𝑒 −𝜆𝑥 = 𝑢 (5.7)
0
for 𝑥 as a function of 𝑢. This one is easy and we find
1
𝑄(𝑢) = − log(1 − 𝑢). (5.8)
𝜆
Now we can use a standard sampler on a uniform interval to
sample exponentially distributed numbers. Note that if 𝑈 is
uniform on [0, 1] then so is 1 − 𝑈, so we can also use
1
𝑄(𝑢) = − log(𝑢). (5.9)
𝜆
Observe that indeed 𝑄(𝑢) will map to [0, ∞] for input in
[0, 1], as must be the case.
CHAPTER 5. STOCHASTIC SYSTEMS 106
Rejection Sampling
To sample from a probability distribution 𝑝(𝑥), choose a
proposal distribution 𝑞(𝑥) from which it is simpler to sample
and which is non-zero for all 𝑥 where 𝑝(𝑥) is non-zero.
Find an 𝑀 (preferable as small as possible) such that
Then
3. • If 𝑈 ≤ 𝑝(𝑥)
𝑀𝑞(𝑥) keep the sample
• Otherwise start over.
CHAPTER 5. STOCHASTIC SYSTEMS 107
This method is also very simple to use, but how fast it is depends
on the choice of 𝑞(𝑥). Preferably, 𝑞 should be chosen to be as
close to 𝑝(𝑥) as possible in order to avoid rejection by the last step.
Intuitively, you expect to sample about 𝑀 numbers before getting an
acceptance. Therefore it is important to choose a 𝑞 that minimizes
𝑀.
We will also skip a formal derivation of this method, but again
it should be fairly intuitive: You sample from 𝑞 and then adjust for
the fact that this is the wrong distribution by making samples more
unlikely in proportion to the distance | 𝑝(𝑥) − 𝑀 𝑞(𝑥)|.
Example
Consider sampling from the two-dimensional distribution
1
𝑝(𝑥, 𝑦) = (1 + cos(𝑥 + 𝑦)) (5.11)
4𝜋 2
for 𝑥, 𝑦 ∈ [0, 2𝜋]. As proposal distribution we simply choose
the uniform distribution
1
𝑞(𝑥, 𝑦) = , (5.12)
4𝜋 2
which is extremely easy to sample from, as we just sample
two uniform random numbers on [0, 2𝜋]. The maximal value
of 𝑝(𝑥, 𝑦) is 2𝜋1 2 , and so the best 𝑀 we can choose is 𝑀 = 2.
CHAPTER 5. STOCHASTIC SYSTEMS 108
MCMC: Metropolis–Hastings
To sample 𝑁 points from 𝑝(𝑥) choose a jump distribution
𝑔(𝑥 | 𝑥 0) from which it is easy to sample. Choose a starting
point 𝑥 0 . Then for 𝑛 ∈ [1, 2, · · · 𝑁]
3. • If 𝛼 > 1 set 𝑥 𝑛 = 𝑥
• Otherwise sample a uniform random number 𝑈
on [0, 1].
– If 𝑈 ≤ 𝛼 set 𝑥 𝑛 = 𝑥
– Otherwise set 𝑥 𝑛 = 𝑥 𝑛−1 .
Note that Markov Chain Monte Carlo works even if you do not
have access to a normalised distribution, as it only uses 𝑝(𝑥)/𝑝(𝑥 0).
This is extremely useful both for physical simulations and for data
CHAPTER 5. STOCHASTIC SYSTEMS 110
modelling.
where (𝑥𝑖 , 𝑦𝑖 ) are the sampled values of 𝑥 and 𝑦. The error on the
estimation of 𝐼 will be of order O (𝑁 −1/2 ).
𝒙(𝑡) to denote the current state. For Eq. (5.20) this would be
𝒙(𝑡) = (𝑁 𝐴 (𝑡), 𝑁 𝐵 (𝑡), 𝑁𝐶 (𝑡)).
Tau-Leaping
Choose a time step Δ𝑡, and repeat until end of simulation:
O (Δ𝑡 𝑛 ) if5
for all integer values of 𝑚. Note that if Eq. (5.23) holds, then so
does Eq. (5.24), but not the other way around.
√Tau-leaping is order O (Δ𝑡) in the weak convergence, but only
O ( Δ𝑡) in strong convergence. You therefore need to use very small
Δ𝑡 when using this method. But very small can still be significantly
larger than what is required by the Gillespie method for problems
with large rates. Note that the definition of error is over the entire
simulation, not per time step. Thus, the above should be compared
e.g. to the total error of the Euler method of O (Δ𝑡) (since for ODEs
the largest error will typically be found at the last time step 𝑡 = 𝑇).
If you have never seen this notation before it can be a bit weird.
Informally, you have think of d𝑊 as an infinitesimally small random
number:
d𝑊 = lim Δ𝑊, (5.26)
Δ𝑡→0
where Δ𝑊 is a normal distributed random number with mean zero
√ Δ𝑡. Note that this means that the standard deviation
and variance
of Δ𝑊 is Δ𝑡. Physicists often use the notation
d𝑋
= 𝜇(𝑋, 𝑡) + 𝜎(𝑋, 𝑡) 𝜉 (𝑡), (5.27)
d𝑡
where 𝜉 (𝑡) is a noise term. The former notation, however, is math-
ematically more well-defined and in fact also more natural for in-
troducing numerical methods.
Euler–Maruyama Method
Each time step is taken by updating
The Euler–Maruyama
√ method has weak order of error O (Δ𝑡), but
only O ( Δ𝑡) in strong order of convergence. When 𝜎(𝑋, 𝑡) does
not depend on 𝑋 though, the strong order of convergence is O (Δ𝑡).
If 𝜎 does depend on 𝑋, a slightly better method which for all
choices of 𝜎 has O (Δ𝑡) in strong order of convergence is:
Milstein method
Each time step is taken by updating
The SDE we have considered has been using the Itô integral.
If you are aware of the distinction between Itô and Stratonovich
integrals, you will know that Stratonovich SDEs are more common
in physics than Itô. Conveniently, any Stratonovich SDE can be
rewritten in the Itô interpretation by calculating the noise-induced
drift term. After such a conversation the above methods can be
applied. We note, nonetheless, that specific schemes designed for
Stratonovich SDEs also exist.