Lecture Notes For "Introduction To Mathematical Modeling" - Freie Universit at Berlin, Winter Semester 2017/2018
Lecture Notes For "Introduction To Mathematical Modeling" - Freie Universit at Berlin, Winter Semester 2017/2018
Contents
1 Introduction 3
1
9 Formal justice 72
9.1 Functional equations . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.2 Criticism and possible extensions . . . . . . . . . . . . . . . . . . 75
References 78
2
1 Introduction
Mathematical tools & concepts: basic ODE
Suggested references: [Ari94, Ben00]
3
year population in millions
1790 3.93
1810 7.24
1830 12.87
1850 23.19
1870 39.82
1890 62.95
1910 91.97
1930 122.78
1950 150.70
1970 208.00
1990 248.14
2010 308.19
1 N (t + ∆t) − N (t)
γ(t) = lim
∆t→0 N (t) ∆t
(Note that this assumes that the limit exists, which is nonsense given the annual
census data.) This suggests the following model for N as a function of t:
where the dot means differentiation with respect to t. This completes Steps 1
and 2 above. Now suppose that γ is independent of t. The solution of (1.1)
then is
N (t) = N0 eγt , (1.2)
which, depending on the sign of γ, means that the population will either grow
(γ positive), die out (γ negative) or stay constant (γ = 0). Sounds okay! So,
let us skip the Step 3 and the question of the how to get γ and directly proceed
with Step 4: Assuming γ > 0 it holds that
which cannot be true. (Make sure you understand why the model must be
rejected on the basis of this prediction.) So let’s go back to Step 2 and take
into account that the growth rate of a population will depend on its size due to
limited resources, food supply etc. Specifically, let
γ̃ : [0, ∞) → R , N 7→ γ̃(N ) .
4
be strictly decreasing for sufficiently large N , with γ̃(N ) → −∞ for n → ∞,
so as to make sure that the reproduction rate becomes negative once a certain
population size is exceeded. Getting the precise census data to estimate γ̃(N )
will be difficult, but we may be happy with a rough estimate; the simplest
possible scenario is
is called the logistic growth model. It is clear that when N grows, the right
hand side of the equation will become negative and so will the reproduction
rate. This guarantees that N remains finite for all t. It can be shown that
KN0 eγt
N (t) = , (1.6)
K + N0 (eγt − 1)
For obvious reasons K is called the systems’s capacity. This looks much better
than before, and we may now see how well this model fits the data given in
Table 1. Clearly, the model could be extended in various ways, e.g., by splitting
the population into subpopulations according to sex and age, by incoporating
additional external factors, such as war, immigration etc.
Problems
Exercise 1.1. Consider the logistic growth model from above.
a) Discuss the issue of parameter estimation and model validation: How could
the unknown parameters γ, K be computed? How well does the model fit
the data? How would you judge the predictive power of the model?
b) Would you trust the model, if N0 was, say, 2 or 3? If not, explain why
and discuss possible ways to improve the model. In case you do trust the
model, interpret the role of the parameter γ.
5
Figure 2.1: The classical pendulum. The radial position at time t is given by
the arclength s(t) = Lθ(t), hence the radial force on the mass is −mLθ̈(t).
Let us start with some motivation and look at the classical pendulum (see
Fig. 2.1). The governing equation of motion for the angle θ as a function of t is
with g acceleration due to gravity and L the length of the pendulum. When θ
is small, θ ≈ sin θ and we may replace the last equation by
with A, B depending on the initial conditions θ(0) and θ̇(0). Since sine and
cosine have a period of 2π, we find that the pendulum has period
s
2π L
T = = 2π , (2.4)
ω g
which is independent of the mass m of the pendulum and which does not depend
on the initial position θ(0) = θ0 .
Derivation from scale arguments. Let us now derive the essential depen-
dence of T on L and g without using any differential equations. To this end we
conjecture that there exists a function f such that
T = f (θ, L, g, m) . (2.5)
6
We denote the physical units (a.k.a. dimensions) of the variables θ, L, g, m by
square brackets. Specifically,
Note that we have ignored θ as it does not carry any physical units. By com-
parison of coefficients we then find
α1 = 1/2 , α2 = −1/2 , α3 = 0 .
which yields s
L
T ∝ (2.7)
g
and which is consistent with (2.4). Note, however, that we cannot say anything
about a possible dependence of T on the dimensionless angle variable θ. (The
unknown dependence on the angle is the constant prefactor 2π.)
y = f (x1 , . . . , xn ) (2.8)
In the SI system there are exactly seven fundamental physical units: mass (L1 =
kg), length (L2 = m), time (L3 = s), electric current (L4 = A), temperature
(L5 = K), amount of substance (L6 =mol) and luminous intensity (L7 =cd),
and we postulate that the physical dimension of any measurable scalar quantity
can be expressed as a product of powers of the L1 , . . . , L7 .
kg m2
[Energy] = = L1 L22 L−2
3 .
s2
Here, the number of fundamental physical units is m = 3.
7
Step 1: Remove redundancies from the model If the unknown function
f is a function of n variables x1 , . . . , xn with m ≤ n fundamental physical units,
using strategy in the pendulum example may lead to an underdetermined system
of equations (there we had 4 variables with only 3 fundamental units).
To remove such redundancies, it is helpful to translate the problem into the
language of linear algebra: Let L1 , . . . , Lm be our fundamental physical units
and identify Li with to the i-th canonical basis vector
ei = (0, . . . , 0, 1, 0, . . . , 0)T ,
vi = (αi,1 , . . . , αi,m ) ∈ Rm , i = 1, . . . , m ,
z = pα αm
1 · · · pm ,
1
(2.12)
1 We assume that the set of primary variables exists, otherwise we have to rethink our
8
We want to express Π solely as a function of the primary variables. To this end
note that we can write
for suitable coefficients αj,1 , . . . , αj,m ; this can be done for all the sj . Along the
lines of the previous considerations we introduce zj with [zj ] = [sj ] by
α
zj = p1 j,1 · · · pα
m
j,m
and define the dimensionless quantity Πj = sj /zj . Note that, by the rank-nullity
theorem there are exactly n − m such quantities where n − m is the dimension
of the nullspace of the matrix spanned by the x1 , . . . , xn Replacing all the sj by
zj Πj , we can recast (2.13) as
y = Π pα αm
1 · · · pm .
1
(2.16)
9
2.2 A historical example
To appreciate the power and usefulness of Buckingham’s theorem we have to
see it in action. The following example is taken from [IBM+ 05]; see also [Tay50]
for the original article. When the U.S. tested the atomic bomb “Trinity” at Los
Alamos in 1945, the British physicist and mathematician Sir Geoffrey Taylor2
could quite accurately estimate the mass of the bomb based on the dimensional
analysis of the radius of the shock wave as a function of time, using only film
footage of the explosion. (The data was still classified then.) Taylor assumed
that the expanding shock wave R due to the explosion could be expressed as
R = f (t, E, ρ, p) (2.19)
where t is time, E the released energy (that is a function of the mass of the
bomb), ρ is the density of the ambient air and p denotes air pressure. The
corresponding physical units are (cf. Example 2.1)
with the three fundamental physical units L1 (mass), L2 (length) and L3 (time).
The latter implies that there are three primary variables where, without loss of
generality we pick t, E ρ. Then
0 1 1
[t] = 0 , [E] = 2 , [ρ] = −3 . (2.20)
1 −2 0
(Remember that Taylor wanted to find out how large E was, so our choice was
only to about 67 percent arbitrary.) Expressing [R] in terms of the chosen basis
then leads to the linear system of equations
Ax = b (2.21)
By construction the matrix A has maximum rank and so the unique solution of
(2.21)–(2.22) is x = (2/5, 1/5, −1/5)T , from which we find
1/5
t2 E
[R] = . (2.23)
ρ
10
time (in miliseconds) radius (in meters)
0.10 11.1
0.24 19.9
0.38 25.4
0.52 28.8
0.66 31.9
.. ..
. .
3.53 61.1
3.80 62.9
4.07 64.3
4.34 65.6
4.61 67.4
.. ..
. .
Here Φ(·) is a yet unknown function of Π1 that must be determined from appro-
priate data.3 In Taylor’s case a reasonable approximation was Φ(Π1 ) ≈ Φ(0),
simply because t was small compared whereas E was fairly large, in other words:
1/5
E 2 ρ3
p ,
t6
assuming that the numerical values of p and ρ were of order 1. Taylor did
experiments with small explosives and found out that Φ(0) ≈ 1, which led him
to conclude that [Tay50]
2 1/5
t E
R= (2.27)
ρ
describes the radius of the shock wave as a function of time t and the parameters
E, ρ. Given measurement data (t, R(t)) and the value of the density of air
ρ = 1.25kg/m3 , it is then possible to estimate E and hence the mass of the
nuclear bomb. Taylor had the following data:
Taking the log on both sides of (2.27) yields
2 1 1
log R = log t + b , b= log E − log ρ (2.28)
5 5 5
3 More about this in the next section.
11
from which E ≈ 8.05 · 1013 Joules can be obtained by a least squares fit of the
data. (See the next section.) Using the conversion factor 1 kiloton = 4.186 · 1012
Joules Taylor estimated the weight of the nuclear bomb Trinity as 19.2 kilotons.
The true weight of the bomb was about 21 kilotons which was revealed much
later. Thus Taylor estimate proved indeed quite accurate.
Problems
Exercise 2.4. Explain the statement p (E 2 ρ3 /t6 )1/5 below equation (2.26).
Why would a statement like t6 E 2 or t ≈ 0 be meaningless?.
Exercise 2.5. Prove that the α1 , . . . , αm in (2.12) are unique.
Exercise 2.6. A recurrent theme in both U.S. kitchens and books on mathe-
matical modelling is the question how to cook a turkey. Cookbook sometimes
give directions of the form: “Set the oven to T0 = 180◦ C and put it in the oven
for 20 minutes per pound of weight.” Analyse (and criticise) this rule of thumb
based on the following modelling assumptions:
a) A piece of meat is cooked when its minimum internal temperature has
reached a certain value Tmin that may depend, e.g., on the type of meat.
[energy] × [length]
[κ] = .
[area] × [time] × [temperature]
12
3 Arguments from data
Mathematical tools & concepts: linear algebra, random variables
Suggested reference: [BTF+ 99]
Recall the problem of Section 2.2: Determine the size of a nuclear bomb
from measurement data {(ti , R(ti )) : i = 1, 2, . . . , N } based on the model (2.27).
The equivalent logarithmic representation (2.28) is an equation of the form
y(t) = αx(t) + β ,
with the new variables y = log R and x = log t, known parameter α and un-
known parameter β. If the measurement data and the model were exact, it
would be possible to estimate the unknown coefficient β from a single measure-
ment (x(t1 ), y(t1 )) = (log t1 , log R(t1 )).
If we take into account that measurement data are subject to measurement
errors coming, e.g., from the measurement apparatus or from other sources of
error that are not part of the measurement model, then an apparently more
realistic model could be an equation of the form
Y (t) = αX(t) + β + (t) , (3.1)
where is a (typically stationary Gaussian) stochastic process that represents
the measurement or, more generally, statistical noise.4
13
with Y the total production (the real value of all goods produced in a year),
L the labor input (the total number of person-hours worked in a year), C the
capital input (the real value of all machinery, equipment and buildings), κ the
productivity, and the statistical error. The coefficients αi are called the output
elasticities; they are a measure for the succeptibility of the output to a change
in levels of either labor or capital used. If α1 + α2 = 1, then doubling the usage
of capital C and labor L will also double output Y.
Taking the logarithm on both sides of (3.4), we have
log Y = α1 log L + α2 log C + log κ + , (3.5)
where the right hand side of the equation is an affine function of the form
f (x) = αT x + β + , (3.6)
with the coefficients α = (α1 , α2 )T and β = log κ and dependent variables
(X1 , X2 ) = (log L, log C). It is commonly assumed that is a zero-mean Gaus-
sian random variable with variance σ 2 that is independent of C and L.5
14
Figure 3.1: Linear regression for n = 1: Find the straight line that minimizes
the sum of squared deviations from the data points.
E[] = 0 , E[T ] = σ 2 IN ×N .
SN : Rn → R , α 7→ (Y − X α)T (Y − X α) , (3.12)
15
Theorem 3.4 (Least squares estimator). The LSE is given by
α∗ = (X T X )−1 X T Y (3.14)
Proof. The first and second derivatives of SN with respect to the parameter
vector α are given by
X T X α − X T Y = 0,
that, under the assumption that the data matrix X ∈ RN ×n has maximum rank
n, has the unique solution
α∗ = (X T X )−1 X T Y .
α∗ = argmin kX α − Yk2 ,
α∈Rn
E[T ] = σ 2 IN ×N .
7 Thanks to the central limit theorem, this is often a reasonable choice.
16
Then in the linear regression model (3.8), the dependent variables Y are
Gaussian with
Y ∼ N (X α, σ 2 ) . (3.16)
We define the likelihood function of Y as the Gaussian density of Y , but con-
sidered as a function of the parameters α and σ 2 , i.e.
|y − αT x|2
L(α, σ 2 ; x, y) = (2πσ 2 )−1/2 exp − . (3.17)
2σ 2
By Assumption 3.7, the likelihood function of the data vector Y then is
N
Y
L(α, σ 2 ; X , Y) = L(α, σ 2 ; x(ti ), y(ti ))
i=1
(3.18)
2 −N/2 1 T
= (2πσ ) exp − 2 (Y − X α) (Y − X α)
2σ
L(α, σ 2 ; X , Y) = f (X , Y; α, σ 2 ) (3.19)
the maximum likelihood estimator (MLE) of (α, σ 2 ) given the data (X , Y).
The logarithm is monotonic, and it is often more convenient to maximize
the log-likelihood
N 1
log L(α, σ 2 ; X , Y) = − log(2πσ 2 ) − 2 (Y − X α)T (Y − X α) (3.21)
2 2σ
rather than the likelihood function. It can be readily seen that the maximizer
of the log-likelihood function is the MLE. The proof of the next theorem is left
as an exercise to the reader.
Theorem 3.9 (Maximum likelihood estimator). The MLE of (α, σ 2 ) is given
by
α̂ = (X T X )−1 X T Y
1 (3.22)
σ̂ 2 = (Y − X α̂)T (Y − X α̂)
N
Proof. Exercise.
Note that α̂ = α∗ , i.e., the MLE of α agrees with the LSE. It should be
stressed that both LSE and MLE are linear transformations of the random
observation Y, hence they are both random variables [BTF+ 99].
17
1.5 1.5
1
1
0.5
0.5
0
Y
Y
0
−0.5
−0.5
−1
−1 −1.5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
X X
Figure 3.2: LSE and MSE of the one-dimensional model (3.23) with N = 102
and N = 106 data points (red: exact model, green: estimated model).
σ 2 = lim E[σ̂ 2 (N )] .
N →∞
We then computed the LSE (equivalently: the MLE) for the given data. Figure
3.2 shows two typical realizations for N1 = 102 and N2 = 105 . The correspond-
ing estimates are α1∗ = 0.5572 for N1 = 102 and α2∗ = 0.5000 for N2 = 105 . The
fact that the estimator is linear in Y , together with the fact that the noise has
mean zero, implies that the LSE and MLE estimators are unbiased with
E[α∗ ] = E[α̂] = α .
This is to say that both MLE and LSE estimators will always fluctuate around
the true value α, no matter how small N is. By the law of large numbers, their
empirical mean converges to α when the estimation is repeated infinitely often.8
E[T ] = σ 2 W ,
8 Clearly the estimator converges faster when N is larger, because by the central limit
18
with W ∈ RN ×N being a symmetric and positive definite (s.p.d.) matrix. A
noticeable feature of this so-called generalized linear modeI is that there are
N (N + 1)/2 additional unknowns in the game, if W is not known a priori.
Therefore it is impossible to estimate both α and σ 2 W without further assum-
tions on the measurement error, if the sample size N is fixed.
For simplicity we assume that W is given. In this case it is possible to reduce
the parameter estimation problem for the generalized model to the estimation
problem for (3.11). To this end, recall that both W and its inverse W −1 have
s.p.d. square roots, i.e., there exist s.p.d. matrices Q, R, such that
W = QQ , W −1 = RR .
RY = RX α + R , (3.24)
Ỹ = RY , X̃ = RX , ˜ = R
can be recast as
Ỹ = X̃ α + ˜ , (3.25)
Rescaling the observation variables by the square root of the inverse error co-
variance is known by the name of whitening because now
˜T ] = σ 2 IN ×N .
E[˜
The assertion that α̃∗ is unbiased and indeed the best linear estimator for
the parameters in the generalized linear model (3.24) is called Gauss-Markov-
Theorem; the interested reader is referred to [BTF+ 99, Thm. 4.4] for details.
Problems
Exercise 3.11. For the linear model (3.1) with unknown scalar coefficients
(α, β), we define the LSE (α∗ , β ∗ ) as the minimizer of the function
N
X
SN (α, β) = (y(ti ) − αx(ti ) − β)2
i=1
19
Exercise 3.12. If we drop the assumption that the data matrix X ∈ RN ×n has
full rank n ≤ N , the LSE α∗ is given by any solution of the normal equations
X T X α − X T Y = 0.
In this case α∗ is no longer unique. Show that SN (α) as defined in (3.12) attains
its minimum for any solution of the normal equations.
Exercise 3.13. Consider the linear model (3.1) with unknown scalar coefficients
(α, β). Compute LSE and MLE of the parameters (α, β); cf. exercise 3.11.
20
Figure 4.1: Italian Front 1915–1917 (source: History Department of the US
Military Academy).
21
available for the prey (e.g. plankton), but they are eaten by the predator. The
rate of change for the the prey per capita can be modelled by
Ṅ (t)
= a − bP (t) , (4.1)
N (t)
which describes exponential growth of the prey with effective growth rate a −
bP (t). The reproduction rate of the predator population depends on whether
there is enough for them to eat; they die without prey. Letting dN (t) − c be
the effective growth rate, the size of the predator population is governed by
Ṗ (t)
= −d + cN (t) . (4.2)
P (t)
We assume that a (prey reproduction rate), b (the rate of predation upon the
prey), c (growth rate of the predator population) and d (predator mortality)
are all strictly positive and that (4.1)–(4.2) are equipped with suitable initial
conditions N (0) = N0 > 0 and P (0) = P0 > 0.
It is customary to rescale the free variable t and the dependent variables
N, P to recast the equations in dimensionless form.9 To this end we define
c b
τ = at , u= N, v= P
d a
in terms of which the Lotka-Volterra equations read (see Exercise 4.6)
du
= u(1 − v) , u(0) = u0
dτ (4.3)
dv
= µv(u − 1) , v(0) = v0 ,
dτ
with µ = d/a.
Vector field and fixed points. Even though there is an explicit solution
to (4.3), we will not take advantage of this fact, but rather try to get some
qualitative insight into the dynamics of the Lotka-Volterra system by studying
the underlying autonomous (i.e. time-independent) vector field. Let
the family of vector fields associated with the Lotka-Volterra system, i.e. the
right hand side of (4.3) parametrized by µ > 0. The Lotka-Volterra vector field
is depicted in Figure 4.2 for µ = 1. Since Fµ is locally Lipschitz, the Picard-
Lindelöf existence and uniqueness theorem for initial value problems [Tes12]
implies that (4.3) has a unique solution. The solutions then are the integral
curves of Fµ , i.e., for every (u0 , v0 ) ∈ R+ × R+ a differentiable curve
22
3
2.5
1.5
v
0.5
0
0 0.5 1 1.5 2 2.5 3
u
In other words, the solution trajectories are everywhere tangential to the vector
field. This gives us some idea of how a typical solution of (4.3) could look like.
An important property of any vector field are its critical points:
Definition 4.1. A point (ueq , veq ) ∈ R+ × R+ is called critical point, equilib-
rium or fixed point of (4.6) if Fµ (ueq , veq ) = 0.
By definition, a solution that goes through a critical point is constant, hence
the names “equilibrium” or “fixed point”. The Lotka-Volterra system has only
two critical points in the positive orthant, namely
(ueq , veq ) = (0, 0) and (ueq , veq ) = (1, 1) . (4.7)
The dynamics when one of the populations is absent at time τ = 0 is relatively
easy to understand: if u0 = 0 and v0 > 0, the first equation in (4.3) entails
u(τ ) = u0 which, together with the second equation, implies that
v(t) = e−µτ v0 . (4.8)
Thus the predators are bound to die out. Conversely, if v0 = 0 and u0 > 0 it
follows by the analogous argument that
u(t) = eτ u0 , (4.9)
assuming that the prey population has infinite resources available. (They will
die out later when they have eaten all the plankton.) If, however, both u0 and
v0 are different from zero but small, then u0 u0 v0 and v0 u0 v0 , which
suggests to neglect the bilinear terms in (4.3) and employ the approximation
du
≈ u , u(0) = u0
dτ (4.10)
dv
≈ −µv , v(0) = v0 .
dτ
23
This means that the prey population will still grow even though there is a small
predator population, while the number of predators will decrease initially; they
have not enough to eat, so they die before they can reproduce. Clearly, once
the prey population grows so that u(τ )v(τ ) is no longer small compared to u(τ )
the approximation that is behind (4.9) is no longer valid.
Now let us consider the other equilibrium (ueq , veq ) = (1, 1). To linearize Fµ
about the point (1, 1), it is convenient to introduce new coordinates by ξ = u−1
and η = v − 1, in terms of which (4.3) reads
dξ
= −η(1 + ξ) , ξ(0) = u0 − 1
dτ (4.11)
dη
= µξ(η + 1) , η(0) = v0 − 1 .
dτ
Upon noting that u, v ≈ 1 is equivalent to ξ, η ≈ 0, this leads to the following
linearized system of differential equations
dξ
≈ −η , ξ(0) = u0 − 1
dτ (4.12)
dη
≈ µξ , η(0) = v0 − 1 .
dτ
Upon replacing the “≈” in the linearized equation by equality signs, the latter
is equivalent to the differential equation of the pendulum,
d2 ξ
= −µξ(τ ) ,
dτ 2
from page 6, with solution
T = 2πµ−1/2 . (4.14)
We will come back to the validity of these kinds of arguments that are based on
linearization later on.
Integral curves and periodic orbits. The above reasoning suggests that
the solutions of the Lotka-Volterra equation are periodic, at least in the neigh-
bourhood of the critical point (1, 1). We will now show that all nonstationary
solutions of (4.3), i.e. all solutions away from the critical points are indeed pe-
riodic. For this purpose we need the following definition.
Definition 4.2. A function I : R2 → R is called a first integral (also: constant
of motion or conserved quantity) if
I(γ(τ )) = I(γ(0)) ∀τ ∈ D .
First integrals of an ODE, such as (4.3), are useful in either finding explicit
solutions or in finding periodic orbits. Specifically, if an ODE has an integral
with compact level sets, then these level sets are candidates for periodic orbits.
24
Theorem 4.3. Let u0 , v0 > 0, (u0 , v0 ) 6= (1, 1). Then τ 7→ (u(τ ), v(τ )) is
periodic, i.e., there exists a positive number T ∈ (0, ∞), such that
dv v(1 − u)
= −µ , v(u0 ) = v0 ,
du u(1 − v)
yields
log v − v + C1 = µ(u − log u) + C2 ,
with integration constants C1 = v0 − log v0 and C2 = µ(log u0 − u0 ). Defining
C = C1 − C2 and switching back to the free variable τ , it follows that
The function
I: X → R, (u, v) 7→ µu + v − log(uµ v)
with X = R+ × R+ is strictly convex with compact level curves
for all
C ≥ min I(u, v) = µ + 1 .
(u,v)∈X
The left panel of Figure 4.3 shows the numerically computed solution for
three different initial values that have been computed with the Matlab function
ode15s. Due to finite precision of the numerical solver, the numerical solution
is not exactly periodic and spirals inwards (see the right panel of the figure).
We can say more about the oscillations around the equilibrium (ueq , veq ) =
(1, 1): Their running mean is equal to the equilibrium value.
1 T 1 T
Z Z
u(t) dt = v(t) dt = 1 .
T 0 T 0
25
3
3
2.5
2.5
2 2
v
v
1.5 1.5
1 1
0.5 0.5
0 0
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
u u
Figure 4.3: Left panel: Solutions of (4.3) for different initial conditions (u0 , v0 ) =
(1.2, 1.2) (red), (u0 , v0 ) = (0.5, 0.5) (orange) and (u0 , v0 ) = (1, 3) (green). Right
panel: Numerically computed green solution trajectory for 500 time periods.
Proof. Let us prove the rightmost part of the above equality and consider the
equation for u only:
du
= u(1 − v) .
dτ
By Theorem 4.3 there exists T ∈ (0, ∞), such that u(T ) = u(0). Separating
variables and integrating from 0 to T yields
Z u(T ) Z T
du
= (1 − v(t)) dt ,
u(0) u 0
which, using that the upper and lower limit in the integral on the left side of
the equality coincide, can be recast as
Z T Z T
1
T = v(t) dt ⇔ v(t) dt = 1 .
0 T 0
The other part of the assertion can be proved in exactly the same way by solving
the equation for v.
The effect of fishing We now want to model the effect that fishing has on
fish predators and their prey. For the sake of simplicity we assume that the
reduction rate of predator and prey population due to fishing pressure is given
by a single parameter δ > 0. The unscaled equations (4.1)–(4.2) then become
dN δ
= N δ (a − bP δ − δ) , N δ (0) = N0
dt (4.15)
dP δ
= −P δ (d − cN δ + δ) , P δ (0) = P0 .
dt
In the unscaled form, the nontrivial equilibrium is
δ δ
Neq , Peq = ((d + δ)/c, (a − δ)/b) , (4.16)
26
3
2.5
1.5
v
0.5
0
0 0.5 1 1.5 2 2.5 3
u
Figure 4.4: The effect of fishing for δ/a = δ/d = 0.3: typical solutions without
(red) and with fishing (blue).
1 T δ 1 T δ
Z Z
δ δ
Neq = N (t) dt , Peq = P (t) dt . (4.18)
T 0 T 0
As a consequence the total average catch is given by
δ δ
d a δ δ
δ Neq + Peq =δ + +δ − (4.19)
c b c b
Assuming that the effect of the fishing kicks in immediately whereas the equi-
libria represent long term properties of the ecosystem, the average catch after
the recovery phase is smaller than it was before, if and only if
b > c, (4.20)
which is the case when the reduction of the predator population due to fishing
leads to a relatively higher survival rate of its prey.
27
tion about these points. As we have seen, the linear model shares some fea-
tures of the nonlinear model, e.g., periodic oscillations around the critical point
(ueq , veq ) = (1, 1), or the exponential growth of the prey population close to the
origin. The idea behind this kind of analysis is that the solution of the original
nonlinear system and its linearization should behave similarly in a small neigh-
bourhood of the fixed point. Under certain assumptions this idea can indeed be
justified, and we will give precise statements below.10
28
Note that this is in accordance with the usual exponential series
∞
X zk
exp(z) =
k!
k=0
Av = λv
v1 − αv2 = 0 .
which proves that the assumption that v1 and v2 are linearly dependent must be
wrong. Calling V = (v1 , v2 ) the 2 × 2 matrix that diagonalizes A, i.e. V −1 AV =
Λ with Λ = diag(λ1 , λ2 ). Then, by definition of the matrix exponential, we have
where exp(Λt) is a diagonal matrix with entries exp(λi t). Depending on whether
the real part of the λi is positive negative or zero the exponential eλi t will either
grow, decay or oscillate. As a consequence we can analyse the solution (4.24)
in terms of the eigenvalues of the matrix A.
Depending on whether eigenvalues are real or complex, we can distinguish 5
cases as is illustrated in Figure 4.5; the case when the two eigenvalues λ1,2 = ±iω
are pure imaginary, in which case the the solution of (4.3) is a linear combination
of sines and cosines, is not extra mentioned.11 Critical points with the property
that the real part of λ1,2 = ±iω of the Jacobian matrix is zero are called elliptic,
otherwise the critical point is called hyperbolic. The following theorem that is
stated in an informal way guarantees that the linearization of a nonlinear ODE
preserves the properties of hyperbolic equilibria.12
29
Figure 4.5: Classification of equilibria of dy/dt = Ay in termsp
of the eigenvalues
of A ∈ R2×2 . The eigenvalues are given by λ1,2 = τ /2 ± τ 2 /4 − ∆ where
∆ = det A and τ = traceA (figure taken from: [Izh07]).
has the two real eigenvalues λ1 = −µ < 0 and λ2 = 1. Hence the origin is a sad-
dle or semistable equilibrium. As a second example we consider the linearization
(4.12) around the fixed point (1, 1); this equilibrium is neutrally stable, because
solutions are periodic, which means that any solution in a sufficiently small
neighbourhood of (1, 1) stays within this neighbourhood without approaching
the fixed point. As a consequence the linearization cannot be informative, for
otherwise the critical point would be hyperbolic (e.g. asymptotically stable)
rather than elliptic. The Jacobian matrix at (1, 1) reads
0 −1
A= , (4.29)
µ 0
√
and indeed the two eigenvalues of A are λ1,2 = ±i µ. Note that even though
the linearized system is again neutrally stable, Theorem 4.5 does not apply.13
13 Further note note that elliptic and hyperbolic equilibria are not mutually exclusive and so
the fact that both eigenvalues are pure imaginary does not imply that the nonlinear system
has an elliptic fixed point too. For instance, it can happen that the nonlinear system has a
crititcal point that is unstable in one direction and neutrally stable in the other direction,
whereas the linearized system has an elliptic fixed point.
30
Problems
Exercise 4.6. Show that (4.1)–(4.2) and (4.3) are equivalent under the substi-
tutions
c b
τ = at , u = N , v = P .
d a
Exercise 4.7. Consider the linear ODE system
dx
= y , x(0) = x0
dt
dy
= −x , y(0) = y0 .
dt
a) Show that I(x, y) = x2 + y 2 is a constant of motion.
b) Consider the forward Euler scheme
xn+1 = xn + hyn
yn+1 = yn − hxn , n = 0, 1, 2, . . .
for sufficiently small step size h > 0 and show that dn (x0 , y0 ) = x2n + yn2
is strictly increasing for all (x0 , y0 ) ∈ R2 \ {(0, 0)}, with
lim dn = ∞ .
n→∞
31
Exercise 4.11. The following plots show linear 2-dimensional vector fields
along with some typical solution trajectories (shown in red). Classify the sta-
bility of the associated fixed points according to the eigenvalues of the Jacobian:
3 3
2 2
1 1
y2
2
0 0
y
−1 −1
−2 −2
−3 −3
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
y1 y1
3 3
2 2
1 1
y2
0 0
y
−1 −1
−2 −2
−3 −3
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
y1 y1
3 3
2 2
1 1
y2
0 0
y
−1 −1
−2 −2
−3 −3
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
y1 y1
32
5 Basic principles of control theory
Mathematical tools & concepts: ODE, optimization
Suggested reference: [Whi96]
where x(t) denotes the fish population at time t, b(t) the number of boats
operating at time t and h(t) the harvesting rate at time t. For simplicity,
we assume that all functions can take real values, even though the number
of boats will be an integer number. Our harvesting strategy will be based
on controlling the number of boats that are used for the fishing; we call b the
control variable, even though, strictly speaking, it is a (e.g. piecewise continuous)
function b : [0, ∞) → R.
There are clearly other players in the game of finding an optimal harvesting
strategy that come in form or parameters or boundary conditions, such as legal
requirements, wages or overhead costs of maintaining a fishing fleet. Specifically,
we consider the following parameters: cB > 0 the overhead cost per boat and
unit of time, n the number of fishermen per boat, w their salary per unit of
time, p the market price of one unit of fish. The boundary conditions and
available parameters determine what a good harvesting strategy is. For example,
maximizing the sustainable catch is different from maximixing the long-term
profit, which may be different from maximing the short-term profit.14
33
where q > 0 is a proportionality constant that depends on the efficacy of the
fishing boats (e.g. the nets used etc.). The harvesting rate is the rate by which
the growth rate of a fish population is reduced as an effect of fishing; we assume
that the fish population evolves according to the logistic equation:
dx x
= γx 1 − − h , x(0) = x0 > 0 (5.3)
dt K
where γ > 0 is the initial growth rate of the population when x is small and
K > 0 is the capacity of the ecosystem without fishing (cf. p. 5). Maximizing
any given objective, such as sustainable catch or profit under the constraint
that the fish population evolves according to the dynamics (5.3) is not possible
without further specifying what the admissible controls b(·) are. Here we assume
that the only admissible strategies are of the form
(
0 t ≤ t∗
b : [0, ∞) → R , b(t) = (5.4)
b0 t > t∗ ,
with the two adjustable, but a priori unkown parameters t∗ ≥ 0 and b0 > 0. As
a consequence our harvesting strategy can be controlled by choosing the right
time t∗ at which fishing is started or resumed and the corresponding number b0
of boats. The resulting logistic model then is a switched ODE of the form
(
dx γx (1 − x/K) t ≤ t∗
= (5.5)
dt γx (1 − qb0 /γ − x/K) t > t∗ ,
where we have used the constitutive relation (5.2) in the second equation.
34
35
K (no fishing
K (fishing)
30 fish population
25
20
x
15
10
0
0 20 40 60 80 100
t
Figure 5.1: Solution of the logistic equation (5.5) with parameters γ = 0.25,
K = 30, q = 0.025, b0 = 2, t∗ = 60 and initial value x0 = 2.
Bear in mind that the solution to the logistic equation for b0 = 0 satisfies
lim x(t; b0 = 0) = K .
t→∞
That is, the fishing reduces the capacity of the ecosystem by the factor 1−qb0 /γ.
A representative solution of (5.4) is shown in Figure 5.1. We now define the
average long-term catch as
1 T
Z
J0 (b0 ) = lim h(t)dt , (5.6)
T →∞ T 0
where the expression for the associated sustainable catch rate follows from (5.2):
35
We observe that the maximum sustainable catch is independent of the efficacy
q, which seems counterintuitive, but is understandable if we realize that b∗0 is
inversely proportional to q, which makes the optimal harvesting rate indepen-
dent of q. Rougly speaking, a lower efficacy requires to use more boats and vice
versa: With too many boats the fish population is depleated too much, which
results in a lower catch; the same happens when too few boats are at work,
which conserves the fish population, but is suboptimal in terms of the catch.
The total profit until time t = T is then obtained by integrating over the profit
rate from 0 to T . To simplify matter we assume that T = ∞ and we discount
the future profit with a discount rate δ > 0. Together with the constitutive
relation (5.2) the overall profit as a function of b turns out to be
Z ∞
J(b) = e−δt b(t) (pqx(t) − c) dt . (5.12)
0
with the shorthand c = cB + nw. The discount factor δ accounts for inflation,
interest rates or the fact that future rewards are less profitable than immediate
rewards; it also ensures that J is finite for our choice of admissible controls b(·).
over the set of admissible control strategies defined by (5.4) and subject to
(
dx γx (1 − x/K) t ≤ t∗
= (5.14)
dt γx (1 − qb0 /γ − x/K) t > t∗ .
36
35
K (no fishing)
K (fishing)
30 fish population
25
20
t
15
10
0
0 5 10 15 20 25 30 35 40
x
Figure 5.2: Solution of the switched logistic equation. the solution at the switch-
ing point t∗ is continuous but non-differentiable, because the control variable
has a jump discontinuity at t∗ and jumps from b(t∗ ) = 0 to b(t∗ + ) = b0 .
details and alternative methods for solving optimal control problems. Note that
Z t∗ Z ∞
J(b) = e−δt b(t) (pqx(t) − c) dt + e−δt b(t) (pqx(t) − c) dt
0 t∗
Z ∞
= e−δt b0 (pqx(t) − c) dt .
t∗
37
Solving the equation for t∗ yields
∗ −1 K γ
t =γ log − 1 + log −1 , (5.17)
x0 qb0
which determines the optimal switching time t∗ = t∗ (b0 ) as a function of
the number of boats (via the capacity that is a function of b0 ).
2. As a next step we eliminate the constraint from J, by noting that
x(t) = x∗ ∀t ≥ t∗ . (5.18)
Hence Z ∞
J(b) = e−δt b0 (pqx(t) − c) dt
t∗
Z ∞
qb0
= b0 e−δt pqK 1 − − c dt (5.19)
t∗ (b0 ) γ
b0 qb0 ∗
= pqK 1 − − c e−δt (b0 ) .
δ γ
The profit function is nonnegative when pqK(1 − qb0 /γ) > c, with c de-
noting the total cost per boat. Then for
γ c
0 ≤ b0 ≤ 1− (5.20)
q pqK
the function J is bounded from below by zero and has a unique maximum
by Rolle’s theorem. (Recall that the optimal fleet size for the maximum
sustainable catch was b∗0 = γ/(2q).) An example with the parameters
γ = 0.25, K = 30, q = 0.025, p = 10, δ = 0.2, x0 = 2 , (5.21)
is shown in Figure 5.3.
Problems
Exercise 5.3. Prove Lemma 5.1.
Exercise 5.4. Consider the time-discrete logistic model with seasonal fishing
xn
x∗n+1 = xn + γ̃xn 1 −
K
xn+1 = (1 − q̃b0 ) x∗n+1
that can be interpreted as the forward Euler discretization of (5.5) with γ̃ = γ∆t1
and q̃ = q∆t2 . Compute the maximum sustainable average catch as a function
of ∆t1 (recovery period) ∆t2 (fishing season) and b0 (number of boats).
Exercise 5.5. Compute the optimal fleet size for the parameters (5.19) to obtain
a) the maximum sustainable catch,
b) the maximum long-term profit,
as described in the text. Plot the profit functions in both cases, compare the
results and discuss the role of the discount parameter δ.
38
0.6
0.5
0.4
0.3
J
0.2
0.1
0
0 1 2 3 4 5 6 7
b0
39
Figure 6.1: Detailed radiative energy balance (source: Trenberth et al., Bulletin
of the American Meteorological Society, 2009).
40
temperature T (t) by one kelvin (the heat capacity varies a lot between land, wa-
ter, etc, but we consider again the global average value C). The energy needed
to increase T by an amount ∆T after a time ∆t, i.e. T (t + ∆t) = T (t) + ∆T is
thus AC∆T , where A is the surface area of the planet.
Now let Ein be the average amount of solar energy reaching one square
meter of the Earth’s surface per unit time, and Eout be the average amount of
energy emitted by one square meter of the Earth’s surface and released into the
stratosphere per unit time. Then we have
Letting ∆t tend to zero, we obtain the global energy balance model describing
the evolution of T ,
dT
C = Ein − Eout . (6.1)
dt
We assume that no forcing modifies the solar energy or radiative properties; Ein
and Eout thus do not depend explicitly on time (but they depend on T ). If the
incoming energy balances the outgoing energy, the Earth’s temperature remains
constant and the planet is said to be in thermal equilibrium. To specify Ein
and Eout , we consider the following:
• Viewed from the Sun, the Earth is a disk of area πR2 , where R is the
radius of the Earth.
FSB (T ) = σT 4 .
Eout (T ) = 4πR2 σT 4 .
41
1
0.9
0.8
0.7
albedo [-] 0.6
0.5
0.4
0.3
0.2
0.1
0
180 200 220 240 260 280 300 320 340 360
Temperature [K]
Considering a typical albedo value of α = 0.3, the solar energy flux den-
sity S0 = 1368Wm−2 (thus Q = 342Wm−2 ), the equilibrium temperature is
T ∗ = 254.8K. However, the actual value of the surface average temperature is
T ∗ = 287.7K. The difference is largely explained by the greenhouse effect of
the Earth’s atmosphere, that is, the effect of gases like CO2 , water vapor and
methane. The greenhouse gases only affect the infrared (long wavelengths) part
of the energy spectrum, and thus only the energy radiated by the Earth. We
include this effect through a factor 0 < < 1 which reduces the outgoing energy,
leading to
dT
C = (1 − α)Q − σT 4 , (6.3)
dt
42
600
500
400
Radiation [W m -2 ]
300
200
100
0
180 200 220 240 260 280 300 320 340 360
Temperature [K]
Qualitative solution Steady state or fixed points are obtained when the left
hand side of 6.4 vanishes. There are three equilibria. Two of them are stable
and one is unstable. The leftmost is a snowball solution, the rightmost is a
ice-free solution.
43
stable unstable current T cr
310
300
290
Temperature [K]
280
Tipping point
270
260
Tipping point
250
240
230
220
0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
Bifurcation parameter (Q/Q 0 ) [-]
44
5 5 5
stable
stable stable
x*
x*
0 0 0
unstable
unstable stable
unstable
-5 -5 -5
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
Thus, the equilibrium x = 0 is stable for λ < 0 and unstable for λ > 0, while the
equilibrium x = λ is unstable for λ < 0 and stable for λ > 0. This transcritical
bifurcation arises in systems where there is some ”trivial” solution branch (here
corresponding to x = 0), which exists for all values of the parameter λ. There is
a second branch x = λ that crosses the first one at the bifurcation point (x, λ) =
(0, 0). When the branches cross one solution goes from stable to unstable while
the other goes from unstable to stable (Fig. 6.5).
ẋ = λ − x2 . (6.7)
ẋ = λ + x − x3 . (6.8)
The equation f (λ, x) = 0 has one solution for λ < −λ∗ , three solutions for
−λ∗ < λ < λ∗ , and one solution for λ > λ∗ . For λ = ±λ∗ , there are two
solutions.
The dynamics of 6.8 have a particularity: suppose we have the capability to
continuously vary the value of the parameter λ, for example by changing some
of the physics (e.g. emitting CO2 in the atmosphere, releasing large amounts of
freshwater into the ocean by melting the polar ice cap, etc. ). If we start with a
large negative value of λ, the system will eventually reach an equilibrium on the
lower branch of the bifurcation diagram. As we keep increasing λ, the system
will ”slide” along the stable branch of the bifurcation diagram until it exceeds
the value λ∗ , where it will transition quickly to the upper stable branch of the
bifurcation diagram (why quickly?). As we further increase λ, the system will
now slide along the upper branch. The reverse change from the upper to the
lower branch will however necessitate a large decrease of λ, since it will occur
at the value −λ∗ . That is, the parameter value at which the transition occurs
45
depends on the direction in which the parameter is varied. This phenomenon is
called a hysteresis.
ẋ = λx − x3 . (6.9)
ẋ = λx + x3 . (6.10)
√
In this case, we have three equilibria x = 0 (stable) and x = ± −λ (unstable)
for λ < 0, and one unstable equilibrium x = 0 for λ > 0. A supercritical
pitchfork bifurcation leads to a ”soft”√loss of stability, in which the system can
go to nearby stable equilibria x = ± λ when the equillibrium at x = 0 looses
stability as λ passes through 0. On the other hand, a subcritical pitchfork
bifurcation leads to a ”hard” loss of stability, in which there are no nearby
equilibria and the system goes to some far-off dynamics (perhaps to infinity)
when the equilibrium at x = 0 looses stability.
∂f
(λ, x∗ ) 6= 0.
∂x
This is a consequence of the implicit function theorem. Therefore, solution
branches are expected to meet at points where f (λ, x∗ ) = 0 and ∂f ∗
∂x (λ, x ) = 0.
These are candidates for bifurcation points.
46
7 Modelling of chemical reactions
Mathematical tools & concepts: conditional probabilities, ODE
Suggested reference: [Hig08]
The main theme of this section will be the stochastic modelling of chemical
reactions. Nonetheless the reader may replace chemical reaction by evolutionary
game or the alike. To begin with, we mention two prototypical examples:
k1 k2
GGGGGGB
S+EF GG SE GGGAE + P
k−1
k1
Cl + O3 GGGGGGAClO + O2
and
k2
ClO + O3 GGGGGGACl + 2O2 .
The second reaction recreates the original chlorine atom, which can repeat the
first reaction and continue to destroy ozone (i.e., chlorine acts as a catalyst).
the state vector, with Xi (t) ∈ {0, 1, 2, 3, . . .} being the number of molecules at
time t ≥ 0. If any of the M reactions fires at time t, say, the j-th reaction, the
state vector is updated according to the rule
47
Figure 7.1: Chemical reaction catalysed by an enzyme.
The function aj : NN
0 → R+ is called the propensity of the reaction. The exact
functional form of aj depends on the type o the reaction.
Example 7.1. As an example that will guide us through this section consider
three species A, B, C with the single binary reaction
c
A + B GGGGGA C .
Since the reaction turns one A and one B into one C, the stoichiometric vec-
tor is ν1 = (−1, −1, 1). Now suppose that initially we have an initial mixture
consisting of four molecules of type A, three molecules of type B and zero C
molecules, i.e. X(0) = (4, 3, 0). Then, since the total number of particles is
finite, the set of possible states at time t > 0 is
Note that the state X(t) = (1, 0, 3) is a fixed point, also called absorbing state,
since all the B molecules are eaten up and no further reaction can happen.
The propensity of the reaction results from the consideration that, in a well-
stirred system, the probability of a reaction happening per unit of time must be
proportional to the number of both A and B molecules, which implies that
a1 (x1 , x2 , x3 ) = cx1 x2
48
of molecules at time t. The idea is to derive an differential equation for the
probability to have x = (x1 , . . . , xN ) molecules at time t,
ρ(x, t) = P (X(t) = x) , (7.4)
given that we know the probability distribution of states at time t = 0. (Note
that this entails the situation that X(0) is known exactly.)
49
Proof. Since the {Bj }j=0,...,M are a partition of Ω, we can write A ⊂ Ω as
M
[ M
[
A=A∩ Bj = (A ∩ Bj ) ,
j=0 j=0
where we have used de Morgan’s rule in the second equality. Since any probabil-
ity measure P is countably additive (σ-additive) and all the A ∩ Bj are disjoint,
we have
M
[
P (A) = P (A ∩ Bj )
j=0
M
X
= P (A ∩ Bj )
j=0
M
X
= P (A|Bj ) P (Bj ) .
j=0
50
Rearranging the terms and dividing by dt yields
M M
ρ(x, t + dt) − ρ(x, t) X X
= aj (x − νj )ρ(x − νj , t) − aj (x)ρ(x, t) . (7.6)
dt j=1 j=1
Chemical master equation We take advantage of the fact that the right
hand side of (7.6) is independent of dt and that the expression on the left
is a finite-difference approximation of the partial derivative with respect to t:
Letting dt → 0, we obtain the chemical master equation (CME)16
M M
∂ X X
ρ(x, t) = aj (x − νj )ρ(x − νj , t) − aj (x)ρ(x, t) . (7.7)
∂t j=1 j=1
∂
ρ(x1 , x2 , x3 , t) = c(x1 + 1)(x2 + 1)ρ(x1 + 1, x2 + 1, x3 − 1, t)
∂t
− cx1 x2 ρ(x1 , x2 , x3 , t) .
To see that the CME is indeed equivalent to a linear ODE system, let us define
the vector u = (u1 , . . . , u4 ) with 0 ≤ ui ≤ 1 given by
all. The reason is that the propensities were only defined in terms of the infinitesimal time
increment dt, i.e., the right hand side of (7.5) is already a linearization in dt, which is why
after dividing by the time increment it becomes constant (i.e. independent of dt).
51
In terms of the new state vector u the CME can be recast as
u̇1 = −12cu1
u̇2 = 12cu1 − 6cu2
u̇3 = 6cu2 − 2cu3
u̇4 = 2cu3
where the dot denotes the derivative with respect to t. In other words, we have
rewritten the CME as the linear system of equations
−12c 0 0 0
12c −6c 0 0
u0 (t) = Au(t) , A =
0 6c −2c 0
0 0 2c 0
with real eigenvalues λ ∈ {0, −2c, −6c, −12c}. The simple eigenvalue λ1 = 0
corresponds to the asymptotically stable equilibrium point of the CME, which is
the stationary probability of the absorbing state x∗ = (1, 0, 3), i.e.,
Time until next reaction To make this intuitive idea precise, let X(t) = x
and consider the time τ until the next reaction fires. Call
the probability that no reaction happens in the finite interval [t, t + τ ) with
τ > 0. Further let us suppose that whatever happens in [t, t + τ ) is independent
of what happens in [t + τ, t + τ + s) for all s > 0, in other words, the system
52
is memoryless. The, by independence, the probability that no reaction fires in
[t, t + τ + dτ ) can be written as
M
X
p0 (τ + dτ ; x, t) = p0 (τ ; x, t) 1 − aj (x)dτ , (7.10)
j=1
where, by definition of the propensities, the term in the parenthesis is the prob-
ability of no reaction between t + τ and t + τ + dτ , given that X(t + τ ) = x.
Rearranging terms, dividing by dτ and letting dτ → 0, it follows that
dp0
= −atot p0 , (7.11)
dτ
with
M
X
atot (x) = aj (x) . (7.12)
j=1
which is to say that τ is an exponential waiting time with parameter atot .17
This implies that the average waiting time between two reactions is
Next reaction index To determine the next reaction, we define p1 (τ, j; x, t)dτ
to be the probability that no reaction happens in the interval [t, t+τ ) and the j-
th reaction fires in [t+τ, t+τ +dτ ), given that X(t) = x. Then, by independence
and definition of the propensities, we have
Using (7.13), the latter can be recast as the product of two probability densities:
aj (x)
p1 (τ, j; x, t) = (atot (x) exp(−atot (x)τ )) . (7.16)
atot (x)
We notice that p1 is the joint probaility density of the time until the next reaction
τ and the reaction index j. The (conditional) probability of the reaction index j
is proportional to aj with atot as normalization constant. Since p1 is a product
density, both random variables τ and j are independent and can be drawn
independently. The following algorithm goes back to [Gil77].
The following Lemma is helpful for generating exponentially distributed ran-
dom variables from a uniformly distributed random variable.
Z̃ = F −1 (U )
17 Waiting times are memoryless iff they are exponentially distributed.
53
Algorithm 1 Stochastic Simulation Algorithm
P
Given X(0) = x, define atot (x) = j aj (x).
while t < T do
Generate exponential waiting time τ ∼ Exp(atot (X(t))).
Pick reaction index j randomly with probability aj (X(t))/atot (X(t))
Set t 7→ t + τ and update state vector X(t) 7→ X(t) + νj .
end while
where we have used the monotonicity of F in the third equality and the fact
that u ∈ [0, 1] is uniformly distributed in the last equality. This shows that the
distribution function of Z̃ is F . The rest of the proof is left as an exercise.
If the cumulative distribution function is a continuous monotonic function,
then the generalized inverse agrees with the standard inverse function.
Problems
Exercise 7.9. Let x = (x1 , x2 , . . .) be the state vector of a system with species
A, B, . . .. Construct propensity functions a(x) for the following reactions.
c0
a) zeroth-order reactions: ∅ GGGGGGA A
c1
b) first-order reactions: A GGGGGGA B
c2
c) dimerization: A + A GGGGGGA B
54
with rate constants c± > 0. Let X(t) = (XA (t), XB (t), XC (t)) be the state
vector of the system at time t ≥ 0.
ρ(x, t) = P (X(t) = x) .
b) Let X(0) = (3, 2, 1). Recast the CME as an equivalent system of linear
odes and compute its equilibrium state as a function of c± . Explain the
meaning of the equilibrium (Hint: use that ρ(x, t) is a pdf ).
Exercise 7.11. Consider the Michaelis-Menten system
k1 k2
GGGGGGB
S+EF GG SE GGGAE + P
k−1
55
8 Modelling of traffic flow
Mathematical tools & concepts: delay differential equations, Euler method,
scalar conservation laws, partial differential equations.
Suggested reference: [MeG07, BCD02]
56
Figure 8.1: Density of vehicles and micro-macro passage.
for, in the real world, nothing can grow or decay forever. One special case of the
DDE is its Markovian limit τ → 0, in which case we obtain a nonlinear system
of N − 1 ordinary time-inhomogeneous differential equations
x1 (t) = φ(t)
k (8.4)
ẋj+1 (t) = log |xj+1 (t) − xj (t)| + aj+1 , j = 1, . . . , N − 1 .
m
# vehicles in (x − s, x + s) at time t
ρ(x, t) = , (8.5)
2s
where we assume that the street section is symmetric around the position x ∈ R.
We regard ρ as a macroscopic variable that replaces the detailed microscopic
description in terms of the positions of single vehicles by a coarse-grained de-
scription in terms of (average) numbers of cars per street section; clearly ρ
depends on the length of the street section over which we average, but it can
be shown that ρ = ρN,s converges to a limit as N → ∞ and l 2s → 0
with N l → const [BCD02]. Here we assume the birds-eye perspective (e.g.
seen from a traffic surveillance helicopter) and busy traffic conditions and, as a
consequence, we may safely ignore the dependence of ρ on N, s.
Example 8.1. One situation in which ρ is independent of N < ∞ and s is
when all vehicles are at equal distance d at any time (which implies that they
are all moving at the same speed), in which case
(cf. Exercise 8.9). As the vehicle length is equal to l, the denominator is bounded
from below by l; therefore the maximum achievable density is the density of
bumber-to-bumber traffic at constant speed, with
57
Figure 8.2: Fundamental diagram of pedestrian flows (from: [BMR11]).
We want to analyse the maximum capacity of the traffic lane under equilib-
rium conditions. To this end we assume that the observed speed v of vehicles
at (x, t) depends only on the density ρ. In an abuse of notation, we write
It is known from empirical data of traffic flows that there exists a critical density
ρcrit , below which the vehicles move at the maximum possible speed vmax , and
that there is a maximum density ρmax , at which the flow stops. From Example
8.1 it readily follows that ρmax ≤ 1/l. From the critical to the maximum density,
v decays towards zero where it is also known from experimental data that v is
a decreasing function of the density, i.e.
v 0 (ρ) ≤ 0 . (8.9)
Figure 8.2 shows experimental and simulation data of pedestrian flows under
various environmental conditions that shows the universal signature of almost
all traffic flows; the graphical relation v(ρ) is called fundamental diagram.
Steady state and equilibrium flow We suppose that all vehicles (cars,
pedestrians, . . . ) are separated by a distance d > 0 and move at the same
constant speed v. The equilibrium density corresponding to this situation is (cf.
Exercise 8.1)
ρ(x, t) = (d + l)−1 , (x, t) ∈ R × [0, ∞) . (8.10)
In equilibrium, all vehicles move at the same speed vj = dxj /dt, hence together
with the DDE (8.3) it follows that
v = λ log(d + l) + a , (8.11)
58
where we have introduced the shorthands λ = k/m and a = aj+1 for j =
1, . . . , N − 1. Combining the last two equations, we find the fundamental equi-
librium relation between the speed v and density ρ:
v = −λ log ρ + a , (8.12)
with the yet unknown parameters a and λ that must be determined from data;
by definition of ρmax , it holds that v(ρmax ) = 0, which is equivalent to
Hence
ρ
v = −λ log . (8.14)
ρmax
An expression for λ is easily obtained by requiring that v is continuous as a
functional of ρ. Setting vmax = v(ρcrit ), the last equation entails
−1
ρmax
λ = vmax log , (8.15)
ρcrit
which, together with the empirical finding that v(ρ) equals vmax below the
critical vehicle density yields the surprisingly general relation
vmax , ρ ≤ ρcrit
v(ρ) = n o−1 (8.16)
vmax log ρmax log ρmax , ρ > ρcrit .
ρcrit ρ
J(ρ) = ρv(ρ) .
59
8.2 Traffic jams and propagation of perturbations
We want to study what happens when the first vehicle brakes, i.e., we want to
study the effect of a perturbation of the lead vehicle on the pursuing vehicles,
when the traffic flows close to the maximum flux point.
To this end, let us go back to the microscopic picture again and consider a
platoon of vehicles under maximum flux conditions as described in the previous
section. We suppose that all vehicles move at constant speed
−1
∗ ρmax
v(ρ ) = vmax log (8.19)
ρcrit
where we have used (8.16) with ρ∗ = ρmax /e and have tacitly assumed that
ρ∗ > ρcrit . Let us further assume that we can extend the time t ≥ 0 to the
whole real axis, t ∈ R, and that the lead vehicle crosses the origin x = 0 at time
t = 0, i.e. x1 (0) = 0. With the sign convention
xj−1 − xj ≥ l > 0 (8.20)
and the shorthand v ∗ = v(ρ∗ ), equation (8.3) becomes
d
xj+1 (t + τ ) = λ log |xj+1 (t) − xj (t)| + a
dt
= v ∗ log(xj (t) − xj+1 (t)) + v ∗ log ρmax (8.21)
= v ∗ log (ρmax (xj (t) − xj+1 (t)))
where we have used that v ∗ = λ, which follows from (8.15) and (8.20) and which
together with (8.13) entails the relation a = v ∗ log ρmax .
where we set b(t) = kt exp((tb − t)/tb ). Solving the ODE for x1 (t), using (8.22)–
(8.24) we find
Z t Z t
∗ ∗
x1 (t) = φ(s) ds = v t − v b(s)ds , t > 0 , (8.25)
0 0
18 The reader should think of v∗as a model parameter, rather than the instantaneous velocity
of individual vehicles that is given by vj = dxj /dt.
60
with Z t
b(s)ds = ektb tb − (t + tb )e−t/tb . (8.26)
0
We call yj (t) the hypothetical position of the j-th vehicle, if the lead vehicle
had not braked, i.e. without the perturbation. We further call
zj (t) = xj (t) − yj (t) , j = 1, . . . , N , (8.27)
the perturbation displacement due to the perturbation of the lead vehicle’s mo-
tion. The perturbation displacement of the first vehicle then is
Z t
z1 (t) = −v ∗ b(s)ds , t > 0 . (8.28)
0
By (8.23), it follows for the pursuing vehicles with indices j = 2, . . . , N ,
zj (t) = xj (t) − v ∗ t + (j − 1)(d + l) , t > 0. (8.29)
Note that zj (t) = 0 for t ≤ 0 and all j = 1, . . . , N . Further note that the
non-collision constraint
xj−1 (t) − xj (t) ≥ l ∀t ∈ R . (8.30)
entails
zj (t) − zj−1 (t) ≤ d ∀t ∈ R . (8.31)
The latter follows from (8.29), together with
l ≤ xj−1 (t) − xj (t) = zj−1 (t) − zj (t) + d + l ∀t ∈ R . (8.32)
Reaction time and the onset of traffic jam Equations (8.27)–(8.31) allow
us to recast (8.22)–(8.23) as a DDE for the perturbation displacement. Bear-
ing in mind that d + l = e/ρmax holds under maximum flow conditions, the
perturbation displacement (8.29) of the pursuing vehicles reads
(j − 1)e
zj (t) = xj (t) − v ∗ t + , t > 0. (8.33)
ρmax
Plugging the last equation into (8.22), using that (8.32), we obtain a closed
DDE system for the perturbation displacement for t > 0:
d ∗ e
zj (t + τ ) = v log ρmax + zj−1 (t) − zj (t) − v∗ , (8.34)
dt ρmax
with j = 2, . . . , N , the lead vehicle displacement
Z t
z1 (t) = −v ∗ b(s) ds (8.35)
0
and the initial conditions
zj (0) = 0 , j = 2, . . . , N . (8.36)
Note that (8.34) is equivalent to
d n ρmax o
zj (t) = v ∗ log 1 + (zj−1 (t − τ ) − zj (t − τ )) , (8.37)
dt e
which follows from shifting the independent variable t according to t 7→ t − τ
and moving the rightmost term −v ∗ in (8.34) under the logarithm.
Figure 8.3 shows a simulation of (8.34)–(8.36) for different reaction times τ ;
cf. Exercise 8.8 for the parameter values used and for further details.
61
0 0
−100 −100
−200 −200
displacement
displacement
−300 −300
−400 −400
−500 −500
−600 −600
−700 −700
−800 −800
0 20 40 60 80 100 0 20 40 60 80 100
time time
Figure 8.3: The left panel shows how a perturbation propagates in case of short
reaction time (no accident), the right panel shows the case of a too long reaction
time; the 11th car cannot brake anymore and crashes into the 12th car.
2nd vehicle. The equations (8.37) for vehicles 2 to N are DDEs, and care
must be taken with their numerical integration.
If we apply the forward Euler scheme to (8.37) for j = 2, we obtain
n ρmax o
z̃2 (t + h) = z̃2 (t) + hv ∗ log 1 + (z̃1 (t − τ ) − z̃2 (t − τ )) (8.40)
e
The initial condition z̃2 (0) = 0 is not enough to solve (8.40) due to the presence
of the time delay τ > 0: In order to compute the first iterate z̃2 (h), we need
z̃1 (−τ ) and z̃2 (−τ ). In general, the initial condition z̃1 (t) = z̃2 (t) = 0 for
t ∈ [−τ, 0) (which we luckily have) is needed to iterate (8.40) forward in time.
62
In addition, z̃1 (t) from equation (8.39) is needed for t = 0, h, . . . , hN in order to
compute z̃2 (t) for t = 0, h, . . . , hN using (8.40). Thus we can solve the equation
for the 2nd vehicle after we solved the equation for the 1st.
Iterating this argument, we can solve the equation for the (j + 1)st vehicle
after we solved the equation for the jth vehicle using forward Euler. This even-
tually leads to the numerical solution (z̃j (0), z̃j (h), . . . , z̃j (hN )) being available
for all j = 1, . . . , N .
where v is the observed speed at location x and time t. We assume that J and
ρ are nonnegative functions. We also make the simplistic assumption that the
speed is a function of density alone, i.e. v = v(ρ). In the following, we will keep
track of the total number of cars in [x1 , x2 ] during a time [t1 , t2 ].
The number of cars entering [x1 , x2 ] through the point x1 during the time
Rt
interval [t1 , t2 ] is given by t12 J(x1 , t)dt, and the number of cars leaving [x1 , x2 ]
R t2
through the point x2 is t1 J(x2 , t)dt. The number of cars to be found in the
Rx
space interval [x1 , x2 ] at time t1 is given by x12 ρ(x, t1 )dx and the number at
R x2
time t2 is given by x1 ρ(x, t2 )dx.
Since the total number of cars in [x1 , x2 ] during a time [t1 , t2 ] is conserved
(no cars disappear or appear), we have
Z x2 Z x2 Z t2 Z t2
ρ(x, t2 )dx − ρ(x, t1 )dx = J(x1 , t)dt − J(x2 , t)dt (8.41)
x1 x1 t1 t1
63
The right-hand side can be rewritten similarly, leading to
Z x2 Z t2 Z t2 Z x2
∂ ∂
ρ(x, t)dtdx = − J(x, t)dxdt.
x1 t1 ∂t t1 x1 ∂x
which is true for any choice of rectangle [x1 , x2 ] × [t1 , t2 ]. By the Fundamental
Theorem of the Calculus of Variation (see the following Lemma for a simple
version) this implies that
∂ ∂
ρ(x, t) + J(x, t) = 0, (8.44)
∂t ∂x
which is a first-order conservation law.
Lemma 8.3. If f (x, t) is a continuous function defined on R2 such that
ZZ
f (x, y)dxdy = 0 .
R
then ZZ ZZ
1
f (x, y)dxdy ≥ f (x0 , y0 )dxdy.
Rδ 2 Rδ
0 ≥ 2δ 2 f (x0 , y0 ),
The state equation has the right behavior, i.e. the traffic flux J increases linearly
for small density ρ, levels off until a maximum is reached and then decreases
64
until J becomes zero at bumper-to-bumper traffic. However, the derivative has
a jump discontinuity at ρ = ρcrit , which is not so realistic. Moreover, it was
derived under equilibrium conditions which will not necessarily be satisfied.
We will now create a similar state equation for J = J(ρ) that is differentiable
for all admissible ρ. A simple choice is J(ρ) = aρ(b − ρ) with a parameter a > 0
and b = ρmax . When doing this, we assume that J = J(ρ) even outside of
equilibrium conditions, meaning that the flux adjusts smoothly to a change in
density.
With this choice, the conservation law 8.44 takes the form
∂ ∂
ρ(x, t) + J 0 (ρ) ρ(x, t) = 0, (8.45)
∂t ∂x
This is a first-order partial differential equation (PDE) for ρ. Together with
initial and boundary conditions, this model is solvable using the method of
characteristics.
Remark 8.4. From the definition of the density flux, we have that J(ρ) = v(ρ)ρ.
Therefore the derivative J 0 (ρ) has the dimension of velocity. We will see later
that the PDE expresses that ”traffic waves” propagate with a velocity given by
J 0 (ρ).
but we see that the terms in δ can be dropped since both partial derivatives in
(8.45) are of order δ. Thus we obtain the linearized form of the PDE
∂ ∂
ρ(x, t) + J 0 (ρ0 ) ρ(x, t) = 0, (8.46)
∂t ∂x
Notice here that J 0 (ρ0 ) is a constant and has the dimension of velocity. We can
call it v0 and get
∂ ∂
ρ(x, t) + v0 ρ(x, t) = 0. (8.47)
∂t ∂x
Now by substitution, one can easily see that ρ = f (x − v0 t) is solution for any
differentiable f (x). Note that ρ = f (x − v0 t) describes a wave moving with
velocity v0 . For v0 > 0, the wave moves to the right, the opposite sign moving
to the left.
Example 8.5. If f (x) = sin x, then ρ = sin(x − v0 t), the point (x, t) such that
x − v0 t = π/2 is at the crest of the wave and it moves in the x − t plane along
65
the straight line x = v0 t + π/2. Thus the solutions of (8.46) represent linear
traffic waves . The velocity v0 is given by
0 2ρ0
v0 = J (ρ0 ) = vmax 1 − .
ρmax
It is important to realise that this velocity is relative to the road surface.Note that
when ρ0 ≈ 0 we have v0 ≈ vmax . This says that density changes are propagating
with the velocity of the cars when there are few cars, which is reasonable. It also
means that traffic waves move with the traffic (again reasonable for light traffic).
At the other extreme, when ρ ≈ ρmax , we have v0 ≈ −vmax . In this case cars are
moving slowly, and the waves move backward relative to the car’s motionat the
high speed of vmax . This happens when cars move slowly in a packed traffic and
one car suddenly stops. The wave of red brake lights caused by many rear-end
cars braking can move towards a driver quickly.
66
3
2.5
2
1.9
1.8
2 1.7
f(x,t)
1.6
1.5
1.5
t
1.4
1.3
1.2
1 1.1
1
3
2.5
0.5 15
2 10
1.5 5
1 0
t 0.5 -10
-5
0
-5 -3 -1 1 3 5 0 -15 x
x
Figure 8.4: The left panel shows the characteristic base curves for example 8.6,
the right panel shows the corresponding solution f (x, t)
x(t) = x0 e1−cos t .
67
Back to nonlinear scalar conservation laws The method of characteristics
can be used for nonlinear conservation equations like our traffic model.
Consider the nonlinear scalar conservation law given by
∂ρ ∂ρ
+ J 0 (ρ) = 0, (8.54)
∂t ∂x
with characteristic base curves such that
dx
= J 0 (ρ(x, t)), x(0) = x0 .
dt
Assuming a smooth enough ρ, we get a solution x(t) and can rewrite the con-
servation law as
d
ρ(x(t), t) = 0.
dt
Thus as before, ρ(x(t), t) = ρ0 (x0 ), meaning that ρ is contant along character-
istic curves. The corresponding characteristic ODE is
dx
= J 0 (ρ(x(t), t)) = J 0 (ρ0 (x0 )),
dt
and since ρ0 (x0 ) is constant we can integrate to get
In conclusion, the PDE (8.54) has characteristic base curves that are straight
lines and explicitly computable.
−1
Each line has a slope of [J 0 (ρ(x0 ))] corresponding to a propagation speed
0
for the density of J (ρ(x0 )).
68
2
1.8
1.6
1.4
1.2
?
1
t
0.8
0.6
0.4
0.2
0
-5 -3 -1 1 3 5
x
The figure reveals a gap region which is not reached by any characteristic
base line. There is an inadequacy in our approach: this gap originates from our
discontinuous change in density, which we will correct for. In other words, we
will now modify the problem in order to have a continuous change in density.
Let us define a modified initial density
1 ,
x ≤ −
1 x
ρ (x, 0) = 2 − 2 , − < x ≤ (8.56)
0, x>
The set of these characteristic base curves is known as the rarefaction fan and
is shown in Figure 8.6.
Now in the transition region we get
1 x0 t
x(t) = (1 − 2ρ0 )(x0 )t + x0 = 1 − 2( − )t + x0 = x0 1 + .
2 2
69
2
1.8
1.6
1.4
1.2
t
0.8
0.6
0.4
0.2
0
-5 -3 -1 1 3 5
x
Figure 8.6: The expansion fan fills in the gap of Figure 8.5
Waiting time The left end of the expansion fan is the traffic wave corre-
sponding to ρmax and has a velocity of −vmax . If we consider a car at a distance
D behind the light, the time until the wave reaches the car is thus t = D/vmax .
In city traffic vmax might be 50 km/h. If we assume a car spacing of 6 m,
the waiting time per car is t = (6/1000).(3600/50) = .43 seconds. However,
typically large human reaction time has to be added to that value.
Vehicle path Once the car located at a distance D encounters the traffic wave
(which moves with velocity vmax ), it will begin to move. The car’s movement
after that is completely independent of the traffic wave and can be calculated.
The density in the expansion fan is
vmax t − x
ρ(x, t) = ρmax . (8.58)
2vmax t
We already know that the velocity of the car is v = vmax (1 − ρ/ρmax ). Inserting
(8.58) in this expression, we obtain the velocity of a car as a function of x and
t. That gives us
dx 1 vmax t − x
= vmax 1 − ρmax . (8.59)
dt ρmax 2vmax t
70
Which cars get through Let us say that the light stays green for a time tG .
The last car to get through the light is the one starting from a distance Dlast
such that its position at time tG is xlast (tG ) = 0.
Problems
Exercise 8.7. Simulate the Markovian approximation (8.4) of the DDE (8.3)
for pedestrians in Matlab using forward Euler.
71
9 Formal justice
Mathematical tools & concepts: functional equations
Suggested reference: [Ill90]
x m(x)
= , x, y ≥ 0 . (9.1)
y m(y)
m(x) = cx (9.2)
for some constant c ≥ 0, where the positivity follows from the fact that m takes
only positive (non-negative) values. To see that m(x) is indeed proportional to
x, set y = 1 which implies that m(x) = m(1)x. As we will show below, this is
in fact the only solution of the functional equation (9.1).
72
c) the system is accurate, i.e., relations between objects (people) are reflected
by the relations between the assigned numbers (wages), and
d) there is an accurate inverse in the sense that for any two different wages
there exit well-defined prototypes of people who qualify for these wages.
Aristotle’s concept of proportional justice meets the first two requirements.
Translating the third requirement, condition c), into mathematics, it entails that
the function m should be a homomorphism, i.e. a structure-preserving map be-
tween relations between people an wage relations. This homomorphism, by the
condition d) on its inverse, is specified to be invertible, hence an isomorphism.
We stipulate that the function m should be a homomorphism with respect to to
the ratio scale, in other words, we measure relations between two people with
qualification x > 0 and y > 0 by the ratio x/y. This is to say that we seek a
function m : [0, ∞) → [0, ∞) that satisfies the equation
x m(x)
m = , x, y ≥ 0 . (9.3)
y m(y)
Theorem 9.1. Let m : [0, ∞) → [0, ∞) satisfy the functional equation
x m(x)
m = , x, y ≥ 0 .
y m(y)
If m is continuous for some z ∈ [0, ∞), then m is of the form
m(x) = xs , s ∈ R.
Proof. It is clear that m(x) = xs solves (9.3), however we have to prove the
converse statement, namely that all solutions of (9.3) that have a point of
continuity are of the form m(x) = xs . The proof proceeds in three steps:
1. We first observe that (9.3) is equivalent to
m(xy) = m(x)m(y) , x, y ≥ 0 , (9.4)
which follows from the fact that
x m(x) m(y)
m(xy) = m = = m(x) ,
1/y m(1/y) m(1)
with x m(x)
m(1) = m = = 1.
x m(x)
2. Define the function h : R → R by h(u) = log(m(eu )). This function solves
Cauchy’s functional equation
h(u + v) = h(u) + h(v) , (9.5)
as can be seen by noting that
h(u + v) = log(m(eu ev ))
= log(m(eu )m(ev ))
= log(m(eu )) + log(m(ev ))
= h(u) + h(v) .
73
Obviously, (9.5) is satisfied by any linear function
h(u) = cu , c ∈ R. (9.6)
which proves that (9.6) holds for all rational arguments u = q/p. By
the above assumptions m is continuous at z ≥ 0, as a consequence h is
continuous at w = log z. But then, since
Remark 9.2. There are other solutions to Cauchy’s functional equation (9.5),
but they represent pathological cases; in particular they are nowhere continuous;
for details we refer to [Kuc09].
Testing whether wages are formally just The solution to the functional
equation does not say anything about how the wages should grow with the
qualification of an employee, because (9.3) does not tell us what s is (or what
it should be in an ideal world). Nevertheless we can use it to test whether a
wage system is consistent, i.e. wether all employees in a company or in a country
receive payment that follows the same “law”. Accordingly (9.3) can be used as
a means for decision making, for example, when negotiating the salary with a
potential future employee or when considering to cap bankers’ bonuses by law.
We suppose that a fair wage scale is given by
with r > 0 being a scaling factor that accounts for, e.g., the currency in which
wages are paid; cf. (9.9) below. Figure 9.1 shows possible qualification-wage
curves for different values of s. Note that the function m(x) = rxs with s <
0 mets the requirement of formal justice, however paying the more qualified
candidate the lower salary does not appear to be just by any sensible standard.
As an illustration of how our model of formal justice can be used to test wage
scales consider the situation of three employees working for company X, who
74
2
1.8
1.6 s=1
s>1
1.4 0<s<1
s<0
1.2
m
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2
x
are paid according to their seniority: Alice has been working for her company
for 25 years and makes e 2,000,000 per year, Bob who joined the company 16
years ago earns e 60,000 per year and Carol after only 3 years gets e 40,000.
We do a least squares fit of the linear model
In a logarithmic scale a fair wage scale is a straight line with slope s and,
when the payment is fair, all data points should lie roughly on this straight
line.20 Figure 9.2 that shows the least square fit of the data suggests—not very
surprising—that Alice is significantly overpaid, whereas Bob is underpaid.
(You may think of other sensible choices. By the proof of Theorem 9.1, all
solutions of this functional equation are linear.) Nonetheless using ratios to
compare measurable quantities is a well-established approach in sociology and
quantitative research, hence we will stick to the ratio scale; moreover it is in
line with the historical notion of proportional justice.
20 Clearly three points do not give sufficient statistics, but the example is just meant to
75
8
7.5
6.5
6
log(salary)
5.5
4.5
3.5
3
1 1.5 2 2.5 3 3.5
log(years)
Figure 9.2: Least square fit of a wage system (log scale), with exponent s ≈ 1.4.
which is different from our original functional equation (9.3). We can, however,
account for this lack of scale invariance by simply replacing our model (9.3) by
the rescaled model (9.9), which then has solutions of the form
Formal justice with multiple objectives One may argue that the previ-
ous concept of formal justice is “too one-dimensional”, in that it tacitly assumes
that qualification or merit can be measured by a single parameter. It is more
reasonable to assume that the regular payment that an employee receives de-
pends on various independent parameters x, y, z . . ., such as formal degree of
education, seniority, extra professional qualifications and so on, which means
that the compensation will be a function m(x, y, z, . . .).
As an example we consider the case m = m(x, y). The idea is that the rule
of formal justice, i.e. (9.3) or (9.9) should apply to each qualification measure
separately. That is, we require that m : [0, ∞) × [0, ∞) → [0, ∞) solves the
76
group annual salary in £
1 42,803 – 57,520
2 44,971 – 61,901
3 48,505 – 66,623
4 52,131 – 71,701
5 57,520 – 79,081
6 61,901 – 87,229
7 66,623 – 96,166
8 73,480 – 106,148
Problems
Exercise 9.3. Let h : R → R solve the functional equation h(t+s) = h(t)+h(s)
for all s, t ∈ R Prove that
a) h(−t) = −h(t), and
b) h(t − s) = h(t) − h(s) for all s, t ∈ R.
Exercise 9.4. Salaries for headteacher in England and Wales (excluding Lon-
don) range from £42,803 to £106,148 based upon a performance group index
that involves, e.g., school leadership, management or pupil progress. The 2013
pay ranges for headteachers are recorded in the following table:
For comparison, the following table shows the 2003 base salaries of players
in the U.S. National Football League, depending on their match experience:
Compare the two salary scales, and explain the rationale behind your com-
parison. Would you rate any of the above salary scales as fair?
(Hint: Determine the exponent s in the qualification-salary relation m(x) = cxs
by a least square fit of the data given.)
77
group annual salary in $
Rookies 225,000
2 yrs 300,000
3 yrs 375,000
4 – 6 yrs 450,000
7 – 9 yrs 655,000
10 yrs 755,00
References
[And11] D.F. Anderson and T.G. Kurtz. Continuous Time Markov Chain Models for
Chemical Reaction Networks. In: Design and Analysis of Biomolecular Circuits,
H. Koeppl, G. Setti, M. di Bernardo, and D. Densmore (eds.), pp. 3–42, Springer,
New York, 2011.
[Ari94] R. Aris. Mathematical Modelling Techniques. Dover, Mineola, 1994.
[BMR11] A.L. Ballinas-Hernández, A. Muñoz-Meléndez, A. Rangel-Huerta. Multiagent
System Applied to the Modeling and Simulation of Pedestrian Traffic in Coun-
terflow. J. Artif. Soc. Soc. Simulat. 14(3), 2, 2011.
[BCD02] N. Bellomo, V. Coscia, M. Delitala. On the Mathematical Theory of Vehicular
Traffic Flow I. Fluid Dynamic and Kinetic Modelling. Math. Mod. Meth. App.
Sc. 12, 1801–1843, 2002.
[Ben00] E.A. Bender. An Introduction to Mathematical Modeling. Dover, Mineola, 2000.
[Bie05] A.A. Biewener. Biomechanical consequences of scaling. J. Exp. Biol. 208, 1665–
1676, 2005.
[Buc14] E. Buckingham. On Physically Similar Systems; Illustrations of the Use of Di-
mensional Equations. Phys. Rev. 4, 345–376, 1914.
[Dou76] P.H. Douglas. The Cobb-Douglas Production Function Once Again: Its History,
Its Testing, and Some New Empirical Values. J. Polit. Econ. 84, 903–916, 1976.
[Gil77] D.T. Gillespie. Exact Stochastic Simulation of Coupled Chemical Reactions. J.
Phys. Chem. 81, 2340–2361, 1977.
[Hig08] D.J. Higham. Modeling and Simulating Chemical Reactions. SIAM Review 50,
347–368, 2008.
[Ill90] R. Illner. Formal justice and functional equations. Technical Reports (Mathe-
matics and Statistics), University of Victoria DMS-541-IR, 1990.
[IBM+ 05] R. Illner, C.S. Bohun, S. McCollum, and T. van Roode. Mathematical Modelling:
A Case Studies Approach. AMS, Providence, 2005.
[Izh07] E.M. Izhikevich. Dynamical Systems in Neuroscience: The Geometry of Ex-
citability and Bursting. MIT Press, Cambridge, 2007.
[KaEn] H. Kaper and H. Engler. Mathematics and Climate. Society for Industrial and
Applied Mathematics (SIAM), Philadelphia, Pennsylvania (2013).
[Kuc09] M. Kuczma. An introduction to the theory of functional equations and inequal-
ities. Cauchy’s equation and Jensen’s inequality. Birkhäuser, Basel, 2009.
[Lot10] A.J. Lotka. Contribution to the Theory of Periodic Reactions. J. Chem. Phys.
14, 271–274, 1910.
[MeG07] M. Mesterton-Gibbons. A Concrete Approach to Mathematical Modelling. Wiley,
Hoboken, 2007.
[Met87] N. Metropolis. The beginning of the Monte Carlo method. Los Alamos Science
15(584), 125–130, 1987.
[BTF+ 99] C.R. Rao, H. Toutenburg, A. Fieger, C. Heumann, T. Nittner, and S. Scheid.
Linear Models: Least Squares and Alternatives. Springer, Berlin, 1999.
[Tay50] G.I. Taylor. The formation of a blast wave by a very intense explosion II: The
atomic explosion of 1945. Proc. Roy. Soc. A 201, 175–186, 1950.
78
[Tes12] G. Teschl. Ordinary Differential Equations and Dynamical Systems. AMS,
Providence, 2012.
[Vol26] V. Volterra. Variazioni e fluttuazioni del numero d’individui in specie animali
conviventi. Mem. Acad. Lincei Roma 2, 31–113, 1926.
[Whi96] P. Whittle. Optimal control: Basics and beyond. Wiley & Sons, Chichester,
1996.
79