0% found this document useful (0 votes)
6 views219 pages

MATH 6192 - Main Lecture Notes - New Version - February 21st - 2024

The document is a comprehensive guide on advanced mathematical modeling, covering various topics such as Newton's laws, harmonic oscillators, calculus of variations, and mathematical epidemiology. It includes theoretical concepts, worked examples, and exercises aimed at graduate students. The text is structured to facilitate understanding and application of mathematical modeling techniques in different scientific contexts.

Uploaded by

Kene De Gannes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views219 pages

MATH 6192 - Main Lecture Notes - New Version - February 21st - 2024

The document is a comprehensive guide on advanced mathematical modeling, covering various topics such as Newton's laws, harmonic oscillators, calculus of variations, and mathematical epidemiology. It includes theoretical concepts, worked examples, and exercises aimed at graduate students. The text is structured to facilitate understanding and application of mathematical modeling techniques in different scientific contexts.

Uploaded by

Kene De Gannes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 219

Advanced Mathematical Modelling

Donna M. G. Comissiong

January 2024
ii
Contents

Dedication vii

Preface ix

Acknowledgements xi

1 Introduction to Mathematical Modelling 1


1.1 Focus of This Course . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Models from Newton’s Laws of Motion 3


2.1 Derivation of Kepler’s Laws of Planetary Motion . . . . . . . . . . . . 3

2.1.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 The Laws of Conservation of Energy . . . . . . . . . . . . . . . . . . 13
2.2.1 Example 1: Escape Velocity . . . . . . . . . . . . . . . . . . . 16
2.2.2 Example 2: Oscillation Period of a Pendulum . . . . . . . . . 18
2.2.3 Example 3: An Oscillating System of Masses . . . . . . . . . . 19
2.2.4 Example 4: Period of Oscillation . . . . . . . . . . . . . . . . 20
2.2.5 Example 5: Collision of Two Bodies . . . . . . . . . . . . . . . 22

3 Harmonic Oscillators 29
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Case with Friction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4 Case With Friction and External Periodic Force . . . . . . . . . . . . 39


3.5 Parametric Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.6 Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.7 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

iii
iv CONTENTS

4 Calculus of Variations 47
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2 Brachistochrone Problem . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Euler-Lagrange Di¤erential Equation . . . . . . . . . . . . . . . . . . 55

4.3.1 Application in Optics . . . . . . . . . . . . . . . . . . . . . . . 57


4.4 Theory of Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . 60
4.4.1 Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.5 Least Action Principle . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.5.1 Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5 Concepts in Mechanics 73
5.1 Laws of Conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.1.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2 Lagrangian vs Eulerian Equations of Motion . . . . . . . . . . . . . . 79
5.2.1 Equation for Caustics . . . . . . . . . . . . . . . . . . . . . . . 85
5.2.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6 Linear Stability Analysis 93


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 An Application - A Theory for the Formation of Stars . . . . . . . . . 100
6.2.1 Challenge Problem . . . . . . . . . . . . . . . . . . . . . . . . 105
6.3 Instabilities at the Interface of Two Fluid Layers . . . . . . . . . . . . 105
6.3.1 Exercise: Kelvin-Helmoltz Instability . . . . . . . . . . . . . . 114
6.4 The Rayleigh-Taylor Instability . . . . . . . . . . . . . . . . . . . . . 114
6.4.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

7 Heat Flow Problems 121


7.1 Equation of Heat Conduction . . . . . . . . . . . . . . . . . . . . . . 121
7.2 Self-Similar Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.2.1 Example 1: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

7.2.2 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127


7.3 The Convection Di¤usion Equation and Applications . . . . . . . . . 128
7.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

7.3.2 Rayleigh-Bénard Problem . . . . . . . . . . . . . . . . . . . . 129


7.3.3 Marangoni Problem . . . . . . . . . . . . . . . . . . . . . . . . 136
CONTENTS v

8 Introduction to Mathematical Epidemiology 145


8.1 What is Epidemiology . . . . . . . . . . . . . . . . . . . . . . . . . . 145

8.2 Classi…cation of Infectious Diseases . . . . . . . . . . . . . . . . . . . 145


8.3 Basic De…nitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.4 Kermack-McKendrick SIR Epidemic Model . . . . . . . . . . . . . . . 147
8.5 Basic Reproduction Number . . . . . . . . . . . . . . . . . . . . . . . 154
8.6 Estimating Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.6.1 Recovery Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.6.2 Transmission Rate Constant . . . . . . . . . . . . . . . . . . 156
8.7 SIS Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.7.1 Qualitative Analysis of the Logistic Equation . . . . . . . . . 160
8.7.2 Stability of a Nonlinear System . . . . . . . . . . . . . . . . . 162
8.8 SIS Model with Saturating Treatment . . . . . . . . . . . . . . . . . 164
8.8.1 Case 1: Unique Positive Endemic Equilibrium . . . . . . . . . 166
8.8.2 Case 2: Two Positive Endemic Equilibria or No Endemic Equi-
libria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.8.3 Bistability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
8.9 Models for Population Growth . . . . . . . . . . . . . . . . . . . . . . 172
8.9.1 Malthusian Model . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.9.2 Logistic Model . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.9.3 Simpli…ed Logistic Model . . . . . . . . . . . . . . . . . . . . . 173
8.10 SIR Model with Demography . . . . . . . . . . . . . . . . . . . . . . 173
8.11 Phase-plane Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
8.11.1 Discussion about R0 . . . . . . . . . . . . . . . . . . . . . . . 178
8.12 Linearization of 2D Linear Systems . . . . . . . . . . . . . . . . . . . 179
8.13 Stability of the Equilibria for 2D Linear Systems . . . . . . . . . . . . 180
8.14 Local Stability Analysis for SIR Model . . . . . . . . . . . . . . . . . 183
8.15 Bifurcation Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
8.16 Oscillations in Epidemic Models . . . . . . . . . . . . . . . . . . . . . 186
8.17 Techniques for Computing R0 . . . . . . . . . . . . . . . . . . . . . . 189
8.17.1 The Construction of More Complex Epidemiological Models . 189
8.17.2 Stages of Contagion . . . . . . . . . . . . . . . . . . . . . . . . 189
8.17.3 Stages Related to Control Strategies . . . . . . . . . . . . . . 193
8.17.4 Stages Related to Pathogen or Host Homogeneity . . . . . . . 195
8.18 Calculating R0 via the Jacobian . . . . . . . . . . . . . . . . . . . . . 196
8.18.1 Example #1: Jacobian reduces to a 2D matrix . . . . . . . . . 197
8.18.2 Routh-Hurwitz Criteria in Higher Dimensions . . . . . . . . . 198
8.18.3 The Next-Generation Approach (*Van den Driessche and Wat-
mough Method) . . . . . . . . . . . . . . . . . . . . . . . . . . 202
8.18.4 Example: SEIT Model with Treatment and Relapse . . . . . 205
vi CONTENTS
Dedication

For my beloved grandfather, the late great William Scholasticus Henry Pyke.

vii
viii Dedication
Preface

This text is based on a series of lectures given to graduate students at The Univer-
sity of the West Indies, St Augustine Campus. My main objective is to introduce
graduate students to the process of developing mathematical models as a means for
solving real-world problems. Several modelling situations will be addressed, start-
ing with some of the more popular examples from Theoretical Physics - such as the
derivation of Kepler’s Laws of planetary motion from Newton’s laws of motion. We
will include a discussion on the general method of linear stability analysis, and later
apply this theory to ‡uid systems - leading to a discussion of some of the better known
phenomena in the …eld of ‡uid mechanics/ hydrodynamic stability. With respect to
chemical applications, we will consider heat ‡ow problems, inclusive of models for
convective-di¤usive systems. Finally, we will focus on the general theory correspond-
ing to mathematical epidemiology (compartmental models), with applications to the
dynamics of infectious diseases.
With such a wide assortment of problems from the basic sciences (Physics, Chem-
istry and Biology), it is hoped that this text will be a useful tool for graduate students
in the applied sciences who might be interested in research and publication in the gen-
eral …eld of mathematical modelling.

"I have had my results for a long time: but I do not yet know how I am to arrive
at them." - Carl Friedrich Gauss

ix
x PREFACE
Acknowledgements

My sincere appreciation is extended to Professor Harold Ramkissoon for introducing


me to the world of applied mathematics, and for his constant support and encour-
agement throughout my academic career.

Donna M. G. Comissiong, Ph.D.


(married name: Donna M. G. Dyer)
January 2024

xi
xii ACKNOWLEDGEMENTS
Chapter 1

Introduction to Mathematical
Modelling

A mathematical model is a description of a system via the tools and language of


mathematics. The process of developing the model is what is referred to as mathe-
matical modelling. This process can be applied to any system, biological, chemical,
…nancial or otherwise. Mathematical models are created to help describe a system, to
study the components of that system, and to predict possible behaviour of the system
subject to given conditions.
The …rst step to building a mathematical model is to obtain a clear description of
the process. The translation of this process into mathematical equations should be
made in accordance with this system. A …rst model should be as simple as possible,
addressing only the particular aspects of the system that are relevant to the speci…c
goal of the study. Any assumptions that are being made while formulating these
equations must be clearly stated, as the limitations of the resulting model must be
understood. Next, the parameters associated with it need to be estimated. This can
be done via statistical/stochastic techniques.
Once the model has been created:

It may be analyzed to determine the critical quantities that govern the overall
behaviour of the system.

It may be …tted to data or used to simulate data for predicting the behaviour
of the system.

It may be used to determine the importance of each parameter for the overall
behaviour of the system.

Models can be classi…ed in multiple ways:

1
2 CHAPTER 1. INTRODUCTION TO MATHEMATICAL MODELLING

Linear/Nonlinear: If there is nonlinear dependence on the variables in the


equations describing the process, the model is said to be nonlinear. Otherwise,
it is said to be linear.

Static/Dynamic: A dynamic model accounts for variations in time, while a


static model assumes that all quantities are time-invariant.

Discrete/Continuous: Discrete models deal with time and system states


as discrete events. Continuous models incorporate changing time and system
states.

Deterministic/Stochastic: In a deterministic model, every set of variables


is uniquely determined by their initial state and the parameters in the model.
Stochastic models are random by nature, and variable states are determined by
probability distributions.

1.1 Focus of This Course


We will focus mainly on models that are nonlinear, dynamic, continuous and deter-
ministic in nature. We will utilize the basic laws of Physics for the …rst part of the
course. The more common models found in ‡uid mechanics will be introduced and
discussed. We will then switch to one of the most common biological applications
- the mathematical modelling of infectious diseases. This is particularly important
these days with the onset of the covid-19 pandemic, and the race to control the spread
of the virus.
Chapter 2

Models from Newton’s Laws of


Motion

2.1 Derivation of Kepler’s Laws of Planetary Mo-


tion
In astronomy, Kepler’s laws give a description of the motion of planets around the
Sun. Kepler’s laws are:

1. The orbit of every planet is an ellipse with the Sun at one of the two foci.
2. A line joining a planet and the Sun sweeps out equal areas during equal intervals
of time. (See Figure 2.1)
3. The square of the orbital period of a planet is directly proportional to the cube
of the semi-major axis of its orbit.

The third law can be written as


a3
= constant
T2
where a is the ration of the semi-major axis and T is the orbital period.
Sir Isaac Newton computed in his "Philosophiæ Naturalis Principia Mathematica"
the acceleration of a planet moving according to Kepler’s …rst and second law. He
noted that

1. The direction of the acceleration is always towards the Sun.


2. The magnitude of the acceleration is inversely proportional to the square of the
distance from the Sun.

3
4 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

Figure 2.1: The area A1 is equal to the area A2 as these are areas swept out by the
same interval of time t

His theories were based on the assumption that the Sun was the physical cause
of the acceleration of planets. Newton stated that the magnitude of the force on a
planet is in direct proportion to the mass of the planet and in inverse proportion to
the square of the distance from the Sun. This is represented by

! GM m
F =
r2
!
where F is the force between the masses, G1 is the gravitational constant, M is the
mass of the sun, m is the mass of the planet, and r is the distance between the centers
of the sun and the planet in question. Newton’s law of universal gravitation can be
written as a vector equation to account for the direction of the gravitational force as
well as its magnitude as
! GM m !
F = r (2.1)
r3
!
which is the same as the scalar form given earlier, except that F is now a vector
quantity, and the right hand side is multiplied by the appropriate unit vector. From
Newton’s second law
1
Assuming SI units, F is measured in newtons, M and m in kilograms, r in metres, and the
constant G is approximately equal to 6:674 10 11 N m2 kg 2 . The value of the constant G was
…rst accurately determined from the results of the Cavendish experiment conducted by the British
scientist Henry Cavendish in 1798, although Cavendish did not himself calculate a numerical value
for G. This experiment was also the …rst test of Newton’s theory of gravitation between masses in
the laboratory. It took place 111 years after the publication of Newton’s Principia and 71 years after
Newton’s death, so none of Newton’s calculations could use the value of G; instead he could only
calculate a force relative to another force
2.1. DERIVATION OF KEPLER’S LAWS OF PLANETARY MOTION 5

! d!v d2 !
r
F = m! a =m =m 2 (2.2)
dt dt
!
where F represents force, m is mass, !
a is acceleration, !
v is velocity and !
r is the
displacement. Comparing (2:1) and (2:2) ; we can say that

! GM !
F =m r (2.3)
r3
which allows us to say that
d2 !
r GM !
= r (2.4)
dt2 r3
Now consider
d ! ! d ! d!r
(r v)= r
dt dt dt
using the product rule
d ! ! d!r d!r d2 !r d2 !r
(r v)= +!
r 2
=0+!
r 2
(2.5)
dt dt dt dt dt
since the cross product of vector with itself is zero. Also, from (2:4) it follows that
d ! ! GM ! GM !
(r v)=!
r r = (r !
r)=0
dt r3 r3
This means that
!
r !
v = constant
and so
!
r m!
v = constant
(where m!v is the momentum), and this implies that the orbit of the planet is planar.
We can therefore draw the orbit in a single plane and introduce polar coordinates as
follows:
ebr = bi cos + b
j sin ; eb = bi sin + bj cos (2.6)
and since
!
r = r ebr (2.7)
so from (2:4) and (2:7)

d2 !
r GM m ! GM m GM m
m = r = r ebr = ebr
dt2 r3 r3 r2
which leads to
d2 !
r GM
= ebr (2.8)
dt2 r2
6 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

Figure 2.2: Introduction of polar coordinates

Also
d! r d dr d ebr
= (r ebr ) = ebr + r
dt dt dt dt
Now since ebr depends only on and it changes with time as does (see …gure 2.2),
we have
d!r dr d ebr d
= ebr + r (2.9)
dt dt d dt
In addition from (2:6)
d ebr
= bi sin + b j cos = eb (2.10)
d
Using (2:10) in (2:9) we have

d!r dr d
= ebr + r (b
e)
dt dt dt
d ebr
Since = eb ; it follows that (using chain rule, and "dots" to indicate time deriva-
d
tives)
d d d d
ebr = ebr = ebr = ebr = eb (2.11)
dt d dt d
Now let us take
dr d
= r; =
dt dt
so we have
d2 !r d
2
= er + r eb
rb er + r ebr
= rb + r eb + r eb + r eb
dt dt
(2.12)
2.1. DERIVATION OF KEPLER’S LAWS OF PLANETARY MOTION 7

Now from (2:6)

d d bi sin + b bi cos + b
eb = j cos = j sin = ebr (2.13)
d d
Hence
d d d d
eb = eb = eb = eb = ebr (2.14)
dt d dt d
Using (2:11) and (2:14)in (2:12) ; It follows that

d2 !
r
= er + r ebr
rb + r eb + r eb + r eb
dt2

= rb
er + r eb + r eb + r eb + r ebr
2
= rb
er + 2r eb + r eb r ebr

This leads to
!
d2 !
2
r
2
= r r ebr + 2r +r eb (2.15)
dt

Now since we have from (2:8) that

d2 !
r GM
= ebr
dt2 r2
it follows that
!
2
GM
r r ebr + 2r +r eb = ebr
r2

If we let
k = GM
we get !
2
k
r r ebr + 2r +r eb = ebr (2.16)
r2
Clearly this is only possible if the coe¢ cient of eb vanishes, and so

2r +r =0
8 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

which means that


1d
r2 =0
r dt
and so
d
r2 = r2 = constant (2.17)
dt
This implies that
dA
= constant
dt
(see Figure 2.3) which is Kepler’s second law. It states that any planet in the solar

dA
Figure 2.3: is always constant
dt

system moves faster near the Sun, so the same area is swept out in a given time as
at larger distances, where the planet moves more slowly.
From (2:16) ; we know also that the coe¢ cients of ebr must be equal on both sides
of the equation, which gives
2
k
r r = (2.18)
r2
and from (2:17)
r2 = h = constant
which means that
2
h h2
= ) = (2.19)
r2 r4
Substituting (2:19) into (2:18) ; we have

h2 k
r =
r3 r2
2.1. DERIVATION OF KEPLER’S LAWS OF PLANETARY MOTION 9

Assuming that the trajectory is a function of ; i.e. r = r ( ) ; we have


dr d dr dr h
r = = = (2.20)
d dt d d r2
from (2:19) : It follows that
d d dr h d dr h d
r = r = =
dt dt d r2 d d r2 dt
Using (2:19) again
d dr h d dr h h
r = = (2.21)
d d r2 d d r2 r2
Substituting (2:19) and (2:21) into (2:18) ; we have
2
d dr h h h k
r =
d d r2 r2 r2 r2
which is
d dr h h h2 k
=
d d r2 r2 r3 r2
This can be written as
d d h h h2 k
=
d d r r2 r3 r2
and this is equivalent to
d2 h h h2 k
=
d 2 r r2 r3 r2
Since h is a constant, we have
d2 1 1 k
+ = 2 (2.22)
d 2 r r h
1
Let us take z = in (2:22). This leads to
r
d2 z k
2
+z = 2
d h
which is a second order ordinary di¤erential equation. The general solution of this
equation (using the method of undetermined coe¢ cients) may be written as
k
z = A cos + B sin +
h2
10 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

which is equivalent to
k
z = C cos ( )+ (2.23)
h2
1
Now since z = ; it follows that
r
h2
1 k P
r= = = (2.24)
k 1 + E cos ( ) 1 + E cos ( )
C cos ( )+ 2
h
where
h2 C
P = ; E=
k (k=h2 )

It follows from (2:24) that the maximum distance of the orbiting planet from the
Sun is
P
r1 = (2.25)
1 E
and the minimum distance of the orbiting planet from the Sun is

P
r2 = (2.26)
1+E
It follows then that the semi-axis of orbit is
1 1 P P 1 P + EP + P EP P
a= (r1 + r2 ) = + = = (2.27)
2 2 1 E 1+E 2 1 E2 1 E2

This is Kepler’s …rst law (i.e. that the orbit of every planet is an ellipse with the Sun
at one of the two foci). See Figure 2.4
Let us now demonstrate Kepler’s third law, which is that

a3
= constant
T2
To do this, we must …nd the period of revolution T: If r = r ( ) ; from (2:19) we get

h d h
= ) = 2
r2 ( ) dt r ( )

Integrating we get Z Z
2
r ( ) d = h dt
2.1. DERIVATION OF KEPLER’S LAWS OF PLANETARY MOTION 11

Figure 2.4: Kepler’s …rst law

Recall that
dA
= h = constant
dt
so if we call A the area swept out by a complete revolution of the planet in question
about the Sun, and T is the period of revolution, then
A
=h (2.28)
T
We refer now to Figure 2.5 and the general property of any ellipse that the distance
from any point of an ellipse to one focus plus the distance from that same point to
the other focus is always equal. Therefore
2x = (a L) + (a + L) = 2a
and so
x=a
where a is the semi-axis calculated before in (2:27) : The area of the ellipse is
A = ab (2.29)
where p
b= a2 L2 (2.30)
Now from (2:26) we have that
P
r2 = a L=
1+E
which gives
P
L=a
1+E
12 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

Figure 2.5: Elliptical orbit of a planet

and using (2:27) we get


P P PE
L= = = aE (2.31)
1 E2 1+E 1 E2
Using (2:31) in (2:30) ; we get
p p p
b = a2 L2 = a2 a2 E 2 = a 1 E2 (2.32)
It follows then that the area of the ellipse is
p
A= a a 1 E2 (2.33)

and (2:28) becomes p


a a 1 E2
=h (2.34)
T
Now we already have that
r
P p P
a= ) 1 E2 = (2.35)
1 E2 a
and substituting (2:35) into (2:34) we get
p p p
a2 P a3=2 h h k GM
p =h) =p = =r =
aT T P h2
k
since k = GM: But G and M are constants, so we get that
a3=2
= constant
T
which means that
a3
= constant
T2
which is Kepler’s third law (i.e. the square of the orbital period of a planet is directly
proportional to the cube of the semi-major axis of its orbit).
2.2. THE LAWS OF CONSERVATION OF ENERGY 13

2.1.1 Exercise
Suppose that we had another gravitation law
! GM m
F =
r3
Derive the orbit of planetary motion for this case, and see whether there is stable
planetary motion. Obtain r = r ( ) ; and sketch qualitatively the orbits of such
planets.

2.2 The Laws of Conservation of Energy


We begin with Newton’s second law of motion Force = Mass Acceleration
! d!v d2 !
r d!p
F =m =m 2 =
dt dt dt
!
where p is the momentum (i.e. force is the rate of change of momentum). Now if
! d!p
F = 0; it follows that = 0 which is the same as saying that the momentum is
dt
constant.
Newton’s second law of motion for every particle in a system can be expressed as
d!pi X!
= Fj i
dt j

where !
pi is the momentum of the ith particle. For N particles, it follows that
X d! XX!
N
pi
= Fj i=0
i = 1
dt i j

Now Newton’s third law of motion is that every action has an equal and opposite
reaction, which means that
! !
Fi j = Fj i
and therefore
! ! ! !
F12+ F2 1 + ::: + F 3 5 + F 5 3 + ::: = 0
If we take X
! !
pi= P (2.36)
i

then
X d!
pi d! dX !
=0) P = mi v i = 0
i
dt dt dt i
14 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

This means that X !


mi !
v i = constant = M V
i

where we are considering the system as one big particle of mass M


X
M= mi (2.37)
i

!
moving with a velocity V referred to as the "moving frame velocity" (otherwise
referred to as the centre mass velocity) de…ned as
P !
mi v i
! i
V = (2.38)
M
The momentum of this "big" particle is
0 P !1
X X mi v i
! !
P = mi !
vi V = mi @!
vi i A
i i
M

which is P
X mi X
!
P = mi !
vi i
mi !
vi
i
M i

Now since X
M= mi
i

then P
mi
i
=1
M
which means that X X
!
P = mi !
vi mi !
vi=0
i i

Therefore, the momentum related to this "moving frame velocity" of "one big particle"
is zero.
Now the centre mass velocity from (2:38) and (2:37) is
P ! P !
mi v i mi v i
! i i
V = = P
M mi
i
2.2. THE LAWS OF CONSERVATION OF ENERGY 15

Also
P ! P d! ri 0P ! 1
! mi v i mi mi r i
! d R i i dt d @ i A
V = = P = P = P
dt mi mi dt mi
i i i

where P !
mi r i
! i
R = P
mi
i

is referred to the centre of mass of the system of particles. For a closed system (as
no particles will enter or leave the system)
!
! d R
V = = constant
dt
Now recall that
! d!v
F =m
dt
hence
d!v! !! ! d! r
m v =F v =F
dt dt
Therefore Z Z
point 2 point 2
!
m!
v d!
v = F d!
r
point 1 point 1

and Z
m!
v 2point m!
v 2point point 2
!
d!
2 1
= F r (2.39)
2 2 point 1

Now since
m!
v2
E= = kinetic energy
2
and Z
!
F d!
r = work done

then (2:39) can be expressed in words by saying that the work done by the force is
the di¤erence between the kinetic energy at the initial and the …nal points. Since the
work done by a force depends only on the initial and the …nal positions and not on
the path taken, we refer to this force as a potential force. If the path is closed, it
follows that I
!
W = F d! r =0
16 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

Using Stokes’theorem, this can be expressed as


I ZZ
! ! !
W = F dr =0= r F dS
S

Hence
!
r F =0
! !
so F is irrotational. It follows that F is a potential force, and we can say that
!
F = rU

where U is the potential energy. We can therefore write the di¤erence in kinetic
energies Epoint 2 Epoint 1 as
Z point 2
m!
v 2point 2 m!
v 2point 1
Epoint 2 Epoint 1 = = rU d!
r
2 2 point 1
= (Upoint 2 Upoint 1 ) = Upoint 1 Upoint 2

In other words
Epoint 2 Epoint 1 = Upoint 1 Upoint 2

Therefore
Upoint 1 + Epoint 1 = Upoint 2 + Epoint 2 (2.40)
and is referred to the principle of conservation of energy for a closed potential system,
i.e.
m! v2
+ U = constant for a closed potential system (2.41)
2

2.2.1 Example 1: Escape Velocity


Let us now consider an application of the above theory by calculating the escape
velocity for a given object of mass m from a planet of mass M . Now the work done
to move the object from point r1 to point r2 by opposing the gravity …eld acting on
the given planet is
Z r2 r
GM m GM m 2 1 1
W = 2
dr = = GM m
r1 r r r1 r2 r1

Now for a potential …eld


W = U1 U2
and since potential energy is de…ned as
GM m
U=
r
2.2. THE LAWS OF CONSERVATION OF ENERGY 17

therefore, for a the given gravity …eld, the work done to move the object is
GM m GM m
W = U1 U2 =
r2 r1
As the system is closed, if the velocity is of the object is vobject and the radius of the
planet is R; then by the principle of conservation of energy (2:41)
2
mvobject GM m
= constant (2.42)
2 R
From (2:42) ; if
2
mvobject GM m
<0
2 R
then the object will not be able to escape the gravity …eld of the planet. However if
2
mvobject GM m
=0 (2.43)
2 R
then the object will just be able to escape the gravity …eld, and this velocity vobject = v0
is the escape velocity that we seek. We solve (2:43) to get the escape velocity of the
planet to be r
2GM
v0 = (2.44)
R
Now for a satellite of mass m to leave the ground and to orbit the earth (of mass
M and radius R); its velocity must be (leaving the ground) vinitial where
2
GM m mvinitial
=
R2 R
which means that r r
GM mR GM
vinitial = =
mR2 R
But we know that
GM m GM
2
= mg ) = Rg
R R
and so p
vinitial = Rg
and for the earth we know that R = 6400 km; g = 9:81 ms 2 ; and so for any given
satellite to orbit the earth, its velocity leaving the earth must be
p
vinitial = 6400 9:81 8 103 ms 1 = 8km s 1
Also, the escape velocity v0 for the earth can be calculated from (2:44)
r
2GM p p
v0 = = 2Rg 2 8 103 ms 1 11km s 1
R
18 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

2.2.2 Example 2: Oscillation Period of a Pendulum


Consider a pendulum of length L and mass m initially held at an angle to the
vertical (See Figure 2.6). The period of oscillation of the pendulum when the angle

Figure 2.6: Simple pendulum

is small is given by s
L
T =2
g
Now the kinetic energy of the pendulum is
mv 2
Ek =
2
where
v=L
Hence the kinetic energy at any point in time can be written as
2 2
m L mL2
Ek = =
2 2
Also the potential energy at any point in time can be expressed as
U= mgL cos
As the pendulum is initially at rest, the initial kinetic energy Ek0 is zero, and all the
energy in the system is potential U0 = mgL cos 0 . By the principle of conservation
of energy, if the initial angle of the pendulum before release is 0 ; then
2
mL2
mgL cos 0+0 = mgL cos +
2
2.2. THE LAWS OF CONSERVATION OF ENERGY 19

which gives
2 r
g g
= 2 (cos cos 0) ) = 2 (cos cos 0)
L L

It follows that Z Z
d
p g = dt = t + C
2 L (cos cos 0)

The period of oscillation of the pendulum is therefore


Z Z
d d
T =2 p g =4 p g
0 2 L (cos cos 0) 0 2 L (cos cos 0)

Remark 2.2.1 By making an appropriate variable transformation, we can show that


s
L 0
T =4 K sin
g 2

where K is the elliptic integral


Z =2
d
K (s) = p
0 1 s2 sin2

2.2.3 Example 3: An Oscillating System of Masses


Consider a horizontally moving mass m2 with velocity v that collides head-on with a
stationary mass m1 attached to an elastic spring (with spring constant k) on a moving
contact. See Figure 2.7 for a rough representation of the problem Upon impact,

Figure 2.7: The spring constant is taken to be k

the two masses move together instantaneously with velocity V: Our problem is to
determine the maximum deformation of the spring after the masses collide. We will
use a very rough model for this system (although of course it can be improved...can
you suggest how?).
20 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

As the momentum must be conserved, then just after the point of collision

m2 v = (m1 + m2 ) V (2.45)

and as the energy is also conserved, then

m2 v 2 (m1 + m2 ) 2 ( x)2
= V +k (2.46)
2 2 2
x)2
where x represents the deformation of the spring, and k ( 2
is the potential energy
transferred to the spring after the collision. From (2:45)

m2
V = v
m1 + m2

and this may be substituted into (2:46) to get

( x)2 m2 v 2 (m1 + m2 ) m22 v 2 m2 v 2 m2


k = = 1
2 2 2 (m1 + m2 )2 2 m1 + m2

This can be further simpli…ed as

( x)2 m2 v 2 m1 + m2 m2 m1 m2 v2
k = =
2 2 m1 + m2 m1 + m2 2

We can then …nd the deformation of the spring x accordingly.

2.2.4 Example 4: Period of Oscillation


Consider a mass m attached to a spring oscillating horizontally along a smooth surface
with a velocity !
v (see Figure 2.8) . The potential energy of the mass attached to the

Figure 2.8: Oscillating ball on a smooth horizontal surface


2.2. THE LAWS OF CONSERVATION OF ENERGY 21

spring is
k (x)2
U=
2
where x is the trajectory (in a given direction) of the mass and k is the spring constant.
The total energy of the system is
2
m!
v2 m x
E= +U = +U
2 2
Therefore r
2
x = (E U)
m
and since
dx
x =
dt
it follows that Z Z
dx
q = dt = t + C
2
m
(E U)
which is the general solution for all systems of this form for a given trajectory x; and
this provides an implicit form for for trajectory x (t) of the particle that is oscillating.
We note that in order for oscillatory motion to occur, we need

E U (x) > 0

for example, consider the potential well illustrated in Figure 2.9 Clearly E U (x) > 0
only between xmin and xmax : It follows that oscillation only occurs between xmin and
xmax :
In general, we can show that the period of oscillation of a particle with potential
energy U (x) is given by
Z xmax
dx
T =2 q (2.47)
2
xmin
m
(E U (x))

Exercise
Using the formula (2:47) ; …nd the oscillation period of a particle with the following
potentials as a function of its energy
V0
1. U (x) = 2
cosh ( x)
2. U (x) = U0 tan2 ( x)
22 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

Figure 2.9: Potential well (dotted curve U) against total energy in system (constant
line E). Oscillation can only occur between points xmin and xmax

2.2.5 Example 5: Collision of Two Bodies


Consider the head-on collision of a pool ball of mass m and velocity ! v with two
stationary pool balls that both have mass m as shown in Figure 2.10. Given that the
two pool balls move away from each other with velocities ! v 1 and !v 2 respectively,
and that the original ball stops moving after impact, by the principle of conservation
of linear momentum
m!v = m [! v2+! v 1]
This means that
!
v =!
v2+!
v1
By the principle of conservation of energy, the total energy before the collision E is
equal to the sum of energies of the two moving balls after impact

E = E1 + E2

which is
1 !2 1 !2 1 !2
mv = mv1+ mv2
2 2 2
This means that
!
v2=!
v 21 + !
v 22
2.2. THE LAWS OF CONSERVATION OF ENERGY 23

Figure 2.10: Pool balls of equal mass m before impact two of the balls are stationary,
one is moving: After impact the moving ball stops.

and by the Pythagorean’ theorem, this implies that the balls move away at 90 to
each other.
Now the total potential energy U depends only on the distance between any two
masses (as seen in Figure 2.11), so U = U (r) ; and

U (!
r ) = U (!
r1 !
r 2)

where
!
r =!
r1 !
r2 (2.48)
Introducing a new coordinate system with a centre of mass

Figure 2.11: The potential depends only on the distance between the two masses

! m1 ! m2 !
R = r1+ r2
m1 + m2 m1 + m2
Let us take
f
! ! ! m1 ! m2 ! m2
r1=!
r1 R = r1 r1 r2= (!
r1 !
r 2)
m1 + m2 m1 + m2 m1 + m2
24 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

Hence using (2:48) we have


f
! m2 !
r1= r (2.49)
m1 + m2
In a similar fashion, we take
f
! ! m1 ! m2 ! m1
r2=! r2 R =! r2 r1 r2= (!
r1 !
r 2)
m1 + m2 m1 + m2 m1 + m2
and using (2:48) we have
f
! m1 !
r2= r (2.50)
m1 + m2
Now if we take
f
! f
v1= !
r1
and
f
! f
v2= !
r2
then we have the total kinetic energy in the system to be
2 2
f
m1 !
v1 f
m2 !
v2
T = +
2 2
which may be written as
2 2
m2 ! m1 !
m1 r m2 r
m1 + m2 m1 + m2
T = +
2 2
This can be simpli…ed to
2 2
1 m1 m22 + m2 m21 ! 1 m1 m2 (m1 + m2 ) !
T = r = r
2 (m1 + m2 )2 2 (m1 + m2 )2
therefore
2
1 m1 m2 !
T = r
2 m1 + m2
2 2
Now the usual notation is r = j!
r j ; so (! !
2
r ) = r2 and r = r

1 m1 m2 2 1 2
T = r = M r
2 m1 + m2 2
where
m1 m2
M= = reduced mass
m1 + m2
This two body problem therefore has total energy that may be expressed as
1 2
E = M r + U (! r)
2
2.2. THE LAWS OF CONSERVATION OF ENERGY 25

An Application to Planetary Motion

Now recall for planar motion


!
r !
p = M = constant

d!r
where !
p =m =m !
r is the momentum. This is so as
dt

d ! ! d!r ! d!p
(r p)= p +!
r =0
dt dt dt

! d!r d!r ! d!p


since p = m and therefore p = 0; and since is collinear with !
r
! dt dt dt
d p
also !
r = 0: Now the angular component of velocity is r ; and the angular
dt
component of momentum for mass m is therefore mr ; and therefore

M
M = mr2 ) = (2.51)
mr2
and it follows that the total energy is
" # " #
2 2
1 2 1 2 M
E = m r + r2 + U (r) = m r + r2 + U (r)
2 2 mr2

and so
1 2 M2 1 2
E= m r + 2
+ U (r) = m r + Uef f (r) (2.52)
2 2mr 2
where Uef f (r) is the e¤ective potential energy de…ned as

M2
Uef f (r) = + U (r)
2mr2
Recall that E = constant by the principle of conservation of energy. Now
r
1 2 2
E = m r + Uef f ) r = (E Uef f ) (2.53)
2 m
which means that
Z Z
dr
q = dt = t + constant
2
m
(E Uef f )
26 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION

Figure 2.12: When E < 0; motion is possible as there are two intersection points
between the e¤ective potential energy and the total energy.

The solution of the above integral provides an implicit form of the trajectory r (t).
From (2:51) we also have Z
M
= dt
mr2
Now from (2:52) ; if we have that

GM m
U (r) =
r
then
GM m M2
Uef f = +
r 2mr2
and we see that Uef f can be negative or positive. A sketch of such a graph for Uef f
is shown in Figure 2.12. When E < 0; we have motion as there is an rmin and an
rmax possible, so the object of mass m is "trapped inside" by the gravitational pull.
However, when E = 0; we have attained the escape velocity for the object of mass m
(See Figure 2.13) as there is an intersection point for rmin but no possible intersection
point for rmax :
2.2. THE LAWS OF CONSERVATION OF ENERGY 27

Figure 2.13: No possible intersection point of Uef f curve with E a second time,
therefore there is no rmax and orbit is not possible.
28 CHAPTER 2. MODELS FROM NEWTON’S LAWS OF MOTION
Chapter 3

Harmonic Oscillators

3.1 Introduction
Consider the case of 1 D motion, with the total energy in the system given by
m 2
E= x + U (x)
2
where the potential is at a minimum at the point x = x0 ; as depicted in Figure 3.1

Figure 3.1: Potential has a minimum at x0

Taking a Taylor’s expansion of U (x) about the minimum point x = x0 ; we have


(bearing in mind that U 0 (x0 ) = 0 at the minimum point)
1
U (x) = U (x0 ) + U 00 (x0 ) (x x0 )2 + :::
2
00
Let us refer to U (x0 ) as k; so that we have the estimated potential energy as
1
U (x) U (x0 ) + k (x x0 ) 2
2
29
30 CHAPTER 3. HARMONIC OSCILLATORS

and therefore
m 2 1
E= x + U (x0 ) + k (x x0 )2
2 2
Now if we take q = x x0 ; then
2
q k
E=m + q 2 + constant
2 2
Di¤erentiating this with respect to t we get

mq q + kq q = 0

Therefore
q mq + kq = 0 ) mq + kq = 0
k
so if we take (!0 )2 = ; then
m
q + (!0 )2 q = 0
The solution to this equation is
q (t) = A cos (!0 t) + B sin (!0 t) (3.1)
therefore, if we have a potential energy with a minimum (or a near minimum), then
we can describe the solution as a harmonic oscillator. Note that we can do some
manipulations to (3:1) to get
p
q (t) = A2 + B 2 cos !0 t + tan 1 ( B=A)
which is in the form
q (t) = a cos [!0 t + ]
where a is the amplitude of the oscillations and is the initial phase. A simple plot
of the oscillating solution is shown in Figure 3.2.
Suppose now that we wish to solve

q + (!0 )2 q = F (t)
where the external force F (t) is given as
f
F (t) = cos (!t)
m
for some given frequency !: Using the method of undetermined coe¢ cients, we can
show that
f
m
q (t) = A cos (!0 t) + B sin (!0 t) + 2 cos (!t)
!0 ! 2
3.1. INTRODUCTION 31

Figure 3.2: Oscillations q(t) are between a and a

Clearly this solution is only valid when ! 6= !0 : The closer ! is to !0 ; the larger
the amplitude of the induced formula, and the solution quickly becomes in…nite in
magnitude.
When ! = !0 ; we have
f
q + (!0 )2 q = cos (!0 t) (3.2)
m
The homogeneous solution is of the form

qh = C1 cos (!0 t) + C2 sin (!0 t)

The form of the particular solution is

qp = A [t cos (!0 t)] + B [t sin (!0 t)]

Di¤erentiating this once, we have

q p = A [cos (!0 t) t!0 sin (!0 t)] + B [sin (!0 t) + t!0 cos (!0 t)]

which simpli…es to

q p = cos (!0 t) [A + t!0 B] + sin (!0 t) [B t!0 A] (3.3)

Di¤erentiating again

q p = cos (!0 t) [!0 B] [A + t!0 B] !0 sin (!0 t)


A!0 sin (!0 t) + !0 [B tA!0 ] cos (!0 t)

which is

q p = cos (!0 t) 2!0 B !02 tA + sin (!0 t) 2!0 A !02 tB (3.4)


32 CHAPTER 3. HARMONIC OSCILLATORS

Substituting (3:3) and (3:4) into (3:2) ; and solving, we get (eventually)

f
A = 0; B =
2m!0

so the solution is
f
q = C1 cos (!0 t) + C2 sin (!0 t) + t sin (!0 t)
2m!0

This clearly grows with time (resonance e¤ect) as shown in Figure 3.3

Figure 3.3: Resonance e¤ect (solution q grows with time)

Another way to analyze this sort of problem is by expressing the trigonometric


terms as complex exponentials as illustrated below

a cos (!t + ) = Re fa exp (i ) exp (i!t)g = Re fA exp (i!t)g

where A = a exp (i ) is referred to as the complex amplitude. For example, we can


solve the problem
f
x + (!0 )2 x = cos (!t)
m
by using the fact that
f f
cos (!t) = Re exp (i!t)
m m
We can therefore solve everything in terms of exponents, and then take the real part
at the end to get the solution. Let us illustrate this by considering the related problem

f
x + (!0 )2 x = exp (i!t)
m
3.2. WORKED EXAMPLE 33

Substituting x = A exp (i!t) we get


f
! 2 A exp (i!t) + (!0 )2 A exp (i!t) = exp (i!t)
m
which leads to
f f
A (!0 )2 !2 = )A=
m m (!0 )2 !2
so
f
x= exp (i!t)
m (!0 )2 !2
Now resonance occurs when ! ! !0 ; so let us take
! = !0 +
for small : Then
f f exp (i t)
x= 2 2 exp (i!0 t) exp (i t) = 2)
exp (i!0 t)
m (!0 ) (!0 + ) m ( 2!0
2
as ! 0; is very small, so we can approximate this to
f exp (i t)
x' exp (i!0 t) = A exp (i!0 t)
2m!0
where the complex amplitude is
f exp (i t)
A=
2m!0

3.2 Worked Example


Carbon monoxide gas is comprised of two common isotopes C 12 O16 and C 14 O18 : Each
of these molecules has internal oscillations between its atoms. One way to determine
the actual chemical composition of a sample of gas is to …nd the resonance frequency
of the molecules comprising the gas. This is done by taking note of the wavelengths
that are absorbed most by the gas - which occurs when the molecules are in resonance.
If you can …nd the ratio of the natural frequencies of the molecules comprising the
gas, you can get an idea of the actual composition of molecules contained in the gas.
!1
Find the ratio of natural oscillation frequencies of the molecules C 12 O16 and
!2
C 14 O18 :
SOLUTION: We have previously reduced the two-body oscillating problem to
2
x
E=m + U (x)
2
34 CHAPTER 3. HARMONIC OSCILLATORS

where
k
U (x) = U (x0 ) + (x x0 )2
2
and m is the reduced mass is
m1 m2
m=
m1 + m2
Let us take
q=x x0
so we can write
2
q k
E=m + q 2 + constant
2 2
Di¤erentiating, we have
2q q
m + kq q = 0
2
which can be written as h i
q mq + kq = 0

Therefore we have
k
mq + kq = 0 ) q + q=0
m
This is the familiar equation
q + (!0 )2 q = 0
where the natural frequency is p
!0 = k=m
Now the relative mass of C 12 O16 is
12 16 192
mcomposite1 = =
12 + 16 28
and the relative mass of C 14 O18 is
14 18 252
mcomposite2 = =
14 + 18 32
The ratio of oscillations for these two molecules is therefore
v
u k
u r
u r
u mcomposite1 mcomposite2 252 192
u = = ' 1:148
t k mcomposite1 32 28
mcomposite2
3.3. CASE WITH FRICTION 35

3.3 Case with Friction

In any real system, there is always friction, so for oscillating systems, the types of
equations we will solve are of the form
mx + k x = x
Dividing through by m; we get
k
x+ x+ x =0
m m
which can be written as
x + 2 x + (!0 )2 x = 0
k
where (!0 )2 = and 2 = : If x = exp ( t) ; we get the characteristic equation
m m
2
+ 2 + (!0 )2 = 0
which gives the two roots
q
1;2 = 2 (!0 )2
In the case where 2
< (!0 )2 ; we have
1;2 = i!
q
where ! = 2 (!0 )2 ; and we get solutions
x = exp ( t) [cos !t + i sin !t] = exp ( t) exp (i!t)
Recall that for the particular solution of the original problem, we would need to take
the real part of this expression, which is
Re (exp ( t) exp (i!t)) = exp ( t) cos !t
It follows that the solution will oscillate, but it is also damped over time (see Figure
3.4). This is referred to the underdamped case.
In the case where 2 = (!0 )2 ; we have identical roots
1;2 =
so our solution behaves like
x = t exp ( t)
where the energy is dissipated by the system as the time increases (see Figure 3.5).
In the case where 2 > (!0 )2 ; we have
q
1;2 =
2 (!0 )2
which gives two negative real roots. This is referred to as the overdamped case as it
does not even oscillate as the friction is too large (see Figure 3.6).
36 CHAPTER 3. HARMONIC OSCILLATORS

Figure 3.4: Sketch of underdamped case


3.3. CASE WITH FRICTION 37

Figure 3.5: Energy is dissipated by the system over time


38 CHAPTER 3. HARMONIC OSCILLATORS

Figure 3.6: Sketch of overdamped case


3.4. CASE WITH FRICTION AND EXTERNAL PERIODIC FORCE 39

3.4 Case With Friction and External Periodic Force


Let us now consider the case where there is an external periodic force acting as well
as friction. The form of the equation for such an oscillator is
f
mx + 2 x + (!0 )2 x = cos (!t) (3.5)
m
As we saw before in the case with just friction, clearly the homogeneous solution dies
out with time, so we are more interested in the particular solution to the equation. Let
us substitute x = A exp (i!t) (where A is the complex amplitude) into the equation
(3:5) ; we have (later we can use the fact that cos (!t) = Re [exp (i!t)] )

f
! 2 A exp (i!t) + 2 !iA exp (i!t) + (!0 )2 A exp (i!t) = exp (i!t)
m
Removing the exponentials, we have
f
! 2 A + 2 !iA + (!0 )2 A =
m
which leads to
f
A= 2
m (!0 ) ! 2 + 2 !i
This is
f !
m (!0 )2 !2 2 !i
A=
(!0 )2 ! 2 + 2 !i (!0 )2 !2 2 !i
so
f
(!0 )2 !2 2 !i
A= m 2
(!0 )2 !2 + 4 2!2
If we convert A to polar form
f
(!0 )2 !2 2 !i
A= m 2 = a exp (i )
(!0 )2 !2 + 4 2!2

where represents the time delay caused by the external force. The "phase"
depends on the ratio of the real and the imaginary parts of A. To …nd the amplitude
a; we do the following
f f
AA = a2 = 2 2
m (!0 ) ! 2 + 2 !i m (!0 ) !2 2 !i
40 CHAPTER 3. HARMONIC OSCILLATORS

Therefore
f
f2 m
a2 = 2
)a= q
m2 (!0 )2 !2 + 4 2!2 (!0 )2 !2
2
+ 4 2!2
A sketch of the amplitude is shown in Figure 3.7. We see from this sketch that for
larger values of the external frequency !; the amplitude of the oscillations diminishes.
This may be interpreted by saying that owing to the mass inertia in the system, there
are no fast oscillations. The size of the amplitude depends on the size of the friction

Figure 3.7: A sketch of the amplitude a as ! 0

related term . If we consider


! = !0 +
then as ! 0; ! ! !0 ; and
f
f
a! pm 2 =
4 2 !0 2m !0
which corresponds to the peak we have indicated in Figure 3.7: Clearly at resonance,
smaller values of (corresponding to a smaller friction force) allows the maximum
peak of the amplitude to be greater increasing the risk of breakage for the oscillating
system.

3.5 Parametric Resonance


Let us consider a simple pendulum
x + (!0 )2 x = 0
3.5. PARAMETRIC RESONANCE 41

where the frequency of oscillation !0 changes harmonically with time

!0 = !0 (t)

If !0 is a periodic function with period T; then

!0 (T + t) = !0 (t)

It follows that if X (t) is a solution, then X (t + T ) is also a solution. Suppose that the
solutions of the homogeneous equation are (x1 (t) ; x2 (t)) : If now we take t ! t + T;
we expect to get a combination of the previous solutions in the form
t + T
(x1 (t) ; x2 (t)) ! ( 1 x1 (t) ; 2 x2 (t))

so as time proceeds, depending on the size of 1 and 2 ; at least one of the solutions
may grow, causing the amplitude of the oscillations to grow. This phenomena is called
parametric resonance. (Note: asymptotic techniques can be utilized to investigate the
solutions of such systems, but this is beyond the scope of this course).
To illustrate this, consider the case of a pendulum which has a length that changes
periodically with time
x + (!0 )2 (1 + cos !t) x = 0 (3.6)
with initial conditions
x (0) = a; x (0) = 0
where 1: Using a regular perturbation expansion, we can seek a solution of the
form
2
x = x0 + x1 + x2 + ::: (3.7)
Substituting (3:7) into (3:6) ; and setting the coe¢ cients of all powers of in the
resulting equation to zero, we get for the order one problem (identify all the coe¢ cients
of 0 )
x0 + (!0 )2 x0 = 0
subject to the initial conditions

x0 (0) = a; x0 (0) = 0

This has the general solution

x0 (t) = A cos (!0 t) + B sin (!0 t)

Now from the initial conditions

x0 (0) = a; x0 (0) = 0
42 CHAPTER 3. HARMONIC OSCILLATORS

we get
A=a
and
B=0
so the solution for x0 (t) is therefore

x0 (t) = a cos (!0 t) (3.8)


1
The order problem (found by identifying all the coe¢ cients of ) is

x1 + (!0 )2 x1 = x0 (!0 )2 [cos (!t)] (3.9)

Substituting (3:8) into (3:9) ; we get

x1 + (!0 )2 x1 = (a cos (!0 t)) (!0 )2 [cos (!t)]


(!0 )2 a
= fcos [(! + !0 ) t] + cos [(! !0 ) t]g
2
For resonance, we need to have

! + !0 = !0 ) ! = 0

or
! !0 = !0 ) ! = 2!0
The case ! = 0 is not interesting as it means that there is no changing frequency
with time. The case ! = 2!0 means that resonance occurs when ! is twice as much
as the natural frequency !0 :

3.6 Worked Example


Consider a pendulum made of a mass m attached to an elastic spring, where the mass
is constrained to move along the horizontal axis. The length of the spring is L when
it is in the vertical position (with the mass located at the origin x = 0). The mass
m is given a small push to the right so that it moves a small distance x away from
the origin along the horizontal axis, generating an elastic force of magnitude F in
the spring. Upon release, the mass will oscillate along the horizontal axis. Find an
expression for the frequency of oscillation ! of this pendulum in terms of the elastic
force F: (*Note: Ignore all friction forces at the contact surface)
SOLUTION - Method 1: Consider the system depicted in Figure 3.8 Let us assume
that the spring makes an angle to the horizontal when the mass m is displaced by
the distance x from its equilibrium position at the origin, as shown in Figure 3.9.
3.6. WORKED EXAMPLE 43

Figure 3.8: Oscillating pendulum on an elastic spring of length L in the vertical


position

This generates an elastic force of magnitude F in the spring. According to Hooke’s


!
law, this elastic force F is dependent on the spring constant k and the length of
!
extension of the spring x; i.e., by Hooke’s law F = F = k x: This elastic force
acts in the direction of the stretched spring. By Newton’s second law (Force = mass
acceleration), the elastic force in the spring will cause the mass to oscillate about
the origin according to the equation

F
F cos = mx =) x + cos = 0 (3.10)
m
Note that we used cos on the left hand side to determine the horizontal component
of the elastic force F . Using trig rules (see Figure 3.10)

Figure 3.9: Reference diagram for pendulum


44 CHAPTER 3. HARMONIC OSCILLATORS

Figure 3.10: Finding cos ( ) via trig rules

x
cos = p
x2 + L2
Now using Taylor expansions to …nd a suitable approximation as follows (since x is
small):
1=2
x x 2 x x
1+ = 1 + O x 2 = + O x3 (3.11)
L L L L
we can write (3:10) as
F x
x+ + O x3 = 0
m L
Ignoring terms that are non-linear, we get (*NOTE: This is an approximation that
only holds when the distance x is small), this simpli…es to
F
x + x=0
mL
Comparing this with the general pendulum equation

x + !2x = 0

we see that the frequency of oscillation


r
F
!= (3.12)
mL
SOLUTION - Method 2: This problem can also be solved by using the principle
of conservation of energy. The kinetic energy at any point in time is
2
m x
T = (3.13)
2
3.6. WORKED EXAMPLE 45

Let us re-draw the system slightly di¤erently to make this easier. For the spring
which is of length L at the equilibrium position, let us denote the change in spring
length as the mass oscillates along the horizontal at any point in time L (See Figure
3.11) Using trig rules

Figure 3.11: Oscillating system (note the new notation)

r
p x 2
L+ L= L2 + x2 = L 1+
L
and we can approximate this using Taylor expansions as
x2 x2
L+ L'L 1+ =L+
2L2 2L
hence we approximate
x2
L= (3.14)
2L
At any point in time, the change in potential energy is
x2
F ( L) = F (3.15)
2L
Therefore, using total energy in the system is
2
m x x2
E= +F
2 2L
which gives
2 F
x + x2 = 0
mL
46 CHAPTER 3. HARMONIC OSCILLATORS

Taking a derivative of this


F
2x x + 2 x x =0
mL
which gives
F
2x x + x =0
mL
Since we don’t have x = 0 always, this means that
F
x+ x=0
mL
We can recognize this as the general equation for an oscillator

x + !2x = 0

and therefore the frequency of oscillation ! is given by


r
F
!=
mL
which is identical to our previous result (from method 1).

3.7 Exercise
Use complex variables to …nd the amplitude and phase of the forced oscillations for
an oscillator that is subject to friction and external periodic forcing, as described in
the equation
f0
x + 2 x + (!0 )2 x = [exp ( t)] [cos ( t)]
m
Chapter 4

Calculus of Variations

4.1 Introduction
The main purpose of the Calculus of variations is to …nd the path, curve, surface,
etc., for which a given function has a stationary value (which, in physical problems, is
usually a minimum or maximum). As you can imagine, this has numerous applications
in economics, …nancial mathematics and engineering. The brachistochrone problem
was one of the earliest problems posed in the calculus of variations. It was originally
proposed by Johann Bernoulli in June 1696 as follows

Proposition 4.1.1 I, Johann Bernoulli, address the most brilliant mathematicians


in the world. Nothing is more attractive to intelligent people than an honest, chal-
lenging problem, whose possible solution will bestow fame and remain as a lasting
monument. Following the example set by Pascal, Fermat, etc., I hope to gain the
gratitude of the whole scienti…c community by placing before the …nest mathemati-
cians of our time a problem which will test their methods and the strength of their
intellect. If someone communicates to me the solution of the proposed problem, I shall
publicly declare him worthy of praise.

The problem he posed was as follows. Given two points a and b in a vertical
plane, what is the curve traced out by a point acted on only by gravity, which starts
at a and reaches b in the shortest time. (Note that Pascal’s most famous challenge
concerned the cycloid, which Johann Bernoulli knew at this stage to be the solution
to the brachistochrone problem, and his method of solving the problem used ideas due
to Fermat). Galileo in 1638 had previously studied the problem and had incorrectly
deduced that the path of quickest descent from a to b would be the arc of a circle.
Having already …gured out how to accurately solve the problem himself, Johann
Bernoulli challenged others to …nd a plausible solution. Upon hearing of the challenge,
Newton (a known rival of the Bernoulli brothers) solved the problem in a single

47
48 CHAPTER 4. CALCULUS OF VARIATIONS

evening after returning home having heard it. After posting his solution, which was
later published by the Royal Society of London, his famous retort was

Remark 4.1.1 I do not love to be dunned [pestered] and teased by foreigners about
mathematical things ...

There were …ve solutions published in total: by Newton, Leibniz and L’Hôpital in
addition to the two Bernoulli brothers Jacob and Johann Bernoulli. The Bernoulli
brothers continued to solve related problems and documented their results. Based
on the ideas put forward by the two Bernoulli brothers, Euler later developed the
basis for the well-known Euler-Lagrange di¤erential equation for a function of the
maximizing or minimizing function and its derivative. The idea is to …nd a function
which maximizes or minimizes a certain quantity where the said function satis…es
certain constraints.

4.2 Brachistochrone Problem


Brachistochrone is a Greek word (brachistos - the shortest, chronos - time) that
describes the path of fastest descent between two points traversed by a point mass,
given that the mass starts at the …rst point with zero speed and is constrained to
move along this path to the second point, under the action of constant gravity and
assuming no friction. See Figure 4.1 for a sketch of the problem in consideration. For

Figure 4.1: Brachistochrone Problem

brevity, we shall adopt the notation

y (a) = ya ; y (b) = yb
4.2. BRACHISTOCHRONE PROBLEM 49

We know that
ds
dt =
v
which leads to expression for the time for the mass to travel from point a to point b
Z L
ds
t=
0 v
where L is the length of the curve. In order to solve the problem, we wish to …nd an
expression for v as a function of its position.
From the principle of conservation of energy, we have
mv 2
mgya = mgy +
2
mv 2
where the potential energy is mgy and the kinetic energy is : This leads to an
2
expression for the velocity in terms of the position of the mass
p
v = 2g (ya y)

Now the arc length in general is


s
q 2 q
dy
ds = (dx)2 + (dy)2 = dx 1+ = dx 1 + (y 0 )2
dx
and therefore q
Z b 1 + (y 0 )2
t= p dx (4.1)
a 2g (ya y)
Let us now introduce some important notation. In the most basic sense of the
word, a function is something that maps a number to another number. We refer to a
functional as a function which maps a function to a number. For example
Z b
I (y) = y (x) dx
a

is the area under the graph y (x) from point a to point b: This area is a number,
and the functional here is I (y) : A linear functional is one that has the property of
linearity. For example,
Z b Z b Z b
y1 (x) + y2 (x) dx = y1 (x) dx + y2 (x) dx
a a a

i.e.
I ( y1 + y2 ) = I (y1 ) + I (y2 )
50 CHAPTER 4. CALCULUS OF VARIATIONS

so we can say that I (y) is a linear functional. Another example of a linear functional
is since
(y (x)) = y (0)
and
( y1 (x) + y2 (x)) = y1 (0) + y2 (0) = (y1 (x)) + (y2 (x))
Using this concept, we de…ne the functional L (y) for …nding the length of the curve
to be Z bq
L (y) = 1 + (y 0 )2 dx
a

We note that L (y) is not a linear functional.


Let us de…ne a new functional
Z b
I (y) = F (x; y; y 0 ) dx
a

We now wish to …nd the path y (x) for which the functional I (y) attains its maximum
or minimum value. Consider
f 00 (x)
f (x + x) ' f (x) + f 0 (x) ( x) + ( x)2 + :::
2
At the maximum or minimum of f (x), we know that f 0 (x) = 0; and so at that point

f 00 (x)
f (x + x) ' f (x) + ( x)2 + ::: (4.2)
2
Consider now a linear functional I (y) de…ned as
Z b
I (y + y) = F (x; y + y; y 0 + y 0 ) dx
a

Using Taylor expansions


Z b
@F @F
I (y + y) = F (x; y; y 0 ) + y + 0 y 0 + ::: dx
a @y @y
Z b
@F @F
= I (y) + y + 0 y 0 + ::: dx
a @y @y

We de…ne Z b
@F @F
I= y + 0 y 0 dx
a @y @y
as the …rst variation of the linear functional I:
4.2. BRACHISTOCHRONE PROBLEM 51

In order to …nd the maximum or minimum of I; the …rst variation I must be


zero for the same reason used to obtain (4:2) : In other words, in order for the linear
functional I (y) to have a minimum or maximum value, we must have

I=0

Hence Z Z Z
b b b
@F @F @F @F 0
y + 0 y 0 dx = 0 = y dx + y dx (4.3)
a @y @y a @y a @y 0
Using integration by parts, we have that
Z b b Z b
@F 0 @F d @F
0
y dx = y y dx (4.4)
a @y @y 0 a a dx @y 0

Since the end points are …xed and the variation y is zero at the end points, it follows
that
b
@F
y =0 (4.5)
@y 0 a
Substituting (4:4) and (4:5) into (4:3) ; we get
Z b Z b Z b
@F @F @F d @F
y + 0 y 0 dx = 0 = y dx y dx
a @y @y a @y a dx @y 0

Hence Z b
@F d @F
y y dx = 0
a @y dx @y 0
which is Z b
@F d @F
y dx = 0 (4.6)
a @y dx @y 0
From (4:6) ; we get the well-known Euler-Lagrange equation

@F d @F
=0 (4.7)
@y dx @y 0

which is a condition on the function y (x) for the functional I (y) to be maximum or
minimum. Using the notation
@F
= Fy
@y
and the chain rule, (4:7) can be expressed as

Fy (Fy0 x ) (Fyy0 ) y 0 (Fy0 y0 ) y 00 = 0 (4.8)


52 CHAPTER 4. CALCULUS OF VARIATIONS

In the speci…c case where F = F (y; y 0 ) ; (i.e. no x dependence) we can simplify this
to
Fy (Fyy0 ) y 0 (Fy0 y0 ) y 00 = 0 (4.9)
Now if we multiply the Euler-Lagrange equation (4:7) by y 0 ; we get

@F d @F
y0 y0 =0
@y dx @y 0
Since
dF
y 0 Fy = Fy0 y 00
dx
we get
dF d
Fy0 y 00 + y 0 (Fy0 ) = 0
dx dx
which can be written as
dF d
[(Fy0 ) y 0 ] = 0
dx dx
This is
d
[F Fy 0 y 0 ] = 0
dx
which means that
F Fy0 y 0 = constant (4.10)
Now consider the functional L (y) for the length of the curve from points a to b
Z bq
1 + (y 0 )2 dx
a

where q
F (y) = 1 + (y 0 )2
As there is no y in this (y 0 is considered a di¤erent entity)

Fy = 0

Also
@F y0
= q
@y 0
1 + (y 0 )2
and substituting these into (4:7), we get
2 3
0 0
d 4 y 5=0) q y
0 q = constant
dx 0 2 0 2
1 + (y ) 1 + (y )
4.2. BRACHISTOCHRONE PROBLEM 53

which means that y 0 is a constant.


Now returning to the Brachistochrone problem, recall that (4:1)
q
Z b 1 + (y 0 )2
t= p dx
a 2g (ya y)

Here q
1 + (y 0 )2
F =p
2g (ya y)
and
@F y0
= q p
@y 0
1 + (y 0 )2 2g (ya y)
Now using the Euler-Lagrange equation (4:10) ; we have
q 2 3
1 + (y 0 )2 y 0
p y0 4 q p
5 = constant (4.11)
2g (ya y) 2
1 + (y 0 ) 2g (ya y)

Let us call the constant A; hence we have


q
1 + (y 0 )2 (y 0 )2
p q p =A
2g (ya y) 1 + (y 0 )2 2g (ya y)

which simpli…es to
1
q p =A (4.12)
1 + (y 0 )2 2g (ya y)
Let us take q p
A1 = 1 + (y 0 )2 2g (ya y)

2
A2 = 1 + (y 0 ) 2g (ya y)
z = ya y
and
2
A3 = 1 + (z 0 ) z

Assuming that
z 0 = cot t
54 CHAPTER 4. CALCULUS OF VARIATIONS

then
2 1
1 + (z 0 ) = 1 + cot2 t = cosec2 t =
sin2 t
Hence, A3 becomes
z
A3 =
sin2 t
and
1 cos 2t
z = A3 sin2 t = A3
2
Now since z = ya y; we have

A3
ya y= (1 cos 2t)
2
which is
A3
y = ya (1 cos 2t) (4.13)
2
Di¤erentiating this, we have
dy = A3 sin 2t (4.14)
Also, as z 0 = cot t; and z 0 = y 0 ; then

dy dy sin t
= cot t ) dx = = dy (tan t) = dy
dx cot t cos t
Substituting for dy into this, we obtain

sin t sin t
dx = ( A3 sin 2t) = A3 [2 cos t sin t] = 2A3 sin2 t
cos t cos t
It follows that
1 cos 2t
dx = 2A3 = A3 (1 cos 2t)
2
and integrating this,
1
x = A3 t sin 2t + C (4.15)
2
where C is a constant of integration. Let = 2t; and write this equation as

2t sin 2t A3
x = A3 +C = ( sin ) + C
2 2 2

and also from (4:13)


A3
y = ya (1 cos )
2
4.3. EULER-LAGRANGE DIFFERENTIAL EQUATION 55

We note that the general equations for a cycloid are

x = r (t sin t) ; y = r (1 cos t)

which is the basic form of x and y that we found above: The path we seek (as correctly
postulated by Johann Bernoulli et al.) can therefore be described as piece of a cycloid.
A cycloid is de…ned as the locus of a point on the rim of a circle of radius a rolling
along a straight line (see Figure 4.2).

Figure 4.2: A cycloid is the locus of a point on the rim of a circle of radius a rolling
along a straight line

4.3 Euler-Lagrange Di¤erential Equation

As described in the previous section, the Euler Lagrange di¤erential equation is the
fundamental equation of the Calculus of Variations. It states that if a functional J is
de…ned by an integral of the form
Z
J = f (t; y; y 0 ) dt

where
dy
y0 =
dt
then J has a stationary value if the Euler Lagrange di¤erential equation

@f d @f
=0
@y dt @y

is satis…ed.
Let us illustrate this theory with a suitable example. Consider two equal sized
metal rings of radius r placed into a soap solution, and then separated. A soap …lm
is subsequently formed stretched out between the two rings. The shape of this …lm is
such as to provide minimal area (see Figure 4.3). For this problem, the surface area
that is to be minimized is expressed as
Z q
surface area = C y 1 + (y 0 )2 dy
56 CHAPTER 4. CALCULUS OF VARIATIONS

Figure 4.3: Soap …lm formed between two metal rings each of radius r

where C = 2 for a circle or cylinder. Let us consider the functional


q
F (y; y ) = C y 1 + (y 0 )2
0

The Euler Lagrange equation that should be used is the one for the form F = F (y; y 0 )
which is given in (4:10)
F Fy0 y 0 = constant = A (4.16)
Now
1 C y (2y 0 )
F = q
y0 (4.17)
2
1 + (y 0 )2
Substituting (4:17) into (4:16) gives
q
1 C y (2y 0 ) 0
C y 1 + (y 0 )2 q y =A
2 0 2
1 + (y )

hence 2 3
0 2 0 2
1 + (y ) (y ) 5
Cy 4 q =A
0 2
1 + (y )
This means that q 2
C 2
C y = A 1 + (y 0 )2 ) y = 1 + (y 0 )
A
Hence s
2
0 dy C
y = = y2 1
dx A
4.3. EULER-LAGRANGE DIFFERENTIAL EQUATION 57

Therefore Z Z
1
q dy = dx
C 2
A
y2 1
Integrating, we get
A 1 Cy
x= cosh +B
C A
This leads to
Cy C (x B)
= cosh
A A
Hence
A C (x B)
y= cosh
C A
where A can be negative or positive. The shape of the …lm drawn out between the
two circles is therefore two symmetric back-to-back hyperbolic cosine curves (catenary
shape). Note that due to symmetry, we can safely assume that B = 0: Also, if
boundary conditions are taken into account, it can be proven that A = 0 is a possible
solution, meaning that a soap …lm will not necessarily be formed between the two
rings when they are separated.

4.3.1 Application in Optics


Let us investigate Fermat’s principle for the propagation of light. In optics, Fermat’s
principle states that the path taken between two points by a ray of light is the path
that can be traversed in the least time. In other words, it states that light always
travels in such a way that it travels in the least time possible. Clearly this is an
application that can be deduced from the Euler-Lagrange equations. This principle
can be used to describe the properties of light rays re‡ected o¤ mirrors and other
surfaces (see Figure 4.4), or refracted through di¤erent media (see Figure 4.5). It
follows mathematically from Huygens’ principle (at the limit of small wavelength),
and can be used to derive Snell’s law of refraction and the law of re‡ection. Light is
believed to travel in straight lines in a vacuum at the speed of light c0 ' 3 108 ms 1 :
If light travels through a uniform medium with a speed of c; then we de…ne the
refraction index n of that medium to be
c0
n=
c
Clearly the more dense the medium, the smaller the value of c; and hence the more
optically dense it is.
Let us consider a medium where the refraction index n1 is dependent on the depth
of the medium, i.e. n1 = n1 (y) : Since
ds
c=
dt
58 CHAPTER 4. CALCULUS OF VARIATIONS

Figure 4.4: Re‡ected light

Figure 4.5: Light is refracted as it passes from one medium to another with a di¤erent
optical density
4.3. EULER-LAGRANGE DIFFERENTIAL EQUATION 59

it follows that the time taken for light to travel from a point x1 to a point x2 is

ds ds 1
dt = = = n1 ds
c (c0 =n1 ) c0

Therefore Z x2
n1 (y)
t= ds
x1 c0
and since we know that q
ds = 1 + (y 0 )2 dx
then Z q
x2
n1 (y)
t= 1 + (y 0 )2 dx
x1 c0
Let us consider the functional
q
F (y; y ) = n (y) 1 + (y 0 )2
0

n1 (y)
where n (y) = : The Euler-Lagrange equation for such a functional is
c0

F (Fy0 ) y 0 = C

where C is a constant. Therefore, we have that


q
n (y) (y 0 )2
n (y) 1 + (y 0 )2 q =C
1 + (y 0 )2

Therefore q
0 2 0 2
n (y) 1 + (y ) n (y) (y ) = C 1 + (y 0 )2
h i q
0 2 0 2
n (y) 1 + (y ) (y ) = C 1 + (y 0 )2
q
n (y) = C 1 + (y 0 )2

2 [n (y)]2
(y 0 ) = 1
C2
s
[n (y)]2
y0 = 1
C2
60 CHAPTER 4. CALCULUS OF VARIATIONS

dy
Now using the fact that y 0 = ; we may write
dx
Z
dy
C q = x + C1
2 2
[n(y)] C

We note that we must have [n(y)]2 C 2 > 0; i.e.

[n (y)]2 > C 2 = constant

n1 (y)
As n (y) = is just a scaled refractive index of the medium, this result can be
c0
roughly translated to mean that the refractive index in the medium must be above
a certain value for light to propagate through it. This means that light waves can
be trapped between two regions bounded by a limiting refractive index in a properly
graded medium of varying densities. This is the principle behind "waveguides", which
describes the con…nement of light waves travelling through a given region.

4.4 Theory of Lagrange Multipliers


Suppose we wish to …nd the extrema of the function z = f (x; y) subject to a given
constraint F (x; y) = h = constant. At the maximum of the function z = f (x; y (x)) ;
we have that
dz dz @f @f dy
=0) = + =0 (4.18)
dx dx @x @y dx
Now
F (x; y (x)) = h
therefore di¤erentiating, we have that

@F @F dy
+ =0
@x @y dx

and it follows that


@F
dy
= @x (4.19)
dx @F
@y
Substituting (4:19) into (4:18) we get

Fx
fx fy =0
Fy
4.4. THEORY OF LAGRANGE MULTIPLIERS 61

Separating the variables, we have that

fx fy
= = = constant
Fx Fy

where is a constant that we refer to as a Lagrange multiplier. We therefore have


two equations
fx Fx = 0; fy Fy = 0
so if we had formed the equation

g (x; y) = f (x; y) F (x; y)

then at any given maximum or minimum point of f (x; y) subjected to the constraint
F (x; y) ; this new function g (x; y) also has extremum at

gx = fx Fx = 0; gy = fy Fy = 0

(*see notes for MATH 2270: Multivariable Calculus for examples).


In summary, given a functional
Z b
I (y) = F (x; y; y 0 ) dx
a

to …nd its extremum with a given constraint


Z b
G (x; y; y 0 ) dx = k = constant
a

we consider a new functional


Z b
J (y) = (F G) dx
a

(where is the Lagrange multiplier) and then we proceed to …nd the Euler-Lagrange
equation for F G: We then would have a di¤erential equation to solve with two
boundary conditions and another variable which we can …nd from the constraint.
That is: Z b
(Euler-Lagrange equation for F G) + G dx = k
a

Note that the actual meaning of the Lagrange multiplier is

dI
=
dk
62 CHAPTER 4. CALCULUS OF VARIATIONS

4.4.1 Worked Example


Suppose we are faced with the following problem. We have a loose non-elastic string
tied between two …xed points a and b on the x axis. We wish to determine the shape
the string must take in order for us to calculate the maximum possible area that can
be formed between the string and the x axis. Since the length of the string is …xed,
this is a constraint that we must take into account in order to solve the problem. To
state the problem in mathematical terms, we are trying to maximize the area
Z b
Area = y dx
a

subject to the constraint


Z bq
length of string = L = 1 + (y 0 )2 dx
a

where L is a constant. In order to solve this and other such problems, we may utilize
the theory of Lagrange multipliers.
Let us consider Z q
b
I (y) = y 1 + (y 0 )2 dx
a

where q
g=y 1 + (y 0 )2
The corresponding Euler-Lagrange equation is

@g d @g
=0 (4.20)
@y dx @y 0

Recall the Euler-Lagrange equation that should be used is the one for the form F =
F (y; y 0 ) from (4:16)
F Fy0 y 0 = constant = A
Using this equation with (4:20), we get
2 3
q 0
y
y 1 + (y 0 )2 + 4 q 5 y0 = A
1 + (y 0 )2

which simpli…es to
(y 0 )2 1 + (y 0 )2
y+ q =A
0 2
1 + (y )
4.4. THEORY OF LAGRANGE MULTIPLIERS 63

y q =A
0 2
1 + (y )

y A= q
1 + (y 0 )2
Squaring both sides
2
(y A)2 =
1 + (y 0 )2
which gives
2
2
(y 0 ) = 1
(y A)2
hence s
2
y0 = 1
(y A)2
dy
Now since y 0 = ; we get
dx
Z Z
(y A)
q dy = dx
2 2
(y A)
which is q
2 (y A)2 = x + C
and …nally
2
= (x + C)2 + (y A)2
We see therefore that the Lagrange multiplier is the radius of a circle centred at
( C; A) : The shape of the string to enclose a maximum possible area is therefore a
semi-circular arc, and the area can be calculated accordingly.
We can also solve this problem as follows. Let us take instead the Euler-Lagrange
equation (in its original form before simpli…cation)
@F d @F
=0
@y dx @y 0
In our case q
F =y 1 + (y 0 )2
and so we get (recall that y is considered separately from y 0 )
0 1
0
d @ y A=0
1+ q
dx 2
1 + (y 0 )
64 CHAPTER 4. CALCULUS OF VARIATIONS

Simplifying this, we get


0 1
q 0 00
@y 00 1 + (y 0 )2 yy A
y0 q
1+ (y 0 )2
1+ =0
1 + (y 0 )2
" #
y 00 1 + (y 0 )2 y 00 (y 0 )2
1+ 3=2
=0
1 + (y 0 )2
y 00
1+ 3=2
=0
1 + (y 0 )2
This may be expressed as

y 00 1 1
3=2
= =
1+ (y 0 )2 R

This is the equation of curvature where = R = radius of curvature. Hence we are


saying that the curvature is constant, which implies that the shape of the string for
maximal area is a semi-circular arc, as previously determined. As is negative, this
indicates that the semi-circular arc is inverted as shown in Figure 4.6.

Figure 4.6: For maximal area the string must be in the shape of a semi-circular arc

4.4.2 Exercise
Find the shape of a chain that is hanging at equilibrium between two …xed points,
where the length of the chain is given. (*Hint: the potential energy of the chain is
minimal in this state).
4.5. LEAST ACTION PRINCIPLE 65

4.5 Least Action Principle


The principle of least action (or principle of stationary action) is a variational prin-
ciple that, when applied to the action of a mechanical system, can be used to obtain
the equations of motion for that system. The principle led to the development of
the Lagrangian and Hamiltonian formulations of classical mechanics. This principle
in often applied in the theory of relativity, quantum mechanics and quantum …eld
theory. The "action" is a mathematical functional which takes the trajectory (also
called the path or history) of the system as its argument. Like any other functional,
it has a real number as its result. As the action is path dependent, it takes di¤erent
values for di¤erent paths. The trajectory traced out by any physical system is that
for which the action is minimized, or, more strictly, is stationary. As can be expected,
the classical equations of motion of a system may be derived from the principle of
least action.
The action S is represented as an integral over time, taken along the path of the
system between the initial time and the …nal time of the development of the system.
Naturally, for the action to be well de…ned, the trajectory has to be bounded in time
and space. It has the dimensions of energy time, and its SI unit is joule-seconds.
It is represented in mathematical terms as
Z t2
S= L dt (4.21)
t1

where the integrand L is called the Lagrangian, named after Joseph Louis Lagrange.
The Lagrangian of a dynamical system is a function that summarizes the dynamics of
the system. The concept of a Lagrangian was originally introduced in a reformulation
of classical mechanics by the Irish mathematician William Rowan Hamilton known
as Lagrangian mechanics. In classical mechanics, the Lagrangian is de…ned as the
kinetic energy, T , of the system minus its potential energy
L=T U (4.22)
If the Lagrangian of a system is known, then the equations of motion of the system
may be obtained by a direct substitution of the expression for the Lagrangian into
the Euler–Lagrange equation.
In order to explain these concepts more clearly, Let us consider the motion of a
particle in a potential well U (x) ; where
dU
F =
dx
By Newton’s second law, we have
d2 x dU
m =F =
dt2 dx
66 CHAPTER 4. CALCULUS OF VARIATIONS

which gives
dU d2 x
m =0
dx dt2
Hence
dU d
mx = 0
dx dt
which can be expressed as (via the chain rule)
2 2
3
dU d 6@ m x 7
4 5=0
dx dt @ x 2

Since 2 3
2
@ 6m x 7 @ @
4 5=0= U
@x 2 @t @ x

then 2 3 2 3
2 2
@ 6m x 7 d @ 6m x 7
4 U5 4 U5 = 0
@x 2 dt @ x 2

However we know that the kinetic energy


2
m x
T =
2
Therefore we have
@ d @
[T U] [T U] = 0
@x dt @ x
Now we have previously de…ned the Lagrangian as

L=T U

It follows that
@ d @
[L] [L] = 0 (4.23)
@x dt @ x
This is variant of the Euler-Lagrange equation (4:23) that involves the Lagrangian L.
Now we already know that E = T + U is the equation used for the total energy
in a given system. Therefore, it follows that

m (v)2
L=T U =T (E T ) = 2T E=2 E = mv 2 E
2
4.5. LEAST ACTION PRINCIPLE 67

Since
!
v !
v = v2
we may write
L = m!
v !
v E
The "action" S is therefore
Z t2 Z t2
S= L dt = (m!
v !
v E) dt
t1 t1

and since m! v =!p is the momentum and ! v dt = d ! r where the trajectory is the
path taken from the point r1 to the point r2 , therefore
Z t2 Z r2
S= L dt = !
p d!
r E ( t)
t1 r1

The path chosen by the particle is such that E ( t) is minimal.

4.5.1 Worked Example


Consider a pendulum made of an elastic string supporting a mass m: Clearly the
length of the oscillating pendulum will vary at any particular point in time. Let us
take the angle to the vertical at the point during the swing at pendulum length Le to
be (see Figure 4.7), where Le and are both functions of time. Let L0 be the "rest"
length of the elastic string, meaning that the string extends by the length Le L0 as
the pendulum swings. Resolving in the horizontal and vertical directions, we have

Figure 4.7: Elastic pendulum has length Le when the angle to the vertical is

x = Le sin ; y = Le cos
68 CHAPTER 4. CALCULUS OF VARIATIONS

and therefore

x = Le sin + Le cos
y = Le cos Le sin

Squaring these, we have


2 2 2
x = Le sin2 + L2e cos2 + 2Le Le sin cos

and
2 2 2
2
y = Le cos + L2e sin2 2Le Le sin cos

Now as the kinetic energy is


m 2 2
T = x + y
2
we have that
" #
2 2 2 2
m
T = Le sin2 + L2e cos2 + Le cos2 + L2e sin2
2

which gives " #


2 2
m
T = Le + L2e (4.24)
2
The potential energy is (where L0 is the "rest" length of the elastic string and k is
the elastic constant)
k k
U= mgy + (Le L0 )2 = mgLe cos + (Le L0 )2 (4.25)
2 2
It follows that the Lagrangian for this system is
" #
2 2
m k
L= Le + L2e mgLe cos + (Le L0 )2
2 2
so " #
2 2
m k
L= Le + L2e + mgLe cos (Le L0 )2 (4.26)
2 2
Now
@L d @L
=0
@Le dt @ L
e
4.5. LEAST ACTION PRINCIPLE 69

which gives
2
d
mLe + mg cos k (Le L0 ) mLe =0
dt
which simpli…es to
2
mLe + mg cos k (Le L0 ) mLe = 0 (4.27)

Also
@L d @L
=0
@ dt
@
which gives
d
mgLe sin mL2e =0
dt
which simpli…es to

mgLe sin m 2Le Le + L2e =0

Dividing this by mLe we have

g sin + 2Le + Le =0 (4.28)

At steady state, all the time derivatives from(4:27) drop out and we get
mg cos
mg cos k (Le L0 ) = 0 ) Le = L0 + = Lequilibrium (4.29)
k
mg cos
where represents the elongation of the elastic string. Similarly at steady
k
state we get from (4:28) that

sin = 0 ) =0 (4.30)

We now linearize about the steady state as follows. Let be very small (i.e. a small
deviation from its steady state value of zero), and Le = Lequilibrium + q; where q is a
small deviation from the equilibrium length. Since is small

( )2
cos ' 1 '1
2
and so
mg
Le L0 = Lequilibrium + q L0 = +q
k
70 CHAPTER 4. CALCULUS OF VARIATIONS

and
Le = q
Therefore equation (4:27) becomes after linearization
mg
mg k +q mq = 0
k
which means
kq + mq = 0 (4.31)
Therefore we have frequency r
k
!0 = (4.32)
m
Also since is small
sin '
and equation (4:28) becomes after linearization

Lequilibrium +g =0 (4.33)

with frequency r
g
!1 = (4.34)
Lequilibrium
The general solution for q is therefore

q = A cos (!0 t) + B sin (!0 t) (4.35)

and for
= C cos (!1 t) + D sin (!1 t) (4.36)
i.e.
q ei!0 t and ei!1 t
Now if we did not drop nonlinear terms from the expressions, we would have
obtained after the small perturbation from the steady state the equations
2
k g
q+ q = Lequilibrium ( )2 (4.37)
m 2
and
Lequilibrium +g = 2 q q2 (4.38)

Using regular perturbation expansions

q = q0 + q1 + ::: and = 0 + 1 + :::


4.5. LEAST ACTION PRINCIPLE 71

the …rst order system is


2
k g
q0 + q0 = 0 Lequilibrium ( 0 )2 (4.39)
m 2

Lequilibrium 0 +g 0 = 2 q0 0 q02 0 (4.40)

Considering only the …rst equation (4:39), we see that


2
i!1 t 2 2i!1 t
0 e ) 0 e ; 0 e2i!1 t

and
q0 ei!0 t
Therefore the right hand side of (4:39) is periodic with oscillating frequency 2!1 ;
while the left hand side is periodic with oscillating frequency !0 . We therefore get
resonance (i.e. solution grows without bounds) if 2!1 = !0 ; which means resonance
happens if r
r
g k
2 =
Lequilibrium m
Since Lequilibrium = L0 + mg
k
; then
r r
g k
2 mg =
L0 + k m
squaring we get
k mg k
4g = L0 + = L0 + g
m k m
which gives a condition for resonance of
kL0
=3
mg

4.5.2 Exercise
1. Consider the following double pendulum (see Figure 4.8) of two masses m1
and m2 supported by rods of length L1 and L2 respectively. Find expression
for the kinetic energy and potential energy for each mass, the corresponding
Lagrangians and Lagrange equations of motion.
2. Consider the following double pendulum (see Figure 4.9), where the …rst mass
m1 is attached to a spring along the x axis with spring constant k , and the
second mass m2 is supported by a rod of length L; and the whole system is
oscillating. Use the Lagrangian formulation to …nd the frequencies of oscillation
of both masses.
72 CHAPTER 4. CALCULUS OF VARIATIONS

Figure 4.8: Double pendulum with masses m1 and m2 (pendulum lengths L1 and L2
respectively)

Figure 4.9: Double pendulum where the …rst mass m1 is attached to a spring along
the x axis and the second mass m2 is connected to the …rst mass by a …xed rod.
Chapter 5

Concepts in Mechanics

5.1 Laws of Conservation


Consider a ‡uid of constant density passing through the surface a …xed volume V
(see Figure 5.1). If the velocity of the ‡uid is !
v ; the mass m of the ‡uid passing

Figure 5.1: Fluid of velocity v, density and mass m passing through the surface of
a …xed volume V

through V may be found as Z


m= dV
V

If the surface of V is S; and an element of the surface through which the ‡uid passes
is denoted dS with unit normal in the direction of the ‡uid ‡ow nb; then the mass ‡ux
! ! !
of ‡uid j (where j = v ) through the surface S is
Z Z
@m @ !
= dV = v n b dS
@t @t
V S

73
74 CHAPTER 5. CONCEPTS IN MECHANICS

It follows that Z Z
@ !
dV + v b dS = 0
n
@t
V S

Using Gauss’divergence theorem, we may write this as


Z Z
@
dV + r ( ! v ) dV = 0
@t
V V

which is Z
@
+r ( !
v) dV = 0
@t
V

Hence we have the continuity equation

@
+r ( !
v)=0 (5.1)
@t
Recalling the notation for the mass ‡ux for a ‡uid of density
!
j = !
v

we may write the continuity equation as

@ !
+r j =0 (5.2)
@t
Now since
!
r j =r ( !
v)=r !
v + r !
v
then (5:1) can be written as

@ !
+r v + r !
v =0
@t
If the ‡uid is incompressible, is constant and therefore

@
r = 0; =0
@t
leading to
r !
v =0
which gives the continuity equation for an incompressible ‡uid as

r !
v =0 (5.3)
5.1. LAWS OF CONSERVATION 75

Figure 5.2: Ideal ‡uid of volume v ‡ows through a volume V upon which an external
pressure P is acting
76 CHAPTER 5. CONCEPTS IN MECHANICS

For an ideal ‡uid with no internal friction between the molecules of the ‡uid.
Consider an external pressure P acting on the surface of a volume V through which
an ideal ‡uid is ‡owing with velocity ! v (see Figure 5.2). If the surface of V is S;
and an element of the surface through which the ‡uid passes is denoted dS with unit
normal in the direction of the ‡uid ‡ow nb; then we may say that
!
d f = P n b dS
!
The total force F acting on the surface S of the volume V is therefore (recall Pressure
= Force Area) Z Z
! !
F = d f = P nb dS
S S

By Gauss’divergence theorem (with a "twist" since r here is not a divergence but a


gradient since P is scalar), we may say that
Z
!
F = rP dV
V

where rP is the force per unit volume acting on the surface of the ideal ‡uid.
Consider the force acting on a small element of ideal ‡uid of mass dm and volume
dV . It follows that
!
d f = rP dV
where rP is the force per unit volume acting on the surface of small element of ideal
‡uid. Now since we know that
dm = dV
and Newton’s second law is
! d!v
d f = dm
dt
it follows that
d!v
rP dV = dV (5.4)
dt
Hence
d!v
rP = (5.5)
dt
Now since in general
@ @ @
d!
r r = dx + dy + dz
@x @y @z
and
! @!v @!
v @!
v @!
v
d v = dt + dx + dy + dz
@t @x @y @z
5.1. LAWS OF CONSERVATION 77

we may say that


@!v
d!
v = dt + (d !
r r) !
v
@t
Therefore we have
d!v @!v d!r
= + r !
v
dt @t dt
and
d!v @!v
= + (!
v r) !
v (5.6)
dt @t
Substituting (5:6) into (5:5) ; we have
@!v
rP = + (!
v r) !
v
@t
which gives the so-called Euler equation
@!v 1
+ (!v r) !
v = rP (5.7)
@t

Let us take
! j + v3 b
v = v1 bi + v2 b k
It follows that
! @ @ @
v r = v1 + v2 + v3
@x @y @z
This allows us to write (5:7) in component form as follows
@v1 @v1 @v1 @v1 1 @P
+ v1 + v2 + v3 = (5.8)
@t @x @y @z @x
@v2 @v2 @v2 @v2 1 @P
+ v1 + v2 + v3 = (5.9)
@t @x @y @z @y
@v3 @v3 @v3 @v3 1 @P
+ v1 + v2 + v3 = (5.10)
@t @x @y @z @z
In the presence of an additional external force, this equation may take a slightly
di¤erent form. Consider the case where the gravity force included. Equation (5:4)
would be
d! v !
rP dV = dV g dV
dt
giving a variant of the equation (5:5) as
d!v !
rP = g
dt
78 CHAPTER 5. CONCEPTS IN MECHANICS

This leads to a slightly di¤erent Euler equation under gravity

@!v 1
+ (!
v r) !
v = rP + !
g (5.11)
@t

At steady state, there is no ‡uid motion (!


v = 0) and we get

1
rP + !
g = 0 =) rP = !
g (5.12)

which represents the equilibrium state of the ‡uid. (*Note: for any general external
!
force, we replace !g by f in the equations above).
By writing equation (5:12) in terms of its z component (with the z axis vertically
upwards, ! g = gb k) , and assuming that the density is constant throughout the
volume of ‡uid (ie., there is no signi…cant compression of ‡uid uder the action of the
gravitational force), we obtain

1 @P @P
+ g = 0 =) = g (5.13)
@z @z

and integrating (5:13), we may determine how the pressure varies with the depth of
the ‡uid (in z direction). In general:

P = gz + constant

Let us use the concepts above to determine how pressure behaves in the earth’s
atmosphere. As the density is not constant for a gas, we must use the ideal gas law
m
PV = RT
M
where P is the pressure of the gas, V is the volume, m is the mass of the gas, M is
the mass of one mole of the gas, R is the universal gas constant, and T is the absolute
temperature. Hence
m RT RT
P = =
V M M
and the pressure is directly proportional to the density as expected. Now
M
= P (5.14)
RT
and substituting (5:14) into (5:13) ; we have

RT dP
= g
M P dz
5.2. LAGRANGIAN VS EULERIAN EQUATIONS OF MOTION 79

Integrating this Z Z
P
1 gM
dP = dz
P0 P RT
which is
gM
ln P ln P0 = z
RT
where P0 is the atmospheric pressure at sea level. Hence
Mg
P = P0 exp z (5.15)
RT
which is the Barometric formula. For air on the earth,
RT (8:31) (300)
= ' 104 m = 10 km
gM (9:81) (29 10 3 )
After using this as a distance scale in (5:15) to non-dimensionalize the equation,
we can predict that for every 10 km vertical distance away from the earth, the air
pressure drops by (approximately) a scale of 3:

5.1.1 Exercise
Consider an ideal ‡uid inside a cylindrical vessel which is placed on top of a rotating
table. What is the shape of the surface of the liquid inside the cylinder if the table
rotates with angular velocity !? (*you may use the Euler equation or any other
reasonable method).

5.2 Lagrangian vs Eulerian Equations of Motion


The analysis of a system may be carried out using the framework of Newtonian or
Lagrangian mechanics. A particle that moves along a particular path in the absence
of friction may be tracked via Newtonian (Eulerian) mechanics by calculating the
time-varying constraint forces that are required to keep the particle moving along the
path. Instead we could use the principles of Lagrangian mechanics to …nd a set of
independent generalized coordinates to characterize the motion of the particle as it
travels along the path. This Lagrangian approach is sometimes bene…cial, as there are
less equations involved, and it is no longer necessary to incorporate into the analysis
the constraint forces that are required to keep the particle travelling along the path.
Let us consider a particle moving along a given path, as depicted in Figure 5.3.
dx
Now x (t0 ) = x0 = ; and = v (x; t) : We may describe the motion of the particle
dt
in terms of its initial position x ( ; t) as follows:
@x
= v ( ; t)
@t
80 CHAPTER 5. CONCEPTS IN MECHANICS

Figure 5.3: Path travelled by particle

by integrating, we have
Z t Z t
x = x (t0 ) + v( ; ) d = + v( ; ) d
0 0

Example 5.2.1 In the case where v (x; t) = t; we have

dx
v= =t
dt
and Z t
(t2 t20 )
x = x (t0 ) + d = +
t0 2

Example 5.2.2 In the case where v (x; t) = x; we have

dx
v= =x
dt
and Z Z
dx
= dt ) ln x = t + C ) x = A exp (t)
x
If x = at t = t0 ; we have

= A exp (t0 ) ) A = exp ( t0 )

and therefore
x = exp (t t0 )
5.2. LAGRANGIAN VS EULERIAN EQUATIONS OF MOTION 81

Example 5.2.3 In the case where v (x; t) = xt; we have


dx
v= = xt
dt
and Z Z
dx t2 t2
= tdt ) ln x = + C ) x = A exp
x 2 2
If x = at t = t0 ; we have
t20 t20
= A exp ) A = exp
2 2
and therefore
t2 t20
x = exp
2
and
t2 t20
v ( ; t) = t exp
2
Note that
@ @
6=
@t x = constant @t = constant

The di¤erence is the frame of reference. It is a di¤erent thing to track the value of a
given variable always at the initial point rather than measuring the given variable
at a point x = constant (see Figure 5.4) For example, suppose we wanted to …nd the

Figure 5.4: Di¤erent to track the value of a variable at the initial point x = rather
than at some other point x:

rate of change of temperature (x; t) in a Lagrangian sense, i.e. from the perspective
of the initial point x = ; then
d @ @ @x @ @
(x; t) = + = + v (x; t)
dt @t @x @t @t @x
82 CHAPTER 5. CONCEPTS IN MECHANICS

i.e.
d @ @ @
= +v
dt @t @ @x
d (x; t)
Consider the case = 0; leading to the typical advection equation
dt
@ @
0= + v (x; t)
@t @x
Suppose we were to use the …nite di¤erence method to solve for the temperature at
x = (forward di¤erence in time, forward di¤erence in space, step size in time t;
step size in space x ). We would have
!
j+1 j j j
i i i+1 i
= v (xi ; tj )
t x

However, this suggests that the evolution of the temperature depends on adjacent
particles throughout, which would not be the case if we are only looking at the
temperature at the initial position x = : The correct idea would have been to …rst
switch over to Lagrangian coordinates. We would do this as follows:
In general, we can express the rate of change of temperature in Lagrangian coor-
dinates as
d
= g (x; t; ) (5.16)
dt
or in Eulerian coordinates as
@ @
+ v (x; t) = g (x; t; ) (5.17)
@t @x
In order to use the Lagrangian coordinates, we would consider
dx
= v (x; t) ; x (t0 ) =
dt
and solve this to get an expression for the trajectories along which everything is
transported
x = ' (t; t0 ; ) (5.18)
We would then substitute (5:18) into (5:16) to get
d
= g ([' (t; t0 ; )] ; t; ) ; (t0 ) = 0 ( ) (5.19)
dt
and we can then proceed to solve (5:19) :
For example, let us consider the particular case of quasi-linear equations
@ @
a (x; t) + b (x; t) = g (x; t; )
@x @t
5.2. LAGRANGIAN VS EULERIAN EQUATIONS OF MOTION 83

we may write this (using Eulerian coordinates)

@ a (x; t) @ 1
+ = g (x; t; )
@t b (x; t) @x b (x; t)

a (x; t)
where comparing with equation (5:17) ; = v (x; t) : If we now switch to La-
b (x; t)
grangian coordinates, we have

d 1
= g (x; t; )
dt b (x; t)

d
where represents the derivative along the trajectory. We refer to the characteristics
dt
of a ‡ow with respect to a given variable as the paths along which the variable in
question remains constant. It follows that the equation for characteristics of this
problem is
dx a (x; t)
= v (x; t) =
dt b (x; t)
hence
dx dt
=
a (x; t) b (x; t)

Example 5.2.4 Consider


@ @
x + 2t = 3x2
@x @t
given that when t = 1; = 5x2 : Therefore

@ x@ 3x2
+ = (5.20)
@t 2t @x 2t
The equation for the characteristics is

dx dt 1 p
= ) ln x = ln t + C ) x = A t
x 2t 2
so when t = t1 ; x = x1 ; and we get
p x1
x1 = A t1 ) A = p
t1

which means r
t
x = x1 (5.21)
t1
84 CHAPTER 5. CONCEPTS IN MECHANICS

The equation for the characteristics is therefore


x2 x2
= 1 = constant (5.22)
t t1
Writing equation (5:20) in Lagrangian coordinates, we have
d 3x2
=
dt 2t
Along the characteristics we have from (5:22) that
d 3x2
= 1
dt 2t1
and therefore Z Z
d 3x21
= dt
2t1
Hence
3x21 3x21
ln = t+C ) = A exp t
2t1 2t1
Next, using the initial condition (t1 ) = 1; we get
3x21 3x21
1 = A exp t1 )A= 1 exp
2t1 2
and therefore we have
3x21
= 1 exp [t t1 ] (5.23)
2t1
Let us take
x1
p = p = constant
t1
and so from (5:21) ; we have (along the characteristics)
p
x=p t
x2 p
Therefore p2 = ) x = p t: Since we are given that at t = 1; = 5x2 ; then on
t
characteristics at t1 = 1 it follows that x = p and
x2
1 = 5p2 = 5
t
Also, since on the characteristics
x2 x2
= 1 = constant
t t1
5.2. LAGRANGIAN VS EULERIAN EQUATIONS OF MOTION 85

we may express (5:23) as


x2 3x2
=5 exp [t 1]
t 2t
Example 5.2.5 Suppose we have in Lagrangian coordinates the following velocity
…eld
v ( ; t) = + t; t0 = 0
We can …nd the velocity …eld in Eulerian coordinates as follows:
dx
= v ( ; t)
dt
therefore Z Z
t t
x= + v ( ; t) dt = + + t dt
t0 0

t2 t2
x= + t+ = (1 + t) +
2 2
It follows that
t2
x
= 2
1+t
and …nally
t2
x
v = +t= 2 +t
1+t

5.2.1 Equation for Caustics


In general, velocity and density of any system are dependent on position and time
v (x; t) ; (x; t)
Recall the equation of continuity from the last section in one dimension
@ @ ( v)
+ =0 (5.24)
@t @x
and in two or three dimensions
@
+r ( !
v)=0 (5.25)
@t
Using the one dimensional version (5:24) ; we have (after using the product rule)
@ @ @v
+ v (x; t) + =0
@t @x @x
86 CHAPTER 5. CONCEPTS IN MECHANICS

which means that


@ @ @v
+ v (x; t) = (5.26)
@t @x @x
In Lagrangian coordinates, (5:26) may be expressed as
d @v
= (5.27)
dt @x
Let us take
dx
= v (x; t) ; x (t0 ) =
dt
and use the notation
x = ' (t; ; t0 )
We wish to solve
dx
= v (x; t) ; x (t1 ) = x1
dt
where
x = ' (t; x1 ; t1 ) ; = ' (t0 ; x1 ; t1 )
Dropping the subscripts "1" (which may be done as they are arbitrary), we have
x = ' (t; x; t) ; = ' (t0 ; x; t)
The continuity equation in one dimension in Lagrangian coordinates (5:27) is therefore
d @
= v (' (t; ; t0 ) ; t)
dt @x
This can be integrated to get
Z t
@
ln = v (' (t; ; t0 ) ; t) dt
0 t0 @x
and therefore
0
= nR o (5.28)
t @
exp t0 @x
v (' (t; ; t0 ) ; t) dt
Let us now interpret these results in terms of Lagrangian coordinates. Referring
dx
to Figure 5.5, Let us take y = : It follows that
d
dv (x; t) d @x dy @v (x; t) @v @x @v
= = = = = y
d dt @ dt @ @x @ @x
Also from Figure 5.5, since x (t0 ) = ; then
@x (t0 )
= y (t0 ) = y0 = 1
@
5.2. LAGRANGIAN VS EULERIAN EQUATIONS OF MOTION 87

Figure 5.5: Diagram showing Eulerian/Lagrangian coordinates

In summary, we wish to solve


dy @v
= y; y (t0 ) = y0 = 1
dt @x
Integrating this, we obtain
Z y Z t Z t
dy @v y @v
= dt ) ln = dt
y0 y t0 @x y0 t0 @x

which is Z t
@v
y = y0 exp dt
t0 @x
Since y0 = 1; we have Z t
@x @v
= exp dt (5.29)
@ t0 @x
Now in general we can express the velocity …eld in Lagrangian coordinates as follows:
dx
= v ( ; t)
dt
Therefore Z t
x= + v ( ; t) dt (5.30)
t0

Di¤erentiating, we have
Z t
@x @v ( ; t)
=y =1+ dt (5.31)
@ t0 @
88 CHAPTER 5. CONCEPTS IN MECHANICS

Previously we had from the continuity equation in one dimension that


0
= nR o
t @
exp t0 @x
v (' (t; ; t0 ) ; t) dt

Now from (5:29) and (5:31) ; we see that this can be expressed as

0( ) 0( )
= =
@x R t @v ( ; t)
1 + t0 dt
@ @
Using (5:31) in (??) ; we get

0( )
= (5.32)
R t @v ( ; t)
1 + t0 dt
@
Note that Z t Z t
@v @v
dt 6= dt
t0 @ t0 @x
as these are two di¤erent functions of time.
Now let us try to visualize these results. The equation that shows how x moves
as a function of the initial position is given in (5:30)
Z t
x= + v ( ; t) dt
t0

The equation (5:30) suggests that at each time t; there is a set of values for v as x
varies, as well as a set of values for v as varies. To illustrate what is going on, let
us consider the simplest case where v = v ( ) (i.e. independent of time). In this case

x = + [v ( )] [t t0 ] (5.33)

and
dx dv
=1+ (t t0 ) (5.34)
d d
dx
At the critical points, = 0; which gives
d
dv 1
1+ (t t0 ) = 0 ) t t0 =
d v ( )
therefore
v( ) 1
x= ; t = t0 (5.35)
v ( ) v ( )
5.2. LAGRANGIAN VS EULERIAN EQUATIONS OF MOTION 89

These equations above are a parametric representation of "caustics". In optics, a


caustic is the envelope of light rays that are re‡ected and refracted o¤ a curved
surface or object and projected onto another surface. It is the surface to which each
of the light rays is tangent (cusp singularities), resulting in a curve of concentrated
light. Such concentration of light (especially sunlight) can burn - hence the name
"caustic". Common situation where caustics are visible is when light shines on a
drinking glass. The glass casts a shadow, but also produces a curved region of bright
light. Rippling caustics are commonly formed when light shines through waves on a
body of water, such as on the surface of a swimming pool.
In order to demonstrate this, we should investigate the critical points of the sys-
d2 v d2 x
tem. Di¤erentiating (5:34) ; we get 2 (t t0 ) = 2 : At the critical points (xcr ; tcr ) ;
d d
the …rst and second derivatives of x must be zero, meaning that

1+v ( cr ) (tcr t0 ) = 0; v ( cr ) =0

Now di¤erentiating (5:35) ; we have

dt v ( ) dx [v ( )]2 [v ( )] [v ( )] [v ( )] [v ( )]
= ; =1 =
d [v ( )]2 d [v ( )]2
[v ( )]2
Hence at critical points (xcr ; tcr ) ; we have
dt v ( cr ) dx [v ( cr )] [v ( cr )]
= = 0; = =0
d cr
[v ( cr )]2 d cr
[v ( cr )]2
Near the critical point, if we expand t in Taylor series, we have
2 3
t = tcr + ( cr ) + ( cr ) + ( cr ) + :::
dt
Now = 0 since d
= 0: If we denote cr = ; we have
cr

2 3
t tcr = + + ::: = Y (5.36)

Also, if we expand x in Taylor series, we have


2 3
x = xcr + A ( cr ) +B( cr ) +C( cr ) + :::

but we know that at the critical points (xcr ; tcr ) ; the …rst and second derivatives of
x must be zero, hence A = B = 0; yielding
3
x xcr = C + ::: = S (5.37)

Therefore, we may say that


2 3
= a1 Y + b1 S = u; = a2 Y + b 2 S = v (5.38)
90 CHAPTER 5. CONCEPTS IN MECHANICS

Figure 5.6: Structure of caustic is a semi-cubic parabola


5.2. LAGRANGIAN VS EULERIAN EQUATIONS OF MOTION 91

and it follows that


u3 = v 2 ) u = v 2=3
We see here that the structure of the caustic is a semi-cubic parabola, referred to as
a cusp singularity (see Figure 5.6) Now let us see what happens to the density at the
caustics. From (5:32) we have that
0 ( )
=
@x
@
3
Now from (5:37) ; x xcr = C + ::: = S; and since =( cr ) ; we may say that
dx 2
( cr )
d
Hence from (5:38) ; it can be deduced that
dx
(x xcr )2=3
d
This gives an expression for the density
0

(x xcr )2=3
which means that as we approach the critical point, the density approaches 1; demon-
strating further why this type of critical point is referred to as a cusp singularity.

5.2.2 Exercises
1. For the particular case
x0 = x (t0 ) =
and
1
v= 2
1+
use the equation Z t
x= + v ( ; t) dt
t0
to …nd the parametric representation of the caustics. Hence …nd the critical
point (xcr ; tcr ) at which the caustic appears (you should get two points, but one
of the critical points is unrealistic as it is not possible to have a negative value
for tcr ).
2. For the velocity distribution of particles in Lagrange coordinates
v = 2 2t
…nd the equation for the caustics and the density in Euler coordinates (x; t) :
92 CHAPTER 5. CONCEPTS IN MECHANICS
Chapter 6

Linear Stability Analysis

6.1 Introduction
Stability theory addresses the behaviour of trajectories of dynamical systems sub-
mitted to small perturbations of initial conditions. A dynamical system is said to be
stable if its trajectories do not change too much under these small perturbations. One
of the key ideas in stability theory is that the qualitative behavior of an orbit under
perturbations can be analyzed using the linearization of the system near the orbit.
If the corresponding eigenvalues are negative real numbers or complex numbers with
negative real parts, then the point is called a stable attracting …xed point since all
nearby points converge to it at an exponential rate.
At each equilibrium point of a smooth dynamical system with an N -dimensional
phase space, there is a corresponding N N matrix M with eigenvalues that char-
acterize the behavior of the nearby points. Whenever none of the eigenvalues of
M is purely imaginary (or zero), then the attracting directions are related to the
eigenspaces of M with negative real part. Repelling directions are related to the
eigenspaces of M with eigenvalues with positive real part. If a mechanical system is
in a stable equilibrium state then a small push will result in a localized motion, for
example, small oscillations as in the case of a pendulum which is initially at rest. In
a system with damping, a stable equilibrium state is said to be asymptotically stable.
On the other hand, for an unstable equilibrium, such as a ball resting on a top of
a hill, a small push results in motion with a large amplitude that may or may not
converge to the original state. There are useful tests of stability for the case of a
linear system. Stability of a nonlinear system can often be inferred from a detailed
study of the corresponding linearized system.
For purposes of illustration, let us consider an oscillating pendulum of length L
(see Figure 6.1). The equation representing the oscillating motion of the pendulum
(subject to friction) is
+ + (!0 )2 sin = 0

93
94 CHAPTER 6. LINEAR STABILITY ANALYSIS

Figure 6.1: Simple oscillating pendulum

where !0 is the frequency of oscillation, and is related to the friction in the system
g
(!0 )2 =
L
We may write this as a system of …rst order equations

= ; = (!0 )2 sin (6.1)

At steady state, all the time derivatives are zero

0 = 0;
sin ( 0 ) = 0 ) 0 =n ; n2Z

as shown in Figure 6.2

Figure 6.2: The steady states of the oscillating pendulum are nodes along the axis

We now need to determine whether the steady states are stable. Let us consider
a small perturbation from the steady state

= 0 +e; = 0 +e
6.1. INTRODUCTION 95

where e and e are very small perturbations from the steady state. For the steady
state 0 = 0 = 0 ; we get = e ; = e; and the system of di¤erential equations
(6:1) becomes
e = e; e = e (!0 )2 sin e
Since e is very small, sin e ' e; which gives

e = e; e = e (!0 )2 e

This can be represented as


" # " #
d e 0 1 e
= (6.2)
dt e (!0 )2 e

As this system is linear, it must be invariant to change. That means if t ! t + A,


then the system becomes ! a ! C! a ; where A and C are constants. This suggests
that the solution to the problem is in the form of an exponential function, since

exp ( [t + A]) = exp ( A) exp ( t) = C exp ( t)

We therefore look for solutions of the form


" # " #
e e0
e = e0 exp ( t) (6.3)

When > 0; the solution increases in amplitude in…nitely as t ! 1; so the steady


state is said to be unstable. Conversely when < 0; the solution decreases in ampli-
tude in…nitely as t ! 1; so the steady state is stable. If is a complex number and
Re ( ) > 0; the steady state is unstable, whereas when Re ( ) < 0; the steady state
is stable.
In order to …nd ; we substitute (6:3) into (6:2) to get
" # " #
e0 0 1 e0
e0 exp ( t) = 2 e0 exp ( t)
(!0 )

Using the notation


0 1
M=
(!0 )2
and dividing through by exp ( t) ; we have
" #
e
(M I) e0 = 0 (6.4)
0
96 CHAPTER 6. LINEAR STABILITY ANALYSIS

1 0
Here I is the 2 2 identity matrix ; and
0 1

1
(M I) =
(!0 )2
In order for (6:4) to have a solution, the determinant of (M I) must be zero, which
leads to the characteristic equation
( ) + (!0 )2 = 0
Solving this for sigma, we get eigenvalues
r
2
= (!0 )2
2 4
When 2
< 4 (!0 )2 ; is the complex number

= !i
2
where r
! = (!0 )2
4
As Re ( ) < 0; we have a stable spiral point at the origin (see Figure 6.3).
Let us look next at the equilibrium point (steady state) 0 = ; 0 = 0: Let us
consider a small perturbation from the steady state
= +e; =0+ e
where e are very small perturbations from the steady state. The system of di¤erential
equations (6:1) becomes

e = e; e = e (!0 )2 sin +e

Now
sin +e = sin cos e + cos sin e = sin e

and since e is very small, sin e ' e; and the system of equations is now

e = e; e = e + (!0 )2 e (6.5)
Again, we look for solutions of the form
" # " #
e e0
e = e0 exp ( t) (6.6)
6.1. INTRODUCTION 97

x'=y g=1
y ' = - g y - w2 x w=2

1.5

0.5

0
y

-0.5

-1

-1.5

-2

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2


x

Figure 6.3: Stable spiral point at the origin - all nearby orbits spiral into the orgin.
98 CHAPTER 6. LINEAR STABILITY ANALYSIS

Substituting (6:6) into (6:5) we get


" # " #
e0 0 1 e0
e0 exp ( t) = (!0 )2 e0 exp ( t)

Using the notation


0 1
M2 =
(!0 )2
and dividing through by exp ( t) ; we have
" #
e
(M2 I) e0 = 0 (6.7)
0

1 0
Here I is the 2 2 identity matrix ; and
0 1

1
(M2 I) =
(!0 )2

In order for (6:4) to have a solution, the determinant of (M2 I) must be zero,
which leads to the characteristic equation which leads to the characteristic equation

( ) (!0 )2 = 0

Solving this for sigma, we get eigenvalues


r
2
= + (!0 )2
2 4
2
This time we have + (!0 )2 which is always positive, so the solutions are real values
4
1 > 0 and 2 < 0: This indicates the presence of an unstable saddle point at 0 = ;
0 = 0:
It can be proven that in general for this system, all even multiples of are stable
spiral points, and all odd multiples of are unstable saddle points (see the numerical
solution provided in Figure 6.4).

Application: Flutter in Aircraft Wings


Flutter is the term used to describe the potentially destructive vibration where aero-
dynamic forces on an object couple with a structure’s natural mode of vibration to
produce rapid periodic motion. This phenomenon may occur in any object subjected
6.1. INTRODUCTION 99

x'=y g=1
y ' = - g y - (w2) (sin(x)) w=2

10

0
y

-2

-4

-6

-8

-10

-20 -15 -10 -5 0 5 10 15 20


x

Figure 6.4: Equilibrium points along the axis (even multiples of are stable spirals
and odd multiples of are unstable saddle points)
100 CHAPTER 6. LINEAR STABILITY ANALYSIS

to a strong ‡uid ‡ow. Structures exposed to aerodynamic forces - including air-


craft wings - are designed carefully within known parameters to prevent ‡utter from
occurring. In complex structures where both the aerodynamics and the mechani-
cal properties of the structure are not fully understood, ‡utter can only be avoiding
through detailed experimental testing. A careful evaluation of the design of the entire
aircraft is crucial, since slight changes in the mass distribution of an aircraft or the
sti¤ness of a given component may induce ‡utter in an apparently unrelated aero-
dynamic component. Flutter can develop uncontrollably with great speed and may
cause serious damage to or lead to the eventual destruction of the aircraft.
In the absence of aerodynamic forces, the system of equations associated are

I1 x1 + I12 x2 + k1 x1 = 0
I12 x1 + I2 x2 + k2 x2 = 0

where x1 refers to the bending motions, x2 to the twisting motions, I1 is the moment
of inertia for x1 ; I2 is the moment of inertia for x2 ; and I12 is the moment of inertia for
the interaction between the two motions. When the stability of this system is investi-
gated, the corresponding eigenvalues are imaginary, suggesting that the solutions are
bounded although they oscillate. However, when there are lift forces included (when
the aircraft is in ‡ight), there are additional terms in the equations which may result
in ‡utter.

I1 x1 + I12 x2 + k1 x1 b1 v 2 x 2 = 0
I12 x1 + I2 x2 + k2 x2 b2 v 2 x 2 = 0

An approximation for the critical velocity vcr may be found at which ‡utter occurs
by carrying out a linear stability analysis on the above system of equations.

6.2 An Application - A Theory for the Formation


of Stars
Consider a spherical mass of gas in space of radius r under the in‡uence of gravity.
If the density of the gas is (!
x ; t) and the mass is m; then as density = mass
volume, we have
4 3
m= r
3
The continuity equation is
@
+r ( !
v)=0
@t
6.2. AN APPLICATION - A THEORY FOR THE FORMATION OF STARS 101

and the Euler equation

@!v 1 !
+ (!
v r) !
v = rP + f
@t

where P is pressure. Since gravitational force is a potential force, then


!
f = r

and
! Gm ! G 34 r3 ! 4
f = 3
r = 3
r = G ! r
r r 3
represents the force acting on a gas molecule taken at a point in the sphere with
radius vector !
r : It follows that
4
r = G !
r
3
Taking the divergence of this, we get
4
r2 = G r !
r
3
Since
@ @ @
r !
r = x+ y+ z=3
@x @y @z
we get
r2 = =4 G
which is the equation for gravitational potential known as the Poisson equation
(*named after the French mathematician, geometer and physicist Siméon-Denis Pois-
son).
The governing equations for this system are therefore

@
+r ( !
v)=0 (6.8)
@t
@!v 1
+ (!
v r) !
v = rP r (6.9)
@t
=4 G (6.10)
We can now investigate the stability of this spherical mass of gas. We …rst must
determine the steady state solution. At steady state

= 0 = constant
102 CHAPTER 6. LINEAR STABILITY ANALYSIS

and
!
v =0
since initially the gas does not move. We will work in one dimension (with respect to
the spatial variable x) for simplicity. Hence, v = v (x; t) ; = (x; t) and = (x; t) :
At steady state then, we have from (6:10)

@2 0
=4 G 0
@x2
Solving this, we get
0 = 2 G 0 x2 + Ax + B
hence clearly we see that the gravitational potential at steady state is not constant
(it actually varies with x - like a quadratic graph with a minimum point). We make
an assumption for simplicity that 0 = constant. At steady state:
!
v =0

= 0 = constant
= 0 = constant
P = P0 = constant
Let us linearize about this steady state. We take

! !
v = ve

= 0 +e
= 0 +e
P = P0 + Pe
Substituting these into the one dimensional equivalent of the governing equations
(6:8) to (6:10) and linearizing, we get (eventually)

@e @e
v
+ 0 =0 (6.11)
@t @x

@e
v 1 dPe d e
= (6.12)
@t 0 dx dx

@2 e
=4 G e (6.13)
@x2
6.2. AN APPLICATION - A THEORY FOR THE FORMATION OF STARS 103

Since P = P ( ) ; i.e., pressure is density dependent for a gas (this is considered a


constitutive equation), we have

d Pe @P @ e
= (6.14)
dx @ 0 @x

@P
where @
is the derivative at steady state. Using (6:14) in (6:12) ; we have
0

@e
v 1 @P @ e d e
= (6.15)
@t 0 @ 0 @x dx
As explained in previous examples, the solutions are in the form of exponential func-
tions:

= R exp ( t + ikx) ; v = V exp ( t + ikx) ; = F exp ( t + ikx) (6.16)

Substituting (6:16) into (6:11) ; (6:15) ; and (6:13), we get after dropping the
exponentials
R + ik 0 V = 0
1 @P
ikR + V + ikF = 0
0 @ 0

4 GR + k 2 F = 0
!
These can be expressed in matrix form in the form A b = 0 as
2 3
ik 0 0 2R3
6 1 @P 7
4 0 @ ik ik 5 4V 5 = 0
0
4 G 0 k2 F

where 2 3 2 3
ik 0 0 R
6 7 ! 4 5
A = 4 10 @P
@
ik ik 5 ; b = V
0
4 G 0 k2 F
In order for a solution to exist, we need the determinant of the matrix A to be zero
ik 0 0
1 @P
0 @
ik ik = 0
0
4 G 0 k2
which gives
ik 3 @P
k2 ik 0 4 Gik ik = 0
0 @ 0
104 CHAPTER 6. LINEAR STABILITY ANALYSIS

This simpli…es to
2 @P
= k2 + 4 G 0
@ 0
which is an expression that relates the growth rate to the wavenumber k and the
other parameters in the equations. This expression is referred to as the dispersion
relation.
@P
In the absence of gravity (i.e. G = 0); since > 0; we get
@ 0
s
@P
= ik
@ 0

This means that we have waves with frequency


s
@P
!=k
@ 0

and speed is frequency divided by the wavenumber, so the speed is


s
! @P
=
k @ 0

Now by de…nition,
@P
= c2
@ 0
where c is the speed of sound. The result of having zero gravity is therefore sound
waves that propagate with the speed of sound:
In the presence of gravity, i.e. G 6= 0; when
4 G 0
k2 <
c2
2
we get (*note that k = where is the wavelength of the waves)
r
2 4 G 0
=k<
c2
hence r
>c critical
G 0
It follows that one solution of will be real and positive, while the other will be real
and negative. One solution will therefore be bounded and the other unbounded. This
leads to sudden changes in the density of the molecular cloud of gas, which is thought
to be the reason for the formation of stars in space.
6.3. INSTABILITIES AT THE INTERFACE OF TWO FLUID LAYERS 105

6.2.1 Challenge Problem


Consider the same problem where the gas is viscous (instead of the Euler equation
you will need to use the Navier Stokes equation). Can you repeat this analysis in this
case, and predict what will happen?

6.3 Instabilities at the Interface of Two Fluid Lay-


ers
Consider two layers of ideal ‡uid of the same density superimposed on each other.
The top ‡uid is moving to the right with a constant velocity, and the bottom ‡uid is
at rest (See Figure 6.5) We expect that the interface between the two ‡uids will be

Figure 6.5: Two superimposed ‡uids. The top ‡uids moves with speed U0 ; while the
bottom ‡uid is at rest. The interface between the ‡uids will be perturbed.

deformed as the top ‡uid moves. For simplicity, we assume that there is no gravity or
any external forces acting on the system under consideration, and that the pressure
is the same for both ‡uids at the interface. We shall utilize the subscripts "1" and
"2" to indicate the top and the bottom ‡uids respectively.
The governing equations for any two superimposed ideal ‡uids of velocities ! v1
!
and v 2 respectively will be the Euler equations
@! v1 1
+ (!v 1 r) !v1= rP1
@t
@! v2 1
+ (!v 2 r) !v2= rP2
@t
Also for simplicity, we assume that there is no surface tension at the interface. Let
us denote the curved interface (curved as a result of the instability created by the
moving ‡uid) between the two ‡uids by y = (x; t) ; where y = 0 if the interface is
‡at and unperturbed. At steady state (using the subscript zero to indicate steady
state), using a bracketed vector type notation for the velocities:

0 =0
106 CHAPTER 6. LINEAR STABILITY ANALYSIS

P10 = P20 = P0
!
v 10 = (U ; 0) ; !
v = (0; 0)
0 20

We now perform a linear stability analysis. Consider a small perturbation from the
steady state
= 0+e= e
P1 = P10 + Pe1 = P0 + Pe1
P2 = P20 + Pe2 = P0 + Pe2
!
v = (U ; 0) + (eu ; ve )
1 0 1 1
! u2 ; ve2 )
v 2 = (0; 0) + (e
Substituting these into the governing equations and linearizing, we get

@e
u1 @e
u1 1 @ Pe1
+ U0 =
@t @x @x
@e
v1 @e
v1 1 @ Pe1
+ U0 = (6.17)
@t @x @y

and

@e
u2 1 @ Pe2
=
@t @x
@e
v2 1 @ Pe2
= (6.18)
@t @y

Now let us look at the boundary conditions at the interface between the two ‡uids.
We will linearize these boundary conditions to simplify the analysis (*note that this
is another assumption). First of all, the pressures are the same in both ‡uids at the
interface
P 1 = P2 (6.19)
and the normal velocities are equal at the deformed interface

v1n = v2n (6.20)

We need to …nd a kinematic condition for the curved interface y = (x; t) ; as shown
in Figure 6.6 If we take !
v = (u; v) ; where u is the x component of the velocity and
v is its y component, then

dy d @ @ dx @ @
v= = = + = + u
dt dt @t @x dt @t @x
6.3. INSTABILITIES AT THE INTERFACE OF TWO FLUID LAYERS 107

Figure 6.6: Actual interface is curved, and is referred to as y = (x; t)

The kinematic condition is therefore


@ @
+ u=v (6.21)
@t @x
At the interface y = 0; we can use the kinematic condition (6:21), take into account
the small perturbation from the steady state and linearize the result to obtain for the
top ‡uid
@e @e
+ U0 = ve1 jy = 0 (6.22)
@t @x
Now the normal velocity at the interface is the projection of !
v in the normal velocity.
This may be expressed as
vn = !
v !
n
where the normal !
n is
! ( x ; 1)
n =q
1 + ( x )2
Therefore the normal velocity is

( x ; 1) u x+v
vn = (u; v) q =q
1 + ( x )2 1 + ( x )2

It follows that at the interface between any two ‡uids, continuity of normal velocity
is expressed as
u + v1 u2 x + v2
q1 x =q
1 + ( x )2 1 + ( x )2
which gives
u1 x + v1 = u2 x + v2
Taking into account the small perturbation from the steady state and linearizing, we
eventually get from the above at y = 0

U0 ex + ve1 jy = 0 = ve2 jy = 0 (6.23)


108 CHAPTER 6. LINEAR STABILITY ANALYSIS

The next boundary condition comes from the linearized continuity equation

@e
u1 @ev1 @e
u2 @ev2
+ = 0; + =0 (6.24)
@x @y @x @y

We also know that at the interface, P1 (e) = P2 (e). Making use of Taylor series
expansions (as e is very small), we have

@P10
P1 ( ) = P10 ( ) + Pe1 ( ) ' P10 (0) + + Pe1 (0) + non-linear terms
@y
@P20
P2 ( ) = P20 ( ) + Pe2 ( ) ' P20 (0) + + Pe2 (0) + non-linear terms
@y

@P10 @P20
As P10 = P20 = P0 is constant at steady state, then = = 0 and we get
@y @y

P0 ( ) + Pe1 ( ) ' P0 (0) + Pe1 (0)


P0 ( ) + Pe2 ( ) ' P0 (0) + Pe2 (0)

hence at y = 0 we get
Pe1 (0) = Pe2 (0) (6.25)
We are now ready to use the method of normal modes (i.e. taking the solutions to
be of exponential form)
e1 = 1 (y) exp ( t + ikx)
u (6.26)
e2 =
u 2 (y) exp ( t + ikx) (6.27)
ve1 = V1 (y) exp ( t + ikx) (6.28)
ve2 = V2 (y) exp ( t + ikx) (6.29)

Pe1 = 1 (y) exp ( t + ikx) (6.30)

Pe2 = 2 (y) exp ( t + ikx) (6.31)


e = ' exp ( t + ikx) (6.32)
Let us …rst take the set of governing equations (6:17)

@e
u1 @e
u1 1 @ Pe1
+ U0 =
@t @x @x
@e
v1 @e
v1 1 @ Pe1
+ U0 =
@t @x @y
6.3. INSTABILITIES AT THE INTERFACE OF TWO FLUID LAYERS 109

Di¤erentiating the …rst equation with respect to x and the second with respect to y
and summing the two results gives
@ @e u1 @e v1 @ 2u
e1 @ @e v1 1h ei
+ + U0 + = P1
@t @x @y @x2 @y @x
However, from the linearized continuity equation
@e
u1 @e
v1 @ 2u
e1 @ @ev1 @ @ev1
= ) = =
@x @y @x2 @x @y @y @x
hence we get
Pe1 = 0 (6.33)
Similarly, from the next set of governing equations (6:18)

@e
u2 1 @ Pe2
=
@t @x
@e
v2 1 @ Pe2
=
@t @y
Di¤erentiating the …rst equation with respect to x and the second with respect to y
and summing the two results gives
@ @e u2 @e v2 1h ei
+ = P2
@t @x @y
and from the linearized continuity equation (6:24)
@e
u2 @e
v2
=
@x @y
hence we get
Pe2 = 0 (6.34)
Now using (6:30) in (6:33), we get
d2 1
k2 1 exp ( t + ikx) + exp ( t + ikx) = 0
dy 2
Dropping the exponential terms
d2 1
k2 1 =0
dy 2
which is a second order di¤erential equation with general solution (when k > 0)

1 = M11 exp (ky) + M12 exp ( ky)


110 CHAPTER 6. LINEAR STABILITY ANALYSIS

As P1 must decay as y ! 1; we must have M11 = 0; resulting in

1 = M12 exp ( ky)

Therefore we have
Pe1 (y) = M12 exp ( ky) exp ( t + ikx)
Similarly, using and (6:31) in (6:34) ; and repeating the analysis, we get eventually

Pe2 (y) = M21 exp (ky) exp ( t + ikx)

From the boundary condition (6:25) ; Pe1 (0) = Pe2 (0) ; which gives

M12 = M21 = M

Therefore
Pe1 (y) = M exp ( ky) exp ( t + ikx) (6.35)
Pe2 (y) = M exp (ky) exp ( t + ikx) (6.36)
Next, we substitute (6:26) and the result for Pe1 (y) in (6:35) into

@e
u1 @e
u1 1 @ Pe1
+ U0 =
@t @x @x
to get
@ @
[ 1 (y) exp ( t + ikx)] + U0 [ 1 (y) exp ( t + ikx)]
@t @x
1 @
= [M exp ( ky) exp ( t + ikx)]
@x

This is (after dropping the exponential terms)


1
1 + [ik 1 ] U0 = [ikM exp ( ky)]

which may be expressed as

ikM exp ( ky)


1 = (6.37)
( + ikU0 )

Next substitute (6:28) and the result for Pe1 (y) in (6:35) into

@e
v1 @e
v1 1 @ Pe1
+ U0 =
@t @x @y
6.3. INSTABILITIES AT THE INTERFACE OF TWO FLUID LAYERS 111

to get
@ @
[V1 (y) exp ( t + ikx)] + U0 [V1 (y) exp ( t + ikx)]
@t @x
1 @
= [M exp ( ky) exp ( t + ikx)]
@y
This is (after dropping the exponential terms)
k
V1 + [ikV1 ] U0 = M exp ( ky)

which may be expressed as


kM exp ( ky)
V1 = (6.38)
( + ikU0 )
Next substitute (6:27) and the result for Pe2 (y) in (6:36) into

@e
v2 1 @ Pe2
=
@t @y
to get
@ 1 @
[ 2 (y) exp ( t + ikx)] = [M exp (ky) exp ( t + ikx)]
@t @x
This is (after dropping the exponential terms)
ikM
2 = exp (ky) (6.39)

Next substitute (6:29) and the result for Pe2 (y) in (6:36) into

@e
v2 1 @ Pe2
=
@t @y
to get
@ 1 @
[V2 (y) exp ( t + ikx)] = [M exp (ky) exp ( t + ikx)]
@t @y
which gives (after dropping the exponential terms)
kM
V2 = exp (ky) (6.40)

Finally we look at the boundary conditions and make use of the expressions we
obtained (6:37) ; (6:38) ; (6:39) and (6:40). Now recall (6:23)

U0 ex + ve1 jy = 0 = ve2 jy = 0
112 CHAPTER 6. LINEAR STABILITY ANALYSIS

Substituting (6:32), (6:28) and (6:29) into this, we get

@
U0 [' exp ( t + ikx)] + V1 (y) exp ( t + ikx)jy = 0 = V2 (y) exp ( t + ikx)jy = 0
@x
which is (after dropping the exponentials)

U0 ik' + V1 (0) = V2 (0) (6.41)

Now from (6:38) ;

kM exp ( ky) kM
V1 (y) = ) V1 (0) =
( + ikU0 ) ( + ikU0 )

and from (6:40)


kM kM
V2 (y) = exp (ky) ) V2 (0) =

Hence we have from (6:41)

kM kM
U0 ik' + =
( + ikU0 )

which may be expressed as

k k
U0 ik' + M + =0 (6.42)
( + ikU0 )

Recall that (6:22)


@e @e
+ U0 = ve1 jy = 0
@t @x
Substituting (6:32) and (6:28) into this, we get

@ @
[' exp ( t + ikx)] + U0 [' exp ( t + ikx)] = V1 (y) exp ( t + ikx)jy = 0
@t @x
This is (after dropping the exponentials)

' + U0 ik' = V1 (0)

kM
As V1 (0) = ; we get
( + ikU0 )

kM
' + U0 ik' =
( + ikU0 )
6.3. INSTABILITIES AT THE INTERFACE OF TWO FLUID LAYERS 113

which is
k
( + U0 ik) ' + M =0 (6.43)
( + ikU0 )
We can now write (6:42) and (6:43) in the matrix form
" #
k k
U0 ik ( +ikU0 )
+ '
k =0
+ U0 ik ( +ikU0 )
M

This has a solution if the determinant of the matrix is zero, leading to


k
U0 ik ( +ikU0 )
+ k
k =0
+ U0 ik ( +ikU0 )

The resulting dispersion relation simpli…es to (eventually)


k
U0 ik 2 = k ( + ikU0 ) + ( + ikU0 )2

Multiplying through by ; we get


2
+ ( + ikU0 )2 = 0

which is a quadratic in

2 k 2 U02
+ (ikU0 ) =0
2
The roots of this equation 1;2 are
p
ikU0 k 2 U02 + 2k 2 U02 ikU0 kU0
1;2 = =
2 2
Therefore
ikU0 + kU0 kU0
1 = = [1 i]
2 2
and
ikU0 kU0 kU0
2 = = [ 1 i]
2 2
This means that the interface between the two ‡uids is in the form of a wave travelling
with frequency
kU0
!=
2
and speed
! U0
= :
k 2
114 CHAPTER 6. LINEAR STABILITY ANALYSIS

6.3.1 Exercise: Kelvin-Helmoltz Instability


The Kelvin–Helmholtz instability (named after Lord Kelvin and Hermann von Helmholtz)
can occur when there is su¢ cient velocity di¤erence across the interface between two
‡uids. One example is wind blowing over a water surface, where the wind causes the
relative motion between the water and air. In this instance, the instability is seen as
small waves that are generated on the water surface. This type of instability has been
observed in cloud formations, in the rings of Saturn, in the ocean, and in the Sun’s
corona. The theory can also be used to predict the onset of instability and transition
to turbulent ‡ow between ‡uids of di¤erent densities moving at various speeds.
Consider two ideal ‡uids of di¤erent densities that are moving in di¤erent direc-
tions at di¤erent velocities, as depicted in Figure 6.7. Carry out a linear stability
analysis and obtain the relevant dispersion relation. Analyze your results and obtain
the relevant stability criteria.

Figure 6.7: Kelvin-Helmholtz instability created at the interface of two superimposed


‡uids of di¤erent densities moving at di¤erent velocities

6.4 The Rayleigh-Taylor Instability


The term Rayleigh–Taylor instability is a phenomenon named after Lord Rayleigh
and G. I. Taylor. It refers to the interfacial instability between two superimposed
immiscible ‡uids of di¤erent densities with the heavier (more dense) ‡uid on top.
Both ‡uids are subject to gravity. There is a release of potential energy as the
heavier ‡uid sinks to displace the lighter ‡uid underneath. This was the set-up as
studied by Lord Rayleigh. As the instability develops at the interface, characteristic
inter-penetrating "Rayleigh–Taylor …ngers" are formed as ‡uid is displaced in either
direction, hence the commonly used term "…ngering instability". The upward-moving,
lighter material is shaped like mushroom caps. This process is evident not only
in many terrestrial examples, from salt domes to weather inversions, but also in
astrophysics and electrohydrodynamics.
6.4. THE RAYLEIGH-TAYLOR INSTABILITY 115

Consider two superimposed ‡uids of di¤erent densities that are initially at rest
and subject to gravity (see Figure 6.8). If 2 > 1 ; we expect that the upper ‡uid

Figure 6.8: Two ‡uids of di¤erent densities subject to gravity

will sink and the lower ‡uid will rise. Assuming that the two ‡uids are ideal and that
viscosity is neglected, but taking into account surface tension e¤ects, the governing
equations for each ‡uid will be the Euler equation and the continuity equation:
@!v1 1
+ (!v 1 r) ! v1= rP1 + !
g
@t 1

@!v2 1
+ (!v 2 r) ! v2= rP2 + !
g
@t 2

r ! v =0
1

r !
v2=0
At steady state, there is no motion but constant pressure, so using the subscript "0"
to indicate the steady state, we have
!
v 10 = !
v 20 = 0

Also, since there is no dependence on x due to symmetry, we have


1 1 @P10
rP10 + !
g =0) g=0
@y
1 1 @P20
rP20 + !
g =0) g=0
@y
Integrating the two previous equations, we get

P10 = 10 ( 1 ) gy

P20 = 20 ( 2 ) gy
At the interface at steady state, y = 0; which gives

P10 = P20 ) 10 = 20 = 0
116 CHAPTER 6. LINEAR STABILITY ANALYSIS

and hence at steady state


P10 = 0 ( 1 ) gy
P20 = 0 ( 2 ) gy
Now taking the perturbed interface to be y = (x; t) ; let us consider a small
perturbation from the steady state
! ! !
v 1 = 0 + ve 1 = ve 1
! ! !
v 2 = 0 + ve 2 = ve 2
P1 = P10 + Pe1
P2 = P20 + Pe2
Substituting these into the governing equations and linearizing, we get eventually
!
@ ve 1 1
= rPe1 (6.44)
@t 1

!
@ ve 2 1
= rPe2 (6.45)
@t 2
! !
In bracketed vector notation, ve 1 = (u1 ; v1 ) and ve 2 = (u2 ; v2 ) : The continuity equa-
tion is therefore
@u1 @v1 @u2 @v2
+ =0= + (6.46)
@x @y @x @y
Using (6:46) in (6:44) and (6:45) we get

@u1 1 @ Pe1 @u2 1 @ Pe2


= ; = (6.47)
@t 1 @x @t 2 @x

@v1 1 @ Pe1 @v2 1 @ Pe2


= ; = (6.48)
@t 1 @y @t 2 @y

Di¤erentiating the …rst equation in (6:47) with respect to x, and the …rst equation
in (6:48) with respect to y and adding the result, we get

@ @u1 @v1 1 h i
+ = Pe1
@t @x @y 1

Using (6:46) in this, we get


1 h e
i
P1 = 0 ) Pe1 = 0 (6.49)
1
6.4. THE RAYLEIGH-TAYLOR INSTABILITY 117

In the same way, di¤erentiating the second equation in (6:47) with respect to x, and
the second equation in (6:48)with respect to y and adding the result, we get

@ @u2 @v2 1 h i
+ = Pe2
@t @x @y 2

Using (6:46) in this, we get


1 h i
Pe2 = 0 ) Pe2 = 0 (6.50)
2

Next, we take
Pe1 = f1 (y) exp ( t + ikx) (6.51)
Pe2 = f2 (y) exp ( t + ikx) (6.52)
and substitute this into (6:49) and (6:50) to obtain

d2 f 1
k 2 f1 + =0
dy 2

d2 f 2
k 2 f2 + =0
dy 2
This means that
f1 = A1 exp (ky) + B1 exp ( ky)
f2 = A2 exp (ky) + B2 exp ( ky)
and since we need f1 ! 0 as y ! 1 and also f2 ! 0 as y ! +1; we need to have
B1 = A2 = 0; therefore

f1 = A1 exp (ky) ; f2 = B2 exp ( ky)

and
Pe1 = A1 exp (ky) exp ( t + ikx) (6.53)
Pe2 = B2 exp ( ky) exp ( t + ikx) (6.54)
Substituting (6:53) and (6:54) into (6:47) and (6:48) gives
1 ik
u1 = A1 exp (ky) exp ( t + ikx) (6.55)
1

1 k
v1 = A1 exp (ky) exp ( t + ikx) (6.56)
1

1 ik
u2 = B2 exp ( ky) exp ( t + ikx) (6.57)
2
118 CHAPTER 6. LINEAR STABILITY ANALYSIS

1 k
v2 = B2 exp ( ky) exp ( t + ikx) (6.58)
2

Next we must consider the boundary conditions for this system at the interface,
which when perturbed, is expressed as y = (x; t). First, we have the continuity of
the vertical velocities, expressed as

v1 jy = 0 = v2 jy = 0 (6.59)

Next we consider the kinematic condition or free-surface condition at y = (x; t):


d @ @
= +u =v
dt @t @x
This condition holds for the lower ‡uid: Linearizing this and using the subscript "1"
to indicate that this is applied to the lower ‡uid which has a free upper surface in
contact with the upper ‡uid, we obtain
@
= v1 jy = 0 (6.60)
@t
Thirdly, we must consider the pressure di¤erence across the interface. If there
were no surface tension e¤ects, then P1 = P2 (as in the previous example). However,
as we now take into account the surface tension, we expect the interface to be curved
and no longer can we assume that P1 = P2 as when a surface bends, it creates an
additional pressure. The change in pressure can be expressed as
2
P2 P1 =
R
where is the surface tension and R is the radius of curvature of the surface. There-
fore, we have that
2
P1 = P2 = P2 Kc (6.61)
R
where
2 xx
Kc = = 3=2
R 1 + ( x )2
is referred to as the mean curvature. Hence
xx
P1 = P2 3=2
1 + ( x )2

Taking Taylor expansions (assuming that is very small) and linearizing, we have
@P10 @P20
P1 ' P10 (0) + + Pe1 (0) ' P20 (0) + + Pe2 (0) xx
@y @y
6.4. THE RAYLEIGH-TAYLOR INSTABILITY 119

Since
P10 (0) = P20 (0)
@P10 @P20
= 1g ; = 2g
@y @y
we eventually get the third boundary condition at the interface y = 0 to be
Pe1 (0) Pe2 (0) = ( 1 2) g xx (6.62)
We are now ready to let
= H exp ( t + ikx) (6.63)
and to substitute (6:63) and the expressions (6:55) ; (6:57), (6:56) and (6:58) into the
boundary conditions (6:59) ; (6:60) and (6:62) to get (after simplifying and dropping
the exponential terms
1 k 1 k 2
A1 = B2 ) B2 = A1 (6.64)
1 2 1

1 k k
H= A1 ) H = A1 (6.65)
1 ( 1) 2
A1 B2 = ( 1 2) g + k2 H
Using (6:64) and (6:65) ; we can express this as
2 k
A1 + A1 = ( 1 2) g + k2 2
A1
1 ( 1)
which is
2 k
1+ = ( 1 2) g + k2 2
1 ( 1)
k
1 + 2 = 2
( 1 2) g + k2

2 k [( 1 2) g + k2 ]
=
1 + 2
The dispersion relation is therefore
2 2 1 k2
=k g (6.66)
1 + 2 1+ 2

We note that if the bracketed term


2 1 k2
g (6.67)
1 + 2 1+ 2

is positive, we will get one positive root and one negative root, leading to unbounded
solutions.
We can interpret the results as follows:
120 CHAPTER 6. LINEAR STABILITY ANALYSIS

When 2 > 1; we will get instability for the case

2 1
k2 < g

When 2 < 1 ; the bracketed term (6:67) is negative, leading to purely imaginary
roots for ; which means we will have stationary waves on the interface.

When = 0 (case of no surface tension) we get


s
2 2 1 2 1
= kg ) = kg
1+ 2 1+ 2

If also 2 = 0; we get

2
p
= kg ) = kg i

which is interpreted to be the case of "gravity waves" on the interface.

From the dispersion relation (6:66) ; we see that in general if 6= 0 and 2 > 1 ;
then the oscillations on the interface will be damped (as the magnitude of will
be smaller). The presence of surface tension therefore in general has a stabilizing
e¤ect on the interface. On the other hand, the gravity e¤ect is destabilizing (as
it increases the magnitude of ); as expected.

6.4.1 Exercise
Modify the Kelvin Helmholtz instability from the previous exercise for the case with
surface tension in the ‡uid boundary. Provide all the details of your linear stability
analysis, and obtain the relevant dispersion relation. Analyze your results and obtain
the relevant stability criteria.
Chapter 7

Heat Flow Problems

7.1 Equation of Heat Conduction


In 1855, a German born Physician and physiologist named Adolf Fick introduced
Fick’s law of di¤usion, which governs the di¤usion of a gas across a ‡uid membrane.
This law states that di¤usive ‡ux always occurs from regions of higher to lower con-
centration, with a magnitude that is proportional to the concentration gradient. In
terms of the heat ‡ux ! q n in a given direction, this law can be represented as
!q = rT ! n
n

where is the heat conductivity of the medium through which the heat is travelling,
and T is the temperature. This is not surprising, as it is easily proven experimentally
that heat ‡ow through any medium is controlled by the temperature gradient through
the medium.
The heat ‡ux is the total heat travelled per unit area and per unit time. For a
given volume V enclosed by a surface S; the total heat Q is
Z
Q= Cp T dV
V

where is the density, Cp is the heat capacity of the medium inside V , and T is the
temperature. It follows that the total rate of change of heat in V must be equal to
the rate of loss of heat through S
Z
@Q !
= q n dS
@t V
S

This may be represented as


Z Z Z
@
Cp T dV = ( rT !
n ) dS = ( rT !
n ) dS
@t
V S S

121
122 CHAPTER 7. HEAT FLOW PROBLEMS

Now by Gauss’divergence theorem


Z Z
!
( rT n ) dS = r ( rT ) dV
S V

hence we may say that


@
( Cp T ) = r ( rT )
@t
As ; Cp and are constant, we have

@T
Cp = r2 T
@t
This leads to the equation of heat conduction

@T
= r2 T = r2 T
@t Cp

where is the thermal di¤usivity of the material.


In one spatial dimension, the equation of heat conduction is

@T @ 2T
=
@t @x2
At steady state, we have T = T (x) and

d2 T
=0
dx2
Integrating twice in x; we have
T = Ax + B
If 0 x L; then
T (0) = B = T0
and
T (L) = AL + B = AL + T0 = T1
which gives
T1 T0
A=
L
Therefore
T1 T0
T = T0 + x
L
Consider the interface between two di¤erent mediums "medium I" and "medium
II" at temperatures T I and T II respectively (see Figure 7.1). At the interface, there
7.2. SELF-SIMILAR SOLUTIONS 123

Figure 7.1: Interface between two mediums at di¤erent temperatures

are two boundary conditions. First, the temperature in medium I is equal to the
temperature in medium II:
T I = T II
The second condition has to do with the heat ‡ux across the interface. If there is no
chemical reaction or heat adsorption to account for, the heat ‡ux across the interface
is the same. This is expressed as
! !
n 1 rT = n 2 rT2

which is equivalent to
@T @T
1 = 2
@n @n

7.2 Self-Similar Solutions


A self-similar solution is a dimensionless solution that holds for any amount of time.
Let us demonstrate this concept via two concrete examples.

7.2.1 Example 1:

Consider a medium with constant temperature T0 that is next to a cold wall with
temperature T1 at position x = 0; such that T0 > T1 (See Figure 7.2). The governing
equation for the transfer of heat is

@T
= r2 T (7.1)
@t
Now the temperature at any point in the system is dependent on the values of T0 and
T1 as well as on the heat di¤usivity, position in space and in time.

T = T (x; t) = T (T0 ; T1 ; ; x; t)
124 CHAPTER 7. HEAT FLOW PROBLEMS

Figure 7.2: Medium at temperature T0 next to a cold wall (at x = 0) with temperature
T1 such that T0 > T1 :

To study the system in dimensionless form to obtain a self-similar dimensionless


solution, we can consider
T =f( )
where the dimensionless is de…ned as
x
=p
t
Substituting this into the heat conduction equation (7:1) ; we have

@ @2
f 0( ) = f( ) (7.2)
@t @x2
Now
@ x
= (7.3)
@t 2 ( t)3=2
also
@f ( ) @ 1
= f 0( ) = f 0( ) p
@x @x t
and
2
@ 2f ( ) 00 @ 00 1
=f ( ) =f ( ) (7.4)
@x2 @x t
Using (7:3) and (7:4) in (7:2) ; we have
x 1
f 0( ) 3=2
= f 00
( )
2 ( t) t
7.2. SELF-SIMILAR SOLUTIONS 125

which is
x
f 0( ) p = f 00 ( )
2 t
h i
f 0( ) = f 00 ( )
2
h i
f 00 ( ) + f 0 ( ) =0
2
As this is a second order di¤erential equation, we can use the method of reduction
of order to solve it by taking
y=f 0
which gives
1
y0 + y=0
2
1 2
ln y = +C
4
1 2 1 2 0
y = exp +C = y0 exp =f
4 4
Hence we have an expression for T as
Z
1 2
f = T = y0 exp d +C
4

At x = 0; = 0 and T = T1 = f (0) : Therefore C = T1 and we have


Z
1 2
f = y0 exp d + T1
4

Let us take
s= ) d = 2 ds
2
then we can write Z =2
f = 2y0 exp s2 d + T1 (7.5)
0

Also since as x ! 1 ) ! 1 where T = T0 ; then


Z 1
T0 = 2y0 exp s2 ds + T1
0

Now we know that Z p


1
2
exp s ds =
0 2
126 CHAPTER 7. HEAT FLOW PROBLEMS

so p
T0 = 2y0 + T1
2
and
2
2y0 = p (T0 T1 ) (7.6)

Substituting (7:6) into (7:5) ; we have


" Z #
=2
2
f = (T0 T1 ) p exp s2 ds + T1
0

which is " #
Z x
p
2 2 t
f = T1 + (T0 T1 ) p exp s2 ds
0

By de…nition Z x
2
erf (x) = p exp s2 ds
0

so our …nal self-similar (dimensionless) solution is

x
T = T1 + (T0 T1 ) erf p
2 t

Figure 7.3 shows the general shape of the temperature pro…les as time proceeds (shows
how the medium gradually cools until it is at the same temperature as the cold wall).

Figure 7.3: Temperature pro…les

We see that the solution is valid since the initial conditions hold, i.e. t = 0; we get
erf (1) = 1; and f = T1 + (T0 T1 ) = T0 :
7.2. SELF-SIMILAR SOLUTIONS 127

7.2.2 Example 2
Consider the identical geometry as in the previous example, but in the case where
the temperature of the cold wall is not constant, but is oscillating in time

x = 0; T = T0 exp ( !t)

We seek the self-similar solution of the same one-dimensional di¤usion equation

@T @ 2T
= (7.7)
@t @x2
of the form
T = [exp ( i!t)] f (x) (7.8)
Substituting (7:8) into (7:7) ; we have
00
i! f = f

which is
00 i!
f + f =0

Therefore
f exp ( x)
where r
!p
= i (7.9)

We can simplify this as follows. Let us take

(a + ib)2 = i

where a and b are real. Then

a2 + 2abi b2 = i

Separating the real and imaginary parts

a2 b2 = 0 ) a = b

and
2ab = 1
1
If a = +b; then b2 = 2
which is not the case as b is real. Hence, we need to take

1 1
a= b ) 2b2 = 1 ) b = p ; a= p
2 2
128 CHAPTER 7. HEAT FLOW PROBLEMS

and therefore
1 1 p
p p i= i (7.10)
2 2
Using (7:10) in (7:9) ; we get
r r
! 1 1 !
= p p i = ( 1 i)
2 2 2

Finally r
!
f exp x [ 1 i]
2
and from (7:8) r
!
T = exp ( i!t) exp x [ 1 i]
2
From the initial condition

x = 0; T = T0 exp ( !t)

we have
T0 exp ( !t) = exp ( i!t)
and
r
!
T = T0 exp ( !t) exp x [ 1 i]
2
r r
! !
= T0 exp x exp i x !t
2 2

This is the form of a travelling thermal wave with decaying amplitude


r
!
T0 exp x
2

of speed
! p
v=r = 2!
!
2

7.3 The Convection Di¤usion Equation and Appli-


cations
7.3.1 Introduction
7.3. THE CONVECTION DIFFUSION EQUATION AND APPLICATIONS 129

In bracketed vectorial notation, if !


v = (vx ; vy ; vz ) it follows that

! @ @ @
v r = vx + vy + vz
@x @y @z
The general convection di¤usion equation may be expressed as
@T
(!
v r) T + = r2 T
@t
where (!
v r) T represents the convective heat transfer through the medium.

7.3.2 Rayleigh-Bénard Problem


Rayleigh–Bénard convection is the name given to the natural convection that occurs
when a plane horizontal layer of ‡uid heated from below, in which the ‡uid develops
a regular pattern of convection cells, that are commonly referred to as Bénard cells
(see Figure 7.4). This type of convection is one of the most commonly studied con-

Figure 7.4: An reproduction of one of Benard’s original pictures showing the regular
hexagonal pattern on the surface of a ‡uid.

vection phenomena. The study of these problems was initiated by the experimental
observations of the french physicist Henri Bénard in 1900. Shortly before his death
in 1916, the well-known English physicist Lord Rayleigh1 carried out the …rst ever
analytical attempt to explain the formation of Bénard cells. His analysis was based on
the assumption that buoyancy, and hence gravity, was responsible for the appearance
1
Lord Rayleigh, "On convection currents in a horizontal layer of ‡uid, when the higher temper-
ature is on the under side", Philosophical Magazine Series 6, Vol. 32 (192), 1916.
130 CHAPTER 7. HEAT FLOW PROBLEMS

of these convection cells. Since there is a density gradient between the top and the
bottom plate, gravity acts trying to pull the cooler, denser liquid from the top to the
bottom. This gravitational force is opposed by the viscous damping force in the ‡uid.
The balance of these two forces is expressed by a non-dimensional parameter that is
referred to as the Rayleigh number.
Consider a thin layer of ‡uid of thickness h. If the temperature on its lower
surface T0 is greater than the temperature on the top surface T1 (see Figure 7.5).
For simplicity, we assume that the upper and lower surfaces are free. The governing

Figure 7.5: Thin horizontal layer of ‡uid of thickness h. The bottom surface of the
‡uid is held at temperature T0 and the top surface at T1 such that T0 > T1 :

equations for this system are …rst the Navier Stokes Equation
@!v
0 + (!
v r) ! v = rP + r2 ! v + !g (7.11)
@t

where is the dynamic viscosity, and ! g is the gravity acting downwards !


g = ^
g k.
The convection di¤usion equation is next
@T
(!
v r) T + = r2 T (7.12)
@t
Finally we have the continuity equation
@
+r ( !
v)=0 (7.13)
@t
The equation that describes the linear dependence of the density on temperature is
= 0 [1 (T T0 )] (7.14)
where 0 is the density of the ‡uid at temperature T0 ; and
1 @
= = constant
0 @T
7.3. THE CONVECTION DIFFUSION EQUATION AND APPLICATIONS 131

We make a few assumptions in order to simplify the system to be studied. First


we take = constant since the dependence of on temperature T is negligible. We
also assume that the density is constant throughout the ‡uid layer (as it is very thin)
to reduce the continuity equation to r ! v = 0: These assumptions are commonly
referred to collectively as the Oberbeck-Boussinesq approximation, and the resulting
set of governing equations are

@!v 1
+ (!
v r) !
v = rP + r2 !
v +!
g [1 (T T0 )] (7.15)
@t 0

@T
(!
v r) T + = r2 T (7.16)
@t
r ! v =0 (7.17)
where is the kinematic viscosity (the dynamic viscosity divided by the density 0 ).
At steady state
!v =0 (7.18)
Next (7:16) reduces to
r2 T = 0
Assuming that the temperature varies in the z direction only, we have

d2 T
= 0 ) T = Az + B
dz 2
using the conditions T (0) = T0 ; T (h) = T1 ; we get
z
T = T0 + (T1 T0 ) (7.19)
h
At steady state, (7:15) becomes
1 dP h zi
g 1 (T1 T0 ) =0
0 dz h
This can be solved to get

z2
P = 0 gz + 0 g (T1 T0 ) + constant (7.20)
2h
Let us consider a perturbation about the steady state (where Tss is the steady
state temperature and Pss is the steady state pressure)
! !
v = 0 + ve

T = Tss + Te
132 CHAPTER 7. HEAT FLOW PROBLEMS

P = Pss + Pe
Substituting these into (7:15) to (7:17) and linearizing, we get
!
@ ve 1 !
= rPe + r2 ve + !
g Te (7.21)
@t 0

! @ Te
ve r Tss + = r2 Te (7.22)
@t
!
r ve = 0 (7.23)
!
Using the method of normal modes, we take the two components of ve
!
ve = (U (z) ; V (z)) exp (ikx + t)

as well as
Pe = M (z) exp (ikx + t)
Te = (z) exp (ikx + t)
Here k is the wavenumber of the disturbance, and is the growth rate. When < 0;
we have stable solutions and when > 0; we have unstable solutions. When = 0;
we have neutral or marginal stability. In what follows, let us consider the case = 0;
the so-called marginally stable case.
Now from (7:19) ; if we let T0 T1 = T; then
z
Tss = T0 T
h
and (7:22) gives

T 00 h
V = k2 )V = 00
k2 (7.24)
h T
Next, we get from (7:23)
@V
ikU + =0
@z
Using (7:24) ; we have

@ h 00 h 0
ikU = k2 = 00
k2
@z T T

That is
h 1 00 0
U= k2 (7.25)
T ik
7.3. THE CONVECTION DIFFUSION EQUATION AND APPLICATIONS 133

Next, we consider the x component of (7:21):

1 0 d2
0= ikM + U 00 k2U ) M = k2 U
0 ik dz 2

and using (7:25) ; we get

0 d2 h 1 0
M= k2 00
k2
ik dz 2 T ik
2 2
0 h 1 d2 0h d2
M= k2 0
= k2 0
(7.26)
ik T ik dz 2 k2 T dz 2
Next we consider the z component of (7:21):
1
0= M0 + V 00 k2V + g
0

Using (7:26) in this, we get


2
1 0h d2 d2
0= k2 00
+ k2 V + g (7.27)
0 k2 T dz 2 dz 2

Now we also know from (7:24) that

h 00 h d2
V = k2 = k2
T T dz 2

Using this in (7:27) ; we obtain


2 2
h d2 2 00 h d2 2
0= k k +g
k2 T dz 2 T dz 2
This can be written as
2 2
d2 2 00 2 d2 2 g ( T) 2
k k k + k =0
dz 2 dz 2 h
which is equivalent to
3
d2 2 g ( T) 2
k + k =0 (7.28)
dz 2 h

Let us now express (7:28) in dimensionless form by taking


z
zb = ) z = zbh
h
134 CHAPTER 7. HEAT FLOW PROBLEMS

and
b
k
k=
h
We can therefore express (7:28) as
!3 !2
1 d2 b
k2 g ( T) b
k
+ =0
h2 db
z2 h2 h h

which is
3
d2 b g ( T ) h3 b2
k2 + k =0
z2
db
We de…ne the dimensionless Rayleigh number as

g ( T ) h3
Ra =

which allows us to write


3
d2 b
k2 + (Ra) b
k2 =0 (7.29)
z2
db

Substituting exp (qz) = gives the characteristic equation


3
q2 b
k2 + (Ra) b
k2 = 0

We turn our attention now to the boundary conditions in non-dimensional form. We


…rst have
U 0 jzb = 0 = U 0 jzb = 1 = V jzb = 0 = V jzb = 1 = 0
From V = 0 we get for zb = 0 and zb = 1
00
k2 =0 (7.30)

while from U 0 = 0 we get for zb = 0 and zb = 1


0000
k2 00
=0 (7.31)

Also, we have that on the top and bottom boundaries

jzb = 0 = jzb = 1 = 0

which means that for zb = 0 and zb = 1


00 0000
= 0; =0
7.3. THE CONVECTION DIFFUSION EQUATION AND APPLICATIONS 135

and indeed all even derivatives of are zero on the boundaries. It follows that for
any integer n
= A sin ( nz) (7.32)
Using (7:32) in (7:29) ; we get
3
A 2 2
n b
k2 + (Ra) b
k2A = 0

and if A 6= 0; then
3
n +b
2 2
k2 + (Ra) b
k2 = 0
which gives
3
n +b
2 2
k2
Ra =
b
k2
Therefore, for each n; there is a Rayleigh number. When n = 1 (i.e. at the …rst
convection), we get
3
2
+b k2
Ra =
b
k2
The minimum point of the plot of Ra against b k occurs when
2 3
@Ra k b
2b k 2 (3) 2
+b
k2 2
+b
k2 2b
k
= =0
@b
2
k b
k2

which means that at the minimum point


2 3
2b
k b
k 2 (3) 2
+b
k2 2
+b
k2 2b
k =0
2 3
6b
k3 2
+b
k2 2
+b
k2 2b
k=0
2
3b
k2 2
+b
k2 = 0 ) b
k2 =
2
2
When b
k2 = ; we get the critical Rayleigh number Racr to be
2
2
3
2
+ 2 27 4
Racr = 2 = ' 660
2
4
as depicted in Figure 7.6. Figure 7:6 is sometimes referred to as the marginal stability
curve. The critical Rayleigh number is the value of Ra when convection …rst occurs.
Below the curve (otherwise known as the stability boundary) the system is stable.
Once the stability boundary is crossed, the system is unstable.
136 CHAPTER 7. HEAT FLOW PROBLEMS

Figure 7.6: Plot of the Rayleigh number Ra against the dimensionless wave number.
The critical Rayleigh number Racr is the minimum value of Ra

7.3.3 Marangoni Problem


It has been proven experimentally that a liquid with a high surface tension pulls
more strongly on the surrounding liquid than one with a low surface tension. Such
a gradient in surface tension may be caused by a pronounced temperature gradient,
since it is a well known fact that surface tension is a function of temperature. This so-
called Marangoni e¤ect is named after Italian physicist Carlo Marangoni, who studied
it for his doctoral dissertation at the University of Pavia and published his results
in 1865. One particular instance of the Marangoni e¤ect appears in the behavior of
Bénard cells studied in the previous section. It is clear that for a very thin layer of
‡uid with a free upper surface, surface tension e¤ects should be taken into account.
Let us consider a thin horizontal layer of ‡uid of depth h bounded above by a lower
impenetrable rigid surface and with a free upper surface. The lower surface is held at
temperature T0 while the upper free surface has a temperature T1 , such that T0 > T1 :
As in the previous problem, convection currents will a¤ect the overall stability of the
system. If the layer is very thin, surface tension e¤ects are likely to play a pivotal part
in the overall thermal stability of the problem, and should not be ignored. In order to
focus only on the e¤ect of surface tension, we simplify the problem by ignoring gravity.
We consider the lower surface to be rigid and impenetrable (with no-slip boundary
conditions), and the upper surface to be free (to account for surface tension e¤ects).
The governing equations are the Navier Stokes Equation

@!v 1
+ (!
v r) !
v = rP + r2 !
v (7.33)
@t 0
7.3. THE CONVECTION DIFFUSION EQUATION AND APPLICATIONS 137

the convection di¤usion equation

@T
(!
v r) T + = r2 T (7.34)
@t
and the reduced form of the continuity equation (for constant density - this is an
assumption since the layer is very thin)

r !
v =0 (7.35)

At the steady state from (7:34) we get

r2 T = 0

and since the temperature is considered to vary only in the z direction, and T jz = 0 =
T0 ; T jz = h = T1 ; we get (as in the Rayleigh Bénard problem)
z
T = T0 + (T1 T0 )
h
Also at steady state, we get from (7:33) if pressure only varies in the z direction

1 @P
= 0 ) P = Pss = constant
0 @z

Considering a small perturbation from the steady state


! !
v = 0 + ve

T = Tss + Te
P = Pss + Pe
Substituting these into the equations (7:33) to (7:35) and linearizing, we get
!
@ ve 1 !
= rPe + r2 ve (7.36)
@t 0

! @ Te
ve r Tss + = r2 Te (7.37)
@t
!
r ve = 0 (7.38)
!
Using the method of normal modes, we take the two components of ve
!
ve = (U (z) ; V (z)) exp (ikx + t)
138 CHAPTER 7. HEAT FLOW PROBLEMS

as well as
Pe = M (z) exp (ikx + t)

Te = (z) exp (ikx + t)


Here k is the wavenumber of the disturbance, and is the growth rate. When < 0;
we have stable solutions and when > 0; we have unstable solutions. When = 0;
we have neutral or marginal stability. In what follows, let us consider the case = 0;
the so-called marginally stable case.
Substituting these into (7:37) ; we get

@Tss 00
V = k2 (7.39)
@z
If we let T0 T1 = T; then

z @Tss T
Tss = T0 T ) =
h @z h
which we may put into (7:39) to get
00
h[ k2 ]
V = (7.40)
T
Also, from (7:38) ; we have
@V
ikU + =0
@z
and from (7:40) ; this is
00 0 0
h[ k2 ] h [ 00 k 2 ]
ikU =0)U = (7.41)
T ik ( T )

Finally, looking at (7:36) and the two components of velocity, we get

1
0= ikM + U 00 k2U (7.42)
0

and
1
0= M0 + V 00 k2V (7.43)
0

From (7:42) ; we obtain

0 d2
M= k2 U
ik dz 2
7.3. THE CONVECTION DIFFUSION EQUATION AND APPLICATIONS 139

and from (7:41) ; this is

0 d2 h d2
M= k2 k2 0
ik dz 2 ik ( T ) dz 2
which simpli…es to
2
h 0 d2
M= 2
k2 0
(7.44)
k ( T) dz 2
In the same way, from (7:43) ; we have
1
0= M0 + V 00 k2V
0

and using (7:40) and (7:44) this becomes


2 2
h d2 2 00 h d2 2
0= 2
k k
k ( T) dz 2 T dz 2
This may be simpli…ed to get
2 2
d2 d2
k2 00
k2 k2 =0
dz 2 dz 2
which is
3
d2
k2 =0 (7.45)
dz 2
Let us now express this in dimensionless form by taking
z
zb = ) z = zbh
h
and
b
k
k=
h
We get 0 !2 1 3
2 b
@1 d k A =0
h2 db
z2 h

and this is
3
d2 b
k2 =0
z2
db
Dropping the "hat" notation for simplicity, we have the same equation (7:45) : The
general solution to this equation is
(z) = A1 + B1 z + C1 z 2 cosh (kz) + A2 + B2 z + C2 z 2 sinh (kz) (7.46)
140 CHAPTER 7. HEAT FLOW PROBLEMS

Equation (7:45) is a sixth order di¤erential equation. This means that we need six
boundary conditions to determine the six unknown coe¢ cients in (7:46) :
We look …rst at the boundary conditions at the upper free surface, which in
dimensionless form, corresponds to z = 1: The …rst boundary condition is at z = 1
deals with the surface tension
@U @ Te
= (7.47)
@z @x
@
where = is the variation rate of surface tension with temperature T , is
@T !
the dynamic viscosity, and U comes from the x component of ve . Now recall that
0
h [ 00 k 2 ] h d2
U= = k2 0
ik ( T ) ik ( T ) dz 2

which gives
@U h d2
= k2 00
(7.48)
@z ik ( T ) dz 2
Also Te = (z) exp (ikx + t) ; so

@ Te
= ik exp (ikx + t) (7.49)
@x
Using (7:48) and (7:49) in (7:47) ; we obtain after dropping the exponential terms

h d2
k2 00
= ik (7.50)
ik ( T ) dz 2

We must put this in dimensionless form before using it, by taking


z
zb = ) z = zbh
h
and
b
k
k=
h
This leads to (eventually)
!
h2
1 d 2 b
k2 00 b
k
= i
ib
k ( T) h2 db
z2 h2 h2 h

which is
d2 b h T
k2 00
=b
k2
z2
db
7.3. THE CONVECTION DIFFUSION EQUATION AND APPLICATIONS 141

This is commonly written as


d2 b
k2 00
=b
k 2 (M a) (7.51)
z2
db
where the dimensionless constant
h T
Ma =

is known as the Marangoni number, and it represents the rate of change of surface
tension with respect to temperature. It is often used to characterize the relative
e¤ects of surface tension and viscous forces.
As we did previously, we drop the "hat" notation for simplicity. Equation (7:51)
represents our …rst boundary condition on the free upper surface
d2
k2 00
(1) = k 2 (M a) (1)
dz 2
We substitute the general solution for (z) given in (7:46) to get
A1 [M a] + B1 [2k tanh k M a] + C1 [10 + 4k tanh k M a]
+A2 [ M a tanh k] + B2 [2k M a tanh k] + C3 [4k + 10 tanh k M a tanh k] = 0
(7.52)
Secondly at the upper free surface, we have V (1) = 0: This is expressed as
d2
k2 (1) = 0
dz 2
Substituting the general solution for (z) given in (7:46) into this and setting z = 1;
we get
B1 [2k tanh k] + C1 [2 + 4k tanh k] + B2 [2k] + C2 [4k + 4k tanh k] = 0 (7.53)
The third and …nal boundary condition at the free upper surface comes from Newton’s
law of cooling
@T h zi
= T
@z h
which in dimensionless form (after dropping the hat notation) is
d2
k2 = (Bi)
dz 2
Here Bi is the dimensionless Biot number
h
Bi =
c
142 CHAPTER 7. HEAT FLOW PROBLEMS

where c is the heat transfer coe¢ cient for the ‡uid in question. The Biot number2 is
the ratio of heat transfer resistances inside of and at the surface of a given body. This
ratio determines whether or not the temperatures inside a body will vary signi…cantly
in space when subject to a thermal gradient applied to its surface. Problems involving
small Biot numbers (i.e. 1) are thermally simple, due to uniform temperature …elds
inside the body. Biot numbers much larger than 1 signal more di¢ cult problems due
to non-uniformity of temperature …elds within the object.
At the top surface z = 1; we have

d2
k2 (1) = (Bi) (1)
dz 2

obtain

A1 [k tanh k Bi] + B1 [1 + k tanh k Bi]


+C1 [2 + k tanh k Bi] + A2 [k (Bi) tanh k]
+B2 [tanh k + k (Bi) tanh k] + C2 [2 tanh k + k (Bi) tanh k] = 0 (7.54)

We switch our attention now to the bottom surface z = 0: The …rst two boundary
conditions at the lower rigid surface are the no-slip conditions U (0) = V (0) = 0:
First U (0) = 0 implies that

d2
k2 0
(0) = 0
dz 2

Substituting the general solution for (z) given in (7:46) eventually leads to

kB1 + 3C1 + kB2 + 3C2 = 0 (7.55)

Next, V (0) = 0 is
d2
k2 (0) = 0
dz 2
Substituting the general solution for (z) given in (7:46) into this, we get

A1 k 2 + B1 (2k) + C1 (2) + A2 k 2 + B2 (2k) + C2 (2) = 0 (7.56)

The third and …nal boundary condition at the lower rigid surface is that the normal
velocity is that (0) = 0; which leads to

A1 A2 = 0 (7.57)
2
Named after the French physicist Jean-Baptiste Biot (1774–1862).
7.3. THE CONVECTION DIFFUSION EQUATION AND APPLICATIONS 143

As the ‡uid layer is very thin, for simplicity, we take the Biot number to be zero.
We then may write equations (7:52) to (7:57) in matrix form as follows:
0 1
A1
BB1 C
B C
B C1 C
G B C
B A2 C = 0
B C
@B2 A
C2

where G is a 6 6 matrix. The solvability condition for this system is that the
determinant of the matrix G is zero, which can be simpli…ed by MAPLE, and solved
to get an expression for the Marangoni number M a in terms of the dimensionless
wavenumber k: We can then produce a marginal stability curve and determine the
critical Marangoni number M acr at which the system loses stability.
144 CHAPTER 7. HEAT FLOW PROBLEMS
Chapter 8

Introduction to Mathematical
Epidemiology

8.1 What is Epidemiology


Epidemiology is the study of patterns of health, illness and associated factors for
human populations. The term "epidemiology" was …rst used to describe the study of
epidemics in the early 19th century, and until the twentieth century, epidemiological
studies were centred around the study of infectious diseases. In this chapter, we will
consider the mathematical modelling of infectious diseases.

8.2 Classi…cation of Infectious Diseases


An infectious disease is clinical illness that results from the presence of a pathogenic
microbial agent in the human body. This microbial agent can be bacterial (as in
case of tuberculosis, pneumonia), viral (causing HIV, in‡uenza, SARS, covid-19 for
example), fungal (resulting in fungal diseases of the skin etc.), or parasitic (among
the more common macroparasites are protozoa). There are also cases of microbial
agents in the form of toxic proteins known as prions.
Communicable diseases are infectious diseases that can be transmitted from one
infectious person to another either directly or indirectly. Many infectious diseases
are communicable diseases, but there are infectious diseases that are not communi-
cable (such as tetanus). Transmittable diseases are infectious diseases that can be
transmitted from one person to another in di¤erent ways, such as through surgical
instruments or transplants.
Infectious diseases are classi…ed according to the means of transmission as follows:
Person-to-person transmitted: These require direct contact (such as sex-
ually transmitted diseases like HIV, viral infections like covid-19) or indirect

145
146 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

contact (via exchange of an infected object, blood or other body ‡uids).


Airbourne transmission: Occurs on the inhalation of infected air (exam-
ples of such diseases include in‡uenza, smallpox, chickenpox, tuberculosis, and
possibly covid-19).
Food and waterbourne diseases: Transmitted through the ingestion of con-
taminated food (such as salmonella poisoning) or water (such as cholera).
Vector-bourne diseases: Transmitted via a vector, such as a mosquito, tick
or snail. Examples include malaria, dengue, Zika, West Nile virus etc.
Vertical transmission: Occurs when the disease is transmitted through the
placenta from mother to child before or at birth. (Examples: HIV, hepatitis B,
syphillis, rubella).
The model utilized for a particular type of infectious disease depends on the mode
of transmission. We should note that many infectious diseases have more than one
pathway of transmission (such as HIV, avian in‡uenza).

8.3 Basic De…nitions


The following concepts are essential for the construction of mathematical models:
Exposed Individuals: Healthy individuals who have contact with a disease
transmitter. Note that an exposed individual may not develop the disease, and
is typically not infectious himself. In some mathematical models, a simplifying
assumption is made that all exposed individuals will develop the disease.
Infected/Infectious Individuals: After the pathogen has established itself
in the exposed individual, he becomes infected. Infected individuals who can
transmit the disease are infectious. (Note that infected individuals are not
necessarily infectious for the entire infected period).
Latent Individuals: Individuals who are infected but not yet infectious.
Latent Period: Period of time from infection up to the point when the host
becomes infectious.
Incubation Period: Period of time between exposure to an infectious agent
and the onset of the symptoms of the disease.
Incidence: Number of individuals who become ill during a speci…ed time
interval.
Prevalence: Number of people who have the disease at a given time.
8.4. KERMACK-MCKENDRICK SIR EPIDEMIC MODEL 147

8.4 Kermack-McKendrick SIR Epidemic Model


The SIR model for infectious disease spread was …rst proposed by Kermack and
McKendrick in 1927. It is based on a classi…cation of the population in three groups:

Susceptible: These are the individuals who are healthy but can potentially
contract the disease. This class is referred to as S.

Infected: These are the individuals who are sick with the disease. In this
model, an assumption is made that the infected individuals are also infectious.
This class is referred to as I.

Removed/Recovered: These are the individuals who have recovered and can
no longer be infected by the disease (i.e. they are now immune). This class is
referred to as R.

The number of individuals in each class may change with time, so we say that
each is a function of time (i.e. S(t) ; R(t); I(t) are all functions of time t). The basic
SIR model consists of a system of ordinary di¤erential equations that describes the
‡ow of individuals between each of the three classes over time.
The total population size N is the sum of each of these three classes, so the …rst
equation we can write to capture this is

N = S(t) + I(t) + R(t): (8.1)

Note that several assumptions have been made in this simple model. The main
assumptions are that the infected individuals are all infectious, and that the total
population size N remains constant (and these assumptions can be relaxed in more
complicated models).
A susceptible individual from class S becomes infected when he/she comes into
contact with an infectious individual from class I; and the individual then moves from
class S to class I. It follows that the number of susceptible individuals decreases as
the number of infected individuals increases. The number of individuals who become
infected per unit time is called incidence. The rate of change of susceptible class S is
therefore expressed as
dS
= incidence (8.2)
dt
In order to complete this idea, we make the following assumptions:

The number of contacts made by one infectious individual is directly propor-


tional to the total population size N with per capita contact rate c. Hence we
may express the number of contacts made by an infected individual per unit
time as cN:
148 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

S
The probability that a contact is made with a susceptible individual is N
:

It follows that the number of contacts with susceptible individuals that any given
infectious individual makes per unit time is

S
cN = cS: (8.3)
N

Since not every contact with a susceptible individual will lead to transmission and
hence the infected state of that individual, let p be the probability that such contact
results in transmission. Therefore from (8:3) it follows that the number of susceptible
individuals who become infected per unit time per infectious individual is

pcS = S (8.4)

where we take = pc: Here is a constant of proportionality that is referred to as


the transmission rate constant.
We can therefore look back at our de…nition of incidence (i.e. the number of
individuals who become infected per unit time), and conclude from (8:4) that

incidence = IS = S (8.5)

where = (t) is a time dependent function de…ned as (t) = I: This function


(t) is referred to as the force of infection. We refer to the number of infected
individuals I(t) as the prevalence of the disease. Using (??)in (8:2) ; we get the
ordinary di¤erential equation for the susceptible class to be

dS
= S0 = IS: (8.6)
dt
The individuals who belong to the infected class I can either recover from the
illness or die from it. For this simple model, we classify all individuals who leave the
infected class as "recovered" or "removed". The "recovery rate" for the removal of
individuals from this class is termed ; and for simplicity it is taken to be constant.
Hence the number of infected individuals who recover per unit time is I: We can now
write the ordinary di¤erential equation that describes the ‡ow of infected individuals
from class I as
dI
= I 0 = IS I (8.7)
dt
Individuals who recover from the disease move from class I to class R: The ordi-
nary di¤erential equation describing this is

dR
= R0 = I: (8.8)
dt
8.4. KERMACK-MCKENDRICK SIR EPIDEMIC MODEL 149

The full system of di¤erential equations for the SIR model is

S 0 (t) = IS (8.9)

I 0 (t) = IS I (8.10)
R0 (t) = I (8.11)
and it is solved subject to a set of initial conditions S(0); I(0); R(0).
As always, when a model is formulated, we must be very careful to ensure that the
units of all quantities are taken into account and balanced accordingly on both sides
of each equation. The derivatives on the left-hand side of all ordinary di¤erential
equations (8:9 8:11) have the units: number of people per unit time. It follows
that since S and I have units of "number of people", then from equation (8:9), then
(transmission rate constant) must have units (number people unit of time) 1 . Since
= pc, and p is a probability (i.e. a dimensionless number between 0 and 1); then
the units of c (per capita contact rate) is also (number people unit of time) 1 :
Hence the number of contacts made by an infected individual per unit time de…ned
as cN previously has units equivalent to (unit of time) 1 : From the equation (8:10);
we can also see that the recovery rate must have units equivalent to (unit of time) 1 .
Well-posed mathematical models of physical phenomena have the following prop-
erties:

1. A solution exists

2. The solution is unique

3. The solution’s behaviour changes continuously with the initial conditions.

Clearly, we must ensure that any model that is proposed is well-posed. For the
system of ordinary di¤erential equations (8:9 8:11) describing the problem to be
well-posed, every set of initial conditions must produce a unique solution. Since we
are dealing with population sizes, and we cannot ever have a negative population,
we must require that the initial conditions S(0); I(0); R(0) are positive, and that the
solutions remain positive always. Since we know that

N = S(t) + I(t) + R(t)

it follows that the total population size N at time zero is given by

N (0) = S(0) + I(0) + R(0)

and also that


150 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Figure 8.1: Kermack-McKendrick SIR model (taken from the text "An Introduction
to Mathematical Epidemiology, by Maia Martcheva, page 12)

N 0 (t) = S 0 (t) + I 0 (t) + R0 (t)


= IS + IS I+ I=0

Hence N is a constant in this simple model, i.e. N = N (0). The type of model that we
have described above is commonly referred to as a compartment model, because each
individual in the population belongs to a particular compartment. Such models can
be depicted using simple ‡ow-charts. The Kermack-McKendrick SIR compartmental
model that we have described previously can be represented by the simple ‡owchart
shown in Figure 8.1.
There are several assumptions that are made in this simple model:

1. There are no births and deaths in the population, so the number of individuals
is …xed at all times.
2. The population is closed - nobody can leave, and no individual from outside
can enter the population.
3. All recovered individuals have complete immunity and cannot be re-infected.
Some examples of such diseases are rubella, chickenpox and smallpox.

Of course, these assumptions are simplifying ones, and can be relaxed. If they are
relaxed, the resulting model (which will no longer be the one we are discussing here)
has more complicated dynamics.
There are several things we can say about the dynamics of the Kermack-McKendrick
model. As S 0 (t) < 0 for all time t; it follows that S(t) is a monotonic decreasing func-
tion of time. This means that although S(t) is always positive (as we cannot have a
negative number of people), the number of individuals in this compartment decreases
over time. We may conclude that

lim S(t) = S1 (8.12)


t !1

where S1 is to be determined.
8.4. KERMACK-MCKENDRICK SIR EPIDEMIC MODEL 151

In the same manner, the number of recovered people is always increasing as R0 (t) >
0 for all time t. Of course, the number of recovered can never surpass N , the total
population under study. We may say then that

lim R(t) = R1
t !1

where the value of R1 is to be determined.


We note that the number of infected people may be monotonically decreasing, or
it may increase to a maximum number and then decrease to zero (as the number of
infected people cannot keep increasing inde…nitely with a constant population N ).
Recall the equation for compartment I is

I 0 (t) = IS I

hence at initial time, we have that

I 0 (0) = ( S (0) ) I (0) :

It follows that if I 0 (0) = ( S (0) ) I (0) > 0; then since I(0) > 0 always, a
0
necessary and su¢ cient condition for I (0) > 0 is that S (0) > 0 i.e.
S (0)
>1

This is the case where the number of infected individuals starts out increasing. We
de…ned prevalence to be the number of people who have the disease at a given time,
i.e. the number of people in compartment I: An epidemic is de…ned as a sudden
increase in prevalence and then a subsequent decline to zero.
Consider now the equations (8:9) and (8:11) governing the ‡ow in compartments
S and R. Dividing these, we get
S 0 (t) IS
=
R0 (t) I
which gives
dS
= S:
dR
Separating and integrating, we have
Z Z
dS
= dR
S

ln jSj = R+C
( = )R
S = Ae
152 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

where A = eC = constant. At time t = 0; S = S (0) and R = R (0) = 0; so we have

S (0) = A

Hence we get
( = )R
S = S (0) e
( = )R ( = )N
Now since R N always, it follows that e e ; and therefore for all
time t
( = )R ( = )N
S = S (0) e S (0) e >0
since S (0) > 0 and e ( = )N > 0 at all times t: Therefore the …nal size of the epidemic
S1 is always positive, i.e. there are always susceptible individuals present. This is
actually a realistic result, since there are always individuals in any society who have
natural immunity and never catch the disease. To illustrate that the epidemic dies
out, we must show that
lim I (t) = I1 = 0: (8.13)
t !1

Now Z 1
S 0 (t) dt = S1 S0
0

(where we use the notation S (0) = S0 ), since from (8:9)


Z 1 Z 1
0
S (t) dt = I (t) S (t) dt
0 0

it follows that Z 1
S1 S0 = I (t) S (t) dt:
0

Also since S (t) S1 for all time t; we may say that


Z 1
S0 S1 S1 I (t) dt
0

which implies that I(t) is integrable over the time interval [0; 1) and that (8:13)
holds.
Let us now solve the system of equations (8:9 8:11) re-written below:

S 0 (t) = IS (8.14)

I 0 (t) = IS I (8.15)
R0 (t) = I (8.16)
where
N = I + S + R:
8.4. KERMACK-MCKENDRICK SIR EPIDEMIC MODEL 153

Since the S and I equations are coupled, we take

R=N S I

and solve the S and I equations separately. Dividing equation (8:10) by (8:9), we
have
I0 IS I
0
= = 1+
S IS S

I0 = 1+ S 0:
S
Integrating, we get
I= S+ ln S + C (8.17)

where C is the integration constant. Given initial conditions I (0) = I0 ; S (0) = S0 ;


we get
I0 = S0 + ln S + C

which gives
C = I0 + S0 ln S0 : (8.18)

Hence we have from (8:18) that

I= S+ ln S + I0 + S0 ln S0 : (8.19)

Also from (8:12) and (8:13) we know that lim S(t) = S1 and lim I(t) = I1 = 0;
t !1 t !1
so taking the limit of (8:17) as t ! 1; we have that

0= S1 + ln S1 + C

so
C = S1 ln S1 (8.20)

Equating the results from (8:18) and (8:20) ; we get

I0 + S0 ln S0 = S1 ln S1

I0 + S0 S1 = (ln S0 S1 )

ln SS10
= (8.21)
I0 + S0 S1
154 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

The maximum number of infected individuals (Imax ) during the epidemic is an im-
portant number to determine, as this will determine the actual extent of the epidemic
(as beyond that point, the number of infected individuals will decrease). We can
calculate Imax from the results we have obtained so far. Recall from (8:10) that

I 0 (t) = IS I:

The maximum number of infected individuals during the epidemic Imax will occur
when I 0 (t) = 0; i.e. when

0 = IS I)S= : (8.22)

We can now substitute (8:22) into (8:19) to obtain

Imax = + ln + I0 + S0 ln S0 : (8.23)

8.5 Basic Reproduction Number


Since R is determined once S and I are known (as the S and I equations are coupled
and we have an equation linking I and S to R), we can ignore the R equation and
consider only the S and I equations

S 0 (t) = IS (8.24)

I 0 (t) = IS I (8.25)
Note that the model only makes sense if S (t) 0 and I (t) 0, and if either S(t) = 0
or I(t) = 0, we consider the epidemic to be over. Hence, we consider S(t) > 0 and
I(t) > 0: From the equation

S 0 (t) = IS
we see that S 0 < 0, so S decreases for all t: From the equation

I 0 (t) = ( S )I

we see that I 0 > 0 i¤ S > : Since S decreases for all t, it follows that I must

ultimately decrease and approach zero. When S (0) < ; I decreases to zero (so there

is no epidemic), and when S(0) > ; there is an epidemic, and and I …rst increases

to a maximum value Imax (attained when S(0) = ) and then decreases to zero. The
8.6. ESTIMATING PARAMETERS 155

quantity S (0) = is a threshold quantity referred to as the basic reproduction


number R0 : This number determines whether or not there is an epidemic. If R0 < 1;
there is no epidemic and the infection dies out. If R0 > 1; there is an epidemic.
The basic reproduction number R0 is the number of secondary infections caused by
a single infected individual introduced into a susceptible population.

8.6 Estimating Parameters


8.6.1 Recovery Rate
Most infectious diseases are characterized by the mean duration of the infectious
period of individuals carrying the disease. We can use this to estimate the recovery
rate as follows.
Assume that there is no in‡ow into the infectious class I; and that I(0) = I0 : It
follows that the equation governing the dynamics of class I can be expressed as

I 0 (t) = I; I (0) = I0 :

Integrating we have Z Z
I0
dt = dt
I
t
I (t) = Ae
where A is a constant. Since I (0) = I0 ; it follows that for t 0;

t I (t) t
I (t) = I0 e ) =e
I0
Hence the proportion of people who are still infectious at time t is given by e t :
This is also the probability of being infectious at time t: It follows that the fraction
of individuals who have left the infectious class at time t 0 is

t
1 e
We may therefore say that this is the probability of recovering and leaving the infec-
tious class for t 0: Hence we may de…ne the related probability distribution function
F (t) for recovering/ leaving the infectious class I as
0; t<0
F (t) = t
1 e ; t 0
It follows that the probability density function is
dF 0; t<0
f (t) = = t
dt e ; t 0
156 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

The average (mean) time spent in the infectious class is therefore (via the formula for
the expected value of a random variable)
Z 1
1
t f (t) dt =
1

To illustrate these ideas with an example, consider the well-known infectious dis-
ease in‡uenza. Usually, people are sick with in‡uenza for 3 to 7 days, so the mean
time spent as infectious is approximately 5 days. The recovery rate for in‡uenza is
therefore estimated to be 1=5 , measured in units of [days] 1 :

8.6.2 Transmission Rate Constant


Estimating the transmission rate constant is usually the most di¢ cult task faced when
using an infectious disease model. As the Kermack-McKendrick model is a simple
one, the relating model equations are straight-forward. We have already studied these
equations to obtain mathematical expressions relating to several other key parameters,
so it is possible to obtain a good estimate for using these equations.
The following example1 illustrates the process for estimating for an infectious dis-
ease via the Kermack-McKendrick model. The case being studied is a 1978 epidemic
of in‡uenza at an English boarding school with 763 resident boys. One boy returned
from vacation with a fever over the period 15th - 18th January, and on January 22nd,
there were three boys who were ill. The number of boys who never got sick was 19.
The average time each of the sick boys was ill was between 5 and 6 days, and it
was estimated that each boy was infectious for just over 2 days. The table below2
summarizes the daily number of boys who fell ill with in‡uenza from January 22nd
onwards.

Day # Infected Day # Infected


3 25 9 192
4 75 10 126
5 227 11 71
6 296 12 28
7 258 13 11
8 236 14 7

From the data, we can take S3 = 738 as the number of susceptible boys on the
third day (from the table, if 25 are infected, then 738 are susceptible, as the total
1
Taken from the textbook "An Introduction to Mathematical Epidemiology" by Maia Martcheva.
2
Taken from "In‡uenza in a Boarding School", British Medical Journal, March 1978.
8.7. SIS MODEL 157

number of boys is 763), and I3 = 25. As 19 boys escape infection, it follows that
S1 = 19. Using the equation (8:21)

ln SS10 ln 738
19
= = = 0:004918689310:
I0 + S0 S1 25 + 738 19
If the infectious period is taken to be 2:1 days, it follows that the infectious period is
1
= = 0:4761904762
2:1
and hence

= 0:004918689310 0:4761904762
= 0:002342233005:

Since
1
= = 203:3061934
0:004918689310
we can calculate the maximum number of expected infected individuals Imax from
equation (8:23)

Imax = + ln + I0 + S0 ln S0

= 203:3061934 + 203:3061934 ln 203:3061934


+ 738 + 25 203:3061934 ln 738
= 298 (rounded o¤ to the nearest individual)

Note that the dataset provided lists a maximum number of infected individuals as
296, so this model does make a good prediction of the epidemic.

8.7 SIS Model


When the assumption for permanent recovery is relaxed, repeated reinfection is pos-
sible (as in diseases like in‡uenza). In the simplest of cases, a recovered individual is
immediately susceptible again. In reality, there is often a small period of immunity
after recovery.
For the simplest case where a recovered individual becomes immediately suscep-
tible, the SIS model is utilized. This is depicted in Figure 8.2 (*Note the ‡ow
from the compartment I back to the susceptible compartment S). The equations
describing this model are
dS
= S0 = IS + I; (8.26)
dt
158 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Figure 8.2: Diagram illustrating the SIS model (taken from text "An Introduction to
Mathematical Epidemiology, by Maia Martcheva, page 19)

dI
= I 0 = IS I: (8.27)
dt
As the total number of individuals in the population N = S + I, it follows that

S0 + I 0 = N 0 = 0

which indicates that the total population size N is constant over time, so N = N (0).
With respect to the initial number of infected and susceptible individuals S(0); I(0),
it would also follow that
N = S(0) + I(0):
We can reduce the SIS model equations to a single logistic equation by …rst expressing
S as
S=N I
and then writing equation (8:27) as

I 0 = I (N I) I =I( N I)
I
=I( N ) 1 :
N

This can be written as a logistic equation

I
I 0 (t) = rI 1 (8.28)
K

where
r
r= N ; K= :

For the logistic equation (8:28), the parameter r is referred to as the growth rate.
When r < 0, the number of infected individuals I (t) ! 0 as t ! 1: This can be
8.7. SIS MODEL 159

seen from (8:28) as when r < 0; then K < 0, so I 0 (t) < 0: Note that if we were to
solve the equation
I 0 (t) = rI
we get
I (t) = I (0) ert
and this clearly tends to zero as t ! 1 for r < 0:
When instead r > 0, we can solve the equation (8:28) via separation of variables
as follows:
dI I
= rI 1
dt K
Z Z
1
dI = rdt
I 1 KI
Z Z
1 1
dI + dI = rt + C
I K I
I
ln = rt + C
K I
where C is an integration constant, and where we assumed that I (t) 6= K (otherwise
we could not divide by 1 KI in the solution above). Since I(t) = K is indeed a
solution to the original logistic equation (8:28), we will add it to the set of solutions
afterwards. To …nd the integration constant C; we use the initial condition I = I(0)
at t = 0 to get
I (0)
ln =C
K I (0)
hence we have
I I (0)
ln = rt + ln
K I K I (0)
i.e.
I (K I (0))
ln = rt
I (0) (K I)
Since I > 0 for all t; and since both K I (0) and K I will have the same sign
(either both positive or both negative), we can remove the modulus sign to have

I (K I (0))
ln = rt
I (0) (K I)

which means that


I (K I (0))
= ert
I (0) (K I)
160 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

so
I
= Bert
K I
I (0)
where B = : It follows that
K I (0)

I 1 + Bert = KBert

hence we obtain a solution for I (t) as

KBert
I (t) = :
1 + Bert
When t ! 1; we get

KBert KB
lim I (t) = lim = lim =K
t!1 t!1 1 + Bert t!1 e rt + B

which means that the disease is endemic in the population (it remains inde…nitely in
the population), and the number of infected individuals in the population in the long
N
term will be approximately K = : Recall that the above results are for the
N
case r = N > 0; i.e. > 1: This condition is written as

N
R0 = >1

where R0 is the basic reproduction number of the disease. It follows that the condition
for the disease to remain endemic in the population is R0 > 1: If instead R0 < 1;
the number of infected individuals declines to zero, and the disease eventually will no
longer be present in the population.

8.7.1 Qualitative Analysis of the Logistic Equation


It is possible to deduce properties of the solutions of a di¤erential equation with-
out directly solving the equation to obtain an explicit solution. This technique is
often utilized, as often the equations governing the behaviour of a disease are highly
nonlinear, and therefore more complicated than the ones that we have encountered
in the simple SIR and SIS models previously described. In order to illustrate this
technique, we will utilize the SIS model developed in this section. Note that this
method can also be applied to a system of di¤erential equations.
A nonlinear di¤erential equation with constant coe¢ cients usually has solutions
that are independent of time. These solutions are referred to as equilibrium points.
8.7. SIS MODEL 161

These solutions can be found by setting all time derivatives to zero, and solving
the resulting equations. Since the equilibrium points are independent of time, they
therefore dictate the long-term behaviour of the actual solutions of the equation.
Consider the equation for the SIS model

I 0 = I (N I) I

which we expressed previously as the logistic equation

I
I 0 (t) = rI 1 (8.29)
K

with
r
r= N ; K= :

The equilibrium solution is found by setting the time derivative I 0 to zero, and solving
the resulting equation
I
rI 1 = 0:
K
This equation has two solutions, which are the equilibrium points I1 and I2 where

N
I1 = 0; I2 = K = :

The equilibrium point I1 = 0 is the trivial case (since there are no infected individuals
present) commonly referred to as the disease-free equilibrium. In this case, the entire
population is contained within the susceptible compartment. The second equilibrium
N
point I2 = K only will occur when the basic reproduction number R0 = > 1 and
this is referred to as the endemic equilibrium.
When R0 < 1; all solutions of the equation (8:29) approach the disease-free equi-
librium I1 = 0. Hence we can say that I (t) ! 0 for every possible initial condition
I(0) > 0: We therefore say that the disease-free equilibrium I1 is globally stable.
When R0 > 1; there are two possible equilibria: I1 = 0 and I2 = K: All solutions
that start from a given initial condition I(0) > 0 will move away from the disease-free
equilibrium I1 and approach the endemic equilibrium I2 = K: In this case, we say
that the disease-free equilibrium I1 is unstable, and the endemic equilibrium I2 = K
is globally stable.
Note that in more complicated models, there are more equilibria, and there may
be multiple endemic equilibria as a result. In such case, the local stability of equi-
libria becomes important. An equilibrium point is said to be asymptotically stable if
solutions that start close to the equilibrium point approach that equilibrium point as
t ! 1:
162 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

8.7.2 Stability of a Nonlinear System


The stability of a nonlinear system can be deduced to some extent from the stability
of its corresponding linearized system. Consider a general di¤erential equation

x0 (t) = f (x) : (8.30)

The equilibrium points x are found by setting the derivative to zero and solving the
resulting equation
f (x ) = 0:
We linearize this by shifting the equilibrium to zero by introducing a small perturba-
tion
u (t) = x (t) x
It follows that solutions of (8:20) that are close to x will in time approach x as
u (t) ! 0: Now since x(t) = u(t)+x , it follows that (assuming that f is di¤erentiable
and can be expanded around x as a Taylor series)

f 00 ( )
u0 (t) = f (x ) + f 0 (x ) u (t) + (u (t))2
2!
where
x < < x + u (t) :
Assuming that nonlinear terms in time are small enough to be neglected (this is what
linearization is essentially), we can ignore the term with (u (t))2 to get

u0 (t) = f (x ) + f 0 (x )u(t):
Also, since x is an equilibrium point, then f (x ) = 0, and we have the linearized
equation

u0 (t) = f 0 (x )u(t): (8.31)


Note that the term f 0 (x ) is a known constant, so we can say f 0 (x ) = to get

u0 (t) = u(t):
The solution of this equation is

t
u(t) = u(0)e
Note that for the case < 0; then u(t) = x(t) x ! 0 as t ! 1, and so
x(t) ! x as t ! 1: We can conclude for the case < 0, all solutions of (8:30) that
start from initial conditions that are close enough to the equilibrium point x will
8.7. SIS MODEL 163

converge to that equilibrium point x : We say here that the equilibrium point x is
locally asymptotically stable.
For the case > 0, we see that ju (t)j ! 1 as t ! 1: In this case, all solutions of
(8:30) will move away from the equilibrium point x , and we say that the equilibrium
x is unstable.

Theorem 8.7.1 An equilibrium x of the di¤erential equation x0 (t) = f (x) is locally


asymptotically stable if f 0 (x ) < 0 and unstable if f 0 (x ) > 0:

Note that for the case f 0 (x ) = 0 we can make no conclusion from this type of
analysis. If f 0 (x ) 6= 0; we say that the equilibrium x is hyperbolic, and when
f 0 (x ) = 0; we say that the equilibrium x is nonhyperbolic.
We now illustrate this with the SIS model considered previously in this section.
Consider the logistic equation that describes the model
I
I 0 (t) = rI 1
K
with
r
r= N ; K= :

N
We found that for the case R0 = > 1; there are two equilibria; the disease-free
one at I1 = 0 and an endemic equilibrium at I2 = K: Let
I
f (I) = rI 1
K
Finding the derivative of this we get
I 1 I r
f 0 (I) = r 1 + rI =r 1 I
K K K K
hence
I r
f 0 (I ) = r 1 I
K K
By the previous theorem, at the disease-free equilibrium I1 = 0; we get
0 r
f 0 (0) = r 1 (0) = r
K K
which is positive, meaning that the disease-free equilibrium is unstable. For the
endemic equilibrium at I2 = K; we have
K r
f 0 (K) = r 1 K= r
K K
164 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

meaning that the endemic equilibrium is locally asymptotically stable.


The theorem can also be used to determine the stability of equilibria of a function
0
x = f (x) graphically. A simple plot of the function f (x) against x provides the
equilibria points x at every intersection of f (x) with the x-axis. The stability of
these equilibria x can then be determined by observing the slope of the tangent to
the curve at each equilibrium point. If the slope of the tangent line at the equilibrium
point x is positive, then f 0 (x ) > 0 and by theorem x is unstable. If instead the
slope of the tangent line at the equilibrium point x is negative, then f (x ) < 0 and
by theorem x is locally asymptotically stable. Note that if the slope of the tangent
to the graph at x is zero, then no conclusion can be made. This is illustrated in
Figure 8.3.

8.8 SIS Model with Saturating Treatment


In the SIS model, the per capita "treatment/ recovery rate" for the removal of
individuals from the infected class I is termed ; and for simplicity it was previously
taken to be constant. It is more realistic to consider the treatment/ recovery rate
to be dependent on the availability of treatment resources. When the treatment
resources are limited and therefore decrease with the number of infected individuals,
the treatment/ recovery rate can assume a form similar to ; where the constant
1+I
is the treatment/ recovery rate when there are very few infected individuals. The
new SIS model with "saturating treatment" is as follows:

S 0 (t) = IS + I (8.32)
1+I

I 0 (t) = IS I (8.33)
1+I
Note that N = S +I, so again we have that N 0 = 0; meaning that the total population
N is constant over time. It follows as well that N = S(0) + I(0), whereI(0) and S(0)
are the initial infected and susceptible population sizes. Since S(t) = N I(t), we
may use this in equation (8:33) to get

I 0 (t) = I (N I) I
1+I

We can …nd the equilibrium points by solving

0 = I (N I) I
1+I
8.8. SIS MODEL WITH SATURATING TREATMENT 165

Figure 8.3: Graph of the functon f (x): Equilibria are found at intersection points
with the horizontal at x1 = 0; x2 = 2; x3 = 3: As the slope of the tangent at x2 is
negative, x2 is a locally stable equilibrium point. The slope of the tangent at x3 is
positive, so x3 is an unstable equilibrium point. However, as the slope of the tangent
at x1 is zero, no conclusion can be made about its stability from the graph.
166 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

The …rst solution we see is the disease-free equilibrium case I1 = 0: To …nd possible
endemic equilibria, we may now cancel out one I (as we already counted the zero
solution) and solve as follows:

(N I) =0
1+I
This equation can be rearranged to get

(N I) (1 + I) = (8.34)

Consider the parabola g (I) de…ned by


g (I) = (N I) (1 + I)
and the straight line
y=

If we plot these functions (with the function g(I) plotted against I), the parabola g(I)
will have the shape of a maximum (downward open) quadratic, while the straight line
is a horizontal line parallel to the I axis. Note that g (0) = (N 0) (1 + 0) = N: As
before, we de…ne the reproduction number for the system as
N
R0 = :

8.8.1 Case 1: Unique Positive Endemic Equilibrium


N
When g (0) = N > ; then R0 = > 1: In this case, equation (8:34) has a unique
positive solution I2 : This is depicted in Figure 8.4.

8.8.2 Case 2: Two Positive Endemic Equilibria or No En-


demic Equilibria
When g (0) = N < ; then R0 < 1: For the case of two positive solutions, we need
the maximum of the parabola described by g (I) to be to the right of the vertical axis.
The intersection point of g (I) with the horizontal I axis will take place at
(N I) (1 + I) = 0
which gives I = N and I = 1: The maximum height of the parabola will therefore
occur at Imax ; which will be midway between these two points, so
N 1
Imax = >0
2
8.8. SIS MODEL WITH SATURATING TREATMENT 167

Figure 8.4: Case 1: Graph illustrating the intersection between the graphs g(I) and
the straight line y = for the case where N = 15 and = 10. Note that the
existence of a unique positive intersection point - yieding a single positive endemic
equilibrium solution I2 :
168 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

which means that N > 1: In order to obtain two equilibria, the horizontal line y =
must lie below the maximum of the parabola, i.e.

(N Imax ) (1 + Imax ) > (8.35)

Once this condition (8:35) holds, then we will obtain two endemic equilibria at I2 ; I3 :
This is illustrated in Figure 8.5. If instead condition (8:35) does not hold, there are
no endemic equilibria, as illustrated in Figure 8.6.

8.8.3 Bistability
As previously outlined, we can determine the stability of equilibria I by …nding
the sign of f 0 (I ). A sketch of the function f (I) plotted against I can be utilized
to determine this. For the case R0 < 1; when there is only a single disease-free
equilibrium point I1 = 0; all solutions of the equation (8:34) will be attracted to this
equilibrium state - as there is no other equilibrium state available. For this reason,
the trivial disease-free equilibrium is said to be globally stable.
When R0 < 1 and there are three possible equilibrium states; the disease-free state
I1 = 0 and two endemic equilibrium states I2 ; I3 ; where I2 < I3 , then bistability is
possible. We outline how this is done from a graph of f (I) plotted against I using the
example depicted in Figure 8.7. For initial condition I(0) such that 0 = I1 < I (0) <
I2 ; then I1 < I(t) < I2 for all t: We see from Figure 8.7 that f (I) < 0 (as the graph
dI
of f (I) is below the horizontal axis), hence < 0 meaning that over time I(t)
dt
decreases to the disease-free equilibrium state I1 = 0; i.e. lim I (t) = I1 = 0: When
t !1
the initial condition I(0) is such that I2 < I (0) < I3 ; then I2 < I(t) < I3 for all t:
We see from Figure 8.7 that f (I) > 0 (as the graph of f (I) is above the horizontal
dI
axis), hence > 0: Over time, I(t) will increase to the endemic equilibrium I2 ;
dt
i.e. lim I (t) = I2 : Finally for the case when the initial condition I(0) is such that
t !1
I (0) > I2 then I (t) > I2 for all t: From Figure 8.7, we see that for this case f (I) < 0
dI
(graph below the horizontal axis), so < 0; meaning that over time I(t) decreases
dt
to the endemic equilibrium state I2 , i.e. lim I (t) = I2 :
t !1
Clearly for the above example, depending on the initial conditions, the solutions
may converge either to the disease-free equilibrium I1 or to the endemic equilibrium
I2 : This is referred to as bistability, as for this case, there is no globally stable
equilibrium state (since I(t) converges to either the disease-free equilibrium or the
endemic equilibrium depending on the initial condition I(0)).
8.8. SIS MODEL WITH SATURATING TREATMENT 169

Figure 8.5: Case 2: Graph illustrating the intersection between the graphs g(I) and
the straight line y = for the case where N = 15 and = 30: Note the existence of
two positive intersection points - yielding two positive endemic equilibria at I2 and
I3 :
170 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Figure 8.6: Case 2: Graph illustrating the intersection between the graphs g(I) and
the straight line y = for the case where N = 15 and = 70: There are no
intersection points, meaning that there are no endemic equilibria.
8.8. SIS MODEL WITH SATURATING TREATMENT 171

Figure 8.7: Plot of f (I) against I for an SIS model when R0 < 1: Here there are three
equilibria: a disease-free equilibrium at I1 = 0; and two possible endemic equilibria
at I2 and I3 ; where I2 < I3 : These equilibria are located at the intersection points of
f (I) with the horizontal I axis.
172 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

8.9 Models for Population Growth


The previous models considered do not include the e¤ect of births and deaths in the
population. When these e¤ects are not included, the resulting models are applicable
only for fast epidemic outbreaks that are short-lived. When the time period for the
development of the outbreak is short (for fast diseases like in‡uenza that develop and
spread rapidly), there is no need to consider changes in the population size from births
and deaths, as this will not signi…cantly a¤ect the dynamics of the model. For slow
developing diseases however (such as HIV), it is important to incorporate the birth
and death of individuals into any model utilized to study the spread of the disease.
Population growth is measured by the change in the number of individuals in the
population over time. The study of the change in human population is referred to as
demography.

8.9.1 Malthusian Model


The Malthusian model is based on a simple assumption that the rate of change of a
population is proportional to the size of the population. The two underlying assump-
tions are:

1. The disease a¤ects all individuals in the population identically irrespective of


age, sex or any other characteristic.

2. The environment is constant in both space and time, and the supply of resources
to enable the population to grow is unlimited.

If the population size is N (t), the per capita birth rate is b and the per capita
death rate is ; the Malthusian model is expressed as

N 0 (t) = bN (t) N (t) = rN (t) (8.36)

where r = b is referred to as the population growth rate. The solution of this


equation is the exponential function

N (t) = N (0) ert (8.37)

so the Malthusian model is sometimes referred to as the exponential model. From


(8:37), we see that when r > 0, the population grows exponentially, and when r < 0, it
decreases exponentially. When r = 0; the population remains constant (as expected).
8.10. SIR MODEL WITH DEMOGRAPHY 173

8.9.2 Logistic Model


It is more realistic to assume that the environment has a speci…c carrying capacity,
rather than to assume that the population can grow inde…nitely with unlimited re-
sources available (as in the Malthusian model). When the population size approaches
its limit (carrying capacity), we would expect the per capita growth rate to decrease
or become negative. This e¤ect is re‡ected in the logistic model (of the same form as
those studied earlier)
N 0 (t) N
=r 1 :
N (t) K
When N (t) is very small (i.e. for very small populations), the population growth
rate approaches its maximum possible value r: The population growth rate decreases
with increasing population size N , and reaches zero when N = K. We therefore refer
to K as the carrying capacity of the population. When the population size exceeds
K; the growth rate becomes negative, causing a decrease in population size. Note
that as demographers believe that there is no constant carrying capacity for human
populations, this simple model is rarely utilized.

8.9.3 Simpli…ed Logistic Model


When the per capita birth rate is assumed to be constant and independent of popu-
lation size, and the per capita death rate is assumed to be constant, the Malthusian
model (8:36) may be simpli…ed to
N 0 (t) = N
where is the total birth rate, and is the constant per capita death rate. The
equation may be integrated to obtain the solution
t t
N (t) = N (0) e + 1 e :

As t ! 1; we get a limiting population size of N (t) ! : Although it is far from


perfect, this is the most commonly used model for human population dynamics in
epidemic models.

8.10 SIR Model with Demography


We incorporate demography into the SIR model by including a per capita death rate
in each of the three compartments. The total death rates in the S; I; R compartments
are therefore S; I and R respectively. The per capita birth rate is included in
the S compartment. The new equations for S; I; R are as follows:
S 0 (t) = IS S;
174 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

I 0 (t) = IS I I;
R0 (t) = I R:
Since N = S + I + R; then N 0 = S 0 + I 0 + R0 : When we add the three equations, we
therefore obtain

N 0 (t) = S 0 (t) + I 0 (t) + R0 (t) = IS S + IS I I+ I R


= S I R= N:

Note that this is the same form of the simpli…ed logistic model previously discussed.
It follows that the population size is not constant here, and as t ! 1, we get a
limited population size of N (t) ! :
When the population is not constant (i.e. when we take demography into account),
the incidence is proportional to the product of I and S (called the mass action incidence
by analogy with the law of mass action for chemical reactions)

mass action incidence = SI:

When we normalize the mass action incidence with the total population size, we
obtain the standard incidence de…ned as
SI
standard incidence = :
N
Mass action incidence is utilized when modelling a disease (such as in‡uenza) for
which the contact rate that results in infection increases with an increase in the
population size. When considering diseases (such as sexually transmitted diseases)
where the contact rate cannot increase inde…nitely with increasing population size,
we utilize instead standard incidence.
As the S and I equations are independent of R; we can take R = N S I and
consider only the governing equations to a nonlinear system of two equations in S
and I
S 0 (t) = IS S = f (S; I) (8.38)
I 0 (t) = IS I I = g (S; I) (8.39)
Note that the coe¢ cients of the system above are independent of time. The units of
the quantities in the S equation is as follows. S 0 is measured in number of people
per unit time. Total birth rate has units of number of people born per unit time.
The number of people S is multiplied by the per capita death rate ; so S has units
of people per unit time. The force of infection is I; which is a per capita rate with
units time 1 : It follows that the transmission coe¢ cient must have units [people
time] 1 :
8.10. SIR MODEL WITH DEMOGRAPHY 175

To make the system dimensionless and thereby simplify the system, we de…ne the
nondimensional quantity = ( + ) t: Let

N (t) = N e( )
=N
+

S (t) = S = Se ( )
+

I (t) = I = Ie (t)
+
Hence
dSe 1 dS dIe 1 dI
= ; = : (8.40)
d + dt d + dt
Next let
x (t) = Se ; y (t) = e
I: (8.41)
Using (8:40; 8:41) in (8:38) ; we get
1 0 y x x
( + ) x =

x0 = xy x
+ ( + ) +
x0 = p R0 xy px
= p (1 x) R0 xy (8.42)
where the nondimensional parameters p and R0 are de…ned as

p= ; R0 = : (8.43)
+ ( + )
Here R0 is the basic reproduction number of the disease.
In a similar way, using (8:40; 8:41) in (8:39) ; we get
1 0 y x y y
( + ) y =

+
y0 = xy y
( + ) +
= R0 xy y = (R0 x 1) y (8.44)
where the new dependent variables x ( ) and y ( ) in (8:42; 8:44) are dimensionless
quantities: Note the the dimensionless form is equivalent to the original dimensional
form, but the number of parameters that must be studied is now reduced to two (i.e.
p and R0 ). As both versions will have identical long-term behaviour, it is common
practice to work with the dimensionless form for simplicity.
176 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

8.11 Phase-plane Analysis


Our system is of the general form

x0 = f (x; y) ; y 0 = g (x; y)

where f (x; y) = p (1 x) R0 xy; and g (x; y) = (R0 x 1) y: Curves may be plotted


for the points (x ( ) ; y ( )) as the nondimensional time parameter varies. These
solution curves are referred to commonly as orbits or trajectories in the phase plane.
The long-term behaviour of these orbits depends on the equilibrium (or singular)
points of the system, which are solutions for which x0 = 0 and y 0 = 0:
For the dimensionless SIR model with demography, we …nd equilibrium points
by solving the set of equations

p (1 x) R0 xy = 0
(R0 x 1) y = 0

The most obvious solution to this set of equations is the case when y = 0; x = 1
which corresponds to the disease-free equilibrium (since y = 0; and y represents the
infectious compartment I). This …rst equilibrium point is sometimes referred to as
a boundary equilibrium, as it lies on the actual boundary of the region for possible
solutions x 0; y 0:
For the solution when y 6= 0; the second equation requires that x = 1=R0 . Sub-
stituting this into the …rst equation, we have

1 1
p 1 R0 y=0
R0 R0

hence we get
1
y=p 1 :
R0
The second equilibrium point, which is an endemic equilibrium point, is therefore
1
R0
; p 1 R10 : Note that this endemic equilibrium is only possible for R0 > 1;
because we are restricted by y 0:
The slope of the trajectory at any point (x0 ; y0 ) provides the direction of the
trajectory at that point. This can be determined by …nding

dy g (x; y)
=
dx (x0 ;y0 ) f (x; y)

Note that this slope is de…ned for all non-equilibrium points (x0 ; y0 ); since at the
equilibria f = g = 0. Since the ‡ow stops at these equilibria, they are often referred
8.11. PHASE-PLANE ANALYSIS 177

Figure 8.8: Phase plane plot showing direction …eld and orbits (x; y) with di¤erent
initial conditions for the dimensionless SIR model with p = 0:5; R0 = 2. There is an
endemic equilibrium point at (0:5; 0:25) and a disease-free equilibrium at (1; 0). Plot
obtained numerically using Matlab (pplane8).
178 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

to as …xed points. A plot that depicts a collection of tangent vectors is referred to as


a direction …eld, and it is often included in the phase-plane diagram. A phase-plane
diagram illustrating the direction …eld for the dimensionless SIR model for di¤erent
initial conditions (for the case p = 0:5; R0 = 2) is presented in Figure 8.8.
The x-zero isocline or x-nullcline for the system is the set of points in the (x; y)
plane that satisfy the equation f (x; y) = 0; while the y-zero isocline or y-nullcline is
the set of points in the (x; y) plane that satisfy the equation g(x; y) = 0: To determine
the nullclines for the dimensionless SIR model, we solve …rst

p (1 x) R0 xy = 0

to get the x-nullcline to be


p (1 x)
y= :
R0 x
The y-nullclines are found likewise by solving

(R0 x 1) y = 0

which gives two possible solutions, the horizontal line (corresponding to the x-axis)
at y = 0 and the vertical line at x = R10 : The points of intersection of the x-nullcline
with the y-nullclines are the equilibrium points of the dynamical system.
For the case R0 < 1; there is only one possible intersection point of the x-nullcline
with the y-nullcliney = 0 at the disease-free equilibrium point (1; 0): Since 1=R0 > 1
for this case, the y-nullcline x = R10 cannot intersect the x-nullcline y = p(1 x)
R0 x
to give
a positive y solution, and since we know that y 0; there is no other equilibrium
point for this case.
For the case R0 > 1; there are two possible intersection points between the x-
nullcline y = p(1 x)
R0 x
and each of the y-nullclines. The intersection with the …rst y-
nullcline y = 0 is at the disease-free equilibrium point (1; 0). Since in this case R10 < 1,
there is also an intersection of the x-nullcline y = p(1 x)
R0 x
with the y-nullcline x = R10 in
the positive quadrant providing the endemic equilibrium point at R10 ; p 1 R10
We conclude that the dimensionless SIR model has two equilibria. The disease-
free equilibrium at (1; 0) exists always, while the endemic equilibrium at R10 ; p 1 R10
exists only when R0 > 1:

8.11.1 Discussion about R0


The basic reproduction number R0 of a disease indicates how many secondary cases
are created in an entirely susceptible population by one infected individual while that
individual himself remains infective.
8.12. LINEARIZATION OF 2D LINEAR SYSTEMS 179

We had previously determined that as t ! 1; we get a limiting population size


of . Hence we may conclude that for an entirely susceptible population, S = :
Since + is the rate at which individuals will leave the infective class I; the
1
average time that an infected individual remains infective is :
+
The number of transmissions per unit time is represented by the incidence rate
IS. For a single infective individual, it follows that I = 1, so as S = ; it follows
that the incidence rate is

IS = (1) = :

The basic reproduction number R0 is the number of transmissions that one infective
individual can make while himself remaining infective, and so

1
R0 = = :
+ ( + )

8.12 Linearization of 2D Linear Systems


It is possible to obtain information about the behaviour of the system near equilibrium
points via linearization. This is done by considering a small perturbation of the
solution starting from an initial condition near to the equilibrium point (as done
previously for …rst-order nonlinear equations).
Let (x ; y ) be an equilibrium point of the two-dimensional system

x0 = f (x; y) ; y 0 = g (x; y)

Consider a small perturbation of the solution from the equilibrium state

x( ) = u( ) + x ; y( ) = v( ) + y

Substituting this into the system, we get

u0 = f (u + x ; v + y ) ; v 0 = g (u + x ; v + y ) :

Assuming that the functions are su¢ ciently di¤erentiable and expanding in a Taylor
series (using the form for a function of two variables), and ignoring all second order
terms (which we can do since u and v are very small, so second order terms are much
smaller)
u0 f (x ; y ) + fx (x ; y ) u ( ) + fy (x ; y ) v ( )
v0 g (x ; y ) + gx (x ; y ) u ( ) + gy (x ; y ) v ( )
180 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Now as (x ; y ) is an equilibrium point of the two-dimensional system, it follows that

f (x ; y ) = g (x ; y ) = 0

which allows us to write the linearized two-dimensional system as

u0 = fx (x ; y ) u ( ) + fy (x ; y ) v ( )

v 0 = gx (x ; y ) u ( ) + gy (x ; y ) v ( )
which can be expressed as
0
u fx (x ; y ) fy (x ; y ) u
=
v gx (x ; y ) gy (x ; y ) v
We refer to the matrix
fx (x ; y ) fy (x ; y )
J=
gx (x ; y ) gy (x ; y )
as the Jacobian J of the system evaluated at the equilibrium point (x ; y ) :

8.13 Stability of the Equilibria for 2D Linear Sys-


tems
Consider the 2D linear homogeneous system

u0 = au ( ) + bv ( ) ; v 0 = cu ( ) + dv ( ) (8.45)

where a; b; c; d are constants. The equilibria are found by solving

au ( ) + bv ( ) = 0; cu ( ) + dv ( ) = 0

The system can be written in matrix form as


a b u( )
=0
c d v( )

A trivial solution of this system is the equilibrium point (0; 0) : Consider the coe¢ cient
matrix
a b
A=
c d
If the determinant of A is non-zero, then A is invertible, and the only solution is the
trivial equilibrium (0; 0): Assuming that A is invertible, i.e. det A = ad bc 6= 0; we
seek exponential solutions of the linearized system by taking

u ( ) = ue ; v ( ) = ve (8.46)
8.13. STABILITY OF THE EQUILIBRIA FOR 2D LINEAR SYSTEMS 181

where u 6= 0; v 6= 0: Substituting (8:46) into (8:45) ; we get the linear homogeneous


system
au + bv = u; cu + dv = v
which may be expressed as

a b u
= 0:
c d v

This has a non-trivial solution only if the determinant of the coe¢ cient matrix is zero,
i.e.
a b
=0
c d
which gives the characteristic equation of the linearized system

(a ) (d ) bc = 0:

This is the quadratic equation


2
(a + d) + (ad bc) = 0:

Note that this is in the form


2
p +q =0
where p = a + d is the trace of the related Jacobian matrix J; and q = ad bc
which is the determinant of J: The equation may be solved to obtain the two
eigenvalues 1;2 of the Jacobian J
p
p p2 4q
1;2 = :
2
When the two eigenvalues 1;2 of the Jacobian J are real and distinct, the solution
of the linearized system (8:45) is given by

u ( ) = C1 e 1
+ C2 e 2

v ( ) = C3 e 1
+ C4 e 2

where C1 ; C2 ; C3 ; C4 are constants to be determined. As 0; it follows that the


perturbations u ( ) and v ( ) will approach zero - returning the system to equilibrium
at (x ; y ) - only when 1;2 < 0:
When the two eigenvalues of the Jacobian J are real and equal, i.e. 1 = 2 = ;
the solution of the linearized system (8:45) is given by

u ( ) = C1 e + C2 e
182 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

v ( ) = C3 e + C4 e
where C1 ; C2 ; C3 ; C4 are constants to be determined. As 0; it follows that the
perturbations u ( ) and v ( ) will approach zero - returning the system to equilibrium
at (x ; y ) - only if < 0:
When the two eigenvalues of the Jacobian J are complex conjugates, i.e. 1 =
+i ; 2 = i ; the solution of the linearized system (8:45) is given by
u ( ) = C1 e sin + C2 e cos
v ( ) = C3 e sin + C4 e cos
where C1 ; C2 ; C3 ; C4 are constants to be determined. As 0; it follows that the
perturbations u ( ) and v ( ) will approach zero - returning the system to equilibrium
at (x ; y ) - only if < 0:
We may conclude that a necessary condition for an equilibrium of a 2D linear sys-
tem to be locally asymptotically stable is that the eigenvalues of the related Jacobian
matrix are negative, or if they are complex conjugates, that they have negative real
part.
An equilibrium point (x ; y ) for a 2D linear system can be classi…ed as a node,
spiral (or vortex or focus), saddle or center. The equilibrium point is said to
be a node if the eigenvalues 1 ; 2 are real and of the same sign. When 1 and 2 are
both real and positive, (x ; y ) is an unstable node. When 1 and 2 are both real
and negative, (x ; y ) is a stable node. When 1 = 2 = ; where is real, (x ; y )
is said to be a degenerate node. When two eigenvectors correspond to ; then we
say it is a proper degenerate node, and when only one eigenvector corresponds to
; we say it is an improper degenerate node.
The equilibrium point is said to be a saddle if the eigenvalues 1 ; 2 are real and
opposite in sign. As one of the eigenvalues is positive, a saddle is always unstable.
The equilibrium point is said to be a spiral (focus) if the eigenvalues 1 ; 2
are complex conjugates. When the real part of the eigenvalues is positive, it is an
unstable spiral point, and when the real part of the eigenvalues is negative, it is a
stable spiral point.
The equilibrium point is said to be a center when the eigenvalues 1 ; 2 are purely
imaginary complex conjugates, and the orbits of such equilibrium points are periodic.
A center is said to be neutrally stable (not asymptotically), as the orbits revolve
around the equilibrium point but never approach it, or deviate from it.
Example 8.13.1 Consider the 2D linear system
x0 = y; y 0 = x:
We can write this in vector form as
0
x 0 1 x
=
y 1 0 y
8.14. LOCAL STABILITY ANALYSIS FOR SIR MODEL 183

0 1
Note that the matrix A = is invertible, as det A 6= 0; hence the only solution
1 0
is the trivial equilibrium (0; 0) : Note also that the equilibria can be found by solving
the system
0 = y; 0 = x
which gives only one equilibrium point at (0; 0) :
We seek exponential solutions of the system by taking

x = xe ; y = ye

where x 6= 0; y 6= 0: Substituting into the linear system, we get eventually the linear
homogeneous system
x y = 0; x y = 0
which may be expressed as

0 1 x
=0
1 0 y

This has a non-trivial solution only if the determinant of the coe¢ cient matrix is
zero, i.e.
1
=0
1
which gives the characteristic equation
2
+1=0) 1;2 = i

As the eigenvalues are complex conjugates with zero real part, the equilibrium point
(0; 0) is a neutral center. A phase-plane depicting the stable center is shown in Figure
8.9.

8.14 Local Stability Analysis for SIR Model


Recall that the dimensionless form of the SIR model with demography is

x0 = p (1 x) R0 xy

y 0 = (R0 x 1) y
where p and R0 are de…ned as

p= ; R0 = :
+ ( + )
184 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Figure 8.9: Phase plane for the system x0 = y; y 0 = x; depicting the neutral center
at (0; 0):

The Jacobian J of the system evaluated at an equilibrium point (x ; y ) is

p R0 y R0 x
J= :
R0 y R0 x 1

Recall that when R0 > 1; there is a disease-free equilibrium point at (1; 0) as well as
an endemic equilibrium point at R10 ; p 1 R10 : When R0 < 1; there is only the
disease-free equilibrium at (1; 0) :
We evaluate the Jacobian J at the disease-free equilibrium point (1; 0) to get

p R0
J=
R0 R0 1

Since this is upper triangular, the eigenvalues of the Jacobian will be diagonal entries
(from the rules of linear algebra). Therefore we have that 1 = p; 2 = R0 1: As
p > 0; it follows that 1 < 0: When R0 < 1; then 2 < 0 as well, and the disease-
free equilibrium is a stable node. When R0 > 1; then 2 > 0, and the disease-free
equilibrium is a saddle which is unstable.
1 1
Next we evaluate the Jacobian J at the endemic equilibrium point R0
;p 1 R0
8.15. BIFURCATION DIAGRAM 185

which exists when R0 > 1: We get


0 1
p R0 p 1 R10 R0 1
R0 pR0 1
J= @ A= :
R0 p 1 R10 R0 1
1 R0 p p 0
R0

To …nd the eigenvalues, we seek the characteristic equation from

pR0 1
=0
R0 p p

( pR0 )( ) + R0 p = 0
2
+ pR0 + p (R0 1) = 0:
Solving for ; we have
q
pR0 (pR0 )2 4p (R0 1)
1;2 = :
2
When (pR0 )2 4p (R0 1) > 0; then 1;2 are real and negative, so the endemic
equilibrium will be a stable node. When (pR0 )2 4p (R0 1) < 0, then 1;2 are
complex conjugates with negative real part, so the endemic equilibrium will be a
stable spiral.
We may therefore conclude that for the SIR model with demography, when R0 <
1; there is a disease-free equilibrium point only, which is stable (locally). When
R0 > 1; there are two equilibria; a disease-free equilibrium which is unstable, and an
endemic equilibrium which is stable (locally).

8.15 Bifurcation Diagram


Recall that
R0 =
( + )
This basic reproduction number represents the number of transmissions that one
infective individual can make from an entirely susceptible population while himself
remaining in the infective state.
We have proven that when R0 < 1; there is only one equilibrium, which is a stable
disease-free state. Every solution of the system approaches this equilibrium, so the
disease-free state is said to be attractive. It follows that the disease will eventually
die out when R0 < 1 for the system.
When instead R0 > 1; there is an unstable disease-free equilibrium and a stable
endemic equilibrium. The endemic equilibrium is said to be attractive, so all solutions
186 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Figure 8.10: Bifurcation diagram for the SIR model with demography. (*From the
text "An Introduction to Mathematical Epidemiology by Maiai Martcheva, page 52)

approach this state at in…nite time. This means that the disease will remain endemic
in the population for R0 > 1:
A bifurcation diagram (also called a forward bifurcation diagram) represents the
information summarized below. It is a plot of the dimensionless variable y repre-
senting the infective individuals (placed on the y-axis) against the basic reproduction
number R0 (placed on the x-axis). We have previously found that
(
0; R0 < 1
y =
p 1 R10 ; R0 > 1

Typically, locally stable equilibria are plotted with solid lines, and unstable equilibria
with dashed lines. The disease-free equilibria is locally asymptotically stable (solid
line) for R0 < 1; and it becomes unstable (dashed line) for R0 > 1: The endemic
equilibria is nonexistent until R0 > 1; and it is stable beyond this point (solid line).
A sketch of the bifurcation diagram is shown in Figure 8.10.

8.16 Oscillations in Epidemic Models


Although ODE models that reduce to one-dimensional dynamical systems do not
have cycles (i.e. we cannot see oscillations in the related data), planar ODE systems
(*including planar epidemic models) can exhibit periodicity. In planar epidemic mod-
els, periodic solutions are often generated when an epidemic equilibrium point loses
8.16. OSCILLATIONS IN EPIDEMIC MODELS 187

stability after passing through a Hopf bifurcation. In dynamical systems theory,


a bifurcation point in general refers to an event that leads to a sudden change of
stability in a dynamic system (from stable to unstable or vice versa). Hopf bifurca-
tions take place when a pair of eigenvalues are complex conjugates (*found from the
linearization around a non-trivial equilibrium point) cross the imaginary axis of the
complex plane. If the …xed point was stable, the complex eigenvalues 1 ; 2 would
have both been located in the left half complex plane, as Re ( ) < 0: If a parameter in
the system of equations was changed and this caused the complex eigenvalues 1 ; 2
to cross over to the right half of the complex plane Re > 0; we say that a Hopf
bifurcation has taken place. In such a case, the …xed point loses stability through
the creation of a limit cycle, which is a small-amplitude sinusoidal cyclic oscillation
about the former stationary point. Limit cycles are themselves classi…ed as subcrit-
ical or supercritical, but a more comprehensive discussion of this is beyond the
scope of this course.
For 2D planar epidemic models, it is possible to utilize the Jacobian of the system
to make conclusions about the stability of its equilibrium states. This is done by
…nding the trace (TrJ) and the determinant (DetJ ) of the Jacobian J, as stated in
the following theorem.

Theorem 8.16.1 Consider the planar system

x0 = f (x; y) ; y 0 = g (x; y)

and let (x ; y ) be an equilibrium state of that system. The Jacobian J of this system
evaluated at the equilibrium state (x ; y ) is given by

fx (x ; y ) fy (x ; y )
J (x ; y ) =
gx (x ; y ) gy (x ; y )

and the following results can be used to determine the stability of the equilibrium state
(x ; y ) :
1. The equilibrium state (x ; y ) is locally asymptotically stable i¤ TrJ < 0
and DetJ > 0:
2. (x ; y ) is a saddle i¤ DetJ < 0
3. (x ; y ) loses stability and undergoes a Hopf Bifurcation if for some value of
the parameter ; called 0 ; the following is true:

TrJ (x ( 0 ) ; y ( 0 )) = 0;
DetJ (x ( 0 ) ; y ( 0 )) > 0;

and
dTrJ
6= 0:
d = 0
188 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Let us illustrate the above theorem with an example. Consider the SIR model
where the transmission coe¢ cient of infection is linearly dependent on the number of
infected individuals: (1 + vI) ; with v > 0: This causes the contact rate/ probability
of infection to increase with the number of infected individuals. The model equations
can be written as
S 0 (t) = (1 + vI) IS S
I 0 (t) = (1 + vI) IS ( + )I
where R = N S I; so we can omit the equation for R accordingly. Mathematically
therefore, the SIR system can be written in the general form

S 0 (t) = f (S; I) ; I 0 (t) = g (S; I)

which is a system of di¤erential equations with two equations and two unknowns (S
and I); which is a planar system. We will investigate this system without nondimen-
sionalizing it …rst.
The total population size N = S + I + R satis…es N 0 = N; and we assume
that the initial population size is N (0) = S(0) + I(0) + R(0): The disease-free equi-
librium state (which always exists) can be found to be E0 = ( = ; 0) ; and the basic
reproduction number is given by

R0 =
( + )
We can also prove that the disease-free equilibrium is locally asymptotically stable
when R0 < 1 and unstable when R0 > 1:
The endemic equilibria for the system can be found by solving

(1 + vI) IS S=0
(1 + vI) IS ( + ) I = 0

From the second equation, since I 6= 0, we have (1 + vI) S = ( + ) : Using this


in the …rst equation, we get

( + )I S=0
( + )I ( + )I
)S= = :

Since (1 + vI) S = ( + ) ; substituting the expression above for S gives


( + )I
(1 + vI) = +

( + )I +
(1 + vI) = (8.47)
8.17. TECHNIQUES FOR COMPUTING R0 189

Let f (I) represent the parabola on the left hand side of equation (8:47) : The endemic
equilibria of this model are found by identifying the intersection point of the parabola
f (I) with the horizontal line y = ( + ) = :
When f (0) > ( + ) = (which happens when R0 > 1); there is a unique (pos-
itive) endemic equilibrium point E = (S ; I ) : When f (0) < ( + ) = (which
happens when R0 < 1); there may be one or two positive endemic equilib-
ria. If two endemic equilibria exist, we may denote them by E1 = (S1 ; I1 ) and
E2 = (S2 ; I2 ) : The stability of these endemic equilibria can be determined from the
Jacobian
(1 + vI) I vIS ( + )
J=
(1 + vI) I vIS
and the theorem above. (*Note that we used (1 + vI) S = ( + ) to simplify/
obtain Jacobian J).

8.17 Techniques for Computing R0


8.17.1 The Construction of More Complex Epidemiological
Models
The simple SI; SIS; SIR; SIRS models can be extended to incorporate more realis-
tic features of the disease being investigated. As the models get more complicated
though, the related ‡owcharts get longer with additional compartments and possible
transitions. As a result, the system of di¤erential equations also gets larger and more
complicated to solve. Another consequence of this is that it becomes more compli-
cated to …nd the basic reproduction number R0 : In this section, we will discuss some
more realistic models, and introduce techniques that can be used to …nd R0 :

8.17.2 Stages of Contagion


The most important stages that we have omitted from our discussion thus far are as
follows:

1. Exposed/Latent Stage: For many diseases, it takes a while before infected


individuals become infectious, i.e., before they can spread the disease. We refer
to this time lag as the latent period. This time lag is sometimes necessary
for the disease pathogen to replicate itself and become …rmly established in
the host individual. In addition, there may be an incubation period for the
disease pathogen, which is the period between infection and the onset of symp-
toms. It should be noted that the length of the latent and incubation periods
need not always coincide. An additional "exposed" compartment E(t) can be
added to the model to represent infected individuals who have not yet become
190 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Figure 8.11: Flowchart of the SEIR model. Demographic rates are omitted.

infectious, i.e. before they are classi…ed as belonging to the compartment I (t) :
The exposed (latent) period normally follows the susceptible stage. When we
incorporate a latent period into an SIR model, the resulting model is referred
to as the SEIR model. It should be noted that SEIR models are often used
to describe the dynamics of spread of viruses like in‡uenza and covid-19. The
system of equations describing the general SEIR model are

S 0 (t) = IS S;

E 0 (t) = SI ( + ) E;
I 0 (t) = E ( + ) I;
R0 (t) = I R;
where the per capita birth rate is , is the per capita rate of becoming infec-
tious, and is the per capita death rate. The approximate length of the latent
period is 1= : The basic ‡owchart for this SEIR model is shown in Figure 8.11.
*Note that the demographic rates are not included in this ‡owchart (i.e. and
):

2. Asymptomatic Stage: Some diseases have an associated asymptomatic stage,


where infected individuals do not exhibit symptoms of the disease. Individuals
who are asymptomatic are still infectious, and therefore can contribute to the
dynamics of spread to a signi…cant extent. This type of infection is common
for many types of in‡uenza and corona viruses (*including covid-19). For dis-
eases with these characteristics, an asymptomatic compartment A(t) may be
included in the model in addition to the infectious compartment I(t) - in order
to account for individuals who never exhibit symptoms, and may therefore be
more likely to spread the disease. Exposed individuals in compartment E(t)
may progress to the symptomatic infectious compartment I (t) with probabil-
ity p; and to the asymptomatic infectious compartment A(t) with probability
(1 p) : Asymptomatic individuals can also be assumed to be infectious at a
8.17. TECHNIQUES FOR COMPUTING R0 191

Figure 8.12: Basic ‡owchart for SEIAR model. Demographic rates are not incorpo-
rated.

reduced transmission rate q : The recovery rate for the asymptomatic individ-
uals is represented by : In general, the symptomatic infectious period 1= is
less than the asymptomatic infectious period 1= : The governing equations for
this type of SEIAR model are:

S 0 (t) = S (I + qA) S;

E 0 (t) = S (I + qA) ( + ) E;
I 0 (t) = p E ( + ) I;
A0 (t) = (1 p) E ( + )A
R0 (t) = I + A R:
and a ‡owchart depicting this model (*note that the diagram does not include
the demographic rates) is shown in Figure 8.12.

3. Carrier Stage: In order to account for people who do not get sick but carry
the pathogen and can infect others, a Carrier compartment C(t) can be in-
corporated into the model. This individual may have been asymptomatic and
therefore never treated for the infection, or he may be a person who was treated
for the infection, but was never cured of the disease completely. This is typical
of viral diseases (such as covid-19) and bacterial diseases (such as diptheria). In
such models, individuals enter the carrier stage after infection. Some of these
individuals may become infected/infectious, while others may recover from the
pathogen without ever becoming infected. Recall that asymptomatic individ-
uals are assumed to be infectious at a reduced transmission rate q : A ‡ow
chart for a general SCIRS model (*note that the diagram does not include the
demographic rates) is shown in Figure 8.13, and the governing equations for the
192 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Figure 8.13: Basic ‡owchart for the SCIRS model (*with demographic rates ex-
cluded).

system are:
S 0 (t) = S (I + qC) S + R;
C 0 (t) = S (I + qC) ( + + ) C;
I 0 (t) = C ( + ) I;
R0 (t) = I + C ( + ) R:

4. Passive Immunity Stage: This is a form of immunity that can be transferred


to an individual passively in the form of antibodies. This can occur naturally
when a pregnant woman transfers antibodies to her fetus through the placenta,
or when she breast-feeds her child after birth. Passive immunity can also be
transmitted arti…cially by injecting blood plasma containing antibodies from a
recovered individual to someone who has the same illness. This type of treat-
ment is currently being tested for covid-19. The passive immunity stage is
denoted by M (t); and this stage can be inserted into the model before the
individuals are considered susceptible. A simple M SIR model with passive
immunity can have the following governing equations:

M 0 (t) = M M;

S 0 (t) = M SI S;
I 0 (t) = SI ( + ) I;
R0 (t) = I R:
where is the rate of loss of maternal antibodies per unit time.
8.17. TECHNIQUES FOR COMPUTING R0 193

8.17.3 Stages Related to Control Strategies


Compartmental models may also include compartments that are associated with dis-
ease control strategies. Examples of these are quarantine/isolation, vaccination and
treatment.

1. Quarantine/ Isolation: Quarantine is a compulsory isolation that is im-


posed on an individual to contain the spread of a disease. It is applied after
an individual comes into close contact with an infectious (*or suspected to be
infectious) individual. Isolation is the term used for the con…nement of an
infectious individual in order to reduce contact with healthy susceptible indi-
viduals in the population. The quarantine/ isolated individuals are represented
by the compartment Q(t) in the model. For the resulting SIQR model, it is
common practice to assume standard incidence rather than mass-action in-
cidence. All individuals who are not quarantined/ isolated are referred to as
active individuals, hence the active class A(t) can be considered to be

A (t) = S (t) + I (t) + R (t)

The other governing equations for the SIQR model can be written as
SI
S 0 (t) = S;
A
SI
I 0 (t) = ( + + ) I;
A
Q0 (t) = I ( + ) Q;
R0 (t) = I + Q R:
and a basic ‡ow chart for this SIQR model (*without the demographic rates)
is shown in Figure 8.14.
2. Treatment: The care provided to reduce morbidity of a disease can be incor-
porated by adding a treatment compartment T (t) to the model. This includes
the administering of medications that alleviate symptoms of illness, or boost
the immune system to aid in recovery. The addition of this compartment may
replace the R (t) recovered class to create an SEIT model. Since there is a pos-
sibility that patients will not respond to the treatment adequately (rendering
the process inadequate), and some patients do not complete the treatment reg-
imen fully, this may lead to either the individual relapsing to to exposed/latent
class with probability p; or being successfully treated with probability q (*with
p + q = 1): Possible governing equations for an SEIT model are as follows:
SI
S 0 (t) = S;
N
194 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Figure 8.14: Flow chart for the basic SIQR model (*without demographic rates).

Figure 8.15: SEIT ‡ow diagram (*excluding demographic rates).

1 SI 2T I
E 0 (t) = + ( + + r1 ) E + pr2 I;
N N
I 0 (t) = E (r2 + ) I;
2T I
T 0 (t) = r1 E + qr2 I T:
N
Here r1 is the treatment rate of exposed individuals, r2 is the treatment rate
of infectious individuals, represents the progression of exposed individuals to
the infectious compartment I (t), 1 is the transmission rate constant for the
susceptible class S (t), and 2 is the transmission rate constant for the treatment
class T (t). A basic ‡ow chart for this SEIT model (*without the demographic
rates) is shown in Figure 8.15.

3. Vaccination: A vaccine is normally manufactured from dead or weakened


antigenic material. Vaccines are administered to individuals in an attempt to
make them immune (permanently or temporarily) from the disease. There are
two common approaches to incorporating this into the compartmental mode.
8.17. TECHNIQUES FOR COMPUTING R0 195

The …rst is to administer the vaccine to all individuals who are entering the
population under study. This results in a proportion p entering the susceptible
class, and a proportion 1 p entering the recovered (or immune) class. Another
way to account for vaccinations is via the incorporation of a vaccinated class
V (t), with susceptible individuals moving to this compartment after receiving
the vaccine.

8.17.4 Stages Related to Pathogen or Host Homogeneity

In this section, we consider the impact of host or pathogen heterogeneities on disease


dynamics.

1. Pathogen Genetic Heterogeneities: When there are multiple strains of a


disease that are genetically distinct, there are implications for disease progres-
sion and spread. This is due to the fact that the each strain of the pathogen may
respond quite di¤erently to control measures that are adopted, meaning that
the transmission rate constants for each strain would be di¤erent, as would the
rate of recovery of infected individuals. Multi-strain compartment models may
be utilized to account for this. An SIS model for such a disease with two strains
may include a compartment I1 with individuals infected by the …rst strain, and
compartment I2 with those who are infected by the second strain. The system
of governing equations that would be generated for this compartmental model
are as follows:

S 0 (t) = 1 SI1 2 SI2 S+ 1 I1 + 2 I2 ;

I1 (t) = 1 SI1 ( + 1 ) I1 ;

I2 (t) = 2 SI2 ( + 2 ) I2 ;

and the related ‡owchart is shown in Figure 8.16.

2. Host Heterogeneities: It is sometimes possible for the pathogen to infect


multiple host species. For example, some types of in‡uenza can infect animals
and humans (*such as swine ‡u). To account for the di¤erences that would be
inherent in such cases, even the simplest SI models should have two susceptible
and two infected populations. A simple multihost single pathogen SI model
with two susceptible populations Sw ; Sd ; and two infected populations Iw ; Id
would have governing equations

Sw0 (t) = w 11 Sw Iw 12 Sw Id w Sw ;

Iw0 (t) = 11 Sw Iw + 12 Sw Id ( w + w ) Iw ;
196 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Figure 8.16: Flowchart for a simple two-strain SIS model (*demographic rates not
included)

Figure 8.17: Flowchart for a simple two-host single pathogen SI model (*excluding
demographic rates)

Sd0 (t) = d 21 Sd Iw 22 Sd Id d Sd ;

Id0 (t) = 21 Sd Iw + 22 Sd Id ( d + d ) Id ;

with the related ‡owchart (*excluding demographic rates) shown in Figure ??.

8.18 Calculating R0 via the Jacobian


The basic reproduction number R0 provides us with a threshold condition for the
stability of the disease-free equilibrium state. In general, the basic reproduction
number should have the following properties:

1. It should be nonnegative for nonnegative parameter values.

2. It is zero when there is no disease transmission taking place.

3. It indicates the number of secondary infections.


8.18. CALCULATING R0 VIA THE JACOBIAN 197

After the Jacobian of the system at the disease-free equilibrium point has been
identi…ed, for stability we must impose a condition that all eigenvalues of the related
characteristic equation must have negative real parts. With the 2D case, this follows
naturally when the trace of Jacobian J is negative and the determinant of J is positive
(i.e. TrJ1 < 0 and DetJ1 > 0). For higher dimensional cases, this no longer applies.
However, it is sometimes possible to reduce the characteristic equation to the 2D case
(by applying matrix reduction techniques), as seen in the following examples:

8.18.1 Example #1: Jacobian reduces to a 2D matrix


Consider the SIRS model with a carrier stage incorporated (previously described),
with governing equations:

S 0 (t) = S (I + qC) S + R;

C 0 (t) = S (I + qC) ( + + ) C;
I 0 (t) = C ( + ) I;
R0 (t) = I + C ( + ) R:
The related Jacobian is found to be
0 1
q S S
B 0 q S ( + + ) S 0 C
J =B@ 0
C
A
( + ) 0
0 ( )

where S = = : By setting jJ Ij = 0; we obtain the eigenvalues for the problem.


We can expand the 4 4 Jacobian matrix along the …rst column to obtain the …rst
eigenvalue 1 = : The reduced 3 3 Jacobian matrix can then be expanded along
the last column to give the eigenvalue 2 = ( + ) : The remaining two eigenvalues
are deduced from the "reduced" Jacobian matrix
q S ( + + ) S
J1 =
( + )

We can then apply the normal conditions to guarantee that the eigenvalues of J1 will
have negative real part, i.e., TrJ1 < 0 and DetJ1 > 0: The second inequality gives

(q S ( + + )) ( + ) S >0

which leads to the basic reproduction number


q S S
R0 = + : (8.48)
+ + ( + + )( + )
198 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

Note that the condition R0 < 1 implies that TrJ1 < 0 and DetJ1 > 0: Therefore,
when R0 < 1; the disease-free equilibrium is locally asymptotically stable, and when
R0 > 1; the disease-free equilibrium is unstable. For the basic reproduction number
q S
(8:48) ; it should be noted that the term represents the number of secondary
+ +
infections produced by one carrier in an entirely susceptible population during its life-
q S
time as a carrier. The second term corresponds to the number of secondary
+ +
infections produced by one infectious individual in an entirely susceptible population
during its lifetime as an infectious individual.

8.18.2 Routh-Hurwitz Criteria in Higher Dimensions


It is often the case with more complicated models that the Jacobian computed for
the disease-free equilibrium cannot be reduced to a 2 2 matrix (*as was the case in
the previous example). The characteristic polynomial for such cases is often of degree
3 or higher. In such cases, the basic reproduction number R0 for the system
can be obtained from the constant term of the characteristic polynomial.
Whether the basic reproduction number is greater or less than one determines the
sign of this constant term.
The Routh-Hurwitz criteria can provide necessary and su¢ cient conditions for
the eigenvalues to have negative real parts for such cases.
Theorem 8.18.1 (Routh-Hurwitz Criteria): Consider the nth degree characteristic
polynomial with real constant coe¢ cients
n n 1
P( )= + a1 + + an 1 + an
De…ne n Hurwitz matrices using the coe¢ cients ai of the characteristic polynomial as
follows: 0 1
a1 1 0
a1 1
H1 = (a1 ) ; H2 = ; H3 = @a3 a2 a1 A
a3 a2
a5 a4 a3
and 0 1
a1 1 0 0 0
B a3 a2 a1 1 0C
B C
B 0C
Hn = Ba5 a4 a3 a2 C
B .. .. .. .. .. C
@. . . . .A
0 0 0 0 0
where aj = 0 if j > n: All roots of the polynomial P ( ) are negative or have negative
real part if and only if the determinants of all Hurwitz matrices are positive:
DetJ > 0; j = 1; : : : ; n:
8.18. CALCULATING R0 VIA THE JACOBIAN 199

For n = 2; the Routh-Hurwitz criterion simpli…es to a1 > 0 and a1 a2 > 0: Note


also that a3 = 0 in H2 : These conditions are equivalent to a1 > 0 and a2 > 0 and
are analogous to the conditions applied before: TrJ1 < 0 and DetJ1 > 0: The Routh-
Hurwitz criteria for polynomials are given in the Table below:

n Coe¢ cient Signs Additional Conditions (if any)


2 a1 > 0; a2 > 0 n/a
3 a1 > 0; a2 > 0; a3 > 0 a1 a2 > a3
4 a1 > 0; a2 > 0; a3 > 0; a4 > 0 a1 a2 a3 > a23 + a21 a4
5 a1 > 0; a2 > 0; a3 > 0; a4 > 0; a1 a2 a3 > a23 + a21 a4 ; and
a5 > 0: (a1 a4 a5 ) (a1 a2 a3 a23 a21 a4 ) > a5 (a1 a2 a3 )2 + a1 a25

Note that a necessary but not su¢ cient conditions for the roots of the polynomial
P ( ) to be negative or to have negative real part is that its coe¢ cients ai > 0 for all
i = 1::n:
The Routh-Hurwitz criterion can be used to derive the basic reproductive number
R0 for the SEIR model with asymptomatic stage. Recall that the governing equations
for the basic SEIAR model we outlined previously are:

S 0 (t) = S (I + qA) S;

E 0 (t) = S (I + qA) ( + ) E;
I 0 (t) = p E ( + ) I;
A0 (t) = (1 p) E ( + )A
0
R (t) = I + A R:
It can be easily shown that the disease-free equilibrium in this model is (S ; 0; 0; 0; 0) ;
where S = = : The Jacobian at the disease-free equilibrium is found to be
0 1
0 S q S 0
B 0 ( + ) S q S 0 C
B C
J =BB 0 p ( + ) 0 0 C
C (8.49)
@ 0 (1 p) 0 ( + ) 0 A
0 0

The characteristic equation is found from jJ Ij = 0; and we can then expand


the related determinant by the …rst column and then by the last column to obtain
two of the eigenvalues 1 = ; 2= : The last three eigenvalues can be found
by …nding the eigenvalues of the reduced 3 3 matrix
0 1
( + ) S q S
J1 = @ p ( + ) 0 A
(1 p) 0 ( + )
200 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

and jJ1 Ij = 0 gives

( + + ) S q S
p ( + + ) 0 =0
(1 p) 0 ( + + )

( + + )( + + )( + + )
(1 p) q S ( + + ) p S ( + + )=0

which gives the cubic characteristic equation P ( )


3 2
P ( ) := + a1 + a2 + a3 = 0

where
a1 = + + + + +

a2 = ( + ) ( + ) + ( + ) ( + )
+ ( + ) ( + ) (1 p) q S p S

a3 = ( + ) ( + ) ( + ) ( + ) (1 p) q S ( + )p S
As stated before, the basic reproduction number R0 for the system can be
obtained from the constant term of the characteristic polynomial, which
in this case is a3 . Whether the basic reproduction number is greater or less than
one determines the sign of this constant term a3 . As R0 < 1 should be equivalent to
a3 > 0; so we can de…ne R0 to be

(1 p) q S p S
R0 = +
( + )( + ) ( + )( + )

and we see that R0 < 1 corresponds to a3 < 0:


To determine whether the disease-free equilibrium is locally asymptotically stable
when R0 < 1; we must normally show that for this equilibrium state, all the roots
of the characteristic equation P ( ) have only negative real roots or complex roots
with negative real parts. We can do this without actually …nding the roots by using
the Routh-Hurwitz criteria for a dimension three P ( ) ; which is that a1 > 0; a2 >
0; a3 > 0; and also a1 a2 > a3 :
When R0 < 1; we know already that a3 > 0: It is clear that a1 > 0; since ; ; ;
are all positive constants. As R0 < 1; then ( + ) ( + ) > (1 p) q S and
( + ) ( + ) > p S : It therefore follows that a2 > 0:
8.18. CALCULATING R0 VIA THE JACOBIAN 201

Since ( + ) ( + ) > (1 p) q S and ( + ) ( + ) > p S ; it follows


that

a2 > p S + ( + ) ( + ) + (1 p) q S
(1 p) q S p S
= ( + )( + )

and since
a1 = + + + + + > +
it follows that
a1 a2 > ( + ) ( + ) ( + )
Next, it is quite clear that

a3 = ( + ) ( + ) ( + ) ( + ) (1 p) q S ( + )p S
< ( + )( + )( + )

Hence it follows that

a1 a2 > ( + ) ( + ) ( + ) > a3

and the Routh-Hurwitz criterion implies that the disease-free equilibrium is locally
asymptotically stable when R0 < 1:
Mathematically, everything is correct. From an epidemiological point of view, we
must interpret R0 to be the number of secondary cases resulting from a single infected
individual. Recall that we had found the basic reproduction number to be

(1 p) q S p S
R0 = + :
( + )( + ) ( + )( + )

The …rst term in R0 is the number of secondary infections that one asymptotic indi-
vidual will produce in an entirely susceptible population during its lifespan as asymp-
tomatic:
(1 p) q S
Ra =
( + )( + )
We come to this conclusion since q S is the number of newly exposed individuals
resulting from contact/ interaction with one asymptomatic individual in an entirely
(1 p)
susceptible population. The fraction of exposed individuals move from the
( + )
exposed class to the asymptomatic class. A single asymptomatic individual remains
asymptomatic while subsequently infecting other individuals who themselves become
asymptomatic in 1= ( + ) units of time:
202 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

The second term in R0 is the number of secondary infections that one symptomatic
infectious individual will produce in an entirely susceptible population during its
lifespan as symptomatic and infectious:

p S
Rs =
( + )( + )

We can say this since S is the number of newly exposed individuals resulting from
one infectious individual per unit of time in an entirely susceptible population. The
p
fraction of exposed individuals progress from the exposed class to the infec-
( + )
tious stage. A single infectious individual remains infectious to susceptible individuals
for 1= ( + ) units of time.
We can therefore say that R0 = Ra + Rs :

Remark 8.18.1 The Jacobian approach works well when the necessary and su¢ cient
conditions for stability of the Jacobian can be reduced to a single condition. This leads
to the reduction of the Jacobian J to a 2 2 matrix, which can then be analyzed accord-
ingly. In such cases, TrJ < 0 is either automatically satis…ed, or follows immediately
from the requirement that R0 < 1; or DetJ > 0: Sometimes though (as is often the
case when host heterogeneities are included in the model), TrJ < 0 is not automatic,
and also does not follow from the condition DetJ > 0: The Jacobian approach does
not work well for these types of models, and it is not easy to de…ne R0 using this
approach.

8.18.3 The Next-Generation Approach (*Van den Driessche


and Watmough Method)
For compartmental models for infectious disease models that involve systems of or-
dinary di¤erential equations, we can de…ne a next-generation matrix. The com-
partments are classi…ed as either infected or not infected, depending on whether
the individuals in that compartment are infected or not, and regardless of whether
they are infectious or not (i.e. we do not care if the individuals are still in a "latent"
phase)3 . If there are n infected compartments and m noninfected compartments in
the model, the system of ordinary di¤erential equations should have n + m dependent
variables in it. We let x be the vector of dependent variables in the infected compart-
ments (i.e. x 2 Rn ), and y be the vector of dependent variables in the non-infected
compartments (i.e. y 2 Rm ). We then proceed as follows:
3
P. VAN DEN DRIESSCHE AND J. WATMOUGH, Reproduction numbers and sub-threshold
endemic equilibria for compartmental models of disease transmission, Math. Biosci., 180 (2002), pp.
29–48. John A. Jacquez memorial volume.
8.18. CALCULATING R0 VIA THE JACOBIAN 203

1. Arrange the system of equations so that the …rst n equations correspond to


infected compartments, and the last m equations to the non-infected compart-
ments. Hence the system of ODEs is written as

x0i = fi (x; y) ; i = 1; : : : ; n
yj0 = gj (x; y) ; j = 1; : : : ; m (8.50)

2. Split the right hand side of the infected compartments as follows:

x0i = Fi (x; y) i (x; y) ; i = 1; : : : ; n


yj0 = gj (x; y) ; j = 1; : : : ; m (8.51)

where Fi (x; y) is the rate of appearance of new infections in compartment i;


and i (x; y) has all the remaining transitional terms (i.e. births, deaths, disease
progression and recovery).
Note that this decomposition (into infected and non-infected compartments,
and then subsequently into F and within the infected compartments) may
not be unique. Di¤erent decompositions lead to di¤erent interpreta-
tions of the disease process, and di¤erent expressions for the basic
reproduction number. Note though that the following must be satis…ed:

Fi (0; y) = 0 and i (0; y) = 0 for all y 0; and i = 1; : : : ; n: Hence, all


new infections are secondary infections resulting from infected hosts. This
also ensures that there is no immigration of susceptible individuals into
the disease compartments.
Fi (x; y) 0 8x; y 0:
i (x; y) 0 whenever xi = 0 for i = 1; : : : ; n: Each component i repre-
sents the net out‡ow of a compartment. In‡ow only (negative) is possible
when the compartment is empty.
Pn
i=1 i (x; y) 0 8x; y 0: The total out‡ow of all infected compart-
ments is positive.

3. Assume that the disease-free system

y 0 = g (0; y)

has a unique disease-free equilibrium E0 = (0; y0 ) such that all solutions with
initial conditions of the form (0; y) approach (0; y0 ) as t ! 1: Next, …nd the
disease-free equilibrium E0 :
204 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

4. Determine the matrices F and V with components


@Fi (0; y0 ) @ i (0; y0 )
F = ; V =
@xj @xj
These matrices will come from the linearization of the system around the
disease-free equilibrium point E0 ; and it can be shown that
@Fi (0; y0 ) @ i (0; y0 )
= = 0; 8 pairs (i; j)
@xj @xj
Hence, the linearized equations for the infected compartments x at the disease-
free equilibrium are decoupled from the remaining equations. The linearized
system for the infected compartments can be expressed as
x0j = (F V ) x:

5. The next-generation matrix K is de…ned as


1
K = FV
and the basic reproduction number R0 can be found by …nding the spectral
radius of the matrix K :
R0 = F V 1
(Note: (A) denotes the spectral radius of the matrix A):

De…nition 8.18.1 The spectral radius of a matrix A is de…ned to be the maxi-


mum of the absolute values of the eigenvalues of A
(A) = sup fj j : 2 (A)g
where (A) denotes the set of eigenvalues of A:

It can be shown that V is a nonsingular M matrix; where an M matrix is


de…ned below:

De…nition 8.18.2 A matrix A is called and M matrix if

The o¤-diagonal elements of A are nonpositive


1
The inverse of A exists, and has nonnegative elements, i.e. A 0:

Since V is an M matrix; V 1 0; i.e., V 1 has only nonnegative entries.


Since F also has nonnegative entries, the next-generation matrix K = F V 1 is also
nonnegative. Hence it follows that K has its spectral radius as an eigenvalue (by the
Perron-Frobenius theorem), and there are no other eigenvalues with larger modulus.
The largest positive eigenvalue gives us the basic reproduction number R0 :
8.18. CALCULATING R0 VIA THE JACOBIAN 205

Summary 8.18.1 The basic reproduction number R0 is the largest positive eigen-
value of the next-generation matrix K:

It can also be shown that if R0 < 1; the disease-free equilibrium E0 is locally


asymptotically stable. Otherwise, it is unstable. The disease-free equilibrium E0 is
locally asymptotically stable if all the eigenvalues of the matrix F V have negative
real part, i.e. if the spectral bound of the matrix F V is negative, where spectral
bound is de…ned below:

De…nition 8.18.3 The spectral bound of a matrix A is given by the maximum real
part of all its eigenvalues:

m (A) = sup fRe : 2 (A)g :

The relationship between the spectral bound of the linearized matrix F V and the
spectral radius of the next-generation matrix F V 1 is given in the following theorem:

Theorem 8.18.2 The following statements are equivalent:


1
m (F V ) < 0 if f (F V )<1
1
m (F V ) > 0 if f (F V )>1

Remark 8.18.2 Since the matrices F and V may be di¤erent (depending on our
interpretation of the disease processes in the model), the next-generation matrix is
not unique. Di¤erent approaches may therefore lead to di¤erent expressions for the
basic reproduction number R0 . In fact, it is true that often the expressions found for
R0 with the Jacobian approach is di¤erent from the expression obtained via the next-
generation matrix method described above. The real advantage of the next -generation
matrix method is that it can always result in an expression for R0 ; while the Jacobian
approach may sometimes fail. However, the form of the expression for R0 determined
by the Jacobian approach is often easier to interpret.

8.18.4 Example: SEIT Model with Treatment and Relapse


Consider the SEIT model with treatment and relapse that we previously outlined,
with the governing system of equations:
SI
S 0 (t) = S;
N
1 SI 2T I
E 0 (t) = + ( + + r1 ) E + pr2 I;
N N
0
I (t) = E (r2 + ) I;
206 CHAPTER 8. INTRODUCTION TO MATHEMATICAL EPIDEMIOLOGY

2T I
T 0 (t) = r1 E + qr2 I T:
N
If we view the right-hand side in infection compartments E and I as the "infected"
compartments, we can show that the disease-free equilibrium is given by (S; E; I; T ) =
; 0; 0; 0 :
If we decide that the relapse term pr2 I should be considered "new infections",
then as Fi (x; y) is the rate of appearance of new infections in compartment i; and
i (x; y) has all the remaining transitional terms (i.e. births, deaths, disease progres-
sion and recovery) from compartment i, we will obtain the following:
1 SI 2T I
N
+ N
+ pr2 I ( + + r1 ) E
F= ; =
0 E + (r2 + ) I
Evaluating the derivatives of at the disease-free equilibrium point (S; E; I; T ) =
; 0; 0; 0 gives the following linearized matrices

0 1 + pr2 + + r1 0
F = ; V =
0 0 r2 +
Taking the inverse of matrix V , we get the M matrix
!
1
1 + +r1
0
V = 1
( + +r1 )( +r2 ) +r2

As V 1 is a M matrix; it follows that it should always have nonnegative elements.


The next-generation matrix K is given by
1 + pr2 1 +pr2
1 ( + +r1 )( +r2 ) +r2
K = FV =
0 0
The basic reproduction number R0 is the largest positive eigenvalue of the next-
generation matrix K; which is
1+ pr2
R0 = :
( + + r1 ) ( + r2 )
Remark 8.18.3 Note that if instead we has chosen the relapse term pr2 I to be ex-
isting infections, it would have been included as part of as follows:
1 SI 2T I
N
+ N
( + + r1 ) E pr2 I
F= ; =
0 E + (r2 + ) I
which would have led to the following expression for the basic reproduction number
R0 :
1
R0 = :
( + + r1 ) ( + r2 ) pr2
8.18. CALCULATING R0 VIA THE JACOBIAN 207

Remark 8.18.4 Another variation to the next-generation approach is that developed


by Castillo-Chavez, Feng and Huang4 . They split the equations according to
three groups: Compartments of noninfected individuals, compartments of infected by
not infectious individuals, and compartments of infectious individuals. In this ap-
proach, we do not need to decide what infections are new. It should be noted that the
implementation of this method will lead to the …rst version of the basic reproduction
number for the previous example:

1+ pr2
R0 =
( + + r1 ) ( + r2 )

4
C. CASTILLO-CHAVEZ, Z. FENG, AND W. HUANG, On the computation of R0 and its role
on global stability, in Mathematical approaches for emerging and reemerging infectious diseases: An
introduction (Minneapolis,MN, 1999), vol. 125 of IMA Vol. Math. Appl., Springer, New York, 2002,
pp. 229–250.

You might also like