100% found this document useful (2 votes)

347 views

Model Diff

diff

Uploaded by

Yeye Dash

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

347 views

Model Diff

diff

Uploaded by

Yeye Dash

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 418

Australian Mathematical Society Lectures Series L(

Modelling with
Differential and
Difference Equations
GLENN FULFORD;

TETER FORRESTER

ARTHUR, JONESP
Modelling with Differential
and Difference Equations
AUSTRALIAN MATHEMATICAL SOCIETY LECTURE SERIES

Editor-in-chief: Professor J.H. Loxton, School of Mathematics, Physics,

Computing and Electronics, Macquarie University, NSW 2109,
Australia

1 Introduction to Linear and Convex Programming, N. CAMERON

2 Manifolds and Mechanics, A. JONES, A. GRAY & R. HUTTON
3 Introduction to the Analysis of Metric Spaces, J. R. GILES
4 An Introduction to Mathematical Physiology and Biology, J. MAZUMDAR
5 2-Knots and their Groups, J. HILLMAN
6 The Mathematics of Projectiles in Sport, N. DE MESTRE
7 The Peterson Graph, D. A. HOLTON & J. SHEEHAN
8 Low Rank Representations and Graphs for Sporadic Groups,
C. PRAEGAR & L. SOICHER
9 Algebraic Groups and Lie Groups, G. LEHRER (ed)
Modelling with Differential
and Difference Equations

GLENN FULFORD
Department of Mathematics,
University College ADFA, Canberra

PETER FORRESTER
Department of Mathematics,
Melbourne University

ARTHUR JONES
Department of Mathematics,
Latrobe University

CAMBRIDGE
UNIVERSITY PRESS
CAMBRIDGE UNIVERSITY PRESS
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, SAo Paulo

Cambridge University Press

The Edinburgh Building, Cambridge CB2 2RU, UK

Published in the United States of America by Cambridge University Press, New York

www.cambridge.org
Information on this title: www.cambridge.org/9780521440691

© Cambridge University Press 1997

This publication is in copyright. Subject to statutory exception

and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without
the written permission of Cambridge University Press.

First published 1997

Reprinted 2001

A catalogue record for this publication is available from the British Library

ISBN-13 978-0-521-44069-1 hardback

ISBN-10 0-521-44069-6 hardback

ISBN-13 978-0-521-44618-1 paperback

ISBN- 10 0-521-44618-X paperback

Transferred to digital printing 2006

Contents

Preface
Introduction to the student
Part one: Simple Models in Mechanics
1 Newtonian mechanics
1.1 Mechanics before Newton
1.2 Kinematics and dynamics
1.3 Newton's laws
1.4 Gravity near the Earth
1.5 Units and dimensions
2 Kinematics on a line
2.1 Displacement and velocity
2.2 Acceleration
2.3 Derivatives as slopes
2.4 Differential equations and antiderivatives
3 Ropes and pulleys
3.1 Tension in the rope
3.2 Solving pulley problems
3.3 Further pulley systems
3.4 Symmetry
4 Friction
4.1 Coefficients of friction
4.2 Further applications
4.3 Why does the wheel work?
5 Differential equations: linearity and SHM
5.1 Guessing solutions
5.2 How many solutions?
5.3 Linearity
5.4 The SHM equation

v
vi Contents

6 Springs and oscillations 85

6.1 Force in a spring 85
6.2 A basic example 88
6.3 Further spring problems 94
Part two: Models with Difference Equations 103
7 Difference equations 105
7.1 Introductory example 105
7.2 Difference equations basic ideas 109
7.3 Constant solutions and fixed points 114
7.4 Iteration and cobweb diagrams 118
8 Linear difference equations in finance and economics 126
8.1 Linearity 127
8.2 Interest and loan repayment 133
8.3 The cobweb model of supply and demand 138
8.4 National income : `acceleration models' 142
9 Non-linear difference equations and population growth 146
9.1 Linear models for population growth 146
9.2 Restricted growth non-linear models 152
9.3 A computer experiment 157
9.4 A coupled model of a measles epidemic 164
9.5 Linearizing non-linear equations 170
10 Models for population genetics 177
10.1 Some background genetics 177
10.2 Random mating with equal survival 185
10.3 Lethal recessives, selection and mutation 193
Part three: Models with Differential Equations 201
11 Continuous growth and decay models 203
11.1 First-order differential equations 203
11.2 Exponential growth 212
11.3 Restricted growth 218
11.4 Exponential decay 227
12 Modelling heat flow 232
12.1 Newton's model of heating and cooling 232
12.2 More physics in the model 237
12.3 Conduction and insulation 241
12.4 Insulating a pipe 249
13 Compartment models of mixing 257
13.1 A mixing problem 257
13.2 Modelling pollution in a lake 265
13.3 Modelling heat loss from a hot water tank 270
Contents vii

Part four: Further Mechanics 275

14 Motion in a fluid medium 277
14.1 Some basic fluid mechanics 277
14.2 Archimedes' Principle 282
14.3 Falling sphere with Stokes' resistance 286
14.4 Falling sphere with velocity-squared drag 290
15 Damped and forced oscillations 295
15.1 Constant-coefficient differential equations 295
15.2 Damped oscillations 302
15.3 Forced harmonic motion 311
16 Motion in a plane 318
16.1 Kinematics in a plane 318
16.2 Motion down an inclined plane 326
16.3 Projectiles 331
17 Motion on a circle 336
17.1 Kinematics on a circle 336
17.2 Uniform circular motion 343
17.3 The pendulum and linearization 348
Part five: Coupled Models 353
18 Models with linear interactions 355
18.1 Two-compartment mixing 355
18.2 Solving constant-coefficient equations 360
18.3 A model for detecting diabetes 366
18.4 Nutrient exchange in the placenta 373
19 Non-linear coupled models 379
19.1 Predator-prey interactions 379
19.2 Phase-plane analysis 384
19.3 Models of combat 389
19.4 Epidemics 394
References 399
Index 403
Preface

This book provides an introduction to modelling with both differential

and difference equations. Our approach to mathematical modelling is
to emphasize what is involved by looking at specific examples from
a variety of disciplines. From each discipline enough background is
provided to enable students to understand both the assumptions and the
predictions of the models. Exercises have been included at the end of
each section. They are intended to provide a balanced development of
some of the main skills used in mathematical modelling, and hence they
are an essential part of the book.
The main mathematical tools used in the book are differential and dif-
ference equations. Differential equations have their origins in mechanics :
Newton's laws of motion lead to differential equations whose solutions
can be used to predict the position of a body at some later time. Dif-
ferential equations have been closely associated with the rise of physical
science in previous centuries and they are now being used as models for
real world problems in a variety of other disciplines. Difference equations
are the discrete analogues of differential equations. They have risen
to prominence in the last decade, during which it has become generally
known that solutions of even very simple difference equations can exhibit
complex chaotic behaviour.
To allow time for the development of other modelling skills besides
solving the equations arising from the models, we have selected only
models involving differential equations which are relatively easy to solve.
Although our treatment of differential equations is intended to be self-
contained, it is only fair to point out that our students were taking
concurrently a first course in mathematical methods (beginning with
the elements of differentiation and ending with some practice at solving
separable and linear constant-coefficient differential equations, towards

ix
x Preface
the end of the year). Some chapters of the book assume a knowledge
of linear equations, complex numbers, vector algebra, or the elements of
probability theory. The mathematical prerequisites are listed at the start
of each chapter.
This book grew out of notes prepared for a first-year course given at
La Trobe University for each of the last six years. The total time allotted
to the course was 65 hours, including lectures and practice sessions. Not
all chapters were covered in the same year and some choice is possible.
A longer course could be organized by covering all the material in the
book and including some computer work where relevant.
Each year we refined the material and its presentation, based on
our experience in teaching the course during the previous years. Some
curious incidental difficulties were faced each year by some students.
These included (a) correct use of minus signs in setting up equations
(b) sketching diagrams to illustrate the choice of a particular coordinate
in a mechanics problem (c) distinguishing between the parameters of
a problem and its unknowns. We have attempted to address these and
other difficulties in both the text and the exercises. We have also analysed
the steps involved in solving various types of problem, at least in the
early part of the book, and we find this helps students to present their
solutions to exercises clearly.
We wish to thank Sid Morris and Ed Smith for assistance with the
overall planning of the original course and Alan Andrew, Jeff Brooks,
Peter Stacey and John Strantzen for improvements in certain sections.
One of the authors (G.F.) also wishes to thank Colin Pask for his
encouragement. We also thank Dorothy Berridge and Annabelle Lippiatt
for assisting with the typing.

G.F., P.F., A.J.

1996
Introduction to the student

Scientists, engineers and economists, working on a wide variety of prob-

lems, nowadays find it useful to set up mathematical models of the
systems which they are investigating. To do this, they give a simplified
description of the problem which allows equations to be set up and then
solved to make predictions.
The following sample of the problems treated in this book shows the
wide-ranging areas in which the modelling process is currently being
applied.
Mechanics. We model the tension in a rope passing over pulleys and
then use the model to explain how it is possible for a worker pulling on
a rope to raise a heavy load many times his own weight.
Genetics. In a genetics problem we describe a model which allows us to
predict how many generations it takes for a recessive gene resulting in
defective offspring to effectively disappear.
Thermal physics. By modelling the loss of heat through the insulation
surrounding a hot water pipe, we derive the paradoxical result that, in
certain circumstances, it is possible for more heat to be lost from the pipe
with insulation than without it.
Engineering. We study the phenomenon of resonance, which was respon-
sible for a famous bridge collapsing due only to wind fluctuations.
Fisheries management. We study a model of interacting populations of
certain types offish and sharks and use it to explain how increasing the
amount of fishing can actually increase the fish population.
Some other areas in which models are constructed in this book are
finance, economics, population studies, and physiology.
Enough background is given in each of these subjects to make the
problems intelligible and to enable you to make suitable modifications to

1
2 Introduction to the student
the models when the problems are changed slightly. Carrying out these
modifications will help you to develop many of the skills which are used
in mathematical modelling.
Setting up a mathematical model of a problem involves the following
steps.
Identifying the quantities most relevant to the problem and then making
assumptions about the way in which the quantities are related. This
usually involves simplifying the original problem so as to emphasize the
features which are likely to be most important.
Introducing symbols to denote the various quantities, and then writing
the assumptions as mathematical equations.
Solving the equations and interpreting their solutions as statements about
the original problem.
Checking the results obtained to see whether they seem reasonable and,
if possible, whether they are in agreement with experimental data.

(a) (b)
Make simplifying Write these
assumptions about assumptions as
real-world problem mathematical
equations

(d) (c)
Do the Solve equations
predictions of the and interpret the
model agree with solutions in terms
experimental of the real-world
results? problem

YES
Use model to make further predictions,
and for control, in the real world

The modelling cycle

The modelling process is illustrated in the above figure, which brings

out its cyclic nature. The process may fail at stage (c) if the equations
are too complicated to be solved. One then returns to stage (a) of
the process and tries to simplify the modelling assumptions to produce
equations which are easier to solve. At stage (d), moreover, there may
Introduction to the student 3

be insufficient agreement between the actual experimental results and the

results predicted from the model. If this happens, one again returns to
stage (a) to see whether the assumptions can be made more realistic.
The process of returning to (a) may be repeated many times until a
satisfactory model is obtained.
The problems discussed in this book will illustrate various stages of
the modelling cycle. While stage (a) of the process is in some ways the
most creative, it is often also the most difficult, involving the intuition
and experience of specialists in the various areas. Hence, although we
describe lots of models in full detail, the examples we set for practice are
aimed more at the later stages of the process:
introducing mathematical symbols and writing the assumptions as equa-
tions;
solving the equations and interpreting their solutions;
checking to see if the answers seem reasonable.
The book is divided into parts, each of which corresponds to the areas
in which the problems arise and to the types of mathematical equations
to which they lead.
Thus in Part one we model some basic problems in mechanics. The
equations which arise in these models express the rate of change of one
quantity in terms of other quantities. They are known as differential
equations. Elementary mechanics provides a lot of interesting models in
which the differential equations can be solved easily.
In Part two we turn to some models in which the equations express
the value of a quantity at the end of the year, say, in terms of the values
of the quantity at the end of previous years. Such equations are known
as difference equations.
In Parts three, four and five we consider some models from a variety
of areas which lead to progressively more advanced types of differential
equations.
By studying the material in this book you will appreciate the extent
to which mathematical modelling is currently being used in the various
sciences and why mathematics is so useful in helping to solve scientific
problems. By working through the exercises, moreover, you will develop
many of the basic skills used by scientists engaged in mathematical
modelling. In addition, the book provides practice and consolidation
of the parts of mathematics associated with differential and difference
equations. Stimulated by their potential applications, these topics now
form one of the most flourishing areas of mathematical research.
Part one
Simple Models in Mechanics
1
Newtonian mechanics

The aim of mechanics is to explain and predict the motion of bodies.

Early in the history of mankind the motion of celestial objects sun,
moon, stars, planets, comets became a source of curiosity and wonder.
At the terrestrial level, questions arising from the motion of falling bodies
and of projectiles whether stones or arrows attracted the interest
of some of the greatest thinkers of antiquity.
The system of mechanics devised by Newton in the seventeenth century
made it possible to explain, for the first time, motion of both celestial
and terrestrial objects with the one set of postulates, or laws. Newtonian
mechanics proved to be one of the most successful mathematical models
ever devised and it showed conclusively the value of mathematics in
understanding nature. For these reasons, its advent is generally regarded
as one of the turning points in the history of human thought.
This chapter introduces Newton's laws and sets them against the
background of the set of postulates for mechanics due to the ancient
Greek philosopher Aristotle. It also refers to the results discovered by
Galileo and Kepler, which showed the inadequacy of the model proposed
by Aristotle and thereby set the stage for Newton.
The outstanding success of Newtonian mechanics should not be al-
lowed to blind us to the fact that it is only a model of what really
happens. In the present century it has been found inadequate to explain
motion at the subatomic level and at speeds close to that of light, which
occur in astronomy. More useful mathematical models at these two
extremes are provided by quantum mechanics and relativity respectively.

1.1 Mechanics before Newton

The first attempt to construct a mathematical model for the motion of
bodies was made by ARISTOTLE (384-322 BC). During the Middle

7
8 Newtonian mechanics

Ages, European scholars regarded him as the authority on all scientific

matters, including mechanics. By contrasting the assumptions made by
Aristotle with the laws of motion given by Newton, we can gain a better
appreciation of Newton's achievement.
The mechanics of Aristotle, like that of Newton, involves the idea of a
force as an explanation of why bodies move. The basic idea of a force is
familiar from everyday experience. If we want to move an object be
it a book, a chair or a wardrobe we must apply a force to the object.
Everyday experience also suggests, in a rudimentary way, how we might
compare forces. Thus, most of us would agree that two people pushing
together can exert, on average, twice the force exerted by one person
pushing alone.
Relationships between force and motion are stated explicitly in the
first two of the following assumptions introduced by Aristotle :

1. Force is necessary to maintain a body in motion and once the force

is removed the body will come to rest.
2. The force acting on a body is proportional to the velocity that it
produces.
3. With equal bulk, heavier bodies fall faster than lighter ones.

Aristotle claimed even more than is stated in the third of these assump-
tions, namely, the speed at which bodies fall is proportional to their weight.
Thus if a ball is twice as heavy as another one, it should reach the ground
in half the time (when released simultaneously from the same height).
Although these assumptions did not go unchallenged during the Mid-
dle Ages, the decisive break with Aristotle's ideas came during the
Renaissance with the work of GALILEO (1564-1642). Part of the folk-
lore surrounding Galileo concerns his dropping unequal weights from
the leaning tower of Pisa to disprove Aristotle's claim about the relative
speeds at which bodies fall. Less well known but equally deserving of
mention is an interesting `thought experiment' which Galileo suggested
to show that this claim of Aristotle led to absurdity. This is explained
later in the exercises.
To obtain further details about the motion of falling bodies, Galileo
rolled a brass ball down a wooden beam. This had the effect of slowing the
falling motion of the ball, making it possible to record, with reasonable
accuracy, the times at which the ball passed various points marked on the
beam. Whereas Aristotle's assumptions had referred to velocity, Galileo
expressed the result of this experiment in terms of acceleration (or rate of
1.1 Mechanics before Newton 9

increase of velocity). His results were consistent with the postulate that
bodies fall with uniform acceleration.
In place of Aristotle's claim that force was necessary to maintain a
body in motion, Galileo further postulated that
in the absence of forces a body need not be
at rest but can proceed with uniform speed.
A body moving in the absence of forces is said to be kept going by its
own inertia. Galileo's views on such inertial motion, however, were, by
later standards, rather timid. To account for the fact that objects do not
fly off the Earth into outer space, Galileo allowed the inertial motion to
take place along circles. He did point out, however, that motion along a
big enough circle would be indistinguishable from motion along a line.
Meanwhile Johannes KEPLER (1571-1630) had been investigating the
orbits of the planets. He believed he could explain the existence of six
planets (Neptune, Uranus and Pluto were then undiscovered) and the sizes
of their orbits from the geometry of the regular solids. Although none of
this is taken seriously today, Kepler devoted his life to elaborating these
ideas. Almost as a by-product he discovered the three laws of planetary
motion for which he is now famous.
1. The planetary orbits are ellipses with the sun as a focus.
2. The area swept out by a ray from the sun to the planet is propor-
tional to the time taken.
3. The squares of the periods of the planets are proportional to the
cubes of the major axes of their orbits.
Kepler's laws were based on the observations of the planets' motion by
the astronomer Tycho BRAHE (1546-1601). The basis for Kepler's laws
was thus empirical. There was at that time no mathematical model from
which these laws could have been deduced.
Anyone interested in learning more about the work of Aristotle, Galileo
or Kepler should consult the book by Cohen (1987), the relevant chapters
being 2, 5 and 6.

Exercises 1.1
1. Two balls have the same size but one is made of material which makes it five
times as heavy as the other. According to Aristotle's model, what is the ratio of
the speeds with which they fall?
2. Galileo refuted the idea that, in the absence of air-resistance, heavier bodies
fall faster than light ones by the following `thought experiment': Suppose it were
true that
10 Newtonian mechanics

`heavier bodies fall faster in a vacuum'.

(a) Of the two bodies shown below, which would fall faster, given the above
assumption?

2 kg
1 kg

(b) In view of (a), what should be the effect on the speed of the 2 kg body
if you placed the 1 kg body beneath it, as shown below?

2 kg

1 kg

1.2 Kinematics and dynamics

Two distinct aspects of mechanics, called kinematics and dynamics, are
clearly discernible in the work of Galileo and Kepler.

Kinematics
Kinematics is concerned with the geometry of motion. What paths
do moving objects follow? What coordinate systems are best suited
to describing their paths? What are their velocities? What are their
accelerations? In other words, kinematics is concerned with the most
obvious aspects of motion the things we can see with our eyes.
Kepler's laws of planetary motion belong to kinematics since they give
a geometric description of the path traced out by a planet. Galileo's law
that bodies fall with uniform acceleration also belongs to kinematics.
The idea of a particle is basic for the kinematic model we shall be using
in this book. A particle is defined as a body which has zero dimensions
and hence may be regarded as occupying a single point. Clearly such a
body cannot exist in the real world, but the idea is none the less useful
in modelling, say, the solar system since the size of a planet is very small
compared with the interplanetary distances. On the other hand, once a
rocket ship gets close to a planet it is no longer appropriate to regard
the planet as a single point, although it may well be still appropriate to
1.2 Kinematics and dynamics 11

think of the rocket ship in this way. In this book the primary concern
is with situations in which it is appropriate to model the bodies by
particles.
The concepts of velocity and acceleration will be explained more fully
later in the book. For the present, however, it is sufficient to recall that
velocity measures how fast a body is moving, while acceleration measures
how quickly the velocity is changing. Velocity and acceleration must be
measured relative to a frame of reference. For example, Parliament House
is at rest relative to the Earth, but it is moving with a large velocity relative
to the sun. Hence its velocity depends on whether the Earth or the sun
is used as the frame of reference.
Strictly speaking, velocity and acceleration should be considered as
vector quantities since they have direction as well as magnitude. Initially,
however, we shall simplify matters by supposing that all motion takes
place along a line; hence the number of possible directions is reduced to
two (up and down, left and right, etc.).

Dynamics
Dynamics, in contrast to kinematics, is concerned with what causes bodies
to move in certain ways. It involves concepts like mass and force, which
are so basic that it is difficult to define them in terms of any simpler ideas.
Hence, instead of attempting to define these concepts in a general way,
we point to some of the everyday situations from which these concepts
have arisen.
As to the concept of force, it had already been mentioned in the
previous section that we exert a force when we push against objects, with
a view to moving them. Although we cannot see a force directly, we are
aware that a force is acting when we push against a table because we feel
the tension in our limbs. Galileo is generally regarded as the founder of
modern dynamics because he told what happens when there is no force
acting on the body: it moves with uniform velocity (whereas Aristotle
had claimed it would necessarily come to rest).
Like velocity and acceleration, force is a vector quantity, which has
direction as well as magnitude. For example, the result of kicking a
football depends not only on how big the kick is, but also the direction
in which the kick is aimed. It will be assumed, moreover, that if two
forces act at the same point of a body, then they will have exactly the
same effect as their sum, calculated in accordance with the rules of vector
algebra, as illustrated in Figure 1.2.1.
12 Newtonian mechanics
F2

Fig. 1.2.1. Forces combine by vector addition.

3 units of force -2 units of force

+ve direction

Fig. 1.2.2. Attaching signs to forces.

Initially, all the action will be taking place along a fixed line and so
there will be only two directions to worry about. One of the directions
will be selected as the positive direction and the opposite direction will
be called the negative direction. Our convention for representing the
direction of a force acting along the line is illustrated in Figure 1.2.2.
On the left of the figure is shown a force of magnitude 3 units, acting in
the positive direction. On the right, however, the force has magnitude 2
units but acts in the negative direction. (Our convention is thus different
from the usual one, in which both arrows would be labelled with positive
numbers.)
The other dynamical concept, that of mass, arises from our everyday
experiences in which a given force does not always produce the same
effect. For example, kicking a football full of water produces a markedly
different result from kicking the same ball full of air. The motion is
smaller in the first case because of the increased mass of the ball.
Newton himself described mass as the `quantity of matter' in a body.
Implicit in this description is the idea that putting two identical bodies
together gives double the mass of each of the individual bodies. This
idea opens up the possibility of constructing a graded set of standard
masses against which the mass of any body (of moderate size!) can be
compared on a balance, as in Figure 1.2.3. (Strictly speaking it is their
weights which are being compared, but at a given point on the Earth's
surface, these are proportional to their masses.)
The above discussion is intended to provide a general idea of the
1.3 Newton's laws 13

AAA
Fig. 1.2.3. Scales are used to determine when masses are equal.

meaning of the concepts of force and mass, but it stops short of giving
precise definitions. The status of the concepts of mass and force in
mechanics is analogous to that of a point and a line in Euclidean
geometry. These concepts are so basic for geometry that it is not possible
to define them in terms of any ideas which are more basic. Instead,
there are the axioms or postulates of geometry, which tell us all we need
to know in order to make logical deductions about points and lines.
In mechanics, the role of the axioms is taken over by Newton's laws,
discussed in the next section.

Exercises 1.2

1. How many standard masses ('weights') do you need to be able to find the
mass, correct to the nearest kilogram, of any mass less than 10 kg?

1.3 Newton's laws

Isaac NEWTON (1642-1727) was born in the year in which Galileo died.
His tremendous contribution to mechanics (and hence to the whole of
physical science) was to discover a set of postulates or laws for dynamics
which would explain the kinematical discoveries of his predecessors. In
this way Newton achieved a complete synthesis of the laws of falling
bodies near the Earth's surface (discovered by Galileo) and the laws of
planetary motion (discovered by Kepler).
In substance, Newton's laws of motion state that when velocities and
accelerations are measured in a suitable frame of reference :

1. Every body continues in a state of rest or uniform motion in a

straight line unless compelled to change its state by the action of
a force.
14 Newtonian mechanics

2. The force produces an acceleration in the direction in which it acts.

The magnitude of the force is jointly proportional to the acceleration
and the mass of the body.
3. To every action there is an equal and opposite reaction; that is, the
forces which two bodies exert on each other are always equal and
opposite.

Newton assumed the existence of an ideal frame of reference in which his

laws would be true. The Earth itself provides an approximation to this
ideal frame of reference, which is often useful for the study of motion
near its surface. A better approximation is obtained by regarding the sun
or stars as fixed. These latter frames of reference would be more useful
for the study of the motion of the planets or of space vehicles. A frame
of reference in which Newton's laws are true is called an inertial frame.
We now make some comments on the significance of the above laws.
Newton's first law of motion is sometimes said to express the principle
of inertia. Newton is here in agreement with Galileo, as opposed to
Aristotle, that in the absence of a force, bodies maintain uniform speed.
Whereas Galileo believed the motion could be in circles, however, Newton
asserts that it must take place along a line.
In the second law, Newton asserts that the force depends on the ac-
celeration rather than on the velocity, as Aristotle had assumed.
Nowadays, of course, anyone who has travelled in a jet aircraft will have
had direct confirmation of the correctness of Newton's view. You feel the
seat thrusting in your back when the jet accelerates from rest to take-off
speed, not when it is cruising steadily at near sonic speeds above the
clouds.
Given a suitable choice of units, we may write Newton's second law
as an equation :

{force} = {mass} x {acceleration}

or, more simply, as

F = ma.
As will be seen later, this is really a differential equation. Hence our
ability to solve problems in Newtonian mechanics is inextricably linked
to our ability to solve differential equations.
Although we have stated Newton's second law in its most ready-to-use
form, there is a slightly more general version which equates force with
the time rate of change of momentum. This more general version can be
1.3 Newton's laws 15

applied to a body whose mass changes with time for example, a body
consisting of rocket plus fuel. We shall not use this form of Newton's
second law in this book. As to Newton's third law, a simple illustration is
provided by the propulsion of rocket ships. The rocket ship exerts a force
on the material which it expels backwards. This material exerts an equal
but opposite reaction on the rocket ship, which causes it to accelerate
forwards.
In later chapters, detailed models will be given for the forces which
act in various situations, such as those where friction or air-resistance
is present. There is, however, one type of force which acts, according
to Newton, on every body in the universe. Hence it deserves to be
mentioned here, alongside his laws of motion. Newton's law of universal
gravitation states:
Each pair of bodies in the universe exerts a force of mutual attraction
of magnitude
Gml m2
r2
where ml and m2 are the masses of the bodies, r is the distance
between them, and G is a constant (independent of the bodies).
On the basis of the above laws, Newton was able to derive Kepler's
laws of planetary motion one of the first big successes of Newtonian
mechanics.
A very readable account of Newton's work in celestial mechanics is
given in Koestler (1958), pages 504-517. A popular biography of Newton
is Andrade (1979). A critical discussion of Newton's own statement of
his laws of motion is contained in Westfall (1971), chapter 8.

Exercises 1.3
1. To each statement below attach the most appropriate name from the list:
Aristotle, Galileo, Newton.
1. A force is necessary to maintain a body in motion.
2. The negation of (a).
3. Bodies fall to earth with uniform acceleration.
4. Velocity is proportional to the force acting on the body.
5. Acceleration is proportional to the force acting on the body.
6. In the absence of forces, bodies move along straight lines.
7. In the absence of forces, bodies move with uniform speed but can move
around circles.
16 Newtonian mechanics
0 Mass m

Weight mg

/ 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7-7 Surface of the Earth

Fig. 1.4.1. Weight is the force mg due to gravity which acts on a particle of
mass m.

1.4 Gravity near the Earth

According to Galileo's model, falling bodies in the absence of air-
resistance have a uniform acceleration towards the Earth, which is the
same for all bodies and is denoted by g. Some aspects of this model will
now be discussed from the standpoint of Newton's laws.
First, Newton's second law of motion shows that the force acting on
a particle of mass m which has an acceleration g towards the Earth is
mg vertically downwards, as shown in Figure 1.4.1. This force, which is
due to the gravitational attraction between the particle and the Earth, is
called the weight of the particle (as distinct from its mass).
Second, Newton's law of universal gravitation, stated in the previous
section, suggests that the gravitational attraction between the particle
and the Earth diminishes with the height of the particle, and prompts
us to discuss the range of validity of Galileo's model. Before this can
be done, however, there is a mathematical difficulty to overcome in that
Newton's law of universal gravitafion applies, strictly speaking, to a pair
of particles. For motion near the Earth's surface, it is clearly no longer
appropriate to regard the Earth as a single point in space.
This difficulty was overcome by Newton himself. By using the integral
calculus, which he invented for solving such problems, he was able to
show that if a particle is attracted by the gravitational pull of a sphere
of homogeneous material it experiences the same force as if the entire
mass of the sphere is concentrated at its centre. Thus, the Earth being
taken as a homogeneous sphere of mass M, the force of gravity pulling
a particle of mass m towards the centre of the Earth is

GMm
r2
1.4 Gravity near the Earth 17

Fig. 1.4.2. Gravitational attraction of a solid sphere equals that of a single point
of the same mass.

where r is the distance of the particle from the centre of the Earth, as
shown in Figure 1.4.2.
It follows from this formula that, as the altitude of a particle increases,
the gravitational attraction of the Earth on the particle decreases. For
everyday purposes, however, the variation of gravity with height can be
ignored. The radius of the Earth is so large (about 6400 km) that changes
in height lead to very small proportional changes in the distance of the
particle from the centre of the Earth and hence to correspondingly small
changes in gravity. This is illustrated by the following example.

Example 1. When a jet airliner ascends from take-off to an altitude of 10 km, by

how much does the gravitational attraction acting on it decrease?
Solution. Let ro and rl be the distances of the airliner from the centre of the Earth
at take-off and at an altitude of 10 km respectively. The ratio of the gravitational
force when the distance is rl to that when it is ro is
GMm GMm = ro 2 = (6400l z = 0.99688.
rl
)/( ro rl 64101
Hence the fraction by which gravity decreases is 1 - 0.99688 = 0.00312 and hence
the percentage decrease is about 0.31 %.

The problem of determining the value of g accurately has an interesting

history and it impinges on several topics studied later in this book. The
earliest method of determining g, using direct observations of falling
bodies, had limited accuracy because of the speed with which bodies
fall. In 1664, however, Christiaan HUYGENS accurately determined the
value of g from pendulum experiments (using the formula which he had
discovered for the period of oscillation of a pendulum). Within a few
years it was discovered that g had a smaller value at the equator than
18 Newtonian mechanics

it had in Paris. Newton's explanation for this was that the Earth is not
a perfect sphere, but is flattened at the poles. The larger distance of a
particle from the centre of the Earth when the particle is on the equator
accounts for the smaller gravitational attraction there.
Some values of g obtained experimentally at various latitudes are given
in Cohen (1987), page 175. They range from about 9.78 metres/s2 at
the equator to about 9.83 metres/s2 at the poles. Unless stated otherwise
we shall ignore the variation in the value of g and use the value of 9.8
metres/s2. This will give a satisfactory model for the gravitational force
on a particle provided it stays near the surface of the Earth and the
duration of its motion is not too large.

Exercises 1.4
1. At what altitude is the weight of a given sample of material 1 % less than
when it is on the ground, directly beneath?
2. What is the percentage increase in the weight of a given sample of material
when it is taken from the equator to a pole?

1.5 Units and dimensions

The units for the various quantities used in mechanics are shown in
Table 1.5.1 for the international system (SI). Since velocity is obtained
by dividing length by time, its units are derived from those for length
and time in the obvious way. Similar remarks apply to the unit for
acceleration, which comes from dividing velocity by time. The unit for
force is chosen to make the constant of proportionality in Newton's
second law of motion equal to unity.
How big is a force of 1 newton? To answer this note that, by Newton's
second law, if the mass m of a body is measured in kilograms and the
acceleration g due to gravity is measured in metres/s2 then its weight is
mg newtons. In particular if the mass of the body is about 0.1 kg then its
weight is approximately 1 newton. This seems rather appropriate in view
of the story about Newton and the falling apple, the mass of an apple of
moderate size being about 0.1 kg !
Each quantity in mechanics has associated with it an expression in-
volving the letters M, T, L which is called the dimensions of the quantity.
The dimensions of some common types of quantities are listed in Table
1.5.2.
1.5 Units and dimensions 19

Table 1.5.1. SI units for various quantities in mechanics.

Quantity Name of unit Abbreviation

mass kilogram kg
time second s
length metre m
velocity metres per second m/s or m s-1
acceleration metres per second' m/s' or m s-2
force newton N
area square metres m2
volume cubic metres m3

One of the main uses of dimensions is to check formulae involving

physical quantities where the formulae are valid independently of any
choice of a system of units. The check consists in seeing whether both
sides of the equation have the same dimensions. The rules which enable
us to get the relevant dimensions needed to apply the check are as
follows:

1. Only quantities with the same dimension may be added or subtracted.

The result of adding or subtracting these quantities gives a quantity
of the same dimensions.
2. The dimension of a product or quotient of two quantities is the
product or quotient respectively of their dimensions.
3. All numerical factors, such as 1/2 and it, do not change the di-
mensions of a quantity. We therefore say that these factors are
dimensionless.

These rules can be written in symbols. Let a and b denote two

quantities and let the symbol `[ ]' stand for the operating of taking
dimensions of a quantity. The rules are :

There is also a rule for the mth power of a quantity, where m is a fraction :
20 Newtonian mechanics

Table 1.5.2. Dimensions of various quantities in mechanics.

Type of quantity Dimensions

mass M
time T
length L
velocity LT-1
acceleration LT-2

force MLT-2
area L2
volume L3

Note that, from rule 3, [it] = 1, for example.

Example 1. The formula for the period i of small oscillations of a simple pendulum
is
i =2n 0-1/9 (1)
where c is the length of the pendulum and g is the acceleration due to gravity.
Check that this formula is dimensionally correct.
Solution.
[LHS]=[i]=T
[RHS] = [2zc t /g] = [Cl g] 2 = (L/(Lr2)) = T .
Thus each side of the formula has the same dimensions, as required. Note that the
multiplicative numerical factor 2x in (1) is dimensionless.

Exercises 1.5
1. The kinetic energy of a particle of mass m which is moving with a velocity v
is defined to be
1
2 mv2.

What is the unit for kinetic energy when mass and velocity are measured in SI
units? What are the dimensions of kinetic energy?
2. Let F be a force, let m1 and m2 be masses and let ( be a length. Is the
following formula correct dimensionally?
ml
F= C.
m1 +m2
2
Kinematics on a line

This chapter contains a careful discussion of the ideas of velocity and

acceleration for a particle moving along a line. Dealing first with this
special case enables the use of vectors to be postponed. Familiarity is
assumed, however, with the elements of the differential calculus including
the idea of the derivative 0' of a function 0, its geometrical interpretation
as the slope of the tangent to the graph of the function, and the basic
rules for differentiating functions.
It will be assumed that the displacement x of a particle moving along
a line is given as a function of the time t by

for some twice differentiable function ¢. The discussion will then lead us
to define the velocity z of the particle at time t in terms of the derivative
of 0 by

The acceleration z of the particle at time t will be defined similarly in

terms of the second derivative of 0 by
00

The velocity and acceleration can then be calculated by using the rules
of differentiation.
Just as important for mechanics is the reverse problem : given the veloc-
ity or the acceleration as a function of the time, what is the displacement
at a given time? This problem will be solved by antidifferentiation and
will provide an introduction to the idea of a differential equation.
This chapter provides all that is needed on differential equations for

21
22 Kinematics on a line
Chapters 3 and 4, while Chapter 5 introduces some harder types of
differential equations, which will be used in Chapter 6.

2.1 Displacement and velocity

The idea of uniform speed will be explained first. To illustrate this we
consider a car moving between two towns A and B. The car makes the
trip from A to B, as in Figure 2.1.1, taking 1/2 hour to do it. For a
precise measurement of the distance travelled it is necessary to idealize
the towns A and B as points and the car as a particle.
Its average speed for the trip is defined as
distance travelled 30
= = 60 km/hour.
time taken - 1/2
Along some stretch of the road, it may well happen that the average
speed between every pair of points is the same. The speed for this part
of the trip is then said to be uniform or constant. The speed could not be
uniform for the whole trip, however, because there is an initial period as
the car leaves A when it is picking up speed and a final period as the car
approaches B when it is slowing down.
While speed is always >_ 0, velocity may be negative to allow for
motion in the opposite direction. Problems of uniform velocity involve
only elementary arithmetic, whereas those of non-uniform velocity are
much more difficult and involve the use of calculus. Before studying such
problems, however, it is necessary to understand how plus and minus
signs are used to help describe the position of the particle.

Coordinates and displacement

On the line along which the particle is moving, a point 0 is chosen to act
as origin. Points on one side of 0 are assigned positive numbers which
measure their distances from 0, in suitable units. Points on the opposite

I 4
30 km,
A B

Fig. 2.1.1. A car travelling between two points.

2.1 Displacement and velocity 23

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6

Fig. 2.1.2. Coordinates along a line.

1 Particle

0 x

Fig. 2.1.3. A particle with coordinate x.

side of 0 are then assigned negative numbers, as in Figure 2.1.2. The

absolute value of the number assigned to a point measures the distance
of the point from 0. The numbers are called the coordinates of the points
on the line.
The choice of the origin is at the discretion of the user. It does not
really matter which point is chosen, provided it is fixed in the inertial
frame of reference. It is often the case, however, that a suitable choice of
origin can simplify subsequent calculations.
Likewise, the choice of the side of the origin to label with positive num-
bers is entirely up to the user. In Figure 2.1.2, we have chosen to label
the points to the right of 0 with positive numbers, but we could have
equally well chosen to label the points to the left with them. The line
with coordinates attached is called a coordinate axis. The direction in
which the coordinates increase is called the positive direction of the co-
ordinate axis, while the reverse direction is called its negative direction.
The positive direction of the x-axis is also called the positive direction
for the coordinate x itself.
If a particle moves along the line, its coordinate will change as time
goes on. For this reason, a letter such as x is used to denote the coordinate
of the particle at an arbitrary time t. When showing the significance of
the coordinate x on a diagram, it is important to show the particle in a
`typical' or `general' position, as in Figure 2.1.3, rather than in a special
position such as the origin.
The coordinate x is often called the displacement of the particle from
the origin 0. An alternative way of representing x which goes with this
terminology is as an arrow from the origin to the particle, as shown in
Figure 2.1.4.
Sometimes the coordinate x for the particle at time t is implicitly
specified by the problem, without explicit mention of the origin or the
coordinate axis. In such problems the choice of the origin and the
24 Kinematics on a line
x

Fig. 2.1.4. The coordinate x measures the displacement of the particle from the
origin.

Particle

Ground level

Fig. 2.1.5. Here the coordinate x measures the displacement of the particle above
ground level.

positive direction of the axis has been made already. The following
example illustrates how the origin of the coordinate axis and its positive
direction may be inferred from the description given in the problem.

Example 1. A particle is moving in a vertical line. A coordinate x is defined for

the particle by the statement that x is the height of the particle above ground level
at time t. Where is the origin and which is the positive direction for the x-axis ?
Solution. The displacement x of the particle above ground level at a typical time t
is shown in Figure 2.1.5.
The origin is at ground level since this is where the particle must be when x = 0.
The positive direction for the coordinate axis, moreover, is upwards since this is
the direction in which the particle moves if x increases.

It will be assumed that the displacement x of the particle may be

expressed as a function of the time t by
X =fi(t)
where ¢ is a twice differentiable function mapping times into displace-
ments. The function 0 is called the displacement-time function since it
relates the displacement to the time. It is not necessarily assumed that
x = 0 when t = 0 and negative values of t are not necessarily excluded.
Students often confuse the use of negative values of t with the science-
fiction feat of going backwards in time. Negative values of t, however,
simply indicate that we have agreed to label some instant as the origin
2.1 Displacement and velocity 25

for time; any instant prior to that is then labelled with a negative t value.
For example, if we choose today as day 0, then tomorrow is day 1 and
yesterday was day -1.
Some books use the same symbol x for the displacement and for the
function 0. This can lead to confusing notation such as `x = x(t)'. We
try to avoid such usage in this book, even though it means we have to
introduce extra symbols.
With the aid of the above ideas on displacement, we can now complete
our discussion of velocity.

Non-uniform velocity
In non-uniform motion, the velocity may change from one instant to
another. Hence there is a need to analyze the idea of velocity at an
instant or instantaneous velocity. At first sight, however, there appears to
be something paradoxical about this idea. An instant has zero duration,
hence during an instant the particle cannot move. How then can it have
an instantaneous velocity?
We resolve this paradox by looking at smaller and smaller time intervals
surrounding the given instant and taking the limit of average velocities
over these intervals. Let the time change from t to t + 6t and let the
displacement change from x to x + 6x. At the start of the change

and at the end of the change

Hence by subtraction
5x = 4(t + 5t) - 4(t).
This gives the change in displacement as the time changes from t to t + 6 t
and hence the average velocity of the particle during this period is
5x 4)(t + 60 - 4i(t)
6t 6t
Since 0 was assumed differentiable, this ratio will approach a limit as
bt -- 0 which is just the derivative of 0 at t. Hence we may define the
instantaneous velocity of the particle at time t as
6x
lira
8t--+o 6 t
26 Kinematics on a line
Thus we have been led to define the instantaneous velocity (or simply
the velocity) at time t as the derivative at t of the displacement-time
function. The function 0' is called the velocity-time function.

Notation
The velocity at time t is written in Leibniz's notation as
dx
dt
This notation helps recall the way in which the velocity was defined above
as the limit of a ratio : a small increase in x divided by a small increase
in t. It also reminds us that velocity is a rate of change : of displacement
with respect to time. For these reasons it is understandably popular with
those who are primarily concerned with setting up mathematical models
of physical problems.
On the other hand, Leibniz's notation ignores the functions which
link the physical quantities. There are many situations in mathematics
where it is necessary to recognize these functions quite explicitly and then
the notation 4'(t) has the advantage. Familiarity with both notations,
together with the ability to translate from one to the other, is therefore
desirable.
An alternative notation for the velocity the one used by Newton in
fact is
x.

This is logically equivalent to Leibniz's notation in that one thinks of the

dot as meaning dt, and it is quicker to write.

Sign of the velocity

Since
5x
x = lim
6t

it follows that, if x > 0 at some time t, then for 6t sufficiently small

-x>0.
6t
Hence,
if 6t>0 then 6x>0.
2.1 Displacement and velocity 27

This means that, during the interval when the time increases from t to
t + St, the net change of displacement of the particle has been in the
positive direction of the coordinate x. We may summarize this by saying
that
when z > 0 the particle moves in the positive direction for the x-
coordinate.
Similarly it can be shown that
when z < 0 the particle moves in the negative direction for the x-
coordinate.
The particle is said to be stationary when z = 0 because the velocity is
then instantaneously zero.
The following example illustrates the ideas that have been introduced
in this section.

Example 2. A particle moves along a horizontal line in such a way that its dis-
placement x metres to the right of an origin 0 at time t seconds is given by
x = t3 +2t2 -9t+9.
At the instant when t = 1, find the displacement and velocity of the particle. State
also the side of the origin on which the particle lies, the direction in which it is
moving, and its speed at this instant.
Solution. Here the displacement-time function is defined by 4(t) = t3 + 2t2 - 9t + 9.
By differentiation, the velocity at time t is ac = 41(t) = 3t2 + 4t - 9. Thus, when
t = 1,
x=3 and x=-2.
Since the particle moves to the right if x increases, this is the positive direction for
the coordinate x. Hence, when t = 1, the particle is to the right of the origin but
is moving left. Its speed is 2 m/s.

Exercises 2.1
1. A particle drops over the edge of a table and falls vertically downwards. The
coordinate x is chosen as the distance through which the particle has fallen at an
arbitrary time t.
(a) Which of the four diagrams below best conveys the meaning of x?
(b) Which is the positive direction for the coordinate x? Why?

x
28 Kinematics on a line
2. Repeat Example 2 in the text, but this time suppose that x metres is the
displacement of the particle to the left of the origin at time t seconds.
3. Let 0 be the displacement-time function for Example 2 in the text. Write down
each of the numbers 0(0), 4)(2), 4(t + 1). Write down also 41(0), 41(2), 0`(t + 1).
4. Two particles Pl and P2 are connected by a string (of constant length). P1 lies
on the table while P2 hangs from the rightmost edge of the table, shown below.
Let x be the distance of Pl from the rightmost edge of the table and let y be the
distance of P2 below this edge at time t.
P1

(a) Copy the diagram and mark x and y on it.

(b) Which is the positive direction for the coordinate x? for the coordinate
y9
(c) In each of the following cases state the direction of motion for each
particle:
(i)x>0 (ii)Y>0.
(d) Write the length of the string in terms of x and y and then, using the
rule for differentiating a sum, deduce a relationship between is and Y.

2.2 Acceleration
In everyday usage, a car is said to be accelerating when its speed is
increasing and decelerating when its speed is decreasing. In mechanics,
however, the word acceleration is used in a slightly more technical way,
being defined in terms of rate of change of velocity rather than speed,
and it may be positive or negative.
To frame a definition of acceleration we use the notation of the
previous section and we suppose that as the time changes from t to t + 6 t
the velocity changes from ac to is + 65c so that

with
is = 4'(t)
2.2 Acceleration 29

and so, by subtraction,

65C = 4)'(t+5t) - 41(t).

This gives the change in velocity as the time changes from t to t + 6 t and
hence the ratio
6z '(t + 6t) - '(t)

gives the change in velocity per unit time. This ratio is called the average
acceleration during the time interval from t to t + 6 t. Since 0' is assumed
differentiable, the ratio on the right approaches the limit 4)"(t) as 5t -- 0.
Hence we are led to define the instantaneous acceleration (or simply the
acceleration) of the particle at time t as

6t
Thus the acceleration of the particle at time t is the first derivative of the
velocity-time function or the second derivative of the displacement-time
function at this instant.

Notation
The acceleration of the particle at time t is written in Leibniz's notation
as
d dx d2x
or, more briefly, as .
dt dt dt2

As this is a little cumbersome to write, however, we shall mainly use

Newton's notation x. Thus we may write
x

Sign of the acceleration

Suppose that the acceleration z of the particle is positive at some time
t. This implies that a small positive change 6t in the time will produce a
positive change 65c in the velocity. Thus the velocity increases from $ to
z+6z.
This does not necessarily mean, however, that the speed increases. For
example: if ac = 2 and Sac = 0.05 then both the velocity and the speed
increase from
2 to 2.05
30 Kinematics on a line
If, however, ac = -2 and 65c = 0.05 then the velocity increases from
-2 to - 1.95
whereas the speed decreases from
2 to 1.95.

Thus there is some divergence between the everyday usage of the word
acceleration and its use in mechanics. Our usage is summarized by saying
that :

If .z > 0 the particle is accelerating in the positive direction of the

x-axis; if z < 0 it is accelerating in the negative direction of the
x-axis.

Example 1. Suppose that a particle moves along a horizontal line in such a way
that its displacement x metres to the right of an origin 0 at time t seconds is given
by x = 4(t) where
4(t)=t2-t-2.
(a) Sketch the graph of the displacement-time function. At which times is the
particle
(i) at the origin ?
(ii) to the right of the origin ?
(iii) to the left of the origin?
(b) At which times is the particle
(i) stationary?
(ii) moving to the right?
(iii) moving to the left?
(c) Which is the leftmost point reached by the particle?
(d) Find the acceleration. In which direction is the particle accelerating?

Solution. The coordinate x of the particle at a typical time t is shown in Figure

2.2.1. The positive direction for the x-axis is to the right since this is the direction
in which the particle moves if x increases.
(a) To sketch the graph of x = 4(t), note that the quadratic 4(t) factorizes to
give
x=t2-t-2=(t+1)(t-2).
This leads to the following table of signs

t large -ve -1 between -1 and 2 2 large +ve

x large +ve 0 -ve 0 large +ve

which in turn helps us to sketch the graph in Figure 2.2.2 - a parabola with
vertex at the bottom.
2.2 Acceleration 31

0 X +ve direction

Fig. 2.2.1. Coordinates for Example 1.

x-axis

Fig. 2.2.2. Displacement-time graph for Example 1.

Note that in sketching the graph we draw the x-axis vertically upwards even
though in the original problem it was horizontal. From the graph it is clear
that the particle is
(i) at the origin when t = -1 or t = 2,
(ii) to the right of the origin when t < -1 or t > 2,
(iii) to the left of the origin when -1 < t < 2.
(b) Differentiation gives
ic=4'(t)=2t-1.
The particle is thus
(i) stationary when is = 0, hence t = 2
(ii) moving right when is > 0, hence t > z
(iii) moving left when xc < 0, hence t < 2
(c) The particle reaches its leftmost point when is = 0, hence t = 2 and so
x = -2 . The leftmost point is thus 2 metres to the left of 0.
(d) Differentiation
a of xc with respect to t gives
a

z=4'(t)=2.
Thus the acceleration is 2 m/s2. Since z > 0, the particle is accelerating
towards the right.
32 Kinematics on a line
t=2
t
t=1
. x-axis
-3 -2 0 1 2 3 4

Fig. 2.2.3. Tracking diagram for the motion described in Example 1.

Geometry is a valuable tool in the study of problems in mechanics.

The well-known saying that a good picture is worth a thousand words is
just as true in mechanics as in any other part of mathematics. In addition
to representing the motion of a particle by a displacement-time graph,
we often find it informative to have diagrams which relate the motion
directly to the line on which it is taking place.
Thus, in Example 1, the displacement-time graph may be used to
derive the diagram shown in Figure 2.2.3, which helps us to keep track of
what is happening on the horizontal line on which the particle is moving.
There does not seem to be any standard name for these diagrams. We
shall call them tracking diagrams because they help us keep track of the
particle's motion.

Exercises 2.2
1. In Example 1 in the text, how would the solution change if x were defined to
be the displacement of the particle to the left of the origin at time t?
2. Suppose that a particle moves along a horizontal line in such a way that its
displacement to the right of an origin 0 at time t is given by x = 4(t) where
4(t) = t3 - 6t2 + 9t.
(a) Sketch the graph of the displacement-time function, showing where it
crosses the axes.
(b) Express ac and x as functions of time, and sketch their graphs.
(c) At which times is the particle
(i) at the origin?
(ii) to the right of the origin?
(iii) to the left of the origin?
(d) At which times is the particle
(i) stationary?
(ii) moving to the right?
(iii) moving to the left?
(e) At which times is the particle accelerating towards the right?
2.3 Derivatives as slopes 33
X-axis
Tangent

Fig. 2.3.1. The derivative is the slope of the tangent.

(f) Sketch a tracking diagram showing how the particle moves along the
line.

3. Repeat the above exercise, but with 4(t) = -t4 +2t2 -1. What is the rightmost
point reached by the particle?

2.3 Derivatives as slopes

Velocity was defined earlier in this chapter as the derivative of the
displacement-time function. The geometrical interpretation of the deriva-
tive, as the slope of a tangent to the graph of the original function, can
therefore be used to predict various features of the velocity-time graph
directly from the displacement-time graph. In a similar way, information
about the acceleration-time graph can be obtained from the velocity-time
graph.
To illustrate the process consider the graph in Figure 2.3.1 showing the
displacement x of a certain particle as a function of the time t, given by
x = 4(t). The tangent is shown at a typical point (t, x) on the graph. Its
slope is the `rise over run' between any two of its points. This slope is the
derivative 4'(t) of the function 0 at time t and hence taking ac to be this
slope gives a point (t, ac) on the velocity-time graph. By plotting points
(t, ac) in this way for a number of values of t, we can build up a rough
picture of the velocity-time graph. The following example illustrates this
procedure.
34 Kinematics on a line

t-axis

Fig. 2.3.2. Displacement-time graph for Example 1.

t-axis
1 2\.3 4 5 6 7

x = 0'(t)
-1

Fig. 2.3.3. Velocity-time graph for Example 1.

Example 1. Suppose that a particle moves along a line in such a way as to give
the graph in Figure 2.3.2 for the displacement x metres as a function of the time t
seconds. By examining slopes of tangents at various points obtain a rough sketch
of the velocity-time graph. Show also the general shape of the acceleration-time
graph.

Solution. The part of the graph for which t < 1 is a line segment of slope 2, while
the part of the graph for which t >_ 3 is a line segment of slope -1. In between,
as t increases from 1 to 3 the slope decreases steadily from 2 to -1. Hence the
velocity-time graph contains a horizontal line segment at height 2 for t < 1 and
another such segment of height -1 for t > 3. These are linked by a curved portion
which drops smoothly from height 2 to height -1 as in Figure 2.3.3.
The acceleration-time graph is now obtained by taking slopes on the velocity-
time graph. Along the constant parts of the velocity-time graph the slopes are zero.
Between t = 1 and t = 3 the slope drops from zero to some negative minimum value
and then comes back up again to zero. Hence the acceleration-time graph has the
general shape shown in Figure 2.3.4.
2.3 Derivatives as slopes 35

Fig. 2.3.4. Acceleration-time graph for Example 1.

It is difficult to give a precise estimate of the minimum value of the acceleration

since the curved portion of the ac-t graph is only a rough approximation.

Exercises 2.3
1. A particle is moving along a horizontal line and its displacement to the right
of an origin 0 at time t is x. The displacement-time graph for the motion is
shown below.

= 0(t)
t-axis

What is the velocity when t = 1 ? In which direction is the particle then moving?
36 Kinematics on a line
2. Repeat the previous exercise, but this time suppose that x denotes the
displacement of the particle to the left of the origin 0 at time t.
3. The graph below shows the velocity-time graph for the motion of a particle
whose velocity and acceleration are both positive when t = 2.

.z=O'(t)

t-axis
2

(a) Sketch similar velocity-time graphs to illustrate each of the following

possibilities, when t = 2:
(i) velocity positive and acceleration negative,
(ii) velocity negative and acceleration positive,
(iii) velocity negative and acceleration negative.
(b) In which of the cases in part (a) is the speed decreasing when t = 2?

4. A particle moves on a horizontal line and x is its displacement to the right of

the origin at time t. The tracking diagram for the motion is shown below.

. x-axis
-4 -3 -2 -1 0 1 2 3 4 5 6

(a) Give a rough sketch of the displacement-time graph.

(b) At which times is the velocity zero?
(c) Between which pair of times must there be an instant when the acceler-
ation is zero? Choose the times as close together as you can and give
reasons for your answer.

5. The displacement-time graph for the motion of a certain particle is shown

below. Copy the graph and then, directly underneath, sketch
(a) the velocity-time graph,
(b) the acceleration-time graph.
2.4 Differential equations and antiderivatives 37
Displacement
A

Time

2.4 Differential equations and antiderivatives

In previous sections it was shown how to proceed in the direction

displacement as a function of time

I
velocity as a function of time
1

acceleration as a function of time.

When expressed in terms of our definitions, all that is involved here is

the process of differentiation (once or twice in succession).
The reverse process is just as important in mechanics. To proceed in
the reverse direction, going back from acceleration to displacement, turns
out to involve the process of antidifferentiation. How this works will be
illustrated by an example.

Example 1. What can be said about the displacement x of a particle if its accel-
eration is given as a function of the time t by
.ic = t? (1)

The particle starts from rest at the origin.

38 Kinematics on a line
Solution. First, we find all the functions 0 such that, when x = 0(t),
x=t
or equivalently
0"(t) = t.
The first antidifferentiation gives the derivative 0':

4)'(t) = 1 t2 + ci
2
for some constant cl. A further antidifferentiation now gives

4)(t) = 61 t3 + cl t + c2 (2)

for some constant c2. But since the particle starts from rest, at the origin, the initial
values of x and ac are x = 0 and x = 0 at t=0. Hence 0(0) = 0 and 0'(0) = 0 from
which using (1) it may be deduced that ci = c2 = 0. Hence the required answer is
X= 1 t3.
6

The equation (1) is an example of a simple differential equation - that

is, an equation involving derivatives. In the above example we solve the
differential equation by antidifferentiating both sides of the equation.

Exercises 2.4
The exercises from number 4 onwards provide practice at applying differential equa-
tions to constant acceleration problems. In each of these exercises you must set up
a differential equation and then solve it - merely quoting a formula is not what
is wanted. You may assume the acceleration g due to gravity is 9.8 m/s2 vertically
downwards.
1. Use antidifferentiation to find the solution of the differential equation
x=t+1
which satisfies the initial condition x = 2 when t = 0.
2. Find the solution of the differential equation
x=e`+e
which satisfies the initial conditions x = 2 and is = 1 when t = 0.
3. A particle moves along a line and its displacement to one side of an origin
at time t is x. The acceleration is given as a function of time by the differential
equation
x = sin(2t).
Find x as a function of t, given that x = is = 1 when t = 0.
4. Read through the following problem, then answer the questions (a) to (e)
below :
2.4 Differential equations and antiderivatives 39

A stone was dropped from rest at the top of a cliff and a clunk
was heard 3 seconds later when it struck the ground at the foot of
the cliff. How high was the cliff`' With what speed did the stone
hit the ground ?
(a) There are two obvious points Pl and P2 either of which would be a
natural choice for origin. Which points are they?
(b) Show on a diagram how to choose a coordinate x for the particle, with
Pl as the origin for the x-axis.
(c) Write down a differential equation for x as a function of t and state the
initial conditions. Hence solve the original problem, stated above.
(d) Repeat parts (b) and (c) but this time choose the coordinate x so that
P2 is the origin for the x-axis.
(e) Which of the two choices of origin gives the neater solution?

5. Consider the following problem; then answer the questions below.

A jet airliner has a constant acceleration of 1 g m/s2 while on the
runway, and it becomes airborne at a speed of 300 km/h. Find
how long it takes to achieve take-off speed, starting from rest, and
find the distance it travels along the runway during this period.
How would your answer change if there were a headwind?
(a) Draw a diagram showing how to choose an origin and showing the
displacement x of the airliner from the origin at time t.
(b) Write down a differential equation for x as a function of t and state the
initial conditions. Hence solve the original problem.

6. Consider the following problem.

A particle moves along a line with uniform acceleration. In the first second of its
motion it moves a distance of 1 metre. What is the total distance it has moved
after 2 seconds? 3 seconds? 4 seconds? n seconds?
(a) Write down a differential equation for the particle's displacement x
metres to one side of the origin 0 at time t seconds.
(b) Hence solve the original problem.
(Galileo used the answers to the above problem in designing his experiment to
test the uniformity of acceleration down an inclined plane. Thus his ability as an
experimentalist depended on his prior ability as a mathematician!)
7. On a frosty morning, a person of average height accidentally drops a milk
bottle. Estimate approximately how far the bottle falls before it hits the ground.
Hence estimate the time before the bottle strikes the ground and the speed with
which it hits.
8. Assume that, in dry weather, a car travelling at 60 km/h takes a distance of
38 metres to stop once the brakes are applied. What distance would the car need
to pull up if initially it had been travelling at 90 km/h.
9. Consider the following problem.
40 Kinematics on a line
A particle is thrown vertically upwards with an initial velocity of vo. Find the time
which elapses before the particle reaches its maximum height and then express
the maximum height as a function of vo.
(a) Choose a coordinate x to measure the position of the particle at time t
and illustrate on a sketch.
(b) Write down a differential equation for x as a function of t and hence
solve the problem.

10. A particle moving along a line is displaced a distance x to one side of an

origin at time t. It is at rest at the origin when t = 0.
(a) Show from the relevant differential equation that if the particle has a
constant acceleration a (a > 0) then the velocity and displacement at
time t satisfy
z2 = 2ax.
(b) Verify that the above equation is satisfied by x = 1 Can you spot
at2.

another function x = 4(t) which satisfies it?

(c) Write down the converse of the result stated in part (a). Is the statement
you have written down true?
3
Ropes and pulleys

This chapter is about mechanical systems in which particles are attached

to a rope which passes around some pulleys. To keep the mathematical
model simple, we shall ignore friction, together with the masses of the
rope and the pulleys. Newton's laws can then be used to find the tension
in the rope and the acceleration of the particles.
On the basis of this mathematical model, it is possible to explain the
operation of the `block and tackle', which is often used in factories to
raise heavy loads. It will be shown how a small force exerted by a
workman pulling on one end of a rope can be converted by the pulleys
into a large force acting on the heavy load.
The mathematical model also provides the theory underlying Atwood's
machine, a contraption which is sometimes used to measure the acceler-
ation due to gravity.
The main skill which this chapter aims to develop is that of using
Newton's laws to derive equations of motion. The chapter also provides
incidental practice at solving systems of simultaneous linear equations,
solving differential equations by antidifferentiation, and manipulation of
inequalities.
First, however, a model will be constructed for the forces acting in the
rope.

3.1 Tension in the rope

The dynamical role of a piece of rope may be illustrated by what happens
in a tug of war. By pulling on the rope, one team is able to exert a force
on the other team, even though the teams are some distance apart. Thus
the rope enables force to be transmitted over a distance.
Does the magnitude of the force diminish along the length of the rope?

41
42 Ropes and pulleys

+ ve direction

-T
B

Fig. 3.1.1. The idea of tension in a rope.

Most people who have thought about it at all would probably answer
NO! The pull exerted by the team at one end of the rope will be the
same as the force felt by the team at the other end. A mathematical
model for the rope will be set up in which this can actually be proved
from Newton's laws of motion. First, however, it is necessary to analyse
the idea of tension in a rope.
Consider a length of rope which is being pulled at either end. Figure
3.1.1. shows a section through a rope at a point along its length. We
imagine this section as determining two separate pieces of rope, say A
and B.
The piece of rope A will have a force acting on it due to B of magnitude
say T > 0. Since the piece of rope B can pull, but cannot push (it would
go slack), the direction of this force on A must be in the direction shown
in Figure 3.1.1. By Newton's third law, the piece of rope B will have
a force acting on it due to A of the same magnitude T > 0, but with
opposite direction.
We define the tension in the rope at the point of section to be the
common magnitude T > 0 of the above forces.
A rope is said to be light if it has zero mass. Although such ropes
cannot occur in practice, we assume their existence as part of our idealized
mathematical model. They should be closely approximated in practice by
ropes whose masses are small compared with the masses of the objects
at either end. We also assume Newton's laws are applicable even when
3.1 Tension in the rope 43

-Ti T2

+ ve direction

Fig. 3.1.2. Diagram used in the proof of Proposition 1.

the mass is zero. The assumption of a light rope gives the following
proposition, which is the key to modelling many practical problems.

Proposition 1 For a light rope, stretched taut by forces at either end, the
tension stays constant along the length of the rope.

Proof Let Ti and T2 be the tensions at any two points along the rope,
so that the forces acting on the piece of rope between these points are as
in Figure 3.1.2.
The net force acting on the rope is T2 - T1 and so, by Newton's second
law,
T2 - Tl = {mass of rope} x {acceleration}
= 0 x {acceleration}
=0.
Thus
T2 = T1.

This shows the tension is the same at the two points. As these two
points were chosen arbitrarily, the tension remains constant along the
entire length of the rope.
The tension in a rope may change as it passes around a pulley. For
example, if the axle of a pulley is not properly lubricated, a large part
of the force exerted by the worker in Figure 3.1.3 might be expended in
getting the pulley to turn, instead of in raising the load.
A similar result might ensue if the pulley were large and massive a
lot of the force would be wasted in getting the pulley to spin.
The contrary case, in which there is little friction at the centre of the
pulley and the mass of the pulley is small compared with that of the
load, will be modelled by an idealized pulley in which both friction and
mass are zero. Such a pulley is said to be smooth and light. It is possible
to argue although we shall not go into detail here that such a
pulley does not change the tension in the rope.
44 Ropes and pulleys

Fig. 3.1.3. Using a pulley to lift a load.

The upshot of this section is therefore that, if a light rope passes around
one or more smooth light pulleys, the tension will be the same at either end
of the rope.

Exercises 3.1
1. By modifying the proof of Proposition 1 show that for a rope not necessarily
light, stretched taut by forces at either end, the tension stays constant along the
length of the rope provided its acceleration is zero.
2. A light rope has a particle of positive mass firmly attached at a point
somewhere between its two ends, as shown below. Are the two tensions at either
end of the rope always equal when the rope is stretched taut? Give reasons for
your answer.

-Ti T2

3.2 Solving pulley problems

In a typical pulley problem, particles will be suspended by a rope passing
over pulleys. The aim will be to find the tension in the rope and the
accelerations of the particles. Although the central step in the solution
3.2 Solving pulley problems 45

is to apply Newton's second law, some preliminary steps and some

concluding steps are also necessary. In outline, the procedure for solving
these problems is as follows.

STEP 1: Draw a diagram. Show the particles in typical positions.

Introduce a coordinate for each particle and a letter for the tension
in the rope.
STEP 2: Express the length of the rope in terms of these coordi-
nates. Hence obtain a relationship between the accelerations of the
particles.
STEP 3: For each particle in turn, draw a diagram showing all the
forces acting on that particle.
STEP 4: Apply Newton's second law to each particle in turn.
STEP 5: Solve the resulting equations for the unknowns the
accelerations of the particle and the tensions in the rope.
STEP 6: Look carefully at your answers, think about what they
mean physically, and check whether they seem reasonable.

While some students may wish to refer to the above list as a guide to
solving the problem, others may use it merely as a checklist at the end
to ensure nothing essential has been omitted from their solutions. The
following example illustrates how these steps are carried out in detail.

Example 1. Two particles, of mass 1 kg and 2 kg respectively, are attached to the

ends of a light rope which passes over a smooth light pulley suspended at a fixed
distance below the ceiling. Find the accelerations of the particles and the tension in
the rope.

Solution.
STEP 1: Draw a diagram and set up notation. The particles are shown in typical
positions in Figure 3.2.1 at time t.
Let x1 and x2 be the coordinates of the particles as in the diagram. Thus
x1 = distance of first particle below centre of pulley
and similarly for x2.
Now note that, for either particle, if its coordinate increases then it moves down-
wards. Hence downwards is the positive direction for each coordinate.
STEP 2: Relate the coordinates. Note from the diagram that

x1 +X2= length of the rope - 1 x circumference of pulley.

2
46 Ropes and pulleys
//////////////////,,,

T T

2 kg

l kg

Fig. 3.2.1. Setting up coordinates for Example 1.

-T

l kg
2 kg

2g
1

Fig. 3.2.2. Force diagrams for each mass.

But, since this does not change with time, xl + x2 must be constant and hence its
derivative with respect to t is 0. By the rule for differentiating a sum, applied twice
in succession, it therefore follows that
X1 + x2 = 0. (1)
STEP 3: Show forces on each particle. Since the rope can only pull, the forces it
exerts on the particles must be in the upwards direction, which is negative for each
coordinate. In addition, each particle has its weight acting on it in the downwards
direction. Hence the forces on each particle are as in Figure 3.2.2.
STEP 4: Apply Newton's second law. For each particle, the mass times the accel-
eration equals the net force, which can be read off Figure 3.2.2. Hence
1X1 = lg - T (2)

2X2 = 2g - T (3)
STEP 5: Solve the equations. The equations (1), (2) and (3) form a system of
three simultaneous linear equations in three unknowns X1, X2 and T. The standard
method for solving such systems is the Gaussian elimination algorithm. To apply
this procedure, first bring all the unknowns to the left-hand side of each equation
3.2 Solving pulley problems 47

to get
X1 +X2 =0
X1 +T=g
2x2+T = 2g
Next, the coefficients of the unknowns are read off and placed in a matrix alongside
the column of right-hand sides to give
1 1 0 0
1 0 1 g
0 2 1 2g
The aim now is to reduce the coefficient matrix to the unit matrix by using opera-
tions on the rows. This gives the following matrices.
1 1 0 0
0 -1 1 g new row 2 row 2 - row 1
0 2 1 2g

1 1 0 0
0 1 -1 -g new row 2 = - row 2
0 0 3 4g new row 3 = row 3 + 2 x row 2
1 1 0 0
0 1 -1 -g
ag
0 0 3
new row 3 = 3 x row 3
1 1 0 0
0 1 0 lg new row 2 = row 2 + row 3
ig
0 0 1
3

i o 0 new row 1 = row 1 - row 2

o i o
0 o i

The final matrix is called the row echelon form of the original matrix. Reinser-
tion of the unknowns in the row echelon form makes the solutions obvious:

zl = __g (4a)
3

X2 = 31 g (4b)

T = 4g (4c)
3

Thus the first particle has an acceleration vertically upwards of magnitude g m/s2,
the second particle has a downwards acceleration of the same magnitude,3 and the
tension in the rope is g newton.
3

STEP 6: Check the answers. Certain features of the answers (4) could have been
predicted from Figure 3.2.1. For example zl should be negative and z2 should be
positive because the lighter particle will accelerate upwards and the heavier one
downwards. The magnitude of the accelerations should be less than g, moreover,
48 Ropes and pulleys
because the fall of the heavier particle is impeded by the upwards pull of the
rope.
The answer for the tension should (and does) lie between g and 2g since the
tension must be larger than the weight of the lighter particle (to accelerate it
upwards) and less than the weight of the heavier particle (to allow it to accelerate
downwards).

The answers obtained for the accelerations provide differential equa-

tions, which may be solved (subject to suitable initial conditions) to give
a complete description of the motion of the particles. The steps involved
in applying the Gaussian elimination algorithm are routine and can easily
be implemented on a computer.

Example 2. Suppose that in Example 1 the particles are released from rest when
the heavier particle is at a height of 2 metres above the floor. Find how long it takes
to reach the floor (given that the rope is long enough for this to happen before the
lighter particle hits the pulley).

Solution. By Example 1, the distance x2 of the heavier particle below the centre of
the pulley, when considered as a function of time, satisfies the differential equation
1
x2= 3g.
Suppose that initially the particle is a distance a metres below the centre of the
pulley. The initial conditions may be written as
x2 = a and 5C2 = 0 when t =.O.
The problem is to find t such that x2 = a + 2, since the particle has to drop a
further 2 metres.
Antidifferentiation applied twice to the differential equation and use of the initial
conditions gives the solution

x2 = 61 g t2 +a.

Hence x2 = a + 2 when 2 = 6gt2 and hence when t = 12/g, as t > 0. Thus the
particle hits the ground after about 1.1 seconds.

Exercises 3.2

1. Two particles, of mass 2 kg and 3 kg respectively, are connected by a light

rope passing over a smooth light pulley. The coordinates for the particles are to
be taken as their respective heights above the floor, say yl and y2 metres. The
particles are shown in typical positions in the diagram below.
3.3 Further pulley systems 49
,,,,,,,,,

ED
3 [Tg]

2 kg

(a) Show the coordinates yl and y2 of the particles on the above diagram.
(b) Suppose now that the system is set in motion. What does your physical
intuition tell you about
(i) the sign of yl,
(ii) the sign of y2,
(iii) the range in which the tension in the rope must lie.
(c) Suppose the particles are initially released from rest when the second
particle is at a height of 2 metres above the floor. What are the values
of y2 and y2 when t = 0?

2. Repeat the solution of Example 1 in the text, but this time choose the
coordinates xl and x2 to be the heights above the ground of the respective
particles, at time t.
How have the answers changed with the new choice of coordinates? Does the
first particle still accelerate upwards?
3. Two particles, of mass 2 kg and 3 kg respectively, are attached to the ends
of a light rope which passes over a smooth light pulley, which is suspended at a
fixed distance below the ceiling. Find the accelerations of the particles and the
tension in the rope, by following the steps explained in the text.
4. Suppose that in the preceding exercise the particles are released from rest
when the heavier particle is at a height of 1 metre above the floor. Find how
long it takes to reach the floor (given that the rope is sufficiently long for this to
occur before the lighter particle reaches the pulley).

3.3 Further pulley systems

Each of the following examples exhibits some new feature which was
not present in Example I of Section 3.2. Since the same general proce-
dure for solving pulley problems is still applicable, we will not give the
complete solutions but will concentrate instead on the steps which need
modification.
50 Ropes and pulleys

Fig. 3.3.1. Setting up the coordinates for Example 1.

The first example generalizes Example 1 of Section 3.2 to allow for

arbitrary masses mi and m2 for the attached particles. Thus ml and m2
come with the problem and are regarded as `knowns'. The aim in solving
the problem is to express the `unknowns' in terms of them. Letters
like ml and m2 which allow us to consider a whole range of problems
simultaneously are called parameters of the problem.

Example 1. Two particles of mass ml > 0 and m2 > 0 respectively are attached to
the ends of a light rope which passes over a smooth light pulley suspended a fixed
distance below the ceiling. Find the accelerations of the particles and the tension in
the rope.
Discussion STEPS 1-5: These may be followed much as in Example 1 of
Section 3.2. As always the first step is to draw a diagram to indicate which
coordinates are to be used, as in Figure 3.3.1.
The remaining steps then lead to the following answers for the accelerations
and the tension :
ml -m2
xl = ml
+ m2 9 (1a)

m2 -ml
x2= ml+m2g
2ml m2
T = (lc)
MI + m2
STEP 6: Because the answers (1) contain parameters, it is possible to use them
to make a wide range of predictions about the behaviour of the system in Figure
3.3.1 and to perform a variety of checks on the answers.
The simplest of the checks is just to substitute ml = 1 and m2 = 2 into the
answers (1) and observe that these are then the same as the answers (4) found
for Example 1 of Section 3.2. Also, it is easily verified that zl + z2 = 0 which
3.3 Further pulley systems 51

must be true since the length of the rope stays fixed. Another check is that T > 0
which must hold by the definition of tension.
Further checks are as follows.
(a) Dimensional checks. Recall that [g] = LT2. The answer (1a) gives for the
dimensions of xl
[xl] _ Ml -M2 g = MM-1 [g] = LT-2

Emi + m2
which are the correct dimensions for acceleration. The answer (1 a) gives
2m1m2 MLT-2
[T] = m1+m2 g = M2M-1 [g] =
which are the correct dimensions for a force.
(b) Equilibrium cases. If the two particles have equal masses, then their weights
should exactly balance. Hence the particles should stay at rest or move
with uniform velocity. The answers predict this will happen, since putting
m1 = m2 in (1 a) and (i b) gives
xl =x2 =0
while (1c) shows the tension is then the common weight of the particles.
(c) Limiting values of the parameters. If we keep the first mass fixed and allow
the other to approach zero, we expect to obtain answers appropriate to
free fall of the first particle. This is what happens since
ml -M2 g
x1 = m1+ m2 by (1a)

m1-0
as m2--+0

while
2m1m2
T= g by (I c)
m1 + m2

0
--+ g=0 as m2 -+ 0-
m1

(d) Further inequality checks. In the case m1 < m2, it is shown in one of the
exercises how to derive from the answers the following inequalities :
0< X2 <g (2)

mfg < T< m2$. (3)

These inequalities say that the answers must lie within certain ranges
which are very plausible on physical grounds. Thus the inequalities (2)
say that the heavier particle must accelerate downwards and that the
magnitude of the acceleration must be less than that for free fall. The
inequalities (3), on the other hand, say that the tension in the rope must
be larger than the weight of the lighter particle (to make it accelerate
upwards) and less than the weight of the heavier particle (to allow it to
accelerate downwards). A similar discussion applies in the case where
mt > m2.
52 Ropes and pulleys
The solution to the above example was based on assumptions about
the lightness of the rope and the pulley, and the smoothness of the
pulley. Examples encountered in practice only approximate our idealized
model, without satisfying our assumptions exactly. Hence it is desirable
to consider ways to test the accuracy with which our model approximates
the real world.
To this end, recall that the answers for the accelerations provide very
simple differential equations which can be solved to give the heights of
the particles as functions of the time. By adjusting the relative masses
of the particles, we could achieve small accelerations and hence, with the
aid of a stop-watch, get empirical plots of height against time. These
results could then be compared with those predicted by our model.
Alternatively, assuming the validity of the model, we could use the
measurements to determine the acceleration g due to gravity. When put
to this use, the mechanical system shown in Figure 3.3.1 is called Atwood's
machine.
The next example introduces some of the complexities that arise when
more than one pulley is involved.

Example 2. A light rope is attached to the ceiling at one end. It then

(a) passes under a smooth light pulley from which a particle of mass m1 > 0 is
suspended,
(b) passes over a smooth light pulley suspended from the ceiling, and finally
(c) has a particle of mass m2 > 0 attached to its other end.
The portions of the rope not directly in contact with the pulleys lie in the vertical
direction. Find the accelerations of the particles and the tension in the rope.
Discussion A suitable choice of coordinates for the particles is shown in
Figure 3.3.2.
In this example, it is convenient to choose the coordinate of the first particle
to be the distance between the centre of the first pulley and the particle. Since
the distance of the first particle below the ceiling differs from xl by a constant,
its acceleration will be zl.
In deriving a relation between the coordinates, you should note that the x1-
coordinate contributes twice to the length of the rope. As a result, the relation
between zl and x2 will be different from that obtained in previous examples.
On the dynamical side, some thought is needed to model the forces acting on
the first particle. Let T be the tension in the rope. Since the rope is pulling
upwards on both sides of the pulley, as in Figure 3.3.3, a plausible assumption
is that it exerts a net upwards force of magnitude 2T on the pulley. Since the
pulley is light, this force will be transmitted unchanged to the first particle.
3.3 Further pulley systems 53

Fig. 3.3.2. Coordinates for Example 2.

Fig. 3.3.3. Forces on a light pulley.

Pulleys find practical application in the use of `block and tackle' to lift
heavy loads. A small force exerted by a worker can thereby be converted
into a large force acting on the load. Further details are given in Exercises
7 and 8 below.

Exercises 3.3
1. Complete the solution of Example 1 in the text by carrying out Steps 2-5
(explained in Section 3.2).
To solve the linear equations you may use the fact that the matrix below on
the left has the row echelon form shown on the right, where it = (ml + m2)-lg.
m1 0 1 mlg 1 0 0 (m1 - m2)jz
0 m2 1 m2g 0 1 0 -(m1 - m2)jz
1 1 0 0 0 0 1 2m1m2µ
54 Ropes and pulleys
2. (a) Show that the answers obtained for zl and z2 in Example 1 in the text
can be written so that they involve the masses m1 and m2 only in the
combination m1 /m2.
(b) If the masses are both doubled, what happens to the accelerations?
(c) How would you choose the mass ratio ml /m2 to make z1 small and
positive?

3. Each of the following answers is suggested for the tension in Example 1 in

the text. Show in each case that the proposed answer is wrong by using one of
the checks explained in the text.
2 2
(mlm2)2 m1 + m2 g.
(a) T= (b) T =
m1+m2g MI + m2

4. Suppose that in Example 1 the particles start from rest with the second
particle at a height of 1 metre above ground level. If the second particle takes
1 second to reach the ground, what is the ratio of the two masses? Assume, of
course, that the first particle does not run out of rope.
[You are to solve this problem by solving a suitable differential equation with
the relevant initial conditions.]
5. Two particles, of mass m1 and m2 respectively, are connected by a light rope
passing over a smooth light pulley. A third particle, of mass m3, hangs by a light
rope from the second particle. The coordinates of the first two particles at time t
are their respective distances x1 and x2 below the centre of the pulley.

(a) Copy the diagram above and show the coordinates xl and x2 on it.
(b) Use your physical intuition to find, in each case, a necessary and sufficient
condition on m1, m2 and m3 to ensure that throughout the motion
(i) zl > 0 (ii) z1 = 0.
(c) Over which sections of the rope must the tension stay constant? How
many different tensions are there?
(d) Which two particles have the same acceleration?
3.3 Further pulley systems 55

6. For the mechanical system in Exercise 5, find the accelerations of the particles
and the tension in the rope by carrying out Steps 1-6 (explained in Section 3.2).
To solve the linear equations you may assume that the matrix below on the
left has the row echelon form on the right, where µ = (ml + m2 + m3)-1g.
m1 0 1 0 mlg 1 0 0 0 (mi - m2 - m3)µ
0 m2 1 -1 m2g 0 1 0 0 -(m1 - m2 - m3)µ
0 m3 0 1 m3g 0 0 1 0 2m1(m2 + m3)µ
1 1 0 0 0 0 0 0 1 2m1 m3µ

7. The contented worker shown below is using sound mechanical principles to

help him raise a heavy load. Show the tension in the various sections of the
rope when the man pulls with a force T. What is the net force then exerted by
the ropes on the heavy load? Assume the ropes stay vertical and the load stays
horizontal.

8. When a heavy load is raised by a system of pulleys, the ratio

{ force exerted on load}
{force exerted by worker}
is called the mechanical advantage of the system.
(a) What is the mechanical advantage of the system in Exercise 7?
(b) Design a similar system with a mechanical advantage of 4.

9. Complete the solution of Example 2 in the text by carrying out Steps 1-6
from Section 3.2.
To solve the linear equations you may assume that the matrix below on the
left has the row echelon form shown on the right, where µ = (ml + g. 4m2)-1

ml 0 2 mlg 1 0 0 (ml - 2m2)µ

0 m2 1 m2g 0 1 0 -2(ml - 2m2)µ
2 1 0 0 0 0 1 3m1 m2µ
56 Ropes and pulleys
10. Each of the following answers is suggested for the acceleration of the first
particle in Example 2. Show in each case that the proposed answer is wrong by
using one of the checks explained in the text.

-
2 2
m1 ml m2
(a) z=
1
2m2
,+ 4m2 g (b) xi = g
m1 MI +M2

11. A light rope passes over two smooth light pulleys suspended at a fixed height
below the ceiling. Attached to the ends of the rope are two particles of mass m1
and m3 respectively. The central portion of the rope passes under a pulley which
supports a particle of mass m2 in such a way that the sections of the rope not
touching the pulleys hang vertically, as in the diagram below.

Choose coordinates for each particle so that their positive directions are
downwards and then find the accelerations of the particles and the tension in the
rope following Steps 1-6, explained in Section 3.2.
To solve the linear equations you may assume that the matrix below on the
left has the row echelon form on the right, where µ = (mlm2 + m2m3 + 4m3m1)-1.

ml 0 0 1 mlg (1 - 4m2m3 2)g

0 m2 0 2 m2g (1 - 8m1 m32)g
0 0 m3 1
m3g (1 - 4m1 m2µ)g
1 2 1 0 0 4m1m2m3µg

12. Find a necessary and sufficient condition on the masses ml, m2, m3 in
Exercise 11 in order that the particles remain stationary if initially at rest.

13. Three particles of mass ml, m2 and m3 respectively are firmly attached to a
light rope - one at either end and the remaining one at an intermediate point
along the rope, as shown below. The portion of the rope joining particles one
and two passes over a smooth light pulley attached to the ceiling; that joining
particles two and three passes over a similar pulley.
3.4 Symmetry 57

Find the tension in the rope and the accelerations of the particles by following
Steps 1-6 of Section 3.2. [Note that this problem differs substantially from
Exercise 11. It involves five equations in five unknowns.]
To solve the linear equations you may assume that the matrix
ml 0 0 1 0 mig
0 m2 0 1 1 m2g
0 0 m3 0 1 m3g
1 1 0 0 0 0
0 1 1 0 0 0
has the row echelon form
1 0 0 0 0 (mi - m2 + m3)µ
0 1 0 0 0 -(MI - m2 + MOP
0 0 1 0 0 (m1 - m2 + m3)µ
0 0 0 1 0 2ml m2µ
0 0 0 0 1 2m2m3µ

where y = (ml + m2 + m3)-lg.

14. (a) Refer to Example 1 in the text, and assume ml < m2. Show that
inequalities (2) and (3) hold. To show (3) observe first that T can be
written in each of the forms
ml +M1
T = m2 + m2 ml g T= m2$.
MI +M2 MI +M2

(b) Show that 2T = mfg

1 +
m2$
1 . This equation may be expressed by saying
that T is the harmonic mean of the two weights.

3.4 Symmetry
In everyday life, symmetry is seen in the patterns on wall paper, in
the design of furniture and in the architecture of great buildings. In
nature, symmetry is most evident in the crystalline structure of solids like
common salt and snowflakes. Some of the most interesting problems in
58 Ropes and pulleys
//
x2 x2

xl xl

From the front From the rear

Fig. 3.4.1. Example of a mechanical problem with symmetry.

mechanics also possess symmetry and this is reflected in special properties

of the equations of motion for such systems.
An example of a mechanical problem with symmetry is provided by
Example 1 of Section 3.3. If you look at this system from behind the
page you find that you get exactly the same system except that the roles
of the first and second particles have been interchanged. The second
particle now appears on the left as shown in Figure 3.4.1.
The same system is observed from the front as from the rear. Thus the
same process we applied to ml and m2 (in that order) to get the tension
T should give us the correct answer when applied to m2 and m1 (in this
order). As a consequence, our answer for T should not change when we
interchange m1 and m2.
Let us verify this. The answer actually obtained was
2m 1 m2
T=
ml + m2
If, on the right-hand side (RHS), the letters ml and m2 are interchanged
we get
2m2m1
m2 +m1
But this is obviously equal to what we had before the interchange. Thus,
the symmetry of the original problem reflects itself in the answer for the
tension.
Similar remarks apply to the accelerations. Because of symmetry,
interchange of the subscripts `1' and `2' wherever they occur leaves the
3.4 Symmetry 59

formula valid; the formula for zi changes into the formula for z2 and
vice versa.
Recognizing symmetry in a mechanical system thus leads to an addi-
tional check on the answers. Thus, for example, the formula
2mim2
T = ml + 2rn2 $
cannot be the correct answer to the above problem since interchange of
m1 and m2 transforms the RHS into
2m 1 m2
9.
M2+ 2ml
This is not equal to what we had before the interchange and so the
required symmetry is lacking.
In more advanced courses, symmetry can occur in mechanical systems
in quite complicated ways. The systematic study of symmetry was closely
associated with the rise of modern algebra, particularly group theory.

Exercises 3.4
1. Is the mechanical system of Example 2 of Section 3.3 symmetric with respect
to the first and second particles? Do you expect the formula for T to stay the
same when ml and m2 are interchanged? Verify your answer by carrying out this
interchange in the formula for T.
2. Is the mechanical system of Exercise 3.3.11 symmetric with respect to the first
and third particles? Do you expect the formula for T to stay the same when ml
and m3 are interchanged? Verify your answer by carrying out this interchange in
the formula for T.
4
Friction

Together with gravity, friction forces are the ones which play the biggest
role in shaping everyday life. Without friction, we would be unable to
drive our cars, to walk, or even to hold our pens. The simple laws
of friction on which our model is based seem to have been first stated
by Leonardo da Vinci (1452-1519), who wrote prolifically about lots of
things.
Although these laws for friction are very simple, they provide useful
estimates and qualitative predictions for a wide range of behaviour as-
sociated with friction. More sophisticated models are sometimes used,
however, in specialized areas (such as the design of bearings in engineer-
ing).

4.1 Coefficients of friction

Friction forces arise as a result of contact between two surfaces. A good
way to get a feeling for these forces is to experiment with the two surfaces
consisting of the top of the table and the palm of your hand.
An instructive experiment is to rest your hand on the table and then
exert a gentle forwards pressure, but not enough to move your hand.
You should then be able to feel the backwards reaction from the table
opposing your forwards push. Now gradually increase the forwards
pressure. The backwards reaction force builds up to a maximum. After
this your hand slides forwards. The backwards reaction force which
opposes your push in the forwards direction arises from the friction
between the palm of your hand and the table.
In order to explain our model of friction, it is convenient to replace
your hand in the above experiment by a block of some solid substance.
It is free to move on a plane surface made of a second solid substance.

60
4.1 Coefficients of friction 61

F 77"7 P
Block

+ve direction
Plane

Fig. 4.1.1. Static friction F opposes a pushing force P.

The forces acting on the block which act parallel to the plane are shown
in Figure 4.1.1, where right has been chosen as the positive direction.
The pushing force is denoted by P (where P < 0), while the friction
force which opposes it is denoted by F (where F > 0). In the diagram it
is assumed that the friction force acts in the positive direction while the
pushing force acts in the opposite direction. While the block is at rest
(or moving with uniform velocity), Newton's second law gives P + F = 0
and hence F = -P.
Thus, as I P I increases from zero, IF I increases by an equal amount until
it reaches a maximum value, at which stage the block begins to slide in
the direction of the push. The maximum magnitude of the friction force
will be denoted by
Finax

The magnitude of the friction force depends on how hard one pushes the
block and can assume any value between 0 and Fmax.
The value of Fmax depends on how hard the two surfaces are pressed
together. It is harder to make a heavy load slide than a lighter one. To
measure the extent to which two surfaces are pressed together, we note
that, by Newton's third law, the block and the plane exert equal and
opposite forces on each other in the direction normal (perpendicular) to
the plane. These forces are illustrated in Figure 4.1.2. Their common
magnitude N is called the normal reaction between the plane and the
block.
In the model we adopt for friction, it is assumed that the maximum
friction is a linear function of the normal reaction.
Law of static friction. For a given pair of substances (for the surfaces in
contact) there is a constant µS > 0 such that
Fmax = bus N.

Note that Fmax and N are both positive quantities since they are defined as
magnitudes.
The dimensionless constant us is called the coefficient of static friction
62 Friction

+ve direction

/Z 77

-N

Fig. 4.1.2. Normal reaction.

between the two surfaces. It is assumed to be independent of the shape

of the block and of the area of the portion which maintains contact. It
depends only on the particular substances of which the block and plane
are composed.
The law of static friction is essentially a pair of inequalities
_,uN<F<_1u5N
for the friction force F which acts when the block is stationary, relative
to the surface. If no other force is applied in a direction along the plane,
the friction force will be zero. When a pushing force is applied, however,
the friction force acts in the opposite direction and assumes the extreme
values only if the push is hard enough.
The reason for the adjective `static' in the description of the above
coefficient is as follows. As soon as the block begins to move, the
magnitude of the friction force drops to a value Fkin which is smaller
than Finax. In the model we adopt, Fkin is assumed to be a linear function
of the normal reaction.
Law of kinetic friction. For a given pair of substances there is a constant
µk > 0 such that
Fkin = 1Uk N.

The dimensionless constant µk is called the coefficient of kinetic friction

for the two substances. Sometimes it is called the coefficient of dynamic
friction and sometimes the coefficient of sliding friction.
When the block is sliding over the plane, the friction force is given by
one of the two equalities
F=1UkN or F=-,UkN,
4.1 Coefficients of friction 63
P +ve direction

Velocity: 4 Velocity: 0
F>0 4 -F

Fig. 4.1.3. Kinetic friction opposes the motion.

A
N

+ve direction

-mg

Fig. 4.1.4. On a horizontal surface the normal reaction force is opposite to gravity.

its direction being opposite to that of the velocity of the block, as

illustrated in Figure 4.1.3.
For the same pair of substances, the following equality holds between
the two coefficients of friction.

ILk < Ys-

Some typical values are shown in the following table.

Table 4.1.1. Values of coefficient of friction.

Pair of substances Ilk /is

Steel on ice 0.06 0.1

Rubber on dry concrete 0.7 1.0

This table makes it clear why you should never brake your car so hard
that it begins to slide: the friction force is reduced to about 70% of its
value prior to sliding.
Applying the laws of friction involves calculating the normal reaction
between the block and the surface. In the special case in which the surface
is horizontal, this is particularly easy since both the normal reaction and
gravity act vertically, as in Figure 4.1.4. The net force on the block in
the vertically upwards direction is N - mg where m is the mass of the
block. The surface being fixed, the vertical acceleration of the block is
zero. Hence N - mg = 0 and so N = mg.
64 Friction

Exercises 4.1

1. A block of mass 3 kg lies on a horizontal table. The coefficients of friction

between the block and the table are given by
µs = 0.3 and µk = 0.2.
State the direction of the friction force acting on the block in each of the following
cases and state its magnitude.
(a) The block was given a push and is now moving to the right.
(b) The block is at rest but pressure is being exerted on it and it is on the
point of moving to the left.

2. A car of mass 2600 kg is parked across a driveway with the brakes on and
the owner has lost his keys. What is the magnitude of the force required to
(a) just start the car sliding?
(b) keep it sliding, once in motion?
Use the coefficients of friction given in the text.
3. Discuss the following statement (in which u denotes the coefficient of static
friction).
Values for µ depend on the materials in contact and the state of the surfaces,
and range from about 0.04 for ski wax on dry snow, through 0.4 for brake lining
on cast iron and 1 for rubber on a hard dry road, to values considerably greater
than 1 for very wide drag-racing tyres.

4.2 Further applications

The following examples will illustrate how the coefficients of friction may
be used to study the motion of a block sliding over a horizontal surface.
The coefficient of static friction j can be used to decide whether the
force applied to the block is enough to set it in motion. Once the block is
moving, however, the coefficient of kinetic friction 1uk is used to determine
the magnitude of the friction force, which acts in the opposite direction
to the velocity.
The steps used to get the equations of motion for the block are similar
to those used in Section 3.2: introduce a coordinate for the block (now
regarded as a particle), draw a diagram showing the forces acting on the
block and then apply Newton's second law of motion. Only those problems
leading to constant acceleration will be considered in this section, so that
no new methods will be needed to solve the differential equations.
4.2 Further applications 65

Velocity: -
X
Li_jF<O
T
0 +ve direction

Fig. 4.2.1. Coordinates for Example 1.

Example 1. A block of mass m > 0 is given a sufficiently hard push along a

horizontal surface to set it in motion, with an initial speed of v. The coefficient of
kinetic friction between the block and surface is u > 0. Show that the block comes
to rest after a time v/µg, having covered a distance v2/(ug).
i
Solution. Let x be the displacement of the block from the initial point 0 at time
t, with the positive direction for x chosen to be the direction of the velocity of the
block.
The friction force F acts in the negative direction for the coordinate x, as shown
in Figure 4.2.1. Hence
F = -suN
where N is the normal reaction between the plane and the block. Since the plane is
horizontal, N = mg. Hence

By Newton's second law,

hence the equation of motion is

(1)

The initial conditions are x = 0 and z = v when t = 0; hence by antidifferentia-

tion (1) is equivalent to
x= -µg t+ c where c = v (2)

x= -12 µg t2 + v t + d where d = 0. (3)

The block comes to rest when x = 0; hence by (2) when

t = v/µg.
Its distance from the initial point 0 at this time is now obtained from (3) as
1
fv 2 Iv
x = -2µg +v
µg µg

1 v2

2 jig
which is the required distance.
66 Friction

x +ve x-direction

1
+ve y-direction

Fig. 4.2.2. Coordinates for Example 2.

As a check on these answers note that both the time and distance
taken for the block to stop increase with the initial speed v, but decrease
as the friction /1 increases. The dimensions of the answers are correct
also.

Example 2. A block of mass 1 kg lies on a table and is attached to one end of a

light string. The string passes over a smooth light pulley at the edge of the table
and supports a block of mass 2 kg hanging down over the edge of the table. The
coefficient of kinetic friction between the block and the table is 0.1.
Find the acceleration of the block on the table when it is projected (a) away
from the pulley, and (b) towards the pulley, assuming the string remains taut.
Solution. Let x metres be the distance of the block on the table from the pulley, at
time t seconds, and let y metres be the height above ground level of the suspended
block - as in Figure 4.2.2. To any change in x will correspond an equal change in
y, hence .z = y always and so

The tension in the string will be denoted by T newton.

Case (a) : the block on the table is projected away from the pulley.
The normal reaction N newton between the block and the table exactly balances
the weight of the block so
N = 1g.
The remaining forces on the blocks are shown in Figure 4.2.3, where F newton is
the friction force opposing the motion of the block. Since the friction acts in the
opposite direction to the velocity, F acts in the negative x-direction and so F is
negative. Hence
F=-0.1N
_ -0.1g.
An application of Newton's second law to each block in turn now gives
x=F-T=-O.Ig-T (2)
2y = T - 2g (3)
4.2 Further applications 67
F<0. lkg -T
Velocity 4 IT

2 kg

1_2g

Fig. 4.2.3. Forces on each mass.

Solution of the simultaneous linear equations (1), (2) and (3) for z, y and T gives
x = y = -0.7g, T = 0.6g.
Thus the block on the table has an acceleration of 0.7g m/s2 towards the pulley
and the tension in the string is 0.6g newton.
Case (b) : the block on the table is projected towards the pulley. The direction of
the friction force F is now reversed. Thus F acts in the positive x-direction and so
F = 0.1g. This leads to the answers
x = y = -0.63g, T = 0.73g.
Thus the block on the table has an acceleration of 0.63g m/s2 towards the pulley
and the tension in the string is 0.73g newton.

The above answers are valid while the blocks continue to move in the
original directions of projection. They will cease to be valid when the
velocity of the block on the table becomes zero or when the suspended
mass hits the ground.

Exercises 4.2
1. Repeat the solution to Example 1 in the text, but this time choose the positive
direction for the x-coordinate to be opposite to the direction of the velocity.
2. Verify that the answers obtained in Example 1 in the text, for the time and
distance needed by the block to come to rest, have the correct dimensions.
3. Suppose that in Example 2 in the text, the suspended block is given an initial
downwards velocity of 1 m/s. By solving a suitable differential equation find how
long it takes for the block to descend i metre.
4. As a generalization of Example 2 in the text, suppose that the block on the
table has mass ml > 0, the suspended block has mass m2 > 0
and the coefficient of friction is µ > 0. Show that
(a) when the block on the table is projected away from the pulley its
acceleration is
x= -µm1 - m2 g,
MI +M2
68 Friction

(b) when it is projected towards the pulley its acceleration is

µm1 -m2
.z = m1+m2g.

5. Apply appropriate checks to the answers given in Exercise 4.

6. Suppose that in Example 1 in the text, the answer for the time the block
takes to come to rest had been given as vµ/g. How could you tell this was wrong
(short of working out the correct answer)?

4.3 Why does the wheel work?

In particle mechanics, the modelling process reduces complicated bodies
such as cars, space ships and planets to single points. The study of the
wheel reveals the limitations of this procedure, and to explain why the
wheel works it will be necessary to go outside particle mechanics, at least
temporarily. Instead of modelling both a block and a wheel as single
points, it will be necessary to take into account the basic differences in
their geometry.
The wheel was discovered at some unknown time very early in the
history of civilization. Stylish wheels complete with spokes appear
in pictorial records of horse-drawn chariots from ancient Egypt and
Mesopotamia. More primitive looking wheels, solid without spokes,
appear on chariots belonging to the Philistines. At the other extreme,
the American Indians were apparently unaware of the wheel before the
arrival of the Europeans.
The discovery of the wheel is widely regarded as one of mankind's
great technological advances. The reason why the wheel works, however,
does not appear to be so widely understood.
First, one must understand that the wheel is basically a device for
by-passing friction. In a frictionless world, wheels would be unnecessary
because the heaviest load could be set in motion by the slightest push.
Once started, it could maintain its velocity on a horizontal plane without
further pushing. The smallness of the coefficient of friction between wood
and ice provides an approximation to this ideal situation and has enabled
Inuit (Eskimos) to transport things efficiently by sledge, without the use
of wheels.
Second, the geometry of the circle makes it possible for a wheel to
4.3 Why does the wheel work? 69

This point is at rest

instantaneously

Fig. 4.3.1. The instantaneous point of contact on a rolling wheel.

F=O

Fig. 4.3.2. Possible stationary frictional forces for a wheel.

move along a road by rolling. When this happens, at each instant

the points of the wheel in contact with the road are stationary,

as illustrated in Figure 4.3.1. Because the points of the wheel in contact

with the road are instantaneously at rest, no slipping occurs. Hence it
seems reasonable to try to explain the friction force acting on the wheel
by using our model for stationary friction, rather than that for kinetic
friction.
This model of friction permits the friction force F between the wheel
and the road to have any value consistent with the inequalities

_,uNcFc1sN
(where js is the coefficient of static friction between the wheel and the
road, and N is the normal reaction). Thus, the model for stationary
friction permits the friction force to act in either direction and does not
exclude the possibility of its being zero. Some possibilities are illustrated
in Figure 4.3.2.
By way of contrast, note that to explain the friction forces on a sledge
being dragged along a road we would need to use the model for kinetic
friction, because slipping occurs between the road and the sledge. This
70 Friction

model permits only one value for the magnitude of the friction force,
namely AN, and so precludes it from having the value zero.

Exercises 4.3
1. Discuss the following statement.
The frictional force that opposes one body rolling over another is much less than
that for a sliding motion and this, indeed, is the advantage of the wheel over
the sledge. This reduced friction is due in large part to the fact that, in rolling,
the microscopic welds are `peeled' apart rather than `sheared' apart as in sliding
friction. This may reduce the frictional force by as much as 1000-fold.
[This quote is taken from Halliday and Resnick (1974), pages 80 and 81.]
5
Differential equations : linearity and SHM

The differential equations which arise in Chapter 6, where the motion

produced by springs is studied, are of a different type from those en-
countered so far. In particular, they can no longer be solved by the
simple process of antidifferentiation. They are called the equations of
simple harmonic motion (SHM) equations. The goal of this chapter is to
provide practice at recognizing SHM equations and writing down their
solutions.
SHM equations are autonomous and linear. These important proper-
ties make it very easy to find all the solutions of the SHM equations.
This chapter assumes familiarity with the sine and cosine functions,
including their graphs and their derivatives. It does not depend on
Chapter 3 or 4, however, and may be started immediately after Chapter
2.

5.1 Guessing solutions

The differential equations in previous chapters have been of the form
.z= f(t) (1)

where f (t) is a known function of t. Such differential equations may be

solved by antidifferentiating both sides with respect to t.
To extend the idea of a differential equation, we now permit the
right-hand side to be a known function of x to get, say,
.z = g(x) (2)

where g(x) is a known function of x. It is no longer possible to solve

such differential equations by antidifferentiation. The trouble is that the
right-hand side of (2), unlike that of (1), involves the unknown function

71
72 Differential equations: linearity and SHM
x = 4(t). You can't antidifferentiate a function if you don't even know
which function it is!
Thus we are forced to look for a different way of solving differential
equations of the form (2). A crude, but none the less effective, way is to
guess a solution, then substitute it back into the differential equation to see
if it works. Even though your first guess may be not completely correct,
it will often be easy to see how to modify it to give a correct solution.
Constant solutions are particularly easy to guess.

Example 1. Find two solutions of the differential equation

X = x(x - 1). (3)

Solution. If x is a constant function of t then

LHS =X=0
while
RHS = x(x - 1).
The two sides will thus be equal if the constant x satisfied the algebraic equation
x(x - 1) = 0. Thus the differential equation has the two constant solutions
x=0 and x= 1.

A useful aid to guessing solutions of differential equations is familiarity

with the derivatives of the standard elementary functions. For example,
the successive derivatives of sin and cos are shown in the following table :
x= sin(t) x= cos(t)
.k = cos(t) .x = - sin(t)
X = - sin(t) X = - cos(t)

Repeated differentiation of either sin or cos produces the negative of the

original function and then, later, the original function itself.
The differential equation in the following example has the constant
solution in which x = 0 for all values of t. The above table helps us guess
two other solutions.

Example 2. Find two solutions, other than the constant solution, of the differential
equation.
X = -X.
5.1 Guessing solutions 73

Solution. The second derivative is to be the negative of the function, suggesting sin
and cos. Choose x = sin(t), to get
LHS = .z = - sin(t)
RHS = -x = - sin(t).
Thus x = sin(t) is a solution. That x = cos(t) is a solution may be checked similarly.

Along with sin and cos, the exponential function figures prominently
in the solution of a lot of commonly occurring differential equations.
Successive differentiations give the following table
x=et x= e-t
ac=et x= -e-t
x =e .z= e-t

In each of the above cases, two differentiations give back the original
function. This enables us to guess the answer to the following problem.

Example 3. Find two non-constant solutions of the differential equation

z- X.
Solution. The second derivative is to equal the original function. Choose x = et and
substitute into the differential equation to get
LHS=.z=et
RHS=x=et.
Thus x = et is a solution. That x = e_t is also a solution may be checked similarly.

Further useful tables of derivatives may be obtained by first scaling the

time t by a constant factor. For example, the chain rule for differentiating
composite functions gives the following table:
x = cos(2t)
x = -2 sin(2t)
x = -4cos(2t)

Such results greatly extend the number of differential equations for which
we can guess non-constant solutions.
74 Differential equations: linearity and SHM
Exercises 5.1
In each case guess the required solutions and then substitute back into the differ-
ential equation. Work out both LHS and RHS and check whether they are equal.
1. In each case find all the constant solutions of the differential equation.
(a) x = x(x + 1) (b) X + x2 = 1

(c) X = X (d) z+33c+2x=4

2. Find a non-constant solution in each case.
(a)ac=x (b)x+x=0
(c) ac = 2x (d) ac = -3x
3. Find two non-constant solutions in each case.
(a)x+x=0 (b)x-x=0
(c) x = -4x (d) x = 4x
(e) x=-9x (f) x+16x=0
4. Let w be a fixed real number. Find two non-constant solutions for the
differential equation
X = -w2X
in the cases
(a)w=0 (b)w*0
5. Let x("') denote the fourth derivative of x with respect to t. In each case give
five solutions of the differential equation.
(a) x0"') = x (b) x('v) = 16x

5.2 How many solutions?

In the previous section a number of solutions of various differential
equations were found by guessing. There may clearly be other solutions
besides those that were guessed. How can we tell when we have fouiid
all the solutions? To answer this question it is necessary to introduce the
idea of the order of a differential equation.
A first-order differential equation is one which expresses the first
derivative ac in terms of t and x. It may thus be written in the form
ac = F(t,x) (1)

where F(t, x) denotes a known function of t and x (for example, F(t, x) _

tx2 + x).
A second-order differential equation is one which expresses the second
derivative z in terms of t, x and ac. It may thus be written in the form
X=G(t,x,$) (2)
5.2 How many solutions? 75

where G(t, x, 5c) denotes a known function of t, x and ac.

Third- and higher-order differential equations are defined in a similar
way.

Example 1. Each of the differential equations

x=t3, x=X3, x=x+5t
is of first order, while each of the differential equations
.X = t3 , .X = X3 , .X = aC + COS(X)t4

is of second order.

Most of the differential equations which arise in practice (or rather,

the functions which define their right-hand sides such as F and G above)
have a property known as smoothness. In first-year calculus, examples are
given of functions which fail to have a derivative at some point of their
domains, or which fail to be continuous. Roughly speaking, a smooth
function is one which does not behave in such nasty ways. In courses on
advanced calculus the concept of smoothness is given a precise definition,
applicable to functions of several variables. For the purpose of this book,
however, it is sufficient for you to know that the differential equations
we shall use have all been prechecked for smoothness. Each solution of a
differential equation will be assumed to have an interval for its domain,
which is chosen to be as large as possible.
The question as to how many solutions a differential equation has is
given a precise answer by the following theorem.
Theorem (Existence-Uniqueness). Let to and xo be any real numbers. If the
function F is sufficiently smooth then the first-order differential equation
(1) has a unique solution x = 4(t) satisfying the initial condition
x=xo when t=to.
If, furthermore, 5co is any real number and the function G is sufficiently
smooth then the second-order differential equation (2) has a unique solution
satisfying the initial conditions

x=xo and .x=aco when t=to.

Similar results hold for differential equations of third and higher

orders, the number of initial conditions being equal to the order of the
differential equation. Although the proof of this theorem lies outside the
76 Differential equations: linearity and SHM
scope of this book, this should not prove an impediment to those who
merely wish to use the theorem.

Example 2. The existence-uniqueness theorem implies that the second-order dif-

ferential equation
z= -X
has a solution x = 4(t) satisfying the initial conditions
x=1 and ac = 1 when t = 0.

Since none of the solutions of .z = -x found in Section 5.1 satisfies

these initial conditions, there must be more solutions than those already
guessed. One way of getting new solutions will now be discussed.
A differential equation is said to be autonomous if it does not involve
the time t explicitly on the RHS. For example, each of the differential
equations
x=x, z=x2-}-x, z=0
is autonomous, whereas none of the differential equations
x =t, x=tX. .z=x+t2
is autonomous.
Theorem (Phase-Shift). If x = 4(t) is a solution of an autonomous dif-
ferential equation and e is any real number, then x = 4(t + c) is also a
solution.
Although the proof of this theorem is omitted, it involves little more
than an application of the chain rule for differentiating composite func-
tions. To see a typical application of the theorem, recall from Example 2
in Section 5.1 that the autonomous differential equation

has the solution x = sin(t). From the theorem it therefore follows that,
for each c,
x = sin(t + c)
is also a solution. (You may and should also verify this by direct
substitution in the differential equation.) Thus the theorem has enabled
us to get infinitely many solutions from one solution -- not a bad trick !
As yet, however, we have not obtained all the solutions.
5.3 Linearity 77

Exercises 5.2
1. In each case give the order of the differential equation and say whether the
differential equation is autonomous.
(a) x=tx3 (b) X= X4

(c) x=cos(x) (d) X+2ac+x=t3

2. Recall from Example 3 of Section 5.1 that x = et is a solution of the
differential equation
z= X.
Now use the phase-shift theorem to show that, for each c > 0, x = cet is also a
solution. Is this also a solution for c < 0?

3. (a) Verify, by direct substitution, that the differential equation

L = -X2
has the solution x = 1/t, defined on the interval for which t > 0. Sketch
the graph of the solution.
(b) State why the phase-shift theorem is applicable and hence give infinitely
many solutions of the differential equation.
(c) Hence find the solution satisfying the initial conditions x = 1 when t = 0.

5.3 Linearity
A differential equation of the second order is said to be linear if it can
be written in the form

X = f (t)5c + g(t)x + h(t)

where f (t), g(t) and h(t) are known functions of t, which may be constants
and may be zero. Thus the unknown function and its derivatives occur in
a linear way. Linear differential equations are usually written with all the
terms involving the unknown function transferred to the left-hand side
to give
x - f (t)$ - g(t)x = h(t). (1)

The function h(t) is then referred to as the right-hand side of the differ-
ential equation. A linear differential equation is said to be homogeneous
if its right-hand side is the zero function so that it can be written
X - f (t)ac - g(t)x = 0. (2)

The homogeneous linear differential equation (2) is called the homoge-

nized version of (1).
78 Differential equations: linearity and SHM

Table 5.3.1. Examples of differential equations.

Differential Homogenized
equation Order Linear Homogeneous version
N +2.z+x = 1 2 Yes No :z+2ac+x = 0
X = --x 2 Yes Yes
--t 2 Yes No z=0
z=x2+t 2 No
z=t2+x 2 Yes No :z=x
.z + cos(t)x = 0 1 Yes Yes
.z + cos(x)t = 0 1 No

Similar definitions apply to differential equations of other orders. Thus

a first-order differential equation is linear if it can be written as
$c - g(t)x = h(t)
and the homogenized version has h(t) replaced by zero. Table 5.3.1
illustrates the above definitions.
Since the terms `homogeneous' and `homogenized version' refer only
to linear equations, the above spaces have been left blank for the non-
linear ones. The following theorem shows how we can write down lots
of solutions of a linear homogeneous differential equation. The proof is
omitted.
Theorem (Homogeneous Superposition). If x = 01(t) and x = 02(t) are
solutions of a linear homogeneous differential equation then
x = c141(t) + c242(t)
is also a solution for each ci E R and C2 E R.
For a second-order differential equation which is homogeneous linear,
it can be shown that the above formula gives all the solutions provided
that neither of the pair of 01 and 02 is a constant multiple of the
other. A pair of functions satisfying this condition is said to be linearly
independent.
For a first-order differential equation which is homogeneous linear,
things are a bit simpler. If x = 41(t) is any solution of the differential
equation, not identically zero, then all solutions are given by
x =CIO I (t)
for some arbitrary constant ci E R.
5.3 Linearity 79

Example 1. Find all solutions of the differential equation

X+X=0.

Solution. This differential equation is linear and homogeneous. From Section 5.1, it
has the solutions x = cos(t) and x = sin(t). Hence, by the homogeneous superposi-
tion theorem for each choice of the constants cl and c2,
x = cl cos(t) + c2 sin(t)
is a solution. Since the differential equation is also second order, and cos and sin
are linearly independent, it follows that this formula gives all the solutions.

The final theorem gives a way to solve linear differential equations

which are not necessarily homogeneous. The proof is omitted.
Theorem (Inhomogeneous Superposition). Let x = OP(t) be a particular
solution of a linear differential equation. If x = 0H(t) is any solution of
the homogenized differential equation, then

x = 4)P(t) + 4)H(t)
is a solution of the original linear differential equation. Each solution of
the differential equation can be obtained in this way, moreover, by suitable
choice of OH.
For a second-order linear differential equation, it follows that, if 01
and 02 is a linearly independent pair of solutions of the homogenized
equation, then every solution of the original equation can be written as

x = 4)P(t) + C14)1(t) + C24)2(t)

for some arbitrary constants cl and c2 E R.

Similarly, for a first-order differential equation, if 01 is a non-zero
solution of the homogenized equation, then every solution of the equation
can be written as
x = 4P (t) + cl 4)(t)
for some arbitrary constant cl E R.
The following example illustrates how this theorem can be used to
solve linear differential equations.

Example 2. Find all solutions of the differential equation

X+X = 1.
80 Differential equations: linearity and SHM
Solution. This differential equation is linear, but not homogeneous. Hence the in-
homogeneous superposition theorem is the one to use.
To guess a particular solution, try a constant so the second derivative is zero.
This suggests the choice
x=1 (3)

which gives
LHS=z+x=0+1 = 1
RHS=1.
Thus the guess is correct and (3) gives a particular solution.
Next, the homogenized linear differential is z + x = 0, which was solved in
Example 1 of Section 5.3, each solution being given by
x = cl cos(t) + c2 sin(t) (4)
for suitable cl and c2 E R.
Finally, the sum of the particular solution (3) and the solution (4) of the ho-
mogenized equation gives the solution
x = 1 + cl cos(t) + c2 sin(t).
In view of the inhomogeneous superposition theorem, each solution is given by this
formula for a suitable choice of cl and c2.

Exercises 5.3
1. Copy and complete the following table giving the classification of the
differential equations for x as a function of t.

Differential Homogenized
equation Order Linear Homogeneous version

:z+.z+x=et

x+7x+tx=0
x = sin(2t)

x = sin(2x)

2. Find all the solutions of the differential equation z + x = 2. Hence find the
solution which satisfies the initial conditions x = 1 and .z = 1 when t = 0.
5.4 The SHM equation 81

3. For each of the following differential equations (i) guess a linearly independent
pair of solutions, as in Section 5.1 and then (ii) find all solutions:
(a) z+ x= 0 (b) z- x= 0
(c) z=-4x (d) z=4x
(e) z=-9x (f) x+16x=0.
4. Find all the solutions of each of the following differential equations:
(a) z+x = 5 (b) :z - x = 1
(c) z=-4x+2 (d) x=4x+1
(e) x = -9x + 3 (f) x + 16x = 32.

5.4 The SHM equation

This section makes a special study of differential equations of the form
x = _C02 X (1)

where w > 0. Although (1) is called the SHM (simple harmonic motion)
equation, it is more accurately an equation involving a parameter and
represents infinitely many equations, one for each choice of the parameter
w. Examples which have arisen earlier in this chapter are the differential
equations
z = -x, -4x and z = -9x,
which correspond to the cases w = 1, w = 2 and w = 3, respectively.
With the aid of the results given in Section 5.1, it is easy to guess the
following pair of solutions to (1):
x = 41(t) = sin(wt) and x = 02(t) = cos(wt). (2)

There are now two ways in which to obtain the remaining solutions using
properties discussed previously.

Two forms of solution

The first way uses the fact that the SHM equation is homogeneous linear
and so, by the superposition theorem, it has the solution
x = Cl sin(ujt) + c2 cos(uJt) (3)
I

for each choice of cl, c2 ER. As the pair of solutions (2) is linearly
independent and the SHM equation is second order, all its solutions are
given by (3).
82 Differential equations: linearity and SHM
The second way uses the fact that the SHM equation is also au-
tonomous. Given an arbitrary constant A >_ 0, we choose cl = A and
c2 = 0 in (3) to get the solution
x = A sin(cot).

We may add an arbitrary constant F. E R to the time t to get the solution

x = A sin(w(t + E))

or, alternatively,
x = A sin(cot + S) (4)
I

where b = ws can be any real number.

Note that (3) involves two arbitrary constants cl and c2, while (4) also
involves two arbitrary constants A >_ 0 and b. It can be shown, moreover,
that, by suitable choice of A and b, it is possible to satisfy any initial
conditions on x and .z. Hence, by the existence-uniqueness theorem, (4)
also gives all the solutions of the SMH equation. An alternative way to
show the equivalence of (3) and (4) is to use the well-known trigonometric
identities (see Exercise 3). One then finds the constants are related by
C2 Cl
and cos(b) _ , sin(b) _
A9 A
The form (4) makes it easier to sketch the graphs of the solutions.
The reason (4) is easier to graph is that it involves only one trigono-
metric function whereas to graph (3) you must add the graphs of two
trigonometric functions.

Period and amplitude

In the case b = 0, the graph of (3) has the general shape shown in Figure
5.4.1 when A > 0. As t increases, x oscillates between -A and A and the
graph repeats itself after each time interval of length 2n/w. For these
reasons, A is called the amplitude of the solution and 2ir/w is called its
period. The reciprocal of the period (w/2n) is the number of complete
oscillations per unit time and is called the frequency of the solution.
If b > 0, the graph of (4) is obtained by shifting the graph in Figure
5.4.1 a distance 61w to the right. The number b is called the phase of the
solution.
5.4 The SHM equation 83

Fig. 5.4.1. Graph of a solution of the SHM equation of amplitude A and period
2n/w.

Example 1. Find the amplitude and the phase of the solution of the differential
equation

which satisfies the initial conditions x = 1 and kc = 2 when t = 0.

Solution. The differential equation is the SHM equation with parameter w = 2, so

by (4) we look for a solution of the form
x = A sin(2t + 6)
where a E I[8 and A >_ 0. Since

5c = 2A cos(2t + 6)

the initial conditions are equivalent to the equations

1 = A sin(b)
1 = A cos(h).

Squaring these equations and adding them and then using cos2(b) + sin2(b) = 1
shows that A can only be ,,,[2-. Hence

sin(b) = cos(b) = 1/J

which have a solution a = n/4. This gives

x = Jsin(2t+i/4)
which clearly satisfies the required initial conditions. Hence the amplitude of the
desired solution is 2- and its phase is 7t/4.
84 Differential equations: linearity and SHM
Exercises 5.4
1. State which of the following differential equations are of the SHM type and,
if they are, give the period and frequency of their oscillatory solutions:
(a) z + 4x = 0 (b) z + 45c = 0
(c) x = -9t (d) x = -9x
(e) x=x (f) x-9x=0
(g) x + m x = 0 where k and m are positive constants.
2. In each case, find the solution of the differential equation
x=--9X
which satisfies the initial conditions :
(a) x=Oand is=3 when t=0
(b) x = 1 and .z=0 when t=0
(c) x = 1 and.z=3 when t =0
(d) x= 1 and .z = -3 when t = 0.

3. By first equating the two forms of solution

x = c1 cos(wt) + c2 sin((ot) and x = A sin(wt + a),
use standard trigonometric identities to show that
C1
A = fi-2 + c2 and sin(a) = cos(a) = c2 .
A

4. What is the general solution of the equation

jC+w2X =0?
Include all cases for the value of w.

5. (a) Find the solution of the differential equation

x+w2X=0
which satisfies the initial conditions x = 1 and .z = 0 when t = 0 in
each of the cases w = 1, w = 2, w = 4. Sketch the graphs of these
solutions on the same diagram. Guess what happens to the solution as
w approaches 0.
(b) For each fixed t, calculate
lim sin((ot + n/2).
(0-+0

Does the answer substantiate your guess in part (a)?

6
Springs and oscillations

This chapter is based on a simple model for the force in a spring which
was first proposed by Robert Hooke, a contemporary of Newton. In this
model, the force exerted by a spring is assumed to be directly proportional
to the distance by which the spring is extended. It is quite a useful model
when the extension of the spring is not too large.
Some interesting mechanical systems arise when particles are attached
to the ends of springs. A consequence of Hooke's law is that the equations
of motion for such particles are linear differential equations, usually the
SHM equation or some simple variant of it. The solutions of these
differential equations can be expressed in terms of trigonometric functions
and hence the model predicts oscillatory motion for the particles.
Later in the book, when you have learned more about differential
equations, you will be able to include the effect of damping forces in the
model.
Oscillatory phenomena occur widely in nature: the alternate rising and
setting of the sun, the waxing and waning of the moon, the ebb and
flow of the tides are examples from physics, while the regular beating
of your heart is an equally familiar example from biology. Although
these phenomena may all be described by differential equations, these
differential equations turn out to be non-linear and hence much harder
to solve than the simple linear ones used in this chapter.

6.1 Force in a spring

A spring which is neither extended nor compressed, but is just lying on
a table for example, has a certain length. This length is called the natural
length of the spring. Some springs are so tightly coiled that they cannot
be compressed to anything shorter than their natural lengths. The springs

85
86 Springs and oscillations

OOOOOOOOOOOOOOOOOOOOOc

Force on hands:

Fig. 6.1.1. Pulling a spring.

used to pull wire doors shut are of this type. On the other hand, springs
used in the suspension of a car permit both expansion to longer than,
and compression to shorter than, their natural lengths. It is with springs
of this latter type that we shall be mainly concerned.

Hooke's law
If you attach one end of a spring to a fixed object, say the wall, and pull
on the other end so as to stretch the spring beyond its natural length,
the spring will exert a force on your hand which tends to pull it back
towards the wall as in Figure 6.1.1.
The further you stretch the spring, the larger this force becomes. If
you pull too far, however, the spring may become permanently stretched
and thereby be reduced to just a twisted piece of wire. In the problems
in this book, it will be assumed that the spring is never stretched to this
extent.
If, on the other hand, you push the spring back towards the wall so as
to compress it to less than its natural length, the spring will exert a force
on your hand which tends to push it back from the wall as in Figure
6.1.2.
The further you compress the spring, the larger this force becomes.
Eventually the spring becomes so compressed that each coil of the spring
touches the next one; after this, no further compression is possible. In
the problems in this book it will be assumed that springs are never
compressed as far as this.
Thus, whether extended or compressed, the force exerted by the spring
on your hand acts in a direction which tends to restore the spring to its
natural length. For this reason, the force exerted by the spring is often
6.1 Force in a spring 87

0000000000000000000000

Force on hands:

Fig. 6.1.2. Pushing a spring.

called a restoring force. For extensions and compressions which are not
too large, the magnitude of the force is given (to a reasonable degree of
accuracy) by the law first stated by Robert Hooke (1638-1703).
Hooke's law. The magnitude of the restoring force in a spring is directly
proportional to the length by which the spring is extended or compressed.
By introducing a constant of proportionality k > 0 we may write
Hooke's law as an equality:

{restoring force} k x I length by which spring is l

extended or compressed

The constant k is called the stiffness of the spring. Its dimensions are
those of force per unit length so that
MLT-2

[k] = -2
= MT
L

In the SI system the unit of k are units are newtons/metre or kg/s2. The
stiffness of a spring depends on the composition of the steel of which it
is made, the processes used in its manufacture, the thickness of the wire,
the number of coils in the spring, and so on.
Hooke established his claim to the discovery of his law by publishing
a famous anagram in 1676 consisting of the letters
c e i i i n o s s s t t u v,
which are a rearrangement of the letters in the Latin phrase
ut tensio, sic vis
which means: as the extension, so the force. It was not till two years later
that he revealed the meaning of the anagram to his colleagues.
Hooke's law provides the theoretical basis for the spring balance, which
is commonly used to measure weights and other forces. The fact that
88 Springs and oscillations

Fig. 6.1.3. A spring with both ends free to move.

equal increments in the force produce equal increments in the length of

the spring makes the spring balance easy to calibrate.
In the discussion so far, it has been tacitly assumed that one end of
the spring stays fixed. A possible situation in which both ends are free to
move is shown in Figure 6.1.3, where particles are attached to each end
of the spring.
To simplify the modelling for this situation, we assume henceforth that
all springs are light that is, have zero mass. An argument similar
to that given in Section 3.1 for ropes then shows that the forces exerted
by the spring at either end have equal magnitude, which we assume to be
given by Hooke's law. As to the directions of the forces acting on the
two particles : both will be inwards towards the centre of the spring if the
spring is extended beyond its natural length, but both will be outwards
away from the centre if the spring is compressed.

Exercises 6.1

1. How would you test the validity of Hooke's law for a particular spring and
how would you determine its stiffness?

6.2 A basic example

This section will investigate the motion of a particle attached to one
end of a spring when the other end is fixed. This will introduce various
ideas and techniques which can later be used to solve more complicated
problems involving springs.
Many different choices of coordinate system are possible in spring
problems. Some of these choices will lead to simpler equations of motion
than others. In Example 1 below we choose the coordinate of the particle
to be the extension of the spring. This choice fits in naturally with
Hooke's law.
A frequent source of difficulty in solving spring problems lies in at-
taching the correct sign to the force exerted by the spring on the particle.
A safe way to achieve this is to consider separately the two cases
6.2 A basic example 89

(a) the spring is extended,

(b) the spring is compressed.
Let x metres denote the extension of the spring. A negative extension is
a positive compression; hence, in the case x > 0 the spring is extended,
while in the case x < 0 the spring is compressed.
The direction of the force exerted by the spring will thus depend on
the sign of the extension x. It is important to realize that
if x < 0 then -x is positive.
This is so because if x < 0 then x contains a `built-in' negative sign. For
example: if x = -2 then -x = -(-2) = 2, which is positive. Thus the
absolute value or magnitude of the extension is given by
x ifx>0
1xI 0 ifx=0
-x ifx<0.
These basic ideas about signed numbers will show us how to attach the
correct sign to the forces arising from springs, whether under extension
or compression.

Example 1. A light spring has stiffness k > 0. One end of the spring is attached
to a wall and the other end is attached to a particle of mass m > 0. The particle
and the spring lie on the floor, assumed smooth, and are free to move in a line
perpendicular to the wall. Find the equation of motion of the particle when the
coordinate for the particle is the extension of the spring.

Solution. At any instant the spring may be extended or compressed. To define a

coordinate x for the particle, put
e -}- x = {length of the spring}
where a denotes the natural length of the spring. Since the spring assumes its
natural length when x = 0, the origin for the x-coordinate is at a distance e from
the wall. If x increases, moreover, the particle will move away from the wall and
so this direction is the positive direction for the x-coordinate.
Our aim now is to express the force exerted on the particle by the spring as a
function of the coordinate x, distinguishing between the cases x > 0 and x < 0. Put
F = {force acting on the particle), measured in the positive x-direction.
Case (a) : x > 0. This case is illustrated in Figure 6.2.1. In this case the spring is
extended beyond its natural length, its extension being x metres. The spring pulls
the particle back towards the wall and so the force on the particle acts in the
negative x-direction, hence F < 0. But, by Hooke's law, the magnitude of F is
(stiffness) x {extension} = kx,
90 Springs and oscillations

+ ve direction
-----------------------------/+x ----------------------------- No.
OOOOOOOOOOOOOOOOOOOOOc
------------------- / -------------------- 4 --------- x ---------lo-

0m
F<0

Fig. 6.2.1. Coordinates for Example 1, Case (a) : spring extended.

+ ve direction
------------ /+X --------------- 0.

aaaaaaaaaaaaaaaaaaaaa

------------------- I ---------------------
1

Sm
F>0

Fig. 6.2.2. Coordinates for Example 1, Case (b) : spring compressed.

which is clearly positive. Since F is negative, however, this implies that

F = -kx when x > 0. (1)

Case (b) : x < 0. This case is illustrated in Figure 6.2.2.

In this case the spring is compressed shorter than its natural length by a distance
of IxI = -x metres. The spring pushes the particle away from the wall and so the
force on the particle acts in the positive x-direction, hence F > 0. But, by Hooke's
law, the magnitude of F is
{stiffness} x I {extension} ( = kIxI = k(-x)
which is positive. Since F is positive, this implies that
F = k(-x) = -kx when x < 0. (2)

From (1) and (2) we have in both cases

F = -kx (3)
for the force F acting on the particle (provided the magnitude of x stays sufficiently
small for Hooke's law to be applicable). This is also true when x = 0.
Now regard the coordinate x of the particle as a function of the time t and apply
Newton's second law to get
mz=F
= -kx
6.2 A basic example 91

provided (xj remains sufficiently small. Hence

k
x= --x
m
(4)

which is the desired equation of motion for the particle.

As a simple check on the equation of motion, note that it admits the

constant solution in which x stays zero. At this point the spring exerts
no force on the particle. If the particle is initially placed at rest in this
position, it will not move. We call this point an equilibrium point for the
particle.
As a further check, note that in the limit as k/m --> 0 the equation of
motion becomes simply
x=0
so that the particle moves with constant velocity. This seems physically
plausible since, if the stiffness of the spring is small compared with the
mass of the particle, we would expect the spring to exert a relatively
small force on the particle.
The above solution, of the problem in Example 1, may seem excessively
laborious in as much as the formula (4) for the force F has been derived
by considering each case separately. It should, none the less, be instructive
for beginners in mechanics to work through the various cases since one
of their main difficulties lies in attaching the correct signs to the various
forces. With the benefit of hindsight, however, it is now possible to see
why each case leads to the same formula (3),
F= --kx.
The `minus' sign is valid here in every case because F is a restoring force
and so must have the opposite sign to the extension x.
Our discussion of Hooke's law began with its statement in English and
Latin and now includes the above translation into an algebraic formula.
To round off the discussion it therefore seems appropriate to represent
the law as a graph.
When restricted to the domain for which Hooke's law is valid, F is a
linear function of x with negative slope. Hence the graph of F against x
has the form shown in Figure 6.2.3.
The types of motion which may be predicted on the basis of Hooke's
law will now be discussed.

Example 2. Describe the possible types of motion for the particle in Example 1.
92 Springs and oscillations

x-axis

Fig. 6.2.3. Hooke's law holds on some interval.

Solution. Let the coordinate x for the particle be the extension of the spring beyond
its natural length t, at time t. Hence, by the solution to Example 1, the possible
motions are obtained by solving the differential equation

for x as a function of t. This is the SHM equation, discussed in Section 5.4, with
parameter co = k/m. The solutions are given by

x=Asin V k
t+b
m

where A, 6 E R are arbitrary constants with A >_ 0. The values of these constants
are determined from initial conditions.
It follows that the particle either remains at the equilibrium point where x = 0
(when A = 0) or oscillates about the equilibrium point with SHM of amplitude A
(when A > 0) as shown in the tracking diagram in Figure 6.2.4. It is assumed that
A is small enough to lie in the interval for which Hooke's law is valid. Changing E
corresponds to changing the origin of time and in the diagram we have put E = 0.
The period of the oscillations is 2n m/k. It is interesting to note that this period
is independent of the amplitude A of the oscillations.
As a check on our answer for the period, note that its dimensions are correct,
being given by

[2n m/k] = [m/k] 3 = (M/MT-2) 2 = T.

Our answer also shows that the period increases with the ratio m/k, which seems
physically plausible: particles with larger mass would take longer to complete an
oscillation, for a given stiffness.
6.2 A basic example 93

------------------- / -------------------- 0.

OOOOOOOOOOOOOOOOOOOOOa
-----------------------------/+x -----------------------------

x-axis
x = A x=0 x=A

t=z
t0
t-z
C
Fig. 6.2.4. Tracking diagram where the period i = 2n m/k.

Exercises 6.2
1. A particle moves on a line and the force F acting on the particle is given as
a function of its displacement x to the right of an origin 0 by the formula
F=kx, k>0.
(a) Which is the equilibrium point for the particle?
(b) In each of the cases x > 0 and x < 0 sketch a diagram showing the
particle in a typical position, the direction of the force and the direction
in which the particle would move if started from rest.
(c) Is the force restoring (in the sense that it always pushes the particle back
towards the equilibrium point)?

2. In each of the following cases, repeat Exercise 1 but use the new law of force :
(a) F = -kx2
(b) F = -kx3
(c) F = k(1 - x) (distinguish now the cases x > 1 and x < 1).

3. A particle moving along a line is a distance y to the right of an origin 0 at

time t.
(a) In each of the following cases state which point is the equilibrium point:
(i) the equation of motion is y + 3y = 6,
(ii) the equation of motion is y + 3y = 0.
94 Springs and oscillations
(b) If the equation of motion is linear and homogeneous, which point must
be an equilibrium point?

4. Repeat Example 1, but this time choose as coordinate for the particle its
distance y from the wall and so obtain the equation of motion
k k
y-}- y = mt.

m
[Hints: Follow the steps given at the start of the section, making whatever
changes are necessary in the notation. Be sure to get the correct formula for the
extension of the spring in terms of y and the natural length.]
5. How does the equation of motion obtained in Exercise 4 differ from that
obtained in Example 1 ? Solve the equation of motion in Exercise 4 and show
that the resulting description of the possible types of motion for the particle
agrees with that found in Example 2.

6.3 Further spring problems

The procedure used in Section 6.2 to obtain the equation of motion
for the particle attached to the spring will now be analysed to provide a
number of simple steps. These steps will provide a useful guide for solving
other spring problems, even when the problems are more complicated.
In reading through these steps, you may find it helpful to refer back to
Example 1 in Section 6.2. The effect of each step, as it occurred in that
example, is shown below in parenthesis.
STEP 1: Introduce notation for the natural length of the spring (e),
a coordinate for the particle (x), the spring force (F) on the particle
(measured in the positive coordinate direction).
STEP 2: In case (a) (extended spring), draw a diagram to show the spring,
the coordinate of the particle, an arrow giving the direction of the spring
force, the extension of the spring, and apply Hooke's law to get a formula
for the spring force.
In case (b) (compressed spring), draw a similar diagram showing the
above quantities and get a formula for the spring force.
Finally give a formula for the spring force covering all cases. (These
steps resulted in Figures 6.2.1 and 6.2.2 and the formula F = -kx.)
STEP 3: Write down the net force on the particle and apply Newton's
second law to get the equation of motion (mz = -kx).
STEP 4: Checks: is the equation of motion dimensionally correct? Does
it give the correct equilibrium point?
6.3 Further spring problems 95

+ve
direction
m

Fig. 6.3.1. Coordinates for Example 1, Case (a) : spring extended.

These steps should be regarded as a guide, to help ensure nothing

essential is missing from your solution, rather than as inflexible instruc-
tions which must always be carried out. In subsequent exercises some of
the steps may already be done for you. The following example provides
a further illustration of their use.

Example 1. A light spring has stiffness k > 0. One end of the spring is attached
to the ceiling, and to the other end, hanging vertically below it, there is attached a
particle of mass m. Find the equation of motion of the particle when its distance
below the ceiling is taken as coordinate.

Solution.
STEP 1: Let t be the natural length of the spring, let y be the distance of the par-
ticle below the ceiling at time t. Since the particle moves downwards if y increases,
downwards is the positive direction for the y-coordinate. Let FS be the force on the
particle due to the spring, measured in the downwards direction.
STEP 2: Here we get a formula for the spring force FS by considering the possible
cases for y.
Case (a) : y > 1, spring extended. This case is illustrated in Figure 6.3.1.
The spring force acts upwards, opposite to the positive direction for the y-coordinate.
Hence FS < 0. The extension of the spring is y - t > 0. Hence, by Hooke's law,
the magnitude of FS is
{stiffness} x ( {extension} I = k(y - t)
which is positive. Since FS is negative, this implies
FS = -k(y - t) when y > t. (1)
Case (b) : y < e, spring compressed. This case is illustrated in Figure 6.3.2.
The spring force now acts downwards, in the positive direction for the y-coordinate.
96 Springs and oscillations
A

+ve
direction

Fig. 6.3.2. Coordinates for Example 1, Case (b) : spring compressed.

Hence FS > 0. The spring is compressed a distance t - y > 0. Hence, by Hooke's

law, the magnitude of FS is
{stiffness} x {extension} = k(t - y)
which is positive. Since FS is positive, this implies
FS=k(t-y)=-k(y-t) when y<1'. (2)
Thus by (1) and (2) the spring force is given in all cases by
FS=-k(y-')
provided that (y - 11 is sufficiently small.
STEP 3: The net force F on the particle is the sum of the spring force and gravity.
Hence
F=-k(y-1)+mg
By Newton's second law,
my = F = -k(y - 1) + mg
so the equation of motion is

y=-k(Y-t)+g
m
(4)

STEP 4: As a check, note that [k/m] = MT-2M-1 = T-2 so that each term in the
equation of motion has the dimensions of acceleration.
As another check, consider the equilibrium point of the system. This will be at
some distance, say d, below the natural length of the spring where the spring force
is balanced by gravity. At this point,
y=e+d and mg = U.
Thus d = mg/k and
y=I+mg/k.
Thus it lies a distance 21 g below the natural length position as we found earlier.
This physical argumentkthus gives the same value y = t' + mg/k as that obtained
by putting y = 0 in (4).
6.3 Further spring problems 97

00000000000000a
Fig. 6.3.3. Spring with two particles.

Solving the differential equation

The equation of motion (4), although linear, is not homogeneous. The
reason for the lack of homogeneity is that we did not choose the y-
coordinate to have its origin at the equilibrium point where, in fact,
y = e + mg/k. This value of y, however, provides a constant solution of
the non-homogeneous equation, thereby facilitating its complete solution.

Example 2. Describe the possible types of motion for the particle in Example 1.
Solution. The equation of motion (4) obtained in Example 1 for the y-coordinate
of the particle may be written as
k k
y + m-Y = -m1+g .
A particular solution of this second-order linear differential equation is the constant
solution y = I+ mg/k. The homogenized differential equation is the SHM equation
k
+ m y=0

with the solutions y = A sin (/t where A, 6 E R with A >_ 0. Hence the
solutions of the equation of motion are given by

y= +mg+Asin
k -
kt+b
m

Putting A = 0 gives the equilibrium solution. For A > 0 and sufficiently small we
get a solution in which the particle oscillates up and down with SHM about the
equilibrium point with amplitude A and period 2n m/k.

The steps used to solve Example 1 may be adapted to help you with
the solution of more complicated spring problems. Such problems may
involve two particles attached to opposite ends of the same spring, as in
Figure 6.3.3, or several springs attached to the same particle, as in Figure
6.3.4. We assume the springs slide on a smooth horizontal table. The
modifications needed to solve these problems are as follows.
For the system consisting of two particles attached to the same spring,
two coordinates are needed one for each particle. The extension of
the spring can then be expressed in terms of these coordinates and the
natural length of the spring.
98 Springs and oscillations

oaoooooooooa v
D00000000000 `W'OOOOOOa

(a) (b)

Fig. 6.3.4. One particle and two springs.

If, as we assume, the spring is light, the forces exerted on the particles
at opposite ends of the spring will act in opposite directions. Since two
coordinates are involved, the motion of the particles will be described by
a pair of simultaneous differential equations.
For systems with several springs attached to the same particle, there
may be an increase in the number of cases needed to obtain all possible
combinations of extensions and compressions for the two springs. To
reduce the number of cases, we shall assume certain relationships between
the natural lengths of the springs.
In Figure 6.3.4, if for system (a) the two springs have the same natural
length, then they will both be extended or both be compressed (or both
be in the equilibrium position at any given time). Thus there are still only
two cases to consider. To simplify system (b), however, we assume the
sum of the natural lengths is equal to the distance between the walls. If
one spring is extended, the other is then compressed. Hence, once again,
there are only two cases to consider.

Exercises 6.3
1. Repeat Example 1, but this time choose as coordinate for the particle its
distance x below the equilibrium position and so obtain the equation of motion

z+ !x=0.
m
[Hints: Again follow the steps given. When sketching diagrams showing the spring
and the forces on the particle in the various cases, recall that the equilibrium
point is a distance d below the natural length where d = mg/k. When the
coordinate of the particle is x, the particle is a distance x below this again. Hence
obtain the extension of the spring in terms of x and d.]
2. How does the equation of motion obtained in Exercise 1 differ from that
obtained in Example I? Solve the equation of motion in Exercise 1 and show
that the resulting description of the possible types of motion for the particle
agrees with that found in Example 2.
3. Two springs are attached to a wall at one end and to a single particle at the
6.3 Further spring problems 99

other, as shown below. Their natural lengths are 11 and e2 respectively, where
el < 12. The springs lie in a line perpendicular to the wall and the distance of
the particles from the wall is y.

---------------- y ----------------- .

(a) Find the extension (or compression) of the springs in each of the cases
(i) 12 <y
(ii) 11 < y < ?2
(iii) y < 11 -
(b) Indicate the directions of the forces F1 and F2 exerted by the springs on
the particle in each of the above cases. What further information would
you need, to be able to write down their magnitudes?

4. A particle of mass m is placed on a small table and is attached to a wall at

the left by a pair of light springs with the same natural length t but possibly
different stiffnesses k1 and k2. The springs are free to move in a line perpendicular
to the wall. Choose as coordinate for the particle the extension x of the spring,
as shown below.

+ ve direction

(a) Show, giving all relevant steps, that the equation of motion of the particle
is

m
(b) Solve the equation of motion and hence find the possible types of motion
for the particle.
(c) If the pair of springs were to be replaced by a single spring having the
same net effect, what should be its natural length? Its stiffness?

5. A pair of springs have natural lengths el and e2 and stiffnesses k1 and k2

respectively. One end of either spring is attached to a particle of mass m while
the remaining ends of the springs are attached to opposite walls, the springs
remaining in a line perpendicular to the walls. The distance between the walls
is the sum of the natural lengths of the springs. Choose as coordinate for the
particle its distance x to the right of the point where each spring has its natural
length, as shown below.
100 Springs and oscillations

I 000000000000000000 `1 m t-0000000000000a

f --------------- /i ----------------- f -------------- /'2----------------

+ ve direction

(a) Find the equation of motion of the particle, giving all relevant steps.
(b) A student gets the answer for (a) as

x+kl k2x=0.
m
What is obviously wrong with this answer? Where did he go wrong?

6. As a model of a vibrating molecule consider two particles of equal mass m

connected by a light spring which stays in a fixed horizontal line. The spring
has natural length a and stiffness k. Choose as coordinates for the particles their
distances x and y from some fixed origin 0, as shown below.

000000000000000000000
---x.---
---------------------------y ----------------------------- b.,

(a) Show, giving all relevant steps, that in these coordinates the equations
of motion of the particles are
k
x= m(Y-x-
k
Y=-m(Y - x
(b) To `uncouple' these simultaneous differential equations introduce new
coordinates u and v by putting
u=y+x
v=y-x.
Since x and y are functions of t, so are u and v so you may differentiate
with respect to t to find iu and v in terms of z and Y. Hence show from
(a) that u and v satisfy the `uncoupled' differential equations
u=0
v+ 2kv=2ki.
m m
6.3 Further spring problems 101

(c) Solve these differential equations for u and v as functions of t.

(d) What do the solutions tell you about the motion of the point mid-
way between the two particles? Describe the possible motions of the
particles.
Part two
Models with Difference Equations
7
Difference equations

The quantities involved in mechanics such as displacement, velocity

and acceleration are typically related to time by smooth functions
defined on an entire interval. Problems in mechanics lead, via Newton's
second law, to differential equations. By way of contrast, the mathemat-
ical models to be studied in this part of the course involve quantities
whose values are known only at certain specified times, equally spaced.
Such quantities are expressed as functions of the time via sequences. The
assumptions in the models can then be expressed as difference equations
for these sequences. The difference between models leading to differential
equations and those leading to difference equations is often expressed by
saying that the former are continuous whereas the latter are discrete.
This chapter introduces the idea of a difference equation via a problem
involving rabbit populations. Basic ideas regarding the solutions of these
equations are then explained.

7.1 Introductory example

Leonardo of Pisa, or Fibonacci as he was better known, is often claimed
to be the greatest mathematician of the Middle Ages. His book, Liber
abaci, completed in 1202, took advantage of the Hindu-Arabic numerals.
Among the problems which the book contains, the one of greatest interest
to later mathematicians is as follows :
How many pairs of rabbits will be produced in a year, beginning with a single pair,
if every month each pair produces a new pair, which becomes productive two months
after birth ?

Our interest in this problem stems from the fact that it leads to the idea
of a difference equation.

105
106 Difference equations
Month I Month 2 Month 3 Month 4

Fig. 7.1.1. A mature pair of rabbits (shaded grey) each month produce a new
pair of rabbits (shaded white). The new rabbits mature after two months.

The way in which the rabbits breed for the first few months is illustrated
in Figure 7.1.1. The pairs of rabbits are shown month by month within
their enclosure. It is assumed that none of the rabbits die and, for ease of
identification, each pair of rabbits is shown in the same position within
the enclosure from month to month. The newly born rabbits are shown
directly under their parents with an arrow pointing to them.
At the top of the enclosure each month is the original pair of rabbits
the only pair during the first month. This pair is assumed to be new
born when placed in the enclosure and hence produces a new pair of
baby rabbits in the second and in each subsequent month. The first pair
of baby rabbits, born in the second month, do not produce any offspring
till the fourth month.

Example 1. What is the total number of pairs of rabbits, each month, up to and
including the fourth month?
Solution. It is convenient to introduce some notation. Put
Yk = {number of pairs of rabbits in the month k}
for each integer k >_ 1. Inspection of the enclosures in Figure 7.1.1 shows that
Yi=1, Y2=2, Y3=3, y4=5. (1)

By continuing to sketch the rabbits in the enclosure month by month

up till the twelfth, we could find successively the numbers

Y59 Y69 Y79 Y& Y9 Y1o9 Y11, Y12

and thereby solve the original problem. This would be quite laborious,
7.1 Introductory example 107

however, and a more efficient way is to first derive a mathematical

formula showing how the numbers in any month depend on those in
previous months.

Example 2. Develop a formula relating the number of rabbits present this month
to the number of rabbits present in the previous month.
Solution. Note first that, since we neglect rabbits' deaths, their total number can
only be affected by births. Hence, for the pairs of rabbits,
number presentj = number present + number born (2)
this month last month this month
Since the rabbits take two months to become productive and then produce only one
pair per month, the last term on the RHS of (2) is given by
number born = number present (3)
this month two months ago
provided the current month is at least the third month. Hence, substituting (3) back
into the RHS of (2) gives
number present = number present + number present (4)
this month last month two months ago
This is the desired relationship between the numbers this month and those for the
previous two months.
To express (4) in terms of the notation introduced previously, let the current
month be the kth month (k >_ 3). Hence the last month was the (k - 1)th and the
one before that was the (k - 2)th. Thus (4) may be abbreviated to
Yk = Yk-1 + Yk-2 (5)
where k = 3,4,5,... and we shall refer to (5) as the Fibonacci equation (despite
the fact that it was first written down explicitly by Kepler).

Equation (5), which relates the numbers this month to those in the two
preceding months, is an example of a difference equation.
To see how the difference equation helps in the calculation of the
remaining numbers, take k = 5 in (5) to get
Y5 =Y4+Y3
Since y3 and y4 are already known from (1), this gives y5 = 5 + 3 = 8.
The next step is to take k = 6 in (5) to get
Y6 = Y5 + Y4.

But now both y5 and y4 are known so this gives y6 = 8 + 5 = 13. By

continuing in this way we can calculate in turn y,, A)... stopping where
we please. In particular we can find Y12 in this way and hence solve the
original problem. The answer turns out to be 233 pairs of rabbits.
A sequence may be regarded as a function whose domain is the
108 Difference equations
Rabbits
A

6
5

4
3

2
1

Months
0 1 2 3 4 5

Fig. 7.1.2. Graph shows the total number of pairs of rabbits present each month.
Because the time is discrete the graph consists of discrete points.

positive integers. A sequence thus has a graph, but it consists of isolated

points rather than a continuous curve. As an illustration, the solution to
Fibonacci's equation is graphed in Figure 7.1.2.
Despite its mathematical interest, Fibonacci's problem gives a greatly
oversimplified view of the breeding patterns of rabbits. In practice, there
is a wide variation in the size of their litters, with seven or eight per litter
being quite common. Rabbits, moreover, normally take considerably
longer to reach sexual maturity than the two months stated in the
problem. One aspect of the model which does fit the facts, however, is
the gestation period of a month: in practice it is normally 31 days.
An interesting article on the solutions of the Fibonacci equation and
their relevance to areas quite remote from rabbit populations is given
in Gardner (1981), Chapter 13. Further information about Fibonacci's
mathematical achievements may be found in Boyer (1968), Chapter 14.

Exercises 7.1
In each problem, yk denotes the number of pairs of rabbits present in the
enclosure in the kth month.
1. For the original Fibonacci rabbit problem, it was shown in the text that
y5=8andy6=13.
(a) Use a suitable choice of k in Fibonacci's equation, (5) in the text, to find
Y7-

(b) In a similar way find in succession the numbers y8, Y9, YJo, Y11, Y12 thereby
completing the solution of Fibonacci's problem.
7.2 Difference equations - basic ideas 109

2. Suppose that we modify Fibonacci's problem by increasing the number of

rabbits born each month from one pair to two pairs.
(a) What change (if any) is required in equations (2), (3), (4) and (5) in the
text?
(b) Starting from yj = 1 and Y2 = 3, find in succession the number of pairs
of rabbits each month from the third to the sixth by using the difference
equation found in part (a).

3. Repeat Exercise 2, but now modify the original Fibonacci's problem by

increasing one pair to three pairs.
4. Suppose that now we modify the original Fibonacci problem by increasing
the time taken for the rabbits to become productive from two months to three
months.
(a) What change (if any) is required in
(b) equations (2), (3), (4) and (5) in the text?
(c) Starting from yl = 1, Y2 = 1 and y3 = 2, find in succession the number
of pairs of rabbits each month up to the sixth month by using the
difference equation found in part (a).

5. Repeat Exercise 4, but this time suppose the time taken for the rabbits to
become productive is four months. In part (b) you will need t choose for yl, Y2,
Y3, Y4 appropriately.

7.2 Difference equations basic ideas

The idea of a difference equation will now be formulated in a general
way, applicable to a wide variety of problems. Difference equations arise
in problems like that of Fibonacci concerning the rabbits, where the
solution leads to a sequence of numbers

Y1, Y29 Y3, Y4, ...

which we can imagine as never ending. Typically the sequence of numbers
will represent the measurements of some quantity, made at equal intervals
of time; hence the phrase 'discrete-time problems' is often used to describe
the practical problems from which difference equations arise.
A difference equation may be defined as a rule which expresses each
member of the sequence, from some point on, in terms of the previous
members of the sequence. If the rule defines the kth member of the
sequence in terms of the (k -1)th member (and possibly also the number
k itself), then it is said to be a first-order difference equation. Once
a value is specified for yl, the difference equation then determines the
rest of the sequence uniquely. The value given for yl is called an initial
110 Difference equations

condition and the sequence obtained is called a solution of the difference

equation.

Example 1. For the first-order difference equation

Yk = k(Yk-1)2 (k = 2, 3141 ...)

and the initial condition yl = 1, determine the solution as far as y5.

Solution. In the difference equation take successively k = 2, 3, 4, 5 to get

Y2=2yi =2x 1 = 2
Y3=3y2=3x22=12
Y4 = 4y3 = 4 x 122 = 576
y5=5y2=5 x5762 = 1 658 880.

A rule which defines the kth member of the sequence in terms of the
(k - 2)th member (and possibly also the (k -1)th member or the number
k itself) is called a second-order difference equation. A unique solution of
such a difference equation is determined once the values of both yi and
Y2 are specified. Difference equations of the third and higher orders may
be defined in a similar way.

Example 2. Note that the Fibonacci difference equation

Yk = Yk-i + Yk-2 (k = 3, 49,59 ...)

is of second order. Given the initial conditions yl = 1 and Y2 = 1, determine the

solution as far as y5. (The solution is called the Fibonacci sequence.)

Solution. Taking successively k = 3,4,5,6 in the difference equation gives the fol-
lowing equations
Y3 = Y2 + Y1
Y4 = Y3 + Y2
Y5 = Y4 + Y3
Y6 = Y5 + Y4
The initial conditions, together with these equations used successively, give
y1=1, y2=1, y3=2, y4=3, Y5=5, Y6=8.
7.2 Difference equations - basic ideas 111

The above process of repeatedly substituting old values back into the
difference equation to produce new ones is known as iteration. It is clear
that this process will eventually produce yk for any prescribed value of
k.

While iteration has the advantage of being repetitive, and therefore

easy to apply, it has some drawbacks. For example, if you wished to
calculate ylo, say, by iteration then you would also have to write down
all the preceding members of the sequence y l, Y29 , Y98, y99 regardless
of whether you wanted them or not.
For some difference equations it is possible to find a simple formula
giving the solution Yk as a function of k. Such a formula is said to provide
a 'closed-form' solution of the difference equation and enables yl00, say,
to be calculated directly, without the need to calculate all the preceding
members of the sequence.
The following examples illustrate how a closed-form solution may be
guessed and then checked.

Example 3. Guess a formula for the solution of the first-order difference equation
Yk=1+Yk-1+2V1 +Yk-i (k=2,3,4,...) (1)
which satisfies the initial condition yl = 0.
Solution. To ensure our guess will be an informed one, we first calculate the numbers
Y1, Y2, Y3, Y4. From the difference equation and the initial condition it follows by
iteration that these four numbers are respectively 0, 3, 8, 15. These numbers look
very close to the perfect squares 1, 4, 9, 16. More precisely
y1=12-1
Y2 =22- 1
Y3=32-1
y4=42-1.
This leads to the guess
yk =k2- 1 (2)
for allk>_ 1.

The formula (2) remains a guess at this stage since it has only been
verified to hold for four of the infinitely many possible values of k. The
following example shows how to verify it is valid for all k > 1.

Example 4. In the previous example, verify that the formula (2) for yk gives the
correct solution to the difference equation (1).
112 Difference equations
y-axis

25 F
p
20

5
'0,010 ,
10
10 10 10

0 k-axis
1 2 3 4 5

Fig. 7.2.1. Graph of the solution (2) of the difference equation (1).

Solution. It will be shown that the formula for yk satisfies both the initial condition
and the difference equation.
Putting k = 1 in (2) gives yl = 0; hence the initial condition is satisfied. Next,
to check that the difference equation is satisfied, first replace k in (2) by k - 1 to
get

Yk-1 = (k - 1)2 - 1 (k = 2, 3, 4, ...) (3)

(this replacement being valid since (2) was assumed to hold for all k >- 1). Sub-
stitution of these formulae in the difference equation (1) now gives, for k >_ It

RHS = 1 + Yk-1 + 2 1 + yk-1

1)2
= (k - (k -1)2, +2 by (3),
= (k -1)2 + 2(k -1), ask >_ 1,
= k2 - 1,
= Yk, by (2).
= LHS.

Thus the difference equation is satisfied, both sides being equal.

The graph of the solution (2) is sketched in Figure 7.2.1. Since the
solution is a sequence, its graph is a discrete set of points rather than
a continuous curve. Note the reduced scale on the vertical axis to
accommodate the points.
7.2 Difference equations basic ideas 113

Exercises 7.2

1. In each case state the order of the difference equation and the number of
initial conditions needed to determine a solution uniquely :

(a) Yk = 2Yk-1 + (yk-1)3 (k > 2),

(b) Yk = 2Yk-1 + (Yk-2)3 (k > 3),
(c) yk = 2yk-2 (k > 3).

2. Suppose that yk = Yk-1 for all k >_ 2. What can be said about the sequence
Y1, Y21 Y3, ...?

3. Suppose that Yk = (yk-1)2 for k >_ 2. For which values of k does it follow that
Yk+1 = (Yk )2

4. For the first-order difference equation

Yk = Yk-1 + 2 (k >_ 2)

and the initial condition yl = 0:

(a) calculate Y2, Y3, Y4, Ys and then guess a formula for Yk in general,
(b) verify that your formula satisfies the difference equation and the initial
condition.

5. Repeat Exercise 4, but with the difference equation

Yk = 2yk-1 (k ? 2)
and the initial condition yl = 1.
6. Repeat Exercise 4, but with the difference equation
Yk
V'1 =
+ 1 + Yk

and the initial condition yl = 1.

7. Write down a first-order difference equation and one initial condition having
the sequence 1, -1, 1, -1, 1, -1, ... as the solution.
8. Write down a second-order difference equation and two initial conditions
having the sequence 1, -1, 1, -1, 1, -1, ... as the solution.
9. Suppose that a sequence yl, Y29 Y3.... is defined by putting

Yk=%vIk- (k> 1).

Write down a similar formula which must be satisfied by yk-1. Hence obtain
k in terms of yk-1 and then obtain a first-order difference equation which the
sequence satisfies.
114 Difference equations

7.3 Constant solutions and fixed points

In the applications of difference equations, a solution represents some
quantity measured at equal intervals of time. A solution in which the
measured values do not change with time is called a constant or steady-
state solution. Although a solution chosen at random is unlikely to be of
this type, it may approach a steady-state solution over a long period of
time. For this reason, steady-state solutions play a major role in the study
of difference equations. To help place them in a geometrical context, we
shall show how they correspond to fixed points of functions.
To encourage notational flexibility, we shall change our notation and
denote a typical sequence of real numbers by

xo, xi, x2, ..

and shall write xn for the typical members of this sequence where n is
an integer >_ 0. The sequence's being constant means that xn does not
change with n and hence that

for all n. Equivalently

xn+1 = xn (n = 0, 1, 2, ...).

To find constant solutions of a difference equation, either of these two

criteria may be used.

Example 1. Find the constant solutions of the difference equation

xn+1 = xn
2
(n=0,1,2, ...). (1)

Solution. If the solution is constant, then all members of the sequence have the same
value, which we denote by s. Hence xn+1 = xn = s for all n >_ 0. The difference
equation (1) implies that
s=s2
and hence s = 0 or 1. Thus the only possible constant solutions of the difference
equation are the sequences
xo = 0, x1 = 0, x2 = 0, ... and XO = 1, x1 = 1, x2 = 1, ... .
Conversely, these two constant sequences clearly satisfy the difference equation (1).
7.3 Constant solutions and fixed points 115

Constant solutions of difference equations will now be interpreted as

fixed points of functions. This leads to a useful interpretation in terms
of graphs.
The difference equations to be considered here are first-order ones
having the form
xn+1 = g(xn) (n = 0,1, 2,...) (2)
for some function g: I -- III with I Ez R. The function g is said to
correspond to the difference equation (2). It is important to be able to
recognize the function g when it appears in specific examples.

Example 2. Find the function g corresponding to the difference equation

2
xn+1 = xn (3)

and describe its graph.

Solution. Comparison of (2) and (3) shows that the function g corresponding to
this difference equation is given by
g(xn) = xn
for all xn. In describing this function, however, we would normally write
g(x) = x2
for all x. Hence g is just the squaring function, whose graph is a parabola.

For the difference equation (2), an initial condition xO = s will give a

constant solution
x0 = s, X1 = s, x2 = s,
if and only if the number s satisfies the equation
s = g (s).
A number s which satisfies this equation is called a fixed point of the
function g. The significance of this terminology is apparent from the
mapping diagram, Figure 7.3.1, where g, when applied to s, leaves s fixed.

In the following example, the constant solutions of the difference

equation found in Example 1 reappear in the guise of fixed points of the
corresponding function.

Example 3. Find all the fixed points s of the function g with g(x) = x2.
Solution. The fixed points s are the solutions of s = g(s), that is
s=s2
Hence the fixed points of g are the numbers 0 and 1.
116 Difference equations
R R

Fig. 7.3.1. Mapping diagram showing a fixed point s of the function g.

x-axis

Fig. 7.3.2. Graph showing fixed point s of the function g.

Fixed points have the advantage of a simple graphical interpretation,

which often provides information about fixed points even in cases where
we cannot solve the equations exactly. This interpretation is shown in
Figure 7.3.2: a number s is a fixed point of g if and only if the point
(s, g(s)) is a point of intersection of the graphs of y = g(x) and y = x.
A typical application of this graphical interpretation is given in the
following example.

Example 4. Show that the cosine function has just one fixed point s which lies in
the interval 0 < s < n/2.

Solution. The graphs of y = cos(x) and y = x are shown in Figure 7.3.3. They
intersect in just one point, whose x-coordinate lies between 0 and n/2. Hence there
is only one fixed point s and it satisfies 0 < s < n/2.
7.3 Constant solutions and fixed points 117

Fig. 7.3.3. Graph showing the single fixed point of the cosine function.

Exercises 7.3
1. In each case find all steady-state solutions of the difference equation using
the method of Example 1 in the text.
(a) xn+l = 2xn (n = 0,1, 2, ...)
(b) xn+1 = 2xn + 1
(c) xn+1 = 2(x2 - 3).

2. How many constant solutions does the following difference equation have?
Xk = xk_1 (k = 2, 3, 4, ...)

3. The function g corresponding to a certain first-order difference equation is

given by the formula g(x) = 3x + 2. What is the difference equation? [Give xn+1
as the appropriate function of xn, where n = 0, 1, 27....]
4. In each case find all of the fixed points of the function g.
(a) g(x) = 2x - 2
(b) g(x) = x2 - 4

(c) g(Y) = 1(Y + Y1) (Y > 0).

5. Suppose that a function g: R -> R assumes only positive values. What follows
about the fixed points (if any) of g?
6. Let g be the function corresponding to the difference equation
xn+1 = (xn)2-3 (n = 0, 1, 2, ...).
(a) Write down a formula giving g(x).
(b) Find the fixed points of g.
(c) Hence find the steady-state solutions of the difference equation.
118 Difference equations

7. Repeat Exercise 6, but with the difference equation

xn+1 =a(xn)2 (n=0,1,2,....)

where a * 0 is a constant.

8. In each case give a formula for g(x) where g is the function corresponding to
the difference equation.

(a) Nk+1 = Nk(1- Nk) (k = 0,1,2,...),

(b) Yn+1 = eY" + 3 (n = 0, 1, 2,. . .).

9. In each case sketch graphs of y = g(x) and y = x on the same axes. Hence
decide whether the function g has any fixed points.
(a) g(x) = x2 - 1 (d) g(x) = e-x
(b) g(x) = x2 + 1 (e) g(x) = ln(x) (x > 0)
(c) g(x) = ex (f) g(x) = tan(x) (-In n < x < fir).

7.4 Iteration and cobweb diagrams

Cobweb diagrams are used in order to provide a simple graphical in-
terpretation of the iteration process used to solve difference equations.
They provide valuable insight into the possible ways in which solutions
of difference equations can behave.
The difference equations to which such diagrams are relevant are
first-order equations of the form

x n + i = g(xn) (n = 0, 1, 2, ...) (1)

Here g is the function corresponding to the difference equation, as

explained in the previous section.
The main idea behind the construction of a cobweb diagram is to
interpret equation (1) on a graph. To start with, graphs of both y = g(x)
and y = x are sketched on the same set of axes. It is assumed also that
the number xn is given, and plotted on the x-axis. Figure 7.4.1 (a) shows
how things might appear at this stage for a typical function g.
Given this set-up and given that (1) is satisfied, the question now is:
where on the x-axis should we put xn+1 ? The answer is shown in Figure
7.4.1 (b). To sketch the arrows in Figure 7.4.1. and thence to get the
point xn+1 we perform the following steps.
7.4 Iteration and cobweb diagrams 119

x-axis

(a) (b)

Fig. 7.4.1. Going from xn to xn+1 via the graph.

STEP i : Start at the point P on the line y = x directly above

x and then project vertically to get the point Q on the graph of

y = 8(x)
STEP 2: From Q project across horizontally until the point R is
reached on the line y = x.
STEP 3: Choose the next point to be the x-coordinate of R.

We call this the next-point procedure. The reasons why it gives the
correct point for xn+1 are as follows:

(a) The y-coordinate of Q is g(xn) (because Q lies on the graph of

y = g(x)).
(b) Hence R has g(xn) for its y-coordinate also (as it is level with Q).
(c) Hence R has g(xn) for its x-coordinate (as it lies on the line y = x).

Thus the procedure gives xn+1 = g(xn), as required.

To construct a complete solution of the difference equation (1) we
now choose an initial value xo and mark it on the x-axis. By repeatedly
applying the next-point procedure we obtain from xo a sequence of
points

x0, x1, x2, ...

lying on the x-axis and forming a solution of the difference equation (1).
The pattern of arrows arising from use of the next-point procedure will
typically form a cobweb (although it might sometimes be better called a
zig-zag) path.
120 Difference equations

The following example gives a number of difference equations, whose

solutions illustrate a variety of possible types of behaviour.

Example 1. For each of the following difference equations construct a cobweb

diagram showing the solution of the difference equation which satisfies the initial
condition x0 = 1:
(a) xn+1 = 2xn (n = 0, 1, 2,...)
1
(b) xn+1 = 2xn
(c) xn+1 = -2xn
1
(d) xn+1 = 2xn.

Solution. In each case the first step is to write down the function g for which the
difference equation has the form
xn+1 = g(xn)

The graphs of y = g(x) and y = x are then plotted relative to the same pair of axes
and the initial value xo is plotted on the x-axis. Repeated application of the next-
point procedure then produces the cobwebs shown in Figure 7.4.2 and the values
for
x0, X19 x2, x3, ..

(which check against those obtained directly from the relevant difference equation
by iteration).

The cobweb diagrams exemplify some important types of long-term

behaviour for solutions of difference equations. One of the most impor-
tant questions in this respect is whether the solution converges to a limit.
The answers, in the various cases, are as follows.
Case (a) : no. We can make x as large as we like by making n sufficiently
large. Hence we say that the sequence diverges to oo.
Case (b) : yes. We can make xn as close as we like to 0 by making n
sufficiently large. Hence we say that the sequence converges to the limit
0.

Case (c) : no. As n increases, xn oscillates from one side of the origin to
the other with ever-increasing amplitude. Hence we say that the sequence
does not converge to any limit.
Case (d) : yes. Although xn oscillates as n increases, the amplitude dies
away and approaches 0. We can thus make xn as close as we like to 0 by
making n sufficiently large. Hence we say that the sequence converges to
the limit 0.
In the preceding example, the functions corresponding to the difference
7.4 Iteration and cobweb diagrams 121

(b)
(a)

y-axis

x-axis
x-axis

Fig. 7.4.2. Cobweb diagrams for Example 1. Cases (b) and (d) converge to a
fixed point whereas (a) and (c) do not.

equations are all linear, in the sense of having straight-line graphs. While
this makes the graphs easy to draw, it also makes the behaviour patterns
for the solutions rather special. Thus, for example, if the function is
linear, it cannot have exactly two fixed points. The following example
provides a contrast with the linearity of the previous example.

Example 2. Use a zig-zag diagram to investigate the long-term behaviour of the

solutions of the difference equation
xn+1 = xn2 (2)

for each of the following initial conditions: (a) xo = 0.9 and (b) xo = 1.1.

Solution. The function g corresponding to the difference equation (2) is given by the
formula g(x) = x2. Hence the cobweb diagrams are as in Figure 7.4.3. The solution
in case (a) converges to the limit 0 whereas the solution in case (b) diverges to oo.
122 Difference equations

x-axis
x4 x3 x2 x1 xo

(a) x0 = 0.9 (b) xo = 1.1

Fig. 7.4.3. Cobweb diagrams for Example 2 show different behaviour for different
initial conditions.

Our final example illustrates how a cobweb diagram may, in ex-

ceptional cases, collapse to a square and thereby give rise to periodic
behaviour.

Example 3. Use a cobweb diagram to discuss the behaviour of the solution of the
difference equation
xn+1 = -xn (3)

which satisfies the initial condition xo = 1.

Solution. The function g corresponding to the difference equation (3) is given by

the formula g(x) = -x. The graphs of y = g(x) and y = x are shown in Figure
7.4.4. Because of the symmetry between the two graphs, two applications of the
next-point procedure give back the starting point. Hence the cobweb diagram closes
up to give a square described infinitely many times.
The solution with initial condition x0 = 1 is obtained by starting at the upper
right corner of the square and following it around anti-clockwise. The values of the
x-coordinate at successive crossings of the x-axis are then recorded to give
xo = 1, x1 = -1, x2 = 1, x3 = -1, ... .
7.4 Iteration and cobweb diagrams 123
y-axis

x-axis

Fig. 7.4.4. Cobweb diagram which illustrates a 2-cycle.

The above sequence has the interesting feature that every second
member has the same value, even though the sequence is not constant.
This property may be expressed by saying that the solution has period 2
or is a 2-cycle. More generally, we say that a sequence

xo, X1 , x2, .. .

is periodic if there is a positive integer p such that

xn.+p = xn (n = 0,1, 1,2 ...)

The smallest such p is then called the period of the sequence and the
sequence is called a p-cycle. In particular, a sequence of period 1 is a
constant solution.

Exercises 7.4

1. Sketch graphs y = x and y = x + 1, for 0 <_ x <_ 5, relative to the same pair of
axes. Hence draw a cobweb diagram for the solution of the difference equation
xn+1 =xn+1 (n=0,1,2,...)
which satisfies the initial condition xo = 0. Show x0, x1,.. . , x5 on the x-axis.
2. In each of the cases shown below, sketch a cobweb diagram for the solution
of the difference equation
xn+1 = g(xn) (n = 0, 1, 2,...)
124 Difference equations

(a) (b)

3. For each of the difference equations in Example 1 in the text, use iteration
to find x1, x2, x3, given that xo = 1. Hence check the statements made in Figure
7.4.2.

4. Repeat Example 1 in the text, but this time suppose the initial condition is
xo = -1.
5. Repeat Example 2 in the text, but this time choose the initial conditions to
be xo = -0.9 in case (a) and xo = -1.1 in case (b).

In Exercises 6 and 7 you will need to use graph paper.

6. Let g be the function with g(x) = xl for x >_ 0. Tabulate the values of g
at 0, 0.1, 0.29...50.99 1. Plot the corresponding points on the graph of y = g(x)
and connect them to form a smooth curve. Also draw the line y = x. From your
graph, make the cobweb construction of the solution of the difference equation
1

xn+1 = (xn)2 (n = 0, 1, 2,...)

which satisfies the initial condition xo = 0.1.
To which limit does the solution appear to be converging?
7. Let g be the function with g(x) = x/(1 + x) for x * -1. Tabulate the values
of g at 0, 0.15... , 0.5. Plot the corresponding points on the graph of y = g(x)
and connect them to form a smooth curve. Also draw the line y = x. From your
graph make the cobweb construction for the solution of the difference equation
xn+1 =xn/(1+xn) (n=0,1,2,...)
which satisfies the initial condition xo = 0.5.
To which limit does the solution appear to be converging?
8. Consider the difference equation
xn+1 = 1 /xn (n = 0, 1, 25... )
where it is assumed that each xn * 0-
(a) Draw a cobweb diagram for the solution which satisfies the initial
condition xo = 2. Check your answers by iteration, directly from the
difference equation.
7.4 Iteration and cobweb diagrams 125

(b) Find all the numbers a * 0 for which the initial condition xo = a
determines a constant solution.
(c) Deduce that all non-constant solutions are 2-cycles.

9. Consider the difference equation

xn+1 = g(xn)

where g is one of the two functions shown below. By experimenting with cobweb
diagrams corresponding to various initial conditions find:
(a) a solution of period 2 when g is given by the graph (i),
(b) a solution of period 3 when g is given by the graph (ii).

(i) (ii)
8
Linear difference equations in finance and
economics

This chapter begins with a study of linear difference equations, which

are usually easier to solve in closed form than non-linear ones. A closed-
form solution is found for a special type of linear difference equation
that arises frequently in problems from finance and economics.
The applications to finance concern interest on loans and the repay-
ment of debt. Hence they should prove just as useful to those who have
money and wish to invest it as to those who don't have money and wish
to borrow it.
Economists set up mathematical models with which they hope to
predict movements in price, interest rates, level of unemployment, and so
on. Although some of the economic concepts involved in these models
are rather abstract, we have a rough idea of their meaning from everyday
usage and this is all that is needed for the models studied in this chapter.
Economic models contain assumptions about the interdependence of
the various quantities of interest to economists. A model is regarded as
successful if these assumptions lead to reliable forecasts. The quantities
studied in economics cannot always be measured with the same accu-
racy as those studied in the physical sciences and hence economists are
more concerned with qualitative predictions whether the prices will
go up or down or whether they will oscillate about some equilibrium
value.
Economists often simplify their mathematical models by assuming the
functions involved are linear, in the sense of having straight-line graphs.
This is particularly appropriate when the quantities involved stay near
equilibrium values, because a sufficiently small piece of a smooth curve
looks like a line segment. A quantity Y is said to be a linear function of
another quantity X if Y = mX + c for where m and c are constants. Y
is said to be directly proportional to X if Y = mX.

126
8.1 Linearity 127

Table 8.1.1. Examples of first-order difference equations and their

classifications.

Difference equation Linear Homogeneous Constant coefficient

Yk+1 = 3yk + 2 Yes No Yes
Yk+1 = 3yk Yes Yes Yes
Yk+1 = (yk)2 No - -
Yk+1 = k2yk Yes Yes No

8.1 Linearity
The difference equations arising in this chapter are mainly first-order
linear ones, which have the form

yk.+l =ayk+b (k=0,1,2,...) (1)

where a and b are given functions of k. In searching for closed-form

solutions later, we shall suppose that both a and b are constants.
The difference equation is said to be homogeneous if b is the constant 0
and to be constant coefficient if a is a constant. The homogenized equation
for (1) is the difference equation

xk+ 1 = axk (k = 09 (2)

which is obtained from (1) by putting b = 0. To avoid confusing

solutions of the two equations, moreover, we have also replaced `y' by
Y. Table 8.1.1 illustrates these definitions.
The solutions of linear difference equations obey superposition theo-
rems, analogous to those presented in Section 5.3 for differential equa-
tions, which show how solutions may be combined to form new ones.
Such theorems lead to a useful procedure for solving linear difference
equations in closed form. The procedure will be illustrated by an example.

Example 1. Find the solution of the first-order linear difference equation

Yk+l = 2Yk + 3 (k = 0, 1, 25... ) (3)

which satisfies the initial condition yo = 2.

128 Linear difference equations in finance and economics
Solution.

STEP 1: Write down and solve the homogenized equation for (3). The homoge-
nized equation is

xk+1 =2xk (k =0,1,2,...) (4)

Iteration suggests that its solutions are given by the formula

xk = 2k xo (5)

where the initial value xo can be chosen arbitrarily. Substitution of (5) back in (4)
shows that (4) is satisfied.

STEP 2: Guess one particular solution of the original equation (3). The RHS
of (3) involves only constants (apart from yk ). This suggests trying a constant
solution, in which yk+l = yk. The original equation (3) is then equivalent to

yk = 2yk + 3

and hence to

yk = -3. (6)

Substitution of (6) back into (3) shows that (3) is satisfied.

STEP 3: Add the solutions (5) of the homogenized equation to the particular
solution (6) of the original equation -- this gives all the solutions of the original
equation. Adding the solutions gives the formula

yk = -3 + xo2k (7)

for the solutions of the original difference equation (3), in terms of the parameter
xo.

STEP 4: Choose xo to fit the given initial condition yo = 2. Put k = 0 in (7) to

get yo = -3 + xo and hence xo = 3 + yo = 5. Thus the required solution is given by

yk=-3+5x2k (k=0,1,2,...).

Finally, it is a good idea to check that this formula satisfies both the initial
condition (put k = 0) and the original difference equation (substitute back in
(3)).

A general formula for the solutions of (1), which includes the solution
found in the above example as a special case, is given by the following
proposition.
8.1 Linearity 129

y=ax+b y=x

Fig. 8.1.1. Cobweb diagram for a linear constant-coefficient difference equation :

exponential or linear growth depending on the value of a.

Proposition 1 The linear first-order difference equation

Yk+l = aYk + b,
with a and b constant, has its solutions given by the formula
1 ak(vo-lba)+iba if a* 1
Yk =
l yo ya=i
where k = 0, 1, 2, ... and where the initial value yo can be chosen as
desired.

The derivation of this general formula in the case a * 1 involves the

use of steps analogous to those used in Example 1 above. The derivation
is even simpler in the case a = 1. The details are set as exercises later. In
this regard, most of us would find it easier to remember the derivations
than to remember the formulae !
The solutions do exhibit an interesting feature, however, which is
worthy of comment. In the case a * 1 the solution gives yk as an
exponential function of k, but in the case a = 1 it is a linear function of
k. The reason for the anomalous behaviour when a = 1 is revealed by
the cobweb diagrams for the solutions in the two cases, shown in Figure
8.1.1.
130 Linear difference equations in finance and economics

Fig. 8.1.2. Convergence of solutions to a limit.

a <-1

Fig. 8.1.3. Non-convergence of solutions.

As the parameter a approaches 1, the line y = ax + b swings around

and approaches a direction parallel to that of the line y = x; hence the
diagram on the left becomes more and more like the one on the right.
This gives rise to the solutions, for a = 1, which grow linearly rather
than exponentially.
In the applications given later, it will be important to understand the
long-term behaviour of the solutions given in Proposition 1. The possible
cases which can arise, when a * 1, are illustrated in Figure 8.1.2 and
Figure 8.1.3.
Although this section has been concerned exclusively with first-order
linear difference equations, some of the ideas extend to second-order
8.1 Linearity 131

difference equations of the type

Yk+2 = ayk+1 + byk + c (k = 0,1, 2, ...),
which are said to be second-order linear, to have constant coefficients if a
and b are constants, and to be homogeneous if c is the constant 0. Use of
appropriate superposition theorems makes it possible to find closed-form
solutions, at least when a, b and c are constant. An example will be set
in the exercises suggesting how this may be done.

Exercises 8.1
In the exercises which ask you to `classify' a difference equation, state the order of
the difference equation and whether it is linear, homogeneous, has constant coeffi-
cients.
1. In each case, classify the difference equation and state whether Proposition 1 in
the text is applicable. If so, use it to get the closed-form solution of the difference
equation satisfying the stated initial condition yo = 3. Assume k = 0, 1, 2,....
(a) Yk+l = 3yk + 10 (b) Yk+1 = 2(Yk )3
(C) Yk+1 = Yk + 10 (d) Yk+2 = 2Yk+1 + 4Yk
2. Classify each of the following difference equations and state whether Propo-
sition 1 in the text is applicable.
(a) Xk+2 = 2kXk (b) Nk+1 = 3Nk
(C) Zk+1 = cos(Zk) (d) Yk+2 = -6 Yk+1 + 7 Yk + 5
3. For the difference equation
Yk+1 = I.lyk (k =0,1,2,...)
(a) write down in closed form the solution satisfying the initial condition
Yo = 100,
(b) use your calculator to find the integer k such that
yk < 200 < yk+1,
(c) state the last integer k for which yk is less than 200,
(d) state the first integer k for which yk exceeds 200.
(e) To which of the cases illustrated in Figure 8.1.2 does the above difference
equation belong?

4. For the difference equation

Yk+1 = 0.9yk (k = 0, 1, 2, ...)
(a) write down in closed form the solution satisfying the initial condition
yo = 0.01,
(b) use your calculator to find the integer k such that
Yk+1 <0.005 <yk,
132 Linear difference equations in finance and economics
(c) state the last integer for which yk is greater than 0.005,
(d) state the first integer for which yk is smaller than 0.005.
(e) To which of the cases illustrated in Figure 8.1.2 does the above difference
equation correspond?

5. Find, in closed form, the solution of the difference equation

Yk+1 = 3yk + 2 (k = 0, 1, 2, ...)
which satisfies the initial condition yo = I. Use steps similar to those given in
Example 1 in the text. Check your answer.
6. Derive the solution given in Proposition 1 in the text in the case a * 1. Use
steps similar to those given in Example 1 in the text. Check the answer.
7. Derive the solution given in Proposition 1 in the text in the case a = 1. Show
how the solution may be guessed from the result of applying iteration. Check
the answer.
8. Suppose that (1) and (2) in the text are satisfied. Show that if Zk denotes
Yk + Xk then
zk+1
= azk + b (k = 0, 1, 2, ...).
[This shows that adding a solution of the homogenized equation (2) to a solution
of the original equation (1) gives another solution of (1). This is the superposition
theorem on which the steps in Example 1 in the text are based.]
9. Consider the difference equation
Yk+1 = (Yk )2 -2 (k = 0, 1, 29... )
and the `homogenized' version
Xk+1 = (Xk )2.
(a) Show that the original equation is satisfied by the constant solution
Yk =2.
(b) Show that the `homogenized' equation is satisfied by the constant solution
Xk=1.
(c) Is the sequence defined by putting zk = Yk + Xk a solution of the original
difference equation? Does this contradict the superposition principle
proved in Exercise 8? Explain your answer.

10. This exercise illustrates how the methods of this section may be extended to
second-order difference equations. Consider the Fibonacci equation
Yk+2 = Yk+1 + Yk (k = 1, 2, 3, ...),
which is second-order linear homogeneous with constant coefficients. The solution
satisfying the initial conditions yl = 1 and Y2 = 1 will be found in closed form.
(a) Guessing solutions. The solutions for first-order equations suggest trying
exponential solutions, of the form
y = ak (k = 12293,... )
where a is a constant. Show that this satisfies the Fibonacci equation
only if a satisfies the quadratic equation a2 - a -1 = 0.
8.2 Interest and loan repayment 133

[Note that this gives two solutions, yk = (al )k and Yk = (a2 )k, where al
and a2 are the roots of the quadratic equation.]
(b) Superposing solutions. Check that

Yk = C1(a,)k + C2(a2)k

is a solution of the Fibonacci equation where cl and c2 are arbitrary real

constants.
(c) Fitting initial values. Hence deduce that the solution of the Fibonacci
equation satisfying the initial conditions yl = 1 and Y2 = 1 is given by
1
(1+\k - (i_ 2 k

(k = 1, 2, 3, ...).
V-' 2

[Note : It is surprising that this closed-form solution involving square

roots produces only integers!]

11. Repeat Exercise 10 for the difference equation

Yk+2 = Yk+ 1 + 2Yk

with the initial conditions yo = 1, yl = 2.

12. Will the technique used in Exercise 10 work for the difference equation

Yk+2 - Yk+1 + (Yk )2

Give reasons.

8.2 Interest and loan repayment

A sum of money, when lent to a financial institution such as a bank,
earns interest. The initial sum deposited, called the principal, thereby
grows in value to an amount which is the sum of the principal and the
interest earned. This may be expressed by writing

{amount on deposit} = {principal} + {interest}.

The amount which the lender finally gets back depends, among other
things, on whether the money was lent at simple interest or at compound
interest. The annual rate at which interest is earned is expressed as a
percentage, say p% (or p% per annum). For compound interest, however,
there is also an interest conversion period, which is usually a year or some
fraction a of a year. How the interest is calculated and how the amount
on deposit grows will now be explained for each type of loan.
134 Linear difference equations in finance and economics

Simple interest
For loans of this type, the interest earned in any year is obtained by
applying the rate to the initial principle and thus it stays the same
throughout the duration of the loan.
Thus, for an annual interest rate of p%, the interest earned during any
one year is p% of the principal, that is,
{interest} = p x {principal}
100
and hence
amount on amount on
deposit after deposit after + x {principal}.
k -4-1 years k years 100

To express this as a difference equation, let So be the principal and for

each k let Sk be the amount into which it grows after k years, giving

Sk+l = Sk + p So (k = o, 1, 2, ...).
100
This is a first-order linear difference equation of the type studied in
Section 8.1 with a = 1 and b = (p/ 100)So. Hence, by Proposition 1 of
Section 8.1, the solution is given by the formula

Sk = 1 + kp So (1)
100
which is called the simple interest formula.
The significance of this formula is that it gives the amount Sk into which
the principal So grows when it earns simple interest for k years at an annual
rate of P%.

Compound interest
This type of interest is more usual for loans made over longer periods.
Interest is added to the principal at regular intervals, called conversion
periods, and the new amount (rather than the principal) is used for
calculating the interest for the next conversion period. The fraction
of a year occupied by the conversion period is denoted by a so that
conversion periods of 1 month, 1 quarter, 6 months and 1 year are given
respectively by a = 12, a = 4, a = 2 and a = 1. Instead of saying that the
conversion period is 1 month, for example, we may say that the interest
is compounded monthly.
For an interest rate of p% and conversion period equal to a fraction
8.2 Interest and loan repayment 135

a of a year, the interest earned for the period is ap% of the amount on
deposit at the start of the period, that is,

finterestj - 100
ap X amount on deposit at the
start of the conversion period
and hence
amount on amount on amount on
deposit deposit « deposit
= +px
after k + 1
conversion
periods
after k
conversion
periods
1 afterk
conversion
periods
To express this as a difference equation, for each k let Sk denote the
amount on deposit after k conversion periods, giving

Sk+1 = Sk + ap-Sk

_ (1 + 100)
'xp S k (k = 0, 1, 2,...).

This is a first-order linear homogeneous difference equation of the type

studied in Section 8.1 with a = (1 +ap/100). Hence the solution is given
by
(I+ p _)k

Sk = Soy (2)
100
which is called the compound interest formula.
This formula gives the amount Sk into which the principal So grows when
it earns compound interest for k conversion periods, each equal to a fraction
a of a year, at an interest rate of p%.
The formulas (1) and (2) show that the amount Sk increases as a linear
function of k when the interest is simple, but as an exponential function
of k when the interest is compound. The cobweb diagrams in Figure 8.1.1
illustrate the difference between these rates of growth, the exponential
growth being much greater than the linear one, at least in the long term.
When applying these formulae, recall that in (2) the letter k stands for
the number of conversion periods (rather than the number of years, as
in (1)). Thus to find the amount on deposit after 10 years at compound
interest with a monthly conversion period, take k = 120 in (2) (rather
than k = 10, as in equation (1)).

Loan repayments
Arguments similar to those used above may be applied to the study of
loan repayments. The particular scheme considered here is the one nor-
136 Linear difference equations in finance and economics
mally used for the repayment of housing loans and is called amortization.
Repayments are made at regular intervals, and usually in equal amounts,
to reduce the principal (the amount borrowed) and to pay interest on the
amount still owing.
It is supposed that compound interest at p% is charged on the out-
standing debt, with conversion period equal to the same fraction a of
the year as the period between repayments. Between payments, the debt
increases because of the interest charged on the debt still outstanding
after the last payment. Hence
1 debt after debt after interest on
k + 1 payments} = lk payments + { this debt } - {payment}.
To write this as a difference equation, let the initial debt to be repaid be
Do, for each k let the outstanding debt after the kth payment be Dk, and
let the payment made after each conversion period be R. Hence
a
Dk+1 = Dk + R

= (1+)Dk-R.
This is a first-order linear difference equation of the type studied in
Section 8.1 with a = 1 + ap/ 100 and b = -R. Hence the solution is given
by Proposition 1 in Section 8.1 as

Dk = (1 + ap-
l k CDO.100R 1 + 100R
(3)
100/ ap ) ap

which we shall call the loan repayment formula.

This formula gives the debt remaining on an initial debt Do after k pay-
ments of R made at the end of conversion periods of length equal to a
fraction a of a year, the interest being compound interest at p%.
Note that if R = 0 this reverts to the compound interest formula (with
debt instead of credit).

Example 2. A loan of $10 000 is to be repaid according to the amortization scheme.

Calculate the monthly repayments needed to pay off the loan in five years if interest
is charged at 15% compounded monthly.
8.2 Interest and loan repayment 137

Solution. Let R be the required monthly payment. In the loan repayment formula
choose
a= 1
12
(since a month is 121 of a year)
p = 15 (the annual percentage rate)
Do = 10 000 (the initial debt)
k = 60 (since 5 years = 60 months)
and hence get
D60 = 21071.81 - 88.57R.
But, for the loan to be repaid after 60 months, D60 = 0. Hence R = 237.91 and so
the required monthly repayment is $237.91.

Difference equations are derived for interest and loan repayment prob-
lems in Goldberg (1958), pages 87-90. Technical details concerning a
wide variety of financial calculations are given in Ayres (1963).

Exercises 8.2
The simple and compound interest formulae (1) and (2) in the text can be used
wherever they are relevant.
1. Find p so that interest compounded annually at p% produces the same
amount at the end of a year as 12% compounded quarterly.
2. Suppose that compound interest is earned on a certain investment, with an
interest conversion period of six months. Calculate the interest rate p% necessary
to double the initial principal in four years.

3. (a) Tabulate the amount on deposit at the end of each year for five years,
for an initial principal of $1000 earning simple interest at an annual rate
of 12.5%.
(b) Calculate the number of years it would take for the principal to double
at this rate.

4. Repeat Exercise 3, but with a compound interest rate of 12.5% and an interest
conversion period of one year.
5. Repeat Exercise 3, but with a compound interest rate of 12.5% and an interest
conversion period of six months.
6. Two insurance companies offer investors insurance bonds earning compound
interest at identical rates of interest. One company takes 5% of the initial
principal as their up-front charge while the other company leaves the entire
principal invested but takes 5% of the final amount as its withdrawal fee.
Does either scheme yield a better return to the investor? Give reasons for your
answer. You may assume interest rates stay fixed.
138 Linear difference equations in finance and economics
7. (a) By making any necessary changes to the derivation given in the text of
the compound interest formula, set up a difference equation to describe
the following situation:
An initial sum of money So is deposited to earn compound interest at
a rate of p% and the conversion period is a fraction a of a year. At
the end of each conversion period a further sum So is deposited, to
give a total amount Sk at the end of the kth conversion period.
(b) Find in closed form the solution of the difference equation found in (a).

8. Using Exercise 7 solve the following problem.

A company deposits $2000 at the end of each quarter in a fund
which earns 10% interest compounded quarterly. Calculate the
amount in the fund 10 years after the initial deposit.

9. In the loan repayment formula in the text, find the value of the repayment R
which just keeps the debt at its initial level. What happens if R is less than this
value? greater than this value? Argue from the formula.
10. The average loan on an Australian home is often quoted to be $55 000.
Calculate the monthly repayment necessary to have the loan repaid after 25 years
if the interest rate is 7.5%. What is the total amount paid back on the loan?
11. Bernoulli's inequality states that, for each real number x >_ -1 and each
positive integer n, (1 + x)" >_ 1 + nx. Use this inequality to prove that, for the same
principal and at the same rate of interest, compound interest always produces an
amount at least as great as that for simple interest.
12. Show that (1 + x/n)" is an increasing function of n > 0 where x is a fixed
real number >_ 0. Hence show that, for a given principal and rate of interest,
decreasing the conversion period can only increase the amount on deposit after
a specified time.
13. Discuss how you would modify some of the models from this chapter to
take account of inflation.

8.3 The cobweb model of supply and demand

Economics courses often begin with a discussion of the ideas of supply
and demand and their effect on prices. Indeed, even people without
formal training in economics have often heard of the `law of supply and
demand'. If pressed for an explanation of what it means they would
probably say something like `if the supply goes down or the demand
goes up, then prices will rise'. The cobweb model, to be discussed in this
section, is a refinement of this idea to include a time lag in the supply.
This leads to a difference equation for the price.
The assumptions of the cobweb model, listed below, are suggested
by the marketing of agricultural produce. Farmers, encouraged by high
8.3 The cobweb model of supply and demand 139

prices for a crop one season, will plant a larger area with that crop in
anticipation of a similarly high price the following season. Thus supply
lags one season behind the price of that particular crop. This provides
the motivation for assumption (a) in our description below of the cobweb
model. The assumption (b) relates demand to price in the same season, as
would normally be expected. The assumption (c) is relevant to perishable
goods like fruit and vegetables: the price will adjust to clear the market
by making demand equal to supply.
The cobweb model. This model of supply and demand for a commodity
makes the following assumptions relative to consecutive time periods:

(a) The supply in period k (where k = 1, 2, 3, ...) is a linear function of

the price in previous period k -- 1, with the supply increasing when
the price increases.
(b) The demand in period k is a linear function of the price in period
k, with the demand decreasing when price increases.
(c) The market price is determined by the available supply, with the
transaction taking place at the price which makes the demand equal
to the supply.

In the above model, the functions have been assumed to be linear merely
for the sake of simplicity. Some economists use models in which the
functions may be non-linear.
To illustrate these assumptions we shall use the following notation for
the various quantities which are involved. For k = 0, 1, 2,... let

Sk = {number of units of the commodity supplied in kth period},

Dk = {number of units of the commodity demanded in kth period},
Pk = {price of a unit of the commodity in the kth period}.

Example 1. Suppose that the supply and demand of potatoes are related to the
price by the straight-line graphs shown in Figure 8.3.1. Verify that assumptions (a)
and (b) of the cobweb model are satisfied. Show that, if (c) is assumed also, then
the price satisfies the difference equation
1
Pk = -2Pk-t + 1 (k = 1,2,3,...).
140 Linear difference equations in finance and economics

Sk ( kg) Dk ( kg)

1000 1000

750

500

0.50 1.00 0 0.50 1.00

Pk-1 (dollars) Pk (dollars)

Fig. 8.3.1. Hypothetical supply and demand graphs for potatoes.

Solution. The assumption (a) is satisfied because the supply Sk is a linear function
of the price pk-1 and, from the graph, Sk increases when Pk-1 increases. Similarly,
assumption (b) is satisfied because Dk is a linear function of pk and Dk decreases
if Pk increases.
It is easy to write down explicit formulae for the linear functions implicit in the
graphs. In each case we read off the slope from the graph (`rise over run') and then
add a constant to the RHS of the formula to make the graph pass through one of
the end points. This gives
Sk = 500pk-1 + 500,
Dk = -1000pk + 1500.
The assumption (c) translates to Dk = Sk and hence from the last two equations
1
Pk = -2Pk-1 + 1 (k = 1,2,3,...).
This is the required difference equation for the price.

A description of the cobweb model of supply and demand may be

found in Lipsey, Langley and Mahoney (1981), pages 149-152 (but
note that, in the graph on page 150, mathematicians would regard the
functions graphed as the inverses of the ones claimed). Lipsey et al. give
some figures concerning acres planted and prices obtained by growers in
the South Australian potato industry. Unfortunately, in this instance, the
cobweb model does not give good agreement with the results.
The difference equation for the cobweb model is derived in Archibald
and Lipsey (1973), pages 300-304, and in Goldberg (1958), pages 179-
182.
8.3 The cobweb model of supply and demand 141

Exercises 8.3
1. Suppose that in Example 1 in the text the price of potatoes is initially 90
cents.
(a) Sketch a cobweb diagram and deduce how the price behaves in the long
term.
(b) Find the solution of the difference equation for the price in closed form
and use it to check your answer in part (a) concerning the long-term
behaviour of the price.

2. Repeat Example 1 in the text, but this time suppose the graphs of supply and
demand against price are as shown below. Derive the difference equation for the
price,
Pk = -2Pk-1 + 5/2 (k = 1, 2, 3, ...).

Sk (kg) Dk (kg)

1000 1000

750

500

0.50 1.00 0.50 1.00

Pk-1 (dollars) Pk (dollars)

3. (a) Write the assumptions of the cobweb model as mathematical equations

using the symbols Sk, Dk, Pk defined in the text together with any other
notation you need to introduce.
(b) Hence derive a difference equation for the price of the form
Pk = -aPk-1 + b (k = 1, 2, 3, ...)
in which a and b are constants. Express a in terms of the notation you
have introduced in part (a) and explain why a > 0. What happens if
b<_0?
(c) Describe the long-term behaviour of the price in each of the cases
(i) 0 < a < 1 (ii) a = 1 (iii) a > 1.
Relate these inequalities to the slopes of the graphs of Sk against Pk-1
and Dk against Pk.
142 Linear difference equations in finance and economics
8.4 National income: `acceleration models'
The national income of a country is intended to measure the total level
of economic activity within the country during a given period of time.
A term often used synonymously with national income is GDP (gross
domestic product). It provides a measure of living standards, dropping
during a recession and rising during a boom. For this reason there is
great interest in analyzing the factors which affect national income and
in setting up models to predict its values in the future. One of the criteria
used by economists to judge the effectiveness of these models is whether
they predict regular fluctuations in the national income, thereby reflecting
the business cycle.
The simple models studied here originated in 1939 with the American
economist P. A. Samuelson (who won the Nobel Prize for Economics in
1970). The basis for these models is the acceleration principle according to
which investment occurs not because of a high level of national income
but rather because of an increase in the level of national income. Such
an increase leads to a greater demand for goods and services, and hence
to a need for greater capacity in the factories to produce more goods.
Thus there is an increased opportunity for the investor.
The quantities to be used in the model, during the kth accounting
period, are as follows :

Yk = {national income}
Ck = {consumer expenditure}
Ik = {induced private investment}
Gk = {government expenditure} .

The consumer expenditure is the amount spent on consumer goods like

food, clothing and appliances and the induced private expenditure is the
amount invested in machinery, training programmes, etc.
The model will consist of four statements, expressed verbally, which
relate these four quantities. It will be left as an exercise to translate these
statements into mathematical equations involving the above symbols.
Model I for the national income consists of the four statements (a), (b),
(c) and (d) listed below. Statement (a) is called the accounting equation
and may be regarded as a definition of national income, while statement
(c) expresses the acceleration principle explained above.

(a) The national income is the sum of consumer expenditure, induced

private investment and government expenditure.
8.4 National income: `acceleration models' 143

(b) Consumer expenditure in each period is directly proportional to the

national income in that period.
(c) Induced private investment in any period is directly proportional to
the increase in the national income for that period above the national
income for the preceding period.
(d) The government expenditure stays constant from one period to the
next.

These statements can be written as mathematical equations involving

the symbols introduced above for the various quantities. It can then be
shown that the national income satisfies a difference equation of the type
Yk=aYk_1+b (k=1,2,...) (1)

where a and b are constants with a > 1 and b < 0. The details are left
till later as a useful exercise. The difference equation (1) is first-order
linear constant coefficient of the type studied in Section 8.1 and hence its
solutions are given by
b b
Yk= a ( 0 _ (2)
I ) 1-a
Because a > 1, the formula (2) shows that Yk grows exponentially with
k (at least if Yo > b(1 - a)), but it does not oscillate. Hence Model I
does not predict the regular fluctuations in the national income which
we know occur in practice. To obtain more realistic predictions, some
modifications to' the model are needed.
An obvious point at which to try to improve Model I is at the
interpretation (c) of the accelerator principle. Knowledge of the latest
increase in national income comes too late for investors to take advantage
of it. Hence it would be more realistic to introduce a time lag and to use
the increase from one period back. This leads to the following model for
the national income.
Model II consists of the statements (a), (b) and (d) from Model I together
with the modified statement (c') of the accelerator principle:
(c') Induced private investment in any period is directly proportional to
the increase in the national income for the previous period above
the national income for the period before that.
This model leads to a second-order difference equation for the national
income of the type
Yk + aYk-1 + bYk-2 = c (3)
144 Linear difference equations in finance and economics
where a, b and c are constants satisfying certain inequalities. An example
appears in the exercises which shows such an equation can have a solution
showing cyclic (periodic) behaviour. Hence, in Model II, our objection
to Model I has been removed.
Samuelson's models for the national income are described in Gandolfo
(1971), pages 63-73, Kenkel (1974), pages 241-259, and in Pfouts (1972),
pages 116-119. Kenkel gives some interesting comments as to how these
models may be made more realistic.

Exercises 8.4
1. (a) In Model I write each of the statements (a), (b), (c) and (d) in terms of
the symbols introduced in the text for the various quantities. (Use the
letters A and B for the constants of proportionality in statements (b)
and (c) respectively.)
(b) Deduce from your answers to part (a) of this exercise that the national
income satisfies the difference equation
(1 - A - B)Yk= -BYk-l +Go (k = 1,2,3,...).

2. Let A and B be as in Exercise 1 (b).

(a) Use the idea that consumer spending cannot exceed national income to
derive an inequality for the constant A.
(b) Use the idea that the investment to match a given increase in income must
be greater than that increase to deduce an inequality for the constant B.
(c) Hence show that the difference equation introduced in Exercise 1 (b) has
the form
Y k = aYk-l +b (k = 1, 2, 3, ...)
where a and b are constants satisfying the inequalities claimed in the
text.

3. (a) In Model II write each of the statements (a), (b), (c') and (d) in terms of
the symbols introduced in the text for the various quantities.
(b) From your answers to part (a) show that the national income satisfies a
difference equation of the form
Yk+aYk-i+bYk-2=c (k=2,3,4,...)
where a, b and c are constants, as claimed in the text.
(c) In the case a = -1, b = 1, c = 100 show that the solution of this difference
equation satisfying the initial conditions Yo = 100 and Yl = 101 exhibits
cyclic (periodic) behaviour.

4. Consider the model for the national income consisting of statements (a), (c)
and (d) in the text together with the following assumption:
8.4 National income: `acceleration models' 145

(b') Consumer spending in each period is directly proportional to

the national income in the preceding period.
Write down each statement in terms of the symbols introduced in the text for the
various quantities and hence find a difference equation for the national income.
5. Consider the model for the national income consisting of statements (a), (b)
and (d) in the text together with the following assumption :
(c') Induced private investment in any period is directly propor-
tional to the increase in consumer spending for that period
above the consumer spending for the preceding period.
Write down each statement in terms of the symbols introduced in the text for the
various quantities and hence find a difference equation for the national income.
9
Non-linear difference equations and
population growth

Linear difference equations have the advantage that a closed-form solu-

tion can be easily obtained. But, in many cases, the behaviour of linear
difference equation models is not consistent with observation. This is
true in many areas of biology, and particularly in studies of populations,
where non-linear models are better.
In this chapter non-linear models are developed which describe how a
population of individuals grows over time. Difference equation models
are appropriate when a species has a distinct breeding season. The
simplest non-linear model, the `logistic equation' is studied in detail.
Similar ideas are also used to model a measles epidemic. This involves
iterating a pair of simultaneous difference equations for the number of
those who can infect others and for the number of those susceptible to
being infected.
Closed-form solutions usually cannot be found for non-linear dif-
ference equations. Thus to interpret the models one has to resort to
numerical simulation or devise approximate closed-form solutions to the
equations. Both approaches are developed here. The ideas rely on
concepts developed in Chapters 7 and 8.

9.1 Linear models for population growth

Many people are very interested in the way populations grow and in
determining what factors influence their growth. Knowledge of this kind
is important in studies of bacterial growth, wildlife management, ecology
and harvesting. In this section a very simple model is formulated for
a population which breeds at fixed time intervals. This model forms a
starting point for the development of more realistic models.
While some models for population growth are very simple, others can

146
9.1 Linear models for population growth 147

be very sophisticated. A certain amount of insight into population growth

can be gained by first looking at simple models which incorporate the
most important features which affect population growth. Although simple
models may not be accurate they are still valuable. The mathematical
biologist, J. Maynard-Smith (1968), makes the following comments:

Any attempt to formulate the problem mathematically necessarily leaves

out many relevant factors. The attempt is nevertheless illuminating
it provides a rapid way of discovering the kind of effect various fea-
tures may have on the behaviour of a population and it suggests what
needs to be measured before the behaviour of any particular species
can be understood.

Many animals tend to breed only during a short, well-defined, breed-

ing season. It is then natural to think of the population changing from
season to season. Thus time is measured discretely with positive inte-
gers denoting each breeding season. Hence the obvious approach for
describing the growth of such a population is to write down a suitable
difference equation. Later in this book, differential equations will be used
to study populations which breed continuously (for example, human
populations).

Cell division
In nature, species typically compete with other species for food and are
themselves sometimes preyed upon. Thus the populations of different
species interact with each other. In the laboratory, however, a given
species can be studied in isolation. We shall therefore concentrate, at
first, on models for a single species. The first example we shall study is
a population of yeast cells which reproduce by dividing into two. The
rate at which yeast cells divide is governed by environmental factors such
as nutrient availability and temperature. In the laboratory these factors
may be controlled so that the rate of dividing is constant.

Example 1. (Cell Division). Suppose a single cell divides every minute. Assuming
that none of the cells die determine how many minutes it will take before there are
more than one million cells.
148 Non-linear difference equations and population growth
Solution. We measure time in integer multiples of one minute. Let Nk be the number
of cells after k minutes (where k is an integer). So Nk+i will be the number of cells
one minute later. If each cell divides into two there will be twice this number of
cells one minute later. Hence we may write down the difference equation
Nk+i = 2Nk with No = 1. (1)
This linear difference equation, which is of the type studied in Chapter 8, has the
solution
Nk=2 k (2)

obtained by direct iteration. Now we want k such that Nk = 106, so

2k=106. (3)

To solve for k, take natural logarithms to get

ln(2k) = In(106)9

which simplifies to
k 61n(10)
= 19.93. (4)
In(2)
Thus in 19 minutes less than a million cells are produced and in 20 minutes more
than a million cells are produced. So 20 minutes is the required time.

Births and deaths

In the previous model for cell division the population was affected only
by two new members replacing an old member. We are also interested,
however, in setting up models for populations of higher organisms (for
example, insects, fish and mammals). In these populations (unlike those
of cells) a number of individuals survive for at least several breeding
seasons. To do this, we must take account of the births and the deaths
which occur between the start of one breeding season and that of the
next.
To make progress with the modelling it is expedient to make some
simplifying assumptions about the population. We assume that we are
dealing only with large populations. Thus we can treat the population
as a whole and we do not have to deal with individuals. We then assume
that the population growth is governed by the average behaviour of its
individual members. With this in mind we make the following additional
assumptions :
Each member of the population produces the same number of off-
spring.
Each member has an equal chance of dying (or surviving) before the
next breeding season.
9.1 Linear models for population growth 149

The ratio of females to males remains the same in each breeding

season.
We also assume
Age differences between members of the population can be ignored.
The population is isolated there is no immigration or emigration.
It should be clear that the first two assumptions are reasonable only
when dealing with large populations, where it is expected that differences
between individuals are not significant. In the exercises models are
formulated which relax the final two assumptions. We also note that, in
certain populations where the ratios of females to males is not roughly
the same (e.g. many insect populations), then it is more practical to count
only the females and completely ignore the males.
Suppose that on average each member of the population gives birth
to the same number of offspring, a, each season. The constant a is called
the per-capita birth rate for an individual of the population. It is also
possible to think of a as the probability that a given member of the
population gives birth to a single offspring during the breeding season.
We also define JJ as the probability that an individual will die before the
start of the next breeding season. We call /3 the per-capita death rate.
Thus
(a) the number of individuals born in a particular breeding season is
directly proportional to the population at the start of the breeding
season, and
(b) the number of individuals who have died during the interval between
the end of consecutive breeding seasons is directly proportional to
to the population at the start of the breeding season.
If we let Nk denote the number of individuals of the population at the
start of the kth breeding season then
number number
born in = aNk and who die in = #Nk (5)
breeding season breeding season
Experimental measurements of per-capita birth and death rates are
usually expressed as average values. Because of this the value of Nk
obtained from an equation involving these quantities will usually not
be an integer. However, although Nk is not calculated as an integer it
may be interpreted as an integer in the model by rounding Nk to the
nearest integer. (We note that it is not always necessary to interpret the
150 Non-linear difference equations and population growth
population as an integer. A practical way of measuring large populations
is by counting the number of individuals in sample areas. This number,
which is measured in individuals per unit area, is called the population
density, and it will not usually be an integer.)

Example 2. In a species of animals a constant fraction of the population a = 5.3

are born each breeding season and a constant fraction fJ = 4.97 die. Formulate a
difference equation for the population and find the number of individuals after two
seasons given the initial number is No = 987.

Solution. Let Nk be the number of individuals in the current breeding season. In

the next breeding season

Nk+l -
current number number + number
of individuals - who die born (6)

Using (5), (6) becomes

Nk+1 = Nk- fiNk + a Nk

Nk+i = (1 + a - f)Nk (7)

Thus with a = 5.3, fJ = 4.97 and No = 987 we calculate
Ni = 1312.71 N2 = 1745.9043.

Thus we interpret the population after the first season as 1313 individuals and after
the second season as 1746 individuals.

The difference equation (7) is of the linear type studied in the previous
chapter. Its closed form solution, given the initial number No, is

N k = (1 + a - f3)kNo, (k = 0, 1, 2, ...).

The model depends on the combination

r=a-J3. (8)

This quantity is called the growth rate. Clearly if r < 0 (corresponding

to the per-capita death rate exceeding the per-capita birth rate) then the
population decreases towards extinction but if r > 0 then the population
increases indefinitely. This is illustrated in Figure 9.1.1. This is not,
however, what is often observed in practice, as will be seen in the next
section where more realistic non-linear models are introduced.
9.1 Linear models for population growth 151

5 10 15 20
k (Breeding seasons)

Fig. 9.1.1. Population growth as given by a linear difference equation with growth
rate r.

Exercises 9.1
1. How many cells will a single cell produce after 10 divisions?
2. Suppose that a single yeast cell divides every 2 minutes. Also suppose 75%
of yeast cells survive to divide in the next generation.
(a) What is the growth rate?
(b) How many yeast cells will there be after 3 hours?
3. A population of birds on an island has a constant per-capita birth rate a and
a constant per-capita death rate /3 (per individual per year). Also, a constant
number I of birds migrate to the island each year.
(a) What is the appropriate time period to be used here?
(b) Formulate a suitable difference equation.
(c) Obtain the closed-form solution given the initial number of individuals,
No.

4. In a certain type of female insect population all the adults die before the eggs
hatch. Each adult contributes a constant number of eggs b, of which a fraction
f survive and develop into adult females.
(a) Set up a difference equation for the population after the kth hatching.
(b) What is the minimum number of eggs that should be laid so that the
population does not become extinct, given the fraction which survive is
20%.

5. Female houseflies produce approximately 120 eggs in one laying. Approxi-

mately half these develop into females. If all survive, starting with one female,
what is the number of females after one year (seven layings)?
152 Non-linear difference equations and population growth
6. Modify the difference equation in the previous exercise given that all the flies
die before the next hatching. Will the population be greater or smaller than that
of the previous question given the same initial number?
7. A culture of bacteria grows from 2 x 106 cells to 3 x 101 cells in 2 hours.
From this information deduce the time between successive cell divisions.

9.2 Restricted growth -non-Unear models

The simple linear difference equation of the previous section is not
generally suitable as a model of population growth since it predicts
unbounded growth as the population This is not what is
increases.

observed in nature or in populations raised under controlled laboratory

conditions. Rather than reject the model outright we try to build into
it modifications so that it better approximates the observed behaviour.
This leads to models described by non-linear difference equations.

The carrying capacity

Laboratory studies have shown that as a population increases the per-
capita death rate goes up and the per-capita birth rate goes down.
This is due to overcrowding and competition for food. In Figure 9.2.1
populations of laboratory-raised beetles are plotted against time. In each
case there is a number which is representative of the number of beetles
that a given environment can support. This is known as the carrying
capacity of the environment.It corresponds to the number of individuals
in the population when the birth rate and death rate are equal. In
Figure 9.2.1(a) the population appears to be converging to a carrying
capacity of approximately 1100. Figure 9.2.1(b) illustrates a population
fluctuating above and below a carrying capacity of approximately 200.
In Figure 9.2.1(c) the population does not appear to follow any simple
pattern. We now introduce a non-linear model whose solutions exhibit
all the different types of behaviour observed in Figure 9.2.1, when a
parameter is varied.
Recall the linear difference equation of the previous section,
Nk+1 = Nk + rNk (1)
where the constant r is the growth rate (the per-capita birth rate minus
the per-capita death rate). To incorporate overcrowding, (1) is replaced
by
Nk+1 = Nk + R(Nk)Nk (2)
9.2 Restricted growth non-linear models 153

1500
1500 300

1000
1000 200

100
500
500

0 5 10 0 5 10 0 5 10
(a) (b) (c)

Fig. 9.2.1. Examples of populations of three different strains of stored product

beetle, from May (1976).

where R(Nk) is a growth rate which is a function of the size Nk of the

population. It must satisfy certain requirements imposed on the growth
of the population:
Due to overcrowding, the per-capita death rate increases and the per-
capita birth rate decreases; so R(Nk) must decrease as Nk increases.
When Nk = K, the carrying capacity, the growth rate is zero; so
R(K)=0.
As Nk -4 0 the effects of overcrowding diminish and the growth rate
tends towards a constant value r, which we call the unrestricted growth
rate (measured per head of population per breeding season). Thus
R(0) = r.
The unrestricted growth rate is determined approximately by measuring
the growth of a population in a situation which is not affected by
overcrowding.

The discrete logistic equation

There are many choices of R(Nk), as a function of Nk, which satisfy the
above conditions. In the spirit of mathematical modelling we choose the
simplest such function. This has as its graph a straight line from (0, r)
to (K, 0) (see Figure 9.2.2). The slope of the line is -r/K and so the
equation of the line is
(3)

Substituting (3) into (2) shows that the population satisfies

Nk+t = Nk + rNk (i_). (4)

This non-linear difference equation is called the discrete logistic equation.

154 Non-linear difference equations and population growth
R(Nk)

Fig. 9.2.2. Growth rate R(Nk), plotted against Nk, for the discrete logistic equa-
tion.

Because the discrete logistic equation is non-linear it is not possible to

use the general solution from Section 8.1. In fact a general closed-form
solution has not yet been discovered. However, it is still quite simple to
iterate numerically for given values of the parameters r (the unrestricted
growth rate), K (the carrying capacity) and No (the initial population),
as is done in the next section.

General features of the discrete logistic equation

Before iterating numerically it is useful to explore the equations to see if
some preliminary information can be extracted.
First we look for steady-state solutions. The steady-state solutions of
the discrete logistic equation are determined by setting Nk+1 = Nk = s, as
explained in Section 7.3. Substitution into the discrete logistic equation
(4) yields
s=s-I-rs(1-K)
or

s (I - s ) = 0. (5)

This equation has two solutions, s = 0 and s = K. The more interesting

of these is s = K since this gives the steady-state solution corresponding
to the carrying capacity of the environment. Thus, if the initial population
is given by No = K, the population will remain at that same value.
Second, by writing the logistic equation (4) in the form

Nk+i -1Vk = rNk 1 1- Kk (6)

)
9.2 Restricted growth non-linear models 155

we can discover some important features of the solutions. The LHS of

(6) is the change in the population between successive time periods. Thus
it follows that if Nk < K then the total population will increase in the
next time interval, since Nk+1 - Nk > 0. Conversely, if Nk > K then the
population decreases in the next time interval.
Note that (6) shows the change Nk+1 - Nk is directly proportional to
the unrestricted growth rate r. Hence we might expect the population
to increase steadily towards the carrying capacity when r is small, but
to oscillate above and below the carrying capacity when r is large. It is
interesting to observe that these two types of behaviour are exhibited by
the population of beetles in Figures 9.2.1 (a) and (b).
A third feature of the discrete logistic equation, that can be found
directly from the equation, is that it can produce negative values of the
population if r > 3. This is a major limitation of this model.

Other models
There are many examples of populations raised in laboratories which
have growth rates greater than 3. Thus the fact that the logistic equation
produces negative population for r > 3 means that this model is not
suitable for such populations.
We must not forget that the discrete logistic equation is based on an
assumption of simplicity namely the straight-line form of R(Nk). In
practice the form of R(Nk) can be more complicated. There are many
choices that we can make for the form of R(Nk) based on experimental
evidence. One such model, which still has the advantage of being fairly
simple, is popular in the biological literature. In particular it has been
used for fish populations; for example, see Greenwell and Ng (1984).
This model uses an exponential curve, yielding a difference equation
Nkea(1-Nk/K),
Nk+1 = (7)

where a is a constant and K is the carrying capacity.

This model, along with others, is explored in the exercises in the same
way that we explored the discrete logistic equation in this section. In
each of these models the unrestricted growth rate r is defined as the limit
of R(Nk) as Nk -4 0. The carrying capacity K, on the other hand, can
be regarded as the population for which the growth rate is zero; that
is, R(K) = 0. A discussion of some of these models can be found in
Chapter 3 of Edelstein-Keshet (1988).
156 Non-linear difference equations and population growth
Exercises 9.2

1. Use a calculator to find the population for the first five breeding seasons:
(a) using the discrete logistic equation with K = 1000, r = 0.5, No = 200;
(b) using equation (7) in the text with K = 1000, No = 200 and a = ln(1.5).

2. Define the variable Xk = Nk /K, where K is the carrying capacity. Note

that Xk is a dimensionless form of the population Nk, scaled with respect to the
carrying capacity K.
(a) Substitute into the discrete logistic equation and obtain
Xk+1 = Xk + rXk(l - Xk)
This is the dimensionless form of the discrete logistic equation.
(b) What is the carrying capacity corresponding to this equation?

3. A model for insect populations (in which all adults are assumed to die before
next breeding) leads to the difference equation
2Nk
Nk 1 = 1 + aNk

where 2 and a are positive constants.

(a) Write the equation in the form Nk+1 = Nk + R(Nk)Nk and hence identify
the growth rate.
(b) Show the general shape of the graph of R(Nk) as a function of Nk, on a
diagram similar to Figure 9.2.2.
(c) Express the unrestricted growth rate r and the carrying capacity K, for
this model, in terms of the parameters a and A.
(d) Find the steady-state solutions of this model and state briefly their
biological significance.
(e) In this model, can the population ever switch from positive to negative
values?

4. Another model for restricted population growth is given by

Nkea'1-Ni/K),
Nk+1 =
where K is the carrying capacity and a is a positive constant.
(a) Write the equation in the form Nk+l = Nk + R(Nk)Nk and identify the
variable growth rate R(Nk).
(b) Sketch R(Nk) in a diagram similar to Figure 9.2.2. Also sketch, with this,
the growth rate for the discrete logistic equation.
(c) Show that a = ln(r + 1) where r, the unrestricted growth rate, is defined
as the limit of R(Nk) as Nk -- 0.
9.3 A computer experiment 157

9.3 A computer experiment

In the absence of a closed-form solution to a non-linear difference equa-
tion, the next best thing is a numerical solution obtained from the
difference equation by iteration. This is performed for a range of values
of both the initial condition and any parameter in the difference equa-
tion. Any resulting changes in the way the population grows can then
be observed. In this way we hope to obtain an overview of all possible
types of behaviour which the equation predicts.
The process of carrying out numerical iterations for a range of param-
eter values is very tedious if done by hand. We recommend the use of a
programmable calculator or a personal computer. A personal computer
has the advantage that it can be programmed to display the results of
the iteration graphically. For those who are not confident at writing
programs the use of a `spreadsheet program' is suggested. A spreadsheet
program is used to manipulate rows and columns of data. It provides
a convenient environment for numerical experimentation since one can
change a parameter, recalculate the spreadsheet and view the graph with
just a few keystrokes.
In the following we carry out a computer experiment on the discrete
logistic equation (4) of Section 9.2,

Nk+l = Nk-I- rNk 1_ Nk (1 )

K
The most interesting parameter to vary is r since an increase in r corre-
sponds to an increase in fertility of a typical individual. Here r is varied
in the range 0 < r <_ 3 while K and No are held constant at K = 1000
and No = 100. (Note that for r > 3 the population will become negative
and the model ceases to apply.) The type of growth varies as r varies
and we have attempted to classify below a number a different modes of
growth : stable growth, cyclic growth and chaotic growth.

Stable growth (0 < r < 2)

In Figure 9.3.1 populations corresponding to low values of r, 0 < r < 1,
are plotted. The populations, in each case, tend towards a carrying capac-
ity K = 1000 as time increases. The population at first appears to grow
exponentially, as in Figure 9.2.1(a), but then the effect of overcrowding
becomes more pronounced as the population becomes larger, causing the
population to level out. A gently sloping curve results. For the larger
value of r the population growth initially is more rapid than for the
158

Nk
1500 1 1500 1
r = 0.2 r = 0.8

ooooooo®ooooe 100o4 Op000000000000000000000000000000000

D00
OA
Q
500 AA soo
Q
o&OZ
1 1
10 20 30 40 0 10 20 30 40
k (Time intervals) k (Time intervals)

Fig. 9.3.1. Discrete logistic equation for r = 0.2 and r = 0.8. The population
increases then levels out to an equilibrium value.

Nk Nk
1500 1500
r= 1.6 r= 1.9

1000 y o0e eooooooeoooeoooeeooooeooeoeooeoo 1000 P, Q A, A a

0
sao 500-

fl 0

1
10 20 30 40 0 10 20 30 40
k (Time intervals) k (Time intervals)

Fig. 9.3.2. Discrete logistic equation with r = 1.6 and r = 1.9. Damped oscilla-
tions.

smaller value of r, as we would expect. However overcrowding manifests

itself more rapidly and the population tends toward the carrying capacity
K = 1000 more quickly than for smaller values of r.
For r > 1 the character of the population growth changes as shown
in Figure 9.3.2. In the early stages the growth of the population is
so rapid that the population is able to overshoot the carrying capacity
before the overcrowding effect is felt. Thus, from Figure 9.2.2 R(Nk)
is negative in the next time interval so the population decreases in the
next time interval to below the carrying capacity. Next the population
overshoots the carrying capacity again but this time it is closer to the
carrying capacity. A `damped oscillation' results, which converges to the
carrying capacity. Note, from Figure 9.3.2, that for r = 1.6 the maximum
amplitude of the oscillation is smaller compared with that of r = 1.9 and
the damping of the oscillation is greater.
9.3 A computer experiment 159

1500 1500 7
r = 2.1 r = 2.4

o g o w w g c Q Q w o w Q g c Q F 9 Q
1000 iooo
O y : N ;i ii i ji
vi

500 500 4

0 10 20 30 40 0 10 20 30 40
k (Time intervals) k (Time intervals)

Fig. 9.3.3. Discrete logistic equation with r = 2.1 and r = 2.4. Each case
demonstrates a 2-cycle.

Cyclic growth (2 < r < 2.57)

In Figure 9.3.3 populations corresponding to r in the range 2 < r <
2.57 are plotted. A new type of behaviour occurs. The population
tends towards an oscillation which is no longer damped but, at large
times, fluctuates periodically above and below the carrying capacity
K = 1000. In Figure 9.3.3 the populations oscillate, coming back every
second breeding season. We call this a 2-cycle. For r = 2.1 the population
oscillates between the values 824 and 1129 whereas for r = 2.4 the
population oscillates between 640 and 1193. (In Exercise 5 these values
are derived analytically.)
As we increase r further to r = 2.5 the long-term behaviour of the
population changes (see Figure 9.3.4). Instead of repeating itself after
two time intervals, the population now repeats itself every fourth time
interval, with two values below the carrying capacity and two values
above. This is called a 4-cycle. Beyond r = 2.5, 4-cycles become 8-cycles
and then 16-cycles and so on up to r 2.57. This is called period
doubling.
The American physicist Mitchell Feigenbaum noted values of the
parameter at which these cycles first appeared (although he used a
slightly different form of the discrete logistic equation; see Exercise 4).
He was thereby led to the discovery of a famous number 4.669... which
now bears his name.

Chaotic growth (2.57 < r < 3)

As we increase the value of r past 2.57 some remarkable behaviour is
noticed (see Figure 9.3.5). The pattern of the growth appears to be
random, even though the population is predicted by a simple difference
160 Non-linear difference equations and population growth
_
1500 Nk r=2.5
o Q p q o

... ..
.8 o
A.

.4 so
.. , . ... ,... .... 9.
.
1.

aoo ob
00 .0 Go

e d :i c ;: 4 b o o 's: c :i a i o
a
500

10 20 30 40
k (Time intervals)

Fig. 9.3.4. Discrete logistic equation with r = 2.5: a 4-cycle.

Nk
1500 1500 r=3
r = 2.6
o p
QTY Q Q;
9
A
1000 I. 1000
a
a
soo b
ad soo 6
i

0 10 20 30 40 0 10 20 30 40
k (Time intervals) k (Time intervals)

Fig. 9.3.5. Chaotic growth with r = 2.6 and r = 3.

equation, the discrete logistic equation. This random type of behaviour

is called chaotic. One of the first people to realize how such simple
models could lead to such complicated behaviour was Robert May who
published several articles on this topic; see May (1975) and May (1976).
It should be noted, however, that in the regime 2.67 <_ r <_ 3 other
types of behaviour can occur. For example, . for a small range of values
of r, 3-cycles occur which then go through the same period doubling
phenomenon as we saw with 2-cycles.
The discovery that apparently random patterns of growth could arise
from simple equations was very important in the study of populations.
Before this discovery, observations from the field which did not exhibit
a simple pattern (such as Figure 9.2.1(c)) were thought to be due to
external random environmental effects. It is now clear, however, that
random behaviour can be generated from within the system.
9.3 A computer experiment 161

It is often difficult, in practice, to tell whether the behaviour is chaotic

or merely periodic with a very long period. One feature of chaotic
behaviour is that if the initial population is varied, ever so slightly, the
population in subsequent times can change dramatically from that with
the original value of No.

Discussion
We have now completed our numerical investigation of the effect of
varying the parameter r since, when r > 3, Nk becomes negative and
the model ceases to apply. Varying the parameter K does not provide
any useful insight since this corresponds merely to a change of scale in
the vertical axis. Varying the initial population No does not normally
affect the long-term behaviour of the population except when the the
behaviour is chaotic. This is also investigated in the exercises.
Numerical experimentation has now become an accepted part of ap-
plied mathematics. The value of numerical experiment is illustrated by
the results of the computer experiments, for the discrete logistic equa-
tion model, in Figures 9.3.1-9.3.5 since they raise some very interesting
questions. For instance, we might ask how realistic the predictions of
our model are compared with the results shown in Figure 9.2.1 for beetle
populations. Also, what is the exact value of r for which the popu-
lation first executes oscillations (2-cycles) ? But most importantly, the
results lead us to look for underlying causes of the oscillations and their
biological significance.
The key to understanding why the population oscillates depends on
two factors. Firstly, the population is self-regulating through the pop-
ulation-dependent growth rate. (Note that we found, in the previous
section from studying the difference equation, that population decreases
in the next time interval when it is greater than the carrying capacity
and increases when it is less than the carrying capacity.) Secondly, the
regulating effect is felt in the next time interval but actually determined
by the population in the current time interval. In other words there is a
natural delay in the population responding to overcrowding. As a result
when the ideal growth rate r is sufficiently high the population responds
by overcorrecting itself which leads to oscillations and sometimes chaos.
Note that for continuously breeding populations (described by differential
equations) the growth rate responds instantaneously to the population
and oscillations do not normally occur, except through external seasonal
factors.
162 Non-linear difference equations and population growth
One major limitation of the discrete logistic model is that it predicts
negative populations when r > 3, which is clearly unrealistic since many
real populations have growth rates exceeding 3. Despite this limitation the
model is still useful since it predicts the qualitative range of behaviour
as seen in Figure 9.2.1 and it does illustrate the importance of the
growth rate. These considerations are certainly important to population
biologists.
For general discussions on chaos see the books by Gleick (1987) and
Stewart (1990). A more technical, but still very readable reference, is
the book by Devaney (1986). Also the book by Tuck and De Mestre
(1991), which discusses how to use the programing language BASIC to
do computer experiments in population dynamics and ecology, is pitched
at an introductory level.

Exercises 9.3
1. (Computer Simulation). Use a computer to iterate numerically the equation
Nkea(1-Nk/K)'
Nk+1 =
where K is the carrying capacity and a is a positive constant. You are given
K = 1000, No = 100 and you should use your own choice of values of a. Describe
the types of behaviour which appear for different values of a. In particular:
(a) When do 2-cycles first appear?
(b) When do 2-cycles become 4-cycles?
(c) Does the model exhibit chaos?

2. (Computer Simulation). A model which has been used to analyse insect

populations is a modification of the one introduced in Exercises 9.2, question 3.
The model leads to the difference equation
,Nk
Nk+1 = (1 + Nk )b

for the insect population Nk. Using a computer, sketch solutions for the following
values of the parameters:

A b

Moth 1.3 0.1

Mosquito 10.6 1.9
Potato Beetle 75.0 3.4

3. (Computer experiment). For the discrete logistic equation with K = 1000

examine N100 as you vary No slightly in each of the following cases :
9.3 A computer experiment 163

(a) r = 0.5,
(b) r = 2.5,
(c) r = 3.
Start with No = 100 and then try No = 101 and No = 99. What happens for
each value of r?
4. The discrete logistic equation can be put into another form. The following
shows how to do this.
(a) Write the discrete logistic equation

Nk+1 =Nk+rNk 1- Nk
K
in the form
Nk+1 = ANk 1- Nk
K1

and identify the constants A and K1 in terms of r and K.

(b) Make the substitution Xk = Nk/K1 and show that the above difference
equation becomes
Xk+1 = AXk(1 - Xk)
(c) Find all the steady-state solutions for the difference equation in (b).
(d) Draw a cobweb diagram for the difference equation in (b). Use a few
typical values of r.
[Note : The advantage of the first form of the discrete logistic equation is that the
parameters have a direct biological meaning, but the form involving Xk is better
to analyse mathematically because it has a simpler cobweb diagram.]
5. Consider the dimensionless discrete logistic equation
Xk+1 = Xk + rXk (1 - Xk )
(a) Show that every second term in the sequence Xo, X1, X2, X3, ... satisfies
the difference equation
Xk+2 = (1 + 2r + r2)Xk - (2r + 3r2 + r3 )Xk + (2r2 + 2r3 )Xk - r3Xk .
(b) Put Xk+2 = Xk = s and hence show that the steady-state solutions satisfy
a quartic equation.
(c) Explain why s = 0 and s = 1 must be solutions of this quartic equation.
Hence show that the other two real solutions are
s- (2-}-r)± r 2_4
ifr _ 2.
2r
[Note: This shows that 2-cycles first appear when r > 2. The two values of s
give the values that the 2-cycle fluctuates between (check this with the numerical
simulation in the text).]
164 Non-linear difference equations and population growth
9.4 A coupled model of a measles epidemic
In nature, populations of different species interact with each other. For
example, one species may be the food for another species or two species
may be in direct competition for the same food supply. Even populations
of a single species may be divided into several groups which interact
with each other. An example of this is the study of infectious diseases
where the population can be divided into several groups : those who
have recovered and those who are susceptible to catching the disease.
In this section, we illustrate this with a model which describes the
spread of measles amongst children. This leads to coupled difference
equations which have as their solution a pair of infinite sequences. The
approach taken here to solve the difference equations is that of numerical
investigation using a computer since we are unable to find a closed-form
solution.

Measles epidemics
Measles is a highly contagious disease, caused by a virus and spread
by effective contact between individuals. It tends to affect mainly chil-
dren. Epidemics of measles have been observed in Britain and the
United States roughly every two to three years. They occur more
frequently in developing countries. An important problem is to un-
derstand what factors affect the timing and severity of measles epi-
demics.
Let us now look at the duration of the disease for a single child. A
child who has not yet been exposed to measles is called a susceptible.
Immediately after the child first catches the disease there is a latent period
where the child is not contagious and does not exhibit any symptoms of
the disease. This is because the virus has not yet multiplied sufficiently.
The latent period lasts, on average, 5 to 7 days. After this the child
enters the contagious period. The child is now called an infective since it
is possible for another child who comes in contact with the infective to
catch the disease. This period lasts approximately one week. After this
time red spots appear on the skin of the child for a few days after which
the child recovers. During this period, and subsequently, the child is
immune to the disease and cannot be reinfected. Note that an individual
cannot become an infective until a week after they have been infected,
due to the latent period.
9.4 A coupled model of a measles epidemic 165

Formulating the model

For simplicity let us assume that both the latent period and contagious
period last one week. Furthermore, we also assume that all contact
between infectives and susceptibles occurs only on weekends, so that the
number of infectives and susceptibles remains constant over the rest of
the week.
There is then a time delay of one week from when a susceptible first
catches the disease until that person becomes an infective; and a further
delay of one week before the infective recovers from the disease. This
suggests a model in the form of a difference equation where time is
measured discretely in intervals of one week. We thus define
number of
Ik = contagious infectives present
in the kth week
number of
Sk = susceptibles present
in the kth week
Next we need to write down equations which determine the number
of infectives and susceptibles one week later. The following numerical
example illustrates how to do this.

Example 1. Suppose there are 10 infectives and 1000 susceptibles present in the
kth week. Suppose that each infectives infects two susceptibles. How many infectives
and susceptibles are there in the (k + 1)th week if we ignore births and deaths?

Solution. We have Sk = 1000, I k = 10, and I k+ 1 = 20. So Sk+1 = 1000 - 20 = 980.

Note that none of the 10 original infectives is counted in the (k + 1)th week since
after one week they are no longer contagious.

To develop an equation for the number of infectives we consider the

number of infectives one week later, in the (k + 1)th week. Now
number of susceptibles
number of infectives
{ in (k + 1)th week } - who caught measles
at beginning of kth week
which we write as
number of susceptibles
Ik+l = who caught measles (1)
at beginning of kth week
It is generally thought that the number of new births is an important
166 Non-linear difference equations and population growth
factor in measles epidemics. We also have to account for the susceptibles
who get infected. So

number of number of number of

susceptibles - susceptibles who
Sk+l = caught measles at + births during
in kth week kth week
beginning of kth week
(2)
It is further assumed that the number of births each week is a constant
B. To calculate the number of susceptibles infected in a week it is assumed
that a single infective infects a constant fraction f of the total number of
susceptibles. Thus fSk is the number of susceptibles infected by a single
infective so, with a total Ik infectives, then
number who caught measles
at beginning of kth week fSkIk-

Hence (1) and (2) become

Ik+l =,fsklk
(3)
Sk+l = Sk " f SkIk + B
where B and f are constant parameters of the model. Note that the
higher the value of f the more easily the disease is spread between
individuals.

Computer results
In an article, Anderson and May (1982) use this model to discuss measles
epidemics in a typical city in Britain and in Nigeria. They assume
initial values of infectives, and susceptibles of Io = 20 and So = 30000
respectively. They also choose f = 0.3 x 10-4 it can be shown that this
value of f corresponds approximately to one infective infecting a single
susceptible, during one week. In their article the number of new births,
in one week in the typical British city, is given as B = 120.
Numerical iteration of (3) with these values gives the results shown in
Figure 9.4.1. This shows that there is a dramatic increase in the number
of infectives every 130 weeks (roughly 2-3 years). This corresponds to
the epidemics observed every 2-3 years in Britain.
Now let us examine what happens when we change the parameter B
to 360 (three times its previous value) corresponding to the birth rate
of a developing country such as Nigeria. Here the birth rate is much
higher. The results are shown in Figure 9.4.2. In this case the model
9.4 A coupled model of a measles epidemic 167

50 100 150 200 250 300

k (weeks)

Fig. 9.4.1. Numerical simulation of a measles epidemic for a typical British city,
with B = 120 births per week.

predicts epidemics every year and the epidemics are much more severe.
This result is consistent with observation.

Steady-state solutions
Steady-state solutions are sometimes useful for understanding simple
models. Finding steady-state solutions for coupled difference equations is
the same, in principle, as for single difference equations. Now, however,
both the quantities Ik and Sk are to assume constant values I and S, say.
This leads to a pair of simultaneous equations for I and S. Care must be
taken that all solutions are found. It is also a good idea to verify results
by substitution back into the original equations.

Example 2. Find all steady-state solutions of (3).

168 Non-linear difference equations and population growth
40 000 i
tl-N
a 30 000
.:o

20 000

10000

0 50 100 150 200 250 300

2000

1500

1000
40-4

500

.104 woe
0 50 100 150 200 250 300
k (weeks)

Fig. 9.4.2. Computer simulation of a measles epidemic in a typical city in Nigeria,

with B = 360 births per week.

Solution. Let us denote the steady-state number of infectives by I and the stead y-
state number of susceptibles by S, where I and S are constants. Then,
Ik+l = Ik = I and Sk+1 = Sk = S (4)
are the steady-state solutions. Substituting (4) into (3) we obtain
I=fSI,
S=S-fSI+B,
or

I(1 - f S) = 0, (5a)
f SI - B = 0. (5b)

Our aim is to solve for I and S. From equation (5a) there are two cases to be
considered : I = 0 or S = 11f.
Case (a) : I = 0. Substituting I = 0 into (5b), to determine S, yields B = 0. But
this contradicts the fact that B is a positive constant. Thus, there is no solution of
both (5a) and (5b) corresponding to I = 0.
Case (b) : S = 11f. Substituting S = 1/f into (5a), to determine I , yields I = B.
Hence the pair, S = 1 If , I = B, is a solution of the coupled system (5a) and (5b),
which is easily verified by substitution.
One can also try to find solutions of (5b) first and then substitute into (5a) but
this does not yield any additional solutions.
9.4 A coupled model of a measles epidemic 169

Thus there is only one steady-state solution;

Ik=B, Sk=1/f,
which corresponds to all of the new born being infected.

Discussion
The first of the equations (3) shows that if Sk is below 1 If then

Ik+1 <
Ik
so Ik decreases as k increases. Now the second of the equations (3) shows
that for small Ik (after an epidemic is over), Sk increases due to births
until Sk eventually becomes greater than 1 If and the number of infectives
begins to increase again. This signifies the important part played by the
birth rate, as seen in the numerical simulations. Also one can easily see
the advantage in trying to keep the number of susceptibles down below
11f, through vaccination, to prevent epidemics.
A comprehensive discussion of this model is given in the article by
Anderson and May (1982). This article is certainly accessible to students.
Further models of epidemics will be discussed in Chapter 19 of this book.

Exercises 9.4
1. In some diseases infected individuals do not become immune but return to
the susceptible class. Ignoring births, put together an argument which gives
I k+1 = f Sklk
Sk+1 = Sk + Ik - f SkIk

2. Find all steady-state solutions of the difference equations in Exercise 1.

3. Deduce from the equations in Exercise 1 that
(a) Sk + Ik = M where M is a constant.
(b) Hence deduce that Ik+1 = A W - Ik). Find the steady-state solutions.
(c) Relate the equation in (b) to the discrete logistic equation.

4. Modify the measles model in the text given that a constant fraction y of those
recovered are reinfected. You will need to introduce an additional variable.
5. Modify the measles model in the text given that a constant fraction 6 of the
new births are vaccinated.
6. In the model in the text we assumed that the number of new infected
individuals per week was given by f IkSk. Another model assumes a constant
probability p of contact between two individuals selected at random.
170 Non-linear difference equations and population growth
(a) Given the probability p above, what is the probability that a given sus-
ceptible does not have contact with a single infected, picked at random?
Hence find the probability that a given susceptible does not have contact
with any of the Ik infected and hence argue that the expected number
of new infected individuals during week k is
Sk-Sk(1-P)lk
(b) Put e -Y = 1- p and show that the expected number of new infections is
given by
Ik+1 = Sk [1 - e-Y'k ]
Hold Sk constant and sketch a graph of the number of new infections
as Ik varies. Is this model better than the one used in the text when the
number of infectives is large?

7. Using the result of Exercise 6 show that the modified model from Exercise 1
becomes
Ik+l = Sk [I -e-71k

Sk+1 =Ik+Ske".
Hence show that
Ik+l = (M - Ik) [1 -e-ylk]

where M is a constant.
8. (Computer Simulation). Modify the model in the text for a measles epidemic,
using the result of Exercise 6. Use y 0.3 x 101. What differences do you
observe?
9. In a host-parasite system, a parasite searches for a host on which to deposit
its eggs. Define
Nk = {number of host species in kth breeding season}
Pk = {number of parasite species in kth breeding seasons}
f = {fraction of hosts not parasitized)
c = {average number of eggs laid by parasite which survive}
A = {host rate, given that all adults die before their offspring can breed).
Argue that Nk and Pk satisfy
Nk+l = Af Nk and Pk+1 = cNk [1 - f].

10. (Computer experiment). The Nicholson-Bailey model is a model for host-

parasite systems (see previous question). It uses probability theory to argue that
f = ey' k . Numerically iterate for y = 0.068, c = 1 and % = 2. What happens?

9.5 Linearizing non-linear equations

In the previous two sections, non-linear difference equations have been
studied via numerical iteration. The reason for this choice of method
9.5 Linearizing non-linear equations 171

is that no closed-form solution is in general possible for non-linear

equations. However it will be explained in this section that, if the initial
value is close to a fixed point, then it is possible to obtain a linear
difference equation which approximates the non-linear equation. The
linear equation can be solved exactly and some properties of the solution
can therefore be discussed for all values of the parameters.

Linearizing the discrete logistic equation

This technique of linearization about a fixed point is applicable to both
single and coupled non-linear difference equations. However, the details
of the method are simplest when applied to single equations and so for
this reason the linearization of the discrete logistic equation of Section
9.2 will be our principal example. The following question is addressed: if
a population is described by the discrete logistic equation

Nk+i = Nk + rNk 1- Kk (1)

for what values of the growth rate r will the population tend to the carrying
capacity K if the initial population is close to K?
To answer this question the scaled variable

Xk = Kk (2)

is introduced, which is the ratio of the population to the carrying capacity.

Hence changes in the population which are small compared with the
carrying capacity correspond to changes in Xk which are small compared
with 1. This feature plays a crucial role in the linearization procedure.
In terms of the scaled variable, the discrete logistic equation becomes

Xk+1 = Xk + rXk (1 - Xk )

The fixed point of this equation corresponding to Nk = K is Xk = 1. The

equation can be studied close to this fixed point by introducing the small
variable Yk such that
Xk = {steady-state solution} + Yk
(3)
=1+Yk.
The new variable Yk has the advantage of being equal to 0 when Xk is
at the fixed point and being small when Xk is close to the fixed point.
172 Non-linear difference equations and population growth
Substituting (3) into the difference equation and rearranging gives

(4)

Although we cannot solve this difference equation in closed form, it

is easy to find a simple approximation to the RHS, valid when Yk is
small. To see this note that if Yk is small compared with I then Yk is
very much smaller. For example, if Yk = 10-3 then Yk 2 = 10-6, so that
Yk2 is not only small, but small in comparison with Yk. Hence, as a good
approximation, the term r(Yk )2 in (4) can be ignored since it represents
only a small correction to the RHS. We are therefore left with the linear
difference equation

Yk+1 =(I - r)Yk (5)

The solutions of the linear difference equation (5) are only approximate
solutions of (4) but they have the advantage that they can be found in
closed form.
The closed-form solution of equation (5) is

Yk =(1-r)kYo
where Yo can be calculated from the initial population No using equations
(2) and (3). Hence an approximate solution to the original equation (1)
is given by
Nk K +K(1 -r)"Yo. (6)

Now for ' 1 - r I < 1, the term (1 - r)k tends to zero as k approaches
infinity. Thus the approximate solution converges to the steady-state
solution. This means that for 0 < r < 2 the approximate solution is
attracted to the steady-state, while for r > 2 the solution is repelled.
This is in precise agreement with numerical experiments on the original
non-linear difference equation, provided that the initial population No is
sufficiently close to the carrying capacity K.
The types of behaviour just discussed occur not only for the discrete
logistic equation, but for many other difference equations as well. In
general, given a non-linear difference equation, a steady-state solution s is
called an attractor, or is said to be stable, if all the approximate solutions
obtained from the linearized equation converge to the steady-state. If, on
the other hand, all these solutions (apart from the steady-state solution)
diverge away from the steady-state, then it is said to be a repellor, or is
said to be unstable.
9.5 Linearizing non-linear equations 173

Linearization by differentiation
The above discussion illustrates the main ideas involved in the lineariza-
tion of a difference equation. The method used, however, is only suitable
when the function on the RHS of the difference equation is a polynomial
of low degree. A more generally applicable technique uses a formula
involving the derivative to approximate the RHS. This new technique
will now be explained.
Consider a general first-order non-linear difference equation
Xk+1 = g(Xk) (k = 0, 1, ...) (7)
where the sequence members Xk are real numbers and g is a sufficiently
smooth function. Our interest is in the following problem : obtain an
approximate solution of the non-linear equation (7) if the initial value Xo
is close to a steady-state solution Xk = s.
The first step is to introduce a new variable Yk such that
Xk = S + Yk. (8)
The new variable Yk equals 0 when Xk is at the fixed point s and is small
when Xk is close to the fixed point. We shall assume that Yk stays small
enough to ensure that
Yk2 is very small compared with Yk. (9)

By (8) the equation (7) becomes

Yk+1+S=g(s+Yk) (10)
The function g(s + Yk) can be approximated as
g(s + Yk) ti g(s) + ga(s) Yk. (11)

A theorem of calculus (the Taylor series expansion) tells us that the

magnitude of the term ignored on the RHS of (11) is less than a constant
times Yk2. By the assumption (9), this term is very small relative to Yk
and so can be ignored while Yk stays small. The approximation (11)
has the graphical interpretation of replacing the function g(s + Yk) by its
tangent at the fixed point s. See Figure 9.5.1.
If we regard (11) as an exact equality (rather than as an approximate
one) and then substitute into (10) we get
Yk+1 + s = g(s) + g'(s)Yk
But
s=g(s),
174 Non-linear difference equations and population growth
y-axis
y=x

Fig. 9.5.1. Linear approximation via tangent to graph.

since s is a fixed point, and so the linear difference equation

Yk+1 = g'(S) Yk

is obtained. This difference equation is not exactly the same as (11),

but only approximately so. We nevertheless expect its solutions will stay
close to those of (11) provided Yk stays small.
The solution of this linear equation is

Yk = (g'(s))k Yo

and hence, after substituting for Xk using (8), we see that the desired
approximate solution of the non-linear equation

Xk+1 = g(Xk),
when Xo is close to the fixed point s, is
Xk =S+ ((gF(s))' (Xo - S). (12)
It is simple to use this approximate solution to determine under what
conditions the fixed point is an attractor (see Exercise 4). The method of
linearization by differentiation is summarized in Table 9.5.1.

Example 1. Find the non-zero fixed point of the difference equation

Xn-+-1 = (Xn)2
and determine if the fixed point is an attractor.
9.5 Linearizing non-linear equations 175

Table 9.5.1. Linearization by differentiation.

Aim: To obtain an approximate solution of the non-linear difference

equation
Xk+1 = g(Xk)
when the initial value Xo is close to a fixed point s.
Method and solution:Introduce the small variable Yk by defining
Xk =S+Yk.
The solution of the linear equation
Yk+1 = g'(S)Yk,
when added to s, then approximates the solution of the original non-linear
equation.
Key approximation: The method relies on Yk being small enough to
ensure that
Yk2 is very small compared with I Yk I.

Properties of the solution: If Ig'(s)I < 1 then the fixed point is an attractor
and the solution converges to s, while if Ig'(s)I > 1 then the fixed point
is a repellor and the solution does not converge to s.

Solution. The difference equation is of the form

Xn+1 = g(Xn)
with
g(x) = x2.
The fixed points s are given by the solutions to the equation
s=g(s)=S2
and so are s = 0 and s = 1. Thus the non-zero fixed point is s = 1.
Now
Ig'(1)I = 2
and thus
Ig'(S)I > 1.

Hence the fixed point is a repellor rather than an attractor.

A thorough discussion of the mathematics involved in linearizing a

difference equation is given in Devaney (1986).
176 Non-linear difference equations and population growth
Exercises 9.5
1. The approximation used in the linearization procedure (equation (11) in the
text) is
g(s + Y)' g(s) + g'(s)Y
where y is assumed small. Use your calculator to test the accuracy of this
approximation for
(a) g(x) = x2, s = 1, y = 0.1,
(b) g(x) = sin(x), s = 0, y = 0.1.
by comparing right and left hand sides of the approximation. Repeat (a) and (b)
for y = 0.05. Does the accuracy improve?
2. Graphically, the approximation
g(s + Y)-' g(s) + g'(s)Y
corresponds to replacing the function y --> g(s + y) by its tangent at the point s.
Tabulate, as a function of y,
g(1+Y)=(1+Y)2
for y in the interval [-0.1,0.1] at spacings of 0.02 and plot the points on graph
paper. Now plot, as a function of y,
g(s) + g'(s)y = 1 + 2y
on the axes. Are the two graphs similar?
3. Use differentiation to linearize the discrete logistic equation, in scaled form,
Xk+1 = Xk + rXk(l - Xk)
about the fixed point s = 1 and thus re-derive equation (5) in the text.
4. Use the approximate solution (12) in the text to deduce the `properties of the
solution' listed in Table 9.5.1: if Ig'(s)I < 1 the fixed point is an attractor and the
sequence converges to s, while if Ig'(s)I > 1 then the fixed point is a repellor and
the sequence will not converge to s.
5. (a) Show that Nk = K is a fixed point of the population model described
by the difference equation
Nkea(l-Nk/K)
Nk+1 =
Introduce the scaled variable
Xk = Nk /K
to rewrite this difference equation.
(i) Use the method of linearization by differentiation to obtain the lin-
earized form of the scaled equation and thus determine an approximate
solution to the original equation.
(ii) An initial population of 20 000 is well described by the above difference
equation with a = 1.5 and K = 19 800. Calculate the subsequent
evolution of the population in the next three time periods using the
approximate solution of (b). Does the population appear to be stable?
(In other words, is the fixed point an attractor?)
(iii) Show that the fixed point is an attractor if 0 < a < 2.
10
Models for population genetics

In the previous chapter, difference equations were used to predict the

change in the total population of a species from generation to genera-
tion. This leads naturally to the question of predicting the change in a
particular characteristic of the individuals in the population. Since the
heredity units which determine characteristics are the genes, this question
comes under the heading of population genetics.
The formulation of models in population genetics requires a knowl-
edge of the fundamentals of the theory of genetics. This chapter begins
by presenting the required theory, before formulating particular models
in terms of difference equations. These models differ from those obtained
in the previous two chapters in that they are based on some firmly es-
tablished laws. In this respect they are similar to the models obtained
in mechanics. The laws of genetics, however, are expressed as probabil-
ities and hence are only relevant, in practice, to populations which are
sufficiently large.
The models in population genetics give rise to non-linear difference
equations. This chapter assumes the elements of probability theory.

10.1 Some background genetics

An ability to predict the most probable characteristics of offspring from
knowledge of the characteristics of the parents is a skill of prime impor-
tance to plant and animal breeders as well as to medical scientists. In
many cases this problem can be reduced to the study of difference equa-
tions but only after the fundamentals of the theory of genetics have
been learned. This section presents the necessary background theory.
As in many of the physical sciences, there are some basic principles
(or laws) by which the results of a large number of experiments can be

177
178 Models for population genetics
explained. In genetics, the laws are due to the monk, amateur scientist
and plant breeder Gregor Mendel, who performed a series of famous
experiments in the mid 1800s.

Alleles and genotypes

For our purposes, the main result from Mendel's experiments is that
some characteristics of plants and animals are determined by just two
genes. Fur colour and fur length in hamsters, eye colour and wing length
in flies, and the colour of flowers in many plants are a few examples of
characteristics controlled by pairs of genes. Two genes responsible for
the same characteristic, say eye colour, are called alleles of each other,
and the two genes are said to form a pair of alleles.
Different pairs of alleles, with different locations along the chromo-
some, are responsible for different characteristics. Once a particular
characteristic has been selected, say the colour of the flower in a pea
plant, it is convenient to denote the pair of alleles by, say, the letters

A and a.

In any given individual in the species, the alleles can occur in just one of
the combinations

AA, Aa (which is equivalent to aA) or as

which are called genotypes. Each individual can therefore be classified as
being of a particular genotype. A given genotype determines a physical
characteristic of the individual.
For example, in pea plants there is a pair of alleles responsible for
flower colour consisting of A which causes white flowers and a which
causes pink flowers. As to the genotypes (each of which contains two of
these genes), the genotypes AA will, of course, have white flowers and the
genotypes as pink flowers. It turns out that the genotypes Aa have white
flowers. For this reason, the A-allele, which causes the white flowers, is
said to dominate the a-allele, which causes the pink flowers. We also say
the gene corresponding to a is recessive. In some cases, however, the Aa
genotype is different from both the AA genotype and the as genotype
so the concept of dominant and recessive genes has no meaning. It is
also noted that many physical characteristics are due to the expression
of several different genes. Here, however, we will assume that only one
gene is responsible, for the sake of simplicity.
10.1 Some background genetics 179

Fig. 10.1.1. Genotypes of parents.

Fig. 10.1.2. Alleles from first and second parents.

Reproduction
Our aim is to determine how the proportions of the three genotypes in
a population vary with the time. To achieve this aim, it is necessary to
know how the genes are transmitted during reproduction. The only type
of reproduction we shall consider is the most common one in which the
male and female both contribute one gamete (an egg or a sperm) to the
offspring. The gamete contains only one of the alleles A or a, the allele
being chosen in accordance with the following law.

Mendel's first law. The allele in the gamete is chosen at random from
the two alleles in the genotype of the parent.

This law is sometimes called the law of segregation. The following

example illustrates the use of this law in predicting the genotypes of the
offspring.

Example 1. If an AA genotype mates with an Aa genotype, what are the possible

genotypes of the offspring and what are the probabilities of occurrence of each
genotype?

Solution. The genotypes of the parents are shown in Figure 10.1.1.

According to Mendel's first law, all the gametes of the first parent will contain
the allele A, while on average half the gametes of the second parent will contain
the allele A and the other half of the allele a. These alleles are shown in Figure
10.1.2.
The possible combinations of these alleles, one from each parent, are
AA, Aa, AA, aA.
180 Models for population genetics

0
as

B
Fig. 10.1.3. Genotypes of a hypothetical population of six individuals.

Thus the possible genotypes of the offspring are AA and Aa, noting that aA is the
same as Aa. Thus both AA and Aa occur with the same probability 112, in this
example.

The above example provides a simple illustration of how Mendel's law

can be used to predict the genotypes of the offspring. To apply this
law to more complicated examples, it is first desirable to introduce some
notation and formulae for the proportions of the different genotypes
and alleles in a population. This will provide an important step towards
our ultimate goal of modelling how these proportions change as the
population increases.

Proportions of genotypes and alleles

As an illustration of how the proportions of genotypes are calculated,
consider the population of six individuals shown in Figure 10.1.3.
There is one individual of genotype AA, three of genotype Aa and two
of genotype oca. Thus 1/6 of the population has genotype AA, 1/2 has
genotype Aa and 1/3 has genotype aa.
The general procedure for calculating the proportion of a population
with a given genotype is to divide the number of individuals with that
genotype by the total number of individuals in the population. Thus

X(AA)
G(AA) = (1a)
N
where G(AA) is the proportion with genotype AA, .K(AA) is the number
with genotype AA, and N is the total number in the population. Similar
10.1 Some background genetics 181

0
00 00
00
0
00 00
Fig. 10.1.4. Gene pool for the population in Figure 10.1.3.

formulae hold for the proportions with genotypes Aa and as :

V'(Aa)
G(Aa) = N (1b)

1(aa)
G(aa) = N (1c)

Note also that

N = K(AA) + K(Aa) + K(aa).
As an illustration, we apply these formulae to the population shown
above in Figure 10.1.3 for which

.K(AA) = 1, .K(Aa) = 3 and K(aa) = 2,

while N = 6. Thus the formulae (1) give

G(AA) = 1/6, G(Aa) = 1 /2 and G(oca) = 1/3,

in agreement with the results noted earlier.

Two quantities which will be needed to predict the genotypes of the
offspring are the proportions of the alleles A and a in the population.
To clarify the meaning of these quantities it is convenient to introduce
the concept of a gene pool to which each member of the population
contributes the two alleles in its genotype. Thus an individual of genotype
AA contributes two A-alleles, and so on.
When applied to the population in Figure 10.1.3, this procedure gives
the gene pool shown in Figure 10.1.4. This gene pool contains five A-
alleles and seven a-alleles. Hence we say that the proportions of the
alleles A and a in the population are 5/12 and 7/12 respectively.
182 Models for population genetics
For an arbitrary population let
P(A) proportion of A-allelest
in the gene pool f
(2)
Iproportion of a-allelesl
l in the gene pool f
To express these proportions in terms of the numbers (1) of each geno-
type, note that each individual contributes two alleles to the gene pool.
Hence the number of A-alleles in the gene pool is 2.K(AA) + X (Aa)
while the total number of genes is 2N. This gives

P(A) = (3a)
2N
Similarly

2N
As an illustration, we apply the formulae (3) to the population in
Figure 10.1.3, for which X(AA) = 1, .N'(Aa) = 3 and .N'(aa) = 2. This
gives
2 x 1+ 3 5
P(A) 2x6 12
and
7
Pa =2x2+3
() 2x6 12
in agreement with the answers stated earlier.

Proportions as probabilities
The proportions of genotypes and alleles in the population can be re-
garded as the probabilities of certain events. For example, since G(AA)
is the proportion of AA genotypes in the population, we may also write
probability that an individual selected
G(AA) (4)
at random has genotype AA
Again, since P (A) is the proportion of A-alleles in the gene pool, it is
also the probability that an allele selected at random from the gene pool
happens to be an A-allele. By Mendel's first law this implies that
probability that a gamete from a randomly
P(A) = selected individual in the population (5)
contains an A-allele
Interpretations similar to (4) or (5) also apply to G(Aa), G(aa) and P(a).
10.1 Some background genetics 183

During reproduction a gamete is needed from each of the male and

female parents. Provided that the gene A is not sex linked (contained on
the X or Y chromosome), the probability on the RHS of (5) will be the
same when calculated for male and female individuals separately. Hence
P (A) gives this probability for each parent, as we assume from now on.
The formulae given above can be used to predict the probability of
a particular genotype occurring in the offspring in more complicated
examples than Example 1.

Example 2. Suppose that 40% of a population are of genotype AA, 40% are of
genotype Aa, and 20% of genotype aa. Find the probability that a gamete from a
randomly selected individual in the population contains an A-allele. Find also the
probability that it contains an a-allele.

Solution. Let N be the total number of individuals in the population, so that

K(AA) = 0.4N, .K(Aa) = 0.4N, X(aa) = 0.2N.
From (3) the proportions of alleles are given by
2x0.4N+0.4N =0.6,
P(A)_
2N
and
2 x 0.2N + 0.4N = 0.4.
P (a) _
2N
Note that P (A) + P (a) = 1.

Example 3. For the problem in Example 2 state the possible genotypes of the
offspring and find the probability of each genotype occurring, given that the two
parents are chosen at random.

Solution. The possible genotypes for the offspring are AA, Aa and aa. For offspring
of genotype AA, each parent must contribute a gamete with the allele-A. Whether
this occurs for the male parent is independent of whether it occurs for the female,
the probability of occurrence being P(A) in either case. Hence the probability of
both occurring - and giving offspring of genotype AA - is the product
P(A)P (A) = [P (a)] 2 = 0.36.
Similarly, for offspring of genotype aa, the probability is
p(a)p(a) = [P(a)]2 = 0.16.
Finally, offspring of genotype Aa may be obtained by either the male parent con-
tributing the A-allele and the female the a-allele, or vice versa. Hence the probability
of offspring having genotype Aa is
P(A)P(a) + P(a)P(A) = 2P(A)P(a) = 0.48.
Thus we have obtained the probabilities of all the possible genotypes.
184 Models for population genetics
Once the offspring are born, the genotype composition of the popula-
tion alters and this must be taken into account in predicting the genotypes
of those born subsequently. Various assumptions which enable us to do
this will be considered in the following sections.
Finally, recall that the sum of probabilities of mutually exclusive events,
exhausting all possible cases, is 1. This leads to two useful identities :
G(AA) + G(Aa) + G(aa) = 1, (6)

P(A) + P(a) = 1. (7)

You should also check that these identities hold in the previous examples.
This section has covered all the background theory from genetics
needed for the task of predicting the changes in genotype proportions.
You may well be interested, however, in learning more about genetics.
If so, you should consult one of the more specialized books on genetics,
such as Hexter and Yost (1976) or Hartl (1980).

Exercises 10.1

1. Match each of the symbols on the left, which were introduced in the text,
with the appropriate description on the right :
(a) G(AA) (a') total population
(b) P(A) (b') number with genotype AA
(c) X (AA) (c') proportion of A-alleles
(d) N (d') proportion with genotype AA
Interpretation as probabilities were given in the text for G(AA) and P (a). Write
down similar interpretations as probabilities for G(Aa), G(aa) and P(a).
2. Verify the identities stated in the text,
G(AA) + G(Aa) + G(aa) = 1 and P (A) + P (a) = 1,
by using the definitions (1) and (3) in the text.
3. A population consists of seven individuals : two of genotype AA, two of
genotype Aa and three of genotype aa.
(a) Draw a diagram like that in Figure 10.1.3 showing the genotypes and
then write down the proportions of each genotype in the population.
(b) Draw a diagram like that in Figure 10.1.4 showing the gene pool for the
population and then write down the proportions of each of the alleles A
and a in the gene pool.

4. A population consists entirely of genotypes Aa.

(a) What answers would you expect for the allele proportions P (A) and
P(a)?
10.2 Random mating with equal survival 185

(b) Verify your answer to part (a) by using the formulae (3) in the text.
5. A population consists of 20% genotypes AA, 40% genotypes Aa, and 40%
genotypes aa. Calculate the allele proportions P (A) and P (a).
6. In humans there is a blood group known as the `MN group'. It is composed
of individuals with the genotypes MM, MN or NN, each genotype consisting of
a combination of the genes M and N.
(a) What are the possible genotypes of the children of an MM and an NN
parent? What is the probability of occurrence of each genotype?
(b) As in (a), but now suppose both parents are of genotype MN.
7. Suppose that, in a survey of people from the MN blood group, it was found
that 115 were of genotype MM, 115 were of genotype MN, and 3/5 of genotype
NN.
(a) Calculate the proportions of the genes M and N respectively, in the gene
pool.
(b) Give the probabilities for each of the possible genotypes occurring in
children of those surveyed, assuming the mating is random.
8. The colouring of fur in rabbits is determined by a single pair of genes, to be
denoted by A and a. The allele A is responsible for the pigmentation of the fur,
while the allele a is responsible for white fur. The allele A is dominant to the
allele a.
(a) What genotypes have coloured fur?
(b) Suppose that, in a population of rabbits, 25% of genotypes are AA, 50%
of genotypes are Aa, and 25% are aa. Calculate P (A) and P (a).
(c) Deduce the probability that a particular offspring is a white rabbit,
assuming the mating is random.
9. Show from appropriate formulae in the text that
P (A) = G(AA) + 1 G(Aa),
2

P(a) = 1 G(Aa) + G(aa).

10.2 Random mating with equal survival

This section begins by discussing some assumptions from which the likely
changes in the genotype proportions can be predicted over a period of
time. All the assumptions of the previous section will be retained. The
main objective is to analyse how the proportion of a recessive gene in
the gene pool changes over time. To make some progress we must make
some simplifying assumptions about the way in which genes may be
combined and also on how many of the various genotypes survive to
produce offspring.
186 Models for population genetics
Basic model I
In this section we will assume random mating between individuals:

(a) mating occurs at random: the choice of mate does not depend on
the mate's genotype.

This particular assumption is mentioned explicitly so as to emphasize

its role in this section. It is applicable whenever the different genotypes
within the population are either unaware of or indifferent to the genotypes
of their partners. For example, mating in humans is typically random
with respect to the genes responsible for blood group type, whereas it
may not be completely random with respect to the genes controlling
height. In the latter case there may be a bias towards tall people's
marrying tall people the choice of partner being thus influenced by
the partner's genotype. This modification of assumption (a) is dealt with
in Section 10.4.
Besides the random mating assumption, further modelling of the popu-
lation is necessary before it is possible to predict the changes in genotype
proportions. These models rely on the idea of a generation, which will
now be explained.
In certain populations, breeding occurs during short well-defined peri-
ods, equally spaced. Crops sown by farmers, at the same time every year,
fall into this pattern as do many species of migratory animals. The role
of this assumption here, as in Chapter 9, is to yield difference equations
(rather than differential equations). Thus the population divides into
discrete generations. For each integer k >_ 0, the kth generation consists
of all the individuals born during the kth breeding period.
To simplify our models, moreover, it will be assumed that the gener-
ations do not overlap and that different generations do not interbreed.
Each generation begins with the fertilized egg (or zygote) birth and ends
with reproduction to form the next generation, as illustrated in Figure
10.2.1. This assumption is certainly applicable for many animal and plant
species which only breed at a specific time and where all the adults die
after breeding. It is not, strictly speaking, correct for human populations.
It may be a useful approximation, however, if we take a generation to be
25 years and only bother to measure the population every generation.
The expected proportions of the different genotypes in the population
at the end of the kth generation will be denoted respectively by

Gk(AA), Gk(Aa) and Gk(aa).

10.2 Random mating with equal survival 187
(k + 1) th generation

kth generation
Born Mates

Born Mates

Fig. 10.2.1. Generations.

Similar notation will be used for the values of the other quantities
associated with the population, at that time. The first of our models will
now be described.
The other major assumptions, to be used in this section, are equal
survival and equal fertility. These assumptions are as follows.

(b) Equal survival: each genotype has the same chance of surviving from
the fertilized egg to the end of the generation, where mating occurs.
(c) Equal fertility: each couple produces on average, the same number
of viable sperm and eggs, regardless of the genotypes in the couple.

The equal survival assumption means that we can assume that a

constant fraction, r say, of the number of each genotype of offspring
survives to the end of the generation. Later, in Section 10.3, we shall
see how to modify assumption (b) for when different genotypes have
different chances of surviving. We shall refer to the model encompassing
(a) random mating, (b) equal survival, and (c) equal fertility, as Model I
for population genetics.

Offspring genotype proportions from random mating

Our overall objective is to obtain a difference equation for one of the
allele proportions. First we need to determine how to get from the end
of one generation to the beginning of the next generation, under the
assumption of (a) random mating.
We introduce the notation Pk(A) and Pk (a) for the two allele propor-
tions at the end of the kth generation, and Gk(AA), Gk(Aa) and Gk(aa)
for the three genotype proportions, at the end of the kth generation.
Let us use an asterisk to denote quantities at the beginning of the
188 Models for population genetics
generation. Thus
Gk+1(AA), Gk+1(Aa) and Gk+1(aa)
denote the various genotype proportions of the fertilized eggs, which give
rise to the (k + 1)th generation.

Example 1. Find an expression for Gk+1(AA) in terms of the the allele proportions
from the kth generation.
Solution. Recall that, by definition,
probability that an individual at the beginning
Gk+1(AA) = of the (k + 1)th generation has genotype AA
Hence, by assumption (c), equal fertility,
probability that an offspring from a given
Gk+l (AA) = couple in the kth generation has genotype AA
This, in turn, is equal to the product of the separate probabilities that each parent
contributes an A-allele. Each of these latter probabilities is equal to the proportion
of A-alleles in the gene pool at the time of mating (by assumption (a)) and hence
to Pk(A). Thus
G;+1(AA) = Pk(A)Pk(A) = [Pk(A)]2.

Similarly, results are easily obtained for Gk+1(Aa) and G;+1(aa). These
are summarized as
G;+1(AA) = [Pk(A)]29
Gk+1(Aa) = 2Pk (A)Pk (a), (1)
Gk+1(aa) = [Pk(a)]2)
These equations are always valid for random mating.
The British geneticist R.C. Punnet deduced the formulae (1) from a
diagram which is now called the Punnet square. In the Punnet square,
shown in Figure 10.2.2, the alleles present in the gametes of the male
parents and their expected proportions are arranged along the top of
the square, and those of the female along the side. The genotypes of
the offspring and their expected proportions are contained within the
square.
Thus the genotype proportions of the offspring, just after birth, can be
read from the Punnet square as [Pk (A)] 2 for the AA genotype, [Pk (a)] 2
for the as genotype and 2Pk(A)Pk(a) for the Aa genotype. Note that the
cross terms in the Punnet square add to give the Aa genotype since Aa is
the same as c A.
10.2 Random mating with equal survival 189

A Pk(A) a Pk(a)

A AA Aa
Pk(A) Gk+I(AA) = (Pk(A))2 Gk+I(Aa) = Pk(A)Pk(a)

a aA as
Pk(a) Gk+I(aA) = Pk(a)Pk(A) Gk+I(aa) = (Pk(a))2

Fig. 10.2.2. The Punnet square.

Obtaining difference equations

We now have enough relations between the various quantities to enable
us to derive a difference equation, giving say Pk+1(a) in terms of Pk(a).
Solving this difference equation will then tell us how the proportion of
the recessive gene a changes over time.
The procedure for deriving a difference equation for allele proportions
is essentially a two-step process. This is described below.

STEP 1: Using (1) and the fraction of each genotype surviving,

determine the numbers of each genotype at the end of the (k -}-1)th
generation.
STEP 2: Now count up one of the allele proportions and eliminate
the other using Pk(A) + Pk (a) = 1.

The following example shows how to use these steps.

Example 2. Find a difference equation for the recessive gene a given that a constant
fraction r of each genotype of fertilized egg survives to the end of each generation.

Solution. We now carry out the two steps in detail for Pk+l(A) in terms of Pk(A).

STEP 1: Define Nk+1 as the total number of fertilized eggs giving rise to the
(k + 1)th generation and let .Kk+i (AA), .Kk+i (Aa) and .Kk+i (aa) denote the num-
ber of each genotype of the fertilized eggs. Now X;+i (AA) = G;+1(AA)N;+, with
190 Models for population genetics
similar results for the other genotypes. Thus, from (1), valid for random mating,
X;+1(AA) = [Pk(A)]2Nk+1,
.Kk+1(Aa) = 2Pk(A)Pk(a)Nk+1, (2)
Xk+10IOI) = [Pk(a)]2Nk+1
From assumption (b), equal survival, a fraction r of each genotype of fertilized egg
survives to the end of the (k + 1)th generation, giving Xk+1(AA), Xk+1(Aa) and
"k+1(aa). Thus
.1k+1(AA) = r.Kk+1(AA) = r[Pk(A)]2Nk+1,
.Kk+1(Aa) = r.Kk+1(Aa) = 2rPk (A)Pk (a)Nk+1 (3)

'k+1(0 (*= rX;+1(aa) = r[Pk(a)l2Nk+1

STEP 2: The right-hand side of (2) contains the allele proportions Pk(A) and Pk(a)
(measured at the end of the generation). To obtain a difference equation we need
to obtain the allele proportions at the end of the (k + 1)th generation. Now
{number of a-alleles}
Pk+1(a)
{total number of alleles in gene pool
2.Kk+l (aa) + V1k+1(Aa)
(AA) + 241k+1(0)
Substituting from (3), we thus obtain,
[Pk(a)]2 + Pk(A)Pk(a)
Pk+1(a ) = .
[Pk (A)] 2 + 2Pk (A)Pk (a) + [Pk (a)] 2
To eliminate Pk(A) we use the identity
Pk(A) + Pk(a) = 1
obtaining, after some straightforward algebra, the rather simple difference equation
Pk+1(a) = Pk(a) (4)
Thus we have obtained a difference equation for Pk(a) as required.

This simple difference equation is a remarkable result. It states that

the a-allele proportion in the next generation is equal to the a-allele pro-
portion in the current generation and hence that the a-allele proportion
remains the same throughout all generations.
On reflection this result makes perfect sense in the case where each
couple has just two offspring, all of whom survive to the end of the
generation, since the gene pool always contains exactly the same genes.
In the more general case it is the relative numbers of genes which remain
the same since there is no mechanism built into the assumptions to
change this. Note that the result is independent of the survival fraction
r.
Having established that under random mating and equal survival
10.2 Random mating with equal survival 191

the allele proportions do not change it is natural to ask whether the

genotype proportions change over time. The following example shows
how to determine this.

Example 3. Suppose that an initial generation of flowering plants consists of

50% white flowers (all of genotype AA) and 50% yellow flowers (all of genotype
aa). Calculate the expected genotype proportions in each subsequent generation,
assuming that Model I is valid for the flowering plants.
Solution. First we obtain the initial conditions for the difference equation. Let the
number of plants in the initial generation (for which k = 0) be No. The numbers
of the various genotypes in this generation are then given by

A10(AA) = 1 No, A1o(Aa) = 0, 1 No.

2 2
Hence, by (3a) of Section 10.1,

Po(a) = 1 (5)
2
The solutions of the difference equation (4) are all constant functions of k. From
the initial condition (5) it therefore follows that

Pk (a) = POW = 1 (k = 0,1, 2, ...).

2
Since Pk(A) + Pk(a) = 1, it follows that also

Pk(A) = 1 (k = 09 19 2, ...
2
Equations (1) now give the expected genotype proportions as constant function of
k:

Gk+ 1(AA) = 4 , Gk +1 (Aa) = 1,

2 G k+1 (aa) = 14 (k=091929 ...).

From assumption (b), equal survival, the genotype proportions cannot change
during a generation, so it follows that
1 1 1
Gk+l (AA) = 4, Gk+I (Aa) = 2, Gk+l (aa) = (k = 09 1, 2, ...)
4
Note that Gk+l (AA) + Gk+l (Aa) + Gk+l (aa) = 1.

In the above example, the expected genotype proportions remain the

same from the first generation onwards. This can be proved to occur
always with the assumptions of Model I and is known as the Hardy-
Weinberg law.
There is an interesting historical anecdote associated with the discovery
of this law. At a meeting in the early 1900s to discuss developments in
genetics, R.C. Punnet (a famous geneticist who invented the Punnet
square) was presenting a talk on Mendel's theory of heredity. During
question time Punnet was asked to explain why, although the allele
192 Models for population genetics
responsible for brown eyes is dominant to the allele responsible for blue
eyes, there are still so many blue-eyed people in the population. Punnet
was not able to provide his audience with a convincing answer to this
question, but he did realize it had a mathematical answer. Each weekend
Punnet played in the same Cambridge University cricket team as the
mathematician G.H. Hardy, and so he related the problem during a
game. Hardy subsequently solved the problem and the result now bears
his name together with that of the German geneticist W. Weinberg, who
independently solved the problem at about the same time.

Exercises 10.2
1. In each of the following cases state whether the assumption is consistent with
the assumptions in the equal survival model (Model I). If the answer is `no', state
which assumption of Model I would be violated.
(a) Between birth and parenthood 10% of each of the genotypes AA, Aa, as
die.
(b) Between birth and parenthood 5% of the genotype AA die, 10% of Aa
and 15% of xa die.
(c) The genotypes AA mate only with the genotypes AA or Aa.
(d) The genotypes AA have, on average, twice as many offspring as the geno-
types xx.

2. Suppose that a population satisfies the assumptions of Model I. Derive a

difference equation for the proportion of A-alleles.
3. Let go be a number between 0 and 1 and suppose that in an initial generation
of some population
Go(AA) = (go)2, Go(Acx) = 2go(1 - go), Go(aa) = (1 - go)2.

(a) Calculate Po(A) and Po(x).

(b) Assuming Model I is applicable, write down the allele proportions Pk(A)
and Pk (x) for each k z 0.
(c) Use your answer to part (b), and the Punnet square, to calculate
Gk (AA), Gk (Ax), Gk (aa) (k = 1, 2,39 ... ).
(d) The genotype proportions are said to have reached equilibrium when
there is no further change in their values with the passage of time. In
which generation do they reach equilibrium in the current exercise?

4. Suppose, as seems likely, that the simple random mating model is applicable
to members of the MN blood group in humans. Suppose, furthermore, that the
genotype proportions have already reached equilibrium values, as in Exercise
3. Given that 36% of the members of this blood group are of genotype MM,
calculate the expected percentages of genotypes MN and NN.
10.3 Lethal recessives, selection and mutation 193

10.3 Lethal recessives, selection and mutation

In this section we look at two possible ways for the allele proportions
to change over time. The first is where the equal survival assumption
is violated. We will consider a lethal combination of genes, where each
individual with that combination dies before reaching maturity, and
we will also consider where different genotypes have different relative
chances of surviving to pass on their genes, an important situation from
the point of view of evolution. Finally we look at a second way in which
the allele proportions can change. Here the changes can occur because
the alleles can mutate from one form to another.

A lethal recessive gene model II

An obvious example where the equal survival assumption (b), from
Section 10.2, is not valid is where the recessive gene a is such that all of
the genotype as die before they reach maturity. Such a gene is called a
lethal recessive. An example of this is the genetic disorder, cystic fibrosis.
Equivalently, a lethal recessive gene could result in all of the embryos
being aborted.
Another situation, equivalent to the existence of a lethal recessive
gene, is where animal breeders deliberately slaughter animals where the
recessives manifest a certain undesirable characteristic, so as to improve
their breed. This practice is known as culling.
Because the Aa genotype still carries the lethal gene it will not be im-
mediately lost from the gene pool. Thus we are interested in determining
how quickly the proportion of the lethal recessives Pk(a) diminishes.
Model II for population genetics consists of two of the assumptions of
the equal survival model (Model I), random mating (a), and also equal
fertility (c), but with equal survival (b) replaced by (b") below.

(b") A fraction r of the AA and Aa newborn genotypes survive to the

end of the generation but none of the as genotypes survive.

Modelling this situation only requires a small modification to Step 1 of

the procedure used in Section 10.2. The following example explains how
to do this.

Example 1. Find a difference equation for Pk (a) if as is a lethal recessive and a

fraction r of the newborn genotypes AA and Aa survive to the end of the generation.
194 Models for population genetics
Solution. We adopt the same notation as in Example 1 of Section 10.2.
STEP 1: Assuming random mating we can use equations (1) from Section 10.2.
Thus
Xk+1(AA) = [Pk(A)]2Nk+1,
Xk+1(Aa) = 2Pk (A)Pk (a)Nk+1, (1)
Xk+1(aa) = [Pk(a)]2Nk+1
Given a fraction r of newborn genotypes AA and Aa survive to the end of the
(k +1)th generation, and none of the as genotype survives, then
Xk+l(AA) = rXk+1(AA) = r[Pk(A)]2Nk+l
Xk+1(Aa) = rXk+1(Aa) = 2rPk (A)Pk (a)Nk+l, (2)
Xk+1(aa) = 0.

STEP 2: Now
{number of a-alleles}
A_ (a) =
(total number of alleles in gene pool)
_ 2A1k+l (aa) + Xk+1(Aa)
2Xk+l (AA) + 2.N'k+l (Aa) + 2Ak+1(aa)
Substituting from (2), we obtain

Pk+1(a) = Pk(A)Pk(a)
[Pk(A)]2 + 2Pk(A)Pk(a)
Finally, to eliminate Pk(A), the identity
Pk(A) + Pk(a) = 1
is used and we obtain

Pk+1(a) = Pk(a) (3)

1 + Pk (a)
For clarity we introduce the notation Xk = Pk(a) and thus (3) can be written as

k+1
Xk+1 = 1+Xk (4)
'
which is the required difference equation.

The difference equation (4) is a non-linear difference equation so the

standard techniques developed in Chapter 8 are not directly applicable
to it. It is generally unlikely that a closed-form solution can be found
for a non-linear difference equation. We are fortunate in the case of
(4), however, that the solution can be easily guessed by iteration. The
following example explains how to do this.
10.3 Lethal recessives, selection and mutation 195

Example 2. Solve the difference equation

Xk
Xk+1 = 1
+ Xk
given an initial number Xo.

Solution. Iteration gives

Xo
X1 = 1+Xo,

x
X1 i xo __ Xo
X2 =
1+X1 1+ 1+
XI
0

x
X3 __ X2 1+2X0 Xo
1+X2=1+X0=1+3Xo.

1+2X0

This suggests that for every k

Xk = 1 +XokXo (5)

To verify that this is the solution we check that it satisfies both the initial condition
and the difference equation. The initial condition is satisfied since putting k = 0 in
(5) gives Xo. To check the difference equation is satisfied note that when (5) is
used in (4)
Xo
LHS =
1+(k+1)Xo
Xk
RHS =
X
1 + Xk
I+kXo Xo
X
1 + 1+kXO
1 + (k + 1)Xo

Thus the two sides are equal, as required.

The closed-form solution is thus

Pk(a) Xk 1-F- kXo,

where Xo is the initial proportion of a-alleles. Suppose that Po(A) _
PO(a) = 0.5 then a simple calculation shows that it takes eight generations
for the recessive allele proportion to be 0.1, it takes 98 generations to be
0.01 and it takes 998 generations for it to be 0.001. The elimination of
the lethal recessive from the population is a fairly slow process. This is
because the lethal gene can be carried by the hybrid genotype Aa without
any ill effect.
196 Models for population genetics
Natural selection model III
The lethal recessive model is an extreme example where some genotypes
are favoured over others. In the theory of evolution by natural selection,
individuals with certain characteristics have a higher chance of surviving
infancy and thus mating and passing on their genes to the next generation.
This also violates the equal survival assumption (b) of Model I from
Section 10.2.
Model III for population genetics consists of two of the assumptions
in Model I of Section 10.2, the random mating assumption (a) and also
equal fertility (c), but with (b) replaced by (b") below.

(b") A fraction ri of the AA and Aa newborn genotypes survive co the

end of the generation and a different fraction r2 of the newborn as
genotypes survive to the end of the generation.

Note that we assume that both male and female are equally likely to
survive. To illustrate the new assumption we consider the frequently
studied example of the Peppered Moth Biston betularia. The dominant
forms, AA and Aa, are a coal black colour whereas the recessive form,
aa, is a pale speckled colour. For moths living in industrial cities the pale
speckled coloured moths are less well camouflaged and thus are more
likely to be eaten by predators. We thus say that the pale speckled moths
are at a selective disadvantage compared with the coal black moths in
industrial cities.
The details of setting up the model and deriving a difference equation
is left to the exercises, where it is shown that the proportion of a-alleles,
Xk = Pk (a), satisfies the non-linear difference equation

(p -1)Xk -4- Xk
Xk+1 - 6
1 + - 1)Xk
where P is given by P = r2/rl. The number P is called the relative fitness
of the genotype as and measures the fitness to survive of the recessive
as genotype relative to the AA and Aa genotypes. Note that the special
case P = 1 corresponds to the equal survival example from Section 10.2
(Model I) and the case and P = 0 corresponds to the lethal recessive
gene model (Model II) of this section.
A numerical iteration of equation (6) has been carried out for various
values of the parameter P starting with an allele distribution of 90%
recessive and 10% dominant. The results are shown in Figure 10.3.1.
Note that as P decreases the recessive allele proportion tends to decrease
10.3 Lethal recessives, selection and mutation 197

Pk(a)
1.0-1 03=1

t -

0.8--

''©.
°''b -©..© /3 = 0.9

0.6 -
GL

00
1.

0.4 -I ©
* o.
© ©.. = 0.7

0.2-
/3= 0.3
4.44 0

5 10 15 20 25
k (Generation)

Fig. 10.3.1. Recessive allele proportions for Model III for various values of the
parameter /3 starting from Pk (a) = 0.9.

very rapidly within the first few generations. The decrease is more gradual
for higher values of P.
It has been estimated that the relative fitness for the pale speckled
Peppered Moth in Manchester, UK, compared with the coal black moth
is approximately /j = 0.7. Figure 10.3.1 shows that there is a rapid
decrease from 90% to 40% in only 10 generations.
Another interesting application of these ideas is to the genetic dis-
order sickle cell anaemia in West Africa. Here a defective recessive
gene causes a minor chemical change in the blood cells. Those who
inherit the recessive gene from both parents have a low survival rate.
However, the gene is not wholly recessive. The hybrid genotypes Aa
are slightly affected, but not enough to cause a fatal condition. In fact
the hybrids have an enhanced resistance to malaria, which is prevalent
in West Africa. Thus the hybrids, Aa, are the most likely to sur-
vive, followed by the pure dominants, AA, and then the recessives, aa.
To model this situation it is necessary to postulate different survival
fractions for each genotype. This case is considered, in detail, in the
exercises.
198 Models for population genetics
Mutation model IV
Another factor which can affect the distribution of genes in a population
is mutation. This happens when external factors (for example, background
radiation) cause an allele A to change into an allele a. The rate at which
mutation of genes occurs is usually very small (typically a fraction 10-5
or 10-6 of the alleles per generation). Also, mutation more commonly
acts to change a dominant gene into a recessive one.
Model IV for population genetics assumes random mating, equal sur-
vival of genotypes and also incorporates mutation of A-alleles into a-
alleles. Incorporating mutation into the modelling of population genetics
requires us to modify Step 2 of the procedure set out in Section 10.2.
This is illustrated in the following example.

Example 3. Find a difference equation for Pk(a) given equal survival and that the
A-alleles mutate to the a-alleles at a rate of y alleles per allele per generation.

Solution. Assume that, for each genotype, a fraction r survives from the beginning
of the generation to the end of the generation, and assume that during a single
generation a fraction p of the A-alleles mutates into a-alleles.

STEP 1: This step is exactly the same as in Example 2 of Section 10.2 and we thus
obtain
Xk+1(AA) = r[Pk(A)]2Nk+1,
Vk+1(Aa) = 2rPk (A)Pk (a)Nk+1, (7)

Xk+1(aa) = r [Pk (a)]2Nk+l

STEP 2: Now, with mutation present,

{number of a-alleles} + p {number of A-allelles}
PA. (a) _
{total number of alleles in gene pool}
(8)
2Xk+1(aa) + .IVk+1(Aa) + 2 pXk+1(AA) + p.N'k+1(Aa)
2.'k+1(AA) + 2.N'k+1(Aa) + 24"k+1 (aa)

Note that, even though some of the alleles have changed due to mutation, the total
number of alleles remains the same so the denominator is still twice the sum of
Ak+1(AA), .IVk+1(Aa) and Xk+1(aa).
Substituting (7) into (8), and letting Xk = Pk(a), we obtain

Xk+1 = (1 - p)Xk + p (9)

as the required difference equation.

10.3 Lethal recessives, selection and mutation 199

Note that the case µ = 0 corresponds to the equal survival example

(Model I) from the previous section, as expected. Equation (9) is a linear
constant-coefficient difference equation. A closed-form solution can be
found directly using the methods of Chapter 8. This solution is (see
Exercises 10.3)
Xk=1-(1-X0)(1-Y)k
where X0 is the initial proportion of a-alleles. To get some feeling for
how slow the process of mutation is let X0 = 0 (no a-alleles initially) and
take /i = 10-5. We calculate that it takes approximately 6932 generations
for the a-allele proportion to reach 0.5.
Further discussions of mathematical models in population genetics
may be found in Chapter 3 of Edelstein-Keshet (1988), Maynard-Smith
(1968) and the article by Sandfur (1968). One of the pioneering articles
in this field is the one by Haldane (1924).

Exercises 10.3

1. (a) For Model II (lethal recessives) show that the proportion of A-alleles
satisfies the difference equation
1
Yk+1 =
2 - Yk
where Yk = Pk(A).
(b) Iterate and hence guess the solution of this difference equation. Prove
that your guess is a solution.

2. Consider the difference equation from Example 2 (for Model II),

Xk+1= Xk
1 + Xk
By making the transformation Zk = 1 /Xk in the difference equation show that it
reduces to a linear difference equation. Hence obtain the solution of the original
difference equation.
3. Consider Model III (natural selection) where the recessive genotype as has a
lower survival rate than the other genotypes.
(a) Show that Xk = Pk(a) satisfies
(/3-1)Xk+Xk
k+1 1 + ($ -1)Xk

where /3 is the fitness of the recessive relative to the other genotypes.

(b) Find all the equilibrium solutions, and discuss their significance.
200 Models for population genetics
4. Find a difference equation for Pk(A) for Model III.
5. For Model III show that the ratio of A-allele proportion to a-allele proportion
Uk satisfies the difference equation
Uk(Uk+1)
Uk+1 =
P + Uk
Find solutions for the special cases /3 = 1 and /3 = 0.
6. Develop a model for sickle cell anaemia by assuming that a fraction rl of AA
survives, a fraction r2 of Aa survives and a fraction r3 of as survives to the end
of each generation.
You are given that the AA genotype has a fitness 0.81 relative to the hybrid Aa
and the as genotype has a fitness 0.2 relative to the hybrid Aa. Hence determine
the a-allele proportions in the first, second and third generations. If you have
access to a computer, determine the a-allele proportions in further generations.
7. Develop a difference equation for Pk(A) for Model IV (mutation with equal
survival).
8. Develop a model for mutation of the A-allele to an a-allele, where the a-allele
is a lethal recessive, so that all as die before mating. Give difference equations
for both the a and A-alleles.
9. Suppose that the A-allele mutates to the a-allele at a rate p alleles per
generation per allele and there is back mutation of the a-allele to the A-allele at
a rate v alleles per generation per allele. Assume that random mating and equal
survival applies.
(a) Set up a difference equation for the recessive allele proportion Pk(a).
(b) Solve the difference equation in (a) and use the solution to determine
the number of generations for the recessive allele proportion to become
50% given that initially it is 10%. Take p = 10-5 and v = 10-6.

10. (a) Find all the steady-state solutions for the difference equations for Pk (a)
for Models I-IV in the text.
(b) For those who have studied Section 9.4 of Chapter 9: determine which of
the steady-state solutions in (a) are attractors.
Part three
Models with Differential Equations
11
Continuous growth and decay models

In this chapter some problems of growth and decay will be studied for
which differential equations, rather than difference equations, are the
appropriate mathematical models. Such problems include :

the growth of large populations in which breeding is not restricted to

specific seasons,
the absorption of drugs into the body tissues,
the decay of radioactive substances.

The differential equations which arise from the above problems are all
of the first order. The two methods of solution which we explain are
sufficient to solve all the differential equations which arise in the next
three chapters. The theoretical background for these two methods is
contained in Chapter 5. The first of the two methods, which applies only
to linear differential equations, is very similar to the method already given
in Section 8.1 for solving linear difference equations. The continuous
models used in this chapter are similar to the discrete models discussed
in Chapter 9.

11.1 First-order differential equations

The two types of differential equations which you need to be able to solve
in this chapter are called linear with constant coefficients and variables
separable differential equations. The former arise from problems of
unrestricted growth, while the latter appear when the growth is restricted.
How to recognize and solve the two types of differential equations will
now be explained.

203
204 Continuous growth and decay models

Linear with constant coefficients

These differential equations, when of the first order, have the form

ac=ax+b (1)

where a and b are constants. The differential equation is said to be

homogeneous if b = 0. Replacing b by 0 in (1) gives the homogenized
equation corresponding to (1). The linear differential equations occurring
in this chapter will all be homogeneous, but in later chapters some
inhomogeneous examples occur as well.

Homogeneous equations
The following example illustrates how to find all the solutions when the
differential equation is homogeneous. It corresponds to the choice a = 2
and b = 0 in (1).

Example 1. Find all the solutions of the first-order linear homogeneous constant-
coefficient differential equation
ac=2x. (2)

Solution. To guess a solution, recall that the exponential function is its own deriva-
tive. Hence our first guess is x = et. This gives ac = x, however, which is out by a
factor of 2. Hence our next guess is
x=e2t (3)

Substitution of (3) into the differential equation (2) gives

LHS=ac
= 2e2t
= 2x = RHS
Thus (3) is indeed a solution of (2).
Because the differential equation (2) is first-order linear homogeneous, it follows
that every solution is some constant multiple of the particular solution (3) (by
the superposition theorem for homogeneous equations in Section 5.3). Thus each
solution has the form
x = Ce2t (4)

for some constant C.

11.1 First-order differential equations 205

Fig. 11.1.1. Graph of a solution.

The graph of the solution (4) for which C = 1 is shown in Figure

11.1.1.
An alternative way to arrive at the particular solution (3) in the above
example, which involves slightly less guesswork, is as follows : try for a
solution of the form
x = eAt (5)

where 2 is a constant to be determined. Substitution of (5) in the

differential equation (2) gives
LHS=.z=2eAt

RHS = 2x = 2e2t.
Hence both sides will be equal if 2 = 2. So once again we get the
particular solution x = e2t for the differential equation (2). The rest of
the solutions are then obtained as in Example 1.

Inhomogeneous equations
It is now easy to solve an inhomogeneous equation, such as
.z=2x-6. (6)

First, try for a particular solution in which x is constant so that .z = 0.

The equation (6) then becomes
0=2x-6
and hence a particular solution is
x=3
206 Continuous growth and decay models

Second, add the solutions for the homogenized equation .z = 2x, which
were obtained as equation (2) in Example 1, to get
x=3+Ce2t
This formula gives all the solutions of (6) (by the superposition theorem
for inhomogeneous equations in Section 5.3).

Variables separable
These differential equations have the form
dx=
dt f ( xg(t)
) (7)

where f and g are known functions (which we assume to be smooth so

that the existence-uniqueness theorem of Section 5.2 is applicable).
For example, the differential equation
dx=x1-xt (8)

is of the variables separable type since it has the form (7) with f (x) _
x(1 - x) and g(t) = t. The reason for the term `variables separable' in
describing these equations will become clear later, after the procedure for
solving them has been explained.

Constant solutions
These solutions, which are also called equilibrium or steady-state solutions,
are important because they are easy to find and they provide a framework
for the study of other solutions. To find these constant solutions, solve
the equation f (x) = 0; in the example (8) this gives
x=0 orx=1
It is easy to check that each of these is a solution of the differential
equation since, when substituted in (8), each gives
LHS=.z=O
RHS=x(1-x)t=0.t=0
The graphs of these constant solutions x = 0 and x = 1 are horizontal
lines, as shown in Figure 11.1.2.
If x = ¢(t) is any other solution, its graph cannot intersect either
of these lines (by the existence-uniqueness theorem of Section 5.2) and
11.1 First-order differential equations 207
x-axis

X=0
-----w t-axis

Fig. 11.1.2. Graphs of solutions cannot intersect.

hence its graph must lie entirely within one of the three horizontal strips
determined by these lines.

Other solutions
The procedure for finding non-constant solutions of variables separable
differential equations will now be illustrated. The idea behind the proce-
dure is to try to `separate' the variables so that x appears on one side, t
on the other.

Example 2. Find the solution x = 4(t) of the separable differential equation

dx
= x(1 - x)t
dt
which satisfies the initial condition x = 1 when t = 0.

Solution. The constant solutions are x = 0 and x = 1. The initial condition places
the solution x = 4(t) in the horizontal strip between the lines x = 0 and x = 1, as
in Figure 11.1.2. Hence, for all time, this solution satisfies
0<x<1. (9)

STEP 1: Divide both sides of the differential equation by f (x) = x(1 - x), which
is non-zero. This gives
1 dx
x(1 - x) dt t
STEP 2: Integrate both sides, with respect to t, from the initial time 0 to the
current time t. This gives
t=

1 dx
dt = ` t dt.
I It 0 x(1 - x) dt r-0
208 Continuous growth and decay models
STEP 3: Apply the rule for integrating by substitution to the LHS. This gives
(since 0(0) = i and ¢(t) = x)
I I
x(1-x)dx= tdt.
Jx=3 r=0

X= 1
1 +1x dx=it2
2

ln(1- x) + ln(x)Ix_ 1 = i t2
X- 2 2
Note that the arguments of log are both positive since 0 < x < 1, by (9). Hence
In ( 1"x) - ln(1) = 2t2

I-X = e2 (10)
x= 1 - -11 2
1+e2`
STEP 5: Check the solution. Substituting (10) back into the differential equation
gives
dx to 3 `2
LHS =
dt (1 + '

e2`2 1
RHS=x(1-x)t= 1 t2 It2
1+e 1+e
Thus the differential equation is satisfied. Also, (10) shows that, when t = 0,

x=1- 1
1

e°2'
= 1-
2
1 1

so that the initial condition is satisfied. Thus (10) does indeed give the required
solution of the differential equation.

Note that, in Example 2, the initial conditions are used in evaluating

the definite integrals. Hence the initial conditions are automatically
included in the final solution. The need to evaluate an arbitrary constant
C is thereby obviated.
Differential equations of the form
dxx
T = f(x)
are of the variables separable type since we can write f (x) as f(x)g(t) by
choosing g to be the constant function 1. In particular, linear constant-
coefficient differential equations
dx
=ax+b,
dt
11.1 First-order differential equations 209

are of the variables separable type, although in practice it is quicker to

ignore this and to stick to the method of Example 1. On the other hand,
the variables separable method is the way to solve differential equations
like
dx dx
= x3/2 and =x(1-x),
dt T
which are non-linear.

Recognizing the variables

In the differential equations studied so far in this book the aim has been
to solve for quantities denoted by x or y as functions of the time t.
The use of the letters x and y was particularly appropriate in mechanics
where the quantities they denoted were coordinates or displacements. In
other branches of science there are other quantities to be expressed as
functions of the time such as temperature, pressure and concentration and
these are usually denoted by other letters such as u, p or c.
Thus, instead of a differential equation being written as
dx
=ax+b,
dt
it may appear as
du = au dc
b p dp =a b or = ac b.
dt + dt + dt +
The quantity to be expressed as a function of the time is called the
dependent variable (because it depends on the time) while, in this context,
the time t is called the independent variable. Quantities other than the
time may sometimes be used as independent variable.
Unless stated otherwise, all letters appearing in the differential equa-
tions other than the dependent and independent variables are to be
regarded as constants. Thus, in the above equations, a and b are to
be taken as constants (so all the equations are linear with constant
coefficients).

Exercises 11.1

1. Copy and complete the following table to show the classification of the given
first-order differential equations.
210 Continuous growth and decay models

Differential Linear Constant Variables

equation homogeneous coefficient separable

ac+2x=0

x = tx3/2

2. From the solutions found in the text for the differential equation ac = 2x,
(a) find the solution which satisfies the initial condition x = 3 when t = 0,
(b) show that the solution which satisfies the initial condition x = xo when
t=0is
x = xoe2t.

3. (a) Use the method of Example 1 in the text to find all the solutions of the
differential equation ac = -3x.
(b) To which types of differential equations is the method used in (a)
applicable?
(c) Find the solution of the differential equation ac = -3x + 6 which satisfies
the initial condition x = xo when t = 0.

4. Suppose that the graph of the solution of the differential equation x = Ax

which satisfies the initial condition x = 1 when t = 0 is as in the following
diagram. Sketch the graph of the solution of ac = -Ax which satisfies the same
initial condition.

5. Consider the differential equation

dx
dt = xt
(a) Although this differential equation is linear homogeneous, it is not of
the type to which we apply the method of Example 1 in the text. Why?
11.1 First-order differential equations 211

(b) Which constant solution does the differential equation have? Find the
solution which satisfies the initial condition x = 1 when t = 0 by
separating the variables as in Example 2 in the text.

6. (a) Find the solution of the differential equation

dx
= x(1 - x)t
dt
which satisfies the initial condition x = 2 when t = 0, by suitably
changing the working for Example 2 in the text.
(b) Sketch the graph of the solution you have found.
7. Find the solution of the differential equation
dx
= x(1 - x)
dt

which satisfies the initial condition x = i when t = 0.

8. For each of the differential equations in the list below:
(a) identify the dependent variable, the independent variable and list the
other symbols, which are constants (or parameters),
(b) indicate which of them are first-order linear, which are variables separa-
ble, and which are neither of these two types.
NOTE: You are NOT asked to solve these differential equations.
(i) dm = Amt/3 du
(ii) J = 2nrL
dr
du 141,

(iii) hA(u - us) (iv) - = aelh -18h2

(v)
It = atp -}- p2
IP Ix
(vi) XC = a sin(pt) - e"c.
dt dt
9. Radioactive iodine 131, produced by nuclear tests, settles on vegetation which
is eaten by deer grazing on the vegetation. The iodine 131 accumulates in the
thyroid gland of the deer. Let y be the amount of iodine 131 in the thyroid of a
deer after t days and suppose Io is the initial amount deposited in the vegetation
that will be eaten by a single animal. You are given that y satisfies

dy
dt
= A2y + IoAie-A1`, Al * A2
where Al and A2 are positive constants.
(a) If there is no iodine 131 in the thyroid initially solve the differential
equation and show that
10 ,1 [e1t - e-A2c
A2 Al
(b) After a nuclear test in Colorado in the USA in 1964 the following
estimates of Al and A2 were made for a Colorado deer population. They
were
A 1 = 0.126 and '2 = 0.107.
Use the model to estimate the maximum percentage increase in the
amount of iodine 131 in the deers' thyroids.
212 Continuous growth and decay models

11.2 Exponential growth

This section introduces the study of growth models in the context of
population growth with no restrictions on the ultimate size of the popu-
lation. The effect of imposing such restrictions will be examined in the
next section. We begin with some remarks which are relevant to both
sections.

Continuous models
The study of population models helps bring into focus the distinction be-
tween discrete and continuous models. The population models discussed
in Chapter 9 were discrete, being appropriate for species of animals which
breed during specific breeding seasons, equally spaced. The population
models of interest in this chapter, however, are more appropriate for
large populations which reproduce continuously, rather than at regular
intervals. Human populations are naturally modelled in this way, as are
certain types of microscopic organisms.
In the continuous models, the number of individuals in a population at
time t will be modelled by a solution N = 4(t) of a differential equation.
Thus both of the variables N and t will assume all the real values in
some interval, fractional and irrational values included.
The justification for using N as a real variable in this way is that, when
the population is sufficiently large, differences of one or two individuals
are of little consequence. At the end of the problem, we simply round
N to the nearest integer value. A similar justification applies to our
use of t as a real variable : in the absence of specific breeding seasons,
reproduction can occur at any time; for a sufficiently large population,
it is then natural to think of reproduction as occurring continuously.

Microscopic organisms
Populations of microscopic organisms are attractive to model since they
may be grown in the laboratory. This enables data concerning their
growth to be collected easily and the environment to be controlled.
Examples of such micro-organisms for which data are available are
yeast cells and E. coli. The former are involved in brewing and in the
commercial production of certain vitamins; the latter are a species of
bacteria which occur in the intestines of man and other animals.
Both of these examples are single-celled organisms which reproduce by
11.2 Exponential growth 213

dividing into two a process known as binary fission. The cells absorb
nutrients which are dissolved in the liquid in which they are immersed.
Thus the amount of nutrients available to the cells can be controlled.

Exponential growth model

We consider a population consisting of yeast cells and we suppose that
the environment in which the cells are multiplying does not alter with
passage of time. In particular, we assume that nutrients are added to
the liquid to replace those used by the cells and that the cells have
enough room to multiply without overcrowding. On average then, each
cell divides into two in a fixed time. Hence we could expect that
rate of number of
increase of the is proportional to cells in the (1)
number of cells population
This leads us to propose as a mathematical model for the growth of the
population the differential equation
dN
= aN (2)
dt
where N (when rounded to the nearest integer) is the number of yeast
cells in the population at time t and where a is a positive real constant.
The constant a measures the average growth rate per unit time per
individual.
Let us suppose that, at the time when observations are started (t = 0),
the number of cells in the population is a positive number No. This gives
the initial condition
N = No when t = 0. (3)

Since the differential equation (2) is first-order linear homogeneous with

constant coefficient, it can be solved as in Section 11.1. It is left as an
exercise to show that the solution satisfying the initial condition (3) is
N = Noea`. (4)

Thus the assumption (1) implies that the population of cells grows
exponentially.
The formula (4) implies that the population grows indefinitely large at
an ever-accelerating rate, as can be seen from the typical graph sketched
in Figure 11.2.1.
Such unrestricted growth is impossible in practice since eventually the
214 Continuous growth and decay models

N = Noea`

t-axis

Fig. 11.2.1. Exponential growth.

population runs out of space and nutrients. The above `J' curve is often
observed, however, during the initial stages of a population's growth.

Finding the constants

In order to be able to use the formula (4) to predict the number of cells
at any time t > 0, it is necessary to substitute numerical values for No
and a. Now No is just the number of cells present when t = 0. If the
number of cells is known at some later time, then the growth rate a can
be found by solving a suitable equation. The following example shows
how.

Example 1. Suppose that, in a population of yeast cells which is growing exponen-

tially, the initial number of cells is 1000 and ten minutes later it is 1500. Find the
growth rate, a, for the population.
Solution. Substituting No = 1000 in the formula (4) gives
N= 10o0e' t

But N = 1500 when t = 10 so that

1500 = 1000e1oa

Hence
e10a = 1.5,
lOa = ln(1.5).
Thus, to two decimal places,
a = ln(1.5)/ 10 = 0.04 (per minute per cell).
11.2 Exponential growth 215
In (N)-axis

ln(N)=1n(No)+at

ln(NO)

t-axis

Fig. 11.2.2. Logarithm of exponential growth.

Testing the model

To test the accuracy of the above model for a particular population, one
might determine the constant a as in the above example and then use
the formula (4) to predict the size of the population at subsequent times.
Comparison could then be made with the observed values.
A more convenient way is to first apply the function In to both sides
of the formula (4) for exponential growth. It is left as an exercise to show
that the formula then becomes

ln(N) =1n(No) + at. (5)

This shows that, while N is an exponential function of t, ln(N) is a

linear function of t, as in Figure 11.2.2. The slope of this linear function,
moreover, is just the growth rate a.
Thus to test the population for exponential growth, we simply plot the
logarithms of the observed population values against the time. The extent
to which the resulting points lie along a straight line is a measure of the
accuracy of the model.
A nice illustration of a population which is growing exponentially was
given by Monod (1949), whose results are also described in Rubinow
(1975). He allowed a population of the bacteria E.coli to grow in a
medium containing glucose as the nutrient and observed the density D
of the population (dry weight of the cells per unit volume) at various
times. In Figure 11.2.3 we have plotted the natural logarithms of the
population densities which he observed. The points lie very neatly along
a straight line, indicating the appropriateness of the exponential growth
model at least during the period of observation.
216 Continuous growth and decay models

.
4
.

2 . ZO

I
Time (hours)
0 1 2 3

Fig. 11.2.3. Exponential growth of E. coli bacteria.

Human populations
In adapting the preceding growth model to human populations, we must
take into account deaths as well as births. A plausible assumption is that
births and deaths both occur at a rate which is proportional to the size
N of the population at any time t. Hence we may write
dN
= aN - #N = (a - #)N (6)
Tt
where a and /3 are positive constants denoting the average rate of births
and deaths, on average (per head of population per year). If a - /3 > 0
then (6) has the same form as (2) with a = a - P, and so the model
predicts exponential growth, given by (4). It is left as an exercise to show
that, if a population grows exponentially, then there is a fixed length of
time that it takes to double.
An interesting historical account of theories of population growth is
given in Hutchison (1978). The idea that populations grow exponentially
can be found in the works published by Graunt (1662) and Malthus
(1798), who made estimates of the times taken for various populations to
double. All sorts of data concerning the growth of populations around
the world can be found in the Demographic Yearbook published by the
United Nations. For example, the following annual growth rates for
the world's population, shown in Table 11.2.1, were obtained from this
source.
An article in The Age newspaper (25 May 1989) quoted United Nations
sources as stating that the world's population growth, after having slowed
11.2 Exponential growth 217

Table 11.2.1. Annual growth rate (per annum) for

the population of the world.

Years Growth rate

1965-70 1.9%
1970-75 1.9%
1975-80 1.8%
1980-85 1.7%

down in the 1970s, was speeding up again, and that the current population
of about 5.25 billion people would double in 39 years, at present rates.
The above figures are in rough agreement with an exponential growth
model for the world's population for the period considered with
a growth rate a of about 0.02 new individuals per head of population
per annum. The exponential growth model cannot be valid in the long
term, however, as the population would run out of food and space. A
more realistic model of population growth, taking such limitations into
account, will be described in the next section.

Exercises 11.2
1. Let a be any real constant. Find all the solutions of the differential equation
dN = aN
dt
by using the appropriate method from Section 11.1. Hence show that the solution
which satisfies the initial condition N = No when t = 0 is
N = Noe°`.

2. Show that if N and No are positive numbers then

N=Noe't ln(N)=ln(No)+at.
Prove the converse of this implication also.
3. Suppose that a population is growing exponentially, in accordance with the
formula (4) in the text. Prove that, if the population doubles during the first T
hours, then it doubles during every T hour period.
4. Find the growth rate a for the world's population, given that it doubles every
39 years. Is your answer consistent with the United Nations data quoted in the
text ?

5. Find approximately the value of the growth rate a for the population in
Figure 11.2.3.
218 Continuous growth and decay models

6. The population of Australia at the various census dates since Federation

is shown in the following table. Decide whether the data is consistent with
exponential growth. If so, find an approximate value for the growth rate a.

Date Population
1901 3773801
1911 4 455 005
1921 5 435 734
1933 6 629 839
1947 7 579 358
1954 8 986 530
1961 10 508 186
1966 11 550 462
1971 12 755 638
1976 13 548 472
1981 14 576 330

Can you think of any historical reasons for any anomalies in the data?

113 Restricted growth

We now discuss a useful modification of the exponential growth model
which takes into account the fact that, in practice, there is a limit to the
size to which a population can grow. The ideas underlying the model are
similar to those explained in Section 9.2, but now lead to a differential
equation, called the logistic equation. Unlike the discrete logistic equation,
however, its solutions can be expressed in closed form and there is no
chaotic behaviour.

Logistic growth model

A population growing in a favourable environment can initially grow very
rapidly, in accordance with the exponential growth law. Restrictions on
space and food supply, however, eventually come into play and impose a
limit on the maximum sustainable size for the population. This is called
the carrying capacity of the environment, and is denoted by K.
If N is the size of the population at time t, we can think of the ratio
1 dN
N dt
as the growth rate at this time, giving the increase in population per
unit time per head of population. In the exponential growth model this
11.3 Restricted growth 219
(1 dN
N dt

K
N
Fig. 11.3.1. Logistic decline in growth rate.

assumes a constant value a. In the logistic model, however, the growth

rate is assumed to start at a when N = 0, and then decreases linearly
until it reaches the value 0 when N = K.
Thus the growth rate is given by the formula in Figure 11.3.1. Hence
the logistic model says that
1dN
N dt
1-N
K
1
( )

dN
dt
=aN 1- N
K
(2)

This differential equation, for N as a function of t, is the logistic equation.

It involves the parameters a and K. The parameter a, which is relevant
to the initial phase of the population's growth before the restrictions on
growth are significant, is the unrestricted growth rate of the population
for uncrowded conditions.
The differential equation (2) is first order and of the variables separable
type. It is left as an exercise to show that, if 0 < No < K, then the solution
which satisfies the initial condition N = No when t = 0 is

N= K ai
(3)
_ 1)e +1
Note that this formula gives the correct initial condition N = No when
t = 0. As t -+ oo, moreover, the term involving a-°1 approaches 0 as a is
positive. Hence
lim N = K.
t-)-oo
220 Continuous growth and decay models
N-axis
A

K N=K

N=O
t-axis

Fig. 11.3.2. Logistic growth curve.

It also follows from (2) that dN/dt is positive. Hence the graph of the
solution (3) is the `S'-shaped curve shown in Figure 11.3.2.
Thus the model predicts that the population increases steadily from
the value No and approaches the carrying capacity K as the time be-
comes arbitrarily large. The relevance of the model to various types of
populations will now be discussed.

Microscopic organisms
The logistic model is reputed to give reasonably good predictions for
the behaviour of populations of yeast cells, bacteria, and protozoans
(the most primitive form of animal life), when grown under suitable
laboratory conditions.
To test the relevance of the model to the growth of such populations,
we shall refer to Table 11.3.1, which is based on actual laboratory
measurements of Carlsen (1913).
When the points in Table 11.3.1 are plotted, as in Figure 11.3.3, they
are seen to lie along an `S'-shaped curve. In this respect, at least, they
are in agreement with the predictions of the logistic model.
Graphs similar to that in Figure 11.3.3 for the growth of the population
of yeast cells may be seen for example in Emlen (1984), page 43, Emmel
(1976), page 103, Hutchinson (1971), page 24, and Kormondy (1976),
page 78.
How can we further test the agreement with the logistic model? A
simple geometric answer is provided by Figure 11.3.1: the logistic model
11.3 Restricted growth 221

Table 11.3.1. Growth of yeast cells.

Time Number of
in hours yeast cells
t N
0 10
2 29
4 71
6 175
8 351
10 513
12 584
14 641
16 651
18 662

N-axis

700

1W
600

500

400
JO

300

200

100

t-axis
0 2 4 6 8 10 12 14 16 18 (hours)

Fig. 11.3.3. Population of yeast cells showing logistic growth.

is characterized by a linear decline in the growth rate

1 dN
N dt
when considered as a function of N. Now, by using a ruler, we can
approximate tangents to the graph of N as a function of t and hence find
the approximate values of dN/dt at various points. This process, when
applied to Figure 11.3.3, yields the results shown in Table 11.3.2.
The points on the graph of (1 /N)dN/dt as a function of N given by
222 Continuous growth and decay models

Table 11.3.2.

Time in hours t 2 4 6 8 10 12 14

Number of yeast cells N 29 71 175 351 513 584 641

Slope of tangent dN/dt 15 31 75 117 57 29 14
Growth rate (1 /N)dN/dt 0.52 0.44 0.43 0.33 0.10 0.05 0.02

QN AN)
dt

N
29 71 175 351 513 584641

Fig. 11.3.4. Linear decline in growth rate.

this table have been plotted in Figure 11.3.4. It can be seen that the
points do indeed lie approximately on a straight line, in accordance with
the logistic model.

Fisheries management
Models for population growth find ready application in the fishing in-
dustry, which aims at maintaining a permanent supply of fish. Too
much fishing in a particular year might so deplete the population that
it would take a long time to recover or it might even become extinct.
Too little fishing, on the other hand, might leave the population intact
but result in a smaller harvest than necessary. Biologists employed by
the fishing industry are therefore interested in determining the maximum
rate, somewhere in between these two extremes, at which fish can be
harvested without reducing the population in the long term. Models for
population growth play an important part in determining this maximum
rate, as we now show.
11.3 Restricted growth 223

First, it is clear that in order to maintain the fish population at

a constant level, only the increase in population should be harvested
during any one season. Hence, to maximize the harvest,
the population should be kept at the size N for which its rate of increase dN/dt is
a maximum.

The value of N which this condition determines will depend on which

population model is being used. For the logistic model, we simply choose
N to maximize the RHS of the logistic equation (2), which is the quadratic

aN11-K I.
l
A little calculation shows that the desired choice is N = K/2. This means
that the population should be maintained at half the carrying capacity.
To get the maximum value of dN/dt we now substitute this value of N
back into the quadratic to get
(dN aK
dt max 4
This is the maximum rate at which fish can be harvested, if the population
is to be kept at a constant size.
A discussion of how the answer for the maximum value depends on the
particular population model is given in Ginzburg (1985), pages 130,131.
Some pros and cons of the logistic model, when used in this context, are
discussed in Walter (1981).

Human populations
The data in Section 11.2 suggest that the world's population is currently
growing exponentially with a growth rate of about 0.02 per year. If this
were to continue for the next three centuries, however, the population
would increase by a factor of (1.02)300, which is about 380. Hence the
average density of the world's population (over the surface area of the
inhabited countries) would increase from its 1985 value of 36 people per
km2 to about 13 680 per km2. This latter figure is truly fantastic : less
than one square metre of land for each living person. Long before this
happened, of course, the food supply would have been exhausted and
the population would have exceeded its maximum sustainable size.
As a more realistic alternative, the logistic model was proposed for the
growth of human populations by Verhulst in the middle of the nineteenth
224 Continuous growth and decay models

Table 11.3.3. Actual and predicted values of the population

of the USA in millions.

Year Actual Predicted

1790 3.929 3.929
1800 5.308 5.336
1810 7.240 7.228
1820 9.638 9.756
1830 12.866 13.109
1840 17.069 17.506
1850 23.192 23.191
1860 31.443 30.412
1870 38.558 39.371
1880 50.156 50.177
1890 62.948 62.769
1900 75.995 76.870
1910 91.972 91.972

century. He used the logistic model to estimate the maximum values for
the population of various countries.
The use of the logistic model to study human populations was revived
in 1920 by Pearl and Reed. They compared the census figures for the
population of the USA from the years 1790 to 1910 with the values which
could be predicted from the logistic model. The remarkable agreement
between the actual and predicted values is shown in Table 11.3.3.
To get the predicted values, Pearl and Reed assumed that the logistic
equation (2) was satisfied with N denoting the population of the USA at
a time t years after some initially chosen year; hence the population is
given as a function of the time by the solution (3) of the logistic equation.
They chose the parameters a and K in such a way as to make the formula
(3) give the actual values of the population in the years 1790, 1850 and
1910. As explained in the exercises, the values they obtained for these
parameters were
a = 0.03134 per year,
K = 197 273 000 individuals.
The formula (3) then gives the remaining predicted values in Table 11.3.3.
As figures from later censuses became available, the remarkable agree-
ment between actual and predicted values for the population of the USA
continued, as can be seen from the first column of Table 11.3.4. After
1950, however, the predicted values consistently underestimated the ac-
11.3 Restricted growth 225

Table 11.3.4. Good then bad news for the logistic model.

Year Actual Predicted

1920 105.711 107.395
1930 122.775 122.398
1940 131.670 136.318
1950 150.679 148.678
1960 179.323 159.231
1970 203.235 167.944
1980 226.546 174.942

tual size of the population, and by 1980 the actual population was well
in excess of the previously estimated carrying capacity, K.
The failure of the Pearl and Reed model to give a realistic prediction
of the maximum sustainable population for the USA, highlights the dif-
ficulties of making predictions about the growth of human population
in the long term. An obvious difficulty is that human beings can change
their environment in such a way as to invalidate the values of the param-
eters previously relevant to the model. Thus, for example, technological
advances in agricultural production and distribution can improve the
supply of food and thereby increase the carrying capacity. Advances
in medical science can decrease the death rate and thereby increase the
growth rate. It is easy to think of many other factors under the control
of human beings which can affect the values of the parameters.
Human population models are discussed in Braun (1983), Section 1.5,
Hutchinson (1978), pages 22-23, and in Keyfitz (1977), pages 213-220.

Exercises 11.3

1. This exercise is about the population of yeast cells from which the data in
Table 11.3.1 were obtained.
(a) Estimate the growth rate a and the carrying capacity K for this popula-
tion by comparing Figure 11.3.4 with Figure 11.3.1. What is the accuracy
of your estimates?
(b) Hence compare the number N of yeast cells at time t predicted by the
formula (3) in the text with the observed values given in the table, for
t = 0, 2,14.1... ,18 hours.
226 Continuous growth and decay models
2. Let a, K, and No be positive real numbers such that 0 < No < K. Use
separation of variables to show that the solution of the logistic equation
dN
dt
_ aN I- NK
which satisfies the initial condition N = No when t = 0 is

N= K
(_1)e_t2t+1
No

3. Repeat Exercise 2, but this time suppose that No > K. Sketch the graph of a
typical solution.
(a) What does your graph tell you about what happens in the long term if
the population initially exceeds the carrying capacity?

4. Let N be given as a function of the time by the solution of the logistic

equation in Exercise 2. Let No, N1 and N2 be the values of N when t = 0, t = T
and t = 2T respectively.
(a) Show that
Ni e-aT _ K - N1
N0 K - No'
N2 e-2aT
No =K
K --No.
N2

(b) Eliminate T and solve for K to show that

K = N1
- No) - NO(N2 - N1)
1 Ni - N0N2

5. (a) Show that if the population N reaches half its carrying capacity when
the time t = t1 then the solution of the logistic equation in Exercise 2
may be written
K
e-a(t-tl) + 1
(b) On page 32 of Braun (1975) it is stated that Pearl and Reed calculated
that the population of the USA reached half its carrying capacity in
April 1913. It follows from the above formula that in year t
197 273000
N= e-0.03134(t-1913.25) + 1

Does this formula give the values predicted in Table 11.3.2? Comment.

6. Growth of a tumour. A tumour grown in a laboratory with a plentiful supply

of nutrients provides an example of restricted growth which does not follow the
logistic model. The number of cells in the tumour is proportional to its volume
V at time t. Its growth rate is then
1 dV
V dt
11.4 Exponential decay 227

The growth rate, instead of decreasing linearly with V (as in the logistic model),
decreases exponentially with t, its value at time t being found empirically to be
oce-At

where a and A are positive constants.

(a) Express this empirical law as a differential equation.
(b) Show, by separation of variables, that the solution which satisfies the
initial condition V = Vo when t = 0 is
a
V = Vo exp ((1 - e-At )

[The law of growth for the tumour is called the Gompertz growth law. Further
discussion may be found, for example in Braun (1983), and Rubinow (1975),
page 43.]

11.4 Exponential decay

A variety of processes involving the decay of some substance can be
usefully modelled by the assumption that the rate of decrease of the
substance at any time is proportional to the amount of the substance
present.

Radioactive decay
A typical example of such a process is the decay of a radioactive element,
such as radium. Since the decrease in mass is caused by the emission
of alpha particles, the decay is really a discrete process. The mass of
an alpha particle, however, is very small compared with the mass of the
sample and hence it is appropriate to regard the mass as a quantity which
can change continuously. On average, the larger the sample, the greater
will be the number of alpha particles emitted per unit time. The simplest
way to model this is to assume that, at any time,
rate of decrease mass of sample
is proportional to
of mass of sample still present
To express this as a differential equation, let m denote the mass of the
sample still present at time t and so obtain
dm _
dt - -km (1)
228 Continuous growth and decay models

Fig. 11.4.1. Exponential decay.

where k is a positive constant. The minus sign ensures that the derivative
of m with respect to t is negative; hence the mass of the sample decreases
as time goes on.
Since this differential equation is linear homogeneous with constant
coefficient, it can be solved by the method of Section 11.1. The solution
satisfying the initial condition m = mo when t = 0 is found in this way
to be
m= moe-kr.

(2)

The graph of a typical solution is sketched in Figure 11.4.1 and the

process just modelled is said to be an example of exponential decay (in
contrast to the exponential growth of populations in Section 11.2).
When a substance decays exponentially, it takes a fixed time T for the
amount to decrease by a factor of 2 (just as with exponential growth,
it takes a fixed time for a population to double its size). The time T is
called the half-life of the decaying substance and is related to the decay
constant k by the equation
T= In(2)
(3)
k

Radium has a half-life of 1600 years while the extremely dangerous

radioactive element plutonium (used in atomic weapons and nuclear
power stations) has a half-life of 24100 years. A table of the half-
lives of radioactive elements is given in Giancolo (1985). Some uses
and dangers of radioactive elements are discussed in Marion (1976).
An interesting application of the exponential decay model to carbon-14
dating of archaeological finds is explained later in an exercise.
11.4 Exponential decay 229

Drug absorption
Another process which also leads to an exponential decay model is the
absorption of drugs from the bloodstream into the body tissues. When
a drug is administered by an injection, it mixes with the blood. As time
goes on, the amount of the drug in the bloodstream diminishes, being
absorbed by the body tissues or excreted from the body. When medical
staff administer a drug it is important for them to know how much to
give in the next injection too little and the drug is ineffective, too
much and undesirable side effects could result.
The significant quantity to monitor is the concentration of the drug in
the bloodstream, which is defined as the amount of drug per unit volume
of blood, and is usually measured in mg/litre. For most drugs the rate
of absorption from the bloodstream increases with higher concentration.
As with radioactive decay, the simplest model consistent with this is the
assumption that, at any time,
f rate of decrease {concentration}.
of concentration is proportional to

With c denoting the concentration of the drug in the bloodstream at time

t this may be written as a differential equation
dc _
_ -µc
dt (4)
where µ is a positive constant. This differential equation has an obvious
analogy with the differential equation (1) for radioactive decay, and can
be solved in a similar way.

Exercises 11.4
1. In the differential equation (1) in the text:
(a) What quantities do the symbols m, t, and dm/dt stand for?
(b) Why is there a minus sign?
(c) Assume mass is measured in grams and time in years. What are the SI
units for the decay constant k? What are its dimensions?

2. Obtain the solution of the differential equation (1) in the text which satisfies
the initial condition m = mo when t = 0. What makes the solution decrease more
rapidly : large k or small k ?
3. Show from the solution (2) in the text that, if the mass of a radioactive sample
decays from mo to mo/2 in time T, then T = ln(2)/k.
230 Continuous growth and decay models

4. Given that the half-life of radium is 1600 years, what is the value of its decay
constant k ? How long does it take for the mass of a given sample to decrease to
3 of its value? a of its value? n of its value?
5. Carbon-14 dating. While a plant or animal is living, the ratio of 14C to 12C
in its tissues is a small constant, the same for all living tissue. When a plant
or animal dies, however, this ratio decays exponentially with a half-life of 5730
years.
A sample of charcoal was found at the cave at Lascaux in France containing
the famous prehistoric painting, for which the ratio of 14C to 12C had decayed
to 14.5% of its original value. How many years ago did the wood grow?
(Further information about carbon-14 dating is given in Braun (1983), Sec-
tion 1.3.)

6. (a) Match each symbol occurring in the differential equation (1) in the text
with the symbol which plays a similar role in the differential equation
(4).
(b) Why is there a minus sign in the differential equation (4)?
(c) Suppose the concentration is measured in mg/litre and time is measured
in hours. What are the units for µ?
(d) Use the solution given in the text for (1) to write down the solution of
(4) which satisfies the initial condition c = co when t = 0.
(e) How long does it take for the concentration to reduce to half its initial
value? Express your answer in terms of µ.

7. The concentration of drug in a patient's bloodstream reduces to half its initial

value in 30 minutes. What is the concentration after 2 hours?
8. Suppose that after 4 hours an additional injection is given to the patient in
Exercise 7.
(a) Find the concentration 5 hours after the initial injection was given. See
if you can identify the constituents from the first and second injections.
(b) Sketch a graph of the concentration against the time.

9. When the drug Theophylline is administered for asthma, a concentration in

the blood below 5 mg/litre of blood has little effect while undesirable side effects
appear if the concentration exceeds 20 mg/litre. Suppose a dose corresponding
to 14 mg/litre of blood is administered initially. The concentration satisfies the
differential equation
dc c
dt 6
where the time t is measured in hours.
(a) Find the concentration at time t.
(b) Show that a second injection will need to be given after about 6 hours
to prevent the concentration becoming ineffective.
(c) Given that the second injection also increases the concentration by 14
mg/litre, how long is it before another injection is necessary?
11.4 Exponential decay 231

(d) What is the shortest safe time that a second injection may be given so
that side effects do not occur?

10. One method of administering a drug is to feed it continuously into the blood
stream by a process called intravenous infusion. This may be modelled by the
linear differential equation
dc __µc+D
dt
where c is the concentration in the blood at time t, p is a positive constant, and
D is also a positive constant which is the rate at which the drug is administered.
(a) Find the constant (or equilibrium) solution of the differential equation.
(b) Given c = co when t = 0, find the concentration at time t. What limit
does the concentration approach as t -+ oo? Compare with your answer
to part (a).
(c) Sketch the graph of a typical solution.
12
Modelling heat flow

Some typical processes from everyday life which involve the flow of heat
from one region to another are the heating of beverages, food and living
areas, and the cooling of foodstuffs in refrigerators. The flow of heat
involved in such processes is best described by mathematical models.
This chapter introduces some simple mathematical models which are
based on Newton's law of cooling and Fourier's law of heat conduction.
These laws lead to very simple differential equations of the type studied
in Chapter 11. At the end of this chapter these ideas are used to model
the loss of heat from an insulated water pipe. The model makes some
unexpected predictions.
The only concept from physics which is assumed initially is that of
temperature which indicates the hotness of a body, and is measured
with a thermometer.

12.1 Newton's model of heating and cooling

A hot cup of coffee, when left standing for a while, cools as heat is
lost to the surrounding air. The temperature of the coffee drops and,
if the coffee is left standing for long enough, its temperature eventually
reaches that of its surroundings. This example is typical of many processes
involving cooling, and heating, which occur in a wide variety of situations.
Fortunately there is a very simple mathematical model for such problems,
due to Newton., which is both reliable and versatile.
In this section the simplest version of Newton's model is described,
which uses only the concept of temperature. The model will be refined
in the next section so as to include the effects of such factors as the
size of the heated body and the material of which it is composed. This
refinement, however, will involve us in a discussion of the concept of the

232
12.1 Newton's model of heating and cooling 233

amount of heat in a body, which is a little more sophisticated than that

of temperature.

The model
Although the model for cooling applies to any heated object, we shall
stay with the cup of coffee as an illustration. We aim at predicting how
the temperature of the coffee changes with time.
The intuitive starting point for modelling this problem is the idea that
the greater the difference between the temperature of the coffee and that
of the surrounding room, the greater will be the rate of cooling of the
coffee. The simplest mathematical model consistent with this requirement
is to have

{rate of cooling} is proportional to {temperature difference}. (1)

This is known as Newton's law of cooling. In proposing this law, Newton

assumed that the coffee was fanned by a continuous stream of air at the
temperature of the surroundings.
To express the model in terms of mathematical symbols we let

temperature of the coffee

u= at a time t
after being placed in the room

1 temperature of
us - t the surrounding room f '

and we assume us to be constant. Note that u is a function of the time t.

The law (1) may then be written as a differential equation for u =fi(t),

du__
T 2(u-us) (2)

where 2 > 0 is the constant of proportionality. Note that the minus

sign is required so that for u > us (that is, the coffee is hotter than the
surroundings) we obtain du/d t < 0 (that is, temperature is a decreasing
function of time).
Equation (2) is also applicable to situations in which a cold object is
placed in a hot room. Here u - us < 0 so that du/dt > 0 (that is, the
temperature of the object increases with time).
234 Modelling heat flow

Solving the Wfferential equation

Newton's law of cooling thus provides us with the differential equation
(2) for the temperature u as a function of the time t. By solving this
differential equation, with a given initial condition, we can find how the
temperature varies with the time. The value of parameter us, being the
temperature of the surrounding room, is easily determined, while ways
of determining the value of 2 will be discussed later.

Example 1. A cup of coffee is initially at boiling point, 100 °C. The temperature
of the room is 20 °C. Find the temperature of the coffee as a function of the time.

Solution. Let u be the temperature of the coffee after time t. Since us = 20, the
di, fJrerential equation (2) is now
du = -A(u - 20) (3)
dt
with the initial condition u = 100 when t = 0. The differential equation, being
linear constant coefficient, may be solved by the method of Section 11.1.
First, we find all solutions of the homogenized equation
du
(4)
dt
Try u = em' where m is a constant to be determined. By substituting in the homog-
enized equation (4) we find that m = -A. Hence one solution of (4) is u = e-)J.
As (4) is homogeneous, every solution therefore has the form
u = Ce-A` (5)
for some real constant C.
Second, we guess the original equation (3) has a particular solution in which u
is a constant. This means that du/dt = 0 and hence by (3) that u - 20 = 0. Thus
a particular solution of (3) is the constant solution
u = 20. (6)
Finally, all the solutions of the original equation (3) are obtained by adding the
solutions (5) and (6) to get
u=Ce-A`+20. (7)

The initial condition when used in (7) gives

100=Ce°+20=C+20
and hence C = 80. Thus the required solution is
u = 80e-A` + 20. (8)

It is left as an exercise to check that (8) satisfies both the differential equation (3)
and the initial condition. This formula gives the temperature u of the coffee as a
function of the time t.
12.1 Newton's model of heating and cooling 235
Temperature

Fig. 12.1.1. Graph of temperature of coffee against time.

Behaviour of solutions
The constant solution (6) obtained in the course of the above working
has the interesting physical interpretation that if the coffee is initially at
room temperature 20 °C, then it will stay at this temperature indefinitely.
It is therefore called the steady-state temperature for the coffee.
Note that the formula (8) does not provide a complete answer to
Exercise 1 since the value of the parameter A has not yet been specified.
In spite of this, however, it is possible to indicate the general shape of a
typical graph of temperature against time, as in Figure 12.1.1.
The graph was obtained from (8) by observing that
u = 100 when t = 0, (9)

du
<0 for t >_ 0 (10)
dt
and

lim u = 20. (11)

The property (11) means that, as the time approaches oo, the temperature
of the coffee approaches the steady-state temperature, equal to that of
the surroundings.

Determining the parameter A

To determine the value of the temperature of the coffee at any time, it is
necessary to know the value of the parameter A. This requires information
additional to that given in Example 1. One way is to give the temperature
236 Modelling heat flow

at some other time, besides the initial one. A second way, which will
be explored in the next section, is to reformulate the model, taking into
account the physics of heat transfer. This will show how the parameter 2
depends on such factors as the mass of the heated object, the material of
which it is composed, and its surface area. Such information will make
the model more versatile and will play an essential role later in our study
of the effect of insulating a hot water pipe.

Exercises 12.1
1. An object is at a temperature u which is colder than the temperature us of
its surroundings. Let % > 0. Which of the following differential equations predict
that the temperature of the object will increase with time?
(a) u = Au (b) iu = A(us - u)3
(c) iu = -A(u - us)2 (d) u = -AI us - uI
2. Let %, us and uo be constants. Show that the solution of the differential
equation
du =-A(u-us)
dt
which satisfies the initial condition u = uo when t = 0 is
u = (uo - us)e zt + us
[Hint : Use the method of Example 1 in the text, which utilizes the fact that the
differential equation is linear constant coefficient]
3. Repeat Exercise 2, but use the method of separation of variables.
4. A cold beer is at a temperature of 10 °C. After 10 minutes the beer is at
a temperature of 15 °C. Find how long it takes for the beer to warm to 20 °C,
given that the temperature of the room is 30 °C.
5. Sketch the general shape of the graph of temperature against time in Exercise
2, assuming . > 0 and uo < us.
6. From the expression in Exercise 2 for temperature as a function of time, say
what happens when uo = us. Explain this physically.
7. Suppose that, instead of Newton's law of cooling (2) in the text, the law of
cooling is
du_
=f(u-us)
dt
for some function f. Explain the physical interpretations of each of the following
conditions on the function f and state whether they seem realistic.
(a) f (0) = 0. (b) f (x) > 0 for x > 0
(c) f (x) < 0 for x < 0 (d) f (x) = -f(-x).
8. A student was seen to enter a tutor's office at 4.00 p.m. The tutor was later
12.2 More physics in the model 237

at the bar at 4.30 p.m. drinking heavily. At 6 p.m. the cleaners discovered the
student's body in the tutor's office and called the police. The police first measured
the temperature of the corpse at exactly 6.30 p.m. as 30 °C and later at 8.30 p.m.
as 27 °C. The temperature of the office remained at a constant 25 °C. [Hint:
Assume Newton's Law of cooling, and use the solution obtained in Exercise 2.]
(a) What is us ?
(b) Why is it a good idea to set t = 0 to correspond to 6.30 p.m.? What is
the initial temperature?
(c) Write down an expression for the temperature at time t and hence
determine 2 from the information given in the question.
(d) Hence determine the time of death, assuming that the temperature of the
student just before the murder was 37 °C (normal body temperature).

9. As mentioned in the text, Newton's law of cooling assumes that air at room
temperature is blown past the cooling body (`forced cooling'). For cooling in still
air ('natural cooling') a better model is to assume that the rate of temperature
decrease of the cooling body is directly proportional to the 5/4th power of
the difference between the temperature of the body and the temperature of the
surrounding air.
(a) Introduce appropriate notation and thus write the law for natural cooling
as a differential equation. Is it linear?
(b) Show that the temperature u of the cooling body at time t is given by
the formula

us+ ((uO - u,) a +At/4

where u = uo when t = 0 and where us is the temperature of the

surrounding air and 2 is a constant.

12.2 More physics in the model

The model for cooling in the previous section involved the constant of
proportionality A. In this section we make the model more versatile
by explaining how the parameter A depends on physical aspects of the
cooling body, such as its mass and size. We do this by reformulating
the model, taking into account more of the underlying physics. The
idea is to express the model in terms of loss of heat, rather than loss of
temperature.

Heat and temperature

Recall that temperature is measured with a thermometer in °C and
indicates how hot a substance is. Heat, on the other hand, is a quantity
238 Modelling heat flow

which can flow from a hotter substance to a colder one, thereby raising
its temperature. The microscopic origin of heat is the motion of atoms
and molecules which compose the substance. Heat is a form of energy,
which in the SI system is measured in joules (J).
The change in the heat of a given substance depends on both the mass
of the substance and the change in temperature, in a manner which we
now describe.
As to the dependence on mass suppose, for example, that 1 joule of
heat flows into 1 kg of the substance, raising its temperature by 1 'C. It
then seems reasonable to suppose that 2 joules of heat will be required
to raise the temperature of 2 kg by the same amount. More generally,
we assume that when the temperature of a given substance is raised by
a fixed amount
{change in heat} is proportional to {mass of substance}. (1)

If, furthermore, we keep the original mass of the substance fixed but
wish to raise its temperature by 20C we might expect that it would
take twice as much heat energy and, in general, for a fixed mass of the
substance,
{change in heat} is proportional to {change in temperature}. (2)

Measurements show, however, that (2) is only approximately true. It is

reasonably accurate provided that the temperature stays close enough
to some initially fixed temperature of say 200C. To simplify our model,
however, we shall simply assume that (2) always holds.
We now combine the assumptions (1) and (2) in a single formula.
Firstly we define some notation by letting
m = {mass of a given substance} ,
H = {amount of heat in the sample of mass ml,
u = {temperature of the sample.}
Here H and u vary with time and m is a constant. Now if 8u denotes the
change in temperature due to a change in the amount of heat 8H then
(1) and (2) combine to give
SH = cmbu (3)

where c is a positive constant of proportionality.

For our purposes it will be more useful to recast (3) in a form which
refers to the rate of change of heat. Suppose, therefore, that the changes
12.2 More physics in the model 239

Table 12.2.1. Specific heat c for some common substances, taken at 20 °C

except where otherwise indicated.

Substance c (J kg-1 °C-1) Substance c (J kg-1 °C-1)

Aluminium 896 Asbestos 841
Copper 383 Brick 840
Iron 452 Concrete 837
Stainless steel 461 Glass 800
Water (at 0 °C) 4226 Butter fat 2300
Water (at 20 °C) 4182 Lamb 3430
Water (at 1000C) 4211 Potatoes 3520

in (3) occur during a time interval of length bt. Dividing both sides of
(3) by 6t and then letting 6t approach 0 gives in the limit
dH du
dt - cmdt (4)

This gives the desired relationship between rate of change of heat and
rate of change of temperature at any given instant.
The constant c depends on the type of substance being heated (alu-
minium, brick, glass, etc.) and is called the specific heat of that substance.
Specific heats of some common substances are shown in Table 12.2.1. In
the table the specific heats are said to be `taken at 20 °C' to indicate that
the formula (3) is valid provided the temperatures stay close to 20 °C.
Thus we can see from Table 12.2.1 and equation (3) that metals such
as aluminium, copper and iron have lower values of c and thus require
much less heat energy to raise their temperature than does water and
food products such as butter fat, lamb and potatoes (which contain a
substantial proportion of water).

Newton's law of cooling revisited

A useful formula for the parameter A occurring in Newton's law of
cooling will now be derived. The derivation will be based on the idea
that a hot object, placed in cold surroundings, cools by giving up heat
to the surroundings. Similarly, a cold object heats up by gaining heat
from its surroundings. This, of course, means that the temperature of
the surroundings will change as it gains or loses heat. Since the heat is
spread over such a large region by convection, however, the change in
temperature is usually neglected.
240 Modelling heat flow

Table 12.2.2. Some heat transfer coefficients.

Type of convection at surface h (W m-2 oC-1)

Plate in still air 4.5

Airflow at 2 m/s over plate 12
Airflow at 35 m/s over plate 75
Cylinder, 5 cm diameter, in still air 6.5
Cylinder, 2 cm diameter, in still water 890

We now model how the heat is lost to (or gained from) the surround-
ings. The key quantity to consider is the rate of change of heat energy
contained within the object, which we have denoted by dH/dt.
First, since the heat loss occurs at the surface of the object, it seems
reasonable to suppose that
f rate of changel
is proportional to !surface area l (5)
1 of heat f t of object f
Second, when cooling is expressed as a loss of heat, Newton's law of
cooling says that
rate of change Itemperature)
of heat } is proportional to difference (6)

Denoting A as the surface area of the object, (5) and (6) may be combined
to give
dH -hA(u
= - us) (7)
Tt
where h is a positive constant of proportionality.
The constant h is known by several different names - the convec-
tive heat transfer coefficient, surface conductance, and the surface convec-
tion coefficient. Experimentally determined values of this coefficient are
shown in Table 12.2.2 under various circumstances. The unit for h is
watt metre -2 °C-I (where the watt is the unit of power, or 1 Joule s-1).

We can now use the relationship (4) between heat and temperature to
write (7) as the following differential equation for the temperature.
du _ hA
(u - us)
dt me

Comparison of this differential equation with Newton's law of cooling

12.3 Conduction and insulation 241

(2) in the previous section shows that

hA
(8)
cm
Thus, by including some of the physics of heat transfer in our model
of cooling, we have been able to derive a formula for the constant of
proportionality A. The formula involves the heat transfer coefficient h, the
surface area A of the heated object, the specific heat c of the substance
of which the object is composed, and the mass m of the object.

Exercises 12.2

1. Discuss the significance of the minus sign multiplying the RHS of the
differential equation (7) in the text.
2. Find the parameter A in Newton's law of cooling for an iron plate whose total
surface area is 2 m2 which is cooling in a stream of air flowing over the plate at
35 m/s. Assume the mass of the plate is 2 kg.
3. Consider a 3 kg plate of iron with total surface area 2 m2, initially at a
temperature 150 °C. In each of the following cases, find how long it takes to cool
to a temperature 100 °C if the temperature of the surroundings is 200C and
(a) the plate is in still air,
(b) air flows over the plate at a speed of 35 m/s.
Useful data are given in various tables in the text.

12.3 Conduction and insulation

This section is about how heat flows through a sample of a given material,
this process being known as conduction of heat. In some situations it is
desirable to maximize the conduction of heat through the material so as
to ensure efficient heating; in others the aim is to minimize conduction
so as to reduce heat loss. A material which does not conduct heat readily
is called an insulator.

Steady-state conduction
We will consider the conduction of heat through insulating material
between the inner and outer walls of a house, as in Figure 12.3.1 below.
Suppose that the inner wall is at a temperature of 20 °C and the outer
wall is at a temperature of 10 °C. Thus heat flows from the inner wall
242 Modelling heat flow
Inner wall Outer wall
Heat Insulating Heat
flowing material flowing
out:

Jjoules/min

Fig. 12.3.1. Heat flow through a wall.

to the outer wall. The temperature inside the insulating material varies
continuously between the temperatures of the inner and outer walls.
As heat flows from the inner to the outer wall, some of the heat
goes into raising the temperature of the insulation. We suppose that
eventually the temperature at each point inside the insulation reaches a
steady-state; this steady-state temperature is independent of the time but
varies continuously with respect to distance, from inside to outside. When
the temperatures inside the insulation have reached this steady-state, the
rate of flow of heat going into the insulation must equal the rate of flow
of heat coming out.

Fourier's law for heat flow

A model for heat conduction will now be described which can be used
to predict the rate at which heat flows through the insulation, once the
steady-state has been reached.
First we need to introduce some notation. Imagine a cross-section
through the insulating material which is perpendicular to the direction
of heat flow, as in Figure 12.3.2. Suppose that this cross-section is at a
distance x from the inside wall, and has area A. Now put
_ rate at which heat is flowing through a
cross-section of area A in the x-direction
Intuition suggests that a cross-section of double the area will have double
the heat flowing through it, and, more generally, that
the area A through which
J is proportional to
the heat flows (2)

Intuition also suggests that the rate of heat flow, along the x-direction,
12.3 Conduction and insulation 243

Cross-section
Outer wall

Heat
flows out

Fig. 12.3.2. Rate of heat flow at distance x.

will depend on the drop in temperature per unit length in this direction.
A larger drop per unit length will produce a larger rate of heat flow. The
simplest model consistent with this idea is to assume that
J is proportional to {temperature gradient}. (3)

This law of heat conduction is named after the famous mathematical

physicist Joseph Fourier, who proposed it in 1822.
Once the temperature in the insulation has reached a steady-state, the
temperature will be a function of x only, say u = ip(x), and so the
temperature gradient at any point is just the derivative du/dx at that
point. Thus, combining (2) and (3) into the one formula gives
du
J = -kA (4)
dx
where k is a positive constant of proportionality. Reference to Fig-
ure 12.3.2 indicates why the minus sign is necessary : if temperature is
decreasing along the x-direction then J is positive (as heat flows from
hot to cold) while du/dx is negative.
The constant k which occurs in (4) is called the conductivity of the
material in the insulation. In Table 12.3.1 are listed conductivities of
some common materials. Note that a large value of k indicates a good
conductor, while a low value indicates a good insulator.
How good a model is Fourier's law? If we want (4) to hold exactly,
then for most materials we must allow k to vary with temperature.
For small ranges of temperature, however, our assumption that the
244 Modelling heat flow

Table 12.3.1. Thermal conductivities of some common materials. These

are measured at 20°C except where otherwise indicated.
°C-1)
Substance k (W m-1 °C-1) Substance k (W m-1
Aluminium 204 Asbestos 0.113
Copper 386 Brick 0.38-0.52
Iron 73 Concrete 0.128
Stainless Steel 14 Glass 0.81
Water (at 0 °C) 0.57 Wood 0.15
Lamb (at 5 °C) 0.42 Rock wool 0.04
Butter (at 5 °C) 0.20 Polystyrene 0.157

thermal conductivity is constant is a good approximation to what actually

happens.
If in (1) we assume that the area of the cross-section stays constant as
x varies, then the rate of heat flow J will also stay constant with respect
to x, once a steady-state has been reached. Fourier's law (4) then gives a
particularly simple differential equation,
du __J constant,
dx kA =
which can be solved by antidifferentiation for the temperature u as a
function of the distance x. This in turn enables us to determine the heat
flow J, as in the following example.

Example 1. In a furnace the temperature of an inner wall of area 3 metre2 is

500 °C. The temperature of the outer wall is 100 °C. There is 1 metre of asbestos
insulation between the walls (the furnace having been built before the carcinogenic
property of asbestos was realized). How much heat escapes in one minute?

Solution. Let J joules per minute be the rate at which heat flows across a cross-
section of area 3 m2 parallel to the walls at a distance x metre from the inner wall
(as in Figure 12.3.2). We assume steady-state temperatures, hence J is a constant,
which is to be determined.
Let u be the temperature at distance x. Hence by (4) the differential equation
du J
dx kA
is satisfied where A = 3 m2. The initial condition is u = 500'C when x = 0. Solving
the differential equation by antidifferentiation gives
J
u=-kAx+500
12.3 Conduction and insulation 245

Now u = 100 °C when x = 1 m. Hence

J = 400kA
= 1200k (as A= 3)
= 135.6J (as k = 0.113).
Thus 136 joules of heat escape per minute.

Exercises 12.3
1. Complete the solution to Example 1 in the text by verifying the claims made
concerning the solution of the differential equation satisfying the given initial
condition.
2. Give reasons why the following would not be suitable to use in place of
Fourier's Law.
du 2
(a) J=-A du
dx
.
du .
(b) J = _A2
dx
[Hint: Consider dimensions.]
Exercises 3,4,5,6,7,8 refer to the rectangular slab of material shown below. It may
be regarded as a wall of a heated room or of a furnace.
Cross-section

Inner face

The slab, of thickness C, has an inner face and an outer face, each of area A.
The inner face is at a uniform temperature ua, and the outer face is at a cooler
temperature ub. Heat is assumed to flow straight through the slab from inner to
outer face. All points on a cross-section at distance x from the inner face have the
same temperature u. Heat flows through this cross-section at the rate J. Assume the
temperature has reached the steady-state. In Exercise 7(c) below, you will express
J in terms of the other parameters.

3. In terms of the notation introduced above, what is the value of the temperature
246 Modelling heat flow

(a) when x = 0,
(b) when x = t?
4. In our model we have assumed that the heat flows straight through the
slab from inner to outer face, none of it escaping out the other sides. Is this
assumption more appropriate when t is large or when t is small?
5. Given that the temperature has reached the steady-state, what follows about
the value of J as x increases from 0 to t ?
6. On the basis of physical intuition, decide the effect on the value of J of each
of the following separate changes.
(a) Increasing ua.
(b) Increasing ub.
(c) Decreasing t.
(d) Increasing A.
(e) Replacing the material by one with greater thermal conductivity k.

7. Recall that Fourier's law

du
J = -kA
dx
implies that, once the steady-state has been reached,
du
dx= constant.

(a) What does this tell you about the shape of the graph of u against x?
Hence state why this graph is as shown in the diagram below.
(b) Use the diagram below and the interpretation of the derivative as a slope
to obtain du/dx in terms of Ua, Ub, and t.
(c) Deduce that
J =kA
t
and hence verify your answers to Exercise 6.
u-axis

x-axis
B

8. (a) Write down Fourier's law as a differential equation for the temperature
u as a function of the distance x. The differential equation will involve
the parameters k, A, and J. Why can the differential equation be solved
by antidifferentiation?
12.3 Conduction and insulation 247

(b) Find the solution of the differential equation which satisfies the initial
condition u = ua when x = 0.
(c) Now use the value of the temperature u at the outer face to find J in
terms of the parameters k, A, Ua, Ub, &"-
(d) Does your answer agree with that found in Exercise 7(c)?

9. Suppose a stone slab has a surface area of 10 m2 and thickness 2.7 m. The
inner and outer faces are at steady temperatures of 20 °C and 0 '*C respectively.
Given that the thermal conductivity of stone is 2.7 W m-1 °C-1, calculate J from
your answer to Exercise 7(c).
Exercises 10,11,12,13,14,15 below refer to the figure below (which gives an end-on
view of the slab in the diagram above).
Slab

Inner face of slab at Outer face of slab at

temperature Ua temperature ub°C

Hot air inside at Cold air outside at

temperature Usa temperature usb

Besides the assumptions already listed for Exercises 3,4,5,6,7,8, the following
additional assumptions also apply. The outer face of the slab, at a temperature of
ub, is cooled by a stream of cold air at a temperature of usb.
Newton's law of cooling applies to the loss of heat from the outer face of the
slab to the cold air. From (7) of Section 12.2, this means that
dHb
hbA(ub - usb)
dt
where -dHb/dt is positive and denotes the rate at which heat is being lost to the
cold air at the outer face. hb denotes the heat transfer coefficient between the outer
face and the cold air. Similarly, the inner face of the slab, at a temperature of ua,
is heated by a stream of hot air at a temperature of usa. Newton's law of cooling
applies to this face also. In Exercise 15 you will express the rate of heat flow J in
terms of the temperatures of the hot and cold air (rather than the temperatures of
the inner and outer faces, over which we have no direct control).

10. (a) Arrange the temperatures ua, ub, usa, usb in increasing order.
(b) What are the signs of (i) ub - usb and (ii) ua - usa ?
(c) Extend the diagram shown in Exercise 6 to show also the temperature
of the hot air (corresponding to points with x < 0) and to show the
temperature of the cold air (corresponding to points with x > C).
248 Modelling heat flow

(d) At which points is there a discontinuity in the graph you have drawn in
(c)?

11. Check, from the formula given above for dHb/dt and your answer to
Exercise 10(b), that -dHb/dt is positive.

12. (a) The steady-state temperatures having been attained, what is the rela-
tionship between the rate J at which heat is arriving at the outer face
and the rate -dHb/dt at which heat is being lost from the outer face?
Deduce that
J = h6A(ub - usb)
(b) Hence find the temperature u6 at the outer face in terms of the temper-
ature usb of the cold air (and the parameters J, hb, A).

13. Newton's law of cooling at the inner face may be written, with a suitable
choice of notation, as
dH
_ haA(ua - u.).
dt
What is the sign of dHa/dt? What is the physical significance of this quantity?

14. (a) The steady-state temperature having been attained, what is the relation-
ship between the rate J at which heat is entering into the slab from the
inner face and the rate dHa/dt at which heat is being transferred to the
inner face from the hot air? Deduce that
J = haA(usa - ua).
(b) Hence find the temperature Ua of the inner face in terms of the temper-
ature u. of the hot air (and the parameters J, ha, A).

15. (a) From your answers to Exercises 7(c), 12(b) and 14(b), show that
A(usa - ub)
J= ha-1 +hb-1
+tk-i
(b) What features of this answer agree with your physical intuition?

Exercise 16 refers to the diagram below, which shows two slabs of material joined
together.

Inner slab
12.4 Insulating a pipe 249

The notation to be used for each slab is similar to that used in the previous
exercises. The inner and outer faces of the combination will be assumed to be at
the respective temperatures ua and ub, where ua > ub. The thermal conductivities
of the respective slabs are denoted by kl and k2. Assume the temperatures have
reached the steady-state; hence J is the same for each slab of material. For the
first slab
J_-k1A du (0<x<C).
dx

16. (a) Write out in words the meaning of the above differential equation for
the temperature in the first slab.
(b) Write down a similar differential equation for the temperature in the
second slab.
(c) Solve these two differential equations, using the fact that u = ua when
x = 0 and U = ub when x = 21.
(d) Hence show that
A(ua - ub)
e(kl-1 + k2-l )

12.4 Insulating a pipe

In a normal Australian house, narrow pipes are used to convey hot water
from a supply to the taps. In cold weather it is desirable to reduce heat
loss from such pipes so as to minimize heating costs. For cold water
pipes, moreover, sufficient heat may be lost to freeze the water in the
pipe. The usual answer to these problems is to insulate the pipes.
We shall formulate a mathematical model of the heat flow through a
layer of insulation wrapped around a hot water pipe. The model will be
used to investigate how thick the insulation should be. The presentation
in this section concentrates on the formulation of the model and the
analysis of the results, leaving the detailed calculations for the exercises.

The problem stated

We consider a length L of a typical hot water pipe, as in Figure 12.4.1.
The pipe has an outer radius a and is surrounded by a layer of insulation
so that the radius of the exposed surface is b, where b > a. Thus the
thickness of the insulation is b - a.
The pipe is assumed to be at the same temperature, say uH,, as the hot
water. The surrounding air is at a cooler temperature us. Heat is lost by
250 Modelling heat flow

a
b

Fig. 12.4.1. An insulated pipe. On the right is a cross-section of the pipe.

flowing through the insulation and then escaping to the surrounding air.
The problem is to determine the extent to which the insulation reduces
loss of heat from the pipes.

The model
Our model will be based on Fourier's law of heat conduction (to describe
the flow of heat outwards through the insulation) and Newton's law
of cooling (to describe the loss of heat from the outer surface of the
insulation to the surrounding air). Instead of imagining the heat as
flowing across plane faces as in Section 12.3, however, it will now be
regarded as flowing across cylindrical surfaces, which we now describe.
For each r > 0, the points which are at the same distance r from the
axis of the pipe form a cylinder of radius r, as shown in Figure 12.4.2.
(The cylinder is a surface, not a solid.) We assume the cylinder has the
same length L as the pipe. The cylinder coincides with the outer surface
of the pipe when r = a and with the outer surface of the insulation
when r = b. If r lies between a and b, then the cylinder lies inside the
insulation.
Note that, since a circle of radius r has circumference 21rr, the surface
area of this cylinder is given by

A(r) = 2irrL. (1)

Heat is assumed to flow radially outwards through these cylinders with

the temperature having the same value at each point of a given cylinder.
12.4 Insulating a pipe 251

Fig. 12.4.2. Coordinate system for the pipe.

Let
J temperature at each point
u - of the cylinder of radius r,

rate of heat flow radially

-f through the cylinder

We assume that the steady-state temperatures have been attained. It

follows that J is constant with respect to r, and that u is a function of r,
say u = 4(r). The temperature gradient in the direction of the heat flow
can hence be written as du/dr.
Fourier's law of heat conduction, introduced in Section 12.3, will be
used to model heat flow through the insulation. Recall that, according to
this law, the rate of heat flow is jointly proportional to the temperature
gradient and to the area through which heat flows. Thus
du
J = -kA(r) (2)
dr

where the positive constant k denotes the thermal conductivity of the

insulation. By (1) this can be written as

(3)
dr (2 kL) r
Because the steady-state has been attained, J and -J/(2irkL) are con-
stants; hence (3) is a very simple differential equation for u as function
of r.
The inner boundary of the insulation is assumed to be at the common
temperature of the pipe and the water. Hence the initial condition for
252 Modelling heat flow

the solution of (3) is

u=uw when r=a. (4)

This solution of (3) will contain the parameter J, whose value is to be

determined. An extra equation for determining J comes from information
about the temperature of the insulation at the outside boundary. Thus
Newton's law of cooling, in the version given in Section 12.2, gives
dH _ -hAbu-u
()(b S) (5)
dt
where -dH/dt is the rate of heat loss from the insulation to the sur-
rounding air, h is the heat transfer coefficient from the insulation to the
air, and ub is the temperature of the insulation at the boundary. In the
steady-state,
- dH
1 J

(since in the steady-state the rate at which heat is being lost to the air
must equal the rate at which heat is flowing through the insulation). The
last two equations give
J = hA(b)(ub - us) (6)

The unknown Ub, furthermore, is the value of u when r = b. Hence

Ub can be found by solving the differential equation (3) with the initial
condition (4). This gives (as will be shown in an exercise)
b
Ub In I
2nkL a
When this value for ub is substituted into (6), we get an equation for J in
terms of known quantities. The solution (as will be shown in an exercise)
is
b
J = 2ir(uw - us)hL (8)
1+kbl (

Predictions from the model

Our aim is to find from (8) how J varies with respect to b while all the
other parameters are held constant. This tells us how the rate of loss of
heat varies as we increase the amount of insulation wrapped around the
pipe. To illustrate the possibilities contained in the equations, we now
substitute some typical values for the parameters.
12.4 Insulating a pipe 253

Example 1. A hot water pipe has outside diameter 15 mm and 5 mm of insulation.

The temperature of the water is 60 °C and that of the surrounding air is 15 °C. The
insulation is made of fibreglass with conductivity 0.05 W m-1 oC-1 and the surface
heat transfer coefficient is 10 W m-2 oC-1. Compare the rate of loss of heat per
metre length of this pipe with that of a pipe when there is no insulation and the
surface heat transfer coefficient is 8 W M-2 °C-1.

Solution. We convert all quantities to SI units. Thus for the first pipe

a=0.015m us=15°C k=0.05Wm-1 °C

b=0.020m uW=60°C h = 10 W m-2 o c.

Substituting into (8) gives

J = 26.3 W.
In the case of no insulation, b = 0.015 and h = 8, giving
J = 33.93 W.
Thus addition of 5 mm of fibreglass insulation has led to a 22% decrease in heat
loss.

Example 2. Repeat Example 1 but now assume the pipe has outside diameter 5 mm
and has 2 mm of asbestos insulation. For asbestos take k = 0.11 W m-1 °C-1 and
h = 8Wm-2oC-1.

Solution. We obtain J = 13.5 W with the insulation and J = 11.3 W without it.

Note that, in this second example, adding insulation causes an increase

in heat lost. This is quite a surprising result and it takes a little thought
to work out why.
The expected effect of adding insulation is to increase the resistance
to heat flow. However there is another important effect for heat flow in
cylinders. Adding insulation increases the surface area and heat is lost
to the surroundings at a rate proportional to the surface area. There are
competing effects; sometimes the first effect wins and heat loss is reduced
as more insulation is added (as in Example 1) but sometimes the second
effect wins and heat loss is increased as more insulation is added (as in
Example 2). Clearly it is important to decide what will happen before
deciding to insulate a pipe.
254 Modelling heat flow
J-axis
A

- b-axis
a k Outer radius of insulation
h

Fig. 12.4.3. Graph showing rate of heat loss verses outer radius of insulation.

Practical considerations
To obtain a rule of thumb as to whether a pipe should be insulated or
not we graph the rate of heat loss J against the outer radius b, as given
in equation (8). The details have been left to the exercises and the graph
is presented in Figure 12.4.3. The turning point is at

b = k* (9)
h
If b is below this value then the rate of heat loss initially increases as b
increases. If b is above this value, however, then adding more insulation
decreases the rate of heat loss.
One way to guarantee that adding insulation always decreases the rate
of heat loss is to make sure that the outer radius of the pipe is greater
than the critical value given in equation (9). Thus in Example 2, this
condition requires the pipe to have outside diameter a > k/h = 0.11/8
m or 13.75 mm. Alternatively, for a given pipe size we should choose the
type of insulating material to give a critical value (9) less than the radius
of the given pipe.
This chapter has been based on a model introduced in the Open
University module : `Modelling Heat' (1975).

Exercises 12.4

1. (a) State briefly the meaning of each of the symbols occurring in equation
12.4 Insulating a pipe 255

(2) in the text,

du
J = -kA(r)
dr'
(b) Derive the differential equation (3) in the text,
du _ J 1

dr 2nkL r
(c) Steady-state temperatures having been attained, which symbols in the
above differential equation are constants? What are the dependent and
independent variables? Solve the differential equation subject to the
initial condition u = u,y when r = a.
(d) Hence derive the formula (7) in the text,

Ub
__ J
In
b
a + uw.

2. (a) Explain the meaning of the symbols occurring in Newton's law of cooling
(5) in the text,
dH__
T hA(b)(ub - us)

(b) Deduce, under suitable conditions which should be stated, that

J = hA(b)(ub - us).
(c) Deduce from the results of part (b) and Exercise 1(d) that the rate at
which heat flows through the insulation is given by (8) in the text,

J = 2n(u,, - us)hL b
l+kbin(Q)
(d) The water being at a higher temperature than the surrounding air, what
is the sign of uw - us ? Given the geometrical interpretations of a and b
in the text, what is the sign of ln(b/a) in the above formula?

3. (a) What do you expect to be the effect on the rate of heat loss from the
heated pipe of each of the following separate changes?
(i) The temperature of the water is increased.
(ii) The temperature of the surroundings is increased.
(iii) The length of the pipe is increased.
(iv) The thermal conductivity of the material insulating the pipe is de-
creased.
(b) Check your answers to part (a) of this exercise by using the formula
from Exercise 2(c).

4. (a) Suppose that f is a differentiable function which does not assume the
value 0 at any point of its domain. Show that (c/ f)'(x) = 0 precisely for
those x such that f'(x) = 0, where c is any non-zero constant.
256 Modelling heat flow

(b) Let J be given as a function of b by the formula in Exercise 2(c). Show

that
dJ
=0 precisely when b = k
db h
as claimed in the text. [Hint : you can simplify your calculations by using
the result from part (a).]
13
Compartment models of mixing

Compartment modelling is a means of constructing a differential equation

for a complicated process by considering just the inputs and outputs of
the process, during a small time interval. The basic ideas are developed
in the context of a model describing the mixing of a dye and water.
Compartment models are then formulated for the pollution in a lake and
the temperature of a domestic hot water system. The latter model uses
ideas about the flow of heat from Chapter 12. The differential equations
obtained are mainly of the first-order linear constant-coefficient type.

13.1 A mixing problem

One of the aims of modelling is to isolate the most important factors
in a problem and ignore those which may not be important. Even
very complicated processes can initially be analysed using very simple
mathematical models which may later be extended to more complex and
realistic models by incorporating more features. In problems involving
the mixing of two or more substances, simple models may be formulated
by considering the input and output to a compartment containing the
quantity of interest.
The following problem will be used to illustrate these ideas. The
problem is illustrated in Figure 13.1.1.

Statement of problem
In a dye factory a large vat is used to mix dye and water. The water
flows in at a rate of 6 litres/minute and the dye flows in at a rate of 2
litres/minute. The mixture is drawn off at a rate of 8 litres/minute. Ini-

257
258 Compartment models of mixing

Pure water
100 1

Initially Later

Fig. 13.1.1. Mixing dye and water.

tially the vat contains 100 litres of pure water. How does the strength
of dye in the water change with time?
We shall set up a compartment model for the mixing of dye and water
within the vat to produce a mixture of the two liquids. In this problem,
since the total flow rate of ingredients into the vat equals the flow rate
of mixture out of the vat, the volume of mixture in the vat is constant.
However, the amount of dye in the tank changes with time.
We now introduce some relevant terminology for problems involving
mixing of liquids. We then look at some introductory examples of
the small time-interval technique which we use to formulate differential
equations for this type of problem.

Concentration
In mixing problems in general we refer to the mixture as the solution, the
substance we are introducing as the solute and the liquid which dissolves
the solute as the solvent. In our problem the dye-water mixture is the
solution, the dye is the solute, and the pure water is the solvent.
To obtain a measure of the strength of dye in the dye-water mixture
we introduce the concentration of the mixture, which is the ratio of
the amount of solute to the amount of solution. In our problem it is
convenient to define the concentration as a ratio of volumes :
{volume of solute}
{concentration} _ (1)
{volume of solution}
This definition assumes that the solute is homogeneously mixed in the
solution. Note that the maximum concentration, which is unity, occurs
when the whole of the mixture is pure dye; the minimum concentration,
which is zero, corresponds to pure water.
13.1 A mixing problem 259

In other problems it may be more useful to define concentration as a

ratio of mass of solute to volume of solution. For example, many drug
prescriptions have concentrations measured in milligrams per litre. In
chemistry, concentration is often defined as the ratio moles of solute per
unit volume of solution. (The mole is defined so that the mass of one
mole of Carbon-12 atoms is exactly 12 grams.)

Formulating a differential equation

In this model we are concerned only with the input and output of
dye to the vat. We are not concerned with any fine detail about the
distribution of dye within the vat. This is an appropriate assumption to
make when the contents of the vat are well stirred. The next phase of
the model building process involves formulating a differential equation
for the amount of dye in the tank, at time t. This is done using a small
interval approximation, which we now illustrate.
To formulate a differential equation we need to account for the volume
of dye entering and leaving the vat. We need to determine how the flow
rate of mixture leaving the vat affects the amount of dye leaving the vat
in a given time interval. Let us first consider an illustrative example in
which the concentration of dye in the mixture remains constant.

Example 1. Mixture flows out of a tank at a rate of 8 litres/min. Suppose there

are 6 litres of dye in the mixture of 100 litres. What volume of dye leaves the tank
in a time b t minutes?

Solution. Firstly,
volume of mixture
flowing out = 86t (litres).

Of the mixture, a constant fraction 6/100 is dye. Thus

volume of dye = 86t x 6 = 12
leaving tank 6t litres.
100 25
Note that the fraction ioo is the concentration of the dye.

More realistically, we would expect the volume of dye in the vat,

and hence also the concentration of dye in the mixture, to change with
time. For very small time intervals, however, the volume of dye and the
concentration do not change significantly. Thus we can approximate the
volume of dye and the concentration of dye by constants over this small
time interval.
260 Compartment models of mixing

Now let us obtain an expression for the change in the volume of dye
in the mixture over some small time interval St. Let

x = ¢(t) - rvolumeatof dye in vatl

time t f
and let bx denote the change in volume in a small time interval bt.
The starting point for formulating the differential equation is to write
down an equation relating the change in volume of dye to the input and
output of dye form the system. Thus

x
volume ofofdye
olumeume dye _ volume
volume of dye
(2)
entenng vat f
entering vat l leaving
leaving vat
vat

Note that the volume of dye is the appropriate measure of the amount
of dye here since the concentrations in the problem are given as the
ratio of volume of dye in the mixture per unit volume of the mixture. In
the following example we obtain approximately the volume of dye which
leaves the vat in a small time interval b t.

Example 2. Find the approximate change in volume of dye bx in the vat shown in
Figure 13.1.1, during a small time interval b t.

Solution. Pure dye flows into the vat at a rate 6 litres/minute. So

(volume of dye
entering vat = 6 6t litres. (3)
in time b t
The mixture flows out at a rate 8 litres/minute. So
volume of mixture
leaving vat = 8 6t litres. (4)
in time b t
Because the mixture is homogeneous (i.e. well stirred) a fraction x/ 100 of the
mixture flowing out is dye - at the beginning of the time interval. We assume this
does not vary significantly over the small time interval bt. Thus, of the 8 bt litres
leaving the vat in a time bt, we obtain
volume of dye x
leaving vat 8 bt litres, (5)
in time b t 100

with the approximation becoming more accurate as bt becomes smaller. Hence,

using (2), (3) and (5) we obtain

bx 66t - 8x 6t litres (6)

100
as the approximate change in the volume of dye in the vat in a small time interval
bt.
13.1 A mixing problem 261

Having accounted for the input and output of dye to the vat we are
finally ready to derive a differential equation for the volume of dye in
the vat at time t. First we divide (6) by b t and then we let b t -+ 0. Since
dx=limbx
dt 6t-+o 6t
we obtain
dx 6 8x
(7)
dt 100'
But the problem was posed in terms of finding how the strength, or
concentration, of the dye in the mixture varied with time. To answer this,
a differential equation for the concentration of dye in the vat at time t is
more relevant.

Equation for the concentration

Let
concentration of dye in the vat
C
-{ at time t (8)

Since the volume of mixture in the tank remains constant at 100 litres,
then from (1)
x
C = or x = 100c. (9)
100
Substituting (9) into (7) we obtain
d
dt(100c)
= 6 _ 8 x 100c
100
which simplifies to
do
= 0.06 - 0.08 c. (10)
dt
The initial condition for this differential equation is
c=0 at t=0 (11)

since there was no dye in the vat initially. It is to be verified in

Exercise 13.1.4 below that the solution of the differential equation (10)
subject to the initial condition (11) is
c = 0.75(1 - e-0.08t )
(12)

A sketch of the solution (12) is given in Figure 13.1.2 below. The

value 0.75 corresponds to the steady-state solution, obtained from the
262 Compartment models of mixing
c-axis

----------------------- Steady-state
concentration

t-axis
Time

Fig. 13.1.2. Graph of how the concentration of dye varies with time, from
equation (12).

dc
differential equation by setting
do
T = 0. From (10) we see that, if c < 0.75,
then T > 0. Thus all concentrations with co < 0.75 increase with time.
They all tend to the steady-state value 0.75.

Summary
A summary of the procedure for formulating a differential equation for
mixing problems is now given.

STEP i : Draw a diagram showing the rates of flow and concentra-

tion of the liquids in and out of the mixture. Check to see if the
total volume of the mixture remains constant.
STEP 2: Write down word equations relating the change in the
amount of solute in the system to the amount input and output, in
a small time interval 8t. Convert to symbols using data given in the
problem.

STEP 3: Divide by 8t and take the limit as 8t -+ 0, obtaining a

differential equation for the amount of solute present as a function
of time.
STEP 4: If necessary, use a substitution to obtain a differential
equation for another quantity of interest (e.g. concentration).
13.1 A mixing problem 263

Dye flowing in
Volume of dye
in the vat
Mixture flowing out

Fig. 13.1.3. Compartment diagram for the dye mixing problem.

This type of approach can be applied to a variety of problems other

than the mixing of liquids. For example, in the next section we model
pollution in a lake. Here the quantity flowing out of the lake is the mass
of pollutant. In Section 13.3 we model the heat loss from a hot water
system where the quantity going into and out of the hot water system is
the heat energy.
The approach used to analyse such problems is called compartment
modelling. This involves drawing a compartment diagram. For the dye
mixing problem in this section, the compartment diagram consists of a
box showing the amount of dye contained in the vat together with an
arrow showing the input through dye entering the vat and another arrow
showing the output through dye leaving the vat as the mixture is drawn
off. A compartment diagram for this problem is shown in Figure 13.1.3.
Note that the input of water is not relevant to the compartment diagram
here since the amount of dye is the quantity of interest.
Compartment diagrams, while apparently trivial for the dye mixing
problem of this section, are useful aids when dealing with more compli-
cated models where there is more than one compartment (and sometimes
many compartments) and there is an interaction between compartments.
In Chapters 18 and 19 we consider some examples of two-compartment
models including interaction between species of animal, insulin and glu-
cose interaction in diabetics and models of combat between two armies.

Exercises 13.1

1. In which of the following situations is the volume of the mixture constant?

(a) Dye flows into a container at a rate 3 litres/min and water flows in at a
rate 11 litres/min. The mixture flows out at a rate 12 litres/min.
(b) Pure water flows in at a rate 5 litres/s, a chemical flows in at a rate
2 litres/s and the mixture flows out at a rate 7 litres/s.
264 Compartment models of mixing

2. In each case determine the amount of the substance entering the compartment
in a time interval b t. Give the appropriate units.
(a) Pure dye flows in at the rate 9 litres/min. Give the amount of dye.
(b) Dye, at a concentration 5%, flows into a vat at a rate of 2 litres/min.
Give the amount of dye.
(c) Water containing 0.5 kg of salt per litre enters a tank at a rate of 3
litres/min. Give the amount of salt.

3. In each case below assume that there is an amount x of the substance present
at time t and then determine the amount bx of the substance entering and the
amount leaving the compartment in the small time interval from t to t + bt.
Indicate units, and also indicate where your answer is exact, and where it is
approximate for very small b t. Hence determine the differential equation for the
amount of substance and then for the concentration in each case.
(a) Pure dye flows into a vat at a rate of 4 litres per minute and pure water
flows into the vat at a rate of 8 litres per minute. Initially there are 50
litres of pure water in the vat. The well stirred mixture is drawn off at a
rate of 12 litres per minute.
(b) Dye of concentration 40% by volume flows into a vat at a flow rate of
2 litres per minute. Initially there are 100 litres of pure water in the vat.
The well-stirred mixture flows out at the same rate.
(c) A salt-water mixture containing 15 grams of salt per litre flows into a
lake of volume 2000 litres at a flow rate of 10 litres per minute. The
flow rate out of the lake is 16 litres per minute and pure water flow into
the lake at a flow rate of 6 litres per minute. Assume the mixture is well
stirred.

4. Solve the differential equation

dc
= -ac +
dt
where a and fJ are constants, using the method explained in Section 11.1.
5. Dye at 50% concentration enters a tank at a rate of 1 litre/min. Fresh water
enters the tank at a rate 2 litres/min. The well-stirred mixture leaves at a rate
3 litres/min. The tank contains initially 50 litres of a 50% concentration of dye
and water.
(a) Formulate a differential equation for the concentration.
(b) What is the concentration after 15 minutes?
(c) What is the steady-state concentration?

6. At 6.00 pm on a Friday night a public bar opens and is rapidly filled with
clients of whom the majority are smokers. The bar is equipped with ventilators
which exchange the smoke-air mixture with fresh air. Unfortunately cigarette
smoke contains 4% carbon monoxide, CO, and a prolonged exposure to a
concentration of more than 0.012% of CO can be fatal. The bar has dimensions
of 20 m by 15 m by 4 m and it is estimated that smoke enters the room at a
constant rate 0.006 m3 /minute. The ventilators remove the mixture of smoke and
13.2 Modelling pollution in a lake 265

air at 10 times the rate that smoke is produced. The problem is to find the time
when the concentration of CO reaches 0.012%.
(a) Formulate a differential equation for the concentration of CO at time t.
(b) By solving the differential equation find at what time the lethal concen-
tration will be reached.

7. Using the data of the previous exercise determine the rate at which the
air-conditioners should operate if the concentration is never to reach the lethal
level.
8. A dam contains 106 litres of water. Fresh water enters the dam at a rate of
104 litres per day and the same amount flows out each day. Suppose someone
spills a barrel of pesticide into the dam.
(a) Set up a differential equation for the concentration of pesticide in the
dam.
(b) Suppose that at a given time the concentration is 5 times the safe level
for use by stock. How long before the stock can use the dam?

9. A 100 litre tank originally contains 50 litres of fresh water. Beginning at time
t = 0, water containing 50% of pollutant flows into the tank at a rate 2 litres per
minute and the well-stirred mixture leaves at a rate of I litre per minute. This
exercise involves a situation where the volume of the mixture is not constant.
(a) Find the volume of mixture in the tank as a function of time.
(b) Formulate a differential equation for the volume of pollutant in the tank.
(c) Hence show that the concentration of pollutant at the time the tank
overflows is approximately 48%.

13.2 Modelling pollution in a lake

Surrounding the Great Lakes system, on the eastern United States-
Canada border, is an area of extensive industrial activity. It has been
common practice to dump waste products into the Lakes. As a result
of this the Lakes have been seriously polluted. In this section we adopt
the compartment modelling approach to estimate how long it takes to
reduce the pollution levels in a lake through the flow of water into and
out of the lakes given that no more waste products are put into the lakes.
In particular, we look at Lake Superior, the largest of the Great
Lakes (see Figure 13.2.1). The volume of Lake Superior is approximately
1.2 x 1013 litres and it is estimated that 6.5 x 1010 litres of fresh water
enters the lake each year. We shall consider the following problem.
Suppose the authorities suddenly stop all pollution flowing into the
lake. Find how long it takes for the concentration of the pollutant to
decrease to 10% of its original value.
266 Compartment models of mixing

Fig. 13.2.1. The Great Lakes system on the eastern USA and Canadian border.

Mass of pollutants Mixture

in lake flowing
out

Fig. 13.2.2. Simple compartment model for removal of pollution from Lake
Superior.

The model
The mechanism for clearing the lake of pollution is the inflow of pure
water which dilutes the water-pollutant mixture. This process is what we
wish to model. We start with the following assumptions :

the lake is well stirred so that pollutant is uniformly mixed throughout

the lake,
the flow rate of mixture out of the lake is equal to the flow rate of
pure water into the lake (i.e. we ignore the net effect of rainfall and
evaporation).

From the second of these two assumptions we see that the volume of the
lake remains constant.
The quantity of interest is the amount of pollutant in the lake at any
given time, measured in tonnes. In this problem we choose initial time to
be when pollution input to the lake stops. A compartment diagram for
this situation is shown in Figure 13.2.2.
13.2 Modelling pollution in a lake 267

We now define some notation. Let us write

m = fi(t) = 1t in
mass of pollutant)
lake at time t
and define the constants
V = {volume of lake} = 1.2 x 1013 litres (1)

and
F _ flow rate of mixture 6. 5 x 1010 litres/year . (2)
out of lake
We also define the concentration of pollutant in the lake c by
_ m
c
V
We would expect the concentration to be a decreasing function of time
in this problem.

Formulating the differential equation

We now write down the fundamental equation which relates the change
in the mass of pollutants to the input and output of pollutant. Let Sm
denote the change of pollutant in the lake in a small time interval from
t to t + 8t. For t >_ 0 there is no flow of pollutant into the lake and so
f mass of pollutant)
Sm = 0 - flowing out f (3)

Now,
volume of mixture
flowing out of lake = FSt litres (4)
in time 8t
since F is the flow rate of mixture in litres per unit time. Thus
mass of pollutant volume of mixture fraction of
flowing out of lake = flowing out of lake x pollutant
in time St in time St in mixture

^_' F8t x Vm tonnes (5)

since m/V gives the mass of pollutant per unit volume of the mixture.
Hence by (3) and (5) we obtain

bm ^ -Fm
V
bt.
()6
Now, dividing by b t and letting b t - 0, we obtain
268 Compartment models of mixing

dm fF
(7)
dt V) m
as the differential equation for the mass of pollutant in the lake at time
t.
We define the concentration
mass of
concentration of pollutant
in lake m
c = pollutant in lake = (8)
at time t J volume V.
l of lake f
Substituting into (7) we obtain the differential equation
dc (F) (9)

for the concentration of pollutant in the lake as a function of time.

Now we need to find an initial condition. We have not been given
the initial concentration or initial mass of pollutant in the lake, so we
introduce the symbol co for the initial concentration. The solution of (9)
with the initial condition c = co when t = 0 is
--
c = coe(FIV)`. (10)

Thus to find the time t = T when the concentration reaches 10% of its
initial value, we obtain the equation

CO e-(F/V)T.
1 0 = c0 ( 11 )
10
Solving for T (omitting the algebra) yields

T = V 1n(10). (12)
F
Substituting the values for F and V for Lake Superior from (1) and (2)
we find that our model predicts that it takes approximately 425 years
for the concentration of pollutant to be reduced to 10% of its initial value,
even if no more pollutants are put into the lake.

Limitations of the model

The model developed in this section ignores many factors which might
influence the clearance of pollution from a lake. Some of these are as
follows:
13.2 Modelling pollution in a lake 269

The flow of water through the lake depends on the existence of stagnant
regions cased by eddy currents, thermal layers and wind. Thus the
assumption of a well-stirred mixture is not always true.
Some pollutants may settle on the bottom of the lake.
Bacterial action can affect the concentrations.
The volume of the lake may not be constant over a whole year.
Despite these limitations, the model does provide a starting point
for further investigations. A detailed discussion of this model and its
applicibility is given in the article by Rainey (1967). In the exercises we
also look at simple extensions. These include a model where pollutants
are fed into one lake from an outflow from another lake and a model
where bacteria consume some of the pollution.

Exercises 13.2
1. For Lake Erie the flow rate of water into and out of the lake is 1.75 x 1011
litres/year. The volume of the lake is 4.6 x 1011 litres. How long does it take for
the concentration of the lake to reduce to one quarter of its initial value?
2. For Lake Ontario, about 84% of its inflow comes from Lake Erie. Using the
data in Exercise 1 together with the volume of Lake Ontario as 1.6 x 1011 litres,
find an expression for the concentration of pollutants at time t. [Hint: You will
neeed to account for a variable concentration in the input of pollutant to Lake
Ontario.]
The next two exercises involve a slightly different type of problem which nevertheless
uses the same technique of a small interval analysis to formulate the governing
differential equation.

3. In a cylindrical container, water drains through a small circular hole at the

bottom of the container. The radius of the circular hole is a, the radius of the
cylinder is b and the height of the cylinder is H. The water emerges with velocity
v.

(a) Using a small interval analysis show that the height of the water in the
tank at time t is given by
dh (a2
V.
dt b2

(b) Toricelli's law states that v2 = 2gh where g is acceleration due to gravity.
By solving the differential equation show that the time taken for the
cylinder to empty, given that it was initially full, is
b2 /2H
T=
g

VA )S;Q
zzzzz
V'0

Heat
lost
from
system

1/1

Fig. 13.3.1. Hot water system to be modelled.

4. It is well known that a stream of water emerging from a cylindrical hole

contracts so that its cross-section area is 0.6 times the area of the hole. How
would this change the model of the previous exercise? Does this help the model
agree better with your experimental observations?

13.3 Modelling beat loss from a hot water tank

In this section we will formulate a mathematical model which describes
heat loss from a typical domestic hot water tank. The model is formulated
using the compartment approach adopted in this chapter. It also uses
some of the ideas developed in Chapter 12.
Suppose we have a cylindrical tank, which is partially full of water
(see Figure 13.3.1). The water is heated by a heating element which is
immersed in the water. Heat is supplied at a constant rate 3000 watts.
Some heat is lost to the surroundings, at temperature 15 °C, from the
surface of the tank. Our problem is to find how long it takes to heat the
water to a comfortable 60 °C, given that the water is initially at the same
temperature as the surroundings, 15 °C.

The model
The main idea of the model is to account for the flow of heat into and
out of the water. We assume that it only requires a negligible amount of
heat to raise the temperature of the metal casing to that of the water. A
compartment diagram of the heat flow is shown in Figure 13.3.2.
To arrive at a differential equation we examine the heat energy input
13.3 Modelling heat loss from a hot water tank 271

Heat from Heat energy Heat lost to

heating contained in water surroundings
element

Fig. 13.3.2. Compartment diagram for the model.

and output to the system in a very small time interval St. First, we define

u = ¢(t) = {temPerature}
at time t
(1)
heat contained
H =ye(t) = in water
at time t
We also define symbols for the following constants:
q = J rate at which heat t = 3000 watts,
is supplied to water f
us = f temperature ofl
surroundings f = 15 °C (2)

initial temperature 15 °C.

u0 - { of water
The change in heat energy in the water, 6H, in the time interval b t is
given by
heat lost
SH = heatfrom
entering water
heater - to surroundings (3)

Since q is the constant rate of heat supplied to the water, then

Jheat entering waterl _ qSt.
(4)
from heater
The heat lost to the surroundings is modelled using Newton's law of
cooling. From Section 12.2, the rate of heat loss is hA(u - us) where A is
the surface area of the tank and h is the heat transfer coefficient. Thus-
heat lost
to surroundings hA u - us)bt. (5)

Note that this is positive for u > us. Note also that this is only an
approximate expression since the temperature u changes with time. Over
a small time interval, however, we can neglect the variation in the
temperature u.
272 Compartment models of mixing
Substituting (4) and (5) into (3) we obtain
bH = qbt - hA(u - us)bt. (6)

Dividing by b t and then letting b t --+ 0 we obtain

J=q-hA(u-us). ()7
Our objective is to obtain a differential equation for the temperature. To
do this we must relate heat H to temperature u. We saw how to do this
in Section 12.2 where we argued 6H = cm 6u, where c is the specific heat
and here m the mass of water in the tank. Dividing by b t and letting
b t --+ 0 we thus obtain the relation
dH - cm du
8
dt dt ()
which we then substitute into (7). Hence
du q hA
(u - uS)
dt cm
cm (9)
is the differential equation for the temperature of the water as a function
of the time. From (2), the initial condition for (9) is
u=uo at t=0. (10)

Time to heat the water

To find the time to heat the water to 60 °C we must first solve the
differential equation. We can see the form of the differential equation
easier by lumping the constants together. Thus we write (9) and (10) in
the simpler form
du
au u= at t=0 11

with the `lumped' constants a and 1 given by

= q hAuS.
a= hA and
cm cm cm
The solution is calculated (Exercises 13.3.1) as
u=
uo
e-«r
+ a(I - e-«r
)
To find the time, t = T, when the water reaches 60'C, we need to solve
the equation
60 = uoe -,T + jl -e-aT
13.3 Modelling heat loss from a hot water tank 273

for T. After some simple algebra, we obtain the expression

T = 1 In
(fl/-uo (12)
a /3/a - 60
Note that (12) is a well-defined expression since it can be shown that the
term inside the brackets is always positive and thus we never take the
log of zero or a negative number.
We now substitute the appropriate values into (12). Using (2) together
with the typical values
m = {mass of water} = 50 kg
c = {specific heat of water} = 4200 J kg -1 °C-I
A = {surface area of tank} = 1 m2
h = {heat transfer coefficient} = 10 W m-2 °C-1,

we obtain the values a = 4.76 x 10-5 and = 1.5 x 10-2 and hence
T 3413 seconds ^_, 57 minutes.

(Note that the calculation gives the time in seconds since all quantities in
the problem have been converted to SI units.) Thus our model predicts
that it takes approximately 57 minutes for the water to be heated from
15 °C to 60 °C.
In this problem we have introduced symbols for all the constants. Not
only does this make the algebra simpler but it can also be easier to
perform dimensional and physical checks on the answer. Also, now that
we have a general expression we can easily look at times for heating with
different values of the various constants. One can then easily determine
how these times change as one of the parameters (for example, the surface
area of the tank) changes.

Discussion
Is this a good model? To answer this we should check the prediction
of the model with a real water heater. One possible problem with this
model is that we assumed that the temperature was the same at all points
in the water. If the water was well stirred then there would not be a
problem. This will not be true in practice, however, since the hot water
will rise to the top. Nevertheless, the model does incorporate most of the
essential physics. It should be able to be used, albeit cautiously, to gain
some insight into the following questions :
274 Compartment models of mixing
How much better is it to have the heater inside the house rather than
outside?
What advantages are there in insulating the system?
How much saving is made by using off peak heating?
To use the model to help answer some of these questions you should
attempt some of the exercises. Further discussion may be found in Open
University module `Modelling Heat' (1975).

Exercises 13.3
1. Obtain the solution, given in the text, of the differential equation
du
+ au = fJ with u = uo when t = 0,
dt
where a and fi are constants.
2. What is the steady-state temperature for the problem in the previous exercise?
Hence argue that the expression

In
(fl/x-uo 1
fl/a - 60
is a well-defined expression.
3. Imagine that the water heater is inside the house (at temperature 20 °C).
Recalculate the time taken to heat the water.
4. Plot a graph showing the time taken to heat the water as the surface area of
the tank is varied. For all other data use the values in the text.
5. Plot a graph showing the time taken to heat the water as the heat transfer
coefficient h is varied. For all other data use the values in the text.
6. Plot a graph showing the time taken to heat the water as the outside
temperature uQ is varied. Assume that the initial temperature is the same as the
outside temperature. For all other data use the values in the text.
7. Suppose the water heater is perfectly insulated so that no heat is lost to the
surroundings. Modify the expression obtained in the text for the time to heat the
water to 600C. Calculate this time using the data given in the text.
8. Modify the model used in the text to take account of heat lost to the metal
casing. You are given that the specific heat of the metal is 400 J kg-1 °C and the
mass of the metal casing is 5 kg. How much difference does this make to the
result for the time to heat the water? Express your answer as a percentage.
9. The cost of heating is proportional to the time it takes to heat the water. Is
it cheaper to switch the heater off all night (for eight hours) and have it turn
on in the morning or is it better to use a thermostat which switches the heater
on whenever the temperature falls below 60 °C? Assume the temperature of the
surroundings is 5 °C. (You will have to find what temperature the water cools to
in eight hours overnight and how long it takes to reheat to 50 °C, amongst other
things.)
Part four
Further Mechanics
14
Motion in a fluid medium

The refinement of some of the simple mechanics models obtained in Part

A is to be undertaken in this chapter and the next. Thus, for example,
in Chapter 2 the problem of free fall under gravity was considered. The
medium through which the object moves was completely ignored and
so was the size and shape of the falling object. In this chapter these
features will be included and it will be seen that two new forces become
relevant the drag force and the buoyant force. These forces provide
the mechanism for some phenomena not present in the model of Chapter
2: the decrease in acceleration of all free falling objects, and the ability
of some objects such as balloons to rise rather than fall.
The differential equations obtained in this chapter are first-order lin-
ear with constant coefficients or first-order separable. Knowledge of
Section 11.1 of Chapter 11 is therefore required.

14.1 Some basic fluid mechanics

As an object moves through a fluid, a force is exerted by the fluid on the
object which is in the opposite direction to the motion of the object. This
force is called the drag force. To gain an understanding of the quantities
influencing the drag force in a fluid, it is necessary first to discuss two
fundamental properties of a fluid: the viscosity and Reynolds' number.

viscosity
Gases and liquids are collectively known as fluids since they can both be
made to flow if a force is applied. While all fluids will flow, we know
that pouring water out of a bottle is a faster process than pouring, for
example, cream or honey. One obvious distinction between these three

277
278 Motion in a fluid medium

Table 14.1.1. Coefficients of viscosity of selected liquids and gases.

Liquid Viscosity I (poise)

Water (at 0 °C) 1.792 x 10-2
Water (at 200C) 1.005 x 10-2
Water (at 40 °C) 0.656 x 10-2
Ethyl alcohol (at 200C) 1.20 x 10-2
Castor oil (at 200C) 9.86 x 10-2
Mercury (at 200C) 1.55 x 10-2
Gases Viscosity I (poise)
Air (at 0 °C) 1.71 x 101
Air (at 20 °C) 1.81 x 10-4
Air (at 40 ° C) 1.90 x 10-4
Hydrogen (at 200C) 0.93 -x 10-4
Ammonia (at 200C) 0.97 x 10-4
Carbon dioxide (at 200C) 1.46 x 10-4

fluids is their `thickness', which is an intuitive measure of how close the

fluid is to a rigid object. Here we are using the word `thickness' as in the
phrase `thickened cream' or in the saying `blood is thicker than water'.
A precise measure of the thickness of a fluid is found in fluid mechanics
and it is called the coefficient of viscosity which is sometimes denoted by
the symbol ri.
The dimensions of ri are ML-1T-1 (mass x length-' x time-'). The SI

unit of viscosity is 1 kg m-1 s-1, but it is common practice to use a unit

known as the poise, which is one-tenth of the SI unit. A value of 10-4
poise is typical of a gas whereas liquids typically have a value of ri of
about 10-2 poise (see Table 14.1.1). It is common experience that in hot
weather honey flows rapidly in comparison with its behaviour in colder
weather. This is saying that the coefficient of viscosity is temperature
dependent, as is evident from an inspection of Table 14.1.1.

Reynolds' number
When an object falls through a fluid medium, a disturbance is caused as
the fluid is pushed away from the path of the object. The fluid dynamicist
Osborne Reynolds (1883) demonstrated that the fluid flow can change
character as the speed of the object relative to the fluid increases. This
is illustrated in Figure 14.1.1 where the curves, known as streamlines,
14.1 Some basic fluid mechanics 279

Fig. 14.1.1. Three types of flow of a fluid around a sphere. Here the frame of
reference is such that the spheres are held fixed.

indicate the motion of the fluid around the sphere. For low speeds the
flow looks the same upstream and downstream as in Figure 14.1.1(a). As
the speed increases a wake is formed formed behind the object and the
fluid recirculates inside the wake, as in Figure 14.1.1(b). The type of flow
in Figure 14.1.1(a) and Figure 14.1.1(b) is called laminar. For very high
speeds the flow in the wake region is no longer smooth but turbulent as
in Figure 14.1.1(c). It might be expected that the modelling of the drag
will be different in each of these three typical situations.
Of course Reynolds realized that not only the velocity of the object
but also the properties of the fluid would affect the type of flow. To take
account of this he introduced a dimensionless parameter (now known as
the Reynolds' number of the motion). The Reynolds' number is defined
by the formula
R = pf(acld
I
where p f denotes the density (mass per unit volume) of the fluid, IM
denotes the speed of the object, q is the coefficient of viscosity of the
fluid, and d denotes a characteristic length of the object (for example,
if the object is a sphere the number d could denote the diameter of the
sphere). The formula shows that the Reynolds' number is proportional
to the speed of the object, but inversely proportional to the viscosity of
the fluid.
For a smooth, falling sphere, a Reynolds' number in the range R <
280 Motion in a fluid medium
1 typically gives a flow of the type in Figure 14.1.1(a), a Reynolds'
number in the range 1 < R < 3000 typically gives a flow of the type
in Figure 14.1.1(b) and for a Reynolds' number R > 3000 the flow is
turbulent, as in Figure 14.1.1(c). The critical Reynolds' number for when
the flow becomes turbulent is generally smaller for spheres with rough
surfaces (e.g. a baseball or cricket ball).

Example 1. Calculate the Reynolds' number for the flow past a raindrop of radius
1 mm falling in air at 5 m/s. (Data: pf = 1.23 kg/m3, ri = 1.8 x 10-5 kg/m s).

Solution. Choosing the characteristic length as the diameter of the raindrop gives
d = 0.002 m.
Substituting this value and the given data in the formula for the Reynolds' number
gives
R = 670.
Since 0 < R < 3000, we might expect that the flow around the raindrop will be like
that in Figure 14.1.1(b).

It is to be expected that modelling of the drag will be different for

each of the three flows shown in Figure 14.1.1 corresponding to low,
intermediate and high Reynolds' numbers. Thus it is to be expected that
the drag will depend on the Reynolds' number (and hence on the density
and viscosity of the fluid, and the velocity and size of the object) and
also on the shape of the object. Our concern will be solely with spherical
objects.

Stokes' law
For a sphere of radius r moving at velocity ac, Stokes in 1845 derived the
following expression for the magnitude of the drag force :

IFDI = 6nrnIacl.

The formula is valid only for very low Reynolds' numbers (R << 1).
This requirement puts a severe limitation on the applicability of Stokes'
law. Example 1 shows that, even for a sphere as small as a raindrop,
the Reynolds' number is much larger than 1. Stokes' law is applicable,
however, to dust-like particles, such as smoke and other pollutants in
the air, and silt in lakes and streams. In Section 14.3, an application of
Stokes' law to the calculation of the charge of the electron, will be given.
14.1 Some basic fluid mechanics 281

103 +

102 +

10+

10"1- 4 4 4 --4- I-- R

10-3 10-1 10 103 101

Fig. 14.1.2. Coefficient of drag against Reynolds' number for a sphere.

The velocity-squared drag law

For intermediate and large values of the Reynolds' number (typically,
R > 1) no theoretical calculations of the drag force are available and
we have to use experimental measurements of the dimensionless drag
coefficient CD obtained from the equation

D = 2pf
FD 1 CDA.z2.

Here A denotes the cross-sectional area of the object which presents itself
to the fluid, p f is the density of the fluid and .z is the velocity of the
object. For a sphere, A is the surface area of a disk and is thus given in
terms of the radius r by A = nr2.
A plot of CD against the Reynolds' number for a sphere is given
in Figure 14.1.2. Note that the plot in Figure 14.1.2 has been done on
logarithmic graph paper. The value of CD for R between 103 and 2.5 x 105
lies in the range 0.4 to 0.5. As an approximation, the value of CD, for a
sphere, will be taken as the constant 0.45, and furthermore this will be
assumed to be the value for all values of R from 1 up to 2.5 x 105. The
magnitude of the drag force is then given by

IFDI = 0.225irp fr2ac2.

282 Motion in a fluid medium
Exercises 14.1

1. (a) From the dimensions stated in the text for 1, show that the Reynolds'
number R is dimensionless.
(b) Verify that Stokes' law is dimensionally correct.
(c) Show that the drag coefficient CD is dimensionless.

2. For each of the following motions a typical speed (.z l and characteristic length
d are given. Compute the corresponding Reynolds' number. For motion in air
use pf = 1.22 kg/m3 and t = 1.8 x 10-5 kg m-1 s-1, while for motions in water use
r, = 10-3 kg m-1 s-1 and p f = 1000 kg/m3. Indicate the cases for which Stokes'
law should be applicable.
(a) A peregine falcon in a hunting dive: (.zl = 70m/s; d = 0.15 m.
(b) A minnow swimming in a quiet stream: licl = 1 m/s; d = 0.03 m.
(c) Airborne dust particles settling on a calm day : licl = 2 x 101 m/s;
d=4 x 101 m.
(d) A cruising yacht: licl = 10 m/s; d = 10 m.
(e) Silt particles settling in a still lake: ikI = 1.6 x 10-3 m/s; d = 4 x 10-5 M.

3. From the graph in Figure 14.1.2, for low Reynolds' numbers,

loglo CD K - loglo R
where K is a constant.
(a) Deduce a formula for CD in terms of R.
(b) Substitute the formula defining R into the RHS of the formula you
found in (a).
(c) Deduce from (b) and the equation FD = pfCDAt2 that
i
IFDI =Hrttlil
where H is another constant. Compare this with Stokes' formula.

14.2 Archimedes' Principle

We all know that not all objects released in a fluid medium will fall.
Air bubbles in a carbonated drink and helium-filled balloons are typical
examples. The reason for this behaviour is the presence of a buoyant
force acting in the opposite direction to the weight force. Archimedes, in
250 BC, formulated what is now known as Archimedes' principle:
The buoyant force acting on a body either fully or partially immersed in a fluid is
equal in magnitude but opposite in direction to the weight of the fluid displaced
by the body.
14.2 Archimedes' Principle 283

Fig. 14.2.1. Force diagram for a floating chunk of ice. B denotes the buoyant
force and W the weight force.

The story goes that Archimedes discovered this principle when observing
that the level in the municipal bath increased upon his entry. In his
excitement he ran naked through the streets shouting `Eureka' (which is
Greek for `I have found it').
It is not too difficult to understand the nature of Archimedes' principle.
Consider any undisturbed region of the fluid. There are two types of
force acting on this region : a gravitational force equal to the weight
of the particles, and the pressure force exerted at the boundary of the
region by the fluid particles outside this region. If the region of fluid
is to remain stationary, the pressure must exactly balance the weight
of the fluid particles. If a solid object displaces the fluid particles, the
same pressure forces act on the object's surface as once acted on the now
displaced fluid. Hence the object experiences a buoyant force equal in
magnitude, but opposite in direction, to the weight of fluid it displaced.

Example 1. Use Archimedes' principle to calculate the fraction of a chunk of ice

which is below water level in a fresh water lake (data: the densities of water and
ice are given by Pwater = 1000 kg/m3 and pice = 920 kg/m3 respectively).

Solution. The calculation can be divided into steps:

STEP 1: Draw a diagram indicating the forces acting on the chunk of ice (see
Figure 14.2.1).
STEP 2: Use Archimedes' principle to calculate B :
If the downwards direction is taken as positive then
B = -Mfg,
where m1 is the mass of the fluid displaced.
STEP 3: Apply Newton's second law to obtain a relation between W and B :
If an object is floating there is no acceleration so the two forces must add to
zero. Thus
W = -B
284 Motion in a fluid medium

(a) (b)

Fig. 14.2.2. A hydrometer in (a) water, and (b) a liquid more dense than water.
Here BI> B'$ and x > 0.

which implies
m=mf,
where m denotes the mass of the chunk of ice.
STEP 4: Express the masses in terms of volumes by using the given densities:
In general
{mass} = {volume} x {density}.
Thus
m = 920 V, M f = 1, 000 Vs,

where V is the total volume, and VS is the submerged volume of the ice, and so
Vs/V = 0.92.
Hence 92% of the chunk of ice is below water level.

The hydrometer
A study of the previous example shows that the fraction of any particular
floating object above the surface level depends only on the density of
the fluid. This is the principle behind the hydrometer (see Figure 14.2.2)
which is used to measure the density of a liquid.
It is first necessary to mark on the stem of the hydrometer the surface
level of distilled water (density 1000 kg/m3 ), and to calculate the sub-
merged volume, Vs°) say. The hydrometer is then floated in the liquid
whose density is to be determined, and the displacement x of the surface
level, which is measured as the distance below the mark for distilled
water, is noted. If the stem of the hydrometer has cross-sectional area A,
14.2 Archimedes' Principle 285

then it is straightforward to show (see Exercise 14.2.4 below) that

_ 1000 VS °)
Vs(O) - xA
where p f denotes the density of the fluid.

Exercises 14.2
1. There's a well known saying: `That's just the tip of the iceberg'. Given that
the density of ice is 920 kg/m3 and the density of seawater is 1020 kg/m3, use
Archimedes' principle to calculate the fraction of the total volume of an iceberg
which is submerged under water.
2. A block of wood with dimensions 2 x 5 x 30 cm and density 400 kg/m3 is
held beneath the surface of a bucket of water by a straight string. Calculate the
tension in the string.
3. A 70 kg pig is marooned on a wooden board which is floating down a flooded
river. If the board is 15 cm thick and 2 m long and has a density of 600 kg/m3,
what is the width of the board if the top surface is level with the water?
4. Use Archimedes' principle to show that, for the hydrometer of Figure 14.2.2,
_ 1000VS°)
P1 VS°) - xA
where the symbols are defined in the text.
5. A `striped' cocktail is a drink made of three different liqueurs, which lie in
layers one on top of the other. The colours and densities of these liqueurs are
shown in the diagram below. The densities are in gram/ml, and each layer is
1 cm thick.

Cognac 0.956 Red

Creme de Caco 1.124 White

Grenadine 1.134 Red

A small plastic cube of volume 1 cm3 and density 1.130 g/ml is carefully
lowered into the drink.
(a) Between which layers does it float?
(b) Show that 3/5 of its volume is in one layer and 2/5 in another.
6. The molecular weight of helium is 4.00 gram/mole. The molecular weight of
air is 28.9 gram/mole. A helium filled balloon rises because it displaces air in
excess of its own weight. The molar volume is approximately the same for all
gases: 0.0224 mole/m3 at a temperature of 0 °C and a pressure of 1 atm. Estimate
the volume of helium required to lift a mass of 100 kg. What is the required
radius of the balloon?
286 Motion in a fluid medium
tB FDt

FDj+w
Moving upwards Moving downwards

Fig. 14.3.1. Forces on a moving sphere in a fluid medium.

14.3 Falling sphere with Stokes' resistance

The above sections tell us that a sphere falling through a fluid medium
is subject to three distinct forces :

a weight force (W),

a drag force (FD) and
a buoyant force (B).

As indicated in Figure 14.3.1, the direction of FD depends on whether

the sphere is moving upwards or downwards.
Furthermore, as discussed in Section 14.1, there are two different forms
of the drag force FD to consider (depending on the size of the Reynolds'
number). The first form of FD will be considered in this section while
the second form is left until the next section. Thus Stokes' law will be
considered, so that the magnitude of the resistive force is

IFDI = 6itnr15cI.

To determine the sign of FD, both the cases of an upward moving

sphere and a downward moving sphere must be considered separately.
Suppose the displacement x is measured upward from the ground. When
the sphere moves upward .z is positive and, from Figure 14.3.1, FD is
negative. Thus
FD = -6inac.

When the sphere is moving towards the ground, .z is negative and, from
Figure 14.3.1, FD is positive. Thus

FD = 6irn l$1 = -6in5c

14.3 Falling sphere with Stokes' resistance 287

since .z < 0, and so here the formula for FD is in fact independent of

whether the sphere is moving upwards or downwards.
Newton's second law gives the equation of motion
mx= W +B+FD.
Since upwards has been chosen as the positive direction, Figure 14.3.1
shows that the weight force W is negative while the buoyant force B
is positive. Inserting the formulae for all the forces in the equation of
motion then gives
mx = -(m - m1)g - 6ngr5c
where m is the mass of the sphere and m1 the mass of the displaced
fluid. The quantities of interest can now be found from the differential
equation by applying the theory from Chapter 11.

Example 1. Obtain a first-order equation for the velocity. Calculate the terminal
velocity.

Solution. Writing
v = .z (and thus v = z)
in the differential equation, and then slightly rearranging the result gives
(m-mf)g.
m m
This equation is a first-order linear inhomogeneous differential equation with con-
stant coefficients of the type studied in Chapter 11. The terminal velocity vt occurs
when v = 0. The differential equation is then simply solved for y to give

v = -(m-m1)g
` 6icrir
as the required terminal velocity.

The general solution of the differential equation, to be obtained in the

exercises, is
voe_tlT e-tlT)
.z = + vt(1 -
where vo is the initial velocity and
In
i =
6ngr

is called the characteristic time.

From the general solution, it follows that if vo = 0, then t = i
corresponds to the time it takes the sphere to reach a fraction 1- a-1
67% of its terminal velocity. For times t > 5i, the sphere has reached
288 Motion in a fluid medium
more than 99.9% of its terminal velocity. Highly accurate calculations
can then be carried out by assuming that the velocity is precisely the
terminal velocity. This is especially relevant when the characteristic time
is small in comparison with the duration of the motion.

The Millikan oil drop experiment

In a famous experiment performed in 1909, R. Millikan used a modifi-
cation of the expression for the terminal velocity vt of a sphere falling
under Stokes' law to determine the charge on the electron. The experi-
ment consisted of observing vt for small oil droplets under the influence
of an electric field created by two parallel plates.
If there is no electric field, the unknown radius r of the oil droplet
can be expressed in terms of the terminal velocity v(t°) say. This follows
from manipulation of the formula for the terminal velocity, obtained in
Example 1, giving

r= t

2(P - Pf )8 /
where p denotes the density of the oil. As the characteristic time is very
small, the oil droplet can be assumed to be falling at the constant velocity
4i°). Thus, by measuring the time t it takes the droplet to fall a distance
e', the terminal velocity is determined by the simple formula
1vi°) ( = e/ t.

Suppose that now the electric field is switched on. Since the oil droplets
have acquired a charge of a few excess electrons when they were formed
by being sprayed from an atomiser, they feel an electric force FE. If
the top plate is kept at potential V, the bottom plate at zero potential
(i.e. earthed) and the oil droplet has excess charge q = ne, the laws of
electrostatics give
V
FE = nIeI d,

where d is the plate separation and the upwards direction has been taken
as positive (see Figure 14.3.2).
When the oil drop is moving, a drag force obeying Stokes' law will
act in addition to the forces in Figure 14.3.2. The equation of motion is
therefore
V
nfef -(m - mf)g - 6ngrac = mx.
d
14.3 Falling sphere with Stokes' resistance 289

Fig. 14.3.2. Forces on a charged oil drop at rest in an electric field created by the
Millikan apparatus.

Putting z = 0 and solving for z gives the terminal velocity vt, say, as
n1el
v= V + v(o)
6iuird
By measuring vt for a positive and negative voltage of the same magni-
tude, and subtracting, this formula gives a value
nIeIV
3irrird'
It is possible to vary the charge niel on the oil droplet by using radiation.
The above formula predicts that the resulting differences in upward and
downward terminal velocities will always be an integer multiple of some
constant. This was found by Millikan, and the value of the constant
accurately gave the magnitude of the electronic charge. Further details
on the Millikan oil drop experiment can be found in Melissinos (1968).

Exercises 14.3
1. It was shown in the text ttiat the equation of motion for a falling sphere
subject to Stokes' resistance is
Y--6ngry (m - mf)g
m m
(a) Identify the symbols in this equation.
(b) Find the general solution of the homogenized form of this equation.
(c) Find a particular solution and hence determine the general solution of
the original equation.

2. The differential equation for the velocity of a sphere subject to Stokes

resistance is given in Exercise 1. If the sphere is released from rest, it follows, as
a special case of the solution to Exercise 1(c), that
.z=vt(1-e `fr).
290 Motion in a fluid medium
Use antidifferentiation to calculate the displacement x as a function of time,
where x is measured downwards from the point of release.
3. The turbidity (muddiness) of lakes and streams - often a serious environ-
mental problem - depends on the concentration of solid particles suspended
in them. To understand the persistence of small particles in lakes and streams,
calculate the time it takes a silt-sized particle (r 2 x 10-5 m) to fall one metre
at terminal speed. (Data: I = 10-3 kg/m s, p = 2800 kg/m3, p f = 103 kg/m3 ).
4. A typical oil droplet in the Millikan experiment has radius 10-6 M. If
I = 1.8 x 10-5 kg m-1 s-1, p = 883 kg/m3 and p f = 1.29 kg/m3, calculate the
characteristic time for the fall of a droplet with the electric field turned off. Use
your answer to explain why accurate calculations can be performed by assuming
that the sphere falls at precisely its terminal velocity.
5. For the Millikan oil drop experiment, suppose that an experimenter measures
the time it takes a droplet to
(a) fall a distanced without the electric field turned off (time t(°) )
(b) fall a distanced with the electric field on (time eV))
(c) rise a distance t with the electric field reversed (time t(-'))
The results are tabulated in the following table. Different rows of the table
correspond to the charge of the droplet having been altered by radiation.

t(°)(s) t(V) (s) t(-V) (s)

19.38 2.46 3.18
20.3 1.7 2.0
20.18 4.34 7.74
19.52 4.82 7.96

(a) Give some possible reasons why e°) is not precisely the same in each
case. What is the average value of &0.
(b) Use the average value of t(°) to calculate the radius of the drop r.
(Data: , = 1.60 x 10-5 kg/m3, c = 6.1 x 10-3 m, p = 883 kg/m3, pf =
1.29 kg/m3.)
(c) Use an appropriate formula from the text to calculate n$e$ for each line
in the above table. (Data: V = 500 volts.) You should find that, to
a good approximation, each of the values is an integer multiple of the
smallest such value.
(d) From your answer to part (c) calculate the approximate value of the
magnitude of the charge of the electron jel. The unit of charge in the
MKS system is the Coulomb.

14.4 Falling sphere with velocity-squared drag

In Section 14.1 the magnitude of the velocity-squared drag was given as
I FDI = 0.2251rpfr2k2
14.4 Falling sphere with velocity-squared drag 291

Fig. 14.4.1. Choice of a coordinate for a sphere falling over a cliff.

for a sphere of radius r. The right hand side of this expression is always
positive, irrespective of the sign of the velocity z, whereas the direction of
the drag force FD must always be opposite the direction of motion (recall
Figure 14.3.1). The equation of motion can be written down correctly if
this is kept in mind. The solution of the equation of motion requires the
technique of separation of variables, discussed in Section 11.1.

Example 1. A wooden sphere of radius 5 cm is released from rest at time t = 0

at the top of a cliff. Find the subsequent velocity of the sphere assuming the ace
resistive law. (Data: p = 600 kg/m3, p f = 1.2 kg/m3, g = 9.8 m/s2.)
Solution. The method of solution can be broken into a number of distinct steps.
STEP 1: Choose a coordinate system and indicate this on a diagram. In Fig-
ure 14.4.1, the displacement of the sphere has been measured from the point of
release. The downwards direction is therefore positive.
STEP 2: Draw a force diagram for the sphere, and use Newton's second law to
obtain the equation of motion. The force diagram is given in the second diagram
of Figure 14.3.1. By Newton's second law, the equation of motion is
mx=W+B+FD
Since the downwards direction is positive, the second diagram of Figure 14.3.1 shows
that the weight force W is positive while the buoyant force B and drag force FD are
negative. Substituting the formulae for the forces in the equation of motion gives
mx = (m - m f)g - 0.225npfr2ac2 (1)
where m is the mass of the sphere, and mf is the mass of the air displaced by the
sphere.
STEP 3: Write m and mf in terms of the density of the fluid and the density of
the sphere respectively, and rewrite the differential equation with y = x. Since
the volume V of a sphere is given in terms of its radius r by

V = 3 nr3

and
{mass} = {density} x {volume},
292 Motion in a fluid medium
v-axis (m/s)
4A
VtI -------------
30

I I I
t-axis (s)
1 2 3 4 5

Fig. 14.4.2. The velocity-time graph for the falling sphere of Example 1.

we see

m= 4nr3p and M f = 4ir3p .

3 3 r
Now divide both sides of the differential equation (1) by m and make the above
substitutions for m and m1, and then put v = is and thus v = x. The differential
equation (1) then becomes
b =-A2b2+B2
where

A = 0.411 (&)
rrp
and 8= Vi_&\g].
STEP 4: Solve the differential equation using the technique of separation of
variables. It is to be shown in the exercises that this method gives for the solution
e2ABt -1
z=AB e2ABt+1

A plot of this solution is given in Figure 14.4.2.

Terminal velocity and characteristic time

In Example 1, it was shown that the differential equation for the velocity
of a falling sphere could be written in the form
b = -A2v2 + B2.
The terminal velocity vt occurs when y = 0. Thus
vt = B IA,
14.4 Falling sphere with velocity-squared drag 293

which is in agreement with the behaviour seen in Figure 14.4.2 with

appropriate values of A and B.
The exact solution
B e2Ae` - 1
x= A (exAar .f-1

mentioned in Example 1 suggests the choice

_ 1

2AB
for a characteristic time. In this model, if t = r, the sphere has reached
approximately 46% of its terminal velocity. For times t > 8i, the sphere
has reached more than 99.9% of its terminal velocity.

Exercises 14.4
1. Obtain the Reynolds' number for the wooden sphere of Example 1 falling at
terminal velocity. Does the value obtained lie in the range appropriate for the
choice of drag constant CD = 0.45 (recall Figure 14.1.2)?
2. Consider the differential equation
v = -A 2V2 + B2,
with initial condition v = 0 at t = 0.
(a) State why 0 < v < B/A for t > 0.
(b) What are the dimensions of A and B, given that [v] = LT-! ?
(c) Obtain the partial fractions expansion
1 _ 1 1 1

B2-A2v2 2B B-Av + B+Av

(d) Show that the antiderivative of 1 with respect to v is
B -Av
- 1A ln(B - Av)
and obtain a similar result for 1
B + Av
(e) Use the technique of separation of variables and the results of (b) and
(c) to show that
In B + `lv = 2ABt.
B - Av
Check that both sides of this equation are dimensionless.
(f) From part (e), obtain the result
B e2AB` - 1
V=
A e2ABr + 1
294 Motion in a fluid medium
3. Suppose a sphere of radius r and mass m is thrown vertically upwards with
velocity vo.
(a) If the coordinate x is measured upwards from the point of release, and
the sphere is subject to the velocity-squared resistance law, show that
the equation of motion is
-(m - m1)g - 0.225icp1r2ac2 = mx.
Identify all the symbols not previously defined in the question.
(b) If m > m1, show that the above equation of motion can be written in
the form
v = -A2y2 -B 2
where A and B are defined in Example 1.
(c) Verify that the antiderivative of 1 with respect to v is
A2v2 +B2
Av .
arctan
AB B
Use this result and the method of separation of variables to show that
for the differential equation in part (b)
Av
arctan - C = -ABt
B
where C = arctan Avo/B.
(d) From (c) obtain the result

v = B tan(C - ABt).
A
(e) How long is the ball moving upwards? For which values of t is the
solution in (d) valid?

4. For a lacrosse ball, r = 3 cm and p = 600 kg/m3. If the ball is propelled

vertically upward with velocity 15 m/s, use the answer to Exercise 3(d) to calculate
the time it takes before the ball begins to fall. Do you expect the time it takes
to fall to be less than, equal to, or greater than the time it takes to reach its
maximum height?
5. Draw the velocity-time graph of an object projected vertically upwards with
speed 10 m/s ignoring the effect of air-resistance. On the same axis sketch the
velocity-time graph of the same object with air-resistance. The precise form of
the drag force is unimportant here; the relevant feature is that it opposes the
direction of motion. In each case compare the speed of the object at a certain
point on its rise and the same point on its descent.
15
Damped and forced oscillations

In Chapter 6, the equation of motion for a particle on a spring was set

up and solved. This model will now be extended to include damping and
driving forces. As expected, the effect of a damping force is to decrease
the amplitude of the oscillation of the spring. On the other hand, forced
motion of a spring can lead to the phenomenon of resonance, when the
amplitude of the oscillations becomes very large.
The differential equations which occur in this chapter are of the linear
second-order, constant-coefficient type. Both homogeneous and inhomo-
geneous equations occur. Their solution relies on the general theory of
Chapter 5, and as a computational tool complex numbers are utilized.
We assume basic familiarity with the complex number system, including
real and imaginary parts and the polar form of a complex number.

15.1 Constant-coefficient differential equations

In this section we briefly explain a technique for solving linear constant-
coefficient second-order differential equations. We first show how to find
the solution of the homogenized equation. We then show how to find
particular solutions of differential equations with trigonometric forcing
terms.

The homogeneous equation

The differential equation
az-{-bz-{-cx=0 (1)

where a, b and c are constants is second-order linear homogeneous and

has constant coefficients. A method of solution of such equations is to

295
296 Damped and forced oscillations
look for solutions of the form
x = eAt.
Substitution into (1) gives
(a22 + b2 + c)e2t = 0.

Since ell is never zero, this implies 2 must satisfy the quadratic equation
a22+b2+c=0. (2)

This is called the characteristic equation of (1).

Solution of the characteristic equation allows the general solution of
the differential equation to be obtained. There are three distinct cases :

(i) the roots are real and distinct,

(ii) the roots are complex conjugate,
(iii) the roots are equal to each other.

Complex variables. To deal with case (ii) above where 2 is complex it

is necessary to understand the meaning of the exponential of a complex
number. For a complex number a + ib, a, b E IR, we define

ea+i = ea(cos a + i sin b).

For example,

e =-1, a =i, and e = (1+i).

It is easy to verify from this definition that the exponential of a sum is

the product of exponentials (just as for exponentials of real numbers).
Now let A = a + ib be complex and let t E R. From the above definition
e2t = eat cos(bt) + ieat sin(bt).

This is an example of a function of the form

f (t) = f 1(t) + if2(t), tER

where f 1 and f2 are real-valued functions. We assign a meaning to
differentiating complex-valued functions by differentiating the real part
f 1(t) and the imaginary part f 2(t) separately, so that
f'(t) = .fi(t) + i1'2l(t)
15.1 Constant-coefficient differential equations 297

It is left as an exercise to show, using these definitions, that

(ezt) = Aext
do

holds even when A is complex.

We now discuss how to obtain the general solution of the homogeneous
constant-coefficient differential equation (1) depending on the nature of
the solutions of the characteristic equation (2). We consider each of the
three cases listed above : (i) real and distinct roots, (ii) complex roots,
and (iii) equal real roots.

(i) Real distinct roots. This case is particularly straightforward, as is

illustrated in the following example.

Example 1. Find the general solution of the differential equation

R+3k+2x=0.

Solution. Looking for solutions of the form

x = eAt
gives the equation
22 eAt + 32eAt + 2eAt
from which we obtain the characteristic equation
22+32+2=0.
This has solutions
2=-2 and 2=-1
and hence two linearly independent solutions of the differential equation are
x = e-2t and x = e-t .
Since the differential equation is second-order linear and homogeneous, the super-
position theorem says that the general solution is a linear combination of the two
particular solutions. Thus
x = cle-2c + c2e-t, where c1, c2 E R,
is the required general solution.

(ii) Complex conjugate roots. When the roots are complex we first get
complex solutions, using the complex exponential; real solutions can
then be obtained from the complex ones. The following example shows
how.
298 Damped and forced oscillations

Example 2. Find the general solution of the differential equation

x+25c+2x=0.

Solution. Looking for solutions of the form e gives

2Y` + 22e'` + 2e" = 0
from which we obtain the characteristic equation,
22+22+2=0.
The solutions of this quadratic equation are
2=-1+i and 2=-1-i,
and thus two complex solutions of the differential equation are
41(t) = eand 42(t) = e
From the definition of the exponential of a complex number,
ea+1b = e' cos(b) + iea sin(b),
the two solutions can be written as
41(t) = e't cos(t) + ie t sin(t) and 42(t) = e't cos(t) - ie t sin(t).
Since the equation is second order, homogeneous and linear, the superposition theo-
rem says the general solution is a linear combination of two independent particular
solutions. To get real solutions we choose the combinations

1(41(t) + 02(t)) and 1. (41(t) - 02(t))-

2 2t
Thus
x = cl a-t cost + c2e-t sin t, c1, c2 E R.
is the required general real solution.

(iii) Equal roots. When the roots of the characteristic equation are the
same then we only obtain one solution of the form
e2t
41(t) =
for the equation (1). It can be verified that another solution, linearly
independent from the first, can be obtained by multiplying the first
solution 01(t) by t :
02(t) = test.

The general solution is thus

x = c 1 eAt + c2 teAt, C1, c2 E IR.
15.1 Constant-coefficient differential equations 299

Inhomogeneous second-order equations

The equations which will be obtained in our study of the harmonic
oscillator are of the form
x + bac + cx = F cos wt (3a)
and

x+bac+cx = F sin wt. (3b)

According to the superposition theorem of Chapter 5, the general solution

of these equations consists of the solution of the homogenized equation
plus a particular solution. The homogenized equation can be solved by
looking for a solution of the form x = eAt and solving the characteristic
equation, as we have just seen. Our technique of finding a particular
solution uses the complex exponential function. The following steps
should be performed.
STEP 1: In the differential equation, replace cos(wt) or sin(cot) by e`wt.
(We do this because eic)t = cos(wt) + i sin(wt) so (3a) corresponds to
taking the real part and (3b) corresponds to taking the imaginary part
of the resulting complex equation.)
STEP 2: Look for a particular solution of the new (complex) equation
of the form ap(t) = ae`wt where a is a complex number, which is to be
determined.

STEP 3: Take the real or imaginary parts of the particular solution of

the complex equation to obtain the particular solution to the original
equation.
These steps are illustrated by the following example.

Example 3. Find a particular solution of the equation

x - ac + 6x = sin 2t. (4)

Solution.
STEP 1: Consider the complex equation, with sin 2t replaced by e2U,
x-5c+6x = e2". (5)

STEP 2: Look for a particular solution of the form

Op (t) = Qe2it

When substituted into (5), this gives the equation

a(-22 - 2i + 6)e2it = e2Et.
300 Damped and forced oscillations
Solving for a and then writing a in polar form gives

a=2-2i=4 1 + i
1 1 1 n

W2
Hence a particular complex solution to (5) is

In (5), let x = xre + ixim where xre and xim are the real and imaginary parts of
x respectively. By equating the real and imaginary parts of (5), the two equations
xre - xre + 6xie = cos 2t
and
zim - xim + 6xim = Sin 2t
are then obtained. It thus follows that the imaginary part of ap(t) satisfies the
original equation (4).
STEP 3: Choose the imaginary part of
n
OP( t) 1 ( cos(2t + 7c4 ) + i s in (2t + 4
2,vF2

to get the particular solution

n
sin 2t + ).
4
of (4).

The real particular solution just obtained, of the differential equation

(4), describes an oscillation with amplitude A = 1 One of the
2
advantages of using the complex formulation is that we can easily obtain
this amplitude as the modulus of the complex solution. Thus, in the
above example,

f modulus of 2;t_ 1+i amplitude of

complex solution = l ae -a = 4 -2= 1

real solution
The general proof of this result is left to the exercises.

Exercises 15.1
1. Find the real and imaginary parts of the following expressions involving
complex numbers.
(a) i(3i + 1)
(b) (2i + 1)/(-2i + 1)
(c) (4i + 1)/i
1 5.1 Constant-coefficient differential equations 301

2. Find the complex solutions of the following quadratic equations.

(a) )2+2=0
(b) A2-3A +3=0
(c) 2A2 + A + 1 = 0

3. Differentiate the following functions involving complex numbers.

(a) f (t) = ie3lt
(b) f (t) = test

4. Find the general solutions of each of the following differential equations.

(a) z+5.z+6x=0
(b) 2z+4z+2x=0
(c) z+z+x=0
(d) 2z+5c +x=0

5. Use the complex exponential to find a particular solution of each of the

following inhomogeneous equations.
(a) z+X+x = 2cos3t
(b) 2z + .z + x = 2 sin 2t
(c) z+4x=sin cot, wER, co 0
(d) z+µx+4x=sin2t, µER, µ*0

6. (a) Find the complex number a so that z = ae3it is a solution of the

differential equation
2-±+ 12z =2e3it.
(b) Hence find a real solution of the differential equation
.z - ac + 12x = 2 cos(3t).
(c) Check that the amplitude of the real solution obtained in part (b) is
equal to the modulus of the complex solution found in part (a).

7. Let a be a complex number whose polar form is Ialeie. Verify that if

z = aeiwr, w, t E R,
then the real and imaginary parts of z = x + iy are given by
x = jal cos(cot + 0) and y = jal sin(cot + 0).
(Thus the real and imaginary parts of z give oscillations whose amplitude equals
the modulus of z.)
302 Damped and forced oscillations

Fig. 15.2.1. Cross-section of a dashpot. This symbol is used to represent ideal

damping in a spring.

15.2 Damped oscillations

In Chapter 6, a study was made of the motion executed by a particle
attached to a light spring. The magnitude of the force experienced by
the particle due to the restoring nature of the spring was taken as being
directly proportional to the extension of the spring (Hooke's law). The
subsequent motion of the particle was shown to be periodic, with the
period of oscillation determined by the spring stiffness and the mass of
the particle.
Everyday experience tells us that the amplitude of the oscillations will
not remain the same forever. Rather it will decrease with time until the
motion eventually stops altogether. Physically, this can be understood as
a consequence of the frictional (damping) forces present within both the
spring and the attachments to the spring.
A common device which produces significant damping in a spring
system is known as a dashpot. A dashpot consists of a cylinder filled
with a fluid, sealed at one end, with a loose plunger at the other end (see
Figure 15.2.1). Examples can be found in car shock absorbers and on
some `self closing' doors. The damping in a dashpot is due to the viscous
effect of the fluid. From the discussion in Chapter 14, the magnitude of
the damping force FD will thus be a complicated function of the velocity
ac, which increases as I ac I increases.
The simplest choice, from the viewpoint of the resulting equation of
motion, is to have the magnitude of FD directly proportional to the speed
of the mass m. Thus

IFDI =YIxI (1)

where y > 0 is a proportionality constant called the damping constant.

This choice will be only approximately correct in general. However (1)
allows the equation of motion to be solved exactly, and it is found that
the type of motion obtained from the solution of the equation of motion
corresponds to that observed in real spring systems. The choice (1) will
15.2 Damped oscillations 303

Dashpot

m
d00000000000000000000C

Fig. 15.2.2. A spring and dashpot system.

be referred to as ideal damping, and the dashpot will be assumed light

so that its mass can be neglected in the equation of motion.

Equation of motion
The equation of motion for a particle on an ideally damped spring can
be determined by modifying the procedure of Section 6.2 for the case
without damping.

Example 1. Obtain and classify the equation of motion for the damped spring
system of Figure 15.2.2. The table is assumed smooth, the spring light and the
damping ideal. The natural length of the spring is ? metre and its spring constant is
k newton/metre. The particle has mass m kg. Suppose the coordinate x is measured
as the extension of the spring beyond its natural length.
Solution. In the horizontal direction there are two forces acting on the particle: Fs
due to the spring, and FD due to the dashpot. An application of Hooke's law, as
detailed in Section 6.2, gives
Fs = -kx.
The formula (1) gives the magnitude of FD. Its sign is determined from the criteria
that the damping force is always opposite in direction to the velocity. Hence
FD -y.z.
Newton's second law applied to the particle says
FD -I- Fs = mz
and thus
-y.z - kx = mz.
Rearrangement gives
z-. + -YX+kx=O,
(2)
m m
which is a homogeneous linear second-order differential equation with constant co-
efficients.
304 Damped and forced oscillations
Solution of the equation of motion
Let us suppose the spring in Example 1 is initially stretched a distance
xO metres and released from rest. It has already been remarked that
the expected consequence of a damping force in a spring system is the
gradual diminishing of the amplitude of the oscillations of the particle
about the equilibrium point. Does the solution of the differential equation
(2) make this prediction?
The solution of (2) can be obtained using the method of Section 15.1
for homogeneous equations. Recall that the first step is to try a solution
of the form e-11 and hence obtain the characteristic equation
YA
22
m
k
m
=0.
It is found that as the parameters k, y and m vary, three distinct types of
solutions are possible for the damped spring system, corresponding to the
nature of the roots of the characteristic equation. These cases are termed
underdamped, overdamped and critically damped harmonic motion. The
underdamped motion corresponds to the gradual diminishing of the
amplitude mentioned above. Each of the three cases will now be discussed
separately. Some of the mathematical details are left for the exercises.

Underdamped motion
It is to be shown in Exercise 15.2.1 that for y < 2(mk)l the solution of
(2) with the initial conditions x = xO, and is = 0, at t = 0 is

x = xoe-`/T cos(wt) + xo e-`/T sin(wt) (3)

wi
where
2m/y

and

4
C1-1 y2
.

The solution (3), with xo = 1, co = it and r = 5, is plotted in

Figure 15.2.3. The curve begins at xo and oscillates with decreasing am-
plitudes bounded by the envelope curves x = ±xoe-'/'. The time interval
which elapses between any two successive maxima of the displacement
is 2ir/w. The quantity i is the characteristic time for the decay of the
amplitude.
15.2 Damped oscillations 305

0.5

-0.5

1 2 3 4 5 6

Fig. 15.2.3. Underdamped harmonic motion with x = 1,.t = 0 at t = 0 and

CO =n,i=5.

The motion described by (3) is said to be underdamped. A feature of

underdamped motion is that the ratio of any two successive maxima is
independent of the initial conditions and the position of the maxima.
Thus if we denote the first maximum by x1, the second by x2, etc, then
x1 x2 xn
x2 X3 xn-t-1

The logarithmic decrement, denoted by A, is defined as

0=1n xn =lnx1.
xn-+-1 X2

From the solution (3), after some algebra, this can be written in terms of
y, k and m as
0=2n(4km-1 1 -1/2 .
\Y
A plot of A as a function of the dimensionless parameter y(km)-1/2 is
given in Figure 15.2.4.
For fixed values of k and m we see that the ratio of the first maximum
to the second maximum increases as the damping constant y increases,
and becomes infinite as the value y = 2(km) ' is reached. For y larger
than this critical value the motion is no longer underdamped.

Overdamped motion
If the damping force is sufficiently strong, it might be expected that
a particle on an extended spring, when released, will return to the
306 Damped and forced oscillations

2 y(km)-in

Fig. 15.2.4. The logarithmic decrement A as a function of the dimensionless

damping parameter y(km)-1/2 for underdamped motion.

equilibrium point without any oscillations at all. This can be predicted

from the solution of (2) when

y > 2(mk)I.

The characteristic equation then has real, negative and distinct roots
which we denote by 2 = -A1 and 2 = -22. The general solution is of the
form

e_2, t + c2e-a,2t
x = cl

where ci and c2 are determined by the initial conditions. If the initial

displacement is xo and the initial velocity is zero, then

x022 a- Alt xo2l a-,12t ,

x= A2-A1 + 22-21
(4)

Although x is never exactly zero for any positive time t, it will approach
zero in the limit as t approaches infinity. In practice therefore, the
distance of the particle from the origin will become imperceptible after
a finite time.
Plots of (4) corresponding to the initial condition x = 1 and x = 0,
at t = 0, for various values of the dimensionless damping parameter
y(mk)-1/2, are given in Figure 15.2.5. A feature of these plots is that for
fixed values of m and k, the greater the value of the damping constant y,
the slower the return to the equilibrium position x = 0.
15.2 Damped oscillations 307
x
1

0.9-
0.8-
0.7-
0.6-
0.5
0.4
0.3
02
.
y(mk)-1n

0.1 =2
0 T r 1 I J- T T -1t
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Fig. 15.2.5. Overdamped motion with the initial conditions x = 1, x = 0 for

various values of the dimensionless damping parameter y(mk)-1/2 where k/m = 1.

Critically damped motion

As suggested by Figure 15.2.5, the value of the damping constant y which
corresponds to the fastest return to the equilibrium point is
y = 2(mk)1/2. (5)
As y decreases to values less than 2(mk)1/2, the motion switches from
overdamped to underdamped. For this reason, damped harmonic motion
with the condition (5) is said to be critically damped. When (5) holds,
the characteristic equation has only a single root -A1 which is negative
and The general solution of (2) is
x= c1e_21t
+ c2to-Alt,

The constants c l and c2 are determined by the initial conditions, x = xo

and ac = 0 at t = 0. This gives
x = xoe-A't + xote-21 t. (6)

Once again, although x is never exactly zero for any t > 0, it approaches
zero as the time t approaches infinity. Hence the distance of the particle
from the origin will be imperceptible after some finite time. In most
applications of overdamping, it is desired to return to the equilibrium
point as fast as possible. The damping constant is thus chosen to have
the critical value (5).
308 Damped and forced oscillations

Fig. 15.2.6. Essential features of the artillery gun. The shaded region is the barrel.

Example 2. The barrel of an artillery gun (see Figure 15.2.6) weighs 500 kg and
has a recoil spring of stiffness 40000 N/m. If the gun is fired horizontally and the
barrel recoils 1 m, determine the critical damping coefficient of an attached dashpot
with ideal damping.

Solution. Critical damping occurs when y = 2(mk), . Substituting for m and k gives
y = 4J x 103 kg m/s.
as the required damping constant.

Example 3. For the problem in Example 2 find the time required for the barrel to
return to a position 5 cm from its initial position.
Solution. To answer this question, it is necessary to set up and solve the appropriate
equation of motion. Let if metres denote the natural length of the spring, and let
the displacement x metres be the extension of the spring as measured from C. Since
the displacement has reached a minimum at x = -1 The initial conditions are
x=-1, $=0, at t=0.
It is desired to calculate t such that
x = -0.05.
With this choice of coordinate system, as explained in Example 1, the equation of
motion is given by the differential equation (2). Substituting the numerical values
of y, m and k gives
x+8Jx+80=0.
The corresponding characteristic equation has a double root at --4J. The general
solution is thus
x= c2te--*%`, c1, c2 E R.
Substituting both initial conditions gives
-1 = c1
and
0=-4Jc1+c2.
Hence cl = -1 and c2 = -4J so that the solution of the equation of motion and
the initial condition is
x = -e-4j t - 4 Jtefs`.
15.2 Damped oscillations 309

Fig. 15.2.7. Graphical solutions of 0.05e4J` = 1 + 4Jt.

From this expression for the displacement, it follows that x = -0.05 when
1 + 4Jt.
It is not possible to solve this equation using only algebra. An approximate solution
can be obtained, however, by plotting both sides of the equation and reading off the
point at which the graphs intersect (see Figure 15.2.7). It is thus found that the
required time is
t = 0.53 seconds.

Exercises 15.2
1. The differential equation

x+ myz+ kx=0
m
describes ideal damped harmonic motion.
(a) Identify the symbols in this equation, and indicate the choice of coordi-
nate system on a diagram.
(b) Use the theory of Section 15.1 to show that there are three distinct types
of solutions depending on whether
y < 2(mk )' , y = 2(mk) I or y > 2(mk) i ,
and obtain the general solution in each case. When the roots are complex,
write the solution in terms of the exponential and cosine function only.
(c) In each of these cases, determine the arbitrary constants in the general
solution so that the initial conditions x = xO and ac = 0 when t = 0 are
satisfied. Compare your answers with (3), (4) and (6) in the text.
310 Damped and forced oscillations
(d) Give the name of the type of motion in each case.

2. Consider the spring system of Figure 15.2.2. The table is assumed smooth,
the spring light and the damping ideal. Suppose the natural length of the spring
is 0.5 m and its stiffness is 10 N/m. Let the mass of the attached particle be 5 kg
and the damping constant 10 kg/s.
(a) If the displacement x metres is measured as the extension of the mass
from its equilibrium point, write down the equation of motion.
(b) With the initial conditions
x=0.1 and is=0.2, at t=0,
obtain the solution of the equation of motion in terms of the exponential
and cosine functions only.
(c) Suppose instead that the displacement x metres is measured from the
wall. Show that the equation of motion in this case is
5z = -10ic + 10(0.5 - x).
Use the workings of (b) to write down the general solution of the
homogenized version of this equation. Find a particular solution and
thus obtain the solution of the equation of motion when the initial
conditions are as in (b).
(d) Compare your answers obtained in parts (b) and (c) and comment. What
is the time interval between any two successive maxima?

3. (a) Prove that for ideal underdamped harmonic motion, the ratio of the
height of any two successive maxima is a constant. Use this fact to show
that the logarithmic decrement A can be written as
1 xi
A= In
n

x is the nth maxima.

(b) Suppose the ratio of any two successive maxima for a car shock absorber
is designed to be 0.1. If m = 1000 kg and k = 5000 N/m, use an
appropriate formula in the text to calculate the required value of the
damping constant y.

4. Consider a particle of mass M suspended from a light spring and an ideal

dashpot, which hang from a ceiling.
(a) If the spring has natural length a and the displacement x is measured
as the distance of the particle below that ceiling, obtain the equation of
motion
Mx = -y55c + k(?- x) + Mg.
(b) Explain from the equation of motion why the criteria for the types of
motion is the same as in Exercise 1.
(c) Find a particular solution and then use your answer to 1(b) to write
down the general solution in each case.
15.3 Forced harmonic motion 311

5. For the system of Exercise 4, suppose t = 0.2 m and M = 5 kg.

(a) Calculate y so that the motion is critical.
(b) If the spring is compressed so that the particle is 0.1 m from the ceiling
and then released from rest, calculate the subsequent displacement of
the particle, assuming y is as in (a).
(c) Plot the displacement-time graph in (b).
(d) Calculate how long it takes the particle to return 1 mm from its equilib-
rium point.

6. If a hydrometer is floated in water (recall Figure 14.2.2) and displaced

vertically from its equilibrium point, it will undergo damped harmonic motion.
(a) To understand this effect, suppose the stem of the hydrometer, of cross-
sectional area a (metre)2, is displaced a distance x metres below its
floating level. Use Archimedes' principle to show that a net force F
newton, given by
F = -1000 ax,
acts on the body, where the downwards direction has been taken as
positive. Show that the same formula applies if the hydrometer is
displaced a distance x metres above its floating point.
(b) Assuming a resistive force proportional to the velocity of the hydrometer,
obtain the equation of motion. For what values of the damping constant
will the motion be underdamped?

15.3 Forced harmonic motion

A feature of the three types of solutions for damped harmonic motion,
from the previous section, is that the amplitude progressively decreases.
For this reason, the motion is referred to as transient (temporary). To
maintain the amplitude of oscillation in a damped system, it is necessary
to have an external force acting on the system. This is known to us all
from experience on a swing. Motion of the legs and body at regular
intervals provides a force capable of maintaining a steady amplitude of
oscillation, even though significant friction is present.
The key requirement for the external force Ff(t) (the subscript f
indicates that this term forces the oscillations) is that it must be periodic
in time. Perhaps the simplest choices with this feature are

Ff(t) = FO sin soft and Ff(t) = FO cos soft (1)

where FO and cof are constants.

312 Damped and forced oscillations

x'
Fig. 15.3.1. The spring system and coordinates for Example I.

Equation of motion
An equation of motion with a forcing term of the form (1) can be
obtained in a spring system by periodically varying the position of the
attachment of the spring to the wall.

Example 1. Suppose a light spring and ideal dashpot are attached at one end to
a particle of mass M which moves on a smooth table. The other end of the spring
and dashpot are forced to oscillate with displacement given by
y = a sin cwt
as shown in Figure 15.3.1.
If x is the displacement of the particle as measured from the point with coordinate
t (the natural length of the spring), and z = x - y, show that
z+ j±+wz =Fosincwt (2)

where
k
Fo=a(cw)2, w2= M
and k is the stiffness of the spring and y is the damping constant.

Solution. Let the direction to the right of the origin in Figure 15.3.1 be positive.
There are two forces acting on the particle: Fs due to the spring, and FD due to the
dashpot. To calculate Fs, note from Figure 15.3.1 that the extension of the spring
beyond its natural length is x - y, assuming x > y. Hence, from Hooke's law, since
the positive direction is to the right,
Fs = -k(x - y).
This formula holds if y > x also, since then the spring is compressed so the force
is positive.
The force due to an ideal dashpot is directly proportional to the difference be-
tween the velocity of the two ends of the dashpot. Since the rightmost end is moving
at velocity z, and the leftmost end at velocity jy, this difference is x - Y. If x > jy,
the force is in the opposite direction to the particle motion, so
FD = -Y(x - Y)
This formula also applies if ac < y.
15.3 Forced harmonic motion 313

Since the origin for the x-coordinate is fixed, the acceleration of the particle is
x. Hence, by Newton's second law,
Mx=Fs+FD-
Changing from x to the relative coordinate z = x - y gives
Fs = -kz, FD =-y2 and z=z+y.
Hence the equation of motion M(z + y) = Fs + FD becomes
Mz=-kz-y2-My
and so
Mz = -kz - y2 + Ma(w1)2 sin w ft,
since y = a sin(w f t). Hence

z+ y 2+
M
kM z= Fo sin(wf t) (3)

as required.

Note that if, in Example 1 of Section 15.2, an additional force

F° sin((oft) were applied directly to the particle of mass M, then the
equation of motion would become

z -. + - y ac + k x = F° sin(cot
f).
M M
This equation is exactly the same as (3), but with x in place of z.

The non-transient solution

The solution of the equation of motion (2) can be obtained by using
the method of Section 15.1. The general solution is the sum of the
homogenized solution, which for y 0 is a transient function, and a
particular solution. The particular solution has the same period as the
forcing term, whereas the amplitude depends on all the parameters as
illustrated in the following example.

Example 2. Consider the equation of motion

z + 0.12 + 9z = sin(w f t) (4)
which from (2) describes forced harmonic motion with y/M = 0.1, w = 3 and
FO = 1. Find the amplitude of the particular solution and plot its dependence on
wffor 0<wf<4.
314 Damped and forced oscillations
Solution. To find a particular solution of (4) we use the method of Section 15.1.
We consider the complex equation
z+O.1z+9z =e'wft, (5)
noting that sin(cwft) is the imaginary part of e'wft = cos(cwft) + i sin(cwft). We try a
particular solution of the form
zP = ae'wf t
where a is a complex constant, to be determined.
Substituting zP into the complex differential equation (5) gives
[a(iwi)2 + 0.1aia)f + 9a] e'wf t = e'(Oft
from which it follows that
1

a_9 - (a)f)2 + O.1ic)f

Hence
1 iw t
(6)
9 - ((of )2 + 0.1 i(onf e f
zP

is the complex solution of (5). To get the real solution of (5) we need to take the
imaginary part of (5). (Note that if the right-hand side of (4) had been cos(coft)
then we would then need to take the real part of the particular solution (6).)
However we are only interested here in the amplitude A of the real solution to
(4). This can be obtained (see the discussion at the end of Section 15.1) directly
from the complex solution (6) as A = jai. Hence the required amplitude A is
1 1
A= - (c)f)2 + O.lic) 19
- (c)f)2 + 0.1ic)f

Hence
-1/2 .
A = ((9 - w2)2 + O.O1102)
A plot of the amplitude A against the forcing frequency coif is given in Figure 15.3.2.

Figure 15.3.2 shows that the amplitude of the oscillations of the particu-
lar solution peaks very close to cof = 3 (i.e. the frequency cof/2ir = 3/27x).
The value 3/27r is the frequency of the system with damping and forcing
removed that is, the natural frequency of the system. The frequency
cof/27r at which this peak occurs is called the resonant frequency of the
system. In general, for small damping, when the frequency of the forcing
term approximately equals the natural frequency of the spring system,
the amplitude of the non-transient solution peaks. The phenomenon is
known as resonance.
Resonance is responsible for some surprising happenings. A well-
known example is a singer holding a note at a certain frequency and
shattering a glass. This occurs when the motion of the natural frequency
15.3 Forced harmonic motion 315

Fig. 15.3.2. Displacement amplitude against frequency near resonance : c of = 2.9.

of vibration of the molecules in the glass coincides with the frequency of

the note. Another example is the massive tides in the Bay of Fundy on
the Canadian east coast. In the oceans, the tidal rise is approximately
0.3 m, but in the bay it is 11m. This occurs because the characteristic
period of water as it moves back and forth across the bay is about 13
hours, which is close to the period of 12.4 hours between successive high
tides. Since the ocean tide can be regarded as a driving force, a large
resonant amplitude in the bay results.
Resonance, induced by periodic fluctuations in the wind, has also
been used as an explanation of the collapse of a bridge in the State of
Washington, USA. An entertaining discussion of this is given by Braun
(1983).

Exercises 15.3

1. For the equation of motion

z + 9z = cos((Oft),
which describes forced harmonic motion without damping, find the amplitude of
the particular solution and sketch its dependence on c of for 0 < caf _< 6, caf 0 3.

2. (a) Recall that the natural frequency of a system is the frequency of the
oscillations when the damping force and the external force are removed.
What is the natural frequency of the system in Exercise I?
(b) What is the resonant frequency for the system in Exercise I?
316 Damped and forced oscillations
3. Repeat Exercise 1 for the equation
z + 0.051 + 4z = cos cwt,
here sketching the amplitude in the range 0 <_ c of <_ 4.
4. Suppose a particle rests on a light spring. If the weight of the particle causes
the spring to be compressed a distance of 5 cm, use Newton's and Hooke's laws
to deduce the ratio of the mass of the particle to the spring constant, and thus
determine the resonant frequency, assuming small damping.
5. Consider a particle resting on a smooth table and attached to a spring of
stiffness 10 N/m and an ideal dashpot. The spring and dashpot are attached to
a wall.
(a) Suppose the spring is extended and released. You are given that the
time interval between successive maxima is 1.5 s and that the ratio of
the amplitude of successive maxima is 2/7. Use appropriate formulae in
the text to calculate the mass of the particle and the damping constant.
(b) Determine the amplitude and phase of the non-transient part of the
motion if an external force Ff(t) = 2 sin(4t) acts on the system.
6. The diagram below shows the essential elements of a vibration-measuring
device (seismometer or accelerometer). If in the y-direction there is a displacement
of the form
y = Y sin(wft)
(due to an earthquake for example), the aim is to deduce Y from the recorded
amplitude of the relative displacement
mg
z = x-Y+ k
of the particle.
T

A -----------------------------
(a) By following the working of Example 1, show that the equation of
motion for the particle can be written
mi + yi + kz = mYwf sin(wft) - mg.
(b) Show that the amplitude A of the non-transient part of the solution to
this equation is
Y(wf/(0)2
A
[1 - (2Cwf/(o)
where co = (k/m) 7 and 2C = y/(mk) 2 .
15.3 Forced harmonic motion 317

(c) For a seismometer, co is very small in comparison to wf, so that (of /w

is large. Show from your answer to (b) that in this circumstance
Y =A.
What is the motion of the particle?
(d) For the accelerometer, the ratio w f/w is small. Show that here
Y (w/wf)2A.
16
Motion in a plane

This chapter introduces some problems from particle mechanics in which

the action takes place in a plane rather than along a line. Typical
problems are
motion down an inclined plane,
motion of a projectile,
where in each case the particle is assumed to move in a vertical plane.
The first problem is suggested by Galileo's famous experiment, but it
also has relevance to the motion of a skier down a slope. The second
problem has relevance to ball games as well as to warfare.
Our overall approach to these problems is to define the acceleration
of the particle as a vector X. After writing the net force acting on the
particle as a vector F, we then apply Newton's second law in its vector
form
F=mX,
IV ^0

m being the mass of the particle. This leads to differential equations,

from which the motion of the particle can be predicted.
The basic ideas of vector algebra will be assumed, but ideas about
vector-valued functions and their derivatives will be explained fully.

16.1 Kinematics in a plane

In this section the ideas of velocity and acceleration will be formulated
in sufficient generality to be applicable to particles moving in a plane.
Our definitions of these quantities will be closely related to the method
by which they are calculated. An important geometrical interpretation
of velocity in terms of tangent vectors to curves will be explained later.

318
16.1 Kinematics in a plane 319

y-axis

y ---------(x,y)
yj

x-axis
X

Fig. 16.1.1. Cartesian coordinates and position vector of a particle.

Position and displacement

A familar way of specifying the position of a particle moving in a plane
is to give its cartesian coordinates x and y relative to a pair of axes, as
as Figure 16.1.1. The axes, which should be fixed relative to an inertial
frame, are chosen in two mutually perpendicular directions which fit in
with the geometry of the mechanical problem. These directions will often
(but not always) be horizontal and vertical respectively.
An alternative way to specify the position of the point is to give the
position vector X of the particle relative to the origin 0. This vector is
just the directed line segment from 0 to the point in question. As can be
seen from the triangle on the right in Figure 16.1.1, the position vector is
given in terms of the cartesian coordinates of the point by the formula

X = xi -}- A (1)

where j and j are the vectors of unit length pointing in the directions of
the x- and y-axes respectively. Note that the cartesian coordinates of a
point determine its position vector uniquely, and conversely.

Curves
As the time varies, the particle traces out a curve in the (x, y)-plane, as
illustrated in Figure 16.1.2.
The coordinates of the particle are thus functions of time given by,
say,
x = 4(t) and y = ip(t) (2)

where 0 and ip are functions mapping times into displacements along

the axes. The position vector of the particle is also a function of the time
320 Motion in a plane
It

Fig. 16.1.2. Curve traced out in the (x, y)-plane.

x-axis

Fig. 16.1.3. Particle moving on a parabola in the (x, y)-plane.

which, from (1) and (2), is given by

X = 4(t)i + ip(t)J (3)

The following examples show how the curve on which the particle is
moving may be obtained once these functions are known.

Example 1. The cartesian coordinates of a particle are given as functions of the

time by
x = t and y = t2.
Find the curve in the (x, y)-plane on which the particle moves.

Solution. Eliminating t from these two equations gives the single equation relating
x and y,
y=x2.
This is the equation of a parabola in the (x, y)-plane, as shown in Figure 16.1.3.
Since x increases as t increases, the particle moves along the parabola in the direc-
tion shown by the arrow.
16.1 Kinematics in a plane 321
y-axis

. x-axis

Particle

Fig. 16.1.4. Particle moving along a straight line in the (x, y)-plane.

In the above example, the equation y = x2 is called the equation of the

curve along which the particle moves. The pair of equations x = t and
y = t2, however, are called the parametric equations of this curve because
they involve the parameter t.
While the equation of the curve gives a purely `static' picture, the
parametric equations tell which point on the curve the particle occupies
at any given time. Thus, in eliminating t, we lose some information about
the motion of the particle.
The following example shows how to get the equation of the curve
when the position vector, rather than each coordinate, is given as a
function of time.

Example 2. Describe the curve along which the particle moves in the (x, y)-plane
when its position vector is given as a function of time by
X = -ti + (2t + 1)j.

Solution. The cartesian coordinates x and y of the particle are just the respective
coefficients of i and j on the RHS of the formula for X. Hence
x=-tand y=2t+1.
Elimination oft gives
y=-2x+1
and so the curve along which the particle moves is a straight line, as shown in
Figure 16.1.4. The particle moves along the line in the direction indicated by the
arrow because the parametric equations show that x decreases as t increases.
322 Motion in a plane
Velocity and acceleration
Our approach to these ideas is motivated by Galileo's work on projec-
tiles. He imagined the motion in the vertical plane to consist of two
independent components : one vertical and the other horizontal. Ver-
tically the motion was as for free fall, while horizontally the motion
was with constant velocity. By arguing in this way, Galileo was able to
deduce the correct result that the path traced out by the projectile was a
parabola.
We now apply this idea quite generally to a particle moving in a
plane in such a way that its cartesian coordinates x and y are twice
differentiable functions of time. The velocity consists essentially of two
components : ac along the direction of the x-axis and y along the direction
of the y-axis. Just as we combined the two coordinates in equation (1)
to give a single position vector

X= xi+ A 1W

we now combine the two velocity components to get a single velocity

vector

X=aci+.Yj (4)

Likewise we combine the two components of acceleration z and y to get

a single acceleration vector

X=zi+
00

(5)
1W

The formulae (4) and (5) may be regarded as the definitions of velocity
and acceleration for a particle moving in a plane, with position vector
X. given by (1), at time t. The speed of the particle is defined as the
magnitude of the velocity vector, is given by

IIXII = V-t,+y2
The advantage of combining the components to produce a single
vector will be apparent later when we give the geometrical interpretation
of the velocity vector. Meanwhile, the following example shows how
to calculate the velocity and acceleration vectors using familiar rules of
differentiation.
16.1 Kinematics in a plane 323

x-axis

Fig. 16.1.5. Position, velocity and acceleration vectors.

Example 3. A particle moving in the (x, y)-plane has position vector at time t given
by
X = -ti + t2j.
Find the velocity and acceleration vectors and at time t. When t = -1, show
X, X and on the curve traced out by the particle. Find also the speed of the
particle at this time, if distance is in metres and time in seconds.

Solution. By our definitions, the velocity and acceleration are given as functions of
t by
-1+26,
X= 2j.

If t=-1 then X=i+j, 3(=-i-2j and X** =2j.

These vectors are shown in Figure !6.1.5 together with the curve along which the
particle moves, which has the equation y = x2. Thus

II X II =
VI"(- + (-2)2 = m/s
is the speed of the particle when t = -1.

Velocity and tangency

Note that in Figure 16.1.5 the velocity vector X seems to be tangent to
the curve along which the particle is moving. The following argument
324 Motion in a plane

AX
N

Fig. 16.1.6. Tangency of the velocity vector.

makes it seem plausible that this will always be the case : by definition

,..

bx by
= lim i lim
(it--+o b t + bt

where bx and by are the changes in the coordinates produced by a change

b t in the time. Hence, when b t is sufficiently small, X will be close to the
vector
5x. by bxi + byj _ bX
bti+ bt bt bt
where bX is the change in the position vector produced by changes bx
and by in the coordinates. Now the vector
bX
bt

has the same direction as the vector 5X. Hence Figure 16.1.6 makes it
plausible that
bX
lim =X
bt
has the same direction as the tangent to the curve along which the particles
moves, at the point with position vector X.
The notations
dX d2X
and
dt dt2

are often used for the velocity and the acceleration.

16.1 Kinematics in a plane 325

Exercises 16.1

1. In each case state what is wrong with the equation involving vectors :
(a) a= b+ c (b) a= b+ c
(c) a = b c (d) X = t + 2t3j.
2. Plot each of the following points in the (x, y)-plane and then express the
position vector of the point as a linear combination of the vectors i and j.
(a) (0,1) (b) (1, 0)
(c) (1, 1) (d) (cos(ir/6), sin(ir/6)).

3. In each case sketch the curve traced out in the (x, y)-plane by a particle when
its coordinates are given as functions of time by the stated formulae. Show the
direction of motion along the curve.
(a) x=t2 andy=t (b) x=t2 andy=t2
(c) x= 1 andy=t (d) x=et and y=e t.
4. In each case sketch the curve traced out in the (x, y)-plane by a particle when
its position vector is given as a function of time by the stated formula. Calculate
the velocity and acceleration vectors and the speed of the particle.
(a) X = 2ti + 2tj (b) X = -ti - t2j
(c) X=ti+etj (d) X = eti + e2tj.

5. The position vector X of a particle at time t is given by the formula

X = t(cos(ir/3)i + sin(ir/3)j).
(a) Sketch the curve which the particle traces out and indicate the direction
of motion.
(b) Guess the velocity and acceleration vectors and the speed.
(c) Calculate the velocity and acceleration vectors by differentiation and
hence check your answer to (b).

6. Repeat Exercise 5, but now assume

X = t(cos(a)i + sin(a)j) where 0< a < it/2.

7. A particle moves around a circle. Its position and velocity vectors at time
t are X and X. What is the angle between X and X? (A careful sketch should
reveal the answer!) What is the value of the dot product X X?
8. Suppose that a particle moves in such a way that its position vector at time t
is given by
X = eti+e-tj.
Calculate the velocity X and verify that it is perpendicular to X when t = 0.
^1 ^1
326 Motion in a plane
16.2 Motion down an inclined plane
Galileo tested his hypothesis that bodies fall towards the Earth with
uniform acceleration by rolling a brass ball down a plane inclined to the
horizontal at various angles. The inclined plane `diluted' the effect of
gravity and, by slowing the motion, made it possible to measure accu-
rately the times at which the balls passed through various points along its
path. Galileo found that, for a given inclination, the distance which a ball
rolled down the plane when released from rest was proportional to the
square of the time taken - just as would be expected if the acceleration
were uniform.
The aim here is to study motion down an inclined plane using Newton's
laws. To simplify our model we shall replace the brass ball by a particle
and shall ignore the effect of friction. The problems which arise in trying
to make a more realistic mathematical model of Galileo's experiment will
be discussed later.
To find the equations of motion for the particle, we shall perform the
following steps :
STEP 1: Choose a coordinate system for the vertical plane containing the
line along which the particle moves.
STEP 2: Express the forces acting on the particle as a linear combination
of the vectors i and j.
STEP 3: Apply Newton's second law in its vector form to get the equations
of motion of the particle.
Similar steps apply to other problems involving motion of a particle in a
plane.

Example 1. A particle is released from rest on a smooth plane making an angle

a with the horizontal, where 0 < a < fir. Show that the particle moves down the
plane with constant acceleration. Find also the magnitude of the normal reaction of
the plane on the particle.
Solution.
STEP 1: Choose coordinates. Let the x-axis run straight down the plane and let
the y-axis point up perpendicularly to it with the origin at the point where the
particle is initially released, as in Figure 16.2.1. Thus the particle moves along the
line in the (x, y)-plane with y = 0.
The unit vectors i and j point along the directions of the x- and y-axes, as usual.
We let
X=xi+yj
be the position vector of the particle at time t, hence
X=zi-I-.'
16.2 Motion down an inclined plane 327

Fig. 16.2.1. Choosing the x- and y-axes.

Fig. 16.2.2. Forces on the particle.

Since we assume the particle stays on the plane, it follows that y = 0 throughout
the motion. Hence y = 0.
STEP 2: Write the force as a vector. Suppose the particle has mass m. There are
two forces which act on it: its weight, of magnitude mg, and the normal reaction
of the plane, of magnitude N, say. These forces are shown in Figure 16.2.2.
These forces are to be expressed in terms of the vectors i and j. Since j is the
^1 ^1
unit vector with the same direction as the normal reaction,
{normal reaction) = Nj.
To express the weight as a vector, we first find a unit vector pointing vertically
downwards. So consider the right angle triangle in Figure 16.2.3 with one side
parallel to the inclined plane and with hypotenuse of length 1.
The angle at the bottom of the triangle is 7r/2 -a; hence the angle at the top is
it/2 - (ir/2 - a) = a. Thus the remaining sides of the triangle have lengths sin(a)
328 Motion in a plane

Fig. 16.2.3. Unit vector vertically downwards.

and cos(a) as shown. Hence the unit vector in the vertically downwards direction is
sin(a)i - cos(a)j.
Hence, in vector form,
{weight} = mg(sin(a)i - cos(a)j).
The net force F acting on the particle is now given by
F = {normal reaction} + {weight}
= Nj + mg(sin(a)i - cos(a)j)
= mg sin(a)i + (N - mg cos(a))j

STEP 3: Apply Newton's second law. The vector form of this law, F = mX, now
shows that
mg sin(a)i + (N - mg cos(a))j = mzi + myj.
Equating coefficients of i and j on each side and using y = 0 gives
z = g sin(s), (1)
0 = N - mg cos(a). (2)
Equation (1) shows that the acceleration down the plane is the constant g sin(s).
Equation (2) shows that the normal reaction has magnitude mg cos(a).

Note that (1) is a differential equation for x as a function of t.

Since the RHS is a constant, the differential equation can be solved by
antidifferentiation, as explained in Section 2.4. In this way a complete
description of the motion of the particle can be obtained.

Effect of friction
It is relatively easy to include in the above model the effect of friction
between the plane and the particle. (See Chapter 4 for the basic ideas of
friction.) Exercises to work out the details are set later.
Even when friction is included, however, the particle model is still
16.2 Motion down an inclined plane 329

inadequate as a description of Galileo's experiment with rolling balls.

For example, a ball will start rolling down the plane even for very small
inclinations a to the horizontal, but for a particle with friction there
is a certain threshold inclination below which the particle will remain
stationary.
The main obstacle to obtaining a simple model for the rolling balls is
that the laws of friction merely give an inequality for the friction acting
on a rolling body, as explained in Section 4.3. (This difficulty can be
overcome by using the principle of conservation of energy for a rigid
body, but this topic lies beyond the scope of this book.) It turns out that
the actual acceleration of the balls is considerably less than that predicted
by the particle on the smooth plane, at least for small inclinations of the
plane.

Exercises 16.2
1. Recall from the text that, for a particle moving down a smooth plane inclined
at an angle a to the horizontal, the acceleration is given by the formula
x=gsin(a) 0<a<ir/2.
(a) Find the limit of the RHS as a -- 0.
(b) Find the limit of the RHS as a -- it/2.
(c) What are the physical interpretations of your answers to (a) and (b)?
(d) Suggest a further check which can be applied to the answer in the text.

2. Recall from the text that, for a particle moving down a smooth plane inclined
at an angle a to the horizontal, the normal reaction on the particle has magnitude
given by the formula
N = mg cos(a).
(a) Find the limit of the RHS as a -- 0.
(b) Find the limit of the RHS as a -- it/2.
(c) What are the physical interpretations of your answers to (a) and (b)?

3. Solve the differential equation

x = g sin(s),
which arises in Example 1 in the text, after imposing the relevant initial conditions.
Does your answer show that the displacement of the particle from the origin is
proportional to the square of the time taken?
4. Consider a vertical circle of radius r > 0. Suppose that a smooth plank is
placed so that it forms a chord of the circle with lower end at the point where
the circle touches the ground, as shown below. Let the plank have inclination a
radians with respect to the ground.
330 Motion in a plane

(a) Find the length of the plank in terms of r and a.

(b) Show that the time taken by a particle to roll from rest down the length
of the plank is the same no matter what its inclination a happens to be.
[This result is Galileo's famous circle theorem]

5. Consider a particle lying at rest on a rough plane with coefficient of static

friction u, between the plane and the particle. Suppose the inclination a of the
plane to the horizontal is gradually increased until the particle is on the point of
sliding down the plane. At this stage :
(a) show all the forces acting on the particle,
(b) use the fact that the acceleration is zero to show that
µs = tan(s),
(c) and hence devise a practical method of estimating the coefficient of static
friction.

6. Suppose a particle is sliding down a rough inclined plane making an angle a

with the horizontal and let the coefficient of kinetic friction between the plane
and the particle be uk.
(a) Show that if the particle moves a distance x down the plane in time t
then
x = g(sin(a) - µk cos(a)).
(b) Deduce there is an angle ao such that the particle
(i) accelerates up the plane if a < ao,
(ii) maintains constant velocity if a = a,
(iii) accelerates down the plane if a > ao.

7. Solve the differential equation in Exercise 6(a), assuming that the particle was
initially at rest.
8. Investigate how the results stated in Exercise 6 are affected if the particle has
been initially projected so as to slide up the plane.
9. Find the time taken by a skier to ski 1 km down a slope making an angle of
30° to the horizontal
16.3 Projectiles 331

(a) in the absence of friction,

(b) if the coefficient of kinetic friction between the skis and the snow is 0.04.

16.3 Projectiles
In the study of projectiles, the aim is to set up a mathematical model for
the motion of a body projected near the Earth's surface, like a cricket
ball or an artillery shell. In the simple model investigated in this section,
the Earth is regarded as an inertial frame and hence the projectile is
assumed to move in the verticle plane containing its initial direction of
projection. While this simplification is appropriate for a cricket ball, it is
less so for a shell. The only force taken into account, moreover, is gravity
(assumed constant in magnitude and direction).
This model predicts that the path traced out by the particle (its
trajectory) is a parabola as was already known to Galileo, who
realized that the horizontal and vertical components of the motion could
be analysed separately. We achieve the same end by the use of vectors,
obtaining a pair of uncoupled simultaneous differential equations to
describe the motion.
If the model is modified to include the effect of air-resistance, the use of
vectors makes it easy to obtain the modified equations of motion, as will
be seen from a later exercise. The modified equations are coupled and,
while it is possible to uncouple them by a suitable change of coordinates,
the easiest way to get useful information about the behaviour of the
solutions is by approximate numerical techniques.
The steps to follow in deriving the equations of motion are much the
same as in the previous section.

Example 1. Find the equations of motion in suitable coordinates, of a particle

moving in a vertical plane subject only to the force of gravity, assumed constant.
Solution.
STEP 1: Choose a coordinate system. Since the only force on the particle acts
vertically, we choose the y-axis vertically upwards and hence the x-axis horizontal.
The origin is chosen at the point of projection, assumed to be at ground level.
STEP 2: Write the force as a vector. Let the mass of the particle be m. The only
force acting on the particle is its weight. This has magnitude mg and direction that
of the unit vector -j, as shown in Figure 16.3.1. Thus
{weight} = -mgj.
332 Motion in a plane
y-axis
ri

I Weight

x-axis

Fig. 16.3.1. Weight as a vector.

STEP 3: Apply Newton's second law. The vector form of this law shows that
-mgjN = mX = mxi + myj.
Equating the coefficients of i and of j on each side gives, after cancellation of m,
x = 0, (1)
(2)
y = -g.
The simultaneous differential equations (1) and (2) are the desired equations of
motion.

The subsequent motion of the particle can be found from these dif-
ferential equations once the point from which the particle is projected
and the velocity of projection are specified. The initial velocity can be
obtained from the initial speed and angle of projection by elementary
trigonometry, as in the following example.

Example 2. Let the x-axis and the y-axis be horizontal and vertically upwards
respectively, as in the previous example. Suppose that a particle is projected up from
the origin at ground level with an initial speed of 10 m/s at an angle of n/4 radians
to the x-axis. Find the coordinates x and y of the particle as functions of the time
t after projection, and describe the trajectory of the particle in the (x, y)-plane.
(Assume x and y are in metres, t in seconds.)
Solution. To write the initial vector in terms of i and j, we first get a vector of unit
length in the direction of projection.
By the construction given in Figure 16.3.2, this unit vector is

cos i +sin(T) - 1 i+1 j.

4 4 - 72-
1 2'- ,[2-'-
To get the correct magnitude for the initial velocity vector, we now multiply this
unit vector by 10 giving
0
X=zi+yj= 102i+ fort=0.
^1
72-j
16.3 Projectiles 333
y-axis

n/4
. x-axis
cos(ir/4) cos(n/4) i

Fig. 16.3.2. Unit vector in direction of projection.

Fig. 16.3.3. The trajectory of the projectile.

Equating coefficients of i and j in this vector equation and using the fact that the
particle starts at the origin gives

x=0 and
is5-,12
when t = 0. (3)
Y= 0 512-

Solving the differential equations (1) and (2) subject to the initial conditions (3)
gives the solutions
x=5/2-t,
(4)
1
y = _ gt2 + 512- t,
2

which are valid while y >_ 0.

Eliminating t between these equations gives as the equation of the trajectory

y=- leg x2+x. 1

Thus the trajectory is part of a parabola, as shown in Figure 16.3.3.

334 Motion in a plane
One of the main applications of the mathematics of projectile motion is
to military ballistics (see Hart and Croft, 1988; and Farrar and Leeming,
1983). Some interesting stroboscopic photographs of a moving projectile
are given in Cohen (1985). To model accurately the motion of shells our
model needs to be extended to take into account rotation of the Earth,
air-resistance, and the spinning of the shell.
Applications of the mathematics of projectiles to sport are also in-
teresting; a very accessible account is given in Chapter 8 of De Mestre
(1990). This book also discusses the effect of air-resistance on projectile
motion.

Exercises 16.3
1. Verify the claims made in the solution of Example 2 in the text concerning
the solutions of the simultaneous differential equations
x0
Y=-g
with the relevant initial conditions.
2. Use the solutions obtained in the text to Example 2 to determine:
(a) how long the particle takes to return to ground level,
(b) how far it had then moved horizontally,
(c) the maximum height reached by the particle.

3. Let the x- and y-axes be horizontal and vertically upwards respectively, with
the origin at ground level. Suppose a particle is projected from the origin in the
(x, y)-plane with the speed u > 0 at an angle a to the x-axis (0 < a < n/2).
(a) By following the method used in Example 2, show that the equation of
the trajectory of the particle is
g
y=- 2 x2 + tan(a)x (y >_ 0).
2 cos (a)u
(b) Verify that this equation is dimensionally correct.

4. In each of the following cases find the limiting form of the equation given in
Exercise 3 and describe the limiting trajectory :
(a) when u approaches oo
(b) when a approaches 0.
(c) Do your results seem plausible physically? Give reasons.

5. In Exercise 3, show that the particle returns to ground level at a point whose
distance from the origin is
sin(2a)u2/g.
16.3 Projectiles 335

Apply some checks to this answer. For which angle of projection is this distance
a maximum?
6. Show that, if 0 < f3 < a, the time t which elapses before the particle in
Exercise 3 again crosses through the line through 0 making an angle J3 with the
horizontal is given by
Zu
t= cos(a)(tan(a) - tan(J3)).
g
Apply some checks to this answer.
7. A projectile of mass m moves in a vertical plane subject to the forces of
gravity and air-resistance. The magnitude of the air-resistance is assumed to be
proportional to the square of the speed of the particle. The direction in which it
acts is opposite to the velocity vector. Suppose that the Cartesian coordinates of
the particle at time t are x and y where the x-axis is horizontal and the y-axis is
vertically upwards.
(a) In terms of ac and y find
(i) the magnitude of the velocity vector zi + yj,
(ii) a vector of unit length with the direction of
(iii) the speed of the projectile,
(iv) the force due to air-resistance.
(b) Deduce that the equations of motion of the projectile are
mx = -kac .z2 + y2,

my= -g - ky .z2+y2,
for some constant k > 0.
[These differential equations can be uncoupled, and hence solved in
closed form (see Synge and Griffiths, 1959); however, the resulting
formulae are complicated. Alternatively, numerical approximations to
the solutions can be obtained, on a computer, and hence graphs can be
sketched.]
17
Motion on a circle

This chapter explains the mathematical ideas needed to study the motion
of a particle moving on a circle. These ideas enable us to formulate
mathematical models for a number of interesting problems. For example,
as an introduction to the mechanics of the solar system you are shown
how to derive one of Kepler's laws for the special case of a planet which
moves in a circular orbit. In another application you are shown how
to set up the equation of motion for a pendulum, from which useful
information can be obtained by linearizing the differential equation.

17.1 Kinematics on a circle

The key step in solving many problems in mechanics is to introduce
coordinates which fit in with the geometry of the problems. Up till now,
the only coordinates which we have used in mechanics have been cartesian
coordinates, which measure distances along axes. The coordinate which
seems most suited to problems involving circles, however, measures an
angle rather than a distance.

The angular coordinate

Consider a particle P moving around a circle of centre 0 in the (x, y)-
plane with radius a > 0, as in Figure 17.1.1. The angular coordinate of
the particle is defined as the angle 9 radians (-ir < 9 < ir) which the
position vector OP makes with the x-axis.
The values of the angular coordinate at various points on the circle
are shown in Figure 17.1.2. If, for example, the particle moves around
the circle anti-clockwise then 9 assumes the value 0 when the particle
passes through the point (a, o) and then increases to the value it when it

336
17.1 Kinematics on a circle 337
y-axis

x-axis

Fig. 17.1.1. The angular coordinate 0.

x-axis

Fig. 17.1.2. Values of 0 at various points.

reaches (-a, 0). After this, 9 drops to values near -n and then increases
up to 0 as the particle passes through (a, 0) once more. Note the sudden
jump, or discontinuity, in the values of 9 as the particle passes through
(-a, 0).
338 Motion on a circle

Fig. 17.1.3. Change in angular coordinate over a small time interval b t.

Angular velocity and acceleration

Imagine now a particle moving around the circle, so that its angular
coordinate 0 is a function of the time, say 0 = f (t). We assume that the
function f is twice differentiable. As the time changes from t to t + b t,
let the angular coordinate change from 0 to 0 + 60, as in Figure 17.1.3.
Thus 69 is the angle swept out by the ray joining the origin to the
particle during the time interval from t to t + 6t, where b t * 0. We
assume, for the present, that 0 * it so as to avoid the discontinuity in
the value of 0.
The ratio
8e f(c + Sc) - f(t)
bt bt

is called the average angular velocity of the particle during this interval.
The angular velocity at time t is defined to be

limber fl(t)
and we denote it by 9. Thus the angular velocity is the rate of increase
of the angular coordinate with respect to time. It measures the rate at
which the particle is spinning around the origin. If the angular velocity
is positive, the particle is moving anti-clockwise around the circle; if it is
negative, however, the particle is moving clockwise. If a is constant, the
motion of the particle around the circle is said to be uniform.
In a similar way, the ratio
69 f (t + bt) - f (t)
bt bt

is called the average angular acceleration of the particle during the time
17.1 Kinematics on a circle 339

y-axis

a sin($) j

x-axis c
a cos(O) i

Fig. 17.1.4. Cartesian coordinates in terms of the angular coordinate 0.

interval from t to t + b t. The angular acceleration at time t is defined to

be
ao
lim = f (t)
b t--1-0 t

and we denote it by O. The angular acceleration is the rate of increase of

the angular velocity with respect to the time.
In spite of the jump in the value of the angular coordinate whenever
the particle crosses the negative x-axis it turns out that, in the problems of
interest to us, there is no such jump in the angular velocity or acceleration.
We now relate the angular quantities to their cartesian counterparts.

Cartesian coordinates
As can be seen from Figure 17.1.4, the cartesian coordinates of the
particle are given in terms of its angular coordinate by the formulae

x=acos9, y=asin9.
Hence the position vector of the particle is given in terms of its angular
coordinate by the formula

X = a cos(9)i + a sin(9)j. (1)

In accordance with the definitions in Section 16.1, the velocity X of

the particle is obtained from (1) by differentiating each component with
respect to the time t. Now, by the rule for differentiating composite
340 Motion on a circle
functions,
d
7acos(O) = de a cos(e) ac = -a sin(e)e,
d
7asin(O) = de a sin(g) d8 = a cos(9)9
and hence from (1)
X = -a sin(O)Oi + a cos(8)ej. (2)

In a similar way the acceleration vector X is obtained from (2) by

differentiating each component with respect to the time t. This involves
using both the product and the composite rules for differentiation. The
result is
x = (acos(O)BZ - a sin(e)e) i + (-a Sin(e)e2 + a cos(e)e) j. (3)

Tangent and normal vectors

The equations for the velocity and acceleration will now be rewritten so
as to show their geometrical significance. In equation (2) take out the
common factor a9 to get
X = a9 (- sin(O)i + cos(O)D
1W
(4)

while in equation (3) expand and then interchange the second and third
terms to get
X = -a cos(e)e2i + -a sin(9)92jN - a sin(O)81 + a cos(O)BjN
and hence
X = -a92 (cos(O)i + sin(O)j) + ae (- sin(e)i + cosce . (5)

Notice that equation (5) expresses X as a linear combination of the two

fto

vectors

cos(O)i + sin(O)jN and - sin(O)i + cos(9)j,N

which we shall denote by v ('nu') and i ('tau') respectively.
We imagine the vectors v and i as sitting with their tails at the point of
the circle occupied by the particle. The geometrical significance of these
vectors is described in the following proposition, which is illustrated in
Figure 17.1.5(b).
Proposition. The vectors
v = cos(O)i + sin(O)jN and i = - sin(O)1 + cos(e)cN (6)
17.1 Kinematics on a circle 341
(a) (b)
y-axis -axis

Fig. 17.1.5. Unit tangent and normal vectors.

have length 1 and, at the point on the circle with angular coordinate 0,
v is normal to the circle, pointing outwards from the circle, while
i is tangent to the circle and points anti-clockwise.

In terms of these vectors, the velocity X and the acceleration X may be

1%0 1W

written very simply as

X = a0i
ow ow (7)

and
X = -a62v + aei. (8)

Proof The vectors v and have length 1; this follows directly from (6).
i
The direction of v is the same as that of X, as comparison of (1) and (6)
shows. Thus v points out radially from the circle.
On the other hand it follows directly from (6) that the dot product of
v and is zero. Hence i is perpendicular to v and so must be tangent to
the circle.
Finally, the formulae (7) and (8) follow immediately from (4) and (5)
on use of (6). D
The geometric significance of the formulae (7) and (8) for the velocity
and acceleration is now clear. Since v and i are the unit vectors normal
1%0 1%0

and tangent to the circle, these formulae express the velocity and accel-
eration in terms of normal and tangential components, as illustrated in
Figure 17.1.6.
Thus (7) shows that the velocity is in the direction of the tangent to
342 Motion on a circle

Fig. 17.1.6. Velocity and acceleration in terms of tangent and normal vectors.

the circle and has magnitude I a6 I = a161 which is also called the speed of
the particle.
On the other hand, (8) shows that the acceleration has both a tangen-
tial component of magnitude I a6 I = a101 and a normal component, of
magnitude a62, directed towards the centre of the circle.
The formulae (7) and (8) are the key to solving problems involving
motion in a circle and they should be remembered.

Exercises 17.1
1. Each of the following points lies on the circle of radius 2 and centre (0, 0) in
the (x, y)-plane:
(0, -2), (290)9 (092)9 (-1, N/3-), (-290).
Plot each of these points on the circle and state the value of its angular coordinate.
2. A particle moves around the circle in Exercise 1 in such a way that its position
vector at time t is given by the formula
X = 2 cos(t)i + 2 sin(t)j.
(a) Express the angular coordinate 0 as a function of t when -n < t <_ n.
Hence find the values of the angular velocity 0 and angular acceleration
0 at time t. Find also the speed of the particle.
(b) Now use appropriate formulae in the text to express the velocity and
the acceleration vector X as functions of t for -n < t < n.
(c) State briefly why the formulae obtained in (b) are still valid when
n<t<_3n.
3. By using the formulae given in the text for the unit tangent and unit normal
vectors, verify that they each have length I.
4. By using the formulae given in the text for the unit tangent and unit normal
17.2 Uniform circular motion 343

vectors i and v at a point on the circle, verify that their dot product i v is zero.
To which geometrical fact does this correspond?
5. The bob P of a simple pendulum oscillates in a vertical plane along the arc
of a circle between two points A and B (at the same height). Let 0 be the angle
which the pendulum makes with the vertical at time t, as shown below.

(a) Which point does the particle P occupy when its angular coordinate 0
assumes (i) its maximum value? (ii) its minimum value?
(b) What is the value of the angular velocity 9 at these points?
(c) Show the direction of the acceleration vector X at each of these points.

6. In Exercise 5, which point does the particle P occupy when its angular
acceleration is zero? Show the direction of its acceleration vector X at this point.
7. Complete the details given in the text of the derivation of formula (3) for X
from formula (2) for X.

17.2 Uniform circular motion

A number of interesting problems in physics can be modelled by a
particle moving around a circle with constant angular velocity. In this
special case, the formula for the acceleration of the particle simplifies. As
in the previous section, we assume the particle is moving on a circle of
radius a and centre at the origin and we let 0 and X denote the angular
coordinate and the position vector of the particle at time t.
The assumption of constant angular velocity means that
6 = cc) (1)

for some constant w * 0 and hence that 8 = 0. Hence in the formula

(8) of Section 17.1 the tangential component drops out and the formula
simplifies to
X = -a82 v,
2 (2)
=-ac v.
344 Motion on a circle

Fig. 17.2.1. Acceleration vector points to centre of circle.

This shows that, when circular motion is uniform, the acceleration always
points in the direction from the particle to the origin, as shown in Figure
17.2.1.
We assume that the particle starts on the x-axis so that 0 = 0 when
t = 0. The differential equation (1) together with this initial condition
has the solution
0 = wt (3)

provided t is in the interval for which -n < 0 < it. If t lies outside this
interval, then a multiple of 2n must be added to the RHS of (3). Hence
the formula
X = a cos(0)i + a sin(0)jN

gives in all cases

X = a cos(uot)i + a sin(cot)j. fto
(4)

This formula shows that ?C is a periodic function of t with period 2n/Iwl

Thus the particle makes one complete revolution around the circle in
time 2n/lcul. The distance which the particle moves during this time is
the length 2aa of the circumference of the circle. Hence the speed u of
the particle is given by
{distance moved} _ 2na
(5)
" {time taken)
This result can also be obtained from the more generally applicable
formula (4) of Section 17.1 (in which the angular velocity need not be
constant). The formula (5) enables us to deduce the speed of the particle
from its angular velocity, and vice versa.
17.2 Uniform circular motion 345

The following example gives a simple application of the above kine-

matical formulae to astronomy. In this example the Earth's acceleration
is calculated from other data about its orbit.

Example 1. The Earth moves around the sun in an orbit which is nearly circular
of radius 1.49 x 108 km. Find
(a) the angular velocity of the Earth around the sun,
(b) the speed of the Earth,
(c) the magnitude of its acceleration towards the sun.

Solution. We choose the centre of the sun as the origin and let
position vector
X= of the Earth
at time t
outward unit normal
v= to the Earth's orbit
1 at this point
angular velocity
(00 = of the Earth
around the sun
speed
u= of the
Earth
(a) The position vector of the Earth sweeps out an angle of 2ir radians in 1
siderial year, of length 365 a days. Thus the angular velocity CO is given by

CO _ {angle swept out}

{time taken}
2ir
365.25 x 24 x 60 x 60
= 1.99 x 10-7 rad/s.
(b) The speed of the Earth is given by (5) as
u=aw
= 1.49 x 108 x 1.99 x 10-'
= 29.7 km/s.
(c) From (2), the acceleration of the Earth is directed towards the sun and has
magnitude
awe = 1.49 x 108 x (1.99 x 10-7)2
= 5.90 x 10-6 km/s2.
346 Motion on a circle

X50 metres X
w

Centre of circle

Fig. 17.2.2. A car crossing a bridge. See Example 2.

Similar calculations to those in the above example can be made for

other planets and their satellites by using data about their orbits obtained,
for example, from the CRC Handbook of Chemistry and Physics.
While the above example involves only kinematics, the following ex-
ample requires the use of the Newton's second law of motion in its vector
form, F = mX.
^0 AO

Example 2. A car is crossing a bridge whose vertical cross-section has the form of
an arc of a circle of radius 50 metres. Show that, if the car is to maintain contact
with the bridge at the highest point, then its speed u must satisfy the inequality
u s 22 m/s.

Solution. We regard the car as a particle. The solution follows the usual steps for
a dynamical problem: introduce notation; draw a force diagram; apply Newton's
second law.
We choose the centre of the circle as origin and let
X = {position vector of car at time t} ,
w = {angular velocity of car around origin} ,
V = {outward unit normal at top of circle),
i = {unit tangent at top of circle),
N = {magnitude of normal reaction on car},
f = (magnitude of friction force on car) ,
m = (mass of car).
We need only consider the forces acting on the car when it is at the highest point
of the road. The directions of the individual forces are then either horizontal or
vertical, as shown in Figure 17.2.3. Thus the net force acting on the car, at the top
of the road, is given by
F=(N-mg)v- fT.
17.2 Uniform circular motion 347

Fig. 17.2.3. Forces on the car at highest point of road.

Now, while the car remains in contact with the road, it is moving around a circle
of radius a = 50 metres. Hence the formula (2) gives
X = -acw2v
where a = 50 metres. Application of Newton's second law F = mX now gives
(N - mg)v - f r = -maw2v.
Equating coefficients on each side of this equation gives
N - mg = -maw2
and hence
acw2 = g - N/m.
But, since N >_ 0, this implies that
acw2 < g.
Hence by (5)
u2 <_ ag,
and sou< V36-

It can be shown that, for the car to maintain contact with the road at
points near the top of the bridge, an even stricter speed limit must be
applied than that obtained in the above example.

Exercises 17.2

1. The moon orbits the Earth once every 27.3 days in a nearly circular orbit of
radius 3.83 x 108 metres. Find the acceleration of the moon.
2. A car of mass 1000 is travelling at a constant speed of 20 m/s as it moves
down one hill and up the next as in the diagram below. Near the lowest point,
the vertical cross-section of the road may be regarded as part of the arc of a
circle of radius 200 metres.

9Or1\_-z_1
0
348 Motion on a circle
Find (i) the normal reaction of the road on the car and (ii) the net frictional
force acting on the tyres as the car passes over the lowest point on the road.
3. The breaking strength of a 50 cm long string of a simple pendulum is 30
newton. What is the maximum mass that may be used for the bob of the
pendulum if the speed of the bob at the lowest point of the string is 1 m/s.
4. An unbanked curve on a highway has the shape of a circular arc. What is
the minimum safe radius for the arc if the coefficient of static friction between
the tyres and the road is 0.6 and the speed limit is 100 km/h?

INTRODUCTION TO ASTRONOMY: Exercise 5 below shows how to

derive one of Kepler's three famous laws of planetary motion:
T2 is proportional to r3

where T is the period of revolution of a planet around the sun and r is

the radius of its orbit. To simplify the mathematics, it will be assumed that
the orbits are circular, although it would be more realistic to take them as
ellipses. You will need to assume Newton's Law of Universal Gravitation,
which states:
each pair of bodies in the universe attract each other with
a force which has magnitude GmM/r2
where m and M are the masses of the two bodies, r is the distance between
them, and G is the universal constant of gravitation.

5. Assume that a planet moves around the sun of mass M in a circular orbit of
radius r, with angular velocity to.
(a) Introduce notation for the position vector of the planet at time t and
the unit normal to the circle at the point with this position vector.
(b) From Newton's law of universal gravitation, write down the force acting
on the planet, at time t.
(c) Apply Newton's second law of motion to derive the formula
t2 = GM/r3.
Hence obtain Kepler's third law.

17.3 The pendulum and linearization

When Galileo performed his famous experiment of rolling balls down
an inclined plane, the device he used to measure equal intervals of time
was similar to an hour-glass, but involving the flow of water rather than
sand. The accuracy obtainable with such a device was very limited and
17.3 The pendulum and linearization 349

Fig. 17.3.1. Angular coordinate for the pendulum.

developments in physics and astronomy during the seventeenth century

called for a more accurate way of measuring time.
This need was met by the invention of the pendulum clock in 1657.
It was invented by the Dutch physicist and mathematician Christiaan
Huygens, to whom the mathematical theory of the pendulum is due. Al-
though the equation of motion for the pendulum is non-linear, Huygens
was able to obtain useful information about its solutions, by using the
technique of linearization.

The model
Our model for the pendulum consists of a light rod of length I pivoted
smoothly at one end and with a particle of mass m attached to the other
end. The particle is called the bob of the pendulum. Motion is possible
in a vertical plane. We choose 0 to be the angle which the pendulum
makes with the vertical at time t, as shown in Figure 17.3.1.
Thus the bob of the pendulum is free to move in a vertical circle under
gravity. The bob is held on the circle by the force due to the rod, which
is assumed to act along the direction of the rod.

The equation of motion

The derivation of the equation of motion of the bob of the pendulum
uses methods from earlier in this chapter. Working through the details
provides useful revision of these methods and is set below as Exercise 1.
The equation of motion so obtained is
00
9
(1)
350 Motion on a circle

8=0

Fig. 17.3.2. The bob resting and oscillating.

This differential equation is second order, but non-linear. Hence the

methods of solution explained earlier in this book do not apply. Since,
however, the RHS is a smooth function of 0, the existence-uniqueness
theorem of Section 5.2 is still applicable and tells us that there are solu-
tions of (1) (even though we cannot express them in terms of functions
which we already know).

Some types of solutions

Because we cannot solve the equation of motion (1) in terms of familiar
functions, we reverse the usual procedure : instead of using the solutions
of (1) to tell us about the physics, we use our physical intuition to tell
us about the behaviour of the solutions. In particular, physical intuition
leads us to expect the following types of motion to be possible.
A stable equilibrium point. The lowest point on the circle is where 0 = 0,
as shown on the left of Figure 17.3.2. If the bob starts at this point with
zero velocity it will stay there for ever. This means that the equation
of motion (1) should have the constant solution 0 = 0. This point is
called an equilibrium point for the bob. The equilibrium point is said to
be stable because, if you give the bob a small push, it stays close to the
equilibrium point.
Small oscillations. If, before being released, the bob is displayed away
from the equilibrium point, then it oscillates. Information about these
oscillations is not so easy to obtain directly from the equation of motion
(1).
In this special case where the oscillations have small amplitude, how-
ever, Huygens was able to obtain useful information about them by
17.3 The pendulum and linearization 351

Fig. 17.3.3. Linear approximation based at 0 = 0.

approximating the differential equation (1) by a linear differential equa-

tion which he could solve easily. The approximation process which
he used is now called linearization. As it is very important in the study
of differential equations, it will now be explained in some detail. Those
who have already read Section 9.4 will notice that the ideas are much
the same as in linearising difference equations.

Linearization near 0 = 0
Huygens knew that during small oscillations the values of 0 stay close
to 0. So instead of using the actual RHS of (1), given by
RHS = - g sin 0,
he looked for a simpler RHS which approximates this closely when 0
is small. As we see from the graph in Figure 17.3.3, the tangent at the
point (0,0) follows the graph closely when 0 is small. So following the
tangent rather than the original graph gives a good approximation and
it has the advantage of being linear in 0.
Now the tangent to the graph is the straight line through the origin
with the same slope as the graph at (0, 0). We use differentiation to get
the slope. Thus the new RHS is given by
RHS = - g sin'(0)0,
_ - g cos(0)9,
__90.
e
352 Motion on a circle
In the differential equation (1) we now substitute this new RHS in place
of the old one. This gives the new, and simpler, differential equation
00

0=-eO. (2)

The differential equation (2) is called the linearization near 0 = 0 of the

differential equation (1).
The linearized differential equation (2) is like (1) in that it is a second-
order differential equation, but is unlike (1) in that it is linear and it can
be solved easily. It is, in fact, just the SHM equation from Chapter 5,
and so we know its solutions are given by

0 =A sin t+E (3)

where A and s are arbitrary constants. These solutions are oscillatory

with period given by

T= 2rze , (4)
V8
which is independent of the amplitude A.

Exercises 17.3
1. Show that 0 = 0 is an equilibrium point of the differential equation
00

0 = 0 -- sin(20)
and then linearise the differential equation near this equilibrium point.
Part five
Coupled Models
18
Models with linear interactions

In this chapter we look at simple models in which two quantities interact

with each other. The models lead to pairs of simultaneous differential
equations for the two quantities and, because of the interaction, the
equations are coupled.
The models considered will be simple enough to produce differential
equations which are first-order linear, with constant coefficients. A
systematic method for uncoupling and hence solving such equations
will be explained. It involves eliminating one of the two quantities to
give a second-order differential equation of the type studied in Chapter
15.
The ideas will be illustrated by a mixing model, similar to the one in
Chapter 13, but involving a pair of vats. Two models from physiology
are then presented : the first models the glucose-insulin homeostasis in
the bloodstream, while the second models the mother-fetus exchange of
nutrients via the placenta.

18.1 Two-compartment mixing

From now on we shall be considering models which lead to a pair

of simultaneous differential equations, rather than a single differential
equation. The equations will involve a pair of quantities, rather than a
single quantity, which are to be expressed as functions of the time, say. In
this section we show how an extension of the mixing problem discussed
in Chapter 13 leads to such a pair of equations.

355
356 Models with linear interactions
Dye - - Water

Fig. 18.1.1. Two interconnected vats.

Mixing with two vats

Consider two interconnected vats, each containing a mixture of dye and
water, as in Figure 18.1.1. Dye runs into the first vat. Mixture from the
first vat is pumped into the second vat and vice versa. As part of the
model we assume the mixture in each vat is well stirred and hence is
homogeneous (of uniform concentration throughout).
Given the various rates of flow of the liquids into and out of the two
vats, we shall show how to set up simultaneous differential equations for
the concentrations of the two mixtures.

Formulating the equations: an example

The following problem illustrates the procedure for setting up the differ-
ential equations.

Assume that initially the two vats each contain 100 litres of pure wa-
ter. Pure dye is then pumped into the first vat at a fixed rate of 1
litre/minute, while pure water is pumped into the second vat at the
same rate. Pumps exchange the mixtures between the two vats - at
a rate of 4 litres/minute from vat 1 into vat 2, and 3 litres/minute
from vat 2 into vat 1. The diluted mixture is drawn off from vat 2 at a
rate of 2 litres/minute. Derive a pair of differential equations for the
concentrations of dye in each vat.

The steps involved in the solution follow closely those used in the
example in Section 13.1. First we draw a diagram and show the rates of
flow of the liquids into the vats, as in Figure 18.1.2.
We then note that, for vat 1, liquid (pure dye and mixture 2) flows in
at the rate of 4 litres/minute. This is exactly balanced by the outflow
18.1 Two-compartment mixing 357
Dye Water
1 Umin 1 Umin
i

Mixture 1 Mixture 1 Mixture 2

4 Umin
Mixture 2 Mixture 2
3 Umin 2Umin
Vat 1 Vat 2

Fig. 18.1.2. Rates of flow of the liquids.

Mixture 1: 100 L Mixture 2: 100 L

Dye: x1 L Dye: X2 L

:. Concentration is Concentration is
c1= x1/l00 c2 = X211 00

Vat 1 Vat 2

Fig. 18.1.3. Volumes and concentrations at time t >_ 0.

of mixture 1 at 4 litres/minute. Hence the liquid in vat 1 maintains its

initial volume of 100 litres. A similar argument shows that the volume
of the mixture in vat 2 also stays fixed.
The next step is to introduce notation for the quantities involved and
to show them on the diagram.
Let xl and x2 litres be the volumes of dye in each tank at time t Z 0.
Hence the concentrations are
f volume of dyel
in vat 1 f xl
Cl =
volume of mixture 100'
{ in vat 1 }
(1)
volume of dye
in vat 2 f x2
C2 =
volume of mixture 100
{ in vat 2 }
These volumes and concentrations are shown in Figure 18.1.3.
We now consider what happens to the volumes of the dye in the vats
during a small time interval from t to t + b t. The volumes of dye in the
358 Models with linear interactions
respective vats at time t are denoted by x1 and x2 ; at time t + o t they
become x1+ox1 and x2 +ox2.
Thus Oxl is the change in volume of the dye in vat 1 during this time
interval and hence
volume of dye volume of dye
8xI = flowing flowing
into vat 1 out of vat 1
From Figure 18.1.2 it now follows that
volume of volume of dye volume of dye
pure dye in mixture 2 in mixture 1
axl = flowing + flowing flowing out
into vat 1 into vat 1 of vat 1
xl
(18t) x 1 + (3 6t) X j-(4Ot)x
X2
00
100.

(2)

Here the first factor in each term is the volume of the appropriate mixture
flowing into or out of vat 1 during the time 8t. These factors are each
multiplied by the fraction of dye (concentration) in that mixture to give
the volume of the dye. The final result is only an approximation because
the concentrations change slightly during the time interval.
Similarly, 8x2 is the change in volume of the dye in vat 2 during this
time interval and hence
volume of dye volume of dye
8x2 = flowing - flowing
into vat 2 out of vat 2
(48t) x
i - (38t) x j-(2Ot)x ice (3)

As in Section 13.1, we can show that the errors involved in (2) and (3)
are small compared with 6 t. Hence dividing each of (2) and (3) by 8 t,
and then letting b t approach 0 gives
dx1=1-
jXl+X2
4
(4)
dx2_ 4 _ 5
1(-)()XI
dt
100x2'

Thus we have derived a pair of simultaneous differential equations for

x1 and x2 as functions of t. The initial conditions
xl = 0 and x2 = 0 when t=0 (5)

must also be satisfied, since there is no dye in either vat initially.

We could solve the initial-value problem (4)-(5) for x1 and x2 as
functions of t, and then obtain the concentrations cl and c2 from (1).
18.1 Two-compartment mixing 359

Alternatively, we can use (1) to get

xi = 100ci and x2 = 100c2.

Substitution in (4) then gives the following pair of simultaneous differ-

ential equations for cl and c2 as functions of time,
dcl_ 1 _ 4 3
dt 100 1-()()CI + 100 c2
(6)
dc2 4 5
j-00CI -
dt
100c2,

while the initial condition (5) now becomes

cl=0 and c2 = 0 when t = 0.
Thus we have obtained the desired pair of simultaneous differential equa-
tions for the concentrations, together with the relevant initial conditions.

Steady-state solutions
How to solve pairs of simultaneous differential equations like (4) and
(6) will be explained in the next section. One thing we can do straight
away, however, is to look for the constant solutions of the differential
equations (which correspond to steady-states or equilibrium states of the
mixing problem). These solutions can be found by setting
dci
=0 and dc2 =0
dt dt
in (6) to get the simultaneous pair of linear algebraic equations
--4ci + 3c2 = -1,
(8)
4ci - 5c2 = 0.
These equations have the solution
5 1
ci = and c2 =
5 2'
which are therefore the steady-state concentrations for the mixing prob-
lem.
The next question to ask is whether the actual concentrations approach
these steady-state concentrations as time goes on. We will be able to
answer this in the next section, by taking limits as t --+ oo, after we have
found the time-dependent solutions of the intial-value problem (6)-(7).
360 Models with linear interactions
Exercises 18.1
1. Initially two vats each contain 50 litres of pure water. Pure dye is then
pumped into the first vat at a fixed rate of 2 litres/minute, while pure water is
pumped into the second vat at a rate of 2 litres/minute. Pumps exchange the
mixtures between the two vats - at a rate of 6 litres/minute from vat 1 into vat
2, and 4 litres/minute from vat 2 into vat 1. The diluted mixture is drawn off
from vat 2 at a rate of 4 litres/minute.
Show that the concentrations of dye in the vats at time t minutes after the
start satisfy the following simultaneous differential equations:
dcl_
dt
1

25
- 3 2
25 cl + 25 c2'
dc2_ 3 4
T5 -C1 - T5-C2-
dt
What are the initial conditions for these differential equations?
2. Find the steady-state concentrations for the mixing problem in Exercise 1.
3. As in Exercise 1 except that vat 1 contains 50 litres of pure water initially
and vat 2 contains 100 litres of a 25% mixture of dye initially.
4. As in Exercise 1 except that dye is now pumped into the first vat at the
rate of 3 litres/minute. Show now that the concentrations satisfy the following
differential equations until such time as the first vat overflows :
cl = (3 - 7c1 +4C2)1(50 + t)
c2 = (6c1 - 8c2)/50.

18.2 Solving constant-coefficient equations

In this section we introduce a procedure which is capable of solving the
simultaneous pair of differential equations from the previous section. Let
A, B, C, D, P and Q be given functions of t. A pair of simultaneous
differential equations of the form
dx
dt
=Ax+BBy + P
dy
= Cx+Dy+Q
dt
is said to be first order because it involves only the first derivatives of x
and y, and linear because of the linear way in which x and y appear on
the RHS. If P = Q = 0 then it is said to be homogeneous.
When A, B, C, D are constants, the pair of equations is called constant
coefficient. This is the case to which the solution procedure is applicable,
and it includes the pair of differential equations of the previous section,
for the concentrations of the dye.
18.2 Solving constant-coefficient equations 361

Example 1. Determine in each case if the pair of differential equations is linear,

constant coefficient, or homogeneous.
(a)
dx
dx = 3x - 2xy (b)
dxl_
2x1 + 2tx2
at dt -
dx2_
dy = 2x - xl + x2
dt y dt =
dx dx
(c) =7x-2y+t2 (d) = 7x - 2y
dt dt
dy = 3x - y + 1 dy = 3x -
dt dt y
Solution. The pair (a) is not linear because of the 2xy term in the first equation.
All other pairs are linear. The pair (b) is not constant coefficient, because of the
coefficient 2t in the first equation, but (c) and (d) are constant coefficient. The
pairs (b) and (d) are homogeneous.

A solution of the pair of differential equations (1) is a pair of functions

and tp such that when we substitute
x = fi(t) and y = ip(t)
into (1) we get LHS = RHS for each of the pair of differential equations.
The domain of each function is assumed to be an interval, chosen as
large as possible.

The solution procedure

We now give a step-by-step procedure for solving a pair of simultaneous

differential equations of the form
5c = ax + by (2a)
y = cx + dy (2b)
where a, b, c, d are constants with b * 0. To simplify the algebra, we will
explain the solution procedure only for a pair of homogeneous equations.
The extension to inhomogeneous ones is left as Exercise 3. (You should
try to understand why the procedure works. Do not try to memorize the
detailed results of each step.)
STEP 1: Find a second-order differential equation for x alone by differen-
tiating (2a) to get
z=az+by
= az + b(cx + dy) (removing y with (2a))
ax
= az + b (cx + d x b (removing y with (2b)).
362 Models with linear interactions
Thus
x = (a + d)ac + (bc - ad)x. (3)

This is the desired second-order differential equation involving x and its

derivative (but not y). It is linear with constant coefficients; hence it can
be solved as in Chapter 15.
STEP 2: Solve (3) for x as a function of t (involving two arbitrary con-
stants al and a2).
STEP 3: Use (2a) to get y as a function of t (involving al and a2). By
(1)
1
Y= x - ax).
b
We now use the solution obtained for x as a function of t in Step 2 to
replace the RHS of this formula by a function of t.

Remarks about the procedure

If at Step 3 we had used (2) instead of (1), we would have obtained
a differential equation for y rather than an algebraic equation. This
would have made a lot of extra work for us.
The solution obtained at Steps 2 and 3 for x and y involves two
arbitrary constants A and B. Specific values of these constants can be
found so as to satisfy initial conditions of the form
x = xo and y = yo when t = 0.
The assumption b * 0 is needed for the validity of the above procedure.
(In Exercise 5 you are asked to determine how to proceed if b = 0.)
The above procedure is not the only one which can be used to solve
simultaneous linear differential equations with constant coefficients.
A particularly efficient method (using more matrix theory and linear
algebra than this book assumes) may be seen in Braun (1975), for
example.

Application to the mixing problem

By using the above procedure, we can now solve the differential equations
for the mixing problem modelled in the previous section.
18.2 Solving constant-coefficient equations 363

Example 2. Find the solution of the differential equations

_ 1 4 3
c'
100 c' + C2 (4)
100 100
4 5
62 = Cl - 100c2 (5)
100
which satisfies the initial conditions cl = c2 = O when t = 0.
Solution. We follow the steps in the above procedure.
STEP 1: We find a second-order equation for c1, by differentiating (4) to get
4 3
62
100 c' + 100
_ 4 3 4 5
(by (5))
100 c' + 100 100 c' - 100c2
_ 4 3 4 5 100 4
i-00" + 100
(j-Ci - 100 3
c'
1

100 + 100
cl (by (4)).
Thus
9 8 5
c'
100
c' - 1002 c' + 1002
or
.
c'+100 e'
9 8
c'__ 5
(6)
+1002 1002
which is a second-order linear constant-coefficient differential equation.
STEP 2: We solve (6) for cl as a function of t, by using the method from Chapter
15. Substituting
cl = eAt
into the homogenized version of (6) gives the characteristic equation
8
22+ 100
9 A+ =0
1002

which has the roots Al = -100 and 22 = -1L. Looking for a constant solution of
(6) itself gives cl = g . Hence, by the method given in Chapter 15, we obtain
sr 5
+ ate _ T +
ale- i r
ci = (7)
8
where ai and a2 are constants whose values are to be found from the initial condi-
tions.
STEP 3: We find c2 as a function of t, by substituting (7) back into (4). This
gives
100 4 _ 1
cl +
c2 = 3 100c' 100
100
3
8
(_jaieT& - 100 8 1 it + 4
100
t
+ (aiei&st + ate- T + 8 100
364 Models with linear interactions
Thus
4 _8t _I t
c2=-3aIeI +a2eTM+2. (8)

STEP 4: Finally the initial conditions, cl = c2 = 0 when t = 0, imply by (7) and

(8) that
5
a1+a2=-g,
4 1
3a1-a2-2.
These equations have the solutions a1 = - s6 and a2 = -1. Hence
_ _56e - _,t
3 _8t4_ 5
7eTW +g,
cl
(9)
1 8 4 1
e- T ` - _ e_ T t + 2 .
1

C2 =
14
It is easy to verify that cl and c2, given as functions of t by (9), satisfy both the
differential equations (4) and (5) and the initial conditions.

Discussion of the solutions

From (9) it follows that cl --+ s and c2 ---> 2 as t ---> oo. This means that
the concentrations approach the steady-state solutions in the previous
section as the time becomes arbitrarily large. The graphs of ci and c2
against time are shown in Figure 18.2.1. The graphs show that it takes
nearly 4 hours for the concentration c2 of the mixture being drawn off
to reach 90% of its steady-state value.

Exercises 18.2
1. In each of the following cases, state if the pair of simultaneous differential
equations is linear, constant-coefficient, or homogeneous.
(b) dx
(a) dx = 3x - 7y = 7x - 2y
dt dt
dy =x-ty2 dy =x-ty2
dt dt
dxl du _
(c) = 2x1 - 2X2 (d) 2u + v + cos(t)
dt dt
dx2 = 2x 1 - 2x2 dv
= v + 2u + sin(t)
dt dt
2. Use the procedure described in the text to find the solution of the pair of
simultaneous linear constant-coefficient differential equations
x=x+2y
y = 2x + y
18.2 Solving constant-coefficient equations 365
cl-axis c2-axis

5
8
1

' ` t-axis A - t-axis

60 120 180 240 0 60 120 180 240

Fig. 18.2.1. Concentrations approach the steady-state.

which satisfy the initial conditions x = y = 1 when t = 0.

3. Show that the procedure given in the text may be used to solve pairs of
simultaneous differential equations of the form

ic=ax+by+p
y=cx+dy+q
where a, b, c, d are constants and p, q are given functions of t (thereby extend-
ing the procedure to linear constant-coefficient equations which may not be
homogeneous).

4. In Exercise 1 of Section 18.1 the following system of differential equations

was obtained:
dci 1 3 2
25 c' + 25
c2'

dt 25
dc2 3 4
C2-
dt 25 c' 25

Using appropriate initial conditions, solve this system to find cl as a function

of t.

5. The procedure in the text shows how to solve the pair of linear constant
coefficient differential equations

is=ax+by
y=cx+dy
when b * 0. Explain how you would solve these equations in each of the
remaining cases (i) b = c = 0 and (ii) b = 0 but c * 0.
366 Models with linear interactions
18.3 A model for detecting diabetes
Glucose, an end product of cabohydrate digestion, is converted into
energy in the cells of the body. A hormone insulin, secreted by the
pancreas, facilitates the absorption of glucose by cells other than those
of the brain and nervous system.
A delicate balance is normally maintained between the amounts of
glucose and insulin in the bloodstream. If the insulin concentration is
too low, then too little glucose is absorbed from the bloodstream; the
unabsorbed glucose is then lost in the urine along with other nutrients.
If, on the other hand, the insulin concentration is too high, then too
much glucose is absorbed by cells other than those of the brain and
nervous system; lack of glucose available to the cells of the brain then
impairs its function. The end result in either case, whether too little or
too much insulin, can be coma and even death.
In the medical disorder Diabetes Mellitus, not enough insulin is secreted
by the pancreas. People suffering from this require supplements of insulin
in the form of regular injections, together with a modification of their
diet to regulate glucose input. In this section a simple model of the
interaction between glucose and insulin in the body is presented; we then
use this model to discuss a clinical test for the detection of mild forms
of diabetes.

The model
The main features that a model of the glucose-insulin regulation system
must take into account are as follows.
(a) A rise in the concentration of glucose in the bloodstream results in
the liver absorbing more of the glucose, which it converts and stores
as glycogen; a drop in the concentration of glucose reverses the
process.
(b) A rise in the concentration of insulin in the bloodstream enables the
glucose to pass more readily through the membranes of the cells in
skeletal muscle, resulting in greater absorption of glucose from the
bloodstream.
(c) A rise in the concentration of glucose in the bloodstream stimulates
the pancreas to produce insulin at a faster rate; a drop in the glucose
concentration lowers the rate of insulin production.
(d) Insulin, produced by the pancreas, is constantly being degraded by
the liver.
18.3 A model for detecting diabetes 367

The model omits details of the biochemistry involved and ignores the
effects of other hormones. It treats the bloodstream, moreover, as if
it were contained in a single compartment throughout which concen-
trations of glucose and insulin are uniform at each instant. In spite
of these simplifications, the model is nonetheless suitable as a basis for
understanding what is, in reality, a complicated situation.
Provided there has been no recent digestion, glucose and insulin con-
centrations will be in equilibrium. We are interested in how the system
responds to a change in that equilibrium. Thus we put
g = {excess glucose concentration),
h = {excess insulin concentration},
at time t. We use `h' because insulin is a hormone. Equilibrium occurs
for g = h = 0. Positive values of g or h corresponds to concentrations
greater than the equilibrium values and negative values to concentrations
less than the equilibrium values.
If either of g or h is given a non-zero value, then the body tries to
restore the equilibrium. We assume that the rates of change of these
quantities depend only on the values of g and h so that

dg
dt
- F i(g' h)'
dh
T = F2(gg h)9

for some functions Fi and F2.

The simplest way to construct a model is to assume that these differ-
ential equations are linear with constant coefficients. Since g = h = 0 are
equilibrium solutions, it now follows that the linear differential equations
must be homogeneous. Hence we assume them to be of the form
_a
dg = g
g bh 1
dt
dh
dt
= c-g dh 2
()

where a, b, c and d are constants.

Signs of the coefficients

To determine how the solutions of the differential equations behave, it
is necessary to have some more information about the coefficients which
368 Models with linear interactions
occur in them. A particularly useful fact is that each of a, b, c and d is
positive.
To illustrate why this is so we show that d > 0. We do this by looking
at what must happen if initially g = 0 and h > 0. From (2)
dh
= -dh
Tt
at the initial instant. But the liver will immediately start to degrade the
insulin, as noted in (d) above, since the concentration of the insulin has
exceeded its equilibrium value. Thus its concentration starts to drop so
that initially
dh < 0
dt
Hence the previous equation shows that d must be positive.
In Exercise 1 it is left for you to derive in a similar way that a, b and
c must be positive too. The signs found for the coefficients will be used
later to predict some features of the behaviour of the model.

Testing for diabetes

In a glucose tolerance test a patient is asked to fast overnight and the
following morning is given an injection of glucose. Blood samples are then
taken at subsequent times and the concentration of glucose measured,
to test the response of the glucose-insulin regulatory system. We might
expect the glucose concentration to return after a time to the equilibrium
level and to take longer in diabetic patients than in normal ones.
In modelling this test we suppose that during the short time interval
while the glucose is being injected, the insulin concentration stays zero.
Thus if go is the total amount of glucose injected, then
g=go, h=0, at t=0. (3)

Our model for the glucose and insulin concentrations is thus the solution
of the pair of differential equations (1), (2) with the initial conditions (3).
To solve the differential equations (1), (2) we apply the procedure from
the previous section and so obtain the second-order equation
g + (a + d)g + (ad + bc)g = 0 (4)

The solution for g as a function of t can then be substituted into

1
h = - b (g + ag). (5)
18.3 A model for detecting diabetes 369

Table 18.3.1. Different types of solutions for the glucose-insulin model.

Solutions of characteristic equation Solutions of differential equation

((a + 22)eA1t -(a+Al) e'2t)

Two real solutions Al and 22: g=go

One real solution A : g = go (1 - (a + 2)t) e2t

To unreal solutions A = -a + iw : g = go cos(cot) - a sin((0t) a at

The initial conditions for (4), which can be found from (3) and (5), are

g = go, g = -ago at t = 0. (6)

To solve the second-order equation (4) we use the procedure of Chapter

15. The type of solution obtained will depend on the number of real
solutions of the characteristic equation

22 + (a + d)2 + (ad + bc) = 0. (7)

This in turn will depend on the values of a, b, c and d. The solutions for
g in the various cases are listed in Table 18.3.1
From the fact that a, b, c and d are all positive it can be shown that
the solutions of (7) for A are negative or have negative real part. Hence
the factors eAlt, eA2t, elt and a«t which occur in the above table must all
decay exponentially with time. Thus our model predicts that the glucose
concentration will approach its original undisturbed value with sufficient
lapse of time. This is just as we would expect.
In these solutions g is a linear function of go. This is a consequence of
our assumption that the differential equations are linear. It can be shown
that these solutions are good approximations to those of any smooth
non-linear model of the problem, if go is sufficiently small.

Experimental results
The model can be used to make numerical predictions about the con-
centration of glucose at various times once the constants occurring in
370 Models with linear interactions
the differential equtions (1) (2) are known. In order to determine these
constants, Bolie (1961) used three different methods, based on data from
previous experiments with dogs, which he extrapolated to humans. Mea-
suring glucose in grams, insulin in `units' and time in hours, he obtained
the following averaged values for normal individuals :

a = 2.92, b = 4.34, c = 0.208, d = 0.780,

measured in units corresponding to grams for mass and hours for time.
Substituting these values into the characteristic equation (7) and then
solving for 2 gives

Al =-1.36, 22=-2.34.
Thus the characteristic equation has two real roots. Hence we can
substitute the numerical values into the first row of Table 18.3.1 to get g
as a function of t and then use (5) to get h. This gives
g = go(-0.56eAl' + 1.56e22`)
,,t - A2t (8)
h = 0.202go(e a ).

By putting g = 0 we can find the time at which the glucose concentration

returns to normal (and then slightly undershoots before coming back up
again to approach the equilibrium value exponentially). The value that
we obtain is about 1 hour, for a normal individual. The graph of g
against t has the general shape shown in Figure 18.3.1.
Possible refinements of the model are discussed in Bolie (1961) and
Edelstein-Keshet (1988).

Orally administered glucose

In an alternative version of the glucose-tolerance test, the glucose is
administered orally, rather than by injection. The differential equations
which model this test are no longer homogeneous. In particular, the
differential equation (4) is replaced by

g + (a + d)g + (ad + bc)g = S(t)

where S(t) is a `forcing' which takes account of the glucose coming in

through the digestive system.
In the approach given by Ackerman, Rosevar and McGuckin (1964)
no attempt is made to determine the parameters a, b, c and d. Instead they
18.3 A model for detecting diabetes 371

Fig. 18.3.1. Glucose concentration returning to normal after glucose tolerance

test.

work directly with the above second-order differential equation, writing

it in the form
g + 2ag + coo2g = S(t) (9)

where a = a + d and wo2 = ad + be and where the forcing term S(t)

is chosen on the basis of their modelling assumptions. They then put
(O2 = w02 - a2 and assume cot > 0 (so that the characteristic equation
has two unreal solutions) and they deduce that the solutions of (9) are

g = A sin(cot)e` (10)

where A is an arbitrary constant.

During the test, the concentration of glucose (cg) in the individual's
bloodstream is measured at regular intervals. Ackerman et al. then chose
the parameters in (10) so as to get a curve which fits the measured
data. Typical examples of data and the fitted curves are shown in Figure
18.3.2(a-c). As a result of data obtained in this way from hundreds of
individuals, Ackerman et al. concluded that the value of the parameter
w0 is a reliable guide as to whether or not an individual is diabetic.
They called TO = 2n/coo the `resonant period' and claimed that normal
individuals have resonant periods of less than 4 hours, whereas diabetics
have periods greater than 4 hours.
An account of this work is included in Middleman (1972). Note that
372 Models with linear interactions

Fig. 18.3.2. Responses of individuals to an oral glucose-tolerance test. Case

(c) suggests diabetes because the period of oscillation is greater than 4 hours,
indicating that the response to the excess glucose is too slow. Graphs are from
Ackerman et al. (1964).

Burghes and Borrie (1981) switch the roles of w and wo and then state a
conclusion which is different from that given in Ackerman et al. (1964).

Exercises 18.3
1. The differential equations for the glucose and insulin concentrations are given
in the text as
dg = -ag - bh
dt
dh = cg - dh
dt
where a, b, c and d are constants. We showed in the text that d > 0. Show in a
similar way, by referring to the relevant principles from physiology, that each of
a, b and c is positive.
2. Derive, from the equations in Exercise 1, the following equations.

g+(a+d)g+(ad+bc)g = 0, h = -1(g+ag).
b
(Apply the solution procedure for simultaneous linear differential equations with
constant coefficients from Section 18.2.)
Show, furthermore, that the initial conditions
g=go, h=0 at t=0
are equivalent to
g = go, g = -ago at t = 0.

3. Given that a, b, c and d are all positive, show that the solutions of the
characteristic equation
A2 +(a+d)),+(ad + bc) = 0
are either negative or have negative real part.
18.4 Nutrient exchange in the placenta 373

Uterine artery Placenta

Fig. 18.4.1. Placenta provides interface between the bloodstreams of mother and
fetus.

4. On the basis of Bolie's model for the glucose-tolerance test, we obtained the
following formula for the insulin concentration t hours after the glucose injection:
h = 0.202go (eA' ` - eA2 `)
where Al = -1.36 and 22 = -2.34. Choose any positive value for go and then
sketch the graph of h against t and find the time at which h reaches its maximum
value. (For any other choice of go, the graph can be obtained simply by scaling
in the vertical direction.)
5. In an insulin-tolerance test, an injection of insulin is given to an individual
after fasting and the level of glucose in the blood is measured at subsequent times.
Assume that that the concentrations of glucose and insulin in the bloodstream
satisfy the differential equations of Exercise 1, together with the initial conditions
g=0, h=ho at t=0.
Show that, on the basis of Bolie's estimates for the coefficients in the differential
equations,
g = 4.35ho(-eA' ` + eA2`),
h = ho(1.57e2'` - .57e22`).
Sketch the graph of h as a function of t.

18.4 Nutrient exchange in the placenta

In the placenta nutrients pass from mother to fetus, while waste products
from the fetus go the other way. During this exchange the blood of the
mother and the fetus do not mix but are separated by a membrane across
which nutrients and waste must pass. The nutrients flow from a high
concentration in the maternal blood to a lower concentration in the fetal
blood. For both mother and fetus, blood flows to the placenta along an
artery and returns via a vein, as shown in Figure 18.4.1.
374 Models with linear interactions

Porous
membrane
Maternal blood vessel

Fig. 18.4.2. Two different types of blood flow in a placenta: (a) maternal pool,
(b) countercurrent flow.

There is some variation among species in the arrangement of blood

vessels within the placenta. In humans, fetal blood vessels are bathed
in maternal blood, as in Figure 18.4.2a. In rabbits and sheep, on the
other hand, there is a system of maternal blood vessels adjacent to the
fetal ones. The blood which they contain is believed to flow in opposite
directions as shown in Figure 18.4.2b.
Simple mathematical models can be used to compare the advantages
and disadvantages of the different types of arrangements of blood vessels
within the placenta. The model to be described here is for the type of
placenta shown in Figure 18.4.2b, appropriate to sheep and rabbits.
The nutrient concentrations in both maternal and fetal blood vessels
can be expected to vary with the distance along the blood vessels since, as
nutrients are transferred, the concentrations change. It will be assumed
that the concentrations have reached a steady-state, so that they depend
only on the distance along the blood vessels.

The model
The model of the placenta we will describe is illustrated in Figure 18.4.3,
our notation being as follows. We use Q, and Q2 to denote the rates of
flow of the maternal and fetal blood respectively. We suppose that the
blood vessels of the mother and fetus stay in contact with the membrane
along a total length L.
We choose as coordinate the distance x of a typical point along the
blood vessels from the point where they first make contact with the
membrane.The concentration of nutrient in each blood vessel is then a
18.4 Nutrient exchange in the placenta 375

Maternal blood vessel

Fetal blood vessel

/000

X =O x=L
Q2 x-axis

Fig. 18.4.3. Schematic view of countercurrent bloodflow in neighbouring blood

vessels in a rabbit or sheep placenta.

function of x and we put

nutrient concentration
Cl = in the maternal blood = ¢1(x),
at a distance x
(1)
nutrient concentration
c2 = { in the fetal blood 02(x).
at a distance x

Thus cl and c2 are functions of x while L, Q1 and Q2 are constants,

independent of x.
We now consider the amount of nutrient contained within two planes,
perpendicular to the blood vessels, which pass through the points with
coordinates x and x + 6x respectively. These planes are shown in Figure
18.4.4a.
In the maternal bloodstream, nutrient enters at the first plane and
leaves at the second. It also leaves, to enter the fetal bloodstream, via
the membrane. Thus, as the mass of nutrient is conserved,

mass entering mass leaving mass leaving

through plane = through plane + across (2)
at x at x + Sx membrane
376 Models with linear interactions

Maternal
Q1 01(x) -. Q1 01 (X + Sx)

11111ATIVII
Pb (c1- c2) Sx
Fetal
-+-+ (x-axis)
x (x + Ox)

Fig. 18.4.4. (a) Planes through the blood vessels. (b) Nutrient flow in maternal
blood vessels.

Now, in any time interval of length St,

mass entering rate of concentration
through plane = flow x of x {time}
at x of blood nutrient (3)

= Qlo1(x)St
and similarly
mass leaving
throu gh p lane Q l ¢ l ( x + S x) St . 4)
atx+Sx
The principle which enables us to estimate the amount of nutrient
transported across the membrane, known as Fick's law, states that if the
concentrations on either side of the membrane were homogeneous then
rate of area of x difference between
transport through . p X {membrane} concentrations
the membrane on either side

where P is a constant called the permeability of the membrane.

In our problem, we wish to apply Fick's law across the portion of
membrane cut off by the two planes, which has area b 6x where b
is the width of the membrane, assumed constant. The difference in
concentration across this portion of membrane is approximately 0 1(x) -
4'2(x) = ci - c2, with an error which approaches 0 as 6x approaches 0.
Thus Fick's law gives
rate of
flow through =P x b 8x x (Cl - c2)
membrane
18.4 Nutrient exchange in the placenta 377

and hence
mass of
nutrient leaving P x b 8x x (cl- C2) x St. (5)
across membrane
Now substitute (3), (4) and (5) into (2) and then divide by bt and
rearrange to get
Q1(41(x + 8x) - q51(x)) + Pb(cl - c2) Sx 0,

the error involved in the approximation being small compared with 8x.
Hence, dividing by 6x and then letting 6x approach 0 gives
dcl
dx
= _ai(i
c - C2)
2) ()
(6)

where al = Pb/Q1.
A similar derivation for the fetal bloodstream gives
dc2_
dx -a2(cl - c2) (7)

where a2 = Pb/Q2.

Solving the differential equations

The pair of differential equations (6), (7) are linear with constant coeffi-
cients and hence may be solved by the procedure given in Section 18.2.
This leads to the second-order differential equation
d2c1 dc1
dx2
(-al + 00 dx (8)

From this we can deduce that, if al *- a2, then the differential equations
(6), (7) have the solutions
al
C1 = 01(x) = 41(0) - (q51(0) - 42(0)) (e(_12)x - 11 ,
a2 - al
a2
(9)
C2 = 02(x) = 42(0) - (41(0) - 42(0)) (e(_12)x -
a2 - ai

Comparisons
The placenta modelled above is called a countercurrent type of pla-
centa because the two bloodstreams flow in opposite directions. Middle-
man (1972) gives further details, and he models other types of placenta,
obtaining solutions analogous to (9) for the concentrations of the nutri-
ents. On the basis of such solutions, he is able to make some comparisons
378 Models with linear interactions
between the efficiency of the various types of placenta in exchanging nu-
trients.
Models analogous to that of the countercurrent nutrient exchange
system also occur in other applications. These include simple models
of an artificial kidney machine,which is described in Burghes and Borrie
(1981), and oxygen exchange in the swim bladders of deep sea fish,
described in Rodin and Jacques (1989).

Exercises 18.4
1. In the text, the differential equation for the concentration cl of nutrient in the
maternal bloodstream at a distance x along the placental membrane was shown
to be
dcl_
--al (Cl - c2)
dx
for a suitable constant al.
By arguing in a similar way, derive the corresponding differential equation
dc2_
dx - -a2(c1 - c2)
for the concentration of nutrient in the fetal bloodstream.
Under which physical condition does al = a2 ?
2. Why can the procedure given in Section 18.2 be used to solve the differential
equation in Exercise 1 ? Use this procedure to solve these equations in the case
al * a2. Check your answers against those given in the text.
3. Repeat Exercise 2 but with al = a2. Interpret your answers.
4. Use the solutions (9) in the text to establish each of the following statements.
(a) If the concentration of nutrient entering via the maternal artery is
equal to that leaving via the fetal vein, then the concentrations must be
constant along the entire length of the placental membrane.
(b) If the concentration of nutrient entering via the maternal artery exceeds
that leaving via the fetal vein, then at each point along the membrane
the concentration in the maternal bloodstream exceeds that in the fetal
bloodstream.

5. Modify the model given in the text if, instead of flowing in opposite directions,
the maternal and fetal bloodstreams flow in the same direction. (This is called
concurrent exchange.)
19
Non-linear coupled models

Further models of two interacting quantities are studied in this chap-

ter. These models are distinguished from those of the previous chapter
because they lead to coupled non-linear, rather than linear, differential
equations. Such equations usually cannot be solved explicitly. Instead, a
method of eliminating the independent variable, called the phase-plane
technique, is developed for studying properties of the solution.
Non-linear equations often exhibit unusual or non-intuitive behaviour.
A classic example considered in this chapter is the Lotka-Volterra equa-
tions which describe the interactions between predators and their prey.
Applied to the study of fish and shark populations they predict that an
increase in fishing will actually lead to an increase in the fish population.
In addition to the predator-prey system, models of combat and epi-
demics are presented.

19.1 Predator-prey interactions

Our concern in this chapter is with models which lead to coupled non-
linear differential equations. In this section it will be shown how a
model describing the interaction of a predator (shark) and its prey (fish)
leads to coupled non-linear differential equations for the shark and fish
populations as functions of time.

Some unusual data

The Italian mathematician Vito Volterra (1860-1940) developed a model
of predator-prey interaction in response to some unusual data. Extensive
records had been kept of the yearly catches of fish and sharks at an Italian
sea port (Fiume, 1914-1923). Table 19.1.1 shows the percentage ratio

379
380 Non-linear coupled models

Table 19.1.1. Percentage catch of sharks at Fiume, Italy, 1914-1923.

Year Sharks
(% total catch)
1914 11.9
1915 21.4
1916 22.1
1917 21.2
1918 36.4
1919 27.3
1920 16.0
1921 15.9
1922 14.8
1923 10.7

of sharks to total catch. It reveals that this ratio, and thus the relative
number of sharks, increased substantially during a time of reduced fishing
(1915 to 1919 corresponding to the First World War).
How can these data be understood? Volterra set up the following model
to try to explain why the decrease in fishing increased the percentage
catch of sharks.

The model
Following Volterra, our objective is to formulate some differential equa-
tions describing the shark and fish populations, which will be done using
the small time-interval method introduced in Chapter 13. The following
influencing factors will be taken into account.
(i) Natural births and deaths of the sharks and fish in isolation from
each other.
(ii) Decline of the fish population due to the fish being the prey of
the sharks.
(rii) Increase in the shark population due to the presence of more fish.
(iv) Fishing of both sharks and fish.
For the populations, we introduce the notation
number of
x fish at time t
_J number of
y - l sharks at time t
19.1 Predator-prey interactions 381

and denote by 8x and by the change in the corresponding populations

during a small time interval 8t. From the points (i)-(iv) it follows that

fish fish fish fish

Sx = born in - deaths in - eaten by - caught by (1)
isolation isolation sharks fishermen

and

extra sharks
sharks shark sharks
by = born in
isolation
- deaths in +
isolation
surviving
fisho
_ caught by
fishermen
(2)
d

where each of the quantities on the right-hand side refers to the number
of sharks and fish which are born or die during the time interval bt.
We need to relate the various quantities in (1) and (2) to the shark
and fish populations. To do this we will make the following modelling
assumptions.

(a) The change in the shark and fish populations, in isolation, is pro-
portional to the present population of sharks and fish, respectively.
(This is the same assumption as for the linear model of population
growth made in Section 9.1.) The proportionality constant for the
shark population is negative, indicating that the shark population
would decrease if isolated from the fish population. This is a
consequence of the fish being the food supply of the sharks.
(b) The number of sharks and fish caught by fishermen is directly pro-
portional to the present population of the shark and fish populations
respectively. The proportionality constant is the same in both
cases, which means that the fishing methods do not discriminate
between sharks or fish.
(c) The number of fish eaten by sharks is directly proportional to the
product of the number of fish present and the number of sharks
present. This is equivalent to assuming that each shark eats a
constant fraction of the fish population.
(d) The additional number of sharks surviving is directly proportional
to the number offish eaten.

In the exercises you are asked to show that, when written as mathemat-
ical equations and substituted into equations (1) and (2), the assumptions
382 Non-linear coupled models
(a)-(d) imply the differential equations
dx _
(r - f )x - axy,
dt =
d (3)
dty = -(s + D y + fixy,
where r, s, f, a and fi are all positive constants. The constants r and -s
are the growth rates for the fish and shark populations if they existed
in isolation, the constant a is the rate at which fish are eaten by a
single shark and fl/ce gives the fraction of a shark surviving by eating
one fish. These equations are called the Lotka-Volterra equations after
Volterra, and Lotka another mathematician who formulated them
independently.
The Lotka-Volterra equations are coupled since the unknowns x and
y appear in both of the equations. Furthermore they are non-linear due
to the presence of the terms axy and fixy. Because of the non-linear
terms, it is not possible to solve these equations explicitly for general
initial conditions. We therefore seek ways of obtaining useful information
about the solutions. The steady-state solutions provide an informative
beginning.

Steady-state solutions
The steady-state solutions (i.e. time-independent constant solutions) are
the easiest particular solutions to find for an autonomous differential
equation. To find these solutions set
x=X and y=Y,
where X and Y are constants, and substitute into the Lotka-Volterra
equations (3) to give two non-linear simultaneous equations

-(s + f)Y + PXY = 0.

In the exercises, you are asked to show that the constant solutions of
the equations are
(X, Y) = (0, 0)

and

(X,Y)=
rs+P f r-f/
Ot
19.1 Predator-prey interactions 383

The first of these solutions corresponds to extinction of both species,

which we know doesn't happen in practice. The second solution con-
tains the more interesting information. One interesting thing about this
solution of the Lotka-Volterra equations is that the steady-state fish
population depends on the shark growth rate parameters but not on the
fish growth rate parameter r. Similarly, the steady-state shark population
depends on the fish growth rate parameter r but not on the shark growth
rate parameter s.
It will be shown in Section 19.2 via a phase-plane analysis that the
second steady-state solution gives the value about which the shark and
fish populations oscillate. This value may thus be regarded as the average
population of the shark and fish populations. The effect of fishing can
therefore be deduced by simply considering this steady-state solution.

Effect of fishing
Consider the circumstance that there is less fishing, as in the First World
War. In the Lotka-Volterra model this means that f is decreased but all
other parameters remain the same. The second constant solution above
then gives that the fish population actually decreases while the shark
population increases. Hence the ratio of sharks to fish increases, which
is consistent with the data of Table 19.1.1. The mechanism for this result
will be discussed in the next section.

Exercises 19.1

1. Consider the statements (a)-(d) of the text which relate the quantities
influencing the shark and fish populations to the populations themselves. Suppose
that the proportionality constants are r and -s in (a) for the fish and shark
populations respectively, f in (b), a in (c) and f in (d). Write the statements as
mathematical equations, keeping in mind that each quantity on the right-hand
side of equations (1) and (2) is also directly proportional to the time interval
8t. Substitute these equations in (1) and (2) and thus derive the Lotka-Volterra
equations.
2. Factorize and thus solve the two non-linear equations given in the text which
specify the steady-state solutions of the Lotka-Volterra equations.

3. (a) Suppose there are two different species of prey x and z, which are the food
of a single predator y. Derive differential equations for the changes in
the populations of the species by using modelling assumptions analogous
to those used to derive the Lotka-Volterra equations.
384 Non-linear coupled models
(b) Find all the steady-state solutions of the differential equations found in
part (a), and thus show that at least one species of prey becomes extinct
in this instance.

4. An orchard is infested by a population of aphids. The aphids are preyed upon

by a type of beetle. The owner of the orchard decides to use a pesticide which
kills a fixed fraction of both aphid and beetle. Use the steady-state solution of
the Lotka-Volterra equation to decide whether or not this is a wise move.
5. Modify the Lotka-Volterra equations to account for the fish population
growing logistically in the absence of sharks (recall Section 11.3). Do the original
conclusions regarding the effect of fishing alter?

19.2 Phase-plane analysis

The Lotka-Volterra equations are examples of first-order, autonomous,
coupled differential equations of the form

.ic = f (x, y) and ,y = g(x,y), (1)

where f and g are non-linear function of x and y. Although in general

it is not possible to solve these equations exactly, as seen in the previous
section, useful information can be obtained from the steady-state solution.
This information can be further supplemented by use of a phase-plane
analysis.
Instead of solving (1) for x and y as functions of t, we obtain a single
differential equation for y as a function of x, by use of the chain rule.
The resulting curves in the (x, y)-plane are called phase-plane trajectories.

The method
There are three main steps in applying a phase-plane analysis, which we
will illustrate by the following example.

Example 1. Find, and sketch, the phase-plane trajectories for the coupled system
5c=-y and y=x. (2)
19.2 Phase-plane analysis 385

y-axis

K=3

x-axis

Fig. 19.2.1. Phase-plane trajectories for the coupled system (2) in Example I.

Solution.
STEP 1: Write down the chain rule and substitute the values of is and y from the
given equations. Hence obtain a single first-order differential equation in y and x
by using the chain rule and then substituting for the derivatives from the original
coupled differential equations.
The chain rule says that
dy dx dy
dx dt dt'
or equivalently,
dy
dxx Y.

Thus, since .z = -y and ,y = x, the differential equation

dy x
dx y
is obtained. This equation is of the first-order separable type studied in Chapter 11.
STEP 2: Solve the differential equation for x and y and sketch the solution in
the (x, y)-plane (the phase-plane) for various initial values.
Solving the differential equation (3) (using the methods explained in Section
11.1) we obtain
x2 + y2 = X02 + Yo

where xo and yo are the initial values of x and y. Since xo and yo are constants,
we can write
xo+yo=K2
where K is a constant. The relationship between x and y defines a circle of radius
K, and varying K gives a family of circles as sketched in Figure 19.2.1.
STEP 3: Determine the directions of the trajectories by finding if x and y increase
or decrease with t at a few selected regions in the phase-plane.
386 Non-linear coupled models

From the original differential equations, for the region x > 0, y > 0, we see that
ac < 0, which means x decreases with t;
y > 0, which means y increases with t.
Thus the trajectories are anti-clockwise around the circles, as indicated in Fig-
ure 19.2.1.

Features of the phase-plane trajectories

Some features exhibited by the phase-plane trajectories of Figure 19.2.1
are common to all phase-plane trajectories. These features are as follows.
Through any point (x, y) in the phase-plane there is at most one
trajectory. This follows from the uniqueness theorem for the solution
of coupled first-order equations.
Each constant solution of the equations is given by a single point in
the phase-plane.
Closed trajectories in the phase-plane correspond to oscillatory solu-
tions of the coupled equations.
The last result is illustrated by the exact solution of the coupled system
(1), which is
x = A cost + 6) and y = A sin(t + 6).
These solutions oscillate with period 27r.

The Lotka-i'olterra equations

In the exercises you are asked to apply the phase-plane technique of
Example 1 to the Lotka-Volterra equations,
(r-f)x-axy=0
-(s + Py + fixy = 0.
This gives, for the phase-plane trajectories, the equation
rlny - ay+slnx - fix=K (3)

where
K =rlnyo - ayo+slnxo - fxo
and K is a constant, which depends on the initial populations xo and yo.
The notation r = r - f and s = s + f has been used, and xo and yo are
the initial values of xo and yo.
19.2 Phase-plane analysis 387

K = -2
4r

2
Fish X
(b) 4 r

0 2 0 2 4
(c) Fish X (d) Fish X

Fig. 19.2.2. Computer-generated phase-plane trajectories for Lotka-Volterra

model. (a) r = 2, s = 1, a = 1, # = 1; (b) r = 1,3=2,a= 1, fi = 1;
(c)r=1,s=1,a=2, fi=1;(d)r=1,s=1,a=1, #=2. ValuesofK for
some trajectories are marked.

Unfortunately, it is not possible to solve (3) for y in terms of x.

However, phase-plane trajectories can be obtained using a contouring
software package. This is a computer program which evaluates the left-
hand side of (3) for a large number of values of x and y and then joins
up those points for which the left-hand side equals K. The procedure is
the same as that which produces isobars on a weather map. Some typical
plots are shown in Figure 19.2.2. In each plot the phase-plane trajectories
are closed curves. In fact, it turns out that the trajectories for x > 0 and
y > 0 are always closed curves, for all positive values of the parameters
r, s, a, and P. A detailed proof of this result is given in Braun (1975),
Section 4.9.
The direction of the plots can be deduced by writing the Lotka-
388 Non-linear coupled models
y-axis

iii II
y<o y>0
x<0 i x<O

F
Y- a
---------- T ----------
IV I

<o
.z>O i />o
x>0

x-axis
X= s/0

Fig. 19.2.3. Direction of the trajectories in each of the four quadrants.

Volterra equations in the form

x=x ay)
.Y = -y(s - fix).
As indicated in Figure 19.2.3, the phase-plane can be divided into four
quadrants about the fixed point (X, Y) according to the sign of x and Y. If
x > 3/p and y > r/a (in region I), then x < 0 and ,y > 0. This means that,
as we move along a trajectory, x is increasing and y is decreasing. This
implies that the trajectories are traversed in the anti-clockwise direction.

Interpretation
Consideration of Figure 19.2.3 allows aspects of the predator-prey system
of sharks and fish to be better understood. Suppose initially both the
shark and fish populations are above their steady-state values and thus
in region II of Figure 19.2.3. In this circumstance the shark population
increases at the expense of the fish population until the fish population
drops below its steady-state value into region III. Now there is not
enough food for the sharks so the shark population also decreases until
it drops below its steady-state value into region IV. The fish population
can now recover as there are fewer sharks. It eventually increases to
above its steady-state value into region I. Now there is sufficient food for
the shark population to recover, so it begins to increase until region II is
again entered and the cycle repeated.
19.3 Models of combat 389

Exercises 19.2

1. Elimination of the independent variable from the Lotka-Volterra equations

leads to the differential equation
r
-a dy
y dx x
where r = r - f and s = s + f . Given x = xo and y = yo solve this equation to
show that
rlny - ay+slnx - fix = K
where K is a constant depending on the initial conditions.
2. Consider the system of equations
.X=2y, y=-x.

(a) Use the chain rule to show that

2ydy = -x.
dx
(b) Hence obtain the equations for the family of phase-plane trajectories.
(c) Sketch the family of phase-plane trajectories in the (x, y)-plane. Remem-
ber to indicate by arrows the direction in which the solution moves along
these curves.

3. Repeat Exercise 2 for the system

5C =x2, y= Y, x>0, y>0.
4. Suppose that the fish population grows according to the logistic equation, in
the absence of sharks (see Exercise 5 of Section 19.1). What would you expect
the phase-plane trajectories to look like? (If you have access to a differential
equation solving program you can use a computer to check your answer.)

19.3 Models of combat

The combat situation of two armies fighting a battle can be modelled,
subject to some simplifying assumptions, as a pair of coupled differential
equations. Mathematical models of combat can be used to understand
what factors can influence the outcome of a battle: some questions which
might be asked include which side is the victor, how many survivors
remain, how long does the battle take?
In this section we look at one particular combat situation where one
army is exposed to fire and the other is hidden. This situation may be
used to model guerilla warfare.
390 Non-linear coupled models

Guerilla combat model

The exposed army will be termed the `home' army and the hidden army
the `enemy' army. We will use the notation
number of number of
x = home soldiers y = enemy soldiers
at time t at time t
and assume that the number of soldiers can be approximated as contin-
uous variables.
In an isolated battle the major factor reducing the size of each army is
the number of soldiers put out of action (either killed, wounded or taken
prisoner) by the opposing army. We will assume that
neither army takes prisoners
each army is using gunfire against the other.
Thus, with 8x and by denoting the changes in the numbers of the
respective armies during a time interval St,
number of number of
home soldiers enemy soldiers
6x - - hit by gunfire and Sy = - hit by gunfire of
of enemy army home army
The number of soldiers hit in a small time interval 8t is equal to the
product of
(i) the rate at which each soldier shoots (Rx shots per unit time for
the home army and Ry for the enemy),
(ii) the probability that a single shot hits its target (Px for the home
army and Py for the enemy),
(ui) the number of soldiers firing the shots.
Hence
bx = ---RyPyy bt and by = RXPXx bt. (1)

The firing rates RX and Ry are assumed to be constant, while the prob-
abilities PX and Py are determined according to whether the target is
exposed or hidden.
If the target is exposed (the home army soldiers), it is reasonable to
assume that each single shot has a constant probability of hitting its
target, independent of the number of home soldiers. So Py is a constant
with respect to x and y.
If the target is hidden, however, then the probability of hitting a
19.3 Models of combat 391

soldier by a shot fired at random into a given area will depend on the
concentration of hidden soldiers in the area. If there are y enemy soldiers,
and if each enemy soldier has on average an area a exposed, then

Px = aAY
where A is the total area occupied by the enemy soldiers. Note that ay
gives the total area of soldiers available to be hit by random fire.
Substituting this formula into (1) and dividing by b t gives the coupled
non-linear equations
dx
= -ay and = -bxy (2)

dt
where a = RyPy and b = Rxa/A are positive constants.

Steady-state solutions
Setting x = X and y = Y, where X and Y are constants, gives
0 = -aY and 0 = -bX Y.
Hence the steady-state solution is Y = 0, X unspecified, which corre-
sponds to the enemy (guerilla) army being defeated. However, as will be
shown below via a phase-plane analysis, certain initial conditions lead to
X = 0 and thus victory to the enemy army.

Phase-plane analysis
The phase-plane technique of Section 19.2 allows us to eliminate t in the
equations (2) to obtain the differential equation
dy b
- X.
dx a
This equation is first-order separable so the methods of Section 11.1 can
be used to show that it has solution
b
Y= 2a
x2+K, (3)

where
K= - 2a (o)
y0
x 2
()4
is a constant depending on the initial conditions.
The parabolas defined by (3) for varying K are sketched in Fig-
ure 19.3.1. Only the region of the phase-plane x, y >_ 0 is sketched since
392 Non-linear coupled models
y-axis

x-axis

Fig. 19.3.1. Phase-plane trajectories for the guerilla combat model.

the number of soldiers cannot be negative. Note also from (2) that, if
x, y > 0, then ac, y < 0 so all the trajectories point towards the axes.
Suppose that a victory to the home army occurs when x > 0 and y = 0
(although in practice the defeated army would probably surrender before
this stage). From (3) and (4) this occurs when K < 0 and thus when
2
Yo < (xo )
2a
On the other hand, a victory to the hidden army occurs when K > 0 so
that
Yo > 2a(xoZ
As an application of these results, let us suppose it appeared that
the two armies were heading for mutual annihilation. From (3) and (4)
mutual annihilation will occur if
K=0
and thus
Yo _ b
5
(xo)2 2a
However if the enemy army were to double the number of soldiers it had
initially we see that the home army would only have to increase the size
of its army by a factor of ,/2- to match the enemy army.
Generally, however, the hidden army often has an advantage over the
exposed army. If the hidden army is spread over a very large area A then
the parameter b = RXoc/A is a small number compared with a = RyPy.
Thus (5) states that the two armies are evenly matched when x0 is large
and yo is relatively small.
19.3 Models of combat 393

Other combat models

Models of combat similar to the above were first devised by F.W. Lanch-
ester. Details may be found in a reprinted article in Newman (1956),
pp. 2138-2157. Additional discussions may be found in Giordano and
Weir (1985) and in Braun (1975).
Other types of combat models include
hidden armies verses hidden armies,
exposed armies verses exposed armies.
It is also possible to take account of reinforcements and illness amongst
troops. Some of these ideas are considered in the exercises. One par-
ticularly interesting situation arises when considering an exposed verses
exposed battle, which is discussed in Exercise 5. Here it is shown how an
army which would lose a battle by committing all of its soldiers to the
battle can gain victory by committing half of the army to a first battle
with the whole of the enemy army and then sending the second half to
fight the surviving enemy.
These combat models have been applied to some real battles. Braun
(1975) discusses an exposed verses exposed model incorporating rein-
forcements which gives excellent agreement with the battle of Iwo Jima
which occurred during the Second World War.

Exercises 19.3
1. In this exercise a model of combat in which both armies are exposed to gun
fire is to be developed.
(a) Using equation (1) of the text, obtain the coupled differential equations
dx = = -qx
-pY and dy
dt dt

and specify the constants p and q in terms of the average firing rates
and probability of hitting the target for each army.
(b) Use a phase-plane analysis to determine an inequality relating p and q
such that the home army (x) wins. Do the same for a win to the enemy
army.

2. (Continuation of Exercise 1) If one side was to double their initial number,

by how much should the other side increase their firing rate so that the former
action has no net effect on the outcome of the battle?
3. (Continuation of Exercise 1) Suppose a soldier from the home army fires at a
rate of two shots per minute with a probability of success of 80% and suppose
that the enemy army fires at a rate of three shots per minute with a probability
394 Non-linear coupled models
of success of 70%. If initially there are 1000 home soldiers and 900 enemy, who
wins the battle? How many survivors are there?

4. The equations obtained in Exercise 1 are linear and can thus be solved using
the method of Section 18.2.
(a) Hence find x and y as functions of t, given that initially x = xo and
Y = Yo
(b) By eliminating t from your solutions in (a) find a relation between y and
x. You should thus reproduce the result derived in Exercise 1(b).

5. Suppose that initially there are 80 000 soldiers of the `home' army and 100 000
soldiers of the enemy, and assume that both armies are exposed as in Exercise 1
above, with soldiers on each side being equally effective. (This means that p = q
in the differential equations.)
(a) If all soldiers take part in the battle, show that the enemy army wins the
battle.
(b) Show that the home army can win by first engaging half of its army
against the enemy army and then fighting the other half against the
enemy survivors of the first battle.

6. (a) Argue that a battle with long range artillery will satisfy the differential
equations
ac = -axy and y = -fixy
where a and fi are positive constants. Express these constants in terms
of other meaningful constants such as a firing rate, number of soldiers
per missile launcher, and firing rates.
(b) Sketch the phase-plane trajectories, and give a condition for one of the
sides to win.

19.4 Epidemics
A mathematical model of a measles epidemic was presented in Chapter 9.
The model was presented as a pair or coupled non-linear difference
equations. The reason for the applicability of difference equations was
the significant latent period between catching the disease and becoming
contagious. If this latent period is small (ideally zero) a model of an
epidemic involving coupled differential equations can be formulated. This
will be done in this section. A phase-plane analysis of the model is left
to the exercises.
19.4 Epidemics 395

The model
For the purposes of formulating our model for the spread of a disease,
the population will be divided into three groups: susceptibles (who are
not immune to the disease), infectives (who are capable of infecting
susceptibles) and removed (who have previously had the disease and may
not be reinfected because they are immune, have been quarantined or
have died from the disease). The symbols S, I and R will be used to
denote the number of susceptibles, infectives and removed, respectively,
in the population at time t.
The following modelling assumptions will be made.
(a) The disease is transmitted by close proximity or contact between an
infective and susceptible.
(b) A susceptible becomes an infective immediately after transmission.
(c) Infectives eventually become removed.
(d) The population of susceptibles is not altered by emigration, immi-
gration, births or deaths.
(e) Each infective infects a constant fraction # of the susceptible pop-
ulation per unit time.
(f) The number of infectives removed is proportional to the number of
infectives present.
As mentioned earlier, it is assumption (b) which makes a formulation
involving differential equations rather than difference equations rele-
vant. Diseases for which this assumption is applicable include diphtheria,
scarlet fever and herpes. Assumption (e) is the same as that used in
Section 9.4. It is valid provided that the number of infectives is small in
comparison to the number of susceptibles.
To set up the differential equations using a 8t argument, let SS, 8I and
SR denote the changes in the population of susceptibles, infectives and
removed during a small time interval 8t. By assumptions (a), (c) and (d)

aS - - ft number of susceptiblesl
infected in time 8t
By assumptions (a), (b) and (c)
number of susceptibles number of
al - infected in time St } - infectives removed
By assumptions (a), (c) and (d)
aR = f number of infectivesl
1 removed in time St
396 Non-linear coupled models
But from assumptions (e) and (f)
number of susceptiblesl =BSI St,
infected in time 8t f
number of infectivesl = yl St.
removed in time St
Substituting the last two equations into the previous three, dividing by
b t and letting b t -+ 0 gives the coupled differential equations
dS=_SI (la
dt )
dI
= flSI-
dt y1' (1b)
dR--1 (1c)
dt y '
where fi and y are positive constants of proportionality. Here fJ is known
as the infection rate which governs how fast the disease is spread from
one infective to one susceptible and y is the removal rate which governs
how fast infectives are removed (by dying, becoming immune or by being
quarantined).
Equations (la-c) give three differential equations in three unknowns
S, I and R. Notice, however, that the variable R does not occur in (1a)
or (1b) hence equation (1c) is not coupled to the system (la-b).
Note also, that adding the three equations (1a), (1b) and (1c) gives

S +-1 R) = 0
dt( + )

which implies that for all times

where No denotes the initial population. Equation (2) is evident from the
formulae for 5S, bl and bR which in turn follow from assumptions (b),
(c) and (d) used in the formulation of the model.
The equations (la) and (lb) can be analysed using the phase-plane
technique, which is to be done in the exercises. An important consequence
of the analysis is the following result :
If the initial number of susceptibles is greater than y/fl then the number of
infectives will increase and we say that an epidemic has occurred. If it is smaller
than y/fl then the number of infectives decreases and thus no epidemic occurs.

Thus knowing # and y can help decide vaccination strategies: one would
19.4 Epidemics 397

try to decrease the number of susceptibles through vaccination to below

the value y//3 (called the threshold value).
Alternative versions of the model can be formulated by altering some
of the assumptions (a)-(f). These include allowance for births, diseases
where immunity is not conferred on sufferers with the disease, and
diseases with carriers. Some such models are considered in the exercises.
These models are prototypes for much more complicated stochastic models
which allow for random variation in infectivity and are not restricted to
large populations. Braun (1975) and Edelstein-Keshet (1988) pp. 242-
253 give fairly comprehensive discussions of epidemics. A more advanced
treatment, including stochastic models, is given by Bailey (1975).

Exercises 19.4
1. Consider the equations (1a) and (1b) in the text, which describe the model for
an epidemic with removal.
(a) Eliminate t and solve the resulting differential equation to obtain the
formula
I=K -S+ylnS
P
where K is a constant depending on the initial values of I and K.
(b) From your answer to (a), find the maximum of I regarded as a function
of S and sketch some typical trajectories. Thus determine a condition
on So for which the number of infectives will keep decreasing with time.
Also give the condition on So for which the number of infectives will
initially increase with time. In this instance an epidemic is said to occur.
(c) Find a method to determine the number of susceptibles remaining when
the disease has run its course and the number of infectives is zero.

2. The differential equations

S = -PSI + ),S, 1 = pSI - yI
model a disease spread by contact. The same assumptions apply as for the
model in the text except that now, in place of assumption (c), we allow birth of
susceptibles.
(a) Identify which terms in the RHS of each differential equation arises from
birth of susceptibles.
(b) Make a correspondence between these differential equations and the
Lotka-Volterra system. Use this correspondence to sketch some phase-
plane trajectories. What are the average values of the number of infec-
tives and the number of susceptibles?

3. Generalize the model of the text to include the possibility of those recovering
from the disease becoming reinfected.
398 Non-linear coupled models

4. With y = 0 and thus zero rate of removal, equations (1a) and (1b) of the text
read
S = -PSI and 1 = PSI.
(a) Determine, and sketch, the phase-plane trajectories.
(b) Show that
1 =,6I(No-I)
where No = So + Io
(c) Solve this differential equation, and check that the solution is consistent
with the phase-plane trajectories.
(d) Give some examples of diseases which could be modelled by these
equations.

5. The following set of coupled non-linear differential equations has been used
to model tranmission of AIDS (May, Anderson and McLean, 1988). Here X
represents a susceptible population and Y represents those infected with the HIV
virus, and N = X + Y.
dX
= vN - (A + µ)X,
dt
dY
= AX - (v + )Y,
dt
where A is the probability of acquiring infection from any one partner.
(a) Discuss why it is plausible to write A = ficY IN, where fi is the probability
of acquiring infection from one infected partner and c is the rate at which
new partners are acquired.
(b) Which terms in the equation represent
(i) input of new susceptibles,
(ii) HIV carriers who get AIDS,
(iii) those who become infected with HIV?
(c) Show that the total population N satisfies the differential equation
dN
=vN -pN - vY.
dt
References

Chapter 1
Andrade, E.N. da C. (1979) Sir Isaac Newton. Greenwood Press, Westport
Connecticut.
Cohen, I.B. (1987) The Birth of a New Physics. Penguin Books, Middlesex.
Koestler, A. (1958) The Sleepwalkers. Penguin Books, Middlesex.
Westfall, R.S. (1971) Force in Newtonian Physics. Elsevier, New York.

Chapter 4
Halliday, D. and Resnick, R. (1974) Physics Parts I and II. Wiley, New York.

Chapter 7
Gardner, M. (1981) Mathematical Circus. Penguin, Middlesex.

Chapter 8
Archibald, G.C. and Lipsey, R.G. (1973) An Introduction to a Mathematical
Treatment of Economics. 3rd edition. Weidenfeldt and Nicolson, London.
Ayres, F. (1963) Theory and Problems of Mathematics of Finance. Schaum, New
York.
Gandolfo, G.C. (1971) Mathematical Methods and Models in Economic
Dynamics. North Holland, Amsterdam.
Goldberg, S. (1958) An Introduction to Difference Equations: with illustrated
examples from Economics, Psychology and Sociology. Wiley, New York.
Kenkal, J.L. (1974) Dynamic Linear Economic Models. Gordon and Breach,
London.
Lipsey, R.G., Langley, P.C. and Mahoney, D.M. (1981) Positive Economics for
Australian Students. Weidenfeldt and Nicolson, London.
Pfouts, R.W. (1972) Elementary Economics: a Mathematical Approach. Wiley,
New York.

Chapter 9
Anderson, R. and May, R. (1982) `The logic of vaccination: New Scientist,
November.
Devaney, R.L. (1986) An Introduction to Chaotic Dynamical Systems.
Benjamin-Cummings, Menlo-Park, California.

399
400 References

Edelstein-Keshet, L. (1988) Mathematical Models in Biology. Random House,

New York.
Gleick, J. (1987) Chaos: Making a New Science. Cardinal, London.
Greenwell, R.N. and Ng, H.K. (1984) `The Ricker Salmon Model' (Unit 653)
UMAP Modules, Comap.
May, R.M. (1975) `Biological Populations obeying Difference Equations: stable
points, stable cycles, and chaos: Journal of Theoretical Biology. 51, 511-524.
May, R.M. (ed.) (1976) Theoretical Ecology: Principles and Applications.
Blackwell, Oxford.
May, R.M. (1978) `Simple mathematical models with very complicated
dynamics.' Nature, 261, 459-567.
Maynard-Smith, J. (1968) Mathematical Ideas in Biology. Cambridge University
Press.
Stewart, I. (1990) Does God Play Dice? The Mathematics of Chaos. Penguin,
Middelsex.
Tuck, E. and de Mestre N. (1991) Computer Ecology and Chaos: an Introduction
to Mathematical Computing. Longman Cheshire, Melbourne.

Chapter 10
Edelstein-Keshet, L. (1988) Mathematical Models in Biology. Random House,
New York.
Haldane, J.B.S. (1924). `A mathematical theory of artificial and natural selection
- I.' Transactions of the Cambridge Philosophical Society, 23, 19-41.
(Reprinted in the Bulletin of Mathematical Biology, 52, 209-240. 1990.)
Hartl, D.L. (1980). Principles of Population Genetics. Sinauer Assoociates,
Sunderland, Massachusetts.
Hexter,W. and Yost, H.T. (1976) The Science of Genetics. Prentice-Hall,
Englewood Cliffs, New Jersey.
Maynard-Smith, J. (1968) Mathematical Ideas in Biology. Cambridge University
Press.
Sandfur, J.T. (1968) `Difference Equations in Genetics: The UMAP Journal 10,
257-274.

Chapter 11
Braun, M. (1975) Differential Equations and their Applications: an Introduction
to Applied Mathematics. Springer, New York.
Emlen, J.M. (1984) Population Biology: The Coevolution of Population Dynamics
and Behaviour. Macmillan, New York.
Emmel, T.C. (1976) Population Biology. Harper and Row, New York.
Giancoli, D.C. (1985) Physics. 2nd edition. Prentice Hall, Englewood Cliffs, New
Jersey.
Hutchinson, G.E. (1971) An Introduction to Population Ecology. Yale University
Press, Yale.
Keyfitz, N. (1977) Introduction to the Mathematics of Populations.
Addison-Wesley, Reading, Massachusetts.
Kormondy, E.J. (1976) Concepts of Ecology. Prentice-Hall, Englewood Cliffs,
New Jersey.
Marion, J.B. (1976) Physics in the Modern World. Academic Press, Reading,
Massachusetts.
Rubinow, S.I. (1975) An Introduction to Mathematical Biology. Wiley, New York.
References 401

Chapter 12
Holman, J.P. (1981) Heat Transfer. McGraw-Hill, New York.
Modelling Heat. Open University Module, Open University Press, Milton
Keynes.

Chapter 13
Rainey, R.H. (1967) `Natural Displacement of Pollution from the Great Lakes.'
Science, 155, 1244.
Modelling Heat. Open University Module, Open University Press, Milton
Keynes.

Chapter 11
Melissinos, A.C. (1966) Experiments in Modern Physics. Academic Press,
Reading, Massachusetts.
Streeter, V.L. (1966) Fluid Mechanics. McGraw-Hill, New York.

Chapter 15
Braun, M. (1975) Differential Equations and their Applications: an Introduction
to Applied Mathematics. Springer, New York.

Chapter 16
Cohen, I.B. (1987) The Birth of a New Physics. Penguin, Middlesex.
De Mestre, N. (1990) The Mathematics of Projectiles in Sport. Cambridge
University Press.

Chapter 17
Toeplitz, O. (1963) Calculus: A Genetic Approach. University of Chicago Press,
Chicago.

Chapter 18
Ackerman, E., Rosevar, J.W. and Mc Guckin, W.F. (1964) Phys. Med. Biol. 9,
203.
Bolie, V.W. (1960) `Coefficients of normal glucose regulation.' Journal of Applied
Physiology, 16, 783.
Braun, M. (1975) Differential Equations and their Applications: an Introduction
to Applied Mathematics. Springer, New York.
Burghes, D.N. and Borrie (1981) Modelling with Differential Equations. Ellis
Horward Limited, Chichester.
Edeistein-Keshet, L. (1988) Mathematical Models in Biology. Random House,
New York.
Middleman (1972) Transport Phenomena in the Cardiovascular System.
Wiley-Interscience, New York.
Rodin, E,.Y. and Jacques, S. (1989) `Countercurrent oxygen exchange in the
swim bladders of deep sea fish : a mathematical model.' Mathematical and
Computer Modelling, 12, 389.
402 References

Chapter 19
Bailey, T.J. (1975) The Mathematical Theory of Infectious Diseases and its
Applications. 2nd edition. Charles Griffin and Company Limited.
Braun, M. (1975) Differential Equations and their Applications: an Introduction
to Applied Mathematics. Springer, New York.
Edelstein-Keshet, L. (1988) Mathematical Models in Biology. Random House,
New York.
Giordano, F.R. & Wier, M.D. (1985) A First Course in Mathematical Modeling.
Belmont California.
Lanchester : chapter in Newman (1956) The World of Mathematics - Volume
Two. Simon and Schuster, New York.
Index

acceleration, 28, 322 constant-coefficient, 127, 131, 295

acceleration, angular, 338, 339 convection, 239
air-resistance, 331 convective heat transfer coefficient, 240
alleles, 178-185 convergence, 120
amortization, 136 corpse, cooling, 237
angular acceleration, 338, 339 countercurrent, 377
angular velocity, 338, 343 coupled non-linear differential equations,
Archimedes' principle, 282 379
Aristotle, 7 critically damped motion, 307
armies, 389
artillary, 394 damping, 302, 304
artillary gun, 308 damping constant, 302
attractor, 172 dashpot, 302
Atwood's machine, 41 death rate, 148
average acceleration, 29 deaths, 147
diabetes, 366
battle, 389, 393 difference equation, 107, 109
birth rate, 148 discrete logistic equation, 153
births, 147 disease, 163, 395
block and tackle, 41 displacement, 319
blood vessels, 374 drag coefficient, 281
breeding season, 145 drag force, 277
buoyancy, 282 drug absorption, 229
dynamics, 11
Carbon-14 dating, 230
carrying capacity, 152, 218 Earth, orbit of, 345
cell division, 147 epidemic, 163, 394, 397
chaotic growth, 159 equilibrium states, 359
characteristic equation, 296 existence-uniqueness, 75
circular motion, 343 exponential decay, 227
cobweb diagrams, 118, 120, 129 exponential growth, 213, 216, 223
cobweb model, 138
compartment, 257, 367 Fibonacci, 105, 108
compartment diagram, 263, 270 fish populations, 379
compartment model, 258, 268 fixed points, 114, 116
computer, 157-163, 166 fluid mechanics, 277
concentration, 229, 258, 357 forcing term, 312, 314
conduction, 241 Fourier, 243
conductivity, 243 Fourier's law, 242, 250
constant solutions, 206 friction, 60-70, 326, 328

403
404 Index

friction, kinetic, 62 measles, 164-170, 394

friction, static, 61 Mendel, 178
military ballistics, 334
Galileo, 8, 9, 16, 326 Millikan oil drop experiment, 288
genetics, 176 mixing, 257, 356
genotype, 178, 180 mixture, 258, 356
genotype proportions, 187 mutation, 198
glucose, 366, 368
glucose-tolerance test, 368, 381 national income, 142
Great Lakes system, 265 natural frequency, 314
guerilla combat, 390 natural length, 85
natural selection, 196
Hardy-Weinberg law, 191 Newton, 13, 232
harmonic motion, 299, 304 Newton's law of cooling, 232-234, 239,
harvesting fish, 222 252, 271
heat and temperature, 237 Newton's laws, 326
heat flow, radial, 250 Newton's laws of motion, 13
homogeneous, 78, 81, 127, 131, 204, 295, normal vector, 340
298
Hooke's law, 86, 87
hot water tank, 270 overdamped motion, 305
housing loans, 136
human populations, 224 particular solution, 297, 299, 313
hydrometer, 284, 311 pendulum, 349
period and amplitude, 82
immune, 396 phase-plane, 391, 398
inclined plane, 326 phase-shift, 76
infective, 164 placenta, 373
infectives, 395 planetary orbits, 9
inhomogeneous, 205 pollution, 265
injection, 229 position vector, 319, 322
insulation, 241, 242, 244, 249, 254, 274 predator, 379-384
insulin, 366, 368 prey, 379-384
interest, compound, 133, 134 projectile, 322, 331, 334
interest, simple, 133, 134 pulley, 41
intravenous infusion, 231
iteration, 111, 118
radial heat flow, 250
radioactive decay, 227
Kepler, 7, 9
random mating, 185-192
Kepler's laws, 9, 348
repellor, 172
kinematics, 10, 21-40, 318, 336
resonance, 314
resonant frequency, 314
Lanchester, 393 Reynold's number, 278, 279, 280
latent period, 164
Leibniz's notation, 26
Leonardo of Pisa, 105 shark populations, 379
lethal recessive gene, 193-200 small interval approximation, 259, 269
linear difference equations, 126-145 solute, 258
linear differential equation, 77, 81, 369 sport, mathematics in, 334
linear first-order difference equation, 129 springs, 85-101, 302, 304
linearization, 173, 348, 351 stable, 171
linearization about a fixed point, 171 stable equilibrium point, 350
logistic growth, 218 steady-state, 114, 154, 167, 171, 206, 369
Lotka, 382 steady-state conduction, 241
Lotka-Volterra equations, 382-384, steady-state temperature, 235
386-388 stiffness, 87
Index 405

stochastic models, 397 underdamped motion, 304

Stokes' law, 280 unstable, 172
superposition, 78, 79, 81
superposition theorems, 127 vaccination, 169, 396
supply and demand, 138-140 variables separable, 206
survival fraction, 190 variables separable differential equations,
susceptible, 164 203
susceptibles, 395 vector, tangent, 323
symmetry, 57, 58 velocity, 322
velocity, angular, 338, 343
tangent vector, 323, 340, 341 velocity-squared drag law, 281
tension, 41 viscosity, 277-279
terminal velocity, 287, 288, 292 Volterra, 379, 380, 382
Toricelli's law, 269
trajectory, 331 wheel, 68
transient, 311, 313
turbidity, 290 yeast cells, 212, 213, 220, 225
Printed in the United Kingdom
by Lightning Source UK Ltd.
134 149U K00001 B/ 105/A 9
1111111 -1-1
780521"44 691
I
The real world can be modelled using mathematics, and the construction of
such models is the theme of this book. The authors concentrate on the tech-
niques used to set up mathematical models and describe many systems in
full detail, covering both differential and difference equations in depth.
Amongst the broad spectrum of topics studied in this book are: mechanics,
genetics, thermal physics, economics and population studies. Any student
wishing to solve problems via mathematical modelling will find that this
book provides an excellent introduction to the subject.

CAMBRIDGE
UNIVERSITY PRESS

ISBN 9780521440691