Mathematical Techniques An Introduction For The Engineering, Physical
Mathematical Techniques An Introduction For The Engineering, Physical
FOURTH EDITION
1
1
Great Clarendon Street, Oxford OX2 6DP
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide in
Oxford New York
Auckland Cape Town Dar es Salaam Hong Kong Karachi
Kuala Lumpur Madrid Melbourne Mexico City Nairobi
New Delhi Shanghai Taipei Toronto
With offices in
Argentina Austria Brazil Chile Czech Republic France Greece
Guatemala Hungary Italy Japan Poland Portugal Singapore
South Korea Switzerland Thailand Turkey Ukraine Vietnam
Oxford is a registered trade mark of Oxford University Press
in the UK and in certain other countries
Published in the United States
by Oxford University Press Inc., New York
© D. W. Jordan and P. Smith, 2008
The moral rights of the authors have been asserted
Database right Oxford University Press (maker)
First edition 1994
Second edition 1997
Third edition 2002
Fourth edition 2008
Reprinted 2010
All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
without the prior permission in writing of Oxford University Press,
or as expressly permitted by law, or under terms agreed with the appropriate
reprographics rights organization. Enquiries concerning reproduction
outside the scope of the above should be sent to the Rights Department,
Oxford University Press, at the address above
You must not circulate this book in any other binding or cover
and you must impose the same condition on any acquirer
British Library Cataloguing in Publication Data
Data available
Library of Congress Cataloging in Publication Data
Data available
Typeset by Graphicraft Limited, Hong Kong
Printed in Italy
on acid-free paper by
L.E.G.O S.p.A. – Lavis TN
ISBN 978–0–19–928201–2
3 5 7 9 10 8 6 4 2
Preface to the fourth edition
Supplementary material
The book has an associated Resource Centre at Oxford University Press which is
open access at
www.oxfordtextbooks.co.uk/orc /jordan_smith 4e
It includes a Solutions Manual with model solutions of over 3000 end-of-chapter
problems in Mathematical Techniques, and a Computer Program Companion
which lists Mathematica™ programs for use with Chapter 42. It also features
figures from the book in electronic format, for lecturers to use for teaching
purposes.
vii
The supplementary Chapter 42 comprises a list of over 120 projects, following
the text chapter by chapter, which can be used as possible questions to be solved
Acknowledgements
We should like to continue to acknowledge our thanks for help received from
individuals mentioned in previous editions. The development, writing and organ-
ization of textbooks together with colour printing, web-based resource centres,
and associated software has become a increasingly complex process. We wish to
express our appreciation of the helpfulness of the staff at Oxford University Press
during the production of this new edition.
Keele DWJ
March 2008 PS
Brief Contents
2 Differentiation 61
8 Determinants 179
8.1 The determinant of a square matrix 179
8.2 Properties of determinants 182
xiii
8.3 The adjoint and inverse matrices 189
Problems 190
CONTENTS
9 Elementary operations with vectors 193
9.1 Displacement along an axis 193
9.2 Displacement vectors in two dimensions 195
9.3 Axes in three dimensions 198
9.4 Vectors in two and three dimensions 198
9.5 Relative velocity 204
9.6 Position vectors and vector equations 206
9.7 Unit vectors and basis vectors 210
9.8 Tangent vector, velocity, and acceleration 212
9.9 Motion in polar coordinates 214
Problems 216
CONTENTS
18.1 Differential equations and their solutions 380
18.2 Solving first-order linear unforced equations 382
18.3 Solving second-order linear unforced equations 384
18.4 Complex solutions of the characteristic equation 388
18.5 Initial conditions for second-order equations 391
Problems 393
CONTENTS
27.1 Sine and cosine transforms 587
27.2 The exponential Fourier transform 590
27.3 Short notations: alternative expressions 592
27.4 Fourier transforms of some basic functions 593
27.5 Rules for manipulating transforms 596
27.6 The delta function and periodic functions 599
27.7 Convolution theorem for Fourier transforms 601
27.8 The shah function 605
27.9 Energy in a signal: Rayleigh’s theorem 607
27.10 Diffraction from a uniformly radiating strip 608
27.11 General source distribution and the inverse transform 612
27.12 Transforms in radiation problems 613
Problems 618
35 Sets 789
35.1 Notation 789
35.2 Equality, union, and intersection 790
xix
35.3 Venn diagrams 792
Problems 799
CONTENTS
36 Boolean algebra: logic gates and switching functions 801
36.1 Laws of Boolean algebra 801
36.2 Logic gates and truth tables 803
36.3 Logic networks 805
36.4 The inverse truth-table problem 808
36.5 Switching circuits 809
Problems 812
39 Probability 865
39.1 Sample spaces, events, and probability 866
39.2 Sets and probability 868
39.3 Frequencies and combinations 872
39.4 Conditional probability 875
39.5 Independent events 877
39.6 Total probability 879
39.7 Bayes’ theorem 880
Problems 881
Part 8 Projects
Appendices 948
A Some algebraical rules 948
B Trigonometric formulae 949
C Areas and volumes 951
D A table of derivatives 952
E Table of indefinite and definite integrals 953
F Laplace transforms, inverses, and rules 955
G Exponential Fourier transforms and rules 956
H Probability distributions and tables 957
I Dimensions and units 959
Index 962
Part 1
Elementary methods,
differentiation, complex
numbers
Standard functions
and techniques 1
CONTENTS
This is a long chapter covering a variety of subjects, some of which you will have
met before. It is not necessary to work through every section in detail; to a large
extent the chapter can be used for reference as required later on. However, you
should read it carefully in order to find what is in it, and to pick up terms and
notations used regularly in the rest of the book. If you find that a familiar subject
is treated in an unfamiliar way, try to understand the fresh approach since the
ideas behind it are liable to reappear in later chapters.
2. A rational number is any number that can be expressed as a fraction having the
form p/q, where p and q are integers. They consist of all numbers expressible
as finite or recurring decimals. Examples of rational numbers with recurring
decimals are 1/3, which has the decimal representation 0.3333… written as
0.3, and 1/7 which has the recurring decimal form 0.142 857 (the dots mark out
the decimal repetition pattern). Notice that the integers are rational numbers
in this definition.
3. The rest are irrational numbers. These are the numbers that cannot be
expressed as fractions made up of integers; they are represented by infinite,
non-recurring decimals. Although there is an infinite number of rational num-
bers, there is a sense in which there are infinitely more irrational numbers, so
they appear everywhere. For example, the hypotenuse of a right-angled triangle
with sides of unit length has length √2, and this is known to be an irrational
number. The number π is irrational, and so is the number e which we will meet
in Section 1.8. Irrational numbers can be approximated as closely as we wish
by rational numbers: retain the appropriate number of decimal places, and
1
1
The notations a –2 and √a always stand for the positive square root of a. If we
want the negative square root we must attach a minus sign. Thus the solutions of
the equation x2 = 2 are written separately as √2 and −√2, or as ±√2.
The condition a 0 is necessary if the rules are to apply to all exponents; for
1
example, (–2)–2 has no meaning in real-number terms since the square of any real
5
x
number is always positive. If a is negative, then sometimes a is a real number,
but only if x = p/q in its lowest terms where p and q are integers and q is an odd
1.1
integer. For example, (−8)3 = −2, because (−2)3 = −8.
1
–3 –2 –1 0 1 2 3 x
Self-test 1.1
The number x satisfies the inequalities 2 x 4 and | x | 3. Expressed as
a single expression, what values can x take?
typically labelled x and y, at right angles, meeting at the common origin O. Axes
are right handed if, when we walk along the x axis in the direction of increasing
x, the positive y axis is on our left. If you look at Fig. 1.2 in a mirror you will see
left-handed axes.
y
3
P : (x, y) 2
A : (1.5, 1)
1
x
–3 –2 –1 O 1 2 3
−1
C : (−2, −1.5) −2
B : (2.5, −2)
−3 Fig. 1.2 Axes and coordinates.
The position of a point is determined by two coordinates (x, y). They repres-
ent, in order, the signed ‘distances’ of the point from the y and x axes respect-
ively, as read off from the numbers on the axis scales. In Fig. 1.2 the point A has
coordinates x = 1.5, y = 1. We shall use the notation A : (1.5, 1) for such a point,
so as to display the name of the point together with its coordinates. On Fig. 1.2 we
also show the point B : (2.5, −2), and a general point P : (x, y). For a point
P : (x, y), x is called the abscissa and y the ordinate of P.
In Fig. 1.3a, the x and y scales are supposed to be equal, so that the distance of
P : (x, y) from the origin is OP. By Pythagoras’s theorem,
OP = √(OU 2 + UP2),
7
1.3
P : (x, y)
GRAPHS
r
x
x O
O U
P2 : (x2, y2) U : (x1, y2)
Fig. 1.3
where U is the base of the perpendicular from P on to the x axis (Fig. 1.3a). If we
put OP = r, then
r = √(x2 + y2). (1.3a)
Note that distances, such as OP and r, are always counted as positive numbers.
Similarly, from Fig. 1.3b, for any two points P1 : (x1, y1) and P2 : (x2, y2) in the
plane, the distance P1 P2 between them is defined to be is given by
P1P2 = √[(x1 − x2 )2 + (y1 − y2 )2]. (1.3b)
Self-test 1.2
Find the distances between the points A : (1, 2), B : (2, −3) and C : (7, −2).
Confirm that AC2 = AB2 + BC2. What can you deduce about the angle R?
1.3 Graphs
If x and y are connected by an equation, then this relation can be represented by
a curve or curves in the (x, y) plane which is known as the graph of the equation
(often known as a cartesion equation in this context).
x −3 −2 −1 0 1 2 3
y −27 −8 −1 0 1 8 27
We then plot the points corresponding to this set of coordinates and draw a smooth
curve through them as shown in Fig. 1.4. The greater the number of values of x in the
interval, the greater is the reliability of the graph. It is assumed that the curve has a
smooth or regular behaviour between consecutive plotted points. In Fig. 1.4 the scales
are not the same on the two axes. Since y has a much greater spread of values (54)
compared with x (6), the vertical scale has been compressed to give a convenient picture.
Generally, unequal scales distort lengths and angles.
8
y 30
STANDARD FUNCTIONS AND TECHNIQUES
y B : (1, 2)
20
2
10
1 P : (x, y)
−3 −2 −1
O 1 2 3 x
x –2 –1 O 1 2
−10
A : (−1, −1) −1 Q C
−20
−2
−30
Fig. 1.5
Fig. 1.4 Graph of y = x 3: note that the x
and y scales are unequal.
1
Example 1.2 Find the equation of the straight line through the points
A : (−1, −1) and B : (1, 2).
The line is shown in Fig. 1.5. Let P : (x, y) be any point on the line. PAQ and BAC are
similar triangles, so that
Q P y + 1 CB 3
= = = .
AQ x + 1 AC 2
Therefore
2(y + 1) = 3(x + 1) or y = 23 x + 12 .
This represents the equation of the straight line through the points (1, 2) and (−1, −1).
y
(x2, y2)
1.3
y B
(x, y) B
P
A
GRAPHS
y=
(x1, y1) x
m2
m 1
x
α A Q R y=
O x β α
Q O P x
From (1.7), if we require the line through A : (x1, y1) with given slope m, its
equation is
y − y1 = m(x − x1). (1.8)
Consider the lines y = m1x and y = m2x through the origin (these are parallel to
the original lines). Extend them into the upper half-plane y 0. If the two exten-
sions both lie in the same quadrant (the first, x 0, or the second, x 0), then
the angle between them is clearly less than 90°, and also m1m2 0. Therefore we
need only investigate the remaining case, of branches lying in different quadrants
as shown in Fig. 1.7.
10
In y 0, A and B are any points on y = m1x and y = m2x respectively. Construct
AP and BQ perpendicular to the x axis. The acute angles α, β are both repres-
STANDARD FUNCTIONS AND TECHNIQUES
A circle consists of all points which are a constant distance from a given point.
In Fig. 1.8, the circle has radius r, and its centre is at (a, b). The point P : (x, y)
represents any point on the circle. Equation (1.3b) for the distance between
two points gives
√[(x − a)2 + (y − b)2] = r.
Square this expression to get rid of the square root, and we have the equation of a
circle in its standard form in eqn (1.10).
y
P
(x, y)
(a, b)
1.3
4x + 4y − 4x + 8y − 11 = 0.
2 2
(i)
GRAPHS
To convert (i) to the form (1.10), rewrite it in the form
x 2 − x + y 2 + 2y = 11
4 . (ii)
(this process is used in many different contexts, and is called completing the square).
Treat the terms in y similarly:
y 2 + 2y = (y + 1)2 − 1.
Replace the terms in (ii) by the new forms; we get
(x − 12 )2 − 1
4 + (y + 1)2 − 1 = 11
4 or (x − 12 )2 + (y + 1)2 = 4.
Therefore the centre is at ( 12 , −1), and the radius is 2.
(b) y (c)
y
(a) y y=
b −b bx/a
x /a y=
x x
−a O a −a a
O
x
O
−b
Fig. 1.9 (a) An ellipse x2/a2 + y 2/b2 = 1. (b) A hyperbola x 2/a2 − y 2/b2 = 1. (c) A parabola y = x 2.
12
The list of second-degree curves is completed by the parabola, which has the
standard form y = x2 (Fig. 1.9c).
STANDARD FUNCTIONS AND TECHNIQUES
These curves, the ellipse, hyperbola and parabola are known as conic sections
since they can be constructed as plane sections of a cone.
Self-test 1.3
Find the radii and centres of the circles
x2 + y2 − 2x = 1, x2 + y2 − 4x − 2y = −1.
Find the coordinates of their points of intersection.
1.4 Functions
The area A of a circle depends on its radius r, and the dependence is expressed in
the formula A = πr 2. In general, suppose that the values of a certain independent
variable x, say, determine the values of a dependent variable y in such a way that
1
y = f(x), y = g(x),
and so on, where the letters f, g, etc., can be used to distinguish different forms of
dependence which can be thought of pictorially in terms of different graphs. The
letters f, g, and so on, standing alone, need not be associated with a formula
in the usual sense. They can stand for any rule, program, or calculation process
which produces a definite single value for y when we offer a number x to it.
A function can be thought of as an input–output device as in Fig. 1.10. ln y = f(x),
x is called the independent variable and y the dependent variable.
Now suppose that the input is not simply x, but another function of x; say(2x).
For example we might be plotting the graph of y = f(2x) where f is the sine func-
tion, sin, and x is the independent variable. We then speak of 2x as being the
argument of f. We shall see many instances of this usage.
Functions can be defined implicitly by means of formulae. For example,
x2 + y 2 = 1
represents a circle, centre the origin and radius 1. But if we solve the equation
for y, we obtain y = ±√(1 − x2), which is not a single function, but two separate,
single-valued, functions
13
y = √(1 − x ) 2
and y = −√(1 − x ), 2
1.4
representing the upper and lower semicircles which together make up the circle.
The following result is frequently required. Suppose that c is a positive con-
FUNCTIONS
stant, and we are given a function f, with graph y = f(x). The graph y = f(x − c) is
exactly the same as that of f(x), except that it is moved, or translated a distance c
to the right along the x axis. There is a similar result for f(x + c), the movement
being to the left. Therefore
Thus y = x2 and y = (x + 2)2 have the same shape, but the second is a distance 2 to
the left of the first.
Sometimes it is helpful to adopt a more formal way of presenting a function. For
example, instead of putting simply f(x) = √(1 − x 2) we may say
the function f defined, for −1 x 1, by f(x) = √(1 − x 2),
or, by changing the independent variable from x to t,
the function f defined, for −1 t 1, by f(t) = √(1 − t 2),
which has exactly the same meaning. Any letter may be used as the independent
variable to specify the formula or rule that f symbolizes; it is sometimes called a
dummy variable for this reason. When we call on the function f in the course of a
particular problem, we then revert to the symbols that are natural to the problem:
we might want f(r) or f(x 2) or f(x − y) or just a single value f(2π). In these
examples, the symbols r, x 2, x − y, and 2π are the arguments of the function f. For
example, if a function g is defined by
g(t) = (1 − t)2 for all values of t,
then, with the new argument 1 − x,
g(1 − x) = [1 − (1 − x)]2 = x 2,
or, with argument t 2,
g(t 2) = (1 − t 2)2.
It is useful to have terms in which symmetry of a graph can be described. For
example, the graph of the parabola y = x 2 shown in Fig. 1.9c is symmetrical about
the y axis; the two halves for x 0 and x 0 are reflections of each other in the
y axis. Functions with such graphs are called even functions. On the other hand y = x 3
(Fig. 1.4) is its own reflection in the origin: the function f(x) = x 3 is an example of
an odd function. The corresponding algebraic properties are defined by
14
For example, in plotting y = f(x) = x 3 in Example 1.1, we did not really have to
calculate x 3 for negative values of x. All that was necessary was to notice that x 3
is an odd function since (−x)3 = −(x 3), and this gives the table for negative x by
changing the sign of the entries for x positive.
Some functions of practical significance have graphs that are not entirely
smooth. For example, we may wish to model a device that is turned on at a given
time, being quiescent before that time but active afterwards. A sudden change
in the state of the device can be represented by a function which has a jump or
discontinuity in its graph at the critical moment. The basic building block for
functions with a jump is the unit step function H(t) (also known as the Heaviside
function after its inventor, and sometimes denoted by U(t)) which we shall define,
using t to represent time, by
1
⎧0 when t 0,
H(t) = ⎨ (1.13)
⎩1 when t 0
(see Fig. 1.11a).
If switch-on is required at t = t0 then we can use the translation
⎧0 when t t0 ,
H(t − t0 ) = ⎨
⎩1 when t t0 ,
shown in Fig. 1.11b: it is the same graph translated to the right a distance t0
by (1.11).
(a) y (b) y
1 1
O t O t0 t
1.4
y
1
FUNCTIONS
−2 −1 O 1 2 3
t Fig. 1.12
Hence for
t −1 f(t) = 1 + 0 − 1 = 0;
−1 t 2 f(t) = 1 + 1 − 1 = 1;
t2 f(t) = 0 + 1 − 1.
The graph is shown in Fig. 1.12.
This function would switch a device on at t = −1 and switch it off at t = 2.
⎧−1 when t 0,
⎪
sgn t = H(t) − H(−t) = ⎨ 0 when t = 0,
⎪⎩ 1 when t 0,
is called the signum function (from the Latin signum meaning ‘sign’, used to avoid
verbal confusion with the trigonometric sine). Its graph is shown in Fig. 1.13a.
H(t) and sgn t can be used along with other functions to produce a variety of
functions having discontinuities in either value or direction at assigned points.
Figures 1.13b,c show the even functions y = t sgn(t) and y = sgn(1 − t 2). Note that
sgn(1 − t 2) has discontinuities where 1 − t 2 = 0; that is, where t = ±1.
y
y
(a) 1 (b) (c)
y 1
O t −1 O 1 t
−1 O t −1
Fig. 1.13 (a) y = sgn(t). (b) y = t sgn(t). (c) y = sgn(1 − t ). 2
Self-test 1.4
Sketch the graph of
y = [H(t + 1) + H(1 − t) − 1] t sgn t
16
For everyday purposes angles are measured in degrees, so we are still following
the Babylonian practice of dividing the circle into 360 sectors each of which
subtends a degree (1°). For mathematical purposes, a less arbitrary measure is
desirable. The absolute unit is the radian, which represents about 57°. The special
property which makes the unit valuable is its connection with length.
Figure 1.14 shows a circle of radius R with a sector AOB containing an angle θ.
The length of the arc ⁄ is proportional to R, and it is proportional to θ whatever
the angular units, so it is proportional to the product Rθ. One radian is the unit
of angle such that ⁄ is numerically equal to Rθ.
A
θ
R
O
1
Fig. 1.14
The radian measure is not just a matter of convention, like measuring lengths
in metres rather than feet. An angle θ of 30° degrees is equal to an angle of --16 π
radians, so θ subtends an arc of length Rθ = --16 π × R on the circumference of a
circle of radius R; not 30 × R, which would be ridiculous. This observation
has consequences in other places. For example, if you have already learned some
calculus you might know that (d/dθ )sin θ = cos θ, and that if θ is small enough,
sin θ is approximately equal to θ. But neither result is true unless θ is measured
in radians.
17
1.6
We assume that you know the meanings of sine, cosine, and tangent for positive
acute angles, as in ordinary trigonometry. We shall extend their meaning to
X′ X′ X′
θ
x x x
O X θ O X θ O X
θ 0 θ 0 θ 2π
The trigonometric functions sine, cosine, and tangent for all angles are defined
by the construction in Fig. 1.16, in which P : (x, y) is any point, and θ is treated as
a polar angle. The length OP is given by
OP = r = √(x2 + y2) 0.
y
P : (x, y)
r
0
x0
y0 θ
x
O
Fig. 1.16 Diagram for cos θ, sin θ,
tan θ.
Then the definitions of the trigonometric functions for arbitrary angles θ are as
follows:
18
These definitions are extensions, to all four quadrants of the (x, y) plane, of the
familiar geometrical meanings in the first quadrant. The length r is positive, but
the coordinates x and y are signed quantities which determine the signs of the
trigonometric functions sin θ, cos θ, tan θ in the four quadrants. The following
lists the ones which are positive:
1st quadrant, x 0, y 0: all are 0.
2nd quadrant, x 0, y 0: sin θ 0.
3rd quadrant, x 0, y 0: tan θ 0.
4th quadrant, x 0, y 0: cos θ 0.
1
Example 1.5 Obtain (a) sin 13 π, (b) tan 16 π, (c) cos 14 π. (The angles are in radians.)
Equation (1.14a) gives the angles in degrees. Use the triangles in Fig. 1.17.
(a) sin 13 π = sin 60° = √3/2.
(b) tan 16 π = tan 30° = 1/√3.
(c) cos 14 π = cos 45° = 1/√2.
(b)
30°
(a) 2 √3
√2 1
Example 1.6 Obtain (a) cos 2π, (b) sin 23 π, (c) sin(− 43 π).
The points P on Fig. 1.18 have OP = r = 1, and polar angles equal to the angles given in
the question. The x, y coordinates of P are easy to find.
(a) P is at (x, y) = (1, 0), so that cos 2π = x/r = 1.
(b) P is at (x, y) = (0, −1), so that sin 23 π = y/r = −1.
(c) P is at (x, y) = (−1/√2, −1/√2), so that sin(− 43 π) = y/r = −1/√2. ➚
19
Example 1.6 continued
1.6
(a) y (b) y (c) y
3
θ = − 4π
A,P O
x x x
O (r, 0) O r A
3
θ = 2π
P P
(0, −r) r
(− , − r)
√2 √2
Fig. 1.18
y y = cos θ
1 y = sin θ
θ
− 32 π −π − 12 π O 1
2 π π 3
2 π 2π
−1
The graphs of cos θ and sin θ are shown in Fig. 1.19. Observe the following:
1. The curves for cos θ and sin θ have identical shape, but are displaced a distance
2 π (radians) from each other. They are related by
1
2. The functions cos θ and sin θ are said to be periodic, with period (or wave-
length) equal to 2π; that is, the curves repeat themselves at intervals of length 2π.
This is evident from the definition of a polar angle, because in terms of the
polar angle of a point P, an increase or decrease of 2π radians is equivalent to
a complete revolution.
3. cos θ is an even function (see eqn (1.12)), so that cos(−θ ) = cos θ; sin θ is an odd
function, so that sin(−θ ) = −sin θ.
4. The values taken by cos θ and sin θ oscillate between −1 and +1.
The graphs of tan θ, cot θ, sec θ, and cosec θ are shown in Fig. 1.20.
There are many trigonometric identities in common use. The following are
some of the more important (a more extensive list is given in Appendix B):
20
(a)
STANDARD FUNCTIONS AND TECHNIQUES
y y
(b)
O O
3 1 1 3
θ θ
− π −π − π
2 2 2 π π 2 π − 32 π −π − 12 π 1
2 π π 3
2 π
(c)
y (d) y
O 1 − 12 π 1 3
2 π
3 1 1 3
θ 3 1
θ
− π −π − π π π π − π −π O −1 π π
2 2 −1 2 2 2 2
1
Fig. 1.20 (a) y = tan θ. (b) y = cot θ. (c) y = sec θ. (d) y = cosec θ.
Trigonometric identities
For all angles A and B:
(a) Sums and differences of angles
sin(A ± B) = sin A cos B ± cos A sin B,
cos(A ± B) = cos A cos B z sin A sin B,
tan(A ± B) = (tan A ± tan B)/(1 z tan A tan B).
(b) Products as sums and differences
cos A cos B = 12 [cos(A + B) + cos(A − B)],
cos A sin B = 12 [sin(A + B) − sin(A − B)],
sin A sin B = 12 [−cos(A + B) + cos(A − B)].
(c) Double angles
cos2A + sin2A = 1,
sin(2A) = 2 sin A cos A,
cos2A = 12 (1 + cos 2A),
sin2A = 12 (1 − cos 2A).
(d) Cosine rule. In a triangle with side lengths a, b, c and opposite angles A,
B, C,
c2 = a2 + b2 − 2ab cos C.
(e) Sine rule. In a triangle with side lengths a, b, c and opposite angles A, B, C,
sin A sin B sin C
= = .
a b c
(Since the identities are the same as for positive acute angles we do not prove
them here.) (1.17)
21
We have so far encountered cos θ, sin A, etc., in which θ and A are understood
to represent certain angles arising in a geometrical context, but trigonometric
1.6
functions are used in many applications which have nothing directly to do with
angles. For example, expressions such as cos ω t will occur, in which t stands for
x
O φ <0
We then obtain
a cos u + b sin u = c(cos φ cos u − sin φ sin u) = c cos(u + φ)
from (1.17a) with u and φ in place of A and B. We have obtained the identity:
Harmonic functions
a cos u + b sin u = c cos(u + φ),
where c and φ are polar coordinates of the point (a, −b) in cartesian axes. (1.18)
22
y
STANDARD FUNCTIONS AND TECHNIQUES
1
( 2 π − φφ)/ω
ω
t
− φ/ω
−c
Period 2π/ω
A function having the form A cos(ku + α ), where A, k, and α are any constants,
1
Self-test 1.5
–1 π and sin 12
Find numerical formulas for cos 12 –1 π.
23
1.7
Let y = f(x), where f is a given function. It is often necessary to find a value of x
corresponding to a given value of y, which amounts to solving a certain equation.
INVERSE FUNCTIONS
For example, if f is defined by
y = f(x) = x 3 and y = 8,
1
then the resulting equation, 8 = f(x) = x 3, is solved uniquely by x = 8–3 = 2. In this
case there is a unique value of x corresponding to every value of y, positive or
negative. These x values each depend on y, so we say that x is a function of y.
Denoting this function by F, then we can write
x = F(y) = y 3 .
1
and
f{F(y)} = f(x) = (y 3 )3 = y, for every value of y.
1
Initially, the values of x and y were connected through the relation y = f(x), but in
the final form of these equations:
F{f(x)} = x, for every value of x,
f{F(y)} = y, for every value of y.
We may use any letters to indicate the variables in place of x and y. In fact, these
two equations are identities (see Section 1.4). For example, if we were concerned
with an application involving an angle θ it might be convenient to write the first
one in the form
F{f(θ )} = θ, for every value of θ,
or we could substitute x for y in the second equation, without changing their
meaning.
Returning to the original problem, if we know the inverse function F we can
solve the equation
c = f(x)
by using the first reciprocal relation F{ f(x)} = x. Taking the inverse F of both
sides of the equation, we obtain
F(c) = F{f(x)} = x,
so that the required value of x is F(c).
We shall now give a geometrical description of the operations we have just gone
through. Figure 1.23a is the graph of y = f(x) = x 3. Choose any number a. To find
a3, locate x = a at A and follow the track ABC. The point C on the y axis repres-
1
ents y = f(a) = a3. Now choose a number b with the aim of obtaining b–3. Read the
graph backwards: locate y = b at U and follow the track UVW. Then W represents
24
y y = x3
STANDARD FUNCTIONS AND TECHNIQUES
(a) (b) y
U y=x
b V
1
1 1 y = x3
B Q
C
−1 a3
A W P
1
x x
O a1 b
3
−1 O 1
−1 −1
Fig. 1.23 (a) The function f(x) = x 3. (b) The function f(x) = x3 and its inverse F(x) = x –3 : since the
1
scales are equal, f(x) and F(x) reflect each other in the 45° line (see e.g. P : (a, b) and Q : (b, a)).
1
b–3. Therefore the same curve generates values of x 3 and also of its inverse function
1
F(x) = x–3. The two identities given above amount to the obvious fact that if we
follow the tracks ABCBA and CBABC respectively in Fig. 1.23a, then we arrive
back at the starting point in each case.
1
In order to obtain cube roots, we might prefer a graph from which the cube
root can be read off in the ordinary way: from a horizontal x axis to a vertical
y axis. Suppose that we plot the curves y = f(x) = x 3, and the inverse in the form
1
y = F(x) = x–3, on the same sheet of paper (see Fig. 1.23b). We also arrange that the
x and y scales are equal. Let P : (a, b), where b = a3, be any point on y = f(x) = x3.
1
The corresponding point on the graph of the inverse function y = F(x) = x–3 is
Q : (b, a). Since the x and y scales are equal, Q is the reflection of P in the straight
line through the origin inclined at 45° to the x axis. Therefore the graphs of
y = f(x) and of its inverse y = F(x) are reflections of each other in the 45° radial
1
line. This is shown for f(x) = x 3 and its inverse F(x) = x–3 in Fig. 1.23b.
The arguments are basically the same for any function and its inverse. What-
ever f may be, we obtain the graph of its inverse function F by plotting y = f(x)
with equal scales, and reflecting it in the 45° line. Also, the reciprocal identities
apply in the general case, though this is usually subject to restrictions on the range
of the function F, in order to ensure that a single, unique value is assigned to F(x).
1
The example f(x) = x 3 is particularly straightforward, since the functions y = x–3
and y = x 3 are unique inverses of each other for every value of x and y. The prob-
lem of single-valuedness of F arises if the graph of f(x) ‘turns over’ at some point,
or points. A simple example is y = f(x) = x 2 (see Fig. 1.9c) which turns over at
x = 0, so that the graph falls into two parts, on the left and right sides of the
y axis. The inverse function F1 corresponding to the right-hand branch of y = x 2 is
1
y = F1(x) = x–2, valid only for x 0 and y 0. A second inverse function, defined
1
by y = F2(x) = −x–2 for x 0 and y 0, arises from the left-hand branch. The two
inverse functions taken together provide, for example, the two expected solutions
of the equation f(x) = x 2 = 2, namely
1 1
x = F1(2) = +2–2 and x = F2(2) = −2–2.
25
1.8
y y = x0
1
1
2
x
=
y
y =x
x2
y
=
x3
Fig. 1.24 Some positive integer1
x4
y
powers x n and their inverses x ,
n
y=
showing reflection across the
x 45° line.
O 1
The problem arises particularly in Section 1.8 in connection with inverse trigono-
metric functions.
Figure 1.24 illustrates the general character of positive integer powers xn and their
inverses x 1/n, the picture being confined to the range 0 x 1 for clarity. Notice
the symmetry of the inverse pairs xn and x1/n about the 45° line y = x 1 = x. Graphs
corresponding to other powers of x lie between those shown, in a regular way.
x
STANDARD FUNCTIONS AND TECHNIQUES
1
1
2
θ
−2π −π − 12 π O 1
2 π π 2π 3π 4π
−1
Fig. 1.25 The intersections of the graphs x = 12 and x = sin θ give the solutions of the equation
sin θ = 12 .
we are given any single one of these values we can easily use it to construct all
of them. Suppose θ = α is any one solution of the equation sin θ = c, where c is
a constant with −1 c 1. Then all the values of θ satisfying the equation
sin θ = c are obtainable from α by means of the formula
θ = nπ + (−1)nα,
1
(a) θ (b) θ
1.8
1
2 π π
ar c
c si
co
ar
sx
=
θ
1
π
x 2
−1 O 1
− 12 π x
−1 O 1
(c) θ
π
1
2
x
ar ctan
θ=
x
−3 −2 −1 O 1 2 3
− 12 π
Fig. 1.26 The inverse trigonometric functions over their standard ranges. In (a) and (b), the scales
of θ and x are the same. (a) θ = arcsin x, − 12 π θ 12 π, −1 x 1. (b) θ = arccos x, 0 θ π,
−1 x 1. (c) θ = arctan x, − 12 π θ 12 π, all x.
(a) (b)
1
x) 2π − x
2
1
+ sin x
√(1 x
x
1 cos x
θ
Self-test 1.6
Find a formula for sin(2 arctan x).
28
r = OP = √(x2 + y2) 0.
y
P : (x, y)
θ
x
O
Fig. 1.28 Polar coordinates:
x = r cos θ, y = r sin θ.
1
The angle θ is any polar angle (see Section 1.7) that locates P; we choose the
simplest one for the illustration, but our statements are true for any of the valid
polar angles (which are given by θ + 2πn for every integer n).
Definite values for r and θ locate a unique point P just as well as the cartesian
coordinates x, y, so in suitable cases we can use r, θ as coordinates in place of x, y.
They are called polar coordinates. From the definitions of sine and cosine in
eqn (1.15), the cartesian and polar coordinates of P are related in the following way:
Polar coordinates are often easier to use than x, y coordinates, especially for
curves that surround the origin. The simplest example is a circle, centre the origin
and radius c, whose polar equation is
r = c.
A spiral, such as a track on a compact disk with inner radius a, outer radius b, and
track width h, is described by
h
r=b− θ,
2π
in which θ runs from zero to 2πN, where N = (b − a)/h is the number of revolu-
tions. Note the enormous size attained by the polar angle as it follows the
rotation through many revolutions.
An extensive catalogue of graphs of curves in polar and cartesian coordinates
can be found in Seggern (1990).
29
Example 1.8 (a) Obtain the polar equation of the central ellipse
1.9
x2 y2
+ =1
a2 b2
POLAR COORDINATES
(see Fig. 1.9a). (b) Obtain the polar equation for the same ellipse tilted through an
angle α.
(a) Referring to eqn (1.20), the cartesian equation becomes
r 2 cos2θ r 2 sin2θ
+ = 1. (i)
a2 b2
From the identity (1.17c) we can express cos2θ and sin2θ in terms of cos 2θ, so that (i)
can be written as
⎡(1 + cos 2θ ) (1 − cos 2θ ) ⎤
r2 ⎢ + ⎥ = 1,
⎣ 2a 2 2b2 ⎦
which simplifies to
√2ab
r= .
√[( a 2
+ b 2
) − (a2 − b2 ) cos 2θ ]
When θ runs from, say zero to 2π, the complete ellipse is traced out once.
(b) To tilt the ellipse through an angle α , simply replace θ by θ − α (by analogy with
eqn (1.11) for a change of origin of the x axis). Then the equation of the tilted ellipse is
√2ab
r= (ii)
√[(a2 + b2 ) − (a2 − b2 ) cos 2(θ − α )]
(i) if x = r cos θ is positive, a value of the polar angle of P : (x, y) is arctan(y /x);
(ii) if x = r cos θ is negative, a value for the polar angle is arctan(y /x) ± π.
The result (ii) follows from the fact that tan θ has period π; so
tan[arctan(y /x) ± π] = tan[arctan(y/x)] = y/x,
as required. As always, we can take the polar angles ±2πn.
Many symbolic computer systems accept x and y arguments to avoid this
problem. For example, in Mathematica the polar angle is given correctly by the
command ArcTan[x, y].
Self-test 1.7
Find the equivalent polar equation in (r, θ ) for the cartesian equation
(x2 + y2 − x)2 − (x2 + y2) = 0 assuming r 0. Sketch the curve.
1
9x = 32x, and so on. We shall show later (Example 1.9) that all the functions y = ax
can be displayed on the same curve provided that the x-scales are contracted
or extended by appropriate factors, so we really only need one, standard, base to
describe them all.
y
4
y = 3x
2x
1.5 x
y=
y=
3
x
1.25
2 y=
1 y = 1x
x Fig. 1.29
−1 O 1 2 3
31
A standard base may be chosen according to what is most convenient for later
requirements. For example, at one time the base a = 10 was adopted in order
1.10
to simplify the arithmetic involved in large calculations, but nowadays we have
better methods. The base a = 2 is used in the theory of binary processes such as
y
Q
45°
1 N
P h
NQ
≈ 1,
NT
the approximation improving as h becomes smaller. We have NQ = eh − 1 and
PN = h, so that
eh − 1
≈ 1, (1.21)
h
and, in fact, the approximation can be made as close as we wish by taking h small
enough. To estimate e, multiply through by h to give eh ≈ 1 + h. Finally, raise both
sides to the power 1/h to isolate the number e:
e ≈ (1 + h)1/h,
to any degree of accuracy, provided h is made small enough. The following table,
constructed using a calculator or computer, illustrates how the desired value of e
gradually emerges as h approaches zero:
1
1.11
x −9 −6 −3 0 3 6 9
y y = ex
2
45°
1 y = ln x
45°
x
−1 O 1 2
−1
Fig. 1.31 The graph of ln x
obtained from that of ex.
34
The logarithm has the following properties, which are proved below:
STANDARD FUNCTIONS AND TECHNIQUES
Proof. (a) This is the fundamental property of (1.19) of inverse functions applied
to this case.
(b) e0 = 1 (see (1.2b)), so 0 = ln 1 (from (1.23a)). Also, e1 = e, so 1 = ln e (from
(1.23a)).
(c) From the definition (1.23a), applied three times:
eln ab = ab = eln aeln b = eln a+ln b (from (1.2a)).
By equating powers of e on the two sides, we have
ln ab = ln a + ln b.
(d) Put a = (a/b)b, and take the logarithm of both sides:
ln a = ln[(a/b)b] = ln(a/b) + ln b
from the product rule (1.23c). The first result in (1.23d) follows immediately.
Put a = 1 so that ln 1 = 0 to obtain the second result.
(e) From (1.23a),
a = eln a, so that a x = (eln a)x = ex ln a.
By the definition (1.23a) with ax in place of a, the logarithm of a x is x ln a.
(f ) These follow from (a) from the limit of eln n = x as x → ∞ and as x → −∞.
Example 1.9 Prove that all the graphs y = ax, for a 1, become identical to that
x
of e if, for each case, the x axis is scaled by the appropriate factor.
To fix ideas, think of a as being a given constant such as 2. As in the proof of (1.23e)
above, we can write y = ax in the form
y = ax = ex ln a = ekx, ➚
35
Example 1.9 continued
say, where k = ln a. The required scale factor is k 0. To make the graph of ax lie
1.12
along that of ex, the x axis must be stretched if k 1 (i.e. if a e), and compressed if
0 k 1 (i.e. if 1 a e).
Self-test 1.8
Obtain y in terms of x when ln(y − 1) = 2 ln x + ln y − x for x 0.
where A and c are constants. This class includes functions such as 2t, since, by
Example 1.9, they can all be expressed in the form (1.24); in this case 2t = et ln 2.
If c 0, then y is said to have exponential growth. To get an idea of what this
implies, we shall consider the doubling period of y. Choose any moment of
time, t. At some later time t + T, y will have doubled its value, so that
A ec(t+T ) = 2A ect or A ect ecT = 2A ect.
After cancelling the factor A ect we have an equation for T:
ecT = 2,
so that cT = ln 2, from which we obtain the unknown T:
T = (1/c) ln 2.
This result is independent of t and A, so we have
eqn (1.24). More generally, the value of A ect is multiplied by a factor N over every
interval of length (1/c) ln N. The successive values form a geometric progression
(see Section 1.15) with common ratio N.
Example 1.11 The number N of scientists and engineers in the USA doubled
every 10 years between 1900 and 1935, and in 1935 they numbered about
1.5 × 105. This suggests exponential growth N = A ect. Find c, and predict
the number N for 1990 on the assumption that the trend continued.
Suppose that we count 1900 as t = 0. The doubling period is 10 years; so N = A ect,
where, by (1.25),
c= 1
10 ln 2 = 0.0693.
Thus
N = A e0.0693t.
In 1935, where t = 35 (years), N = 1.5 × 105, so that
1.5 × 105 = A e0.0693× 35,
1
or A = 13 265.
Therefore
N = 13 265 e0.0693t.
In 1990 t = 90, from which it follows that
N = 6.8 × 106.
Self-test 1.9
In a radioactive element, the number of radioactive nuclei, x, present at time t
is given by x = x0 e−kt, where k is a constant, and x0 is the number of nuclei
present at time t = 0. Find the time taken for 90% of the nuclei to decay.
Hyperbolic functions
1.13
cosh x = 12 (ex + e−x), sinh x = 12 (ex − e−x). (1.26)
HYPERBOLIC FUNCTIONS
Since
cosh(−x) = 12 (e−x + e−(−x)) = 12 (e−x + ex) = cosh x,
it follows that the graph of cosh x is symmetrical about the y axis, that is, cosh x
is an even function. By a similar argument, it can be shown that sinh x is on odd
function of x. Graphs of the two functions are shown in Fig. 1.32a.
(a)
4 y (b) (c)
y = cosh x y
4 y 4
3 3 y = coth x 3 y = cosech x
2 2 2
y = sech x 1
1
1 y = tanh x
−2 −1 O 1 2 −2 −1 O 1 2 −2 −1 O 1 2
−1 x −1 x −1 x
−2 −2 −2
y = coth x y = cosech x
y = sinh x −3 −3 −3
−4 −4 −4
sinh x cosh x
tanh x = , coth x = ,
cosh x sinh x
1 1
sech x = , cosech x = .
cosh x sinh x (1.27)
Graphs of tanh x, coth x, sech x, and cosech x are shown in Fig. 1.32b, c.
From the definitions, a number of identities follow which parallel those for trigo-
nometric functions but with important sign differences. Some are derived below.
For (a):
cosh2x + sinh2x = 14 (e2x + 2 + e−2x) + 14 (e2x − 2 + e−2x)
= 12 (e2x + e−2x) = cosh 2x.
38
For (b):
STANDARD FUNCTIONS AND TECHNIQUES
.
1 ± tanh x1 tanh x2 (1.29)
The inverse hyperbolic functions corresponding to sinh, cosh, and tanh are
indicated respectively by the notations
sinh−1, cosh−1, and tanh−1.
The index (−1) is traditional: do not mistake it as standing for a negative
power; sinh−1x does not mean 1/sinh x. (Note that the commands ArcSinh[x],
ArcCosh[x], and ArcTanh[x] are used for inverse hyperbolic functions in sym-
bolic computation in Mathematica.) The intervals for which they are defined are
as follows:
1.14
e y = x ± √(x 2 + 1),
where the sign √ means the positive square root. The negative sign in the right-
PARTIAL FRACTIONS
hand side corresponds to a solution which cannot represent e y, since e y is always
positive but x − √(x 2 + 1) is always negative. Therefore we select the positive sign.
By taking the logarithm of both sides and using (1.23a) we obtain
y = sinh−1x = ln[x + √(x 2 + 1)],
valid for all x. The inverses of the other hyperbolic functions are obtained sim-
ilarly, and are shown in the following table:
Self-test 1.10
Prove that tanh−1x = --21 ln[(1 + x)/(1 − x)].
x 1 1 1 1
(ii) ≡ + ,
4x 2 − 1 4 2x − 1 4 2x + 1
3x + 2 1 2 1
(iii) 2 ≡ + 2 − .
x (x + 1) x x x+1
The terms on the right are individually simpler than the functions on the left.
This break-up into simpler constituents is useful for many purposes. In this
40
section, we show how to break up a complicated function into simpler terms of
the type above.
STANDARD FUNCTIONS AND TECHNIQUES
ning. No proofs will be given here, but the reader should learn the techniques.
It is the denominator Q(x) which determines what the form of the constituent
partial fractions will take. Suppose that the denominator is broken up into factors
as far as possible. For example,
2x 4 + x 3 − 4x 2 + x − 6 = (2x − 3)(x + 2)(x 2 + 1),
and it cannot be factorized any further. We shall consider only the cases where the
factors are of the type:
ax + b (a simple factor), (cx + d )n (a repeated factor of order n), and
px 2 + qx + r with q2 4pr (an irreducible quadratic).
The rules affecting these are as follows:
P(x) is involved in these rules only to the extent that it will affect the values of the
coefficients K, L1, etc. The following examples show how to determine the values
of the coefficients.
41
1.14
We can use any convenient letters for the unknown coefficients in the terms. The
denominator has two simple factors, x − 1 and x + 2, so (1.32a) says that the partial
PARTIAL FRACTIONS
fractions must have the form
x A B
≡ + (i)
(x − 1)(x + 2) x − 1 x + 2
Multiply through by (x − 1)(x + 2):
x = A(x + 2) + B(x − 1). (ii)
The constants must be chosen so that this becomes an identity. An identity has to be
true for any x, so that if we put any value of x into (ii), the result must be correct. Any
two substitutions of numbers for x form two simultaneous equations for the two
unknown constants A and B. For example, if we put x = −10 and x = 100 we obtain
−10 = −8A − 11B, 100 = 102A + 99B.
The numbers we chose are inconvenient, but according to (1.32) we get the same A and
B whatever values of x we use. Therefore, choose values of x that make the equations as
simple as possible:
x = −2 gives −2 = 0 − 3B, so B = 23 ,
x=1 gives 1 = 3A + 0, so A = 13 .
Therefore, from (i),
1 2
x
≡ 3 + 3 .
(x − 1)(x + 2) x − 1 x + 2
x=0 gives −1 = A − B + C, so B = 1 + A + C = 59 .
Finally,
3x − 1 10 1 5 1 2 1
≡− + + .
(2x + 1)(x − 1)2 9 2x + 1 9 x − 1 3 (x − 1)2
Then x = 0 gives 1 = A + 0, so
A = 1.
There are no other very easy values of x to choose. Put the value of A just found into (i)
and rearrange: we get
−x = Bx + C. (ii)
If the degree of the numerator is greater than or equal to the degree of the
1
denominator, the case is not covered by (1.32), but we can treat it as follows.
Example 1.15 Put (x 3 + 1)/[x(x − 1)] into the form of a polynomial plus partial
fractions.
Carry out polynomial division, until the remainder is of lower degree than the divisor:
x+1
x2 − x x3 + 1
subtract x 3 − x 2
x2 + 1
subtract x2 − x
remainder x+1
Therefore
x3 + 1 x+1
≡x+1+ .
x(x − 1) x(x − 1)
The last term is of the right type for partial fractions, and finally
x3 + 1 1 2
≡x+1− + .
x(x − 1) x x−1
Self-test 1.11
Express f(x) = x/[(x − a)(x − b)(x − c)] in partial fractions if (a) a, b, c are all
different; (b) b = c ≠ a; (c) a = b = c.
43
1.15
The sign ∑ (sigma) is a large Greek capital S, standing for ‘the sum of …’. It is
used in the following way. Suppose, for example, we are provided with a string of
which is read ‘the sum of all the un from n = 1 through n = 6’. Similarly
5
u2 + u3 + u4 + u5 is written ∑ u.
n
n=2
Any letter can be used as the counting index instead of n, provided that there is
no conflict; so we could also write, for instance,
6
u3 + u4 + u5 + u6 = ∑ ui.
i=3
We index a sequence according to convenience. The first index does not have to
be 1. For example, consider the important sequence
1, x, x 2, x 3, … ,
which is the same as
x 0, x1, x 2, x 3, … .
This is called, for historical reasons, a geometric sequence or geometric progres-
sion. Each term in turn is got from its predecessor by multiplying by the common
ratio x. The natural way to index such a sequence is to start with n = 0 instead of
n = 1. Suppose then we want the sum of the first six terms. It can be expressed as
5
1 + x + x2 + x3 + x4 + x5 = ∑ xn,
n =0
∑x
n =1
n −1
instead. Such a sum, whether or not it starts with the x0, or constant, term, is
called a geometric series.
We will obtain an expression for the sum S of a geometric series having any
value of the common ratio x (except x = 1) and which runs from the term
N
in x0 to the term in xN. Thus S = ∑ x . Note that it contains N + 1 terms
n
n=0
(i.e. not N terms). Written at length:
S = 1 + x + x 2 + ··· + x N−1 + x N.
44
Then
STANDARD FUNCTIONS AND TECHNIQUES
xS = x + x 2 + x 3 + ··· + x N + x N+1.
Subtract the second line from the first. All the terms cancel except for two; we
obtain
S(1 − x) = 1 − x N+1, so S = (1 − x N+1)/(1 − x).
N
If x = 1, then S = ∑ 1 = N + 1.
n=0
4 6
1
Find the following sums. (a) ∑ (0.1)n , (b) ∑
1
Example 1.16 ,
N N 5 n= 0 n =0 2n
(c) ∑ enx, (d) ∑ (−1)n , (e) ∑ 2n .
n =0 n =0 n= 0
1.16
The brackets contain a series of type (1.33), with common ratio r 3. The number of terms
N + 1 is equal to 5, so that N = 4. Then
1 − (r 3 )5 1 − r 15
N 1 2 3 4 …
xN 0.1 0.01 0.001 0.0001 …
The sequence of terms 1, x, x2, x3, x4, x5, … is formed by multiplying each term in
turn by x to get the succeeding term, so if |x | 1, the terms become steadily
smaller in magnitude, and in fact (though we do not prove it here) can be made as
close as we wish to zero if we take a large enough value of N.
Therefore, referring back to (1.34),
1
if |x | 1, then SN → as N → ∞,
1−x
where the sign ‘→’ signifies ‘approaches’. In this way the idea of an infinitely
extended geometric series can be given a meaning. Its sum to infinity, S∞, is
expressed by
∞
1
S∞ = ∑x
n =0
n
1−x
= . (1.36)
On the other hand, if |x| 1, the magnitude of the term xN+1/(1 − x) in (1.34) will
increase to infinity as N increases, so the infinite series cannot be said to have a
sum at all. If x = 1, the series becomes
46
1 + 1 + 1 + ··· ,
STANDARD FUNCTIONS AND TECHNIQUES
which simply continues to grow to infinity as the number of terms increases. The
case of x = −1 is indeterminate. To summarize:
The second term on the right in (1.34) is called the remainder or error: it
represents the error incurred by using only the first N terms to approximate to the
infinite sum. For the infinite series to be useful, this quantity must approach zero
as N approaches infinity.
Example 1.18 Express the recurring decimals (a) 0.4444… , (b) 0.96 96 96… , in
1
Self-test 1.12
Sum the series
1 + 4x + 7x2 + 10x3 + ··· + (3N + 1)xN.
What is the sum to infinity of the series if | x | 1?
1.17
We can obtain the number of words without writing them all down by using the
following argument. Imagine that we are writing them down; then we have four
Example 1.19 How many four-letter words can be made out of the six letters
A, B, C, D, E, F, with no repetitions within a word?
Put another way, how many permutations are there of n = 6 distinct objects, taken r = 4
at a time, or what is 6 P4? There are six choices for the first letter. With each such choice
there are only five letters available for the second (no repetition), so there are 6 × 5
possible choices for the first two letters. There are four letters left to supply the third
letter, so there are 6 × 5 × 4 possibilities for the first three letters, and inclusion of the
fourth letter gives finally
6 P4 = 6 × 5 × 4 × 3 = 360.
Example 1.20 There are six different books and we must choose one book for
each of four children as a present. How many different distributions of books
to the children are possible?
A decision is required as to whether to distinguish by letter the children, or the books, or
both. We choose the children, distinguished by W, X, Y, Z, say.
Imagine that we are listing all the possibilities. Child W may receive any one of six
books. Whichever one W receives, X will have one of the remaining five; Y will have
one of the remaining four; and Z one of the remaining three. The number of entries in
our list is therefore
6 P4 = 6 × 5 × 4 × 3 = 360.
It might seem at first sight that this treatment favours child W and that child Z is
shabbily treated. However, the process describes a systematic way to arrange a list
containing all possible assignments, only instead of writing them down, we count them.
No choice of any gift is involved.
48
Guided by this discussion we can now obtain a formula for n Pr . We firstly need
the factorial notation n!. If n 0 is a positive integer, then the meaning of n! is
STANDARD FUNCTIONS AND TECHNIQUES
and then we can use formulae such as (1.39) without making an exceptional case
for r = n.
Permutations
The number of possible permutations of r objects, 1 r n, taken without
repetition from among n distinct objects is given by
1
n!
P = n(n − 1)(n − 2) … (n − r + 1) = .
n r
(n − r)! (1.41)
Proof. There are n possibilities for the first place in a permutation. With each of
these, the second can contain any of the remaining (n − 1) objects, so that there
are n(n − 1) possibilities for the first two places. The third place can contain any
of the remaining (n − 2) objects, so there are n(n − 1)(n − 2) possibilities for the
first three places. This continues until we have completed r factors corresponding
to the r entries in each permutation; these form the product n(n − 1)(n − 2) …
(n − r + 1). Then use (1.39).
The following example shows how it is sometimes possible to relate a problem
involving repetitions to one in which all the elements are distinct.
1.17
replications corresponding to any distinct word is 3 P3 = 3! = 6: this is the number of
permutations of the symbols C1, C2, C3, all of which are equivalent. Therefore there are
3! times as many entries in the list of permutations of ABC1C2C3 as there are in the list
Example 1.22 How many distinct permutations exist which use all the
14 letters in the word ASSASSINATIONS?
It does not matter which string, or anagram, we treat as a source of letters, so start
instead from one which displays the repetitions clearly:
SSSSSAAAIINNOT. (i)
We shall enforce a distinction between the repeated letters of each type by indexing them:
S1S2S3S4S5A1A2A3I1I2N1N2OT. (ii)
There are 5! × 3! × 2! × 2! permutations within the indexed groups in (ii), taken
separately, which all correspond to the same word (i). Similarly, there are 5! × 3! × 2! × 2!
rearrangements of indexed letters corresponding to any distinct permutation of the
ordinary letters. There are altogether 14 P14 = 14! permutations of the indexed letters.
Therefore, if the number of distinct permutations of the letters in ASSASSINATIONS is
denoted by n, then:
number of permutations of the indexed symbols in (ii) = 14! = (5! × 3! × 2! × 2!)n.
Finally
n = 14!/(5! × 3! × 2! × 2!) = 30 270 240.
Example 1.23 (Circular permutations) Five people sit round a circular table. In
how many distinct orders may they sit?
The meaning of ‘distinct’ here is that two arrangements are regarded as being the same
if each person has the same person on his or her right (or left would do as well). If the
people are named A, B, C, D, E, then rotation of a particular grouping, say BADEC,
bodily around the table does not count as a new circular permutation: go clockwise
(say) from any fixed position noting the order of seating; then BADEC, ADECB,
DECBA, ECBAD, CBADE are to be treated as the same permutation.
The number of ordinary 5-letter permutations is 5!. Let the number of circular
permutations be NC . To each of the circular permutations there are five ordinary
permutations, so that 5NC = 5!, and finally
NC = 5!/5 = 4! = 24.
In general, if there are n persons, the number of circular permutations is (n − 1)!.
Permutations are sequences: if the order of the elements is changed the per-
mutation is counted as a different one. We shall now consider problems in which
rearrangements of the same group, collection, or set of objects are regarded as
equivalent: what defines the set is simply which items it contains, without regard
to order. Such a set is called a combination. For example, an apple (A), a banana
(B), and a carrot (C) in a plastic bag can be regarded as a mere combination of
purchases, but the decision to consume them in a certain time order involves
consideration of the possible permutations ABC, BAC, and so on.
50
Suppose there are six distinct objects, A, B, C, D, E, F, and we want to count
how many different combinations consisting of three elements can be selected.
STANDARD FUNCTIONS AND TECHNIQUES
Combinations
The number of combinations of r distinct objects (1 r n) selected from n
distinct objects is given by
P n!
Cr = =
n r
.
n
r! r!(n − r)! (1.42)
Example 1.25 A lucky-dip jar contains seven sweets; there are one each of
flavours A, B, C, and D, and three of flavour E. A child reaches into the jar and
pulls out four sweets. How many distinct combinations of flavours might the
child obtain? ➚
51
Example 1.25 continued
The combinations may contain three Es, two Es, one E, or no Es:
1.18
three Es: one other choice out of four; possible combinations 4.
two Es: two other choices out of four; possible combinations 4C2 = 4!/(2!2!) = 6.
Example 1.26 Prove that (a) nCr = nCn−r , (b) nCr = n−1Cr + n−1Cr−1.
(a) From (1.42) nCr = n!/[r!(n − r)!] and nCn−r = n!/[(n − r)!r!].
(b) Starting with the right-hand side
n−1Cr + n−1Cr−1 = (n − 1)!/[r!(n − r − 1)!] + (n − 1)!/[(r − 1)!(n − r)!]
Self-test 1.13
A pin number for cash machines is a 4-digit number chosen from 0, 1, 2, 3, … , 9.
How many pins are there if: (a) all the digits are different? (b) repetitions are
permitted except that pins with 4 digits the same are not allowed?
power of x 0 1 2 3 4
coefficient 1 4 6 4 1
Notice that there are n + 1 terms in the expansion of (1 + x)n, and that the coeffici-
ents have a symmetrical pattern, coefficients equidistant from the ends being equal.
We shall show later that these properties hold for all positive integer powers n.
52
The process of repeated multiplication soon becomes arduous. We firstly describe
a more efficient method for powers given numerically, and secondly obtain an
STANDARD FUNCTIONS AND TECHNIQUES
Notice how the coefficient of x2 in (1 + x)5 is arrived at: it is the sum of the
coefficients of x2 and x in (1 + x)4 (underlined in the sum above). Similarly, the
coefficient of x3 in (1 + x)5 is equal to the sum of the coefficients of x3 and x2 in (1
+ x)4, and so on, the only exceptions to the rule being the first and last coefficients,
which are both equal to 1. The same rule applies whenever we calculate (1 + x)n+1
from (1 + x)n:
1
By using this rule we can develop an efficient and rapid method, or algorithm,
for obtaining the coefficients, called Pascal’s triangle. It is a triangular array, as
shown in (1.43), whose rows consist of the coefficients of x0, x1, x2, … in the
expansion of (1 + x)n for n = 1, 2, … . The rows are constructed successively.
Each row is obtained from the preceding row by the rules just given: place a ‘1’
at the beginning and end of each new row; then every intermediate entry in that
row is equal to the sum of two entries from the previous row, one directly above
and one to the left. Two instances are indicated in the table (1.43).
power of x 0 1 2 3 4 5 …
n=1 1 1
n=2 1 2 1
n=3 1 3 3 1
↓
n=4 1 4 6 4 1
n=5 1 5 10 10 5 1
↓
1.18
powers of 10 are implicit in the positions of the digits in the sequence 365. Similarly,
in Pascal’s triangle we temporarily hide the powers xr, and only have to manip-
We shall now prove the binomial theorem, which provides an explicit formula
(rather than an algorithm) for the coefficients in (a + b)n where n is any positive
integer. We shall start with the standard case (1 + x)n.
All the essentials of the general result can be illustrated by the special case
n = 3. Consider the following expansion, obtained by multiplication:
Each term on the right is really the product of three elements (either 1s or xs), one
from each of the three brackets on the left. Thus the first term is really 1 × 1 × 1, and
x1x3 arises from the product x1 × 1 × x3. The terms are then sorted into groups
according to the number of x factors. A formula for the number of terms in each
group can be obtained by using the result (1.42) of Section 1.17. For example, the
number of terms having two x factors is equal to the number of ways of choosing
two out of the three available x factors. This number is given by
3!
3 C2 = = 3.
2!1!
Similarly for the other groups: for r = 1, 2, 3 the group containing the products of
r x-factors has 3C2 members. Finally, suppose that all the x elements are made
equal by putting
x1 = x2 = x3 = x.
The bracketed expression in (1.45) containing the products of r of the x’s, with
r = 1, 2, or 3, collapses into 3Cr x r, so we obtain:
(1 + x)3 = 1 + 3C1x + 3C2x 2 + 3C3x 3.
54
We now prove the general result:
STANDARD FUNCTIONS AND TECHNIQUES
Binomial theorem
If n is a positive integer and a, b are any numbers:
(a) (1 + x)n = 1 + nC1x + nC2x2 + ··· + nCn−1xn−1 + nCnx n.
(b) (a + b)n = an + nC1an−1b + nC2an−2b2 + ··· + nCn−1abn−1 + nCnbn,
where
n! n(n − 1) … (n − r + 1)
Cr = = .
n
r!(n − r)! r!
(The notation
A nD
C r F = nCr
(The meaning of ‘an r-fold product’ is one containing exactly r different x’s: thus
xpxq, p ≠ q, is a two-fold product.) For each fixed value of r, the number of r-fold
products is equal to the number, nCr, of combinations of r objects selected without
repetition from the n different objects x1 to xn.
Now put
x1 = x2 = x3 = ··· = xn = x
into (1.45). For each value of r, the sum of the r-fold products collapses into the
form
(x r + x r + x r + ··· to nCr terms) = nCr x r.
This gives the result (1.46a)
(b) To obtain the expansion of (a + b)n, we follow the process that led to
eqn (1.44), namely
(a + b)n = an[1 + (b/a)]n,
and use (1.44a) with x = b/a; this becomes
⎡ ⎛ b⎞ ⎤
2 n −1 n
⎛ b⎞ ⎛ b⎞ ⎛ b⎞
(a + b) = a ⎢1 + nC1 ⎜ ⎟ + nC2 ⎜ ⎟ + + nCn −1 ⎜ ⎟ + ⎜ ⎟ ⎥ .
n n
⎢⎣ ⎝ a⎠ ⎝ a⎠ ⎝ a⎠ ⎝ a⎠ ⎥
⎦
After removing the brackets the sum is as given in (b). An alternative proof of this
theorem, using a calculus method, is given at the end of Chapter 4.
55
PROBLEMS
It is simplest to add a line to Pascal’s triangle (eqn (1.43)) to obtain the coefficients, rather
than to use the general form of the binomial theorem (1.46):
power of x: 0 1 2 3 4 5 6 …
n=5 1 5 10 10 5 1
n=6 1 6 15 20 15 6 1
Then
(x + x−1)6 = x6(1 + x−2)6
= x6[1 + 6(x−2) + 15(x−2)2 + 20(x−2)3 + 15(x−2)4 + 6(x−2)5 + (x−2)6]
= x6[1 + 6x−2 + 15x− 4 + 20x− 6 + 15x−8 + 6x−10 + x−12]
= x6 + 6x4 + 15x2 + 20 + 15x−2 + 6x− 4 + x− 6.
Self-test 1.14
Using the binomial theorem, find a series expansion for (1 + x)2n + (1 − x)2n.
Problems
1.1 (Section 1.3). Sketch graphs of the following 1.3 (Straight line, Section 1.3). What are the slopes
equations over the intervals stated: of the following straight lines, and where do they
(a) y = x4, −1.5 x 1; cut the coordinate axes?
(b) y = x(1 − x), −1 x 2; (a) y = x − 1; (b) 3y = x − 2; (c) 2x + 5y = 4.
(c) y = 1 + x + x2, | x − 1 | 2;
(d) y = |x − 1 |, −3 x 3; 1.4 (Straight line, Section 1.3). Find the equations
(e) y = |x | + |x − 3 | + | x + 2 |, −3 x 4; of the following straight lines:
(f) y = ||x | − 1 |, −2 x 2; (a) passing through (1, 2) inclined at 45° to the
(g) y = √(x2 + 1), | x | 2. x axis;
(b) passing through (−1, −2) with slope −2;
1.2 (Straight lines, Section 1.3). Find the straight (c) with slope 0.5 and x axis intercept x = 1;
lines through the following pairs of points: (d) through (1, 2) parallel to the line y = 3x − 4;
(a) (1, 1), (−1, 5); (e) through (−1, 3) perpendicular to the line
(b) (0, 1), (2, 1); y = 4x − 1.
(c) (2, 1), (−1, −1).
Sketch the triangle formed by these lines. Find the 1.5 Show that the following pairs of lines are
lengths of each side of the triangle. mutually perpendicular:
56
(a) 2x + 3y − 2 = 0 and −3x + 2y − 3 = 0; 80 (t 0),
(b) f(t) =
(b) y = 2x + 1 and y = --12 (3 − x); 9 2t (t 0);
STANDARD FUNCTIONS AND TECHNIQUES
1.7 (Circles, Section 1.3). Find the centre and 1.14 (Trigonometric functions, Section 1.6). Using
radius for each of the following circles: the methods of Examples 1.5 and 1.6, obtain
(a) x2 + y2 = 9; (a) sin --14 π; (b) sin --12 π; (c) sin π;
(b) (x − 1)2 + y2 = 4; (d) sin(− --34 π); (e) cos --16 π; (f) cos --56 π;
(c) x2 + y2 − 2x − 2y − 21 = 0; (g) sin − --13 π; (h) cos − --23 π.
(d) 4x2 − 4x + 4y2 + 4y = 9.
1.15 (Trigonometric functions, Section 1.6). Use
1.8 (Circles, Section 1.3). Find the equation of the (1.18c) to show that
circle centred at (1, −2) with radius 3. (a) cos4A = --18 (3 + 4 cos 2A + cos 4A);
(b) sin4A = --18 (3 − 4 cos 2A + cos 4A).
1
PROBLEMS
to solve the equation x3 − x + 1 = y for x.) displacement AC is given by
AC = 2.5[sin ω t + √(4 − cos2ω t)]
1.22 (Sections 1.10, 1.11). Solve the following
(in cm), where t is measured from θ = 0, and
equations for x:
state ω in radians per second.
(a) e2x = 13; (b) ln 3x = 2;
(c) ln x− –3 = 1; (d) 3 e3x = 1;
(e) ex + e−x = 2 (hint: multiply through by ex first); 1.28 An oscillation takes the form
(f) eln 2x = 4; (g) ln e2x + 3 ln e5x = 2; x = 3 cos ω t + 4 sin ω t.
(h) ln(x + 1) + ln(x − 1) = 0; By finding numbers c and φ such that
(i) ln(x + 1) + ln(x − 1) = e;
c cos φ = 3, c sin φ = 4
(j) 2x = 3; (k) 32x = --12 ;
(l) sinh 2x = 4; (m) 2 sinh x = 2 cosh x + 3. express x as a single cosine term. What are the
amplitude and phase of the oscillation?
1.23 Express 2x as a power of e.
1.29 The exponential function f(t) = C e−α t
1.24 (Section 1.12). Prove that 10x doubles its satisfies the conditions f(0) = 2 and f(1) = 0.5.
value in any interval of length equal to Find the constants C and α . What is the
ln 2 value f(2)?
.
ln 10
1.30 A yacht, which has a draught of 2 metres, is
1.25 Sketch regions in the (x, y) plane defined by anchored in a tidal estuary, in which the depth of
the following inequalities: water around the yacht is
(a) (x − 1)2 + y 2 9; 5 + 4.5 sin 0.5t
(b) x 0, y 0, and x + y 1;
(in metres), where the time t is measured in hours.
x2 y2
(c) + 1; What is the tidal period in hours? Over how many
4 9 hours in one period can the yacht float free of the
(d) x 2 + y 2 1 and x 0; estuary?
(e) | x | + | y| 1.
1.31 Draw sketches of the graphs of the following
1.26 Prove that tanh−1x = --12 ln[(1 + x)/(1 − x)] for
curves given in polar coordinates, by constructing
−1 x 1.
a table of values of r for equally spaced angles (say
15° intervals):
1.27 Figure 1.33 shows a cross-section of a simple
(a) the cardioid r = 0.5(1 + cos θ );
model of a piston and crankshaft. The crankshaft
(b) the folium r = (4 sin2θ − 1) cos θ ;
(c) the four-leaved rose r = sin 2θ ;
(d) the Archimedean spiral r = 0.04θ (extend the
interval in θ to [0, 6π]);
(e) the equiangular spiral r = 0.1 e0.1θ (extend the
interval in θ to [0, 6π]).
(a) sin 4x; (b) cos(π + t); (c) sin t + cos 2t; following series:
(d) sin(x2); (e) e−sin x; (f) cos2x; 7 6 5
PROBLEMS
(b) Express the following in terms of factorials: obtain the numbers in each such grouping.
(i) 2 × 4 × 6 × ··· × (2m), (ii) 1 × 3 × 5 × ··· × Check the total against the number in (a).
(2m + 1), where m is a positive integer.
1.53 Suppose there is a collection of N objects of
1.48 (a) Calculate the numbers represented by different types A, B, etc., all the separate types
(i) 5 P4; (ii) 9 P3; (iii) 6 P3; (iv) 7C3; (v) 7C4; (vi) 10C5; being distinct from the others in some way, but
(vii) 100C98; (viii) (107). objects of a particular type are identical. There are
(b) Show that n Pn = n Pn−1, and explain why by NA identical objects of type A, NB identical objects
using an example. of type B, and so on. Show that:
(a) (A generalization of Example 1.22.) the
1.49 Given four letters A, B, C, D, obtain: possible number of distinct permutations
(a) the number of possible permutations of the four of the N objects is
letters, without repetitions of letters within a N!
permutation; ;
NA! NB! NC! …
(b) the number of three-letter combinations of
the letters, taken three at a time without (b) the combined number of distinct combinations
repetitions; that may be formed out of 1, 2, … , and N of
(c) the number of distinct four-letter the objects is equal to
permutations, in which all possible repetitions [(NA + 1)(NB + 1) … ] − 1.
within a permutation are allowed;
(d) the number of distinct three-letter 1.54 (a) Five representatives from each of the
combinations in which a letter may be countries France, Germany, Italy, and the UK are
repeated up to three times; to be seated along one side of a long table. Each
(e) the total number of permutations containing national group should sit together. In how many
from one to four letters without repetition; orders may the individuals be seated?
(f) the total number of distinct combinations (b) Suppose that the table is circular with the
containing from one to four letters, when up to representatives seated all round it. How many
three occurrences of a single letter is allowed. distinct orders are possible then? (Distinct
permutations are to be understood as circular
1.50 Find: permutations in the sense of Example 1.23.)
(a) the number of distinct three-letter ‘words’
obtainable from the letters A, B, C, D, E, in 1.55 Three prizes are to be distributed among
which E may occur 0, 1, or 2 times, but the 10 candidates. How many possible distinct
rest may occur only once; distributions are there in the following cases?
(b) the possible number of distinct six-letter words (a) The prizes are all equal with at most one for
in which E occurs exactly twice and the other any person.
letters only once. (b) The prizes are all unequal, and only one may
go to any person.
1.51 (a) How many distinct four-digit numbers (c) All the prizes are equal, and any person may
may be made up by using the digits 1, 2, 3, 4, 5, no receive up to three prizes.
digit being used more than once? (d) As in (c), but the prizes are all different.
(b) How many of the numbers in (a) are divisible
by 5? 1.56 (a) A committee of 4 is to be formed from
(c) How many of the numbers in (a) are divisible a pool of 2 accountants, 2 lawyers and 3 doctors.
by 2? How many committees contain (i) exactly 1
(d) How many distinct positive numbers are accountant, (ii) exactly 1 doctor? (Hint: enumerate
obtainable by using not more than four of the the possibilities.)
digits taken without repetition from the digits (b) The argument which follows is fallacious,
0, 1, 2, 3, 4? and the result it produces is false:
“Given N different objects, attempt to obtain a
1.52 There are four women and three men eligible new expression for the number of combinations
to fill four posts. NCn of n objects by means of a two-stage process.
(a) What is the total number of distinct Choose any r < n; there are NCr combinations of
combinations of personnel that can be selected? size r. To obtain combinations of size n,
60
supplement each of these by all combinations completed year (the annual growth rate) from the
N−rCn−r of size n − r taken form the remaining time it was purchased. Show that its value VN at the
STANDARD FUNCTIONS AND TECHNIQUES
N − r objects. This gives a total of N−rCn−r NCr end of the N th completed year is VN = A(1 + R)N.
combinations. Therefore N−rCn−r NCr = NCn.” Calculate VN over 5, 10, and 15 years when
Show that this result is false in general, and A = £1000 and R = 0.03 (usually expressed as 3%).
locate the source of the fallacy, by considering the (b) To value an investment when the time t
simple case where N = 7, n = 4 and r = 3, using from purchase is not a whole number of complete
letters for the objects. years the analogous formula Vt = A(1 + R)t is to
be used, where t is measured in years. Show that
1.57 (a) Write from memory the binomial over every period of T years the investment
expansions of the expressions (1 + x)n and (a + b)n, grows by a factor (1 + R)T (the proposed
where n is a positive integer. extension therefore has exactly the property
(b) Expand the expression (1 − x)6. we should hope for).
(c) Expand and simplify (x + x −1)5 and (x − x −1)5 (c) Obtain the doubling period of the investment
where x ≠ 0. when R = 3%, 6% and 9%. Obtain the 10-times
period when R = 6%.
1.58 Use the binomial theorem to show that
(1.01)10 ≈ 1.105. In a similar way, make an 1.63 Income from an investment is at a rate
approximation to (0.99)8. expressed as R per annum, but it is paid out
monthly to the investor at a rate r per month
1.59 By giving special values to the constants in on the current balance. Express r in terms of R.
the binomial theorem prove that, when n is a Why is R > 12r?
1
positive integer:
(a) 1 + 2 nC1 + 22 nC2 + ··· + 2n nCn = 3n, and 1.64 Money is borrowed from a finance company
1 − nC1 + nC2 − ··· + (−1)n nCn = 0. at an interest rate of rM per month. What is the
(b) 1 + nC2 + nC4 + ··· = 2n−1 = nC1 + nC3 + nC5 + ··· . equivalent compounded rate per annum? Calculate
the annual rate when the monthly rate is 1% and
1.60 Let
3%.
F(n, k) = nC0 + n+1C1 + n+2C2 + ··· + n+kCk,
where n and k are positive integers. Show that 1.65 (a) (Geometric series: a model savings
F(n, k) + n+k+1Ck+1 = F(n, k + 1). Check that scheme.) At the start of every year an amount A
F(n, 0) = n+1C0 , F(n, 1) = n+2C1 is put into a savings scheme. The interest on the
F(n, k + 1) = F(n, k) + n+k+1Ck+1 current balance at the end of each complete year
is reinvested, the (constant) annual rate being R.
for all n and k. By starting with k = 0 and k = 1, Show that the value VN of the fund at the end of
deduce that year N is given by VN = A(1 + R)[(1 + R)N − 1]/R.
F(n, k) = n+k+1Ck (b) Calculate VN after 10 years at 5% interest,
for all values of k. the annual subscriptions being £100, and find the
percentage gain on the total sum invested.
1.61 Expand 1/(x 2 + 3x + 2) in powers of x using (c) Find the expression for the fund value if the
partial fractions. saver contributes an amount 2A every 2 years,
over a period of 2M years, where M is an
1.62 (a) The value of a single investment of integer. Obtain the value of the fund using
amount A grows by a constant fraction R in each the data in (b).
Differentiation
2
CONTENTS
Figure 2.1 shows the graph of a straight line. The x and y coordinates are assumed
to have the same scale. Choose any two points P : (x1, y1) and Q : (x2, y2) which
lie on the line. If we measure the angle α from the positive x direction then
y2 − y1
tan α = (2.1)
x2 − x1
(see (1.6)). The value of tan α remains the same whether Q is to the right or left of
2
P, since the value of the fraction on the right is unchanged. The angle α itself will
differ in the two cases by an amount equal to π (or 180°), but this does not affect
the value of tan α. (If we refer to α itself to indicate the steepness of a line, we
choose the value that lies between ±90°, but normally we only need tan α.) Notice
that if the x and y scales differed, the angle α as depicted would not satisfy (2.1);
it would be too great or too small.
(x2, y2)
Q
(x1, y1) y2 − y1
α
P x2 − x1 N
O x Fig. 2.1
The slope or gradient of a straight line is defined to be the quantity tan α. If the
line is horizontal, tan α is zero. It is positive or negative according as the line
slopes upwards or downwards respectively as we go from left to right. Its magni-
tude increases or decreases as the inclination increases or decreases respectively,
becoming ±∞ when α = ±90°.
Consider now the slope or gradient of a curve at a point. Figure 2.2a shows a
typical curve. By the slope of the curve at the point P we mean the slope of the
tangent line to the curve at P. We can think of the tangent line as the line joining
two points on the curve which are ‘infinitely close together’, but it is no use
making P and Q coincide, since we simply get tan α = 0/0, which has no definite
meaning. It is necessary to carry out an indirect process.
Let P be the fixed point (see Fig. 2.2b). Take any other point Q on the curve
and join PQ by a straight line, called the chord PQ. If Q is some distance from P,
then the slope of PQ will not be close to that of PT, but if we take a succession of
points Q closer and closer to P, then the slope of the chord PQ can be made as
close as we wish to that of PT. The succession of points Q that we consider is
said to approach P. The corresponding value of the slope of PQ then approaches
a limit or a limiting value, and this is equal to the slope of the curve at P. As in
Section 1.10 the sign → signifies the word ‘approaches’, so we can write:
(a) (b)
2.1
y y Q
O x O x
Fig. 2.2
We shall be able to obtain the exact value of the slope of PT, which is the same
as the slope of the curve at P, by carrying out the approach of Q to P in algebraic
terms. To do this, we introduce a new symbol
δx
(pronounced ‘delta-x’). This is a single symbol: the Greek letter δ stands for the
words ‘the increment in’ or ‘the change in’ something: in this case, the increment
in the value of x from P to Q. We represent the point Q by the abscissa x + δx.
Note that δx can be positive or negative depending on the position of Q in
relation to P. The corresponding incremental change in y from P to Q is δy.
The coordinates of Q are (x + δx, y + δy) as shown in Fig. 2.3. The separation
between P and Q is indicated by the differences δx and δy in the triangle PNQ.
Then, by (2.1),
δy
slope of PQ = . (2.3)
δx
Now let δx → 0 so that Q → P: the ratio δy/δx approaches a number which is
equal to the value of the slope at P. We first show what happens numerically in a
particular case.
y (x + δx, y + δy) Q
δy
P δx
(x, y) N
O x Fig. 2.3
64
Example 2.1 Find the slope of the curve y = x2 at the point P : ( 13 , 19 ) on the curve
DIFFERENTIATION
(Fig. 2.4).
1 1
( 3 + δx, 9 + δy)
1
9 + δy Q
δy
2
1 1
( 3, 9 ) δx
1
N
9 P
O 1
3
1
3 + δx x Fig. 2.4
We could have worked out the slope at P in this example without doing any
calculation. From (2.4),
δy 2
δx + (δx)2
= 3 = 2
+ δx (2.5)
δx δx
3
2.2
Therefore
δy 2x δx + (δx)2
This process is a model for treating other functions: for example, you could now
show in the same way that the slope of the graph of y = x3 at any point is equal
to 3x 2.
Limit notation
Let y = f(x), so that δy = f(x + δx) − f(x). Then the value approached by δy/δx
when δx → 0 is denoted by
δy
lim .
δx→0 δx
(2.7)
Read this as ‘The limit, or the limiting value, of δy/δx as δx → 0’. (The lim sign is
used in many other contexts too.)
The result of the process limδx→0 δy/δx, where δy = f(x + δx) − f(x), is called the
derivative of y with respect to x, or the derivative of f(x). The process is called dif-
ferentiation. We worked out earlier that, if y = f(x) = x 2, then the derivative is equal
to 2x. The following notations are standard short ways of indicating a derivative:
dy df (x) d δy
, , or f (x) all signify lim .
dx dx dx δx →0 δx
The symbol dy/dx is usually pronounced ‘dee-y by dee-x’. Notice that the letter
used is an ordinary d, not δ.
dy d(x2 ) d 2
or or x = 2x.
dx dx dx
Strictly speaking, dy/dx should be regarded as a single shorthand symbol
representing the longer expression limδx→0 δy/δx, and not as a ratio which can
be taken to pieces. However, its great usefulness is that it often behaves just
like an ordinary ratio of nonzero quantities, and we shall later see cases where
this property guides us to true results and makes them easy to remember.
2
(b) The slope m of a curve at any point (x0, y0), where y0 = f(x0), is given by
⎛ dy ⎞
m=⎜ ⎟ ,
⎝ dx ⎠ x = x
0
Self-test 2.1
Find the equations of the tangent to the curve y = x − x3 at (1, 0), and the nor-
mal to the same curve at (--12 , --38). Find where the tangent and normal intersect.
67
2.3
The quantity limδx→0 δy/δx is usually needed to solve problems which have no
immediate connection with the slope of graphs: this idea was only introduced to
RATES OF CHANGE
give the reader a picture to hold on to. Moreover, it is not always appropriate to
call the variables x and y if other letters arise more naturally.
For example, suppose that a car is moving along a straight road, represented by
an x axis, and that at time t its displacement (which may be positive or negative)
from the origin is given by
x = f(t).
We can deduce the velocity from moment to moment from this information.
Choose any moment t, and suppose that, between times t and t + δt, the car
moves from x to x + δx. Then δx is given by
δx = f(t + δt) − f(t).
The quantities δt and δx could be imagined as being recorded with a stopwatch
and distance meter, and the average velocity over the interval δt is
or alternatively, by (2.8),
dx
v= .
dt
We can borrow the result (2.6) to complete the calculation in one case.
Suppose that
x = t 2.
Equation (2.6) says in effect that
dy
if y = x2 then = 2x,
dx
and by changing the letters x and y to t and x respectively we obtain:
dx
if x = t 2 then = 2t.
dt
Therefore the velocity is
v = 2t.
68
Another way of expressing the meaning of velocity is that velocity is the rate of
change of displacement with time. Similarly, acceleration a is the rate of change
DIFFERENTIATION
so
dv δv
a= = lim = 2.
dt δt → 0 δt
The expression ‘rate of change’ means the same as the term ‘growth rate’ that we
used in Section 1.10. As seen in the next example, the idea of rate of change is
quite general and need not involve time.
Example 2.2 Find the rate of change of the area of a circle with respect to
its radius.
Call the radius r and the area A. The rate of change of A with respect to r is
dA δA
or lim .
dr δr → 0 δr
Since A = πr 2, we have
δA = π(r + δr)2 − πr 2 = π(2r δr + (δr)2).
Therefore
δA
= π(2r + δr).
δr
Now let δr → 0; we obtain
dA δA
= lim = 2πr.
dr δr → 0 δr
This result could have been obtained by using our previous result
d(t 2 )
= 2t,
dt
with r in place of t, and multiplying it by π. (Notice also that 2πr is the circumference:
the result can be interpreted as meaning that if we increase r by a small amount δr,
then the area increase is nearly equal to that of a narrow strip of length 2πr and
breadth δr.)
Self-test 2.2
The volume V of a sphere of vadius r is given by V = --43 πr 3. Find the rate of
change of volume with radius.
69
2.4 Derivative of xn (n = 0, 1, 2, 3, … )
2.4
The following is our first general result:
DERIVATIVE OF x n (n = 0, 1, 2, 3, … )
(a) if y = c, where c is a constant, then
dy
= 0.
dx
(b) If y = xn, where n = 1, 2, 3, … , then
dy
= nxn–1.
dx (2.9)
To prove (a): the graph of y = c is a horizontal straight line; therefore its slope
is zero, so dy/dx = (d/dx)c = 0.
To prove (b) in the most elementary way, we shall use an identity: if n is a
positive integer and a, b are any numbers,
an − bn ≡ (a − b)(an−1 + an−2b + an−3b 2 + ··· + bn−1).
This can be verified by multiplying out the two brackets on the right; everything
cancels except for the two terms on the left.
Follow (2.8a), with f(x) = xn, so that δy = (x + δx)n − xn:
dy δy 1
= lim = lim [(x + δx)n − xn ].
dx δx → 0 δx δx → 0 δx
Put a = x + δx and b = x into the identity, noticing that
a − b = (x + δx) − x = δx.
δy 1
= (δx)[(x + δx)n−1 + (x + δx)n−2x + ··· + xn−1]
δx δx
= (x + δx)n−1 + (x + δx)n−2x + ··· + xn−1
when δx ≠ 0. Now let δx → 0; we obtain
dy δy
= lim = xn−1 + xn−1 + ··· + xn−1.
dx δx → 0 δx
There are n terms on the right, each equal to xn−1, so finally,
dy
= nxn −1.
dx
(In Section 3.4 we show that (2.9b) is in fact true for all values of n.)
70
Example 2.3 Obtain (a) the general expression for dy/dx when y = x 3; (b) the
DIFFERENTIATION
slope of the curve y = x 3 at P : (2, 8); (c) the angle of inclination of the tangent
line to y = x 3 at the point P; (d) the equation of the tangent line through P;
(e) the velocity v and acceleration a of a point with coordinate x, when x = t3.
(a) From (2.9) with n = 3, we have
dy
= 3x3 −1 = 3x2.
dx
(b) The slope of the curve at P is equal to the value of dy/dx at x = 2, which is 12.
2
(c) The slope is equal to tan α, where α is the angle of inclination, so α = 85.2°.
(d) Let (x, y) now represent any point on the tangent line at (2, 8). The slope of the
tangent is equal to 12, so from (2.1).
y−8
= 12.
x−2
Therefore the equation of the tangent line is y = 12x − 16.
(e) From Section 2.3, v = dx/dt = (d/dt)t 3 = 3t 2. Also
dv d d
a= = 3t 2 = 3 t 2 = 6t.
dt dt dt
A little thought about the process of finding limδt→0 δv/δt will persuade the reader that it
is right to take the constant 3 from under the differentiation sign in the last line; see also
the next section.
The result (2.10c) follows easily by repeated use of (a) and (b). We can use
this rule together with (2.9) to obtain the derivatives of polynomials, as in the
following example.
71
2.5
(a) From (2.10c),
Example 2.5 A car travels along a straight road with varying velocity v for
one hour. At time t hours, its displacement from the starting point O is given by
x = 60t 2(3 − 2t) (0 t 1) kilometres. Find expressions for (a) the velocity v;
(b) the acceleration a.
(a) The velocity is the rate of change of displacement with time:
dx d
v= = [60t 2(3 − 2t)]
dt dt
d ⎛ d(t 2 ) d(t 3 )⎞
= 60 (3t 2 − 2t 3 ) = 60 ⎜ 3 −2 ⎟
dt ⎝ dt dt ⎠
= 60[3(2t) − 2(3t 2)] = 360(t − t 2) in km h−1.
The car stops after 1 hour at t = 1.
(b) Acceleration is the rate of change of velocity with time:
dv
a= .
dt
Therefore
a = 360(1 − 2t) (in km h−2).
Self-test 2.3
Find the derivative of y = 10x7 + 7x10.
72
First consider
eε − 1
lim ,
ε→0 ε
where e = 2.718 28… is the number defined in Section 1.10. If we put ε = 0 we get
0/0, which is meaningless, but the approach to a limit can be seen in the following
table:
(An approach to zero through negative ε values is similar.) It looks as if the limit
is equal to 1.
To prove this, recall that in Section 1.10 it was shown that the graph y = ex
intersects the y axis at 45°; that is to say, its slope there is equal to 1. (This is the
characteristic property of the base e = 2.7128….) The same thing is true if we plot
y = eε against ε, as in Fig. 2.5. Referring to this figure:
eε − 1 RQ − OP NQ
= = ,
ε PN PN
which represents the slope of the chord PQ. When ε → 0, the slope of the chord
PQ approaches the slope of the tangent PT, which is equal to 1. Therefore we have
proved that
eε − 1
lim = 1.
ε→0 ε
(2.11)
sin ε
lim ,
ε→0 ε
73
2.6
Q
T
O R
Fig. 2.5 Graph of y = ex
ε
(equalscales)
ε being measured in radians. The approach to the limit is shown in the following
table, which includes negative values of ε:
sin ε PN
= . (2.12)
ε arc PR
P
(a) (b) P
ε ε
B B
A Q N R AQ NR
Fig. 2.6
74
Now let Q recede some distance towards the left, as illustrated in Fig. 2.6b. The
angle ε decreases, and ε → 0 as Q recedes to infinity. At the same time the arc PR
DIFFERENTIATION
approaches the straight line PN, tending ultimately to coincide with it. Therefore,
when ε → 0, the length of the arc PR approaches the length of PN; so, from (2.12),
sin ε
lim = 1. (2.13)
ε→0 ε
2
Finally we consider
ln(1 + ε)
lim .
ε→0 ε
Figure 2.7 shows the graphs of y = ln ε and y = ln(1 + ε ). The graph y = ln ε (see
Fig. 1.31) passes through the point (1, 0) at 45° to the ε axis. The graph y = ln(1 + ε )
is the same graph moved over to the left by a distance 1, so it passes through the
origin O at 45°: that is to say, it has slope equal to 1 at the origin. Therefore
ln(1 + ε)
lim = 1.
ε→0 ε (2.14)
y
1
45° 45°
ε
ε
)
0 ε 1 2
1+
ln
ln(
y=
y=
−1 Fig. 2.7
You may be glad to know that there are no more complicated limits to be
evaluated.
Self-test 2.4
What are the following limits?
e3ε − eε sin 2ε ln(1 + 3ε)
(a) lim ; (b) lim ; (c) lim .
ε→0 ε ε→0 ε ε→0 ε
2.7
y = e x.
Then according to the definition (2.8),
d x x
e =e .
dx (2.15)
Putting 1
2 δx = ε , we have from (2.13),
dy sin ε sin ε
= lim cos(x + ε ) = lim lim cos(x + ε ) = cos x.
dx ε → 0 ε ε →0 ε ε →0
Therefore
d
sin x = cos x (where x is in radians).
dx (2.16)
76
By a closely similar argument, it can be shown that
DIFFERENTIATION
d
cos x = − sin x, (x in radians).
dx (2.17)
d 1
ln x = .
dx x (2.18)
Function Derivative
y = f(x) dy/dx or df(x)/dx
c (c = constant) zero
x n (n = 1, 2, … ) nx n−1
ex ex
sin x cos x
cos x −sin x
ln x (x positive) 1/x (or x −1)
(2.19)
77
The derivatives of more complicated functions can be obtained from these by
using the rules described in the next chapter. A more extensive table is given
2.9
in Appendix D. Remember rule (2.10) for the addition of functions and multi-
plication by constants.
HIGHER-ORDER DERIVATIVES
Example 2.7 Obtain the equation of the tangent line at the point ( 21 π, π) on the
graph of y = 2x − 3 cos x.
At a general point on the curve,
dy d dx d
= (2x − 3 cos x) = 2 −3 cos x (by (2.10))
dx dx dx dx
= 2 − 3(−sin x) = 2 + 3 sin x
(from the table). At ( 12 π, π) this becomes equal to
2 + 3 sin 12 π = 5,
and this is the slope of the tangent line at the point. The equation of the tangent line is
therefore
y−π
= 5,
x − 12 π
or y = 5x − 23 π.
Self-test 2.5
Find the derivatives of cosh x and sinh x.
d ⎡ d ⎛ dy ⎞ ⎤ d ⎛ d2y ⎞ d 3y d 3f (x)
⎢ ⎜ ⎟⎥ = ⎜ ⎟ = or ,
DIFFERENTIATION
dx ⎣ dx ⎝ dx ⎠ ⎦ dx ⎝ dx2 ⎠ dx 3 dx 3
and so on.
2 3
dx dx dx dx 4
For any polynomial of degree n the (n + 1)th derivative will be zero.
Example 2.9 Write down the sequence y, dy /dx, d2y /dx2, … , d7y /dx7 when
y = sin x.
The sequence is
sin x, cos x, −sin x, −cos x, sin x, cos x, −sin x, −cos x
(and it continues in this regular way).
The following example involves the factorial n!. As we saw in Section 1.17, n! is
defined as
n! = n(n − 1)(n − 2) … 2· 1.
Remember that 0! is defined to be 1.
2.10
(b) When r = n, we obtain from (i)
dn
(x n) = n(n − 1)(n − 2) … (n − n + 1)x0 = n(n − 1)(n − 2) … 1 = n!,
Self-test 2.6
dr(x2r)
Find the higher derivative .
dxr
d2y d2y
(a) Case > 0. (b) Case < 0.
dx2 dx2
dy dy dy dy
0 0 0 0
dx dx dx dx
y dy y
=0
dx
O
x x
O
dy
=0
dx
Fig. 2.8 Change of sign of dy/dx across a point where dy/dx = 0 in two cases.
80
Example 2.11 Sketch one period 2π of the graph of sin x and indicate the signs
DIFFERENTIATION
(a) (b)
d2y d2y dy
y 0 0
dx2 dx2 dx
1 y = sin x 1
Z 3
2
π
x x
O 1 π π 2π O 1 π π 3 π 2π
2 Z 2 2
−1 −1
dy dy dy
0 0 0
dx dx dx
Fig. 2.9
Problems
2.1 (Computational). A point P is given on each of 2.3 (Sections 2.1, 2.2). Obtain dy/dx from
the following curves. Choose a sequence of points Q first principles (see Problem 2.2) at a general
which lie closer and closer to P on the curve, and point P : (x, y) on the given curves.
make a table giving the slopes of the chords PQ. (a) y = 3x2; (b) y = x3;
From this table, estimate the slope of the curve at P. (c) y = 1/x; (d) y = x2 + 12 ;
(Consider points on both sides of P.) (e) y = x + 1/x; (f) y = 2x2 − 3.
(a) y = x3 at P : (1, 1);
(b) y = x – at P : (1, 1);
1
2
2.4 (Section 2.3). Let x be the displacement of a
(c) y = cos x at P : ( 14 π, 2− – );
1
2
point moving on a straight line, and let t represent
(d) y = ex at P : (0, 1); the time elapsed. Form a table by taking the given
(e) y = e2x at P : (0, 1); value of t and calculating the average velocity
(f) y = x3 + x – at P : (1, 2) (compare (a) and (b)); between t and t + δt for diminishing values of δt.
1
2
(g) y = ln x at P : (1, 0). Use the table to estimate the velocity at time t.
(a) x = 3t at t = 1; (b) x = 5t 2 at t = 3;
2.2 (Sections 2.1, 2.2). Obtain dy/dx in each of (c) x = 2t − 5t 2 at t = 1; (d) x = 2t − 5t 2 at t = 0.2.
the following cases at the given point P. Do this
from first principles; that is, find δy in terms of δx, 2.5 Use the formula (2.9) to find dy/dx at the given
simplify δy/δx, and let δx → 0 to obtain points in the following cases.
limδx→0 δy/δx, or dy/dx. (a) y = x at any point; (b) y = x3 at x = 3;
(a) y = 3x at P : (2, 6); (b) y = 3 − 2x at P : (1, 1); (c) y = x4 at x = 2 and at x = −2.
(c) y = 3x2 at P : (1, 3); (d) y = x3 at P : (1, 1);
(e) y = 1/x at P : (2, --12 ); (f) y = 3x + 2x 2 at P : (1, 5); 2.6 From (2.9), write down the derivatives, dy/dx
(g) y = (1 + 2x)2 at P : (−1, 1). or (d/dx)f(x), for the given functions f(x). Use this
81
information to sketch rough graphs of f(x) (notice sin ε
(i) when ε is an angle measured in degrees;
the sign and the magnitude of the slope of y = f(x)). ε
PROBLEMS
(a) y = x; (b) y = x2; (c) y = x3; tan ε sinh ε e− ε − 1
(d) y = x4; (e) y = x5. (j)
ε
; (k)
ε
; (l)
ε
.
2.7 Sketch a velocity–time graph and an 2.13 (See Section 2.7.) Obtain d(cos x)/dx in the
acceleration–time graph for a point moving on a same way that (2.16), for sin x, was obtained.
straight line with displacement x = t3. Use these
to sketch a graph of acceleration against distance. 2.14 (See Section 2.7.) (a) Differentiate e2x by
(See Example 2.5.) following the method leading to (2.15).
(b) Differentiate sin 2x by following the
2.8 In the following, different letters for the method leading to (2.16).
variables are used in place of the usual x and y. (c) Prove that (d/dx)e−x = −e−x by following
Write down the derivatives in the appropriate part-way the method leading to (2.15). (Hint:
form. (For example, if w = r 3, then dw/dr = 3r 2.) limε→0[(e−ε − 1)/(−ε)] = 1.)
(a) V = 43 πr 3; (b) S = πd 2; Use this result to differentiate sinh x and cosh x
(c) E = kT 4 (k is a constant); (see (1.26) for the definitions).
(d) I = V/R (R is a constant);
(e) H = RI2 (R is a constant); 2.15 Differentiate the following functions.
(f ) V = RT/P (R and P are constant). (a) 2 sin x − 3 cos x;
(b) ln 3x (see Section 1.11 for the properties of the
2.9 Differentiate the following functions by using logarithm);
(2.10): (c) ln x3 (see Section 1.11); (d) sin x − x;
(a) 3x2 − 2x + 1; (b) x7 − 3x6 + x + 1; (e) ex − 1 − x − 12 x2.
(c) x + C (where C is a constant);
(d) x(x − 1); 2.16 Find the equations of the tangent lines in the
(e) x2(x2 + 1) − 1; following cases.
(f ) ax2 + bx + c (where a, b, c, are constants); (a) y = x3 at (1, 1);
(g) (x − 1)2. (b) y = x4 − 2x2 + 1 at (2, 9);
(c) y = cos x at (12 π, 0);
2.10 Prove that the following pairs of curves (d) y = ln x at (e, 1);
intersect in a right angle at the points given. 1 1
(Hint: find dy/dx at the point for each curve.) (e) y = sin x + cos x at (14 π, 1);
√2 √2
(a) y = 1 + x − x2 and y = 1 − x + x2 at (1, 1); (f) y = 3ex − 4x at (0, 3).
(b) y = 12(1 − x2) and y = x − 1 at (1, 0);
(c) y = 1 − 13 x3 and y = 16 + 12x2 at (1, 23 ). 2.17 Obtain dy/dx, d2y/dx2, d3y/dx3 in the
following cases.
2.11 Find the angle between the following curves (a) y = x6; (b) y = 3x2 − 2x + 2; (c) y = x6 − x2;
at their points of intersection. (Hint: the angle of (d) y = 2 sin x − 3 cos x; (e) ex − 1 − x − 12 x2.
intersection is the angle between the tangents to
the curves at the point; then consider (1.7) and 2.18 Show that, if N is a positive whole number,
the tangent formula of (1.17a) for the difference then (dN/dxN)xN = N!.
of angles.)
(a) y = x 2 and y = 1 − x 2; 2.19 For the curve y = x2(x2 − 3), find the ranges
(b) y = 13 x 3 and y = x 2 − 2x + 43 . in x for which (a) dy /dx is positive (so that y is
increasing); (b) dy /dx is negative (so that y is
2.12 (See Section 2.6.) Find the limits of the decreasing); (c) d2y /dx2 is positive (so that the slope
following functions when ε → 0. (Remember: is increasing); (d) d2y /dx2 is negative (so that the
0/0 has no definite meaning.) slope is decreasing). Deduce the general shape of
ε ε ε2 the curve from these facts. (Hint: if dy/dx changes
(a) ; (b) ; (c) ;
ε 2ε ε sign at some point, then dy/dx must be zero at the
point. But dy /dx does not necessarily change sign
e2ε − 1 e2 ε − 1 sin 2ε
(d) ; (e) ; (f) ; where dy/dx = 0.)
2ε ε 2ε
sin 2ε ln(1 + ε 2) 2.20 Find the equation of the normal to the
(g) ; (h) ;
ε ε2 parabola y = ax2 at any point x = x0.
Further techniques for
3 differentiation
CONTENTS
In the course of Chapter 2, the derivatives were found of the elementary functions
xn (where n is a positive integer), ex, sin x, and so on (see (2.19)). However, we
shall need a lot more, even for quite ordinary applications. For example, we
should surely like to be able to find dy/dx if y = xa when a is not a positive integer
1
(for example y = x –2 , or x−1(= 1/x); if y = eax, where a is any constant; if y = sin ax;
1
and so on. Much more complicated cases frequently arise, like y = sin –2 (ln x).
Fortunately we do not have to work out each one separately by means of a
lengthy argument. The linear combination rule (eqn (2.10c)), the product rule
(eqn (3.1)), and the quotient/reciprocal rule (eqn (3.2)), together with the results
in the Table (2.19), enable us to differentiate xn when n is a negative integer; also
rational fractions involving terms listed in (2.19) such as tan x (which equals
sin x/cos x), mixed expressions such as x−3 cos[x/(1 + ex)].
The derivatives of xa, sin(Ax + B) and so on, where a may take any value, and
also the inverse functions such as arcsin x, are obtained by applying the chain
rule or ‘function-of-a-function’ rule (Section 3.3) to results that we already have
(for example those in the Tables (3.4) and (3.5)). The chain rule has many general
applications – it is very important to recognize when it can be used.
By using combinations of these rules as appropriate, any finite expression built
out of the basic functions, however complicated, can be differentiated.
83
3.1
The derivatives of a product of several functions can be obtained when the deriva-
tives of its individual components are known. Examples of such products are
Product rule
If y(x) = u(x)v(x), then
dy d dv du
= (uv) = u + v .
dx dx dx dx (3.1)
d
= x ex cos x + sin x (x ex ). (i)
dx
To evaluate (d/dx)(x ex ), use the product rule again, putting u = x and v = ex. Then
d d x d
(x e x ) = x e + ex x = x ex + ex . (ii)
dx dx dx
Replace (ii) into (i):
dy
= x ex cos x + (sin x)(x ex + ex )
dx
= ex(x cos x + x sin x + sin x).
More generally, if y = uvw, the product rule, applied twice as in Example 3.3,
becomes
dy du dv dw
= vw + wu + uv,
dx dx dx dx
which generalizes in an obvious way to products of four or more functions.
A general method of dealing with the product of several terms, which is usually
more convenient, is given in Section 3.7. You are strongly recommended to write
out all the steps completely at first, otherwise mistakes are likely to occur.
Self-test 3.1
Find dy/dx where y = ex sin x
85
3.2
Suppose that
δy ⎛ u + δu u ⎞ 1 uv + v δu − uv − u δv
=⎜ − ⎟ =
δx ⎝ v + δv v ⎠ δx v(v + δv) δx
v δu − u δv 1 ⎛ δu δv ⎞
= = ⎜v − u ⎟.
v(v + δv) δx v(v + δv) ⎝ δx δx ⎠
Let δx → 0; then δy/δx, δu/δx, δv/δx become dy/dx, du/dx, dv/dx, and δv → 0.
Therefore
dy d ⎛ u⎞ 1 ⎛ du dv ⎞
= ⎜ ⎟ = 2 ⎜v − u ⎟.
dx dx ⎝ v ⎠ v ⎝ dx dx ⎠
It is worth noting the special case of the reciprocal of a function. In that case,
u(x) = 1, so du/dx = 0. Finally we have
From (3.2),
dy 1 ⎛ du dv ⎞ 1
= ⎜v −u ⎟ = [cos x cos x − sin x(− sin x)]
dx v2 ⎝ dx dx ⎠ cos2 x
1 1
= (cos 2 x + sin 2 x) = = sec 2 x .
cos 2 x cos 2 x
(Remember that cos2A + sin2A = 1.)
(a) Put v = x into the reciprocal rule (3.2b) (or u = 1 and v = x into (3.2a)):
dy 1 dv 1
=− 2 =− 2.
dx v dx x
(b) Put v = x2 into (3.2b):
dy 1 2
= − 4 2x = − 3 .
dx x x
Self-test 3.2
Find dy/dx where y = (ln x)/(1 + x2)
3.3
An example of this is
y = cos(x3),
Example 3.7 (a) You are given that (d/dx) e x = e x (see the table, (2.19)). Deduce
that (d/dx) e = a eax, where a is any constant. (b) Find the derivative of e−x.
ax
(c) Use this result to obtain the derivatives of sinh x and cosh x (see (1.26) for
the definitions of these functions).
(a) Rewrite y = eax in the form
y = e u, where u = ax. ➚
88
Example 3.7 continued
FURTHER TECHNIQUES FOR DIFFERENTIATION
3.4
Put y = 1/u, where u = x2 + 1. Then
Example 3.11 Find du/dt when u = a cos k(x − ct) where a, k, c, and x are
constant, t and u being the only variables.
We should not use u for the intermediate variable in the chain rule (3.3), because it is
already in use (as the name of the dependent variable). Instead of u, use an uncommitted
letter such as w as the intermediate variable, putting
u = a cos w, where w = kx − kct.
The chain rule takes the form
du du dw
= ,
dt dw dt
in which
du dw
= − a sin w and = −kc.
dw dt
Therefore
du
= (− a sin w)(−kc) = akc sin k(x − ct).
dt
Self-test 3.3
Find dy/dx where y = (1 + 12e12x)12.
Derivative of xn
If y = xn, where n may take any value whatever, then
dy
= nxn–1.
dx (3.4)
90
To prove (3.4), we use the chain rule (3.3). Note that, for all x 0, x = eln x (see (1.21)),
so that
FURTHER TECHNIQUES FOR DIFFERENTIATION
y = xn = (eln x)n = en ln x.
To use the chain rule, we put this in the form
y = e u, where u = n ln x,
so that
dy du n
= eu and = .
du dx x
Then
dy dy du n n
= = eu = xn = nxn −1
dx du dx x x
(where we used eu = y = xn again).
d 23
(a) Here n = (x ) = 23 x 2 .
1
3
2 in (3.4), so
dx
(b) This may be written y = x − 2 , so n = − 23 in (3.4), and
3
d
(x − 2 ) = − 23 x − 2 .
3 5
dx
dy
(c) y = x − 2 , so = − 12 x − 2 .
1 3
dx
1
(d) Write
1
y = (2x 3 + x)−1. We can use the chain rule: put y = u−1, where
u = 2x 3 + x, Then
dy du 2 − 23
= −u −2 (by (3.4)), = 3 x + 1 (by (3.4)).
dx dx
Therefore, by the chain rule (3.3),
dy dy du x +1
2 − 23
= = (−u −2 )( 23 x − 3 + 1) = − 3 1
2
.
dx du dx (2x 3 + x)2
3.5 Functions of ax + b
A frequently occurring application of the chain rule (3.3) is in connection with
functions like eax+b, sin(ax + b), (ax + b)n, and in general f(ax + b). The spirit of the
chain rule is to say: ‘If the functions were e x, sin x, xn, f(x), then they would be
easy. Therefore, try the chain rule with u = ax + b.’
Suppose that, in general, we want to differentiate y when
y = f(ax + b),
and that we know how to differentiate f(x). Write
91
u = ax + b, y = f(u).
3.6
Then the chain rule gives
Function Derivative
eax a eax
sin ax a cos ax
cos ax −a sin ax
ax(a 0) ax ln a
Self-test 3.4
Find dy/dx if y = akx where a and k are constants and a 0.
dx
92
The result can be extended in an obvious way to any number of intermediate
variables, but it is seldom that more than two would be needed:
FURTHER TECHNIQUES FOR DIFFERENTIATION
Self-test 3.5
Find dy/dx where y = sin(ex ).
2
3
Logarithmic differentiation
If y = uvw, then
dy ⎛ 1 du 1 dv 1 dw ⎞
= uvw ⎜ + + ⎟
dx ⎝ u dx v dx w dx ⎠
(and so on for any number of terms in the product defining y). (3.7)
93
1
Example 3.14 Find dy/dx when y = (x–2 sin2x)/(x2 + 1).
3.8
Put y = uvw, where
u = x –2 , v = sin2x, w = (x2 + 1)−1.
1
IMPLICIT DIFFERENTIATION
Then
ln y = ln(x –2 ) + ln(sin2x) + ln(x2 + 1)−1.
1
1
dy x 2 sin2 x ⎛ 1 2 cos x 2x ⎞
= 2 ⎜ + − 2 ⎟.
dx x + 1 ⎝ 2x sin x x + 1⎠
Self-test 3.6
1
Using logarithmic differentiation, find dy/dx where y = x–2 cos2x ln x.
x explicitly.
Example 3.15 Find a general expression for dy /dx at any point on the curve
given by f(x, y) = x + y + sin x + cos y = 1.
So long as we stay on the curve, f(x, y) does not change when x changes, so
df(x, y)/dx = 0. Therefore
dy dy
1+ + cos x − sin y = 0,
dx dx
so finally
dy cos x + 1
= .
dx sin y − 1
Such a result is not quite so simple as its neatness suggests, because we would
still find it hard to say what values of y are to be associated with a particular value
of x in the new formula: this would in effect involve solving the original equation
for y in terms of x.
3
Self-test 3.7
Find dy/dx as a function of x and y for 2y = x − y + 2y3 − 3 ln x. What is the
value of dy/dx at (x, y) = (1, 1)?
dy dx
=1
dx dy (3.8)
3.10
x = e y.
Then
C
2 y)
+ tan
√(1 tan y
y
A 1 B Fig. 3.1
Self-test 3.8
Find dy/dx if y = tanh−1(2x).
δy δy δx
= ,
δx δt δt
since δt cancels on the right-hand side. Let δt → 0; then δx → 0, and we have
From (3.8),
dy dy dx t2
= = = − 12 t.
dx dt dt −2t
This equals −1 when t = 2. The slope of the curve is negative at this point, so the
tangent to the path slopes downwards from left to right as shown in Fig. 3.3. The
actual direction in which the vehicle is moving is, however, from right to left. It is
facing north west as shown.
97
3.10
N
y
Ta
ng
y Direction
Q
δs
δy
P
δx
O x
Fig. 3.4
From information such as that given in the previous example, the speed of a
moving point can be calculated. Suppose that a point moves so that
x = x(t), y = y(t),
where t represents time. Figure 3.4 shows the effect of changing t to t + δt, where
δt is small: the point moves from P to Q, a short distance δs say along the curve
(δs is called an element of arc-length). Then the average speed over this short time
is given by
=
δt
1
⎡⎛ δx ⎞ 2 ⎛ δy ⎞ 2 ⎤ 2
= ⎢⎜ ⎟ + ⎜ ⎟ ⎥ .
⎢⎣⎝ δt ⎠ ⎝ δt ⎠ ⎥
⎦
Now let δt → 0. Then δx/δt and δy/δt become dx/dt and dy/dt, and finally we
have the result:
98
ds ⎡⎛ dx ⎞ ⎛ dy ⎞ ⎤
2 2 2
speed = = ⎢⎜ ⎟ + ⎜ ⎟ ⎥ ≥ 0,
dt ⎢⎝ dt ⎠ ⎝ dt ⎠ ⎥
⎣ ⎦
where ds stands for an element of arc-length. (3.10)
Example 3.19 Find the speed of the vehicle in Example 3.18 when t = 2.
In general
1
ds ⎡⎛ dx ⎞ ⎛ dy ⎞ ⎤
2 2 2
= ⎢⎜ ⎟ + ⎜ ⎟ ⎥ = (4t 2 + t 4 )2 .
1
dt ⎢⎣⎝ dt ⎠ ⎝ dt ⎠ ⎥
⎦
The speed is therefore 4√2 when t = 2.
3
Self-test 3.9
An ellipse is given parametrically by x = a cos t and y = b sin t. Find dy/dx as
a function of t. Find the points on the ellipse where the slope of the tangent
to the ellipse is (−1).
Problems
3.1 (Product rule, (3.1)). Obtain df(x)/dx for the (n) 1/(x + 1); (o) e−x (= 1/ex);
following f(x): (p) 1 /tan x; (q) x −2 ln x.
(a) x e x; (b) x sin x; (c) x cos x;
(d) e x sin x; (e) x ln x; (f ) x 2 ln x; 3.3 Find the first, second, and third derivatives of
(g) e x ln x; (h) x2 e x; (i) sin x cos x; (a) 1/(1 − x); (b) x sin x; (c) x/(x − 1);
(j) x 2x 3 (this is the same as x5: show that the result (d) f(x)g(x), where f and g are any functions.
is the same for both forms).
3.4 (Chain rule, (3.3)). Obtain df(x)/dx for the
3.2 (Quotient and reciprocal rule, (3.2)). Obtain following f(x). (Set out the calculation
df(x)/dx for the following f(x): systematically, as in the examples in Section 3.3.)
(a) cot x; (b) x /(x + 1); (c) (sin x)/x; (a) sin2x; (b) cos2x; (c) sin x2; (d) cos x2;
(d) e x/x; (e) (x2 − 1) /(x2 + 1); (e) tan2x; (f) tan x2; (g) cos(1/x);
(f ) (tan x)/x2; (g) (sin x + cos x)/(sin x − cos x); (h) e−x (compare Problem 3.2(o)); (i) (x + 1)5;
(h) sec x (= 1 /cos x); (i) cosec x (= 1/sin x); ( j) (x 3 + 1)4; (k) sin 3x; (l) cos 12 x;
(j) x/(3x2 − 2); (k) 1/x(x3 + 1); (l) 1/ln x; (m) tan 12 x; (n) e−3x; (o) sin(2x + 1);
(m) x n where n is a negative integer (x n = 1/x −n); (p) cos(3x − 2); (q) tan(1 − 2x); (r) e1/x;
99
3.5 (General powers of x, Section 3.4). (a) Show that if x + y = 4, then dy /dx = −x/y.
2 2
PROBLEMS
(a) x−2; (b) x−1; testing it with y = ±(4 − x2)–. Interpret the
1 1 3 1
(c) x –; (d) x– –; (e) x–;
3 3 2 2
(d) sin 2
(3t + 1); (e) e−t
cos t; (f) e−t sin t; 3.12 The same expression for dy/dx in Problem
(g) e−2t cos 3t; (h) e−3t cos 2t; (i) sin x cos2x; 3.11a is obtained when the radius is changed; for
⎛ sin x ⎞
2
example, if x2 + y2 = 9, we still get dy/dx = − x/y. Is
( j) sin2x cos x; (k) ⎜ ⎟ ; this paradoxical? (Notice that even in the general
⎝ x ⎠
case of f(x, y) = c, a constant, the expression for
(l) x sin3x; (m) x cos3x. dy /dx will not depend on c: think of the difference
between the form of the expression and the values
3.7 Differentiate cos2x and sin2x, (a) by using it takes.)
the identities cos2A = --12 (1 + cos 2A) and
sin2A = --12 (1 − cos 2A), (b) by using the product 3.13 Find expressions for dy/dx and then d2y/dx2
rule, (c) by using the chain rule. if xy2 − x2 y = 1.
3.8 Confirm the correctness of the following 3.14 Differentiate the following inverse functions,
statements. The letters A, B, C, D, and n stand using the method of Section 3.9. The results are
for any constants. quite important, and are included in the table of
d 2x
(a) If x = A cos 2t + B sin 2t, then 2 + 4x = 0 derivatives, Appendix D.
dt (a) arcsin x; (b) arccos x;
d 2x (c) arctan x; (d) sinh−1x;
(b) If x = A cos nt + B sin nt, then 2 + n 2x = 0
dt (e) cosh−1x; (f) tanh−1x.
2
d x
(c) If x = A e3t + B e−3t, then 2 − 9x = 0 3.15 (Parametric differentiation, Section 3.10).
dt
d 2
x The curves in the following are in polar
(d) If x = A ent + B e−nt, then 2 − n 2x = 0 coordinates. Find dy/dx at the point specified.
dt
(a) r = sin 12 θ at θ = 12 π;
(e) If x = A e−t cos t + B e−t sin t, then (b) r = 1 + sin 2θ at θ = 41 π.
d 2x dx
+2 + 2x = 0
dt 2 dt 3.16 Obtain dy /dx in terms of t, then re-express
(f ) If y = A ex + B e−x + C cos x + D sin x, then it in terms of x, when the path of a point is given
d4y parametrically by the following.
− y = 0.
dx 4 (a) x = t 3, y = t 2; (b) x = 2 cos t, y = 2 sin t.
3.9 (Chain rule (3.3); or, more easily, the 3.17 The path of a point is given parametrically by
extension (3.6)). Differentiate the following x = a cos t, y = b sin t. Show that the point travels
functions. around the ellipse
2 2 2
(a) ecos x; (b) e–cos x ; (c) ln(cos x2); (d) (ex – 1)4. x2 y2
+ = 1.
a2 b2
3.10 (Logarithmic differentiation, Section 3.7, is
easiest.) Differentiate the following. Express dy/dx in terms of t. Suppose that t
1 1
(a) x ex sin x; (b) t et cos t; (c) x – e2x sin– 3x.
2 2 represents time. Express the speed as a function
of t.
3.11 (Implicit differentiation, Section 3.8). Proceed
as in Example 3.15 to obtain expressions for dy /dx 3.18 Show that (d/dx)(ax) = ax ln a when a 0.
in the following. (Hint: write ax in exponential form with base e.)
Applications of
4 differentiation
CONTENTS
4.1
d
f ′(u) = f(u) = 2u − 3.
du
Notice that the result (c) is different from the result (b): f ′(x2) is not the same as
(d/dx)f(x2). In (b) we first find f(x2) and then differentiate with respect to x; in
(c) we first find f ′(u) and then put u = x2.
Example 4.2 Express (a) the product rule (3.1); (b) the quotient rule (3.2a);
(c) the chain rule (3.3); in terms of the ‘dash’ notation.
(a) Product rule
d
[u(x)v(x)] = u(x)v′(x) + v(x)u′(x).
dx
or simply
(uv)′ = uv′ + vu′.
(b) Quotient rule
⎛ u⎞ ′ 1
⎜ ⎟ = 2 (vu′ − uv′).
⎝ v⎠ v
(c) Chain rule
d
f(u(x)) = f ′(u(x))u′(x).
dx
102
any terms available. (b) Verify the correctness of (a) in the special case when
f(5x − 3) = sin(5x − 3).
(a) Since the particular function f is not specified, the only thing to be done is to express
(d/dx)f(5x − 3) in terms of f ′, which is also unspecified. Then, from the chain rule (c) in
Example 4.2, with u = 5x − 3,
d
f(5x − 3) = 5f ′(5x −3).
dx
It is awkward to express the right-hand side without using the dash notation. One
alternative is to write it as
⎡d ⎤
⎢ du f (u)⎥ .
⎣ ⎦u =5x−3
(b) In this case f(u) = sin u, so
f ′(u) = cos u.
4
Self-test 4.1
If f(u) = eu cos(u2), obtain (a) f ′(x); (b) f ′(x2); (c) df(x2)/dx.
4.2
sign from positive to negative or negative to positive. The derivative of f(x) is zero
(the tangent is horizontal) at such turnover points: for example, it is easy to verify
(b) y
f ″(b) 0
(a) y f ″(a) 0
A
f ′(a) = 0
Local
Local
maximum
minimum
x B f ′(b) = 0
O a
x
O b
Fig. 4.1
d2 y d ⎛ dy ⎞
f ′′(x) = = ⎜ ⎟,
dx2 dx ⎝ dx ⎠
and this is negative at x = c. Therefore the slope, dy /dx, or f ′(x), is decreasing
across x = c, and since f ′(x) = 0 at x = c, f ′(x) must be positive on the left of c and
negative on the right. Thus the graph is of the type shown in Fig. 4.1a, and the
point is a local maximum.
If x = c is a point where
f ′(c) = 0 and f ″(c) 0,
then f ′(x) is increasing, and therefore goes from negative to positive, across x = c.
The point is therefore a minimum, like the point B in Fig. 4.1b.
In the special case when
f ′(c) = 0 but f ″(c) = 0
4
there might occur a maximum (as with y = −x4 at x = 0), or a minimum (as with
y = x4 at x = 0), or another feature called a stationary point of inflection (as with
y = ±x3 at x = 0). These cases are illustrated in Fig. 4.2. One way to classify such
a point is to examine directly the sign of dy/dx on both sides of the point.
Point of
Maximum inflection
O O
x O x O x Point of x
Minimum
inflection
Fig. 4.2 Cases for which f ′(0) = 0 and f ″(0) = 0. (a) y = −x4. (b) y = x4. (c) y = x3. (d) y = −x3.
To summarize:
4.2
The stationary points are where f ′(x) = 0; that is, where
y
2
V
1
R
O
x
–2 –1 1 2
−1
x
−2 Fig. 4.4
Fig. 4.3
Example 4.5 In the circuit shown in Fig. 4.4, V is a constant voltage and R
and x represent two resistances: R is fixed and x is variable. The rate of heat
generation y in resistance x is equal to I 2x where I is the current. Show that y
is a maximum when x = R.
Current equals voltage divided by total resistance, so
V
I= .
R+x
Therefore the rate of heat generation is
V 2x
y= = f (x),
(R + x)2
106
Example 4.5 continued
APPLICATIONS OF DIFFERENTIATION
say. If there is a maximum, it will occur when f ′(x) = 0. From the quotient rule (3.2)
V2 R−x
f ′(x) = [(R + x)2 − x ·2(R + x)] = V 2 . (i)
(R + x)4 (R + x)3
This is zero when x = R.
To show that f(x) has a maximum when x = R we may work out the sign of f ″(R).
From (i),
V2 V2
f ′′(x) = [(R + x)3
(−1) − (R − x) · 3(R + x)2
] = (− 4R + 2x).
(R + x)6 (R + x)4
Therefore
f ″(R) = −V 2/(8R3),
which is negative, so x = R corresponds to a maximum of y.
However, it is easier to look instead at the expression (i) for f ′(x). When x R,
we have f ′(x) 0, so f(x) is increasing. When x R, we have f ′(x) 0, so f(x) is
decreasing. This ensures that a maximum has been obtained without the need to
differentiate again.
4
Example 4.6 x and y are two numbers subject to the restriction that x + y = 1.
Find the maximum possible value of xy.
There are two variables, x and y, but we can reduce the problem to one involving
only x by using the fact that x + y = 1, so that
y = 1 − x. (i)
In that case,
xy = x(1 − x) = x − x2 = f(x),
say. Now f(x) has a stationary point (a maximum, minimum, or point of inflection)
where f ′(x) = 0, that is to say, where
1 − 2x = 0, or x = --12 .
By (4.2), this value of x delivers a maximum, because f ″(x) = −2 (for any value of x)
which is negative. From (i), y = --12 when x = --12 , so the maximum value of xy is --14.
Self-test 4.2
Find and classify the stationary points of f(x) = 2x3 − 9x2 + 12x + 1.
Example 4.7 Suppose that the values of x to be considered are restricted to lie
4.3
between 0 and 1 inclusive: that is, 0 x 1. Find the points on this interval at
which x − x2 takes maximum and minimum values.
In problems of the type illustrated in Example 4.6 this situation can arise
naturally, as in the following example.
f(x)
1
1
4
O
x
−1 1
O 1
2 1 x
Example 4.8 Find the maximum and minimum values of x2 − y2 on the circle
x + y = 1.
2 2
It is evident that the point (x, y) can only be on the circle if x and y both have values
between −1 and 1 inclusive, that is if
−1 x 1 and −1 y 1. (i)
A restricted interval therefore arises naturally in the problem. On the circle x + y = 1,
2 2
we have
y2 = 1 − x2, (ii)
so
x2 − y2 = 2x2 − 1 = f(x), (iii)
say. To find the stationary points of f(x) we see that f ′(x) = 4x, which is zero when x = 0.
Also f ″(0) = 4 0, so x = 0 is a local minimum of f(x), whose value is f(0) = −1.
However, we have overlooked something. In Fig. 4.6, we show the graph of
f(x) = 2x2 − 1, within the permitted interval −1 x 1. The local minimum at x = 0
can be seen, but there are also maxima at the end-points x = −1 and x = 1, where f(x)
takes the values +1.
Alternatively, the maxima at x = ±1 can be found by substituting for x instead
of y at the first stage. Put x2 = 1 − y2, so that ➚
108
Example 4.8 continued
APPLICATIONS OF DIFFERENTIATION
x2 − y2 = 1 − 2y2 = g(y),
say, and solve g′(y) = 0: we then find a local maximum at y = 0, where x = ±1. However,
we also lose sight of the minima we found before. The subject is discussed again in
Section 28.2.
y
B
A
O x
Fig. 4.7
4
Another possibility is that there may be points at which the graph of y = f(x)
does not have a definite tangent. Then f ′(x) or dy/dx has no meaning at such
points. For example, in Fig. 4.7, there is no tangent at the points A, B, and C. The
points A and B could qualify as local maxima, and C as a local minimum, but at
A and C the graph suddenly changes direction, and at B there is a jump in the
value of f(x). These points cannot be located by solving f ′(x) = 0, because f ′(x)
does not exist at A, B, and C.
For example, the derivative of f(x) = | x | is not defined at x = 0, but the function
clearly has a minimum value at x = 0.
Self-test 4.3
Obtain and classify the stationary points of f(x) = 3x4 − 8x3 + 6x2. Sketch the
graph of y = f(x).
y (b) y
4.4
(a)
1 1
y= 2 y=−
2 x2 x2
−2 O 2 x O
−2 −2
(c) y –1 (d) y
y= 1
(1 + x)2 y=1−
2 2 (1 + x)2
−2 −1 O 2 x 1
–2 O 2 x
−2 −2
Fig. 4.8
In Fig. 4.8d of Example 4.9 we see that, as x increases, becoming large and
positive, the value of
1
y=1−
(1 + x)2
gets closer and closer to 1. The same is true when x becomes large and negative.
This is obviously an important feature of the graph. It can be seen to be true by
thinking what happens to y when we put a large value of x into the formula for y
(think of a very large number: x = 1 000 000, rather than x = 10). Then obviously
1/(1 + x)2 is very small, so y gets very close to 1, and the larger x becomes, the
nearer y is to 1. The same is true when x is large and negative.
We say that, as x increases, the graph approaches the line y = 1, which in gen-
eral terms is called an asymptote of the graph. When x approaches −1, the graph
approaches the vertical line x = −1; this is also called an asymptote. The two
continuous halves of the graph to the left and right of x = −1 are called branches.
Suppose that y = f(x) is to be sketched. A general question to be asked is ‘What
happens to y when x increases towards infinity (or decreases towards minus
infinity)?’ We normally say ‘as x approaches ±∞’, and as usual indicate the
approach by ‘→’:
x → ±∞.
For example, 1/x → 0 when x → −∞. Also
x−1
→ 1
when x → ∞ (or x → −∞).
3x + 2
3
To see this, think of the effect of giving x an immense value. Only the terms x and
3x are significant; they are said to dominate the expression, so
110
x−1 x
→ = 13 .
3x + 2
APPLICATIONS OF DIFFERENTIATION
3x
The limit notation can be used in this context (see Section 2.2). We can write,
for example,
1 – 2x
lim = −2 .
x→∞ 1+x
The reasoning is the same as in the earlier case: think of a very large value of x.
Very often the function has no definite limit as x → ∞. For example, limx→ ∞ sin x
does not exist; no definite single number is approached, since sin x simply goes up
and down between ±1 for ever. However, it is quite usual to write, say,
lim x2 = ∞,
x→∞
We shall not prove (4.3); but, to convey the feel of it, a table of values is given for
the special case of x3 e−x:
x 0 1 2 3 4 8 10
x3 e−x 0 0.36 1.08 1.34 1.17 0.18 0.05
Fairly large values are needed before the function settles down to approach
zero, because x3 is increasing, and therefore competes with e−x in the early stage.
However, e−x will beat any power of x down to zero eventually. In the following
example, we sketch the graph of the function in the table without using the
calculated values above.
4.4
6. Behaviour as x → −∞. As x → −∞, we have x3 → −∞ and e−x → ∞ (think, for
example, of x = −1000). Therefore, x3 e−x → −∞ (very rapidly).
y
1
−1 O 1 2 3 4 5 6
−1 x
−2
Fig. 4.9
x x
O π 2π O 1
2 π π 3
2 π 2π O 1
2 π π 3
2 π 2π
1
y = e− 3x sin 2x
−1 −1 −1 1
y = ±e− 3x
Fig. 4.10
1
(x is assumed to be in radians.) Split the expression into its two factors, e– –3x and sin 2x.
These are shown in Fig. 4.10a,b. The value of e– –3x drops to about --18 at x = 2π. Also,
1
– 1–3x
are estimated by the size of the factor e , shown as a broken line, which multiplies the
maxima and minima of sin 2x. The new maxima and minima do not occur at exactly
the same points: it is left to the reader to show that the new maxima and minima occur
at values of x which satisfy the equation tan 2x = 6.
(a) y (b) y
APPLICATIONS OF DIFFERENTIATION
1
3 y= 2 3 1
x y=
x
2 2
1 1
−3 −2
x x
−3 −2 −1 O 1 2 3 −1 O 1 2 3
−1
−2
−3
Fig. 4.11
Look out for the obvious things first. At x = 0, we have y = −--12 . The function is infinite at
x = 2 and x = −1. It does not cross the x axis anywhere.
Now consider the sign of 1/[(x − 2)(x + 1)]. It is positive when x −1 (try e.g. x = −3).
It is positive when x 2 (try e.g. x = 3). It is negative when −1 x 2, which is linked
with the facts that the graph does not cross the x axis and, as we already know, that y is
negative when x = 0.
We know now that
y → ∞ as x → −1 from the left;
y → −∞ as x → −1 from the right;
y → −∞ as x → 2 from the left;
y → ∞ as x → 2 from the right;
so Fig. 4.12 is emerging.
y
2
1 x
−2 −1 O 2 1 2 3
− 12
−1
−2
We now locate precisely the obvious maximum between x = −1 and 2, and make sure
there are no other stationary points. By the reciprocal rule (3.2b), we have ➚
113
Example 4.12 continued
4.4
dy 1 d 2x − 1
= [(x − 2)(x + 1)] = .
dx (x − 2)2(x + 1)2 dx (x − 2)2(x + 1)2
We now return to asymptotes and show that there can be asymptotes that slope.
Consider the function
x2 − 1
y= ,
2x + 1
when x is large, positive, or negative. The term x2 − 1 is dominated by x2, mean-
ing that the part −1 is negligible compared with x2 when x is large. Likewise the
dominant term in 2x + 1 is 2x. It is therefore obvious that
y → ±∞ when x → ±∞.
However, we can do much better than this, because it can be seen by polynomial
division that
x2 − 1 1 3
y= = 2x − 1
− 4
.
2x + 1 2x + 1
4
when x is large. As in the earlier instances we have seen, the line y = 21 x − 14 is said
to be an asymptote of the original graph. The notation
y ∼ 21 x − 1
4 when x → ±∞
is sometimes used, meaning that the curve approaches the line y = 21 x − 14 when x
is large. The curve is sketched in Example 4.13.
In the same way, a function may be an asymptotic to a curve as x → ±∞. For
example, if
1 1
y= − 3 sin x,
x x
then
1
y∼ when x → ±∞.
x
y
2
1
1 x− 4
=
te y
1 2
pto
O Asym
x
−2 −1 1 2
−1
1
Asymptote x = − 2
−2
Fig. 4.13
= = 2 .
dx (2x + 1)2 (2x + 1)2
This is never zero, because the equation x2 + x + 1 = 0 has no real solutions. Therefore,
there are no stationary points.
4.5
of a single variable.
Example 4.14 Let y = x + 1/x. Estimate the change δy in y when x changes from
x = 2 to x = 1.8. Compare the estimate with the exact value of δy.
Put
1
y=x+ = f (x).
x
Then
dy 1
or f ′(x) = 1 − 2 ,
dx x
so that f ′(2) = 0.75. Here δx = 1.8 − 2.0 = − 0.2; so, by (4.4a),
δy ≈ 0.75 × (−0.2) = − 0.15.
The exact value is given by δy = (1.8 + 1/1.8) − (2.0 + 1/2.0) = −0.1444… .
centimetres rather than metres, the numbers are even larger; then δr becomes
10 (cm) and δV is about 5 × 10 6 (cm3). On the other hand, if the units had been
kilometres then δr and δV would have looked very small indeed. But nothing at
all is changed except the units of measurement. We still get only a 5% error in the
estimate. The reason for this is that the ratio
Estimated δV/Exact δV = [f ′(r) δr]/[f(r + δr) − f(r)]
is dimensionless; that is to say it is unaffected by the choice of units (see Appendix I).
There is no easy way to predict when the method will work well: geometrically
speaking, we are content to guess that the graph sticks sufficiently closely to its
tangent line at a within the interval a ± δx.
Example 4.16 The cosine rule for a triangle ABC is c2 = a2 + b2 − 2ab cos C. In
a triangle for which a = 3 and b = 4, estimate the change in c when C increases
4
dC
The quantity δC must be measured in radians, because radian measure was assumed
in obtaining the derivatives of the sin and cos functions. So we put
C = 60° = 13 π radians, δC = 1805
π = 0.087 radians.
We know that cos C = 2 and sin C = 12 √3, so
1
f ′( 13 π) = 6√3/√13.
Therefore, by the incremental approximation (4.4),
δc ≈ (6√3/√13) × 0.087 = 0.25.
The exact change is δc = 0.2489… .
Self-test 4.4
The surface area A of a sphere of radius r is given by A = 4πr 2. Estimate the
change in area if the radius increases from 2.0 to 2.1. Compare this value
with the exact change.
4.6
For such cases, there are many methods for obtaining numerical solutions, which
are applicable no matter how complicated the equation is. We describe one of
y y
y = x4 + x3 − 1 B0
1.5
1.0
0.5
B1
x
−1.5 −1.0 −0.5 O 0.5 1.0
−0.5 B2
−1.0 A0 A1 A2 C
x
O x0 x1 x2
−1.5
The following examples work through the equations with which we opened
the section.
119
4.6
five-decimal accuracy.
We have
Example 4.18 The equation e−x = x has a solution near to x = 0.5. Find the
solution accurately to five decimal places.
We have
x0 = 0.5, f(x) = e−x − x, f′(x) = − e−x − 1.
From (4.4),
e −xn − xn x +1
xn+1 = xn − = n
− e − xn − 1 e xn + 1
(the last step for simplicity of calculation). We obtain the following sequence:
x0 x1 x2 x3
0.500 00 0.566 31 0.567 14 0.567 14
This repetitive process described is easy to program for a computer for indi-
vidual cases as they arise, and then the complexity of the equation is of no
importance. The same program can be adapted to scan a range of x in order to
get a provisional idea of where the solutions are to be found. A simple program
combined for safety’s sake with inspection of the whole sequence of values output
would satisfy most requirements.
However, to write a program which will automatically, without intervention,
find all the solutions for any function f(x) that might be presented to it is a very
different matter. For example, we would have to find means to be absolutely sure
that none of the possible tangents would by chance carry us an irrecoverable
distance away from the solution we are seeking (see e.g. Problem 4.15) by design-
ing a way of automatically recognizing and rectifying the situation if it occurs.
Self-test 4.5
The equation ex − 3 cos x = 0 has a solution near x = 1. Using Newton’s
method starting with x0 = 1, find the solution to four significant figures. How
many steps are required?
120
A proof of the binomial theorem has been given in Section 1.18 (eqn (1.46)),
obtained by counting combinations of terms. There follows a simpler proof,
obtained by repeated differentiation.
(4.6)
Proof. Consider firstly the case (1 + x)n, where n is any positive integer. It is clear
that this can be expressed as a polynomial of degree n. Suppose the coefficients
are c0, c1, c2, … , cn. Then
(1 + x)n ≡ 1 + c1x + c2x2 + ··· + cnxn. (4.7)
We have written ‘≡’ in place of ‘=’ to stress that this is an identity: that is, it is true
for all values of x (see Section 1.1). Differentiate both sides once, twice, … , n times:
n(1 + x)n−1 ≡ c1 + 2c2 +3c3x2 + 4c4x3 + ··· + ncnxn−1,
n(n − 1)(1 + x)n−2 ≡ 2c2 + 3 · 2c3x + 4 ·3c4 + ··· + n(n − 1)cnxn−2,
n(n − 1)(n − 2)(1 + x)n−3 ≡ 3·2c3 + 4·3·2c4 + ··· + n(n − 1)(n − 2)cnxn−3
up to the nth derivative, which is
n(n − 1)(n − 2) … 1 ≡ n(n − 1)(n − 2) … 1· cn.
Since these are all identities, they are true when x = 0, and for this value they
become
n = c 1, n(n − 1) = 2c2, n(n − 1)(n − 2) = 3 ·2c3, …,
and in general, for the rth derivative, with 1 r n,
n(n − 1)(n − 2) … (n − r + 1) = r!cr, (4.8)
where the term on the left contains r factors. This immediately gives the result (4.6b).
Equation (4.6a) is derived from (4.6b) by writing
(a + b)n = an[1 + (b/a)]n.
We now know the coefficients cr , so we may put x = b/a into (4.7):
(a + b)n = an[1 + c1(b/a) + c2(b/a)2 + ··· + cn(b/a)n]
= an + c1an−1b + c2an−2b2 + ··· + cnbn,
as required.
121
Problems
PROBLEMS
4.1 (See Section 4.1 on the ‘dash’ notation.) 4.7 Solve Problem 4.6 for the case when the lid is
The function f is defined by f(u) = u2. Obtain not included in the restriction.
the following.
d 2 4.8 Sketch the graphs of the following functions.
(a) f ′(t); (b) f ′(t 2); (c) f(t ); (a) 1/(x2 + 1) (this is an even function: see (1.12)).
dt
e−x . (c) x/(x − 1). (d) x e−x.
2
d (b)
(d) f ′(t –); (e) f(t –); (f) f ″(t –).
1 1 1
2 2 2
(e) x2 e−x. (f) x3 e−x. (g) e2x − 4 ex.
dt
(h) (ln x) /x for x 0 ((ln x)/x → 0 when x → ∞;
this can be proved by putting x = eu and
4.2 (See Section 4.2.) Find the stationary points of
letting u → ∞).
the following functions and classify them as maxima,
(i) [ln(−x)]/x for x 0 (compare (h)).
minima, or points of inflection.
( j) x ln x − x for x 0 (x ln x → 0 when x → 0;
(a) x2 − x; (b) x2 − 2x − 3; (c) x ln x (x 0); this can be seen by writing x = e−u and letting
(d) x e ; −x
(e) 1/(x2 + 1); (f ) x2 − 3x + 2; u → ∞).
(g) e + e ;
x −x
(h) x + 4x + 2; (i) x − x3;
2
(k) sin 1/x (Start by finding where it crosses the
( j) x (x − 1); (k) sin x − cos x (in 0 x 2π);
2 axis, using the fact that sin u = 0 when u = 0,
(l) sin x cos x (−π x π); (m) e−x sin x; ±π, ± 2π, … .)
− –x
(n) e sin 2x (see Example 4.11); (o) x − cos x;
1
3
(l) (x2 − 1)2 (This is an even function: see (1.12).)
(m) x(x2 − 1)2 (This is an odd function: see (1.12).)
(p) 2ex − --12 e2x; (q) x2 e−x; (r) (ln x)/x (x 0);
(n) (sin x) /x (You will not be able to find the exact
(s) (1 − x)3; (u) e−x ;
2
(t) sin3x; positions of the maxima and minima; be
x −x
(w) x + x ; (x) x3 e−x.
−1
2
(v) e ; content to indicate the trend. It is an even
function: see (1.12). For the value approached
4.3 Let y = f(u(x)). Use two successive applications at x = 0, see (2.13).)
of the chain rule in the form of Example 4.2c to
show that 4.9 Sketch the graphs of the following functions.
dy2 (a) 1 /(x2 − 1) (Hint: write x2 − 1 = (x + 1)(x − 1),
= f ″(u(x))[u′(x)]2 + f ′(u(x))u″(x). and then follow Example 4.12; alternatively,
dx2
sketch y = x2 − 1 and imagine taking its
Show that if f ′(u) is always greater than zero, reciprocal.)
or always less than zero, then f(u(x)) and u(x) (b) x /(x2 − 1). (c) 1/x(x − 2).
have the same stationary points. Consider, for (d) x3/(1 − x) (Hint: see the note on curved
example, Problem 4.2v in this connection, asymptotes following Example 4.12.)
with f(u) = eu and u(x) = x2 − x: it becomes (e) (x + 2)/(x − 1) (See the hint in (d).)
rather obvious. (f) 1 /(x + 1) + 1 /(x + 2).
4.4 A rectangular piece of ground is to be 4.10 (See Section 4.5.) Find the approximate value
marked out, which must have a given area A. of the change δy in y due to a small change δx in x
Find the dimensions of the plot which requires using the incremental approximation (4.4) in the
the minimum length of perimeter fence. (This is a following cases. Compare the approximate and
‘restricted’ problem, like Example 4.6. Call the exact values of δy.
sides x and y.) (a) y = x3 when x = 2 and δx = 0.1;
(b) y = x sin x when x = 12 π and δx = −0.2;
4.5 A tunnel cross-section is to have the shape (c) y = cos x when x = 14 π and δx = 0.1;
of a rectangle surmounted by a semicircular roof. (d) y = (1 + x)/(1 − x) when x = 2 and δx = − 0.2;
The total cross-sectional area must be A, but the (e) y = tan x when x = 14 π and δx = 0.1;
perimeter minimized to save building costs. Find (f) y = 1 /(1 − x2) when x = 0.5 and δx = ±0.1.
its dimensions.
4.11 (a) If the focal length of a lens is f, and a
4.6 A circular-cylindrical oil drum is required to viewed object is at distance u, then the image is at
have a given surface area (including its lid and distance v where v = uf/(u − f ). Let f = 0.75 (m).
base). Find the proportions of the design which Find approximately the change in v if u changes
contain the greatest volume. from 1.25 to 1.30 (m).
122
(b) In a Wheatstone bridge circuit, the out-of- 4.15 (a) Supposing y = f(x) to have a continuous
balance voltage v is given by graph, illustrate graphically that the following
APPLICATIONS OF DIFFERENTIATION
where s = --12 (a + b + c). Find an approximate 4.16 (Computational). (a) Suppose that an
expression for δA in terms of δc. (Hint: use equation f(x) = 0 is known to have exactly one
logarithmic differentiation to shorten the solution in a particular finite interval a x b.
4
PROBLEMS
[1 + f ′(x 0 )2 ] 2
3
(fg)(n) = f (n)g + nC1 f (n−1)g(1) + nC 2 f (n−2)g(2) + ··· + nCn fg(n),
R= .
f ′′(x 0 ) where nCr is the rth binomial coefficient, given by
R is known as the radius of curvature of the curve n!/[r!(n − r)!]. (Hint: try writing out the first three
(see also Section 10.1). derivatives at full length: notice how the repetitions
Find the radius of curvature of the parabola of terms in f (n−r)g(r) combine to produce the
y = x2 at every point on the curve. coefficients.)
5 Taylor series and
approximations
CONTENTS
We extend the idea of infinite series introduced in Section 1.16 for the special case
of infinite geometric series. The so-called Taylor series expansion is a type of
infinite series. In the simplest case, a given function f may be expressible in the
form
f(x) = c0 + c1x + c2x2 + … ,
whose terms involve successive positive integer powers of x. In general such series
are called power series. A power series holds good (that is, truly represents the
function) on some interval of validity −r x r, where r 0 is called the radius
of convergence (which may be infinite). Within this range two or three terms may
be adequate to provide a good approximation to the function if | x | is small enough.
For example, if | x | 0.5 radians only the first two non-zero terms, x − --16 x3, of the
series for sin x (eqn (5.4c)) are required for 0.05% accuracy.
We firstly obtain the formula for the coefficients of the Taylor series of a gen-
eral function, and work out the series for the standard elementary function ex,
sin x, and so on (see Table (5.4)). For composite functions such as tan x, arcsin x,
or e−x sin x, substitution in the general formula can become complicated, so we
also show how the results in the table can be used directly to obtain any fixed
number of terms in such cases.
125
5.2
We shall use yet another standard notation for derivatives in this chapter. Since we
shall have to keep track of derivatives of high orders we modify the ‘dash’ notation
TAYLOR POLYNOMIALS
of (4.1) as follows to provide a brief form:
Thus if f(x) = x3, then f (1)(x) = 3x2, f (2)(x) = 6x, and f (3)(x) = 6. As with the dash
notation, we encounter such forms as f (2)(u) = 6u, f (2)(0) = 0, and f (2)(x − c) =
6(x − c).
∞ ∞ ∞
/(1 −
/(1 −
/(1 −
y=
2 2 2
y=1
y=1
y=1
1+
x+
y=1 1 1 1
x
2
x
O x 1+ O x O x
y=
−1 1 2 −1 1 2 −1 1 2
Fig. 5.1
126
We need a way to continue improving the approximation, P(x) say, to a further
stage and beyond. At present we have reached the tangent approximation P(x) =
TAYLOR SERIES AND APPROXIMATIONS
1 + x, which was chosen so that P(0) = f(0) and P(1)(0) = f (1)(0). To obtain the next
approximation choose a P(x) which also matches the second derivative at x = 0:
P(0) = f(0), P(1)(0) = f (1)(0), P(2)(0) = f (2)(0).
This involves adding a term in x2, and we can choose its coefficient so that the
extra condition is satisfied without disturbing the two terms we have already
found. Continuing with the example,
2
f (2 )(x) = , so f (2)(0) = 2.
(1 − x)3
It is easy to check that P(x) = 1 + x + x2 satisfies the three conditions. Therefore
1
≈ 1 + x + x2
1−x
is an improved approximation. This represents the parabolic curve shown in
Fig. 5.1c. Notice that the series which is emerging is the geometric series which
5
Derivatives of a polynomial in x at x = 0
P(x) = a0 + a1x + a2x2 + ··· + aNxN
is a polynomial of degree N. Then
P(0) = a0, P(1)(0) = a1, P(2)(0) = 2!a2,
and in general
P(n)(0) = n!an
for n = 1, 2, 3, … , N. (5.2)
5.2
2! 2!
1 ( 3) 1
a3 = P (0) = f ( 3)(0),
TAYLOR POLYNOMIALS
3! 3!
and so on. By writing the coefficients an in terms of the known values f (n)(0) we
obtain the Taylor polynomial approximation:
x −4 −3 −2 −1 −0.5 0 0.5 1 2 3 4
x
e 0.0183 0.0498 0.1353 0.3679 0.6065 1 1.6487 2.7183 7.3891 20.086 54.598
P(x) −3.533 −0.6500 0.0667 0.3666 0.6065 1 1.6487 2.7167 7.2667 18.400 42.867
The approximating polynomial P(x) clings to the true values for a considerable range
around the origin (see Fig. 5.2).
y y = ex
100
50 y = P(x)
Example 5.2 (a) Obtain the Taylor polynomial approximation of any degree
TAYLOR SERIES AND APPROXIMATIONS
N for the function 1/(1 − x) near x = 0. (b) Obtain an expression for the error
in the approximation.
(a) Putting 1/(1 − x) = f(x), the sequence of derivatives of f(x) is
1 2 ⋅1 3 ⋅ 2 ⋅1
f (1)(x) = , f (2)(x) = , f (3)(x) =
(1 − x)2 (1 − x)3 (1 − x)4
and in general
n!
f (n)(x) = .
(1 − x)n+1
Therefore, referring to (5.3), the Taylor polynomial of degree N is
1 + x + x2 + x3 + ··· + xN.
(b) The error in an estimation using this approximation is equal to
1 (1 − x)(1 + x + x2 + + x N ) − 1 − x N +1
P(x) − f(x) = 1 + x + ··· + xN − = = .
1−x 1−x 1−x
You should experiment with this expression using various values of N and x. (i) If x is
5
very small, the error involved is very small even if N is only 2 or 3. (ii) If we take any
fixed value of x in the range −1 x 1, the error will approach zero when we take
approximations of higher and higher degree (because, when −1 x 1, xN+1
approaches zero as N increases: try this numerically with, say, x = 0.9). (iii) The
approximation fails altogether if x 1 or x −1. The error will be large, and to
increase N will make it still larger because |xN+1 | increases when N increases.
Self-test 5.1
Obtain a fifth-degree polynomial P(x) which approximates e2x for small | x |.
Find the difference [e2x − P(x)] at x = 1.
5.3
carry out we never reach the end. However, this does not mean that the infinite
series does not add up to a definite number, only that we cannot reach it exactly
over, it seems obvious, for example, that the values of a function and its derivatives
at the origin only cannot possibly predict values elsewhere if we allow functions
to be completely arbitrary at other points.
For example, the graph of f(x) where f(x) = e−1/x for x ≠ 0 and f(0) = 0, is perfectly
2
smooth everywhere. But f(x) and all its derivatives f (1)(x), f (2)(x), … , are zero when
x = 0. Therefore the coefficients in the approximating polynomials (5.3) are all
zero, and the only point whose value is given correctly by its Taylor series is the
value zero at the origin.
However, ordinary functions do follow the simple pattern illustrated by the
case of f(x) = 1 /(1 − x) in Example 5.2. Each function has an individual range of
values of x, called its interval of validity, in which the Taylor series converges to
the exact value of f(x). Elsewhere, the series must not be used for approximation.
The following table (5.4), displays the infinite Taylor series about the origin for
several important functions, together with their ranges of validity. You should
confirm the coefficients, as in Examples 5.1 and 5.2.
5.4
1 1
ln(1 + x) = x − x 2 + x3 − . (5.4e)
2 3
Example 5.3 Find how many terms of the Taylor series for sin x are needed to
obtain three-decimal accuracy over the range −1 x 1 (in radians).
The intuitive requirement is that we should stop at the point where we can see that taking
further terms is not likely to affect the third decimal place. The magnitude (modulus) of
the terms in (5.4c) increases when the magnitude of x increases, so it should be sufficient to
provide an approximation good for the largest value, x = 1. The magnitudes of successive
terms when x = 1 are equal to
1, 0.16, 0.083, 0.0002, 2 × 10 − 6, ··· ,
using the recurring decimal notation (Section 1.1). It is therefore enough to retain three
terms of the series; that is to say we should retain powers of x up to x5. To three
decimals, then,
1 3 1 5
sin x ≈ x − x + x for −1 x 1.
3! 5!
Self-test 5.2
How many terms of the Taylor series for cos x are needed to obtain three-
decimal accuracy over the range −1 x 1?
132
We can obtain new Taylor series from the standard ones in (5.4).
when −2 x 2.
5
To find the first few terms in the Taylor series for a composite function f(x) such as
e− x
f (x) = 1 ,
(1 + x)2
it is usually best not to start from first principles by calculating f(0), f (1)(0), f (2)(0),
and so on, which can lead to great complication, but to manipulate standard
expansions as in the following examples.
5.5
Write
Example 5.7 Obtain the first three nonzero terms of the Taylor expansion for
sec x = 1/cos x.
There are several ways of doing this problem.
(a) Working from (5.3). You might try this, but it is rather arduous.
(b) Using the power series for cos x given by (5.4d). Write
1 1
= .
cos x 1 2 1 4
1− x + x −
2! 4!
The problem is to find the first three terms in the reciprocal of the infinite series; we then
have a Taylor polynomial. Anticipate that only the even powers of x will occur, as in the
expansion of the even function cos x. Then we expect
1 1
= = b0 + b2x2 + b4x4 + ··· .
cos x 1 2 1 4
1− x + x −
2! 4!
We have to find b0, b2, b4. To do this, cross-multiply:
⎛ 1 1 ⎞
1 = ⎜ 1 − x2 + x 4 − ⎟ (b0 + b2x2 + b4x4 + ··· )
⎝ 2! 4! ⎠
= b0 + (b2 − 12 b0 )x2 + (b4 − 12 b2 + 1
b )x 4 +
24 0
(retaining only powers up to x4). Match the coefficients of powers of x on both sides,
starting with the constant term; we obtain
b0 = 1,
and, since the coefficients of x2 and x4 on the left are zero,
b2 − 12 b0 = 0 and b4 − 12 b2 + 1
24 b0 = 0.
The last two equations can be solved successively to give
b2 = 1
2 and b4 = 5
24 .
Finally
1 /cos x ≈ 1 + 12 x2 + 5
24 x4.
134
(c) Polynomial division. We can evaluate 1/(1 − 12 x2 + 1
24 x 4 − ) by long division,
setting it out like this, ignoring powers higher than x4:
TAYLOR SERIES AND APPROXIMATIONS
1 + 12 x2 + 245 x4
1 − 12 x2 + 241 x4 1
subtract: 1 − 12 x2 + 241 x4
1
2 x2 − 241 x4
subtract: 1
2 x2 − 241 x4
5
24 x4
5
24 x4
Self-test 5.3
1
Find the Taylor series for (1 + x)–2 e−x as far as x 3.
5
⎛ 1⎞ 2 1 1
⎜1 + ⎟ ≈ 1 + − 2,
⎝ x ⎠ 2x 8x
when x is large enough (positively or negatively), the approximation improving as x gets
larger.
5.7
process that amounts to changing the origin, as in the following example.
Example 5.10 Obtain a Taylor series about the point x = π for the function cos x.
There exists already the series (5.4d), which is valid at x = π. However, if we are
interested in approximating to cos x near x = π, an expansion in powers of x − π should
be more economical and expressive than one consisting of powers of x. We show two
ways of finding the series.
(a) On the lines of Example 5.9. Write
cos x = cos[π + (x − π)] = cos π cos(x − π) − sin π sin(x − π) = −cos(x − π).
We can use (5.4d) to expand this, by putting x − π in place of x. We obtain
1 1
cos x = −cos(x − π) = −1 + (x − π)2 − (x − π)4 + .
2! 4!
This is valid for all values of x. A two-term approximation shows that cos x has a
parabolic shape near x − π = 0 or x = π, where cos x has a local minimum.
(b) Matching the value and the derivatives at x = π. The derivatives of f(x) = cos x at
x = π are given by
f(π) = cos π = −1, f (1)(π) = −sin π = 0, f (2)(π) = −cos π = 1,
and so on. The same relations hold good between the coefficients of a polynomial
in powers of x − π and the values of its derivatives at x = π, as was stated in (5.3) for
polynomials in x at x = 0. We simply put x − π in place of x in (5.3). The required
Taylor series is
1 1
f(x) = f (π) + f (1)(π)(x − π) + f (2)(π)(x − π)2 + ,
1! 2!
which is the same as the result obtained in (a).
136
The general result is the following:
TAYLOR SERIES AND APPROXIMATIONS
Self-test 5.4
Find the Taylor series of sin x about x = --12 π, including the general term.
sin x
y=
x
specifies a value for y for all values of x except for x = 0. At this point the formula
gives y = 0/0, which is a meaningless or indeterminate expression. The graph of y
against x therefore contains a gap at x = 0. However, as we approach x = 0 from
either side, y may approach a single, finite value, y(0) say, that plugs the gap. If such
a value exists, it is given by the limiting operation
sin x
y(0) = lim ,
x→0 x
which can be evaluated in the following way.
Use the Taylor series to write, for x ≠ 0,
sin x x − 3! x 3 +
1
1
= = 1 − x2 + ,
x x 3!
after cancelling the factor x. This new expression has no pecularities. Therefore
we put x = 0 in the new series, obtaining
sin x
lim = 1,
x→0 x
and this is the missing value y(0).
5.8
Also
⎛ 1 ⎞ 1 (ii)
1 − cos x = 1 − ⎜ 1 − x2 + higher powers⎟ = x2 + higher powers.
Observe that only the leading terms, or the dominant terms, of the Taylor series
are needed explicitly in order to obtain the limiting value.
Suppose that we require the limit of the ratio f(x)/g(x) as x → a, but f(a) and
g(a) are both zero. The dominant terms in the Taylor series can be expressed in
terms of derivatives of f(x) and g(x) as in (5.5). This leads to the following formal
statement, known as l’Hôpital’s rule:
L’Hôpital’s rule
Let f(x), g(x) be represented by Taylor series at x = a. If f(a) = g(a) = 0, and
g′(a) ≠ 0, then
f (x) f ′(a)
lim = .
x → a g(x) g ′(a) (5.6)
138
If f ′(a) = g′(a) = 0 and g″(a) ≠ 0, then l’Hôpital’s rule generalizes to
TAYLOR SERIES AND APPROXIMATIONS
f(x) f ″(a)
lim = ,
x→a g(x) g″(a)
with obvious extensions.
For example, if f(x) = x and g(x) = sin x, then, since f(0) = g(0) = 0,
f(x) f ′(x) 1
lim = lim = lim = 1.
x→0 g(x) x→0 g ′(x) x→0 cos x
Self-test 5.5
Using l’Hôpital’s rule find
ex sin x
lim 1 .
x→0 (1 − x)–2 − 1
5
Problems
5.1 Obtain a four-term Taylor polynomial (a) arcsin x; (b) arccos x; (c) arctan x;
approximation valid near x = 0 for each of the (d) e−x sin x; (e) e−x cos x.
following. Estimate the ranges of x over which
three-term polynomials will give two-decimal 5.5 (See Section 5.5.) Find three nonzero terms in
accuracy (you cannot usually tell until you have the Taylor series at x = 0 for the following
seen the next-higher term). functions and state the ranges of validity.
(a) 1 /(1 + 3x); (b) 1/(2 − x); (c) (3 − x)–;
1
5.3 For each of the following series give the Taylor (c) [ln(1 − x)]2 /x2.
polynomial having the lowest degree which you
think will safely give four-decimal accuracy over 5.7 (See Example 5.7.) Find the first three nonzero
the ranges given. terms in the Taylor expansions at the origin for the
(a) ex over −2 x 2; following.
(b) sin x over −2 x 2; (a) 1/[1 + ln(1 + x)];
(c) cos x over −2 x 2; (b) tan x;
(d) (1 + x)– over −0.5 x 0.5; (c) 1/(1 + ex );
1
2
(e) ln(1 + x) over −0.5 x 0.5. (d) tanh x, or (ex − e−x )/(ex + e−x ). (It is less
complicated if you firstly reduce this to
5.4 Obtain the first two nonzero terms in the a more manageable form.)
Taylor series at the origin for the following. (e) x/sin x.
139
5.8 (See Section 5.6.) Find a three-term 5.13 Obtain
approximation, valid for large enough values of x,
PROBLEMS
in each of the following cases. (1 − x)12 − 1 sin x − x
(a) lim ; (b) lim ;
–12 x→0 (1 − x)10 − 1 x→0 sin x − x cos x
A 1D A 1D
(a) 1− ; (b) 1+ – ; cos x + 1 sin x − 1
C xF C xF
1
2
(c) lim ; (d) lim1 .
x→π x−π x→ –2 π cos 5x
(c) x /(1 + x) ;
–12 –12
(c) Show that, when x is large enough, approach −∞ as x approaches zero from the left,
(2 + x)– − (1 + x)– ≈ 1/(2x–).
1
2
1
2
1
2
and +∞ as x approaches from the right.
(d) Show that, when x is nonzero but small
5.15 (a) Obtain limx→0 sin3(3x) /(1 − cos x) by
enough, then 1/(1 − cos x)– ≈ (√2/x) + (x√2/24).
1
2
5.11 Suppose that f(x) has a stationary point at 5.17 Using (5.4) identify the functions which have
x = c. Write down the form of its Taylor series the following Taylor series:
about x = c, taking this into account. ∞
⎡1 n⎤
∞
xn + 2
(a) By considering the first three terms, (a) ∑x
n=0
n
⎢ n! − (−1) ⎥ ;
⎣ ⎦
(b) ∑
n =1 n
;
rediscover the conditions on f ″(c) which
determine the type of stationary point (see (4.2)). ∞
x 2n
(b) By considering further terms of the Taylor (c) ∑ (2n)! ;
n=0
series, extend the criteria to obtain a general rule
which covers the case f ″(c) = 0. and indicate the values of x for which the series
is valid.
5.12 The following expressions are undetermined
at x = 0. Where possible, obtain the appropriate 5.18 By identifying the Taylor series find the sums
values there which make up continuous functions. of the following series.
(a) (ex − 1) /x; (a) 2 − 23
+ 25
− 27
+;
3! 5! 7!
(b) (1 − cos x)/x2;
(c) [ln(1 + x) − x]/sin x; (b) 1 + ( 12 ) + 21! ( 12 )2 + 31! ( 12 )3 + ;
(d) sin x/(1 − cos x). (c) 1 − 1
4 + 1
16 − 1
64 + 1
256 −.
6 Complex numbers
CONTENTS
6.1
A general complex number z can be written in the standard form z = x + iy,
where x and y are real numbers. In this expression x is known as the real part
Example 6.1 Express i2, i3, i4, i5, and i6 in standard form.
The standard forms are
i2 = −1, i3 = i2i = −1 × i = −i.
Since −i can be written 0 + (−1)i, it follows that Re i3 = 0 and Im i3 = −1.
i4 = i2i2 = (−1)(−1) = 1, i5 = ii4 = i, i6 = ii5 = −1.
Example 6.2 Express the following in standard form, and state the real and
imaginary parts in each case:
(a) (2 + i) − (3 + 3i); (b) i(i + 2); (c) (1 − i)(1 + 2i); (d) (2 − 3i)(2 + 3i).
(a) (2 + i) − (3 + 3i) = 2 + i − 3 − 3i = −1 − 2i.
Real part = −1; imaginary part = −2.
(b) i(i + 2) = i2 + 2i = −1 + 2i.
Real part = −1; imaginary part = 2.
(c) (1 − i)(1 + 2i) = 1 + 2i − i − 2i2 = 1 + 2i − i + 2 = 3 + i.
Real part = 3; imaginary part = 1.
(d) (2 − 3i)(2 + 3i) = 22 − (3i)2 = 4 − 9i2 = 4 + 9 = 13.
Real part = 13; imaginary part = 0.
Let z1 = x1 + iy1 and z2 = x2 + iy2. In formal terms, the principal rules are as
follows.
1. Equality Two complex numbers z1 and z2 are said to be equal if and only if
x1 = x2 and y1 = y2: we write z1 = z2. (This rule may seem too obvious to be worth
recording. However, it is called for very frequently in applications and you should
attend to it – especially the ‘only if’.)
its real part being the sum of the real parts, and its imaginary part the sum of the
imaginary parts of z1 and z2.
142
3. Difference Similarly the difference z1 − z2 is
COMPLEX NUMBERS
6.1
1 1 1−i 1
(a) ; (b) ; (c) ; (d) .
2 + 3i 2 − 3i 1+i i
Example 6.4 Find the standard form of the complex numbers (a) z1 + z2,
(b) 2z1 − 3z2, (c) z1z2, (d) z 21 /z2, where z1 = −1 + 2i and z2 = 2 − 3i.
(a) z1 + z2 = (−1 + 2i) + (2 − 3i) = 1 − i.
(b) 2z1 − 3z2 = 2(−1 + 2i) − 3(2 − 3i) = −2 + 4i − 6 + 9i = −8 + 13i.
(c) z1z2 = (−1 + 2i)(2 − 3i) = −2 + 4i + 3i − 6i2 = −2 + 4i + 3i + 6 = 4 + 7i.
(d) First
z 12 = (−1 + 2i)(−1 + 2i) = 1 − 4i + 4i2 = −3 − 4i.
Then
z 21 −3 − 4i (−3 − 4i) (2 + 3i) −6 − 8i − 9i − 12i2
= = =
z2 2 − 3i (2 − 3i) (2 + 3i) 4 − 6i + 6i − 9i2
6 − 17 i 6 17
= = − i.
4+9 13 13
The conjugate Z of z has the following further properties, which are simply
applications of the rule:
to obtain the conjugate, change i to −i wherever it occurs (explicitly or
implicitly).
Self-test 6.1
Find the standard form of (a) z1z2; (b) z1 /z2; (c) (z 12/z2) − (Z 12/Z2 ), where z1 = 1 + 2i
6
and z2 = 2 − i.
y
Imaginary axis
6.2
(a) |z1 + z2 |; (b) |z1 | + |z2 |; (c) | z1z2 |;
–1
(c) | z1z2 | = |(1 − 3i)(3 − 2i)| = |−3 − 11i| = [(−3)2 + (−11)2]2 = √130.
(d) | z1 ||z2 | = |1 − 3i||3 − 2i| = √10√13 = √130
(i.e. the same as (c)).
z1 1 − 3i 3 + 2i 1
(e) = = (9 − 7 i).
z2 3 − 2i 3 + 2i 13
Therefore
z1 1 1 √10
= | 9 − 7 i | = √130 = .
z2 13 13 √13
| z1 | | 1 − 3i | √10
(f) = = (i.e. the same as (e)).
| z2 | | 3 − 2i | √13
The following results hold good for the modulus; they are illustrated in
Example 6.6 above.
The identity (6.4b) follows directly from (6.2). We shall defer the proof of (c) and
(d), but the truth is illustrated in Example 6.6c,d,e,f. The modulus of a sum or dif-
ference cannot be split in this way: contrast the results of Examples 6.6a and 6.6b.
The sum of two complex numbers can be interpreted by the parallelogram law
of addition in the Argand diagram, as in Fig. 6.2. Construct a parallelogram on
OP1 and OP2, where P1 and P2 correspond to the complex numbers z1 and z2. The
corner P of the parallelogram represents the sum z1 + z2. This follows from the
addition rule for complex numbers. If you know anything about vectors, you will
recognize that complex numbers add like vectors.
The conjugate Z = x − iy is the reflection of z in the x axis, shown in Fig. 6.3.
146
y z : (x, y)
COMPLEX NUMBERS
P : z1 + z2
y
P2 : z2
O x
P1 : z1
O x Z : (x, −y)
6
Fig. 6.2 Parallelogram law of addition. Fig. 6.3 Argand diagram showing z and its
conjugate Z.
Self-test 6.2
If z = 1 + i, plot the points on the Argand diagram of z, Z, z2, Z2, 2z, zZ.
Example 6.7 Find the moduli and principal values of the arguments of the
following complex numbers: (a) z1 = 2i; (b) z2 = −1 − i; (c) z3 = −2;
(d) z 4 = 21 + 21 i√3.
The moduli are given by:
(a) | z1 | = |2i| = 2;
(b) | z2 | = |−1 − i| = √(1 + 1) = √2;
(c) | z3 | = |−2| = 2;
(d) |z4 | = | 12 + 12 i√3 | = √( 14 + 43 ) = 1 . ➚
147
Example 6.7 continued
6.3
(a) y (b) y
1 1
θ1 O
x x
−2 −1 O 1 2 −2 −1 θ2 1 2
−1 −1
−1 − i
−2 −2
(c) y (d) y
2 2
1 √3
1 1 2 + 2i
θ3 θ4
x x
−2 −1 O 1 2 −2 −1 O 1 2
−1 −1
−2 −2
Fig. 6.4
A sketch of the Argand diagram for the complex numbers helps to decide their
arguments. Figure 6.4 show their locations. Thus
(a) Arg z1 = θ1 = 12 π; (b) Arg z2 = θ2 = − 43 π; (c) Arg z3 = θ3 = π; (d) Arg z4 = θ4 = 13 π.
In Fig. 6.1, the coordinates (x, y) and the polar coordinates (r, θ ) are related by
x = r cos θ, y = r sin θ.
Hence the complex number z = x + iy can be written
z = r cos θ + ir sin θ = r(cos θ + i sin θ ), (6.5)
y
2
−1 + √3i
1
θ
−2 −1 O 1 2 x
Fig. 6.5
6
Example 6.9 Obtain (a) | cos θ + i sin θ |; (b) | 1/(cos θ + i sin θ )|.
(a) |cos θ + i sin θ | = (cos2θ + sin2θ )2 = 1.
1
Self-test 6.3
If z = 1 − √3i, find the polar forms of z, Z, 2z and z2.
6.4
applying the rule of replacing i by (−i).
From (6.5), any complex number can be written in the exponential form
Remember that, in r eiθ, the numbers r and θ are polar coordinates. For these simple
cases, we can therefore put the points straight on an Argand diagram and read off
the coordinates, without needing to work out cos θ and sin θ.
(a) r = 2, θ = 12 π (90°). Hence 2 e 2 π i = 2i.
1
1 iθ 1 iθ
cos θ = (e + e −iθ ), sin θ = (e − e −iθ ).
2 2i (6.10)
Equation (6.6) will still be true if we replace the angle θ by nθ, where n is an
integer. Hence we obtain De Moivre’s theorem:
cos nθ + i sin nθ = eniθ = (eiθ )n = (cos θ + i sin θ )n.
6
De Moivre’s theorem
If n is any integer and θ real, then
(cos θ + i sin θ )n = cos nθ + i sin nθ. (6.11)
The complex numbers having arguments θ and θ + 2nπ are equal for all integer
values of n, since 2π is a complete revolution on the Argand diagram. Thus
ei(θ +2nπ) = eiθ ei2nπ = eiθ · 1 = eiθ
If z1 = r1eiθ , z2 = r2eiθ , then the product z1z2 = r1r2ei(θ +θ ): its argument is the sum
1 2 1 2
(e) r cos θ = 3, r sin θ = − 4. Hence r = √(9 + 16) = 5 while the angle α is the principal
value of the argument such that cos α = 53 , sin α = − 54 . The exponential form is
3 − 4i = 5 eiα.
where α = −53.1° or −0.927 radians.
151
Example 6.12 By expressing −1 + i in the form r eiθ, find (−1 + i)−8 as a complex
6.5
number in standard form.
First r = | −1 + i| = √2. From its position on an Argand diagram θ = 3 × 45°, or 43 π in
Then
(−1 + i)−8 = (√2 e 4 π i )−8 = (√2 )−8 e − 6 π i = 161 e− 6 π i.
3
On an Argand diagram the polar coordinates are r = 161 and θ = −6π = −3(2π). This
value of θ, equivalent to three complete revolutions, puts us on the positive real axis
again, so that
(−1 + i)−8 = 161 .
Self-test 6.4
If z = −1 − i, express z10 in standard form.
6.6
in which t represents time. Recast c e(α+iβ )t by writing it as c eα t+iβ t, and we have
HYPERBOLIC FUNCTIONS
The form c e(α+iβ )t (α, β, c, t real)
(a) c eα t cos β t = Re c e(α+iβ )t,
(b) c eα t sin β t = Im c e(α+iβ )t. (6.13)
Self-test 6.5
Express cos 4θ in powers of cos θ.
Since, for real z, we have cosh z 1, we expect the equation to have complex roots.
In exponential form,
2 (e + e ) = −1,
1 z −z
or
e2z + 2ez + 1 = 0,
or
(ez + 1)2 = 0.
Hence
6
Self-test 6.6
Find all (complex) solutions of
cosh2z − 3 cosh z + 2 = 0.
so that
r 5 = 2–2 and 5θ = π(− 14 + 2n),
5
6.7
can express cosnθ and sinnθ in terms of cosines and sines of multiple angles.
MISCELLANEOUS APPLICATIONS
Example 6.20 Expand cos6θ in terms of multiple angles.
Let z = cos θ + i sin θ. By De Moivre’s theorem, with n an integer,
1
zn = cos nθ + i sin nθ, = cos nθ − i sin nθ.
zn
By adding these two results, it follows that
1⎛ 1⎞
cos nθ = ⎜ z n + n ⎟ . (6.14)
2⎝ z ⎠
Hence
6
⎛ 1⎞ 15 6 1
(2 cos θ )6 = ⎜ z + ⎟ = z 6 + 6z 4 + 15z 2 + 20 + 2 + 4 + 6
⎝ z⎠ z z z
⎛ 1⎞ ⎛ 1⎞ ⎛ 1⎞
= ⎜ z 6 + 6 ⎟ + 6 ⎜ z 4 + 4 ⎟ + 15 ⎜ z 2 + 2 ⎟ + 20
⎝ z ⎠ ⎝ z ⎠ ⎝ z ⎠
= 2 cos 6θ + 12 cos 4θ + 30 cos 2θ + 20,
by repeated use of (6.14). Finally
cos6θ = 321 cos 6θ + 163 cos 4θ + 15
32 cos 2θ + 16 .
5
We can also use the polar form to sum certain series as in the following
example.
Self-test 6.7
COMPLEX NUMBERS
Problems
6.1 (Section 6.1). Find the solutions of the Express the following complex numbers in
following quadratic equations: standard form:
(a) x2 + 2x + 5 = 0; (b) x2 − 6x + 10 = 0; (a) z1 + z2 + z3; (b) z1z2 + Z3; (c) z1z2z3;
(c) x2 + 2ix + 3 = 0. (d) z1Z2 /z3; (e) z21z22 − 2z32.
6.2 (Section 6.1). Find all the complex solutions of 6.9 (Section 6.3). Find the modulus and principal
x4 + 3x2 − 4 = 0. argument of each of the following complex
numbers:
6.3 (Section 6.1). Express the following complex (a) z1 = −2 + 2i; (b) z2 = 4 − 4√3i; (c) z3 = −5i;
numbers in standard form: (d) z4 = −3; (e) z5 = 3 + 4i.
(a) (1 − i) + (3 + 4i); (b) 2(3 − i) + 3(−1 − i);
(c) 3(−1 + i) − 4(2 − 3i); (d) 3(1 + i)(2 − i); 6.10 (Section 6.2). Let z = x + iy. Express each of
2+i (2 + i)(7 + 5i) the following equations in the complex variable z in
(e) ; (f) ; (g) (−1 + 2i)2;
3−i 3−i real form in terms of x and y. Sketch, and identify
1 in each case, the corresponding curve in the Argand
(h) (−1 + 2i)2 + ; (i) (1 + i)5.
(−1 + 2 i)2 diagram:
(a) zZ = 1; (b) Im z = 2;
6.4 (Section 6.1). Find the boundary curve in the
(c) | z − a | = 1, where a is a complex number;
(p, q) plane which separates the (p, q) values giving
(d) (z − Z)2 = −8(z + Z);
real roots from those for the complex roots in the
(e) | z − 1| + | z + 1| = 4;
quadratic equation
(f) Arg z = --14 π (see Section 6.3);
x2 + px + q = 0, (g) | z | = arg z.
where p and q are real parameters. Of the real
solutions, where in the (p, q) plane do the solutions 6.11 (Section 6.4). Express the following complex
which are both negative lie? numbers in exponential form with principal values
of the arguments:
6.5 (Section 6.1). Let z1 = 3 − i and z2 = 1 + 2i. Find, (a) −1 + i; (b) −2; (c) −3i;
in standard form, the complex numbers (d) 7 − 7i√3; (e) (1 − i)(1 + i√3);
(a) z1 + z2; (b) z1z2; (c) z1 /z2; (d) z1 /z22. 1−i
(f) ; (g) e2+i; (h) (1 + i) e2i;
1 + √3
6.6 (Section 6.1). Let z1 = 2 + 3i and z2 = −2 + i. (1 + i)4
Find the following complex conjugates: (i) (1 − i√3)9; ( j) .
2 − 2i
(a) z1 + z 2; (b) z1z 2; (c) z1 /z 2; (d) Z1 /Z2 .
6.12 (Section 6.3). Using Euler’s formula (6.8)
6.7 Let z = 1 + i. Find the following complex for e±iθ, obtain the trigonometric identities for
numbers in standard form and plot their cos(θ1 ± θ2) and sin(θ1 ± θ2).
corresponding points in the Argand diagram:
(a) Z; (b) z2; (c) Z 2; (d) 1/Z ; (e) z /Z. 6.13 (Section 6.2). Using the parallelogram rule,
sketch the locations in the Argand diagram, for
6.8 (Section 6.5). Three complex numbers are general complex numbers z1 and z2, the following
given by z1 = 2 e1+i, z2 = 3 e−i, and z3 = --12 e−1+2i. points: z1 + z2, Z1 + Z2, z1 − z2, Z1 + z2, z1 − Z2.
157
6.14 Let f(θ ) = cos θ + i sin θ. Verify that 6.22 (Section 6.5). The current in a branch of a
circuit is given by
d f(θ )
PROBLEMS
2
= −f(θ ). i(t) = c e−0.05t sin(0.4t + 0.5).
dθ 2
Show that it is still true if Write this in the form of the real part of a complex
function.
f(θ ) = a cos θ + b sin θ,
where a and b are arbitrary complex numbers. 6.23 A function f(z), where z = x + iy, is known as
a function of complex variable z. Find the real and
6.15 Prove that tan ia = i tanh a, where a is a imaginary parts of the following functions in terms
real number. of x and y:
(a) z2; (b) z + 2z2 + 3z3; (c) sin z;
6.16 Find all the complex solutions of the (d) cos z; (e) ez cos z; (f) ez .
2
following equations:
(a) cosh z = 1; (b) sinh z = 1; 6.24 Let w = f(z), where z = x + iy and w = u + iv
(c) ez = −1; (d) cos z = √2. are complex variables. If f(z) = z2, find u and v
in terms of x and y. The relation represents a
6.17 The logarithm of a complex number z = reiθ mapping between two Argand diagrams. What
is defined by curves do the hyperbolas x2 − y2 = 1 and xy = 1
log z = ln r + iθ + 2nπi, map into in the (u, v) plane?
where n is any integer, since elnr +iθ +i2nπ = elnr +iθ e i2nπ
and ei2nπ = 1. Therefore log z is a multi-valued 6.25 Show that the mapping (see Problem 6.24)
function. The principal value of the logarithm is c
w=z+ ,
denoted by Log z (note the capital letter L), and z
is defined by
where z = x + iy and w = u + iv and c is a real
Log z = ln r + iθ, number, maps the circle |z | = 1 in the z plane into
where −π θ π. an ellipse in the w plane, and find its equation.
(a) Find the principal value Log(1 + i√3), and
indicate its location on the Argand diagram. 6.26 (Section 6.7). Show that
(b) Find all roots of the equation log z = πi. cos6θ = 321 (cos 6θ + 6 cos 4θ + 15 cos 2θ + 10),
(c) Express Log(ei) in standard form.
(d) Show that elog z = z. and find sin6θ.
6.18 If z ≠ 0 and c are complex numbers, then z c is 6.27 The damped oscillation of a vibrating block
defined by is given by
z c = ec lnz x = Re z, z = e(−0.2+0.5i)t,
(see Problem 6.17). in terms of the time t. Find x, and determine the
(a) Express 2i in standard form. values of t where x is zero. Find the velocity of
(b) Find the principal value of the argument of ii. the block
(c) Find all complex roots of zi = −1. (a) as dx/dt;
(b) as Re dz /dt;
6.19 (Section 6.4). Find all complex solutions of and confirm that the answers are the same.
z5 = −1, and sketch their locations on the Argand
diagram. 6.28 Given that 2 + i is a solution of the equation
z4 − 2z3 − z2 + 2z + 10 = 0,
6.20 (Section 6.5). Find the modulus, argument,
find the other three solutions.
and real and imaginary parts of each of the
following complex numbers:
6.29 Find the sum of the series
(a) 2 e3+2i; (b) 4 ei; (c) 5 ecos – π+i sin – π; (d) e1+i.
1 1
4 4
sin 2θ sin 3θ
(a) 1 − sin θ + − +;
6.21 (Section 6.5). An oscillation in a system is 2! 3!
given by x = 0.04 e−0.01t sin 12t. Write this in the form 2 2 cos 2θ 23 cos 3θ
(b) 1 + 2 cos θ + + +.
x = Re(c eα +iβ ). 2! 3!
Part 2
Matrix and vector algebra
Matrix algebra
7
CONTENTS
⎡1 2 −1 ⎤
A=⎢ ⎥
⎣0 3 − 4⎦
is a matrix with two rows and three columns. The individual terms are known as
elements: the element in the second row and third column is − 4. This matrix is
said to be of order 2 × 3, or a 2 × 3 matrix. A general m × n matrix, one with m
rows and n columns, can be represented by the notation
where aij is the element in the ith row and jth column of A; or by
A = [aij : i = 1, … , m; j = 1, … , n],
162
or simply
MATRIX ALGEBRA
A = [aij]
for brevity, if it is clear in context that the matrix is m × n.
A 1 × 1 matrix is simply a number: for example, [−5] = −5. Matrices which have
either one row or column are known as vectors. Thus
⎡ −1.1⎤
[1.3 −1.1 2.9 4.6] and ⎢ 6.5⎥
⎢−2.0⎥
7
⎣ ⎦
are respectively row and column vectors.
A matrix in which the number of rows m equals the number of columns n is
called a square matrix: if m ≠ n then the matrix is said to be rectangular.
Self-test 7.1
The elements of a matrix are given by
aij = (−i)j (i = 1, 2, 3; j = 1, 2, 3).
Write out the matrix in full.
1. Equality. Two matrices can only be equated if they are of the same order: that
is, if they each have the same number of rows and the same number of columns.
They are then said to be equal if the corresponding elements are equal. Thus if
⎡a b ⎤ ⎡e f ⎤
A=⎢ ⎥ and B = ⎢g h⎥ ,
⎣c d ⎦ ⎣ ⎦
then A = B if and only if a = e, b = f, c = g, and d = h. In general, if A = [aij] and
B = [bij] are both m × n matrices, then A = B if and only if aij = bij for i = 1, 2, … ,
m and j = 1, 2, … , n.
7.2
every element of A is multiplied by k. Thus, if
⎡ 5 25 −30⎤ ⎡ 1 5 − 6⎤
A=⎢ ⎥ = 5 ⎢2 3 −1⎥ .
⎣10 15 −5⎦ ⎣ ⎦
3. Zero matrix. Any matrix in which every element is zero is called a zero or null
matrix. If A is a zero matrix, we can simply write A = 0.
4. Matrix sums and differences. The sum of two matrices A and B has meaning
only if A and B are of the same order, in which case A + B is defined as the matrix
C whose elements are the sums of the corresponding elements in A and B. We
write C = A + B. Thus, if A = [aij] and B = [bij] are both m × n matrices, then
C = A + B = [aij + bij].
To summarize:
If A = [aij] and B = [bij] (1 i m; 1 j n) then
1. Equality: A = B if and only if aij = bij;
2. Scalar factor k: kA = [kaij];
3. Zero: A = 0 if and only if aij = 0 for all i, j;
4. Sum and diffence: A + B = [aij + bij], A − B = [aij − bij].
(7.1)
Example 7.2 If
⎡ 1 3⎤ ⎡− 4 − 6 ⎤
A = ⎢2 2⎥ and B = ⎢ −5 −5⎥ ,
⎢ 3 1⎥ ⎢ − 6 − 4⎥
⎣ ⎦ ⎣ ⎦
then find A + B, B + A, and A + 2B.
We have
⎡1 − 4 3 − 6 ⎤ ⎡−3 −3⎤ ⎡1 1⎤
A + B = ⎢2 − 5 2 − 5⎥ = ⎢−3 −3⎥ = −3 ⎢1 1⎥ (by Rule 2).
⎢3 − 6 1 − 4 ⎥ ⎢−3 −3⎥ ⎢1 1⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
➚
164
Example 7.2 continued
MATRIX ALGEBRA
Also
⎡− 4 + 1 −6 + 3⎤ ⎡−3 −3⎤
B + A = ⎢−5 + 2 −5 + 2⎥ = ⎢−3 −3⎥ = A + B.
⎢− 6 + 3 − 4 + 1⎥ ⎢−3 −3⎥
⎣ ⎦ ⎣ ⎦
Further
⎡ 1 3⎤ ⎡− 4 − 6⎤ ⎡ 1 3⎤ ⎡ −8 −12⎤
A + 2B = ⎢2 2⎥ + 2 ⎢ −5 −5⎥ = ⎢2 2⎥ + ⎢−10 −10⎥ (by Rule 2)
7
⎢3 1⎥ ⎢− 6 − 4⎥ ⎢3 1⎥ ⎢−12 −8⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦
⎡−7 −9⎤
= ⎢ −8 −8⎥ .
⎢ −9 −7 ⎥
⎣ ⎦
As the second sum suggests, the commutative property of the real numbers,
namely aij + bij = bij + aij, implies the commutative property of matrix addition,
that is
A + B = B + A.
The difference of two matrices is written as A − B which is interpreted as
A + (−1)B, using Rule 2 for the multiplication of B by the number −1, and then
Rule 4 for the sum of A and (−1)B. In practice, we simply take the difference of
corresponding elements.
⎡− 4 4 13⎤
=⎢ .
⎣ −3 4 −3⎥⎦
7.2
(b) k(A + B) = kA + kB, (k + l )A = kA + lA.
⎡b11 ⎤
A = [a11 a12 a13], B = ⎢b21 ⎥ .
⎢b ⎥
⎣ 31 ⎦
⎡b11 ⎤
AB = [a11 a12 a13] ⎢b21 ⎥ = [a11b11 + a12b21 + a13b31] = C. (7.2)
⎢b ⎥
⎣ 31 ⎦
Here, the single surviving element in C is the sum of the products of corre-
sponding elements from the row in A and the column in B. Thus the product of a
1 × 3 matrix and a 3 × 1 matrix is a 1 × 1 matrix (or simply on ordinary number).
This is known as a row-on-column operation.
Suppose now that A is a 2 × 3 matrix and that B a 3 × 2 matrix which are
given by
⎡b11 b12 ⎤
⎡a a a13 ⎤
A = ⎢ 11 12 , B = ⎢b21 b22 ⎥ .
⎣a21 a22 a23 ⎥⎦ ⎢b ⎥
⎣ 31 b32 ⎦
⎡b b ⎤
⎡a a a13 ⎤ ⎢ 11 12 ⎥
AB = ⎢ 11 12 b b
⎣ 21 a22
a a23 ⎥⎦ ⎢ 21 22 ⎥
⎣b31 b32 ⎦
⎡a b + a12 b21 + a13 b31 a11b12 + a12 b22 + a13 b32 ⎤ (7.3)
= ⎢ 11 11 ⎥
⎣a21 b11 + a22 b21 + a23 b31 a21 b12 + a22 b22 + a23 b32 ⎦
= C.
Note that each row in A ‘operates’ on each column in B giving four elements in
the 2 × 2 matrix C.
166
⎡ 0 3⎤
⎡ 1 −1 0⎤
A=⎢ , B = ⎢ 1 −1⎥ .
⎣2 1 −3⎥⎦ ⎢−2 4⎥
⎣ ⎦
We have
⎡ 0 3⎤
⎡ 1 −1 0⎤ ⎢
AB = ⎢ 1 −1⎥
1 −3⎥⎦ ⎢
⎣2 −2 4⎥⎦
7
⎣
⎡1 × 0 + (−1) × 1 + 0 × (−2) 1 × 3 + (−1) × (−1) + 0 × 4 ⎤ ⎡−1 4⎤
=⎢ = .
⎣2 × 0 + 1 × 1 + (−3) × (−2) 2 × 3 + 1 × (−1) + (−3) × 4⎥⎦ ⎢⎣ 7 −7 ⎥⎦
∑c ,
i =1
i
where i runs through all the integers from the lower limit on i under the ∑ symbol,
to the upper limit above. Thus, for example,
8
∑h
i =3
i = h3 + h4 + h5 + h6 + h7 + h8.
We can also use the summation notation with the double-suffix notation, as in
4
∑h
i =1
i6 = h16 + h26 + h36 + h46.
⎡ 3 ⎤ 3
AB = [a11b11 + a12b21 + a13b31] = ⎢∑ a1 j bj1 ⎥ = ∑a 1j bj 1 .
⎢⎣ j =1 ⎥⎦ j =1
⎡ 3 ⎤
AB = ⎢∑ aik bk j : i, j = 1, 2⎥ .
⎢⎣ k =1 ⎥⎦
This example gives a clue to the general expression for the product of an m × n
matrix A and an n × p matrix B. Remember that the number of columns in A must
always equal the number of rows in B for the product to be defined. Thus, the
row-on-column definition of the product is the m × p matrix
⎡n ⎤
AB = ⎢∑ aik bk j : i = 1, … , m; j = 1, … , p⎥ .
⎢⎣ k =1 ⎥⎦
167
Multiplication rule
7.2
The element in the ith row and jth column of the product consists of the
row-on-column product of the ith row in A and the jth column in B. (7.4)
One conclusion which can be inferred from the previous example is that matrix
multiplication does not, in general, commute; that is, in general, AB ≠ BA. As the
previous example indicates, one or both products may not be defined; when both
are defined, AB and BA may be of different order; and, even when both are defined
and of the same order, AB is generally not equal to BA. So we must be careful about
the order of multiplication. In the product AB, we say that A is multiplied on the
right by B, or that B is multiplied on the left by A. The expressions ‘A postmultiplied
by B’ and ‘B premultiplied by A’ are also used. Statements such as ‘A is multiplied
by B’ can be ambiguous without carefully stating how the product occurs.
Example 7.6 If
⎡1 2⎤
⎡1 −1 0⎤
A=⎢ ⎥, B = ⎢1 2⎥ ,
⎣3 −2 −1⎦ ⎢1 2⎥
⎣ ⎦
calculate AB and BA.
AB will be a 2 × 2 matrix and BA a 3 × 3 matrix. We have
⎡1 2⎤
⎡1 −1 0⎤ ⎢ ⎡0 0⎤
AB = ⎢ ⎥ 1 2⎥ = ⎢ = 0,
⎣3 −2 −1 ⎢
⎦ 1 2 ⎥ ⎣0 0⎥⎦
⎣ ⎦
and
⎡1 2⎤ ⎡7 −5 −2⎤
⎡1 −1 0⎤ ⎢
BA = ⎢1 2⎥ ⎢ ⎥ = 7 −5 −2⎥ .
⎢1 2⎥ ⎣3 −2 −1⎦ ⎢7 −5 −2⎥
⎣ ⎦ ⎣ ⎦
This example illustrates the point that AB can be a zero matrix without either
A or B or BA being zero. Also, as a consequence, A(B − C) = 0 does not necessarily
imply B = C.
168
We state the following results concerning sums and products, but proofs are
omitted:
MATRIX ALGEBRA
Self-test 7.2
7
If
G1 1J
G 1 −1 2 J
A=I H
, B = 2 −1K ,
3 0 −3 L
I 1 −1L
calculate AB and BA.
The transpose of any matrix is one in which the rows and columns are inter-
changed. Thus the first row becomes the first column, the second row the second
column, and so on. We denote the transpose of A by AT. Hence,
⎡a11 a12 ⎤
⎡a a a31 ⎤
if A = ⎢a21 a22 ⎥ then AT = ⎢ 11 21 .
⎢a ⎥ ⎣a12 a22 a32 ⎥⎦
⎣ 31 a32 ⎦
7.3
Also, note that
⎡ 1 0 −1⎤ ⎡3 −1 0⎤ ⎡4 −1 −1⎤
SPECIAL MATRICES
AT + (BT )T = ⎢ + =⎢ = (A + BT )T .
⎣2 1 1⎥⎦ ⎢⎣1 2 −2⎥⎦ ⎣3 3 −1⎥⎦
T T
⎛ ⎡ 1 2⎤ ⎞ ⎡ 5 3 − 4⎤ ⎡ 5 1 −2⎤
⎡3 −1 0⎤⎟
(AB)T = ⎜ ⎢ 0 1⎥ ⎢ ⎥ = ⎢ 1 2 −2⎥ =⎢ 3 2 3⎥ ,
⎜ ⎢−1 1⎥ ⎣1 2 −2⎦⎟ ⎢−2 3 −2⎥ ⎢− 4 −2 −2⎥
⎝⎣ ⎦ ⎠ ⎣ ⎦ ⎣ ⎦
⎡ 3 1⎤ ⎡ 5 1 −2⎤
⎡ 1 0 −1⎤ ⎢
BTAT = ⎢−1 2⎥ ⎢ = 3 2 3⎥ .
⎢ 0 −2⎥ ⎣2 1 1⎥⎦ ⎢− 4 −2 −2⎥
⎣ ⎦ ⎣ ⎦
Hence (AB)T = BTAT.
1. Properties of the transpose. Provided that the sum A + B and product AB are
defined for two matrices A and B, the last example points to the following two
results concerning transposes:
(a) (A + B)T = AT + BT;
(b) (AB)T = BTAT.
⎡ 1 3 −2⎤
A=⎢ 3 2 4⎥
⎢−2 4 −1⎥
⎣ ⎦
is a 3 × 3 symmetric matrix.
A square matrix A for which A = −AT is said to be skew-symmetric. Note that, if
A is any square matrix, then A + AT is symmetric and A − AT is skew-symmetric.
The elements along the leading diagonal of a skew-symmetric matrix must all be
zero. Thus
⎡ 0 1 2⎤
⎢ −1 0 −3⎥
⎢−2 3 0⎥
⎣ ⎦
is skew-symmetric.
3. Row and column vectors. As we defined them is Section 7.1, a row vector is a
matrix with one row, and a column vector is one with one column. For vectors, we
usually use bold-faced small letters and write, for example,
170
⎡ a1 ⎤
⎢a ⎥
MATRIX ALGEBRA
a = ⎢ 2 ⎥, b = [b1 b2 … bn].
⎢⎥
⎢⎣an ⎥⎦
If
7
Example 7.8
⎡ 1 −1 2⎤ ⎡x⎤ ⎡ 2⎤
A = ⎢ 3 1 − 4⎥ , x = ⎢y ⎥ , d = ⎢ 1⎥ ,
⎢−1 2
⎣ 1⎥⎦ ⎢z ⎥
⎣ ⎦
⎢−1⎥
⎣ ⎦
find the set of equations for x, y, z represented by Ax = d.
The matrix equation in full is
⎡ 1 −1 2⎤ ⎡x⎤ ⎡ 2⎤
⎢ 3 1 − 4⎥ ⎢y ⎥ = ⎢ 1⎥ ,
⎢−1 2 1⎥⎦ ⎢⎣ z ⎥⎦ ⎢⎣−1⎥⎦
⎣
or
⎡ x − y + 2z ⎤ ⎡ 2 ⎤
⎢ 3x + y − 4z ⎥ = ⎢ 1⎥ .
⎢−x + 2y + z ⎥ ⎢−1⎥
⎣ ⎦ ⎣ ⎦
The set of linear equations for x, y, z is
x − y + 2z = 2,
3x + y − 4z = 1,
−x + 2y + z = −1.
4. Diagonal matrices. A square matrix all of whose elements off the leading
diagonal are zero is called a diagonal matrix. Thus, if A = [aij] is an n × n matrix,
then A is diagonal if aij = 0 for all i ≠ j. Hence
⎡ 1 0 0⎤
A = ⎢0 −2 0⎥
⎢0
⎣ 0 3⎥⎦
5. Identity matrix. The diagonal matrix with all diagonal elements 1 is called
the identity or unit matrix In. Hence, the 3 × 3 identity is
171
⎡ 1 0 0⎤
I3 = ⎢0 1 0⎥ .
7.3
⎢0 0 1⎥
⎣ ⎦
SPECIAL MATRICES
(If there is no confusion likely to arise, In or I3 are simply replaced by the
universal symbol I.) The reason for the definition becomes clear if we multiply
a 3 × 3 matrix by I3. If A is a general 3 × 3 matrix, then
Similarly I3A = A.
A need not be square: provided that the products are defined, AI = A and
IA = A for the appropriate identity matrix in each case.
⎡d1 0 0⎤
A = ⎢ 0 d2 0 ⎥,
⎢0 0
⎣ d 3 ⎥⎦
then
⎡d 21 0 0 ⎤ ⎡d 31 0 0 ⎤
A = ⎢ 0 d 22 0 ⎥ ,
2 A = ⎢ 0 d 32 0 ⎥ ,
3 etc.
⎢ 0 0 d2⎥ ⎢ 0 0 d3 ⎥
⎣ 3⎦ ⎣ 3⎦
Self-test 7.3
If the symmetric matrices A and B are given by
G1 2 3J G 2 1 1J
A = H 2 −1 1K , B = H 1 −1 3K ,
I3 1 1L I 1 3 0L
compute 2A + 3B, A2, AB, BA and AB + BA. Which of these matrices is
symmetric?
172
If A and B are square matrices, each of order n × n, which satisfy the equations
AB = BA = In, (7.5)
then B is called the inverse of A. We say the inverse because, if a matrix B exists
with this property, then it is uniquely determined by A (although we shall not
prove this here). We write B = A−1 (not B = In /A). Since the definition is ‘sym-
metric’, it follows that A is the inverse of B, that is A = B −1. The inverse matrix
7
defines ‘division’ for matrices, but analogies with numbers must not be taken too
far. It is a particularly useful operation since it enables us to manipulate matrix
equations. Thus, if AB = C, and the inverse of B exists, then we can solve the
equation and find A as A = CB −1.
It appears that we need to find a matrix B which satisfies both equations
in (7.5). However it can be proved that if AB = In, then BA = In, and conversely (see
Whitelaw (1983) for a proof of this). Hence it is sufficient that:
How do we find the inverse? Does it always exist? Let us look first at the case in
which A is a 2 × 2 matrix, and consider the equation
Ax = d,
where
⎡a a ⎤ ⎡x ⎤ ⎡d ⎤
A = ⎢ 11 12 ⎥ , x = ⎢ 1⎥, d = ⎢ 1⎥.
⎣a21 a22 ⎦ ⎣x2 ⎦ ⎣d2 ⎦
Thus
These are linear equations in the unknowns x1 and x2: we shall say more about
their solution in Chapter 12. Eliminate x2 by multiplying (7.7) by a22, (7.8) by a12,
and by subtracting the two equations so that
(a11a22 − a21a12)x1 = a22d1 − a12d2.
Similarly, elimination of x1 leads to
−(a11a22 − a21a12)x2 = a21d1 − a11d2.
173
Provided that a11a22 − a21a12 ≠ 0, it follows that
7.4
a22 d1 − a12 d2 −a21 d1 + a11 d2
x1 = , x2 = .
a11 a22 − a21 a12 a11 a22 − a21 a12
⎡x ⎤ 1 ⎡ a22 d1 − a12 d2 ⎤
x = ⎢ 1⎥ = ⎢−a21 d1 + a11 d2 ⎥ = Cd,
⎣x2 ⎦ a11 a22 − a21 a12 ⎣ ⎦
where
1 ⎡ a22 −a12 ⎤
C= , with det A = a11a22 − a21a12.
det A ⎢⎣−a21 a11 ⎥⎦
1 ⎡ a22 −a12 ⎤
A −1 = C = .
det A ⎢⎣−a21 a11 ⎥⎦
(7.9)
It is worth remembering the rule for 2 × 2 matrices by which A−1 can be con-
structed from A.
The number det A = a11a22 − a21a12 is known as the determinant of A. It may also
be written directly in terms of the corresponding matrix as
⎡a a ⎤
det A = det ⎢ 11 12 ⎥ .
⎣a21 a22 ⎦
a11 a12
det A = .
a21 a22
⎡ 1 3⎤
A=⎢ ⎥,
⎣−1 4⎦
is singular or not. If it is non-singular, find its inverse.
Here a11 = 1, a12 = 3, a21 = −1, and a22 = 4. Hence
1 3
det A = = 1 × 4 − (−1) × 3 = 4 + 3 = 7.
−1 4
7
Since det A ≠ 0, then A is non-singular. Its inverse is, by the rule above,
4 −3
A −1 = 1
7 .
1 1
Example 7.10 If
⎡ 1 3⎤ ⎡1 2⎤
A=⎢ ⎥, B=⎢ ⎥,
⎣−1 4⎦ ⎣1 −1⎦
find A−1, B−1, and (AB)−1.
Always check the determinants first. Here
det A = 4 − (−3) = 7, det B = −1 − 2 = −3.
These are not zero, so A and B have inverses:
⎡4 −3⎤ ⎡−1 −2⎤
A −1 = 1
⎢⎣ 1 1⎥⎦ , B −1 = − 13 ⎢ .
7
⎣−1 1⎥⎦
Also
⎡ 1 3⎤ ⎡1 2⎤ ⎡4 −1⎤
AB = ⎢ = .
⎣−1 4⎥⎦ ⎢⎣1 −1⎥⎦ ⎢⎣3 − 6⎥⎦
Thus, det(AB) = −24 + 3 = −21, and
⎡− 6 1⎤
(AB)−1 = − 211 ⎢ .
⎣ −3 4⎥⎦
Note that
⎡−1 −2⎤ ⎡4 −3⎤ ⎡− 6 1⎤
B −1A −1 = − 211 ⎢ = − 211 ⎢ = (AB)−1.
⎣−1 1⎥⎦ ⎢⎣ 1 1⎥⎦ ⎣ −3 4⎥⎦
This last result suggests the following correct rule for the inverse of the product
of two square matrices, namely:
(AB)−1 = B −1A−1.
(7.11)
175
For the inverse of a 3 × 3 matrix we can adopt the same approach as for the
2 × 2 case by eliminating x1, x2, … successively between the set of equations
7.4
Ax = d or
The result is
x = A−1d,
where
⎡ a a − a32 a23 −(a12 a33 − a32 a13 ) a12 a23 − a22 a13 ⎤
1 ⎢ 22 33
A =
−1 −(a21 a33 − a31 a23 ) a11 a33 − a31 a13 −(a11 a23 − a21 a13 )⎥ (7.13)
det A ⎢ a a − a a −(a11 a32 − a31 a12 ) a11 a22 − a21 a12 ⎦
⎥
⎣ 21 32 31 22
with
Equation (7.13) gives the inverse matrix, as can be verified by calculation of the
products AA−1 and A−1A. Even for 3 × 3 matrices, the formula for the inverse
is quite complicated. If det A = 0, then the matrix is called singular, and has
no inverse.
Determinants, which have arisen in the content of inverse matrices, have import-
ant properties particularly with regard to their evaluation. They will be discussed
in more detail in the next chapter.
Hence B = A−1.
Note that we need only verify that either BA = I3 or AB = I3, not both. If BA = I3, then
AB = I3, and vice versa.
Self-test 7.4
The matrix A is given by
G0 a 0 0J
H0 0 0 bK
A=H ,
c 0 0 0K
I0 0 d 0L
where a, b, c, d are non-zero constants. Compute A2, A3 and A4. Hence find
the inverse of A.
177
Problems
PROBLEMS
7.1 (Section 7.1). The matrix A = [aij] is given by 7.8 (Section 7.3). A general n × n matrix is given by
⎡ 1 2 3⎤ A = [aij].
⎢−1 0 1⎥ Show that A + AT is a symmetric matrix, and that
A=⎢
2 −2 4⎥
.
⎢ ⎥ A − AT is skew-symmetric.
⎣ 1 5 −3⎦ Express the matrix
Identify the elements a13 and a31. ⎡ 2 1 3⎤
A = ⎢−2 0 1⎥
7.2 (Section 7.2). Solve the equation A = B, where ⎢ 3 1 2⎥
⎣ ⎦
⎡ 1 −2⎤ ⎡ 1 x⎤ as the sum of a symmetric matrix and a skew-
A=⎢ 3 1⎥ , B = ⎢y − x 1 ⎥ , symmetric matrix.
⎢−1 2⎥ ⎢ −1 2 ⎥
⎣ ⎦ ⎣ ⎦
for x and y. 7.9 (Section 7.3). Let
⎡ 1 3⎤
7.3 (Section 7.2). Given that A = ⎢−1 2⎥ .
⎢ 0 1⎥
⎡ 1 2 −3⎤ ⎡2 −1 3⎤ ⎣ ⎦
A=⎢ , B=⎢ ,
⎣−1 0 4⎥⎦ ⎣4 1 2⎥⎦ Write down AT, and find the products AAT and ATA.
find the matrices A + B, A − B, and 2A − 3B.
7.10 (Section 7.3). If
7.4 (Section 7.2). Given that ⎡ 1 −1 2⎤ ⎡x⎤ ⎡ 2⎤
⎡ 1 0⎤ ⎡ 2 1⎤ A=⎢ 3 0 1⎥ , x = ⎢y ⎥ , d = ⎢ 0⎥ ,
⎡ 1 3 0⎤ ⎢−1 2 −3⎥ ⎢z ⎥ ⎢−1⎥
A=⎢ ⎥ , B=⎢ 2 1⎥ , C = ⎢−1 1⎥ , ⎣ ⎦ ⎣ ⎦ ⎣ ⎦
⎣2 1 1⎦ ⎢−1 −1⎥ ⎢ 0 1⎥
⎣ ⎦ ⎣ ⎦ write down the set of equations defined by Ax = d.
verify the distributive law A(B + C) = AB + AC for Confirm that the same set of equations is given by
the three matrices. xTAT = d T.
A = ⎢2 −2 2⎥ , B = 1 −1 − 41 ⎥ ,
⎢
⎡ 4 2⎤ ⎡−2 −1⎤ ⎢0 ⎢ 1⎥
A=⎢ , B=⎢ . ⎣ 4 − 4⎥⎦ ⎣1 −1 − 2 ⎦
⎣2 1⎥⎦ ⎣ 4 2⎥⎦
Show that AB = 0, but that BA ≠ 0. find the products AB and BA, and confirm that B is
the inverse of A.
7.7 (Section 7.3). Let
7.13 Let
⎡ 2 1 3⎤
A = ⎢ 1 −1 2 ⎥ . ⎡2 0 1⎤
⎢−2 1 1⎥⎦ A = ⎢2 −2 2⎥ .
⎣ ⎢0 4 1⎥⎦
⎣
Find a matrix C such that A + C is the identity
matrix I3. Deduce that AC = CA. Find AC, and Find the powers A2 and A3, and verify that
hence the matrix A2 + C 2. A3 − A2 − 12A = −12I3.
178
Hence find the inverse matrix A−1 by multiplying in the (x, y) plane. Show that the matrix equation
the equation on both sides by A−1. for the constants a, b, and c can be written as
MATRIX ALGEBRA
⎡1 x1 x 21 ⎤ ⎡a⎤ ⎡y1 ⎤
7.14 (Section 7.4). Using the rule for inverses of ⎢1 x x 22 ⎥ ⎢b⎥ = ⎢y 2 ⎥ .
2 × 2 matrices, write down the inverses of: ⎢ 2
⎥
⎣1 x3 x 32 ⎦ ⎢⎣c ⎥⎦ ⎢⎣y3 ⎥⎦
G1 1J G 2 3J
(a) I
2 –1 L
; (b) I
–7 11 L
; Verify that the inverse of the 3 × 3 matrix on the
left is
G1 0J G 10 –7 J ⎡ ⎤
(c) I ; (d) I ; x2 x3 x3 x1 x1 x2
0 –2 L 8 0L ⎢ ⎥
⎢ (x2 − x1)(x3 − x1) (x3 − x2)(x1 − x2) (x1 − x3)(x2 − x3) ⎥
⎢ x2 + x3 x3 + x1 x1 + x2 ⎥
7
G –99 100 J ⎢− − − ⎥,
(e) I
97 98 L
. ⎢ (x2 − x1)(x3 − x1) (x3 − x2)(x1 − x2) (x1 − x3)(x2 − x3) ⎥
⎢ 1 1 1 ⎥
⎢ (x − x )(x − x ) (x3 − x2)(x1 − x2) (x1 − x3)(x2 − x3) ⎥⎦
⎣ 2 1 3 1
7.15 (Section 7.4). The sparsely filled matrix A is provided that certain conditions are met. What
given by are they, and what implications have they for the
⎡0 1 0 0⎤ given points in the plane? Find a, b, and c in terms
⎢0 0 1 0⎥ of the given data. Find the equation of the parabola
A=⎢
0⎥
. through the points (−2, 0), (1, −2), (3, 4).
1 0 0
⎢ ⎥
⎣0 0 0 1⎦
7.19 The elements in a 3 × 3 matrix A = [aij ] are
Thinking about the row-on-column rule for matrix given by the rule
multiplication, can you guess the columns in the
aij = (−j)i − ij.
inverse matrix A−1? How would this rule generalize
to the matrix Write down the matrix A. Calculate det A and the
inverse of A.
⎡0 a 0 0⎤
⎢0 0 b 0⎥ 7.20 If
A=⎢
0⎥
?
c 0 0
⎢ ⎥ ⎡2 1 3⎤
⎣0 0 0 d⎦
A = ⎢1 −1 2⎥ ,
⎢1 2 1⎥
7.16 (Section 7.4). Write down the set of equations ⎣ ⎦
given by Ax = d, where show that A 3
− 2A 2
− 9A = 0, but that A2 − 2A − 9I3
≠ 0. Does the inverse of A exist?
⎡0 1 1⎤ ⎡x⎤ ⎡ 6⎤
A = ⎢1 −2 2⎥ , x = ⎢y ⎥ , d = ⎢ 3⎥ . 7.21 An nth-order square matrix A satisfies A2 = A
⎢1 0 1⎥ ⎢z ⎥ ⎢ −9 ⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦ and A ≠ In. Show that
Find A−1 and calculate the product A−1d. What is (a) det A = 0;
the solution of the equation? (b) (In + A)−1 = In − 12 A;
(c) (In + A)m = In + (2m − 1)A for any positive
7.17 (Section 7.4). If A and B are both n × n integer m.
matrices with A non-singular, show that
⎡x y1 ⎤ ⎡ x2 y2 ⎤
(A−1BA)2 = A−1B2A. 7.22 Let A1 = ⎢ 1 ,A = .
⎣− y1 x1 ⎥⎦ 2 ⎢⎣− y 2 x 2 ⎥⎦
G 1 2J G 1 2J
Let A = I and B = I . Calculate A−1B4A. Calculate
–1 1L –1 0L
A1 + A2, A1A2, A2A1, A 1−1. Compare your results
7.18 (Section 7.4). For interpolation purposes with z1 + z2, z1z2, and 1/z1, where z1 = x1 + iy1 and
for given data, it is required that the parabola z2 = x2 + iy2 are complex numbers (see Chapter 6).
y = a + bx + cx2 should pass through the three Consider the possibility of developing further
points with coordinates (x1, y1), (x2, y2), and (x3, y3) parallels, such as to |z | and ez.
Determinants
8
CONTENTS
Expansion of 2 × 2 determinant
a11 a12
det A = = a11a22 − a21a12 .
a21 a22
(8.1)
(The notation | A | is also used extensively for the determinant.) For the 3 × 3
matrix
In (8.2) there are six terms, each of which is the product of three elements. Each
term contains three elements, each from a different row and column. In other
words, there are never two elements in any term from the same row or column.
8
It can be seen that there must be just 3 × 2 × 1 = 6 terms of this form, because
three elements can be chosen from row 1, two from the two remaining elements
in row 2, and one element from row 3.
Each term is prefixed by either +1 or −1. This is decided according to the
following rule. Write each term in the form
a1j a2j a3j ,
1 2 3
in which the first suffixes are in consecutive increasing order. Examine the permuta-
tions of the second suffixes j1 j2 j3 (see Section 1.17). The permutation is said to
be even (odd) if it has an even (odd) number of inversions. An inversion occurs
whenever a larger integer precedes a smaller one. Thus the permutation 132 is
odd, since 3 precedes 2, but 312 is even since there are two inversions because 3
precedes 1 and 2. If the number of permutations is even, then a + sign is attached;
if the number is odd, then a − sign is attached. This rule can be extended to a
determinant of any order.
While this expansion of the determinant says something about the structure
of the determinant, it is not really a practical rule for evaluating determinants.
Returning to (8.2), we can rewrite det A as
det A = a11(a22a33 − a32a23) − a12(a21a33 − a31a23) + a13(a21a32 − a31a22).
The terms in brackets are themselves 2 × 2 determinants. Thus
Expansion of 3 × 3 determinant
a11 a12 a13
det A = a21 a22 a23
a31 a32 a33
a22 a23 a a23 a a22
= a11 − a12 21 + a13 21 .
a32 a33 a31 a33 a31 a32
(8.3)
This expression is called an expansion by the top row. The term associated with
a11, namely
a22 a23
C11 = ,
a32 a33
8.1
a21 a23 a21 a22
where the signs attached should be noted. In the same way the cofactors of the
elements in the second and third rows are defined as follows:
The signs associated with the cofactors alternate, starting with a + at the top left
as we move across or down from the top left-hand corner as shown:
+ − +
− + − .
+ − +
The sign associated with Cij is + if i + j is even and − if i + j is odd, and can be
expressed as (−1)i+j.
For example, if det A is expanded by the third column, then
det A = a13C13 + a23C23 + a33C33.
1 2 k
det A = 2 −1 3 ,
−1 4 −2
for any k. Find the value of k for which the determinant is zero.
Expanding by the first row gives
8
1 2 k
−1 3 2 3 2 −1
det A = 2 −1 3 = 1 × −2× +k×
4 −2 −1 −2 −1 4
−1 4 −2
= 1 × (2 − 12) − 2 × (−4 + 3) + k × (8 − 1)
= −10 + 2 + 7k = −8 + 7k.
Hence det A = 0 if k = 87 .
Self-test 8.1
Let
G 1 2 kJ
A = H k 2 −1K .
I1 2 1L
Find det A and the value of k for which det A = 0.
and all terms in this expansion can be identified with those in (8.2). Hence
det AT = det A.
183
8.2
1 28 −29
det A = 0 1 −4 .
PROPERTIES OF DETERMINANTS
0 −2 5
Since the determinant has two zeros in the first column, it is advantageous to use Rule 1.
The determinant of the transpose of A is given by
1 0 0
det AT = 28 1 −2 ,
−29 − 4 5
which now has two zeros in the first row. Hence the expansion by the top row becomes
particularly easy:
1 −2
det AT = 1 × = 5 − 8 = −3.
−4 5
2. Scalar factor. If every element of any single row or column of the matrix A is
multiplied by a scalar k, then the determinant of this matrix is k det A.
(Note: this rule is different from Rule 2, Section 7.2, for matrices.)
This is a self-evident result, since just one element from every row and column
appears in every term. Thus, by (8.2), if every element of the second row in A is
multiplied by k, then
a11 a12 a13
ka21 ka22 ka23 = a11ka22a33 − a11ka23a32 − a12ka21a33 + a12ka23a31
a31 a32 a33 + a13ka21a32 − a13ka22a31
= k det A.
By putting k = 0 in this result, note that any determinant must have zero value
if all the elements of any row or column are zeros.
Conversely, a common factor can be taken out as a multiplier of the determinant.
Since the second column obviously has a factor of 11, then we can remove this factor
from the second column before expansion. Thus, by Rule 2,
−1 9 1
⎛ 3 −2 2 −2 2 3⎞
∆ = 11 × 2 3 −2 = 11 × ⎜(−1) × −9× +1×
⎝ 5 1 3 1 3 5 ⎟⎠
3 5 1
= 11 × [−(3 + 10) − 9 × (2 + 6) + (10 − 9)]
= 11 × (−13 − 72 + 1) = −924.
184
3. Row/column exchange. If B is obtained from A by interchanging two rows
(or columns) then det B = −det A.
DETERMINANTS
Then, by analogy with (8.2), the expansion of B by its first row is given by
a22 a23 a a a a
det B = a31 − a32 21 23 + a33 21 22
a12 a13 a11 a13 a11 a12
= a31a22a13 − a31a23a12 − a32a21a13 + a32a23a11 + a33a21a12 − a33a22a11.
These are the same terms as those present in (8.2) except that the sign of every
term is changed. Therefore in this case
det B = −det A.
The same is true whichever row or column pairs are exchanged.
The rule applies to a determinant of any order.
There are several ways of approaching the evaluation of this determinant since the
second row and third column each have three zeros. It is obviously advantageous to
have as many zeros as possible in the top row. With this in view, interchange rows 1
and 2 using Rule 3:
0 2 0 0
1 2 1 2
∆=− .
−1 3 0 4
−1 2 0 −1
Expanding by row 1, remembering the sign rule for cofactors:
1 1 2
∆ = 2 × −1 0 4 .
−1 0 −1
Now successively use Rule 1 and interchange rows with columns, and then Rule 3 and
interchange the new rows 1 and 2:
1 −1 −1 1 −1 −1 1 0 0
−1 −1
∆ = 2 × 1 0 0 = 2 × 1 0 0 = (−2) × 1 −1 −1 = (−2) ×
4 −1
2 4 −1 2 4 −1 2 4 −1
= (−2) × (1 + 4) = −10.
185
4. Expansion by any row or column. From (8.2), by grouping the terms differ-
ently, we can write, for example,
8.2
det A = a31(a12a23 − a13a22) − a32(a11a23 − a13a21) + a33(a11a22 − a12a21)
PROPERTIES OF DETERMINANTS
a12 a13 a a a a
= a31 − a32 11 13 + a33 11 12
a22 a23 a21 a23 a21 a22
= a31C31 + a32C32 + a33C33.
in terms of cofactors. Here the elements a31, a32, a33 constitute the third row, and
we call this the expansion of det A by the third row.
It can be shown that the expansion can be written down similarly using any
row or column. Thus
det A = a12C12 + a22C22 + a32C32
is an expansion by the second column.
Linear dependence
The rows r1, r2, r3 are said to be linearly dependent if α, β, γ, not all zero, exist
such that α r1 + β r2 + γ r3 = (0, 0, 0).
Otherwise the three rows are linearly independent. (8.4)
linearly dependent. There may be more then one such relation if the dimension
n > 3; in all cases the whole group of n rows is said to be linearly dependent.
Notice that the same terminology is applicable to any collection of n-
dimensional vectors (see Chapters 9, 10, 11). For example, if three non-zero
three-dimensional vectors are coplanar, they are linearly dependent, and con-
versely, if they are linearly dependent, they are coplanar.
8
We return to property 5.
5. Zero determinant. If the rows (or columns) of A are linearly dependent, then
det A = 0.
The simplest case is when two rows are identical. By Rule 2, interchange of
two rows changes the sign of det A even if the rows are identical, so that inter-
change amounts to no change. Therefore det A = −det A. This is possible only if
det A = 0.
Suppose one row is a multiple k ≠ 0 of another. Then by Rule 2 we may take k
out as a factor, leaving det A = k det A, which is only possible if det A = 0. (There
may be more than one such relation.)
We prove the general case of linear dependence, illustrating it for three dimen-
sions, using the notation A = [aij] for the matrix elements and Cij for the cor-
responding cofactors as in Section 8.1. Suppose, for example, that the first row
consists of a linear combination of the second and third rows, so that a1j = pa2j
+ qa3j, j = 1, 2, 3, where p and q are constants, not both zero.
Expand the determinant det A by the top row:
det A = (pa21 + qa31)C11 + (pa22 + qa32)C12 + (pa23 + qa33)C13
= p(a21C11 + a22C12 + a23C13) + q(a31C11 + a32C12 + a33C13)
in which C11, C12, C13 are the cofactors corresponding respectively to a11, a12, a13.
The bracketed expressions on the right each has the form of an expanded deter-
minant with a repeated first row, and is therefore zero, as we showed above.
Therefore det A = 0. For dimensions n with n 3 there will simply be more non-
zero coefficients, p, q, r, … , and correspondingly more terms in the expression.
Thus, for example,
99 18 63 11 2 7
11 2 7 = 9 11 2 7 (by Rule 2).
−2 3 4 −2 3 4
= 0 (by Rule 5).
8.2
a21 a22 a23
a31 + ka11 a32 + ka12 a33 + ka13
PROPERTIES OF DETERMINANTS
= (a31 + ka11)C31 + (a32 + ka12)C32 + (a33 + ka13)C33
(expanding by row 3)
= a31C31 + a32C32 + a33C33 + k(a11C31 + a12C32 + a13C33)
a11 a12 a13
= det A + k a21 a22 a23
a11 a12 a13
= det A,
since the second determinant vanishes by Rule 5 having two rows with the same
elements.
Note that
a11C31 + a12C32 + a13C33 = 0
that is, in its general form, the sum of the products of the elements of one row (or
column) and the cofactors of the elements of another row (column) is zero. This
follows since the left-hand side must arise from a matrix with two identical rows
(columns).
Rule 5 is a particularly useful rule for simplifying the elements in a determinant
before expansion and evaluation. We illustrate a number of these points in the
next example.
Usually we use the rules (particularly 6) either to introduce zeros into the matrix or to
reduce the size of elements as far as possible. It is important to list the operations in
order to make the sequence of operations intelligible. For this purpose we identify the
current rows by r1, r2, …, and the current columns by c1, c2, … . Denote the new rows
and columns which have been changed by r′1, r′2, … and c′1, c′2, … respectively. There are
many ways of approaching the evaluation of ∆. A first step in this example could be to
add column 3 (c3) to column 2 (c2) since this produces a zero at the top of column 2. This
operation is represented by c′2 = c2 + c3, and we list the operations on the right-hand side
as we proceed. The second operation is to subtract the new row 3 from the new row 2.
A decision is taken at each step in the light of the new matrix. By Rule 6, these
operations do not affect the value of ∆. Hence
2 99 −99
∆= 999 1000 1001
1000 1001 998
2 0 −99
= 999 2001 1001 (c 2′ = c 2 + c 3 )
1000 1999 998 ➚
188
Example 8.7 continued
DETERMINANTS
2 0 −99
= −1 2 3 (r ′2 = r2 − r 3)
1000 1999 998
2 0 −99
⎛ c1′ = c1 − 12 c 2 ⎞
= −2 2 2 ⎜ c′ = c − 1 c ⎟
0.5 1999 −1.5 ⎝ 3 3 2 2⎠
2 0 −93
8
= −2 2 −4 (c ′3 = c 3 + 3c1 )
0.5 1999 0
= 2(4 × 1999) − 93[(−2 × 1999) − 1] = 387 899.
Note that while r′2 = r2 + kr3 does not affect the value of the determinant,
r′2 = kr2 + r3 will change its value by a factor k.
Self-test 8.2
The determinant
ix a a … ai
ia x a … ai
Dn = … … … … …i
i
ia a a … xi
has n rows. Factorize Dn in terms of x.
189
8.3
We can now rewrite the formula for the inverse given in Section 7.4, using cofactors.
The transposed matrix of cofactors given by
adj A ⎡⎢ 11 12
a a a13 ⎤ ⎡C11 C21 C31 ⎤
1
A = a 21 a 22 a 23 ⎥ ⎢C12 C22 C32 ⎥
det A ⎢a ⎥ ⎢ ⎥ det A
⎣ 31 a32 a33 ⎦ ⎣C13 C23 C33 ⎦
1 ⎡⎢
det A 0 0 ⎤
= 0 det A 0 ⎥ = I3 .
det A ⎢ 0 0 det A⎥⎦
⎣
This confirmation uses the results that the sum of the products of the elements
of one row and their cofactors is the value of the determinant whilst the sum of
the products of one row and the cofactors of another row is zero.
of hand calculations unless the determinant is only sparsely filled with nonzero
elements or can be reduced to such a determinant. Such computations become a
fertile source of errors. There are computer packages available which will quickly
perform the arithmetic operations for determinants of reasonable size.
(a) Determinant of A
a22 a23 a a23 a a22
det A = a11 − a12 21 + a13 21 ;
a32 a33 a31 a33 a31 a32
(b) Adjoint or adjugate of A
⎡C11 C21 C31 ⎤
adj A = ⎢C12 C22 C32 ⎥ ;
⎢C ⎥
⎣ 13 C23 C33 ⎦
(c) Inverse of A
adj A
A−1 = .
det A (8.7)
Self-test 8.3
Let
G1 k 2J
A= H 2 −1 −2 K.
I 1 −1 1L
Find the adjoint and inverse of A, and state the value of k for which the
inverse does not exist. For this value of k calculate the product A adj A. Is the
answer predictable?
Problems
PROBLEMS
2 3 4 −1 2 X1 and X2 are the cofactors of x and y, what is the
3
slope of this line in terms of the cofactors? Using
(a) 4 6 8 ; (b) 3 1 −2 ;
this method, find the equation of the straight line
1 −1 2 −2 −3 −1
through the points:
a b c 1 1 1 (a) (1, −1) and (2, 3); (b) (−1, 0) and (4, −1).
(c) b c a ; (d) 3 0 0 .
a−b b−c c−a 5 0 0 8.8 Find the value of a which makes the
determinant
8.3 Given that
1 1 −1
a b c 1 a 2
∆= b c a, −1 1 2
c a b
equal to zero.
what is the value of
a3 ab ac 2 8.9 Explain why
ab c ac x 2 −2
ac a bc 2 x 3 =0
in terms of ∆? x −1 x
will be at most a cubic equation in x, but that
8.4 Simplify first and then evaluate the following
determinants 1 1 2
3 x 2 =0
99 100 200 77 84 55
x 1 x
(a) 98 102 199 ; (b) 75 87 57 ;
−1 2 3 1 −2 3 will be at most a quadratic equation in x. Solve
both equations, and find all roots including any
2 −1 1 complex ones.
(c) 99 98 55 ;
200 197 111 8.10 Show that
87 84 83 81 a11 + b11 a12 + b12 a13 + b13
77 76 77 75 a 21 a 22 a 23
(d) .
54 53 52 54 a31 a32 a33
− 43 − 44 − 46 − 4
a11 a12 a13 b11 b12 b13
= a 21 a 22 a 23 + a 21 a 22 a 23
8.5 Explain why the determinant
a31 a32 a33 a31 a32 a33
1 1 1
∆= a b c
8.11 The determinant
a 2 b2 c 2
a11 + b11 a12 + b12 a13 + b13
has factors b − c, c − a, and a − b. Express the value
a 21 + b21 a 22 + b22 a 23 + b23
of ∆ as the product of factors.
a31 + b31 a32 + b32 a33 + b33
8.6 Factorize the determinant is required as the sum of determinants each of
1 1 1 which has just as or bs in columns. How many
∆= a b c . determinants are there in the sum? If the
a3 b3 c3 determinant is n × n, how many determinants
would there be in the sum?
8.7 Explain, using one of the rules for
determinants, why the equation 8.12 Show that
x y 1 1 a1 − b1 a1 + b1 1 a1 b1
a1 b1 1 = 0 1 a 2 − b2 a 2 + b2 = 2 1 a 2 b2 .
a 2 b2 1 1 a3 − b3 a3 + b3 1 a3 b3
192
8.13 Let Dn be the n × n tridiagonal determinant (b) Write down AT and find its determinant
defined by det AT. Confirm that
DETERMINANTS
det AT = det A.
2 1 0 ... 0
1 2 1 0 (c) Find A−1 and det A−1, assuming that det A ≠ 0.
Dn = 0 1 2 . Confirm that
1 det A−1 = 1/det A.
0 0 ... 1 2
(d) Show that
Show that
det adj A = det A.
8
CONTENTS
A vector quantity in geometry, mechanics and physics is one that has both mag-
nitude and direction, and satisfies certain other strict physical properties. A great
convenience of using vectors is to represent and manipulate a multi-dimensional
variable by a single symbol (such as v for velocity). Using the simplest types,
arising in two- and three-dimensional geometry, we demonstrate the concepts of
a directed line segment, components, vector addition, and position vector (see
Sections 9.1 to 9.7). We then consider derivatives of vectors that are functions of
time or position. This process delivers the less intuitive vector quantities velocity,
acceleration, and the tangent and curvature of curves.
P O Q
ELEMENTARY OPERATIONS WITH VECTORS
–2 –1 0 1 2 x
Fig. 9.1
This is done by attaching a plus or minus sign to the distance. We use plus if Q
as viewed from P is in the positive direction of the x axis (to the right in this case),
and minus if Q is in the negative direction (to the left in this case). This quantity
is called the displacement of Q relative to P, or the displacement of Q from P, and
is defined in terms of xQ and xP by
displacement of Q from P = xQ − xP .
In this case the displacement of Q from P is equal to 1.5 − (−2) = 3.5. This is
positive, showing that Q is to the right of P. By the same rule,
displacement of P from Q = xP − xQ = (−2) − 1.5 = −3.5.
The minus sign indicates that P is to the left of Q.
9
Example 9.1 A pedestrian wanders up and down the high street, which extends
east and west. Starting at the bus stop, she strolls 80 m east, 25 m west, 50 m
east, then races 100 m west, at which point the returning bus drives off. Where
was she, relative to the bus stop, at this time?
80 m N
B C
25 m W E
D C
S
Bus stop
50 m
D E
F 100 m
E
5m
Fig. 9.2
There is no difficulty about this question: Fig. 9.2 shows that she ends up east of the bus
stop with 5 more metres to go. Notice how natural it is to count one direction as positive
and the other as negative. We shall formalize this, because we can get useful illustrations
about handling displacements from this problem.
In Fig. 9.3 we have drawn an east-pointing axis x. The origin is at O (it will make no
difference where it is) and the bus stop is at B. The direction changes at C, D, E, and the
end-point is F.
We want to find the displacement of F from B. This is defined by xF − xB. Write it in
the form
xF − xB = (xF − xE ) + (xE − xD) + (xD − xC) + (xC − xB) ➚
195
Example 9.1 continued
9.2
B F D C E
Fig. 9.3
which is identically true because xE, xD, and xC cancel out. The quantities in the
brackets are relative displacements: for example, (xD − xC) represents the displacement
of D relative to C.
The data of the problem consists of these displacements; all we have to do is to get
the signs right. For example, since we chose the positive direction to be east, and the
movement from C to D is west, the displacement of D from C is −25. By substituting
all the information we obtain
xF − xB = 80 + (−25) + 50 + (−100) = 5.
Since this is positive, she ends up 5 m east of the bus stop.
We did not need to know the actual coordinates of any of the points B to F; the position
of the origin O makes no difference to relative displacements.
N y
ELEMENTARY OPERATIONS WITH VECTORS
W E P
y (km × 10) S
1
(– √21 ) km north
km
so
ut
h
ea
D
st
20 km
C
20 km 1 Q
A 50 km km east
B √2
O x (km × 10) O x
9.2
A_D = C_D + B_C + A_B.
The components are ordinary numbers, so we can change the order in which
Example 9.2 Figure 9.6 shows the track of the ferry boat again (see Fig. 9.4).
(a) Find the displacement vector A_D in component form.
(b) Express A_D in terms of its length A_D and the angle θ it makes with the
positive x axis.
y
x displacement
D
y displacement
20 km
C
20 km
θ 50 km
A B
O x Fig. 9.6
Y
θ = arctan = arctan 0.952 = 43.6°.
X
Self-test 9.1
Three vectors in the plane are given by
A_B = (10, 20), B_C = (5, 20), C_D = (14, –5).
Find the displacement vector A_D. What are the magnitude and direction
of A_D?
198
z
2
1
1
1O 2
2 3
P : (2, 3, 1)
1 y
2
3
x
Fig. 9.7
9
There was a choice of two possible directions for Oz, as shown in Fig. 9.8.
These two sets of axes cannot be superposed no matter how we turn them about:
they are mirror images of each other, like a right shoe and a left shoe. The axes
shown in Fig. 9.8a are called right-handed axes (left-handed axes, Fig. 9.8b, are
seldom used).
(a) z (b) O
x
O
z
y
9.4
Q
1. Components and magnitude. Figure 9.9 shows a vector placed in a set of axes.
Its initial point is P : (xP, yP, zP) and its end-point is Q : (xQ, yQ, zQ). We denote it
either by P_Q, where the bar stresses the direction, P to Q; or (more often) by a
single letter, say
a (in heavy print) or a (underlined when handwritten).
The components of a in the x, y, and z directions respectively are a1, a2, and a3,
where
a1 = xQ − xP, a2 = yQ − yP, a3 = zQ − zP. (9.2)
We write
P_Q, or a = (a1, a2, a3). (9.3)
z z z
ELEMENTARY OPERATIONS WITH VECTORS
a b
a a a
–a
Q a
P
O b
a y y y
x R
x x
(a) (b) (c)
Fig. 9.10 (a) and (b) illustrate equality of vectors. (c) The vectors a and −a have opposite senses.
which has the same length as a and the opposite direction (Fig. 9.10c).
(a) z (b) z
C
–b
B
b B
C a–b
a a
a+b
A O A O
y y
x x
(a) z (b) z
9.4
C
A A
x x
which is illustrated in Fig. 9.11b using the triangle rule and in Fig. 9.12b using the
parallelogram rule.
Also,
a − a = (a1, a2, a3) + (−a1, −a2, −a3) = (0, 0, 0).
This is the zero vector, denoted by 0 (or 0, if handwritten).
These are like the rules of ordinary algebra. A more complicated example is
(a + b) − (c + d ) = (b − c) − (d − a).
Plane vectors are useful in dynamical applications such as for projectiles where
motion takes place in a fixed vertical plane.
Example 9.4 M is the midpoint of the side AB of the triangle PAB (Fig. 9.13).
Put P_A = a and P_B = b. (a) Express the vector P_M in terms of a and b.
(b) Deduce that the diagonals of a parallelogram bisect each other.
(You can think of this in two dimensions, but it applies equally in three.)
B
9
B
b
b D
M N
P P
a
a A
A
Fig. 9.14 N is the midpoint of PD.
Fig. 9.13 M is the midpoint of AB.
Also
B_A = B_P + P_A (triangle rule; note the direction of B_P)
= −P_B + P_A (see (9.6))
= −b + a. (ii)
Example 9.5 (In two dimensions.) In the (x, y) plane, a and b are any two vectors
9.4
which are not parallel, and c is another vector. (a) Prove that c = λ a + µ b, where
λ and µ are constants. (b) Find λ and µ when a = (1, 1), b = (2, 0), and c = (3, 4).
C
b c
Q
a Fig. 9.15 Vectors in a plane:
A c = λ a + µ b.
(a) Take any point Q. Draw a, b, and c radiating from it, and then complete the
parallelogram QBCA, as in Fig. 9.15. Then
c = Q_C = Q_A + Q_B (parallelogram rule).
But Q_A and Q_B point respectively in the directions of a and b so they are equal to
certain (unique) multiples of a and b:
Q_A = λ a and Q_B = µb,
say. Therefore
c = λ a + µb.
(b) a = (1, 1), b = (2, 0), and c = (3, 4), so from (a)
(3, 4) = λ(1, 1) + µ(2, 0).
The individual components on the two sides must match, so
3 = λ + 2µ, 4 = λ.
The solution is λ = 4, µ = − --12 , so
c = 4a − --12 b.
Figure 9.16 shows the three vectors a, b, c in their common plane: the vectors
are then said to be coplanar, or linearly dependent (see Section 8.2). We can use
the same argument as in the previous example. (It is not actually necessary for the
vectors a, b, and c to be in the same plane and emerge from the same point to start
with: it is sufficient for them merely to be parallel to the same plane, so that we can
translate them to the positions in Fig. 9.16.) Then the argument in Example 9.5
follows.
204
z
ELEMENTARY OPERATIONS WITH VECTORS
b
Q c
a
O
y
x Fig. 9.16
Self-test 9.2
ABCD is a quadrilateral with its sides represented by the vectors
A_B = a, B_C = b, A_D = c, D_C = d.
9
P, Q, R, S are the mid-points respectively of AB, BC, CD, DA. Show that
PQRS is a parallelogram. If M is the mid-point of AC, show that AMRS is
also a parallelogram.
9.5
(9.13)
RELATIVE VELOCITY
Example 9.6 (Figure 9.17) A river of width 0.2 km flows with uniform speed
3 km h−1 from west to east. A boat sets off from a point S on the south bank,
wishing to land at a point N on the north bank directly opposite S. It can travel
at a speed of 5 km h−1 relative to the water. In what direction should it point in
order to arrive at N by a straight line route? How long does it take?
Direction of
Direction travel
of bow
θ
Water velocity 0.2 km
B
Fig. 9.17
The true path of the boat (i.e. as seen from the land, or relative to fixed axes) is not
along the direction it is pointing, because it is also being carried downstream. However,
viewed from axes which travel along with the water, it does go in the direction it is
pointing, at an apparent speed of 5 km h−1. To visualize this, imagine there is a dense
fog, so that the banks cannot be seen and the pilot is not aware of the current.
With B denoting ‘boat’ and W denoting ‘water’, put
vB = velocity of B relative to fixed axes (direction north, magnitude, or speed,
unknown);
vBW = velocity of B relative to the water W (speed 5 km h−1, in the unknown direction
it is pointing);
vW = velocity of the water W relative to fixed axes (direction east, speed 3 km h−1).
We also know from (9.13) that these are connected by
vBW = vB − vW,
or vB = vBW + vW.
This information gives Fig. 9.18.
(a) From Fig. 9.18, the boat is directed at θ = arcsin 53 = 36.9°.
(b) Pythagoras’s theorem gives the magnitude of vB:
| vB | = √(52 − 32) = 4 km h−1.
Therefore the time taken is 0.2/4 = 0.05 h = 3 minutes.
206
3
ELEMENTARY OPERATIONS WITH VECTORS
vW
vBW
vB
5
Fig. 9.18
Self-test 9.3
Two roads in NS and WE directions cross (by a bridge). A car A is travelling
9
north at 70 km h–1, and a second car B is travelling east on the other road at
50 km h–1. What is the speed of B relative to A, and what is the apparent
direction of B viewed from A?
2
1
1 2
1 3
2 O r
P
1 2 y Fig. 9.19 Position vector, r = O_P,
3
x of the point P.
207
Example 9.7 (Two dimensions.) A circle has radius c, and its centre C at the
9.6
point (a, b). (a) Obtain a vector equation for the circle. (b) Deduce the ordinary
cartesian equation.
z
y
P : (x, y) A
a–b
r – rC B r–b
c–b P
r
C : (a, b) b r
C
a c
rC
O
y
O x
x
Fig. 9.20
Fig. 9.21
Example 9.8 Three points, A, B, and C (which do not lie in a straight line), have
position vectors a, b, and c. (a) Obtain a parametric vector equation for the
plane through the points A, B, C. (b) Deduce parametric cartesian (i.e. x, y, z)
equations for the plane in the case where the points are A : (1, 2, 1), B : (2, 2, 0),
C : (2, 1, 2). (c) Deduce the ordinary cartesian equation for this plane by
eliminating the parameters occurring in (b).
(a) Figure 9.21 shows the points A, B, C, and their position vectors. The point P : (x, y, z)
with position vector r is any point in the plane through A, B, and C. By the triangle rule
B_A = a − b, B_C = c − b, B_P = r − b. ➚
208
Example 9.8 continued
ELEMENTARY OPERATIONS WITH VECTORS
By using the result (9.12), which relates any three coplanar vectors, we obtain
B_P = λ B_A + µ B_C,
or r − b = λ(a − b) + µ(c − b), (i)
where λ, µ are two constants which depend on the position of P. We find every point r in
the plane by letting the parameters λ , µ run through all possible values between −∞ and
+∞, so (i) is a parametric vector equation for the plane through A, B, C.
(b) Since r, a, b, c are position vectors, their components are given by the coordinates of
P, A, B, C, so eqn (i) becomes
(x, y, z) − (2, 2, 0) = λ [(1, 2, 1) − (2, 2, 0)] + µ[(2, 1, 2) − (2, 2, 0)]
= λ (−1, 0, 1) + µ(0, −1, 2).
Take the vector (2, 2, 0) over to the right-hand side, and then match the x, y, z
components separately:
x = 2 − λ, y = 2 − µ, z = λ + 2µ (ii)
where λ and µ may take any values. These are cartesian parametric equations for the plane.
(c) We obtain an x, y, z equation by eliminating λ and µ from the equations (ii). From the
first two equations we have
λ = 2 − x and µ = 2 − y.
9
If A, B, and C do not lie on a straight line, the equation of the plane through them
will always be like Example 9.8(iii):
Equation of a plane
The general equation of a plane is
ax + by + cz = d,
where a, b, c, d are constants. (9.14)
Example 9.9 (Three dimensions.) Two points, A and B, have position vectors
a and b. (a) Obtain a parametric vector equation for the straight line joining A
and B. (b) Deduce parametric cartesian (i.e. x, y, z) equations for the case where
the points are A : (2, 2, −1) and B : (0, 1, −2). (c) By eliminating the parameter
between the equations in (b), find cartesian equations for this line.
(a) Figure 9.22 shows the points A and B and their position vectors a and b. The
point P : (x, y, z) with position vector r represents any point on the line joining AB.
Also,
A_B = b − a, and A_P = r − a.
A_P is some multiple, λ say, of A_B: ➚
209
Example 9.9 continued
9.6
z
a
B
b
P : (x, y, z)
O r
x Fig. 9.22
A_P = λ A_B,
or r − a = λ(b − a).
Therefore
r = (1 − λ)a + λ b. (i)
This is the required parametric vector equation, with λ as the parameter. As λ increases
from −∞ to +∞, P traces out the straight line passing through A and B.
(b) Since r, a, and b are position vectors, their components are the same as the
coordinates of P, A, and B:
r = (x, y, z), a = (2, 2, −1), b = (0, 1, −2).
Substitute these into (i):
(x, y, z) = (1 − λ)(2, 2, −1) + λ(0, 1, −2) = (2 − 2λ, 2 − λ, −1 − λ).
Now match the x, y, z components on both sides:
x = 2 − 2λ, y = 2 − λ, z = −1 − λ. (ii)
These are parametric cartesian equations, in which the parameter ranges from −∞
to +∞.
(c) In order to get rid of the parameter λ in (ii), write them successively in the form
x−2 y−2 z +1
λ= , λ= , λ= .
−2 −1 −1
Since the three fractions are equal (equal to the current value of λ) we obtain the relation
between x, y, z which holds on the line:
x−2 y−2 z +1
= =
−2 −1 −1
which simplifies to
− --12 x + 1 = −y + 2 = −z − 1. (iii)
The shape of the result (iii) of Example 9.9 might strike you as being peculiar.
It really consists of two simultaneous equations, representing two planes which
intersect along the required line AB. The expression cannot be reduced to a single
equation. The general case will be given in Chapter 10.
210
2x − 2 = y + 1 = −2z, (i)
(a) Find any one point on the line. (b) Find a parametric equation for the line.
(c) Find the coordinates of the point where the line crosses the plane
x − y + z = 0. (ii)
(a) Put, for example, x = 1. Then from (i), 2x − 2 = y + 1, so when x = 1, y = −1. Also
from (i), 2x − 2 = −2z, so z = 0. Therefore, the point (1, −1, 0) lies on the line. (Other
values of x lead to other points.)
(b) Proceeding as in (a), put x = λ, where λ may take any value. Then we find that
y = 2λ − 3 and z = −λ + 1.
Therefore a set of parametric equations is
x = λ, y = 2λ − 3, z = −λ + 1. (iii)
(c) From (ii) and (iii), at the point where the line meets the plane the value of λ must be
given by
0 = x − y + z = λ − (2λ − 3) + (−λ + 1) = −2λ + 4.
At this point λ = 2, so from (i) again, the line meets the plane at
x = 2, y = 1, z = −1.
9
Self-test 9.4
Find the parametric vector and cartesion equations of the plane through the
points A, B, C with position vectors respectively a = (1, –1, 2), b = (2, 0, –1),
c = (3, –1, –3). Show that the plane passes through the origin.
(They are sometimes spoken of as ‘i-hat’, and so on.) Figure 9.23 shows them as
position vectors.
211
9.7
2
x
1
q
O
î
2 x
Fig. 9.23 Basis vectors, î, q, x.
Any vector can be expressed in terms of î, q, and x. Suppose that a = (a1, a2, a3)
in component form. Then
a = (a1, 0, 0) + (0, a2, 0) + (0, 0, a3)
= a1(1, 0, 0) + a2(0, 1, 0) + a3(0, 0, 1) = a1î + a2 q + a3x.
The components become the coefficients of î, q, and x.
Example 9.12 Obtain the unit vector w pointing in the direction of the force
F = 2î − 3q − 6x.
|F | = √[22 + (−3)2 + (−6)2] = √49 = 7.
Therefore, the unit vector pointing in the same direction is
w = F/| F| = (2î − 3q − 6x)/7 = 27 î − 73 q − 67 x,
or, in component form,
w = ( 27 , − 73 , − 67 ).
212
Unit vectors
ELEMENTARY OPERATIONS WITH VECTORS
A unit vector is a vector of unit magnitude. The unit vector in the direction of a
is denoted by â (a-hat).
(a) If a is any vector, then
â = a/|a |.
(b) The vectors î, q, x (basis vectors) are the unit vectors in directions Ox, Oy,
Oz. If a = (a1, a2, a3) is any vector, then
a = a1î + a2 q + a3 x.
(For two dimensions, use only î and q.). (9.16)
Example 9.13 Find the point Q where the straight line joining A : (2, 3, 1) and
B : (1, 2, 2) intersects the plane x + y + z = 0.
The position vectors of A and B, in terms of î, q, x, are a = 2î + 3q + x and b = î + 2q + 2x
respectively. Let r = xî + yq + zx be the position vector of a general point on the line AB.
Then from Example 9.9a, the parametric equation of AB is
r = (1 − λ)a + λ b = (1 − λ)(2î + 3q + x) + λ(î + 2q + 2x).
9
Problems can be worked through with the vectors given either in component form
or in î, q, x form, whichever is convenient.
Self-test 9.5
Find where the straight line through the points A : (1, 2, –1), B : (p, 1, 0),
(p ≠ 1), intersects the plane x + y + z = 0. Treating p as a parameter, find the
locus of the points of intersection on the plane.
9.8
Q
B(t = b)
O
A(t = a) Fig. 9.24 Tangent vector T.
Derivative of r(t)
r(t) = îx(t) + qy(t) + xz(t), where t is a parameter, represents a curve. The vector T
given by
T = dr/dt = î dx/dt + q dy /dt + x dz /dt
is a tangent to the curve, in the direction of increasing t. (9.18)
If the parameter t stands for time, then dr /dt is the definition of the velocity v(t)
of P, and dv/dt represents its vector acceleration:
Example 9.14 (Motion in the (x, y) plane.) The position vector of a point P is
given by
r(t) = îc cos ω t + qc sin ω t,
where c and ω are positive constants. Find (a) the velocity v(t) and the speed of
P; (b) the acceleration a(t) of P.
(a) |r(t)| = c√[cos2ω t + sin2ω t] = c, so P is moving around a circle of radius c in the (x, y)
plane.
v = dr /dt = −îcω sin ω t + qcω cos ω t.
The direction of v is tangential to the circle, by (9.18). By putting, say, t = 0 we obtain
v = qω c, and since c, ω 0, this shows the motion to be anticlockwise.
Also, speed = |v | = cω .
(The speed is constant, but the velocity is not, because its direction is continuously
changing.)
(b) a = dv/dt = −îcω 2 cos ω t − qcω 2 sin ω t
= −ω 2(îc cos ω t + qc sin ω t) = − ω 2r.
9
The acceleration is therefore directed towards the centre of the circle (perhaps
unexpectedly).
Self-test 9.6
A particle has the position vector r = cî cos ω t + –√2–1 cq sin ω t + –√2–1 cx sin ω t in
terms of time t. Show that the particle moves on a sphere of radius c. Find the
velocity and acceleration of the particle. Show that both are constant in
magnitude, and that the acceleration is directed towards the origin.
The unit vectors êr and êθ vary in direction according to the value of θ, and are
therefore functions of θ. They are related to the basis vectors î and q as in Fig. 9.26.
By the triangle rule,
êr = î cos θ + q sin θ, êθ = − î sin θ + q cos θ. (9.21)
9.9
êθ
êr
θ
Fig. 9.25 Polar unit vectors êr
O x and êθ .
êθ
êr q cos θ
q sin θ
θ
î cos θ −î sin θ
Fig. 9.26 The hypotenuse has unit length in both triangles, which determines the lengths of the
other vectors.
Now suppose that P is moving along a curved path. Then r and θ are functions
of time, t, so we can write r(t), θ(t) for its polar coordinates, and consider their
derivatives with respect to t. There is a useful dot notation for time derivatives
which saves a lot of writing – it works in the same way as the dash notation, (4.1):
Example 9.15 The polar coordinates of a point moving in a plane are r(t), θ(t),
ELEMENTARY OPERATIONS WITH VECTORS
where t is time. Find the polar components (a) of its velocity and (b) of its
acceleration.
(a) The position vector is r(t) = r(t)êr. The velocity v is dr /dt:
v(t) = dr /dt = d(rêr) /dt.
Both r and êr depend on θ, so we use the product rule for differentiation:
v = Kêr + r dêr /dt = Kêr + rIêθ (i)
by (9.25). Therefore the radial velocity component is K and the transverse component
is rI.
(b) The acceleration is dv/dt, given by
dv /dt = (d/dt)(Kêr + rIêθ) (from (i))
dêr d(r I) dê
= } êr + K + êθ + r I θ
dt dt dt
= }êr + KIêθ + (KI + rJ)êθ − rI êr
2
Problems
9.1 Sketch the two-dimensional displacement displacement vectors. In axes pointing east and
vectors P_Q and Q_P, and state their x and y north, S1 follows the path to B via Q_A = (2, 4),
components, when the coordinates of P and and A_B = (4, 1). S2 goes to E via Q_C = (3, 3), C_D =
Q are as follows. (1, 1), and D_E = (2, −3). Find the displacement
(a) P : (−2, 3), Q : (3, 0), (b) P : (3, 4), Q : (2, 1), vector B_E in component form, the distance BE,
(c) P : (0, 1), Q : (−1, −2), and the final bearing of S2 seen from S1.
(d) P : (−1, −1), Q : (0, 0).
9.5 Find the distances between the pairs of points
9.2 (a) to (h) represent two-dimensional whose coordinates are: (a) (0, 0, 0) and (1, 2, 3),
displacement vectors expressed in terms of their x, (b) (1, 2, 3) and (3, 2, 1), (c) (1, 0, −1) and (−1, 1, 0).
y components. For each one obtain the length and
the angle of inclination θ to the positive direction 9.6 State the projections on the three axes of the
of the x axis in the range −180° to 180°. vector P_Q when P is the point (1, 2, 1) and Q is
(a) (3, 0), (b) (0, 2), (c) (−1, 1) (2, 3, 3).
(d) (1, 1), (e) (−1, −1), (f) (−3, 4),
(g) (−3, −4), (h) (−2, 1). 9.7 Find 2a, 3b, and 2a − 3b when
(Made sure that these angles are in the right (a) a = (1, 2, 1), b = (2, 1, 2),
quadrant by means of a rough sketch.) (b) a = (3, 2, 3), b = (1, 1, 2),
9.3 Obtain the components of the vectors a in (a) (c) a = (6, 3, 1), b = (4, 2, 1).
to (d), where L is the magnitude and θ the angle How do you recognize that 2a − 3b is parallel
made with the positive direction of the x axis to the (x, y) plane in (b), and parallel to the z axis
(−180° θ 180°): (a) L = 2, θ = 45°, (b) L = 3, in (c)?
θ = 120°, (c) L = 3, θ = 60°, (d) L = 3, θ = −150°.
9.8 Sketch a diagram to show that if A, B, C are
9.4 Two ships, S1 and S2, set off from the same any three points, then A_B + B_c
_ + C_A = 0. Formulate
point Q. Each follows a route given by successive a similar result for any number of points.
217
9.9 Sketch a diagram to show that if A, B, C, D 9.18 r is the position vector (2, 3, 1), and
are any four points, then C_D = C_B + B_A + A_D. a = (1, 1, 2) is a general vector. R is the position
PROBLEMS
Formulate a similar result for any number of points. vector defined by R = a + 2r. Find the coordinates
of the terminal point of R.
9.10 Oxyz and QXYZ are two sets of axes with
origins at O and Q respectively. QX is parallel to 9.19 Find the angle θ, where 0 θ 180°, made
Ox and has the same sense (positive direction), and by the position vector r with the positive directions
similarly for QY and QZ. The frame QXYZ is said of the axes Ox, Oy, Oz in the following cases:
to be a translation (a motion without rotation) of (a) r = (1, 0, 0), (b) r = (0, 1, 1), (c) r = (0, 0, −1),
the frame Oxyz. (d) r = (1, 1, 1), (e) r = (1, 1, −1).
Suppose that O_Q = (2, −1, 3). (a) Find the
coordinates of the point P in QXYZ if it has 9.20 P : (1, 1, 0), Q : (1, 1, 1), and R : (1, 2, 1)
coordinates x = 5, y = 2, z = −3 in Oxyz. (b) Find are three of the vertices of a parallelogram with
the equation of the sphere x2 + y2 + z2 = 1 in terms sides PQ and PR. Use vector methods to find the
of X, Y, and Z. coordinates of (a) the fourth vertex, S, (b) the
midpoint of PS, (c) the midpoint of QR. Show that
9.11 ABCD is any quadrilateral in three (b) and (c) have the same coordinates (it is where
dimensions. Prove that if P, Q, R, S are the the diagonals intersect).
midpoints of AB, BC, CD, DA respectively, Find the midpoints A, B, C, D of the four sides
then PQRS is a parallelogram. PR, RS, SQ, QP respectively. Show that ABCD is
a parallelogram.
9.12 ABC is a triangle, and P, Q, R are the
midpoints of the respective sides BC, CA, AB. 9.21 Show that the points A : (1, 2, −1), B : (3, 3, −2),
Prove that the medians AP, BQ, CR meet at a and C : (−3, 0, 1) are collinear (lie on a straight
single point G (called the centroid of ABC; it is line), by considering the vectors A_B and A_C (or any
the centre of mass of a uniform triangular plate). other two combinations of A, B, and C). (a) Find
which point is between the other two. (b) Find any
9.13 Show that the vectors O_A = (1, 1, 2), O_B = other point on the line. (c) Show that the points
(1, 1, 1), and O_c = (5, 5, 7) all lie in one plane. Show x = 2λ + 1, y = λ + 2, z = −λ − 1, where λ is a
that the same is true if O_A = (a, a, p), O_B = (b, b, q), parameter which may take any value, all lie on the
O_C = (c, c, r), where a, b, c, p, q, r may stand for line (these are parametric equations for the line).
any numbers. Explain this result geometrically.
9.22 Two points A and B have position vectors
9.14 A glider is moving with a velocity v = a and b respectively. In terms of a and b find the
(40, 30, 10) relative to the air and is blown by the position vectors of the following points on the
wind which has velocity relative to the earth of straight line passing through A and B: (a) the
w = (5, −10, 0). Find the velocity of the glider midpoint C of AB; (b) a point U between A and B
relative to the earth. for which AU/UB = 1/3; (c) a point V for which
AV/VB = 1/3, but for which V does not lie between
9.15 The captain of a boat at night can tell that A and B.
it is moving relative to the sea with velocity (5, 4)
km h−1, and by observation of lights on shore its 9.23 Suppose that λ is a number such that 0 λ
true velocity is found to be (4, 1). What is the 1. Find two points, U and V, on the line through
velocity of the current? A and B such that (a) AU/UB = λ and U is between
A and B. (b) AV/BV = λ and V is not between A
9.16 A cyclist rides north along a straight road and B. (c) What is the case if λ 1?
at 10 km h−1. The wind appears to come from the
west. If she increases her speed to 20 km h−1 then 9.24 (a) Obtain a vector parametric equation for
the wind appears to blow from the north west. the straight line which passes through the point
Determine the speed and direction of the wind. (1, 4, 2) and is parallel to the line joining the points
(2, 3, 4) and (1, 2, 3). (b) As in Example 9.9, deduce
9.17 A ship travels south with speed u and the a pair of simultaneous cartesian equations for the
apparent wind direction is from the east. Another line. (c) Obtain the points where the line intersects
travels west with speed 2u/√3, and the apparent the (x, y) plane and the (y, z) plane. (d) By using
wind direction is from 30° east of north. Find the these two points, obtain another pair of cartesian
true wind velocity. equations for the line.
218
9.25 Suppose that P has position vector r, and through the point with position vector î + q + x.
r = λ a + (1 − λ)b, where λ is a parameter, and A, B Find the point of intersection of this line with the
ELEMENTARY OPERATIONS WITH VECTORS
9.34 Obtain a parametric vector equation for the occurs at the highest and lowest points of the sphere,
line which is parallel to î + 2q − x and which passes and find where the maximum occurs.
The scalar product
10
CONTENTS
Given two vectors a and b, an operation can be carried out that bears some sim-
ilarity to forming their product. There are two types of ‘product’, the scalar prod-
uct or dot product, written a·b, which is the subject of this chapter, and vector
product, or cross product a × b, treated in the next chapter, and they have differ-
ent spheres of usefulness. The scalar product of two vectors is not itself a vector,
but a scalar quantity related to the angle between the two vectors. This property
extends the capacity of vector techniques to handle many geometrical questions.
In mechanics the scalar product is associated with component of a vector
quantity, such as force, in a given direction.
Self-test 10.1
Prove that if a = −î − 2q − x and b = 2î − q, then a · b = 0.
10.2
B
x Fig. 10.1
If a and b are not at the same point to start with we may still refer to θ as
being the angle between them. The result (10.4b) can also be written in the form
cos θ = â ·s where â and s are the unit vectors in the directions of a and b.
Example 10.3 Given three points A : (1, 1, 1), B: (3, 2, 3), and C : (0, −1, 1), find
the angle θ between C_A and C_B.
Put C_A = a, C_B = b. Then
a = (1, 1, 1) − (0, −1, 1) = (1, 2, 0), b = (3, 2, 3) − (0, −1, 1) = (3, 3, 2).
| a| = √ [12 + 22 + 02] = √5 and | b| = √ [32 + 32 + 22] = √22.
a ·b = (1 × 3) + (2 × 3) + (0 × 2) = 9.
From (10.4),
a ⋅b 9
cos θ = = = 0.858.
| a || b | √110
Finally θ = 30.9°
222
Self-test 10.2
THE SCALAR PRODUCT
Show that the angle θ between the vectors (3, 2, 1) and (−2, 1, 2) is equal
to 100.2°.
Example 10.4 Show that the vectors a = (1, 2, 3) and b = (−5, 1, 1) are
perpendicular.
We have
a ⋅b
cos θ =
| a || b |
and a· b = (1, 2, 3) ·(−5, 1, 1) = −5 + 2 + 3 = 0. Therefore θ = 90°, by (10.4).
From (10.4) the condition for two vectors to be perpendicular may be expressed
as follows:
Perpendicular vectors
If a and b are nonzero vectors, they are perpendicular if
a ·b = a1b1 + a2b2 + a3b3 = 0. (10.5)
Scalar products of î, Q, X
10.4
(a) î ·î = q · q = x·x = 1;
î· q = q · x = x·î = 0.
Example 10.5 Find the numbers α, β, and γ which make the vectors
a = α î + q + 2x, b = î + β q − x, c = î − q + γ x
mutually perpendicular.
We require that a· b = b·c = c· a = 0.
a ·b = (α î + q + 2x)·(î + β q − x) = α + β − 2 = 0,
b· c = (î + β q − x)· (î − q + γ x) = 1 − β − γ = 0,
c·a = (î − q + γ x)·(α î + q + 2x) = α − 1 + 2γ = 0.
Therefore α, β, γ must satisfy
α+β = 2, (i)
− β − γ = −1, (ii)
α + 2γ = 1. (iii)
Substitute α from (i) and γ from (ii) into (iii) to give
(2 − β ) + 2(1 − β ) = 1,
so that β = 1. From (ii), γ = 1 − β = 0, and from (i), α = 2 − β = 1. Therefore the required
vectors are
a = î + q + 2x, b = î + q − x, c = î − q.
Self-test 10.3
Show that the vectors 3î − 2q + x, î − 3q + 5x, and 2î + q − 4x are parallel to the
sides of a certain right-angled triangle.
(a) y (b)
THE SCALAR PRODUCT
P : ( x, y)
Y
X q
r
θ Î
θ θ
x 90° – θ x
O O î
10
Fig. 10.2 (a) Change of axes in two dimensions. (b) The associated unit vectors.
Therefore
î = Î cos θ − r sin θ,
and q = Î sin θ + r cos θ.
The position of P in space does not change when we change axes, so in terms of
the new axes
XÎ + Yr = xî + yq
= x(Î cos θ − r sin θ ) + y(Î sin θ + r cos θ )
= (x cos θ + y sin θ )Î + (−x sin θ + y cos θ )r.
Finally, by equating the coefficients of Î and r, we obtain the result (10.9a):
The inverse relation (10.9b) can be obtained by solving the equations in (10.9a) for
x and y; or by interchanging x, y and X, Y in (a) and putting (−θ ) in place of θ.
Self-test 10.4
(a) The coordinates of a fixed object are x = 1, y = √ 3. What do its coordi-
nates become in axes X, Y obtained by rotating x, y anti-clockwise
through 60°?
(b) In a change of axes, the X axis is to be parallel to the vector î + q. What is
the value to be given to θ in (10.9a)?
225
10.5
Figure 10.3 shows a position vector r = O_P, where P is the point (a, b, c), so
O_P = r = (a, b, c).
DIRECTION COSINES
The angles between r and î, q, x respectively (chosen for definiteness between
0° and 180°, as for θ in Section 10.2) are α, β, γ. These angles specify the direction
of r uniquely. It is convenient to use not the angles themselves, but their cosines,
which are normally indicated by l, m, n:
l = cos α, m = cos β, n = cos γ.
These are called the direction cosines of r, and also specify the direction of r
uniquely.
γ β
α
O
If the angles between s and Ox, Oy, Oz, are α, β, γ, respectively, in the range 0°
to 180°, then
l = cos α, m = cos β, n = cos γ
are the direction cosines of s.
(a) Any vector parallel to s with the same sense has the same direction cosines
l, m, n.
(b) l 2 + m2 + n2 = cos2α + cos2β + cos2γ = 1.
(c) v = (l, m, n) is a unit vector in the direction of s. (10.10)
10
Example 10.6 Obtain the direction cosines of the vector s = î + 2q − 2x. Find the
angles between s and the coordinate axes.
The components of s are (1, 2, −2), so its length is given by
s = √[12 + 22 + (−22] = 3.
Therefore the unit vector v has components
l = 13 , m = 23 , n = − 23.
The corresponding angles in the range 0° to 180° are α = arccos 1
3 = 70.5°,
β = arccos --23 = 48.2°, γ = arccos(− --23 ) = 131.8°.
Self-test 10.5
Obtain the direction cosines (a) of vectors parallel to î + q − x, (b) of the line
joining the points with coordinates (1, --12 , 1) and (0, − --12 , 1), and the angles
they make with the axes.
By inverting our view of the two sets of axes, we can also specify the components
of î, q, x in the axes OXYZ:
î = (l1, l2, l3), q = (m1, m2, m3), x = (n1, n2, n3) (10.11b)
(a) z (b)
10.6
Z Z
α3 Y
Y
O α2
O
α1
î = (l1, l2, l3)
x
X X
Fig. 10.4 (a) Change of axes in three dimensions. (b) Angles between î and the X, Y, Z axes.
The inverse relation is obtained in a similar way. Given a fixed point Q, with
coordinates (X, Y, Z) in the axes OXYZ and position vector R, then
R = (X, Y, Z) = XÎ + Yr + ZP.
In matrix form, the coordinates in the two systems are related as follows:
228
Î = (l1, m1, n1), r = (l2, m2, n2), P = (l3, m3, n3) are the basis vectors for axes OXYZ,
referred to axes Oxyz (the components being direction cosines). Then
⎡X ⎤ ⎡l1 m1 n1 ⎤ ⎡x⎤
(a) ⎢Y ⎥ = ⎢l 2 m2 n2 ⎥ ⎢y ⎥
⎢ Z ⎥ ⎢l m3 n3 ⎥⎦ ⎢⎣ z ⎥⎦
⎣ ⎦ ⎣3
⎡x⎤ ⎡ l1 l 2 l3 ⎤ ⎡X ⎤
(b) ⎢y ⎥ = ⎢m1 m2 m3 ⎥ ⎢Y ⎥ .
⎢z ⎥ ⎢ n n n3 ⎥⎦ ⎢⎣ Z ⎥⎦
⎣ ⎦ ⎣ 1 2
(10.13)
10
The matrix of direction cosines in (b) is the inverse of the matrix in (a).
10.7
In the new coordinates,
3x + 3y + 3z = (X + 2Y + 2Z) + (−2X − Y + 2Z) + (2X − 2Y + Z) = X − Y + 5Z.
Example 10.9 Find the angles made with Ox, Oy, and Oz by a line with
direction ratios 2, 3, −6.
Put s = 2î + 3q − 6x: this is parallel to the line. Since |s| = 7, the corresponding unit vector v
is given by
v = --17 s = --27 î + --37 q − --67 x = î cos α + q cos β − x cos γ,
where cos α, cos β, cos γ are its direction cosines. Therefore, the inclination of the line
is specified by the angles
α = arccos --27 = 73.4°, β = arccos --73 = 64.6°, γ = arccos(− --76 ) = 149°.
Example 10.10 (Two dimensions.) Find a set of direction ratios for the straight
line y = 2x + 1.
We are looking for any vector which is parallel to the line. The points
A : (0, 1) and B : (1, 3) lie on the line, so the vector s = A_B given by
s = O_B − O_A = (î + 3q ) − q = î + 2q
is parallel to the line. Therefore one set of direction ratios is given by the numbers 1, 2.
230
Example 10.11 (Two dimensions.) Find parametric and cartesian equations for
THE SCALAR PRODUCT
the straight line through the point A : (a, b), which has direction ratios p, q.
y
P : (x, y)
S
r
s
A : (a, b)
a
10
O x
Line Fig. 10.5
In Fig. 10.5, A is the point with position vector a = aî + bq, and s = pî + qq. P is a general
point on the line with position vector r = xî + yq, and s = pî + qq = A_S.
r = O_A + A_P,
and A_P is some multiple of s, say:
A_P = λ s.
Therefore
r = a + λ s, (i)
where λ is a parameter. This is a parametric vector equation for the line.
By equating corresponding components we have
x = a + λp, y = b + λ q, (ii)
and these are parametric cartesian equations.
Now eliminate the parameter between the equations (ii):
(x − a)/p = (y − b)/q.
This is a cartesian equation, which could be reduced to the standard form y = mx + c.
10.8
Required plane
D
r–a
A Line
PROPERTIES OF A PLANE
P
n
y
a
C
O
x Fig. 10.6
Now let A : (a1, a2, a3) be any point on the plane. It satisfies the equation (10.19).
Put a = a1î + a2 q + a3x. From (10.20), this means that
232
p· a = d.
THE SCALAR PRODUCT
This is like (10.18). Therefore (10.19) represents a plane, the plane passes through
the point with position vector a, and p is perpendicular to the plane:
to n.
(b) ax + by + cz = d always represents a plane.
(c) p = aî + bq + cx is perpendicular to the plane ax + by + cz = d. (10.22)
Example 10.13 Find the angle of intersection between the two planes
2x + 3y + 4z = 5 and 2x − 6y − 3z = 0.
θ
θ
10.8
perpendicular.
The planes are perpendicular if their normal vectors are perpendicular. Taking the
PROPERTIES OF A PLANE
equations in order, by (10.22c) the vectors
p = 2î + 2q − x and q = 3î − 2q + 2x
are normal to the planes. Then
p ·q = 6 − 4 − 2 = 0,
so the planes are perpendicular.
Origin
O
p
Perpendicular
O_N,
|O_N | = D
N
Fig. 10.8 The distance from the
Plane p ·r = d origin O to a plane.
Now let Q, position vector q, be any point, distance DQ from the plane. Move
the origin to Q, and let R denote the new general position vector measured from
Q. Since R = r − q, the new equation of the plane is p ·(R + q) = d, or p ·R = d − p · q.
Therefore d − p·q is to be put in place of d in (10.23).
234
Put aî + bq + cx = p. Then
(a) Distance D of theorigin O from the plane:
D = | d|/|p| = |d | /√(a2 + b2 + c2).
(b) Distance DQ of a point Q, position vector q from the plane:
DQ = |p·q − d| /| p|. (10.24)
Self-test 10.6
10
Self-test 10.7
(a) Show that the planes x + 2y + 3z = 1 and x + 2y + 3z = − 4 are parallel.
(b) State their perpendicular distances from the origin. (c) Decide whether
they lie on the same or opposite sides of the origin, and deduce the distance
between them.
z
P
A s
Line
y
a r
x Fig. 10.9
235
By the triangle rule
10.10
r = O_A + A_P = a + A_P.
A_P is always some multiple λ of s:
This is the cartesian equation of a straight line, since any line passes through some
point A and has some direction p, q, r. This expression is not unique, because
(a1, a2, a3) and p, q, r are not unique.
The equation (10.25) really consists of two simultaneous equations: for example,
the pair
(x − a1)/p = (y − a2)/q and (y − a2) /q = (z − a3)/r.
These are the equations of two planes, and the line is their line of intersection.
Self-test 10.8
(a) Obtain a cartesian equation for a straight line connecting the points
A : (1, 2, 1) and B : (3, −1, 2). (b) Obtain the equations of the line of intersec-
tion of 2x + 3y − z = 1 and 3x + 2y + z = 1.
F
THE SCALAR PRODUCT
F2
F3 v
F1 P
10
θ
F4
A zero force has zero effect. This gives the condition for equilibrium of a particle
under the influence of several forces F1, F2, … : that the resultant force F must be
zero, or
F = F1 + F2 + ··· = 0. (10.26)
10.10
then |F| must be zero, so F = 0. (One direction is not sufficient, since F might be
perpendicular to that direction.) This principle, of ‘resolving in two directions’, is
used frequently to solve problems. The following is a simple example.
180° – 30°
60° 180° – 60°
30°
mg Fig. 10.12
The arrows indicate provisional directions for the vectors R and F. The scalar quantities
R and F attached to the arrows stand for the unknown components of R and F in the
assumed directions, and these might not be positive numbers. This convention provides
a safety net, for suppose we have, say, guessed the direction of F wrongly, and that it
actually acts down the plane rather than up it. The mistake will do no harm, because F
will simply turn out to be a negative number in our answer. This is a conventional way
of lettering diagrams in mechanics.
It is easiest to resolve in the assumed directions of F and R:
in direction of F: 0 = F + mg cos(180° − 60°)
which is the same as
0 = F − mg cos 60° = F − --12 mg (i)
in direction of R: 0 = R + mg cos(180° − 30°)
which is the same as
√3
0 = R − mg cos 30° = R − mg. (ii)
2
Therefore
√3
F = 12 mg and R = mg.
2
(You would usually go straight for the commonsense way of writing the components
given by (i) and (ii), avoiding the cosines of large angles.)
238
In Fig. 10.13, S is a fixed point on an arc and P is any other point on it, with
position vector r = xî + yq + zx. A positive direction along the arc is indicated. P is
then determined by specifying a number s, where
| s| = arc-length fi,
and s is positive or negative according to whether P is on the positive or negative
side of S. The parameter s is a kind of coordinate for P, measured along the arc.
Indicate the dependence of r on s by writing r(s) (compare r(t) in Section 9.8).
Given a particular vector function r(s), the curve can in principle be reconstructed,
10
s increasing
δr
P
r + δr
O s
Let Q be a point on the curve with position vector r + δr. Figure 10.13 shows
the vector P_Q = δr, where P has parameter value s and Q has parameter value
s + δs. According to (9.18), the vector dr /ds is tangential to the arc at P, and points
in the direction of increasing s. Also, in this case, when δs is small,
|δr | ≈ |δs|,
approximately, and so
|dr(s)/ds| = lim |δr /δs| = 1. (10.28)
δs→0
Therefore, in the case when the parameter used is s, dr/ds is a unit tangent vector,
which we can write as M, pointing in the direction of increasing s.
Since M is a unit vector, M·M = 1. Therefore, by using the product rule to differentiate,
d dM dM dM
(M· M) = M· + ·M = 0 so that M· = 0.
ds ds ds ds
Therefore, dM /ds is perpendicular to M.
Figure 10.14 shows a curve in the x,y plane. Draw a unit normal P_N = L to the
curve at P as in the diagrams of Fig. 10.14. As we walk along the curve in the
direction of M, the direction of L is towards the right. Since M and dM /ds are
perpendicular, dM/ds must be a certain multiple, κ say, of L:
239
10.11
s increasing s increasing s increasing
M
M
Fig. 10.14 (a) κ 0, curve is concave viewed from the side of L. (b) κ 0, the curve is convex
viewed from the side of L. (c) κ = 0, a point of inflection.
dM/ds = κ L. (10.29)
The three cases in Fig. 10.14 relate to the sign of κ. In Fig. 10.14a, the curve is
concave as viewed from the side of L, implying that if we make a small increase
in s, then δM points in the direction of L; therefore κ is positive. In Fig. 10.14b,
the curve is convex as viewed from the side of L; and in the same way it follows
that κ is negative. In the case of a point of inflection (Fig. 10.14c), κ is zero.
The number κ is called the curvature of the curve at P. The greater is | κ |, the
more sharply the curve is turning. The positive quantity ρ given by
ρ = 1/| κ |
is its radius of curvature at P. This is the radius of the circle that best fits the curve
at P. We will not prove this, but illustrate it in the following example.
Example 10.17 Obtain expressions for M and dM /ds for the case of a circle of
radius a with centre at the origin, and confirm that ρ = a (Fig. 10.15).
y M L
90° – θ
a s
θ
O S x
Fig. 10.15
1
dM /ds = − L,
a
so κ = −1/a (consistent with a curve that is convex viewed from the side of the normal n).
Also
ρ = 1/| κ | = a,
the radius of the circle.
Self-test 10.9
For a plane curve r = xî + y(x)q, y is a function of x, δs = √[δx)2 + (δy)2], and
M = dr/ds. (a) Supposing that the curve is given by y = x2, write
dr dr dx
= ,
ds dx ds
and show that M = (î + 2xq)/√(1 + 4x2).
3
(b) Obtain dM/ds, and deduce that ρ = 12 (1 + 4x 2)–2 .
Problems
10.1 Obtain the scalar products of the pairs of geometrical theorem from this result. (There
vectors given in component form by: (a) (2, 2, 1) are two possible theorems, depending on what
and (3, 1, 2), (b) (2, −3, 2) and (−2, 3, −1), diagram you draw.)
(c) (2, 2, −3) and (−1, 1, −2), (d) (2, 3, 4) and
(1, −2, 1), (e) (p − q, p + q, p) and (p + q, q, −p − q). 10.4 Let a = (2, −3, 4) and b = (−1, −2, 3), or in the
alternative form a = 2î − 3q + 4x and b = −î − 2q + 3x.
10.2 (Two dimensions). Obtain the scalar Evaluate a· b, (a) using the first form, (b) using the
products of the pairs of vectors given in component second form with (10.8).
form by (a) (2, 3) and (3, 4), (b) (1, 0) and (0, 1),
(c) (5, 6) and (0, −4), (d) (2, 3) and (3, −2). 10.5 Given that a = î + 2q − x and b = î + 3q + x,
evaluate the following scalar products:
10.3 Prove that | a + b |2 + | a − b |2 = 2(| a |2 + | b|2). (a) a ·b, (b) (a − b) ·(a + b),
(Hint: see eqn (10.1c).) Sketch the vectors a, b, (c) (a − b) · (a − b), (d) a · a + 2a ·b + b· b,
a + b, a − b on one diagram in order to obtain a (e) (a ·a)a − (b ·b)b.
241
10.6 Find the angles, in the range 0° to 180°, 10.16 Determine numbers α, β, γ which ensure
between the pairs of vectors (a) î + q + x and that the vectors a = (α, 2, −3), b = (−1, 2β, 2), and
PROBLEMS
î + q, (b) î − q + x and î + q, (c) 2î − q + 3x and c = (2, 1, −3γ ) are mutually perpendicular.
î + 3q + 2x.
10.17 The points A : (1, 0, 0), B : (0, 1, 0),
10.7 (Two dimensions). Find the angle θ C : (0, 1, 1), and D : (0, y, z) are the vertices of a
(0° θ 180°) between the pairs of vectors: tetrahedron. Find y and z such that ABD is an
(a) 3î + 4q and 4î − 3q, (b) î − 2q and 2î − q, equilateral triangle and ‹ is a right angle.
(c) î − 2q and −6î + 3q.
10.18 (Change of axes in two dimensions). Oxy
10.8 Find the angle between one of the edges of a and OXY are two sets of right-handed axes with
cube and a diagonal line through one end. the same origin O. OX is reached from Ox by an
anticlockwise rotation 45°. (a) Obtain the X, Y
10.9 A circular cone has its vertex at the origin
coordinates of a point P whose coordinates in Oxy
and its axis in the direction of the unit vector â. are (2, 2). (b) Find the values of x and y for the point
The half-angle at the vertex is α. Show that the Q for which X = 1, Y = −1. (c) Find the equation of
position vector r of a general point on its surface the circle (x −1)2 + y 2 = 1 in the axes OXY.
satisfies the equation
10.19 Find the lengths, and the direction cosines l,
â · r = | r| cos α. m, n, of the following vectors. (a) q, (b) î + q + x,
Obtain the cartesian equation when â = (--27, − --37, − --67 ) (c) î − 2q − 2x, (d) î − q + x, (e) î − q − x,
and α = 60°. (f ) 2î + 3q + 6x, (g) î − 2q + 2x, (h) 3x, (i) −3x.
10.10 A : (2, 2, −1), B : (0, 1, 1), C : (−1, 2, 0) are 10.20 (Change of axes). (a) Show that the vectors
three points. Find the angles in the triangle ABC. with components in Oxyz given by X = (6, 15,
10)/19, Y = (15, −10, 6)/19, Z = (10, 6, −15)/19 are
10.11 Confirm the fact that a · b = --14 ( | a + b|2 − mutually perpendicular unit vectors. (b) A sketch
|a − b |2). (Hint: it is easier to start with the right- will show that [X, Y, Z] is a right-handed system,
hand side.) Test the result using any two vectors. so it defines a new set of right-handed axes OXYZ.
Deduce a simple geometrical theorem by sketching Write down the change-of-axes matrices in (10.13a)
a, b, a + b, a − b all on the same diagram: there are and (10.13b). (c) Find the coordinates of the point
two theorems to be had, depending on whether x = 1, y = 2, z = 2 in the new axes. (d) Express the
you think of the triangle or the parallelogram rule. equation of the plane x + y + z = 0 in the new
coordinates.
10.12 Show that the component of a vector F
10.21 The following are sets of direction ratios p,
in the direction of another vector a is given by
q, r for a straight line. Obtain two possible sets
F · a/ |a |. Find the components of F = (8, 15, 9) in
of direction cosines in each case. (a) 3, 4, 12;
the directions of the three vectors a, b, c, where
(b) 6, −10, 15.
a = (2, 3, 6), b = (0, 3, 4), and c = (2, 2, 1). Express
F in the form F = λ a + µ b + νc, where λ, µ, ν are
10.22 A swarm of particles expands through
constants.
all space. The velocity v(t) of the particle with
position vector r(t) at time t in a given set of axes
10.13 Show that the vectors a = î + 3q + 4x and is equal to f(t)r. Show that the rule is the same
b = −2î + 6q − 4x are perpendicular. Obtain any when the velocity is measured relative to any
vector c = c1î + c2 q + c3x which is perpendicular given particle.
to a and b, and derive from it two unit vectors
(their senses will be opposite). 10.23 The angles made by a vector a and the
positive directions of the axes Ox, Oz are 45° and
10.14 Let a = î + q − x and b = 2î − q + 2x. Find 60° respectively. Find the angles that a may make
the angle (in the range 0° to 180°) between a and with Oy.
b, and construct any vector perpendicular to
a and b. 10.24 The following are sets of direction ratios p,
q, r for a straight line. Obtain two sets of direction
10.15 Find the value of λ such that the vectors cosines, describing unit vectors parallel to the line,
(λ, 2, −1) and (1, 1, −3λ ) are perpendicular. for each csae. (a) 3, 4, 12; (b) 6, −10, 15.
242
10.25 (a) Find any constant vector parallel to the directions of v and L respectively, so that F = Fs + Fn.
line given parametrically by x = 1 − λ , y = 2 + 3λ , Show that F = Fs + (F ·L)L. (Hint: see (10.8b).)
THE SCALAR PRODUCT
z = 1 + λ . (Hint: see eqn (10.25).) (b) Find the Find Fs and Fn when F = î − 3q and the straight
equation of the plane which is perpendicular to line is given by 2x − 3y = 1.
line in (a) and which passes through the origin.
(Hint: see eqn (10.22b).) (c) Find the equation of 10.32 (Two dimensions). A mirror M stands
the plane such that the line in (a) lies in the plane, upright on a table (sketch it as a straight line M
and the plane passes through the origin. through the origin O in the (x, y) plane). v is a unit
(Hint: the new plane must be perpendicular vector along M pointing away from O, and L is the
to the plane in (b).) unit normal vector to M pointing to the left of v.
(a) A ray of light in the plane, with direction
10.26 Find the angle θ, in the range 0° θ 90°, vector N, falls on the mirror and is reflected in the
between the pairs of planes given as follows: direction N1. By considering its vector components
10
PROBLEMS
through a plane screen which has the equation represented by
r ·(1.1î + 1.1q + x) = 1. Q is a general point on an r = xî + f(x)q
object behind the screen, and its position vector
using x as the parameter. Show that the unit
is r = xî + yq + zx. Find the coordinates of the
tangent vector to the curve is
apparent position of Q on the screen. (Hint: find
the equation of the line EQ; then find where it dr dr dx
M= =
cuts the screen.) ds dx ds
î + f ′q
10.39 An ellipse is given parametrically by = .
√(1 + f ′ 2 )
r = îa cos t + qb sin t, where a and b are constants
and t is the parameter, with −π t π (in Show that the curvature κ of the curve at any
radians). Show that δs2 ≈ δx2 + δy2, where point is given by
s represents arc-length. Deduce that −f ′′
κ = .
ds /dt = (a 2 sin 2t + b2 cos 2t) 2 . (1 + f ′ 2 )3 / 2
1
Find the unit tangent vector, a unit normal, the Find the curvature along
curvature, and the radius of curvature at the points (a) the parabola y = x2;
where t = 0, --14 π, and --12 π. (b) the cosine curve y = cos x.
11 Vector product
CONTENTS
Vector product a × b
(a) a × b = (a2b3 − a3b2)î − (a1b3 − a3b1) q + (a1b2 − a2b1)x,
which can be expressed as an expansion like a determinant (see (8.3)):
î q x
a a3 a a3 a a2
(b) a × b = a1 a2 a3 = î 2 −q 1 +x 1 .
b2 b3 b1 b3 b1 b2
b1 b2 b3
(11.1)
11.1
In evaluating b × a, we exchange the a and b components in the expression (11.1a), so
the sign of each of the three bracketed terms changes. Therefore
b × a = −a × b = 10î + 11q − 3x.
VECTOR PRODUCT
(Alternatively, we interchange the last two rows in the determinant form (11.1b), which
changes its sign by Section 8.2, Rule 3.)
Algebraic properties of a × b
(a) a × b = −b × a (the vector product does not commute).
(b) a × (b + c) = a × b + a × c (distributive law).
(c) a × (λ b) = λ a × b where λ is any number.
(d) a × b = 0 if b and a are parallel: in particular, a × a = 0. (11.2)
Vector products of î, q, x
(a) î × q = x, q × x = î, x × î = q.
(b) q × î = −x, x × q = −î, î × x = − q. (11.3)
Notice that for the group in (11.3a), the cyclic order î, q, x, î, q, … is maintained,
and for the group in (11.3b) there is a different cyclic order q, î, x, q, î, … . To prove,
for example, that î × q = x, put î = (1, 0, 0) and q = (0, 1, 0) into the definition (11.1b)
(or into (11.1a) if you are not sure about determinants). Then we obtain
î q x
î × q = 1 0 0 = 0î + 0q + 1x = x.
0 1 0
Self-test 11.1
Evaluate c = a × b, where a = 3î − q + 2x and b = î + q + x, and confirm that
c is perpendicular to both a and b. Find the magnitude of c.
246
(a) (b)
11
Q
p
a
b
p
Q
Fig. 11.1 It will be shown later that the direction of p is that given in (a).
11.2
C
Eye, V
b2 0
a A
Q
b B b
θ
Fig. 11.2 Test for a right-handed system Q
of vectors [a, b, c]. Viewed through the a
triangle, the vertices A, B, C follow in
a1 0 x
anticlockwise order.
Fig. 11.3
It is essential to place V on the opposite side of the triangle ABC from Q, other-
wise the apparent direction of the circuit is reversed. In Fig. 11.1a, the system
[a, b, p] is right-handed, and in Fig. 11.1b, [a, b, p] is left-handed.
Returning to the cross product p = a × b, where the vectors all emerge from Q,
set up a special set of right-handed axes Qx, Qy, Qz, as in Fig. 11.3. The axes
satisfy the following conditions:
(i) Qx is in the direction of a.
(ii) Qy is in the plane of a and b, perpendicular to Qx. It is directed so that the
y component of b is positive.
(iii) The direction of Qz makes the axes right-handed.
The unit vectors are î, q, x. From the conditions (i) and (ii), with the usual
notation,
î q x
p = a1 0 0 = a1b2 x. (11.5)
b1 b2 0
Since, according to (i) and (ii), a1 and b2 are positive, p is in the direction of x, and
the test (11.4a) shows that
Therefore, Fig. 11.1a is the correct one, and Fig. 11.1b gives the direction of p
incorrectly.
Moreover (see Fig. 11.3),
b2 = |b | sin θ
(since 0° θ 180°, the sign of b2 is positive as required). Also
a1 = |a |.
248
Therefore, from (11.5),
VECTOR PRODUCT
p = a × b = x |a | | b| sin θ, (11.7)
Properties of p = a × b
(a) p is perpendicular to a and b, in the direction making
[a, b, p] right-handed.
(b) | p| = |a||b | sin θ, where θ is the angle in the range 0° to 180° between the
11
Invariance of a × b
a × b is invariant with respect to changes from one right-handed set of axes to
another. (11.9)
Other invariants are the length and direction of a vector a, and therefore the
vector a itself: its components are different in different axes, but the physical vector
we are talking about does not change. The scalar product, a·b = a1b1 + a2b2 + a3b3,
is also invariant; that is to say, it has the same numerical value in any right-handed
axes: this value is equal to |a| | b | cos θ and so does not change.
Example 11.2 Let a = Q_A and b = Q_B be two vectors from Q, representing
two sides of a parallelogram. Show that the area of the parallelogram is equal
to |a × b|.
Complete the parallelogram as shown in Fig. 11.4. Construct a perpendicular BN on
to QA. Then
Area QACB = base QA × height BN = | a| | b| sin θ = | a × b| (by (11.8)).
Example 11.3 Two planes have normals n1 and n2 respectively, and pass
through a point A with position vector a. Obtain a vector parametric equation
for their line of intersection.
Figure 11.5 shows the two planes and their line of intersection LM, which contains
the point A. The point P : (x, y, z) with position vector r is a general point on the line.
Let p be any vector parallel to LM. Then A_P is always a multiple of p, so
r − a = λp (i)
where λ is a parameter. ➚
249
C O
11.3
r
B M
n2
n1 a
Self-test 11.2
Let a = 3î − 4 q and b = î + q + x. Confirm that the value of θ given by (11.7) is
compatible with the value of cos θ derived from the dot product a·b.
b×c
a
θ C
θ c
Q N
b E
B Fig. 11.6
11.4
height AN = QA cos θ = |a | cos θ.
Therefore
MOMENT OF A FORCE
volume = | a||b × c | cos θ = |a ·(b × c)|
from (11.2).
Volume of a parallelepiped
If the adjacent sides at a vertex Q are Q_A = a, Q_B = b, Q_C = c, then
volume = | a·(b × c)|. (11.11)
Vectors are said to be coplanar if, when drawn from the same point, they lie in
the same plane. The condition for this is:
Coplanar vectors
Three nonzero vectors a, b, c at the same point are coplanar if, and only if,
a·(b × c) = 0, (that is, for a, b, c linearly dependent – see Section 8.2). (11.12)
(If they are not at the same point, then this is the condition that they should be
parallel to a common plane.) The result follows from (11.11): the volume of the
corresponding parallelepiped is zero.
Example 11.4 Show that the points A : (1, 2, 2), B : (3, 4, 5), C : (−1, 0, −1) lie on
a plane through the origin.
Suppose the three points A, B, C have position vectors a, b, c. To show a, b, c are
coplanar, evaluate a ·(b × c):
a·(b × c) = (1, 2, 2)· [(3, 4, 5) × (−1, 0, −1)]
1 2 2
4 5 3 5 3 4
= 3 4 5 =1 −2 +2
0 −1 −1 −1 −1 0
−1 0 −1
= −4 − 2 × 2 + 2 × 4 = 0.
Therefore, A, B, C, and O are all in the same plane, so the points A, B, C are on a plane
through the origin.
Q
VECTOR PRODUCT
R F
d θ
P
N
In Fig. 11.7, Q__P = R, and θ is the angle between F and R, with 0 θ 180°. Then
d = | R | sin θ,
so
M = | F||R | sin θ. (11.13)
Example 11.5 A force F = î − q + 2x acts at P : (1, 2, 1). Find its vector moment
M about the point Q : (2, 1, 1).
In these axes the position vectors of P and Q are
p = î + 2q + x, q = 2î + q + x,
so
R = Q_P = p − q = −î + q.
The moment M is given by
î q x
M = R × F = −1 1 0 = 2 î + 2 q.
1 −1 2
Example 11.6A force F = î − q (force units) acts at P : (1, 2, 0). Find its vector
moment about the origin O.
O, P, and F all lie in the (x, y) plane, so the physical problem is two-dimensional. The
Oz axis points towards you, out of the page, in Fig. 11.8.
The vector moment M is given by
î q x
M = R × F = 1 2 0 = −3x.
1 −1 0 ➚
253
Example 11.6 continued
11.4
Thus M is parallel to Oz and its z component is −3. Figure 11.8 shows the negative sign
corresponds to F having a clockwise influence on a wheel turning about the point O.
MOMENT OF A FORCE
y y F2 q
P
2
F
F1î
R
1 P: (a, b, 0)
R
O 1 2 x O x
î q x
M=R×F= a b 0 = (F2a − F1b)x.
F1 F2 0
This situation is also (physically) two-dimensional; the z direction in Fig. 11.9 would only
be needed in order to display M. The expression illustrates the separate clockwise and
anticlockwise contributions respectively of F1 and F2.
A′ z
VECTOR PRODUCT
of
ion
F
Axis
F
rotat
v
R P
Q P
y
P′
v
R′
Q′
11
Q′
A R′
P′
Fig. 11.10 Moment of F about an axis x
parallel to s:
v ·(R′ × F) = v ·(R × F). Fig. 11.11
The expression (11.16) corresponds to what we should expect about the turning
effect of F about the given axis. There is no contribution from F3 because F3 x is
parallel to the axis of rotation, and F1 is zero in these axes. What remains is F2 q,
which is perpendicular to the axis of rotation Q′z, and d is the perpendicular
distance of F from it.
255
For this reason the scalar quantity M = v ·(R × F) is called the moment of
F about an axis of rotation AA′, as in Fig. 11.10. Dropping the dashed quantities,
11.5
the unit vector v is the direction AA′, and R = Q_P, where Q is any point on AA′ and
P any point on the line of action of F (M being independent of the choice of these
Self-test 11.3
Let ω be a constant vector, ω = ω î, and r = xî + yq + zx the position vector of
a point. Show that r × ω = (−xy + qz)ω, whose directions (projected on to the
y,z plane) are tangential to a family of circles, radii √(y2 + z2), traversed in the
anticlockwise direction.
Q y
x Fig. 11.12
256
Remember that x × î = q, x × q = − î, and x × x = 0. In these axes,
VECTOR PRODUCT
w = a × (b × c)
= a3x × [(b2c3 − b3c2)î − (b1c3 − b3c1)q + (b1c2 − b2c1)x]
= a3(b2c3 − b3c2)q + a3(b1c3 − b3c1)î
= a3c3(b1î + b2 q ) − a3b3(c1î + c2 q).
The third components of b and c (b3x and c3x) are missing in the brackets: to
make them appear, add to the right-hand side the term
11
Self-test 11.4
Express (a) a × (b × c) and (b) c × (b × a) in the form (11.18), for a = î + q + x,
b = î + 2q + x, c = 3î + q + x.
Problems
11.1 In component form let a = (1, −2, 2), b = (3, perpendicular to the vectors b and c, and passes
−1, −1), and c = (−1, 0, −1). Evaluate the following: through the point with position vector a.
(a) a × b (b) b × a (b) Obtain the equation to the line when
(c) a × a (d) a·(b × c) a = î + 2q + x, b = î − q, and c = q + x.
(e) c ·(a × b) (f) b ·(a × c)
(g) (a × b)·b (h) a × (a × b) 11.4 Show that the vector a × u, where
(i) (c × b) × a. a = (a1, a2, a3) and u is any vector, is parallel
to the plane a1x + a2 y + a3z = d. Obtain two
11.2 Given two planes, r · n1 = d1, r · n2 = d2, show vectors parallel to the plane 2x − 3y − z = 1.
that the plane through the origin perpendicular to
their line of intersection is given by r ·(n1 × n2) = 0. 11.5 Under what conditions will a × b = 0?
11.3 (a) Use the vector product to obtain a vector 11.6 Show that the vectors a = 2î + 3q + 6x and
parametric equation for the straight line which is b = 6î + 2q − 3x are perpendicular. Find a vector c
257
which is perpendicular to b and c and such that as Cramer’s rule (see Section 12.1), for solving any
[a, b, c] is a right-handed set. three simultaneous equations provided D ≠ 0.
PROBLEMS
11.7 (a) The vertices of a triangle are A, B, C, with 11.12 (a) Show that if three vectors a, b, c are
position vectors a, b, c. Show that the area of the non-coplanar and v is any vector, then constants
triangle ABC is given by --12 |b × c + c × a + a × b|. X, Y, Z can be found such that v = Xb × c + Yc × a
(Hint: see Example 11.2.) (b) A second triangle has + Za × b. (Hint: start by forming a· v from this
vertices at a + λ(b − c), b, c, where λ is a scalar. expression.) (b) Find X, Y, Z if v = 2î + q − 2x,
Show that the areas of the two triangles are a = î − q, b = î + 2q, c = q − 2x.
the same. What simple geometrical result does
the equality exhibit? (c) Find the area of the 11.13 The equations r = a + λ u and r = b + µ v,
triangle whose vertices are at î − 2q − x, î − q + 2x, where λ and µ are parameters, represent two
î + 2q − x. skew lines L1 and L2 (straight lines which do not
intersect). (a) Write down a vector w which is
11.8 A, B, C are three points which do not lie on perpendicular to both L1 and L2. (b) Show that
a straight line, and D is another point. Put A_B = b, values of λ, µ, and ν can be found so that
A_C = c, and A_D = d. Show that the distance of D (a + λ u) + ν w = (b + µ v),
from the plane passing through A, B, C is equal to
and explain why this implies that there actually
| d ·(b × c)|/ |b × c |.
exists a straight line L3 which joins L1 and L 2 and is
11.9 Show that, if QA, QB, QC are adjacent edges
perpendicular to both. (c) For the case when a = −î,
of a rectangular parallelepiped with coordinates u = x, b = î − q, v = î + q + x, find the values of λ, µ,ν.
Deduce the points where L3 meets L1 and L2. Find
Q : (x0, y0, z0), A: (x1, y1, z1), an equation for L3, and the perpendicular distance
B : (x2, y2, z2), C : (x3, y3, z3), between L1 and L2.
then its volume is given by the modulus of the
determinant 11.14 Find the vector moments M of the given
forces F acting at the points P as specified. Make
x1 − x 0 x2 − x0 x3 − x 0
sketches, indicating the direction of M.
y1 − y 0 y2 − y0 y3 − y 0 .
(a) F = (2, 0, 0) at P : (0, 3, 0). Find M about the
z1 − z 0 z2 − z0 z3 − z 0
origin O.
(b) F = (2, 0, 0) at P : (0, 3, 0). Find M about
11.10 (Oblique coordinates). (a) Let a, b, c be Q : (0, 0, 3).
three non-coplanar vectors, and v be any vector. (c) F = (2, 0, 0) at P : (0, −3, 0). Find M about
Show that v can be expressed as Q : (0, 0, 3).
v = Xa + Yb + Zc,
where X, Y, Z are constants given by 11.15 A force F of magnitude 4 acts at the
point (1, −1, 2) in the direction of î − 2q − 2x.
X = v ·(b × c)/D, Find the vector moment M of F, (a) about the
Y = v ·(c × a)/D, origin, (b) about the point (−2, 1, 2). (c) Find its
Z = v ·(a × b)/D, component about the y axis, taken in the direction
of q (i.e. v = q, in the text).
where
D = a·(b × c). 11.16 Find the moment M about the axes
(Hint: start by forming, say, v ·(a × b). Equation specified, where the force is F = (2, 0, 0) acting at
(11.10d) gets rid of two terms.) (b) Check the P : (0, 3, 0). (Note that the sense of the axis needs
formulae for the case a = (1, 1, 0), b = (0, 1, 1), to be specified. If the sense is reversed, then the
c = (1, 0, 1). v = (1, 1, 1), by solving the three sign of v·(R × F ) changes.) (a) The z axis, taken
equations obtained by splitting the vector in the positive direction. (b) The z axis, taken
equation into components. in the negative direction. (c) The x axis, in the
positive direction. (d) The y axis, in the positive
11.11 (Cramer’s rule). In Problem 11.10, write direction. (e) The axis through the origin,
the vector equation v = Xa + Yb + Zc in the form direction v = (1/√3, 1/√3, 1/√3).
of three simultaneous equations involving the
components of a, b, c. Now write the formulae 11.17 Find the magnitude of the moment M of
for X, Y, Z in determinant form. This is known the force F = (1, 1, 2) acting at P : (2, −3, 1), about
258
the axis A_B , where A : (2, 3, 2) and B : (1, 1, 1). Find the matrix S such that v = Sω. Show that
Verify directly that the component of F in | v |2 = ω TS TSω, and that
VECTOR PRODUCT
moment M about the axis is a maximum when v in either of the forms v = pc + qd or v = ma + nb.
is perpendicular to the plane containing r and F. Justify this expectation geometrically, then obtain
(Hint: remember a·b = |a||b| cos θ in the usual the constants by using eqn (11.18).
notation.) Under what conditions is |M| a
minimum, and what is its value? 11.23 Prove that
a × (b × c) + b × (c × a) + c × ( a × b) = 0.
11.19 A rigid lamina in the (x, y) plane rotates at
11.24 (a) Find a vector which is perpendicular to
ω radians per second about the z axis in axes Oxyz,
n and in the plane of n and b, where n and b are
in the manner of a wheel on an axle. (a) Show that
any two vectors. (b) Show that the straight line
if r and θ are the polar coordinates of any point P,
r = b + µ n × [(a − b) × n], where µ is a parameter,
then the velocity of P is given by v = −îω r sin θ +
passes through the point with position vector b,
qω r cos θ.
and meets the straight line given parametrically
(b) Show that v may be written v = ω × r, where
by r = a + λ n in a right angle.
ω = ω x (ω is called the angular velocity vector in
two dimensions). 11.25 You are given two planes, r · n1 = d1, r · n2 = d2.
(c) Choose any point Q which travels round with Show that the point on their line of intersection
the lamina, and let QXYZ be another set of axes that is closest to the origin has the position vector
which remain parallel to Oxyz. Show that, viewed
relative to QXYZ, any point P has velocity V given α(n1 × n2) × n1 + β(n1 × n2) × n2
by V = ω × R, where R is its position vector in where α and β are certain constants. Obtain a
QXYZ. formula for the constants.
CONTENTS
You are likely to have encountered a method for solving two simultaneous linear
equations for two unknowns, x and y, as in the case of the equations
2x + 3y = −1, x − 2y = 3.
To solve these equations, use the method of elimination. To eliminate y, multiply
the first equation by 2 and the second by 3, and add the results. The y terms
cancel, and we are left with 7x = 7, so x = 1. By substituting this value back in, say,
the first equation, we have 2 + 3y = −1, so y = −1. You might have been led to expect
that a similar process will always lead to a single, definite or unique solution for
x and y, no matter what pair of equations is presented. But in fact a surprising
variety of eccentricities can occur.
Suppose the equations are
x + y = 2, 2x + 2y = 1.
These two equations are contradictory, or incompatible – there is no solution –
we cannot reconcile the two statements, x + y = 2 and 2(x + y) = 1. But if the two
equations happen to be equivalent, meaning effectively identical, say x + y = 2 and
2x + 2y = 4, they reduce to the single equation x + y = 2 for the two unknowns,
x and y. Therefore there is an infinity of solutions, in this case x = c, y = 2 − c for
every value of c.
Now suppose the numbers on the right-hand sides of a pair of equations are
both zero, 2x + 3y = 0, x − 2y = 0 say (the pair is then said to be homogeneous).
Such a pair is always compatible, but there are still two possibilities. If the
equations are equivalent, as in the pair x + y = 0, 2x + 2y = 0, there is really only
260
one equation for the two unknowns, and they have an infinity of solutions (in
this case given by x = c, y = −c for every value of c). If they are not equivalent, for
LINEAR ALGEBRAIC EQUATIONS
Consider the general system with two equations and two unknowns:
a11x1 + a12 x2 = d1,
a21x1 + a22 x2 = d2.
Elimination leads to the solution
d1a22 − d2 a12 d a − d1a21
x1 = , x2 = 2 11 ,
a11a22 − a12 a21 a11a22 − a12 a21
provided that the denominator a11a22 − a12a21 is not zero. From Section 8.1 in
the chapter on determinants, these ratios can be recognized as ratios of
determinants:
d1 a12 a11 d1
d2 a22 a21 d2
x1 = , x2 = .
a11 a12 a11 a12
a21 a22 a21 a22
This formula is known as Cramer’s rule. As we shall see later, if the denominator
is zero then the two equations can have no solutions or an infinity of solutions.
(These possibilities occur when the vectors [a11, a12] and [a21, a22] are linearly
dependent – see Section 8.2.)
Elimination can be applied to equations with more unknowns as the following
example illustrates.
Eliminate x1 between (i) and (ii), and between (ii) and (iii). Thus 2(i) − (ii) gives
−5x2 + 3x3 = −7, (iv)
261
Example 12.1 continued
12.1
whilst (ii) − 2(iii) gives
−5x2 − 5x3 = −15. (v)
CRAMER’S RULE
Now eliminate x2 between (iv) and (v) by subtraction:
8x3 = 8, or x3 = 1.
Rather than eliminate again to find the other unknowns, we can substitute back x3 = 1
in (iv), say, so that
−5x2 + 3 = −7, or x2 = 2.
Finally substitute x2 = 2 and x3 = 1 into (i):
x1 − 4 + 1 = −4, or x1 = −1.
Hence the full solution is x1 = −1, x2 = 2, x3 = 1.
For more unknowns and equations, this approach becomes increasingly labori-
ous and prone to errors.
A matrix equation
Ax = d (12.1)
⎡ 1 2 1⎤ ⎡x1 ⎤ ⎡ 1⎤
A = ⎢−2 3 −1⎥ , x = ⎢x2 ⎥ , d = ⎢−7⎥
⎢ 1 4 −2⎥ ⎢x ⎥ ⎢−7⎥
⎣ ⎦ ⎣ 3⎦ ⎣ ⎦
is
x1 + 2x2 + x3 = 1, (12.2a)
Consider now the case in which A is an arbitrary square matrix. If the inverse
of A exists, then multiplication of (12.1) on the left by A−1 leads to the solution
vector
adj A
x = A −1d = d,
det A
using the formula for the inverse given in Equation (8.5c): adj A is the adjoint of
A. Let n = 3; then, for our standard matrix,
262
⎡a11 a12 a13 ⎤
A = ⎢a21 a22 a23 ⎥ ,
LINEAR ALGEBRAIC EQUATIONS
⎢a
⎣ 31 a32 a33 ⎥⎦
we have
where C11, C12, … are the cofactors of a11, a12, … (see Section 8.1). Thus, com-
parison of elements in the vectors leads to
d1 a12 a13
1 1
x1 = (C11d1 + C21d2 + C31d 3 ) = d2 a22 a23 ,
12
12.2
x1 + 2x2 + x3 = 1,
Step 2. We now proceed to eliminate x2 from (12.3c) using a multiple --72 of the
new row 2. Hence
x1 + 2x2 + x3 = 1,
7x2 + x3 = −5,
− 237 x3 = − 467 (r3′ = r3 − 27 r2 ).
Step 3. Using Rule (i) above, reduce the coefficients of x2 and x3 in the second
and third equations above to 1:
x1 + 2x2 + x3 = 1,
x2 + 17 x3 = − 57 (r2′ = 17 r2 ),
x3 = 2 (r3′ = − 237 r3 ).
Step 4. Starting from the third equation, we can now solve the equations by
back substitution. Since x3 = 2, the second equation then gives,
x2 = − 57 − 17 x3 = − 57 − 1
7 × 2 = −1,
and from the first equation,
x1 = 1 − 2x2 − x3 = 1 + 2 − 2 = 1.
Thus the solution is
x1 = 1, x2 = −1, x3 = 2.
The method is known as Gaussian elimination.
In fact, we need not write down the equations for x1, x2, x3 at each stage, since
all the information in (12.3) is given by the 3 × 4 matrix
⎡ 1 2 1 1⎤
⎢−2 3 −1 −7⎥ ,
⎢ 1 4 −2 −7⎥
⎣ ⎦
which is known as the augmented matrix for the system of equations: the fourth
column consists of the constants on the right-hand sides of (12.3a,b,c). The ele-
mentary operations referred to previously correspond to elementary row operations
on the matrix. We can reproduce the steps above by the following more compact
procedure:
264
⎡ 1 2 1 1⎤ ⎡1 2 1 1⎤
⎢−2 3 −1 −7⎥ → ⎢0 7 1 −5⎥ ⎛ r2′ = r2 + 2r1 ⎞
LINEAR ALGEBRAIC EQUATIONS
⎢ 1 4 −2 −7⎥ ⎢0 2 −3 −8⎥ ⎜ ⎟
⎣ ⎦ ⎣ ⎦ ⎝ r3′ = r3 − r1 ⎠
⎡1 2 1 1⎤
→ ⎢0 7 1 −5⎥
⎢0 0 − 23 − 46 ⎥
⎣ 7 7 ⎦ (r3′ = r3 − 27 r2 )
⎡1 2 1 1⎤
→ ⎢0 1 17 − 57 ⎥ ⎛ r2′ = 17 r2 ⎞
⎢ ⎥ ⎜ r′ = − 7 r ⎟ ,
⎣0 0 1 2⎦ ⎝3 23 3 ⎠
where the arrow ‘→’ means ‘is transformed into’. The final matrix is said to be in
echelon form: that is, it has zeros below the diagonal elements starting from the
top left. We can now solve the equations by back substitution as before.
12
The elements underlined are known as pivots and they must be nonzero.
They are used to clear the elements in the column below them. If any pivot
turns out to be zero as the method progresses, then that equation or row is
replaced by the first row below which has a nonzero coefficient in the column.
If there are no further nonzero coefficients, then the pivot moves across to the
next column.
It is now possible to complete the Gaussian elimination by using further row
operations on the echelon matrix rather than back substitution. Thus, continuing
from the echelon form above
⎡1 2 1 1⎤ ⎡ 1 2 0 −1⎤ ⎛ r1′ = r1 − r3 ⎞
⎢0 1 1 − 5 ⎥ → ⎢0 1 0 −1⎥ ⎜ r′ = r − 1 r ⎟
⎝ 2 2 7 3⎠
⎢ 7 7
⎥ ⎢0 0 1 2⎥
⎣0 0 1 2⎦ ⎣ ⎦
⎡1 0 0 1⎤ (r1′ = r1 − 2r2 ),
→ ⎢0 1 0 −1⎥
⎢0 0 1 2⎥
⎣ ⎦
where the pivots are underlined again. The final matrix now represents the solution
set x1 = 1, x2 = −1, x3 = 2.
Example 12.2 Using Gaussian elimination and back substitution, solve the set
of equations
x1 + x2 + 2x3 = 4,
2x1 + 2x2 + x3 − x4 = −1,
x2 + x 3 + x 4 = 6,
x2 − x3 + 2x4 = 5.
We first perform the pivotal row operations on the augmented matrix as follows:
➚
265
Example 12.2 continued
12.3
⎡1 1 2 0 4⎤ ⎡1 1 2 0 4⎤
⎢2 2 1 −1 −1⎥ ⎢0 0 −3 −1 −9⎥ (r2′ = r2 − 2r1 )
⎢0 →
1 1 1 6⎥ ⎢0 1 1 1 6⎥
⎡1 1 2 0 4⎤
⎢0 1 1 1 6⎥
→⎢
0 0 −3 −1 −9⎥
⎢ ⎥
⎣0 0 −2 1 –1⎦ (r4′ = r4 − r2 )
⎡1 1 2 0 4⎤
⎢0 1 1 1 6⎥
→⎢
0 0 −3 −1 −9⎥
⎢ ⎥
⎣0 0 0 5
3 5⎦ (r4′ = r4 − 23 r3 )
⎡1 1 2 0 4⎤
⎢0 1 1 1 6⎥
→⎢
0 0 1 13 3⎥ ⎛ r3′ = − 13 r3 ⎞
⎢ ⎥ ⎜ r′ = 3 r ⎟ .
⎣0 0 0 1 3⎦ ⎝ 4 5 4 ⎠
(Note the row change r2 ↔ r3 at step 2 because of the zero pivot.) Back substitution now gives
x4 = 3, x3 = 3 − 13 x4 = 2, x2 = 6 − x3 − x4 = 1, x1 = 4 − x2 − 2x3 = −1.
Self-test 12.1
Using elementary row operations, solve the equations
x1 + 2x2 − x3 = 1
2x1 − x2 + 3x3 = −3
−x1 + x2 − x3 = −1.
Matrix inversion
Use elementary row operations to transform A into the identity I, and use the
same operations to transform I into A−1. (12.5)
266
Suppose that we require the inverse of
LINEAR ALGEBRAIC EQUATIONS
⎡0 1 0 2⎤
⎢1 0 1 0⎥
A=⎢
1⎥⎥
.
⎢0 1 0
⎢⎣1 0 2 0⎥⎦
We reduce A to I4 and perform the same row operations on I4. Thus, we can write
down the steps in parallel as follows:
⎡0 1 0 2⎤ ⎡1 0 0 0⎤
⎢1 0 1 0⎥ ⎢0 1 0 0⎥
A=⎢ I4 = ⎢
⎢0 1 0 1⎥⎥ ⎢0 0 1 0⎥⎥
⎢⎣1 0 2 0⎥⎦ ⎢⎣0 0 0 1⎥⎦
12
⎡1 0 1 0⎤ (r1 ↔ r2 ) ⎡0 1 0 0⎤
⎢0 1 0 2⎥ ⎢1 0 0 0⎥
→⎢ →⎢
⎢0 1 0 1⎥⎥ ⎢0 0 1 0⎥⎥
⎢⎣1 0 2 0⎥⎦ ⎢⎣0 0 0 1⎥⎦
⎡1 0 1 0⎤ ⎡0 1 0 0⎤
⎢0 1 0 2⎥ ⎢1 0 0 0⎥
→⎢ →⎢
⎢0 1 0 1⎥⎥ ⎢0 0 1 0⎥⎥
⎢⎣0 0 1 0⎥⎦ (r4′ = r4 − r1 ) ⎢⎣0 −1 0 1⎥⎦
⎡1 0 1 0⎤ ⎡ 0 1 0 0⎤
⎢0 1 0 2⎥ ⎢ 1 0 0 0⎥
→⎢ →⎢
⎢0 0 0 −1⎥⎥ (r3′ = r3 − r2 ) ⎢−1 0 1 0⎥⎥
⎢⎣0 0 1 0⎥⎦ ⎢⎣ 0 −1 0 1⎥⎦
⎡1 0 1 0⎤ ⎡ 0 1 0 0⎤
⎢0 1 0 2⎥ ⎢ 1 0 0 0⎥
→⎢ →⎢
⎢0 0 1 0⎥⎥ (r3 ↔ r4 ) ⎢ 0 −1 0 1⎥⎥
⎢⎣0 0 0 −1⎥⎦ ⎢⎣−1 0 1 0⎥⎦
⎡1 0 1 0⎤ ⎡0 1 0 0⎤
⎢0 1 0 2⎥ ⎢1 0 0 0⎥
→⎢ →⎢
⎢0 0 1 0⎥⎥ ⎢0 −1 0 1⎥⎥
⎢⎣0 0 0 1⎥⎦ (r4′ = − r4 ) ⎢⎣ 1 0 −1 0⎥⎦
⎡1 0 1 0⎤ ⎡ 0 1 0 0⎤
⎢0 1 0 0⎥ (r2′ = r2 − 2r4 ) ⎢−1 0 2 0⎥
→⎢ →⎢
⎢0 0 1 0⎥⎥ ⎢ 0 −1 0 1⎥⎥
⎢⎣0 0 0 1⎥⎦ ⎢⎣ 1 0 −1 0⎥⎦
⎡1 0 0 0⎤ (r1′ = r1 − r3 ) ⎡ 0 2 0 −1⎤
⎢0 1 0 0⎥ ⎢−1 0 2 0⎥
→⎢ →⎢
⎢0 0 1 0⎥⎥ ⎢ 0 −1 0 1⎥⎥
⎢⎣0 0 0 1⎥⎦ ⎢⎣ 1 0 −1 0⎥⎦
= I4 = A−1.
267
We conclude that
12.4
⎡ 0 2 0 −1⎤
⎢−1 0 2 0⎥
A −1 =⎢
1⎥⎥
.
Self-test 12.2
Using Gaussian elimination, find the inverse of
G 1 2 −1 0 J
H 0 −1 3 2K
A=H .
−1 1 −1 0 K
I 1 −4 2 −1 L
1 1 −1 1 1 −1
4 2
3 −1 3 = 4 0 2 ⎛ r2′ = r2 + r1 ⎞ = − = 0.
⎜ ⎟ 2 1
1 −1 2 2 0 1 ⎝ r3′ = r3 + r1 ⎠
268
Thus Cramer’s rule will fail, although there still may be solutions. We can deter-
mine whether solutions exist more readily by using Gaussian elimination. In this
LINEAR ALGEBRAIC EQUATIONS
⎡1 1 −1 3⎤ ⎡1 1 −1 3⎤
⎢3 −1 3 5⎥ → ⎢0 − 4 6 − 4⎥ ⎛ r2′ = r2 − 3r1 ⎞
⎢1 −1 2 2⎥ ⎢ − ⎥ ⎜ ⎟
⎣ ⎦ ⎣0 2 3 −1⎦ ⎝ r3′ = r3 − r1 ⎠
⎡1 1 −1 3⎤
→ 0 − 4 6 − 4⎥
⎢
⎢0
⎣ 0 0 1⎥⎦ (r3′ = r3 − 21 r2 ),
which is the echelon form for this set of equations. However, row 3 is impossible
to be satisfied. Hence these equations can have no solutions.
On the other hand, consider the following set:
12
x + y − z = 1,
3x − y + 3z = 5,
x − y + 2z = 2
(this is the previous set with one change to the first equation). Gaussian elimina-
tion now gives
⎡1 1 −1 1⎤ ⎡1 1 −1 1⎤
⎢3 −1 3 5⎥ → ⎢0 − 4 6 2⎥ ⎛ r2′ = r2 − 3r1 ⎞
⎢1 −1 2 2⎥ ⎢ − ⎥ ⎜ ⎟
⎣ ⎦ ⎣0 2 3 1⎦ ⎝ r3′ = r3 − r1 ⎠
⎡1 1 −1 1⎤
→ ⎢0 − 4 6 2⎥
⎢0
⎣ 0 0 0⎥⎦ (r3′ = r3 − 21 r2 ).
⎡x⎤ ⎡ 23 − 21 λ ⎤
⎢y ⎥ = ⎢ − 1 + 3 λ ⎥
⎢z ⎥ ⎢ 2 2 ⎥
⎣ ⎦ ⎣ λ ⎦
for any value of λ . It can be seen in this case that there exists an infinite number of
solutions, a different one for each different value of λ .
Geometrically, in three dimensions, it can be seen why equations can have a
unique solution, no solution, or an infinite set of solutions. Any equation such as
ax + by + cz = d
269
(b)
12.4
(a)
(c) (d)
Fig. 12.1 (a) Exactly one solution at P. (b) Infinite number of solutions, lying on the straight line
MN. (c) and (d) are examples of cases with no solutions.
represents a plane in 3. Three equations represent three planes, and we need only
visualize how they might intersect or not. The coordinates of any point of inter-
section of the planes is the solution of the equations. The three diagrams in Fig. 12.1
show how three planes can intersect in a single point, no point, or a line of points.
Example 12.3 Determine the complete sets of values for a and b which make
the equations
x − 2y + 3z = 2,
2x − y + 2z = 3,
x + y + az = b
have (i) a unique solution, (ii) no solutions, (iii) an infinite set of solutions.
Reduce the augmented matrix to echelon form using pivots to clear each column successively:
⎡ 1 −2 3 2⎤ ⎡1 −2 3 2 ⎤
⎢2 −1 2 3⎥ → ⎢0 3 −4 −1 ⎥ ⎛ r2′ = r2 − 2r1 ⎞
⎢1 1 a b⎥⎦ ⎢0 3 a − 3 b − 2⎥ ⎜⎝ r ′ = r − r ⎟⎠
⎣ ⎣ ⎦ 3 3 1
⎡1 −2 3 2 ⎤
→ ⎢0 3 −4 −1 ⎥
⎢0 0 a + 1 b − 1⎥ (r ′ = r − r ).
⎣ ⎦ 3 3 2
→ ⎢0 −2 3
⎢1 −1 2 2⎥ 1⎥ ⎜ r3′ = r3 − r1 ⎟
⎢ ⎥ ⎢ ⎥ ⎜ r′ = r − r ⎟
⎣1 0 1 3⎦ ⎣0 −1 2 2⎦ ⎝ 4 4 1 ⎠
⎡1 1 −1 1⎤
⎢0 − 4 6 2⎥
→⎢
0 0 0 0⎥ ⎛ r3′ = r3 − 12 r2 ⎞
⎢ 1 3⎥ ⎜ r′ = r − 1 r ⎟
⎣0 0 2 2⎦ ⎝ 4 4 4 2⎠
⎡1 1 −1 1⎤
⎢0 − 4 6 2⎥
→⎢ 3⎥
2 ⎥ (r3 ↔ r4 ).
1
0 0
⎢ 2
⎢⎣0 0 0 0⎥⎦
Row 4 is consistent, while row 3 implies z = 3. Then y and x can be found by back
substitution in rows 2 and 1. Confirm that y = 4 and x = 0.
On the other hand, there may be more variables than equations, as in the
following example. (Inconsistency is still possible.)
⎡ 1 1 12 1 1⎤
→ ⎢0 −3 2 −3 3⎥
⎢0 0 0 0 −2⎥ (r = r − 2 r ).
⎣ ⎦ 3′ 3 2
Self-test 12.3
12.5
Determine the complete set of values for a and b which make the equations
x − 2y + 3z = 2,
Example 12.6 Find the value of a for which the following equations have
non-trivial solutions:
x + y + z = 0,
x + 2y = 0,
x − 3y + az = 0.
Proceed in the usual way using Gaussian reduction. Thus
⎡1 1 1 0⎤ ⎡1 1 1 0⎤
⎢1 2 0 0⎥ → ⎢0 1 −1 0⎥ ⎛ r2′ = r2 − r1 ⎞
⎢1 −3 a 0⎥ ⎢0 − 4 a − 1 0⎥ ⎜ r ′ = r − r ⎟
⎣ ⎦ ⎣ ⎦ ⎝ 3 3 1⎠
⎡1 1 1 0⎤
→ ⎢0 1 −1 0⎥
⎢0 0 a − 5 0⎥ (r ′ = r + 4 r ).
⎣ ⎦ 3 3 2
Non-trivial solutions exist if, and only if, a = 5. Put z = c, any number. Then by back
substitution we obtain
y = z = c, x = −y − z = −2c
for any c. There is therefore an infinite number of solutions if a = 5.
Example 12.7 Find all conditions on the constants a, b, and c in order that
x + y + z = 0,
ax + by + cz = 0,
a2x + b2y + c2z = 0
should have non-trivial solutions. Find the solutions in the cases
(a) a = 1, b = 1, c = 2; (b) a = 1, b = 1, c = 1.
This system of equations will have non-trivial solutions for x, y, and z if, and only if,
12
1 1 1
D= a b c = 0.
a2 b2 c2
Thus
1 0 0
⎛ c 2′ = c 2 − c1 ⎞
D= a b−a c−a ⎜ c′ = c − c ⎟
⎝ 3 1⎠
a2 b2 − a2 c 2 − a2 3
1 0 0
= (b − a)(c − a) a 1 1
a2 b+a c+a
= (b − a)(c − a )(c + a − b − a)
= (b − c)(c − a)(a − b).
Hence, non-trivial solutions exist if b = c, c = a, or a = b.
(a) (a = 1, b = 1, c = 2) The equations become
x + y + z = 0,
x + y + 2z = 0,
x + y + 4z = 0.
The augmented matrix is
⎡1 1 1 0⎤ ⎡1 1 1 0⎤
⎢1 1 2 0⎥ → ⎢0 0 1 0⎥ ⎛ r2′ = r2 − r1 ⎞
⎢1 1 4 0⎥ ⎢0 0 3 0⎥ ⎜ r ′ = r − r ⎟
⎣ ⎦ ⎣ ⎦ ⎝ 3 3 1⎠
⎡1 1 1 0⎤
→ ⎢0 0 1 0⎥
⎢0 0 0 0⎥ (r ′ = r − 3r ).
⎣ ⎦ 3 3 2
Row 2 implies z = 0, while row 1 implies x = −y. Let y = λ, say. Then the solution set is
⎡x⎤ ⎡−λ ⎤ ⎡−1⎤
⎢y ⎥ = ⎢ λ ⎥ = ⎢ 1⎥λ ,
⎢ z ⎥ ⎢ 0⎥ ⎢ 0⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
for any λ. ➚
273
Example 12.7 continued
12.6
(b) (a = 1, b = 1, c = 1) Applying Gaussian elimination, we find that
⎡1 1 1 0⎤ ⎡1 1 1 0⎤
x2 = (x1 − x3 − 8),
1
4 (12.8)
x3 = (−2x1 − x2 − 14),
1
5 (12.9)
where x1, x2, and x3 are now the subjects of the three equations.
2 = 0 and x 3 = 0,
To start the iteration choose initial values for x2 and x3, say x(0) (0)
(1)
without thinking about the equations. Calculate x 1 from eqn (12.7) as
x (11) = 13 (−x (20) − x (30) − 1). (12.10)
i 0 1 2 3 4 5 6 7
x (i1 ) – − 0.3333 1.1110 1.0570 1.0030 0.9984 0.9997 1.0000
x (i2 ) 0 −2.0830 −1.1600 −0.9825 −0.9926 − 0.9997 −1.0000 −1.0000
x (i3 ) 0 −2.2500 −3.0130 −3.0260 −3.0030 −2.9990 −3.0000 −3.0000
All solutions are quoted to 4 decimal places. It can be seen that the exact solu-
tion, which is x1 = 1, x2 = −1, x3 = −3, can be achieved to this accuracy after seven
steps for this example. This is known as the Gauss–Seidel scheme for numerical
solution of linear equations.
An alternative method without updating given by
x (11) = 13 (−x (20) − x (30) − 1),
12
⎡ 3 1 1⎤
⎢−1 4 1⎥ .
⎢ 2 1 5⎥
⎣ ⎦
Each of the diagonal elements dominates the remaining elements in that row, since
3 1 + 1 = 2, 4 | −1| + 1 = 2, 5 2 + 1 = 3.
This property of the system of equations is known as diagonal dominance. If the
matrix is not diagonally dominant, then the scheme may or may not converge.
Usually a few steps will indicate whether this is likely to be the case.
The schemes for both these methods can be expressed in matrix form as
follows. Let the system of equations
Ax = d,
where A = [aij] is an n × n matrix. Let
A = AL + D + AU,
where AL, D, and AU are respectively the lower triangular, diagonal, and upper
triangular matrices given by
275
⎡0 0 … 0⎤
⎡a11 0 … 0 ⎤
⎢a21 0 … 0⎥
PROBLEMS
⎢ 0 a22 … 0 ⎥
AL = ⎢a31 a32
⎢ … 0⎥⎥ , D = ⎢ ⎥,
⎢ ⎥
⎢ ⎥ ⎢⎣ 0 0 … ann ⎥⎦
⎢⎣an1 an 2 … 0⎥⎦
⎡0 a12 a13 . . . a1n ⎤
⎢0 0 a23 . . . a2 n ⎥
AU = ⎢ ⎥.
⎢ ⎥
⎢⎣0 0 0 . . . 0 ⎥⎦
Problems
12.1 (Section 12.1). Solve the following systems of (e) x1 + 5x2 + 2x4 = 1,
linear equations using Cramer’s rule: − 3x2 − x4 = 1,
(a) x1 + x3 = 1, 3x2 + x3 + x4 = 1,
x2 − x3 = 3, 2x2 + x3 + x4 = 2.
2x1 + x2 = −1;
(b) x1 + 7x + x3 = 1, 12.2 (Section 12.1). The currents i1, i2, i3 (in amps)
2
flow in parts of a circuit which contains a variable
x2 − x3 = 3,
resistor of resistance R (in ohms). The equations
2x1 + x2 + 10x3 = −1; for the currents are given by
(c) x1 + 5x2 − x3 = 1, 4i1 − i2 − i3 = 12,
−3x1 + x2 − x3 = 1, −i1 − Ri2 = 24,
3x1 + x2 + x3 = −3; i1 + 5i3 = −12,
(d) x1 + x2 + x3 = 1, in terms of the voltages on the right-hand side. For
ax1 + bx2 + cx3 = d, design reasons, the current i3 should be 2 amps.
a2x1 + b2x2 + c2x3 = d 2; How many ohms should the resistance R be?
276
12.3 (Section 12.4). Show that the following sets 12.9 (Section 12.3). Find the inverses of the
of equations are inconsistent. following matrices.
LINEAR ALGEBRAIC EQUATIONS
(a) x1 + 2x2 + x3 = 3, ⎡ 6 −3 6 ⎤ ⎡ 1 −1 2⎤
x1 − 3x2 + 2x3 = 4, (a) ⎢ 3 6 6 ⎥ ; (b) ⎢ 1 2 1⎥ ;
5x1 + 5x2 + 6x3 = 1; ⎢−12 −3 6 ⎥ ⎢− 4 −1 2⎥
⎣ ⎦ ⎣ ⎦
(b) x1 + x2 + x3 = 2,
⎡1 1 0 0 0 0⎤
x1 + x3 + 2x4 = 3, ⎢0 0⎥
⎡ 2 −1 2 0⎤ 1 0 0 0
x1 + x2 + x4 = 4, ⎢ 1 0 −1 2⎥ ⎢0 0 1 0 0 0⎥
(c) ⎢ (d) ⎢ ⎥;
2⎥
;
− x2 + 2x3 = 2; 0 0 −1 ⎢0 0 0 1 0 0⎥
⎢ ⎥
(c) x1 + x2 = 1, ⎣ −1 0 1 0⎦ ⎢0 0 0 0 1 1⎥
⎢⎣0 0 0 0 0 1⎥⎦
x2 + x 3 = 1,
x3 + x4 = 1, ⎡1 0 0 0 0⎤
x4 + x5 = 1, ⎢1 1 0 0 0⎥
x1 + 3x2 + 5x3 + 7x4 + 4x5 = 1. (e) ⎢1 1 1 0 0⎥ .
⎢ ⎥
⎢1 1 1 1 0⎥
12
PROBLEMS
2x + y − λz = µ 4x1 − x2 + kx3 + 3x4 = 0,
may have (a) just one solution, (b) no solutions, 4x1 − x2 + 3x3 + kx4 = 0
(c) an infinite set of solutions. have non-trivial solutions?
12.15 (Section 12.3). For each of the sets of 12.19 (Section 12.4). Show that the equations
equations below, set up the augmented matrix and, x1 + 2x2 + 3x3 = 4,
using elementary row operations, decide on the
consistency of the equations. If they are consistent, 2x1 + 3x2 + 8x3 − x4 = 20,
obtain all solutions in each case. 2x1 + 5x2 + 4x3 + x4 = 5
(a) x + y + z = 3, are inconsistent.
3x + 5y + z = −1,
12.20 (Section 12.3). Find the inverses of
x + 2y = 0;
⎡1 λ 0⎤ ⎡ 1 0 0⎤
(b) y + z = 1, ⎢0 1 λ ⎥ and ⎢µ 1 0⎥ .
x + y + 2z = 3, ⎢0 0 1 ⎥⎦ ⎢ 0 µ 1⎥
⎣ ⎣ ⎦
x− y = 1; Hence find the inverse of
(c) x + 2y + z = 4, ⎡1 + λµ λ 0⎤
x+ y = −1, ⎢ µ 1 + λµ λ ⎥ .
⎢ 0 µ 1 ⎥⎦
3x + 4y − z = 12. ⎣
Find the inverse of
12.16 (Section 12.5). Find all solutions of the ⎡13 3 0⎤
determinant equation ⎢ 4 13 3⎥ .
⎢ 0 4 1⎥
1−k 2 −1 ⎣ ⎦
2 1−k −1 = 0.
−1 −1 2 − k 12.21 (Section 12.5). Express the determinant
12.24 (Section 12.6). Show that one row of the x 2 = 41 (x1 − x3 − 8),
matrix of coefficients fails to be dominant in the x3 = 15 (−2x1 − x 2 − 14),
system
using the Jacobi method. How many steps are
6x1 − x2 + x3 = 2, required to achieve the same accuracy as that in
3x1 + 2x2 + x3 = 1, the table in Section 12.6, that is to 5 significant
x1 − x2 + 4x3 = 5. figures?
12
Eigenvalues and
eigenvectors 13
CONTENTS
The problem of finding a formula for powers of a square matrix A requires the
construction of matrices which transform A into a diagonal matrix. This process
involves the determination of the eigenvalues and eigenvectors of A. Eigenvalues
have many other applications wherever linear equations occur, particularly in
systems of linear differential equations.
This can only be satisfied if λ takes certain values. These are called the eigenvalues
of the matrix A, and the equation they satisfy (eqn (13.1)) is called the character-
istic equation of A. The characteristic equation is a polynomial equation, of
degree n in λ. We usually list the eigenvalues as λ 1, λ 2, and so on.
280
⎡1 3⎤
A=⎢ ⎥.
⎣2 2⎦
The eigenvalues of A are given by the determinant equation
1−λ 3
det(A − λI2 ) = = 0,
2 2−λ
which can be expanded into
(1 − λ)(2 − λ) − 6 = 0, or λ2 − 3λ − 4 = 0.
This factorizes into (λ − 4)(λ + 1) = 0: hence the eigenvalues are λ 1 = −1, λ 2 = 4.
⎥.
⎣ 1 4⎦
In this case
2−λ −2
det(A − λI2 ) = = (2 − λ )(4 − λ ) + 2 = λ2 − 6λ + 10 = 0,
1 4−λ
and the quadratic equation has the complex roots
λ = 12 [6 ± √(36 − 40)] = 3 ± i.
Thus real matrices can have complex eigenvalues, which will occur in pairs of complex
conjugates.
= (4 − λ)(−1 − λ)(1 − λ) = 0,
if λ = 4 or ±1. Hence the eigenvalues are
λ 1 = 4, λ 2 = 1, λ 3 = −1.
281
Eigenvalues
13.2
The eigenvalues of the n × n square matrix A are the solutions λ of the
determinant equation
EIGENVECTORS
det(A − λIn) = 0. (13.2)
13.2 Eigenvectors
Associated with each eigenvalue λ of A, there will be non-trivial solutions of the
equation (A − λIn)x = 0.
These are called the eigenvectors of A corresponding to the eigenvalue λ, and
are generally denoted in this text by s. Thus, if λ is an eigenvalue of A, then there
will exist a corresponding eigenvector s ≠ 0, which is a non-trivial solution of
(A − λIn)s = 0.
The solutions of this set of linear equations can be found by Gaussian elimination.
⎡1 2 1⎤
A = ⎢2 1 1⎥ .
⎢1 1 2 ⎥
⎣ ⎦
The eigenvalues of A are λ 1 = 4, λ 2 = 1, λ 3 = −1 (see Example 13.3). Let the corresponding
eigenvectors be
⎡ai ⎤
si = ⎢bi ⎥ (i = 1, 2, 3).
⎢c ⎥
⎣ i⎦
In each case, we need to solve (A − λ iI3)si = 0. If λ 1 = 4, then
−3a1 + 2b1 + c1 = 0,
2a1 − 3b1 + c1 = 0,
a1 + b1 − 2c1 = 0.
13
⎡−3 2 1 0⎤ ⎡−3 2 1 0⎤
⎢ 2 −3 1 0⎥ → ⎢ 0 − 53 5
0⎥ ⎛ r2′ = r2 + 23 r1 ⎞
⎢ ⎥
⎢ 1 1 −2 0⎥ 0⎦ ⎜⎝ r3′ = r3 + 13 r1 ⎟⎠
3
⎣ ⎦ ⎣ 0 5
3 − 5
3
⎡−3 2 1 0⎤
→ ⎢ 0 − 53 53 0⎥
⎢ 0 0 0 0⎥⎦ (r3′ = r3 + r2 ).
⎣
By back substitution, if c1 = α, then b1 = c1 = α, and a1 = 13 (2b1 + c1 ) = α. Thus, with
α = 1, an eigenvector is
⎡1⎤
s1 = ⎢1⎥ .
⎢1⎥
⎣⎦
The other eigenvectors corresponding to λ 1 are simply multiples of s1. Using the same
procedure shows that the two eigenvectors corresponding respectively to λ 2 and λ 3 can
be chosen to be
⎡−1⎤
s2 = ⎢−1⎥ ,
⎢ 2⎥
⎣ ⎦
⎡ 1⎤
s 3 = ⎢−1⎥ .
⎢ 0⎥
⎣ ⎦
13.2
In this example,
1−λ 2 −1
EIGENVECTORS
det(A − λI 3 ) = 1 2−λ −1
2 2 −1 − λ
−λ λ 0 (r1′ = r1 − r2 )
= 1 2−λ −1
2 2 −1 − λ
−λ 0 0
= 1 3−λ −1 (c ′2 = c 2 + c1 )
2 4 −1 − λ
= −λ[(3 − λ)(−1 − λ) + 4]
= −λ(λ − 1)2.
This particular matrix has an eigenvalue 0 and a repeated eigenvalue 1. How does this
affect the eigenvectors? Let the eigenvectors be, for λ 1 = 0 and λ 2 = 1,
⎡ai ⎤
si = ⎢bi ⎥ (i = 1, 2).
⎢c ⎥
⎣ i⎦
For λ 1 = 0,
a1 + 2b1 − c1 = 0,
a1 + 2b1 − c1 = 0,
2a1 + 2b1 − c1 = 0.
Hence a1 = 0, b1 = α, c1 = 2α, for any α. An eigenvector is
⎡0⎤
s1 = ⎢1⎥ .
⎢2⎥
⎣ ⎦
For λ 2 = 1,
2b2 − c2 = 0,
a2 + b2 − c2 = 0,
2a2 + 2b2 − 2c2 = 0.
If we let b2 = β, then c2 = 2β and a2 = c2 − b2 = β. Hence we can associate with λ 2 = 1 the
eigenvector
⎡1⎤
s2 = ⎢1⎥ ,
⎢2⎥
⎣ ⎦
by putting β = 1. There are only two independent eigenvectors in this example.
Note that if A has a zero eigenvalue, then A must be a singular matrix since
det A = 0. And conversely, if A is singular, then A has at least one zero eigenvalue.
The matrix in Example 13.6 has two eigenvalues (one repeated) and two
eigenvectors. The meaning of this reduced eigenvector set will be illustrated in the
284
context of coordinate transformations in Section 13.4. As the next example
illustrates, a matrix can have a repeated eigenvalue but still retain a full set of
EIGENVALUES AND EIGENVECTORS
independent eigenvectors.
⎡ai ⎤
si = ⎢bi ⎥ (i = 1, 2).
⎢c ⎥
⎣ i⎦
For λ 1 = 2,
a1 − c1 = 0,
−b1 = 0,
2a1 − 2c1 = 0.
We can let b1 = 0, c1 = α, a1 = α. Hence we can choose
⎡1⎤
s1 = ⎢0⎥ .
⎢1⎥
⎣ ⎦
For λ 2 = 1,
2a2 − c2 = 0,
0 = 0,
2a2 − c2 = 0.
If a2 = β, then c2 = 2β but b2 can then take any value γ , say. Hence, the eigenvector set is
⎡β ⎤ ⎡1⎤ ⎡0⎤
s2 = ⎢ γ ⎥ = β ⎢0⎥ + γ ⎢1⎥ ,
⎢2β ⎥ ⎢2⎥ ⎢0⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
this is it contains two parameters β and γ. The choices of β = 1 with γ = 0, and β = 0 with
γ = 1, say, give two independent eigenvectors
⎡1⎤ ⎡0⎤
⎢0⎥ and ⎢1⎥ .
⎢2⎥ ⎢0⎥
⎣ ⎦ ⎣ ⎦
Eigenvectors
13.3
The eigenvectors of a square matrix A are the non-trivial solutions sr of the
homogeneous equations
LINEAR DEPENDENCE
(A − λ r I n)sr = 0, for each eigenvalue λ r . (13.3)
Self-test 13.1
Find the eigenvalues and eigenvectors of
G1 2 –1 J
A = H2 –1 1K .
I0 –2 1L
⎡ a1 ⎤ ⎡ b1 ⎤
⎢a ⎥ ⎢b ⎥
s1 = ⎢ 2 ⎥ , s2 = ⎢ 2 ⎥ ,
⎢⎥ ⎢⎥
⎢⎣am ⎥⎦ ⎢⎣bm ⎥⎦
then s1 and s2 belong to Vm, and so does α s1 + β s2 for any constants α and β.
An important set of vectors in Vm is the set of base vectors
⎡1 ⎤ ⎡0⎤ ⎡0⎤
⎢0⎥ ⎢1 ⎥ ⎢0⎥
e1 = 0 , e2 = 0⎥ , … , em = ⎢⎢0⎥⎥ .
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢⎣0⎥⎦ ⎢⎣0⎥⎦ ⎢⎣1⎥⎦
⎡ a1 ⎤
⎢a ⎥
s1 = ⎢ 2 ⎥ = a1e1 + a2 e2 + + amem .
⎢⎥
⎢⎣am ⎥⎦
286
The set of vectors {e1, e2, … , em} is said, therefore, to form a basis of Vm. None
of the vectors e1, e2, … , em can be expressed as a linear combination of the others,
EIGENVALUES AND EIGENVECTORS
so that they are said to be linearly independent. A set of n column vectors s1, s2, …
, sn is said to be linearly dependent if there exist constants α1, α2, … , αn, not all
zero, such that
α 1s1 + α 2s2 + ··· + αnsn = 0.
If the above equation holds only when α 1 = α 2 = ··· = αn = 0, then the vectors
are linearly independent. It can be proved that any set of m linearly independent
vectors form a basis of the vector space Vm.
By (12.6) the only solution is x = y = z = 0. The vectors are therefore linearly independent
and can form a basis.
Self-test 13.2
For what values of k do the vectors (1, 2, k)T, (1, 2, –1)T, (k, 2, –1)T form a
basis in three dimensions?
⎡1 2 1⎤
A = ⎢2 1 1⎥
13.4
⎢1 1 2 ⎥
⎣ ⎦
DIAGONALIZATION OF A MATRIX
which has the eigenvalues λ 1 = 4, λ 2 = 1, λ 3 = −1 and eigenvectors
⎡1⎤ ⎡−1⎤ ⎡ 1⎤
s1 = ⎢1⎥ s2 = ⎢−1⎥ s 3 = ⎢−1⎥ .
⎢1⎥ ⎢ 2⎥ ⎢ 0⎥
⎣⎦ ⎣ ⎦ ⎣ ⎦
Construct a matrix C which has these eigenvectors as its columns:
⎡1 −1 1⎤
C = [s1 s2 s 3 ] = ⎢1 −1 −1⎥ .
⎢1 2 0⎥
⎣ ⎦
The columns are independent, so C is nonsigular. Then
AC = A[s1 s2 s3] = [As1 As2 As3] = [λ 1 s1 λ 2 s2 λ 3 s3],
the last equality holding since the eigenvector si is defined as a nonzero solution
of Asi = λisi. Let D be the diagonal matrix of eigenvalues, namely
⎡λ 1 0 0 ⎤
D = ⎢ 0 λ2 0 ⎥ .
⎢0 0 λ ⎥
⎣ 3⎦
Then
AC = [λ 1 s1 λ 2 s2 λ 3 s3] = [s1λ 1 s2λ 2 s3λ 3 ]
G λ1 −λ 2 λ 3 J G 1 −1 1J G λ1 0 0 J
= H λ1 −λ 2 −λ 3 K = H 1 −1 −1 K H0 λ 2 0 K = CD.
I λ1 2λ 2 0 L I1 2 0L I0 0 λ 3L
⎡ 13 1
3
1
3 ⎤ ⎡1 2 1⎤ ⎡1 −1 1⎤ ⎡4 0 0⎤
C −1AC = ⎢− 61 − 1 1 ⎥ ⎢2 1 1⎥ ⎢1 −1 −1⎥ = ⎢0 1 0⎥ = D.
⎢ 1 6 3
⎥⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢⎣ 2 − 1
2 0⎥⎦ ⎣1 2 0⎦ ⎣1 2 0⎦ ⎣0 0 −1⎦
288
It might appear at first sight that there is not a unique answer for D since the
eigenvectors are not uniquely defined. However, a different selection of eigen-
EIGENVALUES AND EIGENVECTORS
From Example 13.7, we see that A has the eigenvalues λ 1 = 2 and λ 2 = λ 3 = 1. However,
we can associate two linearly independent eigenvectors with the repeated eigenvalue.
Thus, we can define C by
⎡1 1 0⎤
C = [s1 s2 s 3 ] = ⎢0 0 1⎥ .
⎢1 2 0⎥
⎣ ⎦
Its inverse is
⎡ 2 0 −1⎤
C −1 = ⎢−1 0 1⎥ .
⎢ 0 1 0⎥
⎣ ⎦
Finally it can be verified that
⎡ 2 0 −1⎤ ⎡3 0 −1⎤ ⎡1 1 0⎤ ⎡2 0 0⎤
C −1AC = ⎢−1 0 1⎥ ⎢0 1 0⎥ ⎢0 0 1⎥ = ⎢0 1 0⎥ = D.
⎢ 0 1 0⎥ ⎢2 0 0⎥ ⎢1 2 0⎥ ⎢0 0 1⎥
⎣ ⎦⎣ ⎦⎣ ⎦ ⎣ ⎦
13.5
Its inverse is
1 ⎡ 1 1 + i⎤ 1 ⎡ 1 1 + i⎤
C −1 = = .
det C ⎢⎣−1 −1 + i⎥⎦ 2i ⎢⎣−1 −1 + i⎥⎦
POWERS OF MATRICES
Finally, check that
1 ⎡ 1 1 + i⎤ ⎡2 −2⎤ ⎡−1 + i −1 − i⎤ ⎡3 + i 0 ⎤
C −1AC =
2i ⎢⎣−1 −1 + i⎥⎦ ⎢⎣ 1 4⎥⎦ ⎢⎣ 1 1 ⎥⎦ = ⎢⎣ 0 3 − i⎥⎦
.
Diagonalizing a matrix
To diagonalize a matrix A:
(i) find the eigenvalues of A;
(ii) find n linearly independent eigenvectors sn of A (if they exist);
(iii) construct the matrix C of eigenvectors;
(iv) calculate the inverse C −1 of C;
(v) compute C −1 AC. (13.4)
Not all matrices can be diagonalized in this way. In Example 13.6 where
⎡1 2 −1⎤
A = ⎢1 2 −1⎥ ,
⎢2 2 −1⎥
⎣ ⎦
we can associate only two linearly independent eigenvectors with the eigen-
value 0 and the repeated eigenvalue 1, and no diagonalizing matrix C can be
constructed.
Self-test 13.3
Find the eigenvalues and eigenvectors of
G1 2 2J
A = H1 2 −1 K .
I2 2 −1L
Construct a matrix C such that C −1AC = D, where D is a diagonal matrix.
What are the elements in D?
⎡λ 1 0 0 ⎤
D = ⎢ 0 λ2 0 ⎥ ,
EIGENVALUES AND EIGENVECTORS
⎢0 0 λ ⎥
⎣ 3⎦
then
⎡λ 1 0 0 ⎤ ⎡λ 1 0 0 ⎤ ⎡λ 12 0 0⎤
D = ⎢ 0 λ2 0 ⎥ ⎢ 0 λ2 0 ⎥ = ⎢ 0
2 λ 22 0 ⎥,
⎢0 0 λ ⎥⎢0 0 λ ⎥ ⎢ 0 0 λ 23 ⎥⎦
⎣ 3⎦ ⎣ 3⎦ ⎣
and, in general,
⎡λ 1n 0 0⎤
D = ⎢0
n
λ n2 0 ⎥.
⎢0
⎣ 0 λ n3 ⎥⎦
13
13.5
n
⎡1 −1 1⎤ ⎡4 0 0⎤ ⎡ 13 1
3
1
3⎤
A = CD C = ⎢1 −1 −1⎥ ⎢0 1 0⎥
n n −1 ⎢− 1 − 1 1⎥
⎢1 2 0⎥ ⎢0 0 −1⎥ ⎢ 61 6 3
⎥
⎣ ⎦⎣ ⎦ ⎣ 2 − 1
0⎦
POWERS OF MATRICES
2
⎡1 −1 1⎤ ⎡4n 0 0 ⎤ ⎡ 13 1
3
1
3⎤
= ⎢1 −1 −1⎥ ⎢ 0 1n 0 ⎥ ⎢− 16 − 16 1⎥
⎢1 2 0⎥ ⎢ 0 0 (−1)n ⎥ ⎢ 1 ⎥
3
⎣ ⎦⎣ ⎦⎣ 2 − 12 0⎦
⎡4n −1 (−1)n ⎤ ⎡ 13 1
3
1
3⎤
= ⎢4n −1 −(−1)n ⎥ ⎢− 16 − 16 1⎥
⎢4n 0 ⎥⎦ ⎢⎣ 12 ⎥
3
⎣ 2 − 12 0⎦
4n ⎡1 1 1⎤ ⎡ 1 1 −2⎤ (−1)n ⎡ 1 −1 0⎤
= ⎢1 1 1⎥ + 16 ⎢ 1 1 −2⎥ + ⎢−1 1 0⎥ .
3 ⎢1 1 1⎥ ⎢−2 −2 4 ⎥ 2 ⎢ 0 0 0⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
Hence
(1 − α − λ)(1 − β − λ) − αβ = 0,
or
λ2 − λ(2 − α − β ) + 1 − α − β = 0.
The roots λ 1 = 1, λ 2 = 1 − α − β = p, say. Choose the corresponding eigenvectors
⎡1⎤ ⎡−α ⎤
s1 = ⎢ ⎥ , s2 = ⎢ ⎥ .
⎣1⎦ ⎣β⎦
Let
⎡1 −α ⎤
C = [s1 s2 ] = ⎢ .
⎣1 β ⎥⎦
Its inverse is given by
1 ⎡ β α⎤
C −1 = .
α + β ⎢⎣−1 1 ⎥⎦
Thus ➚
292
Example 13.12 continued
EIGENVALUES AND EIGENVECTORS
⎡1 −α ⎤ ⎡1 0 ⎤ ⎡ β α ⎤ 1
Pn = CDnC −1 = ⎢
⎣1 β ⎥⎦ ⎢⎣0 p ⎥⎦ ⎢⎣−1 1 ⎥⎦ α + β
n
1 ⎡1 −α pn ⎤ ⎡ β α ⎤
= ⎢⎣1 β pn ⎥⎦ ⎢⎣−1 1 ⎥⎦
α+β
1 ⎡β + α pn α − α pn ⎤
=
α + β ⎢⎣β − β pn α + β pn ⎥⎦
1 ⎡β α ⎤ pn ⎡α −α ⎤
= ⎢⎣β α ⎥⎦ + α + β ⎢⎣− β .
α+β β ⎥⎦
α+β
Self-test 13.4
Using the results from Self-test 13.3, find a formula for An where
G1 2 2J
A = H1 2 −1 K .
I2 2 −1L
13.6
⎡1 4 0⎤ ⎡ x1 ⎤
[x1 x2 x3 ] ⎢4 1 3⎥ ⎢x2 ⎥ ,
⎢0 3 1⎥ ⎢x ⎥ (13.6)
QUADRATIC FORMS
⎣ ⎦ ⎣ 3⎦
in which A is symmetric. Any quadratic form may be written using a symmetric
A although non-symmetric representations are possible. For example, in the above
we may put
⎡1 0 0 ⎤
A = ⎢8 1 2⎥ .
⎢0 4 1⎥
⎣ ⎦
However, the symmetric form is adopted throughout this section.
Let us find the eigenvalues of the symmetric matrix in (13.6) in the usual way
by solving
1−λ 4 0
4 1−λ 3 = 0.
0 3 1−λ
Hence
(1 − λ)[(1 − λ)2 − 9] − 4· 4· (1 − λ) = 0
or
(1 − λ)[(1 − λ)2 − 25] = 0.
It follows that the eigenvalues are λ1 = 1, λ2 = −4, λ3 = 6. It can be shown by the
methods previously explained that corresponding eigenvectors are
⎡ 3⎤ ⎡− 4 ⎤ ⎡4⎤
s1 = ⎢ 0⎥ , s2 = ⎢ 5⎥ , s 3 = ⎢5 ⎥ .
⎢− 4 ⎥ ⎢ −3⎥ ⎢3⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
If a and b are two column vectors, and
aTb = 0,
then a and b are said to be orthogonal. If we examine the eigenvectors s1, s2, and s3
above, then it is easy to see that
⎡− 4 ⎤
s s = 3 0 − 4 ⎢ 5⎥ = −12 + 0 + 12 = 0,
T [ ]
⎢ −3⎥
1 2
⎣ ⎦
and similarly that s T2 s3 = 0 and s T3 s1 = 0. Thus the three eigenvectors are mutually
orthogonal: regarded as ordinary vectors in the sense of Chapter 9, they are
mutually perpendicular.
It will be shown that this property of the eigenvalues follows from the symmetry of
the matrix of the quadratic form. However, we first show that the eigenvectors of a
symmetric matrix must be real numbers.
294
Theorem 13.1 If A is a symmetric real matrix, then its eigenvalues are real.
EIGENVALUES AND EIGENVECTORS
Proof. Suppose that λ = α + iβ is an eigenvalue. Since the left-hand side of the equation
det(A − λIn) = 0 is a real polynomial in λ, it must also have an eigenvalue i = α − iβ.
Let s and [ be the eigenvectors corresponding to λ and its conjugate i. Thus
As = λ s, A[ = i[. (13.7)
Since A is symmetric, it follows that (A[)T = [ TAT = [ TA, and we can replace (13.7) by
As = λ s, [ TA = i[ T. (13.8)
Multiply the first equation in (13.8) on the left by [ T, and the second equation on the
right by s. Thus
[ TAs = λ [ Ts, [ TAs = i[ Ts.
T
Elimination of [ As leads to
(λ − i)[ Ts = 0. (13.9)
To show that s s ≠ 0, put s = (a1, … , an ). Then, since =nan = | an | ,
T T 2
⎡a1 ⎤
⎢a ⎥
[ s = [A1 A2 … An ] ⎢ 2 ⎥ = | a1 |2 + | a2 |2 + + | an |2 0.
13
⎢⎥
⎣an ⎦
From (13.9), it follows that λ = i or α + iβ = α − iβ, from which we conclude that β = 0.
Therefore λ is real.
Self-test 13.5
Find the eigenvalues and eigenvectors of
G 3 −3 2 J
A = H −3 −6 3 K .
I 2 3 3L
Confirm that the eigenvectors are mutually orthogonal.
295
13.7
A quadratic form xTAx is said to be positive-definite if xTAx 0 for all x ≠ 0. If this
is true, we simply describe the matrix A as positive-definite.
POSITIVE-DEFINITE MATRICES
Remember that any quadratic form can be written as xTAx where A is
symmetric.
Consider the particular case in which A is a 3 × 3 symmetric matrix. Let λ 1, λ 2,
λ 3 be its eigenvalues, with corresponding eigenvectors s1, s2, s3 which are chosen
so that they are all unit vectors, that is sT1 s1 = s T2 s2 = s T3 s3 = 1.
As we saw in Section 13.4, we can diagonalize A by using the matrix
C = [s1 s2 s3],
so that
⎡λ 1 0 0 ⎤
C AC = D = ⎢ 0 λ 2 0 ⎥ .
−1
⎢0 0 λ ⎥
⎣ 3⎦
For a symmetric matrix, the eigenvectors are orthogonal (Theorem 13.2). Hence
s T1 C = s T1 [s1 s2 s3]
= [s T1 s1 sT1 s2 s T1 s3]
= [1 0 0],
since s1 is a unit vector. In a similar way,
s T2 C = [0 1 0], s T3 C = [0 0 1].
Hence, if we construct a matrix with sT1, sT2, sT3 as its rows, then
⎡s T1 ⎤ ⎡1 0 0 ⎤
C C = ⎢s T2 ⎥ C = ⎢0 1 0⎥ = I3.
T
⎢s T ⎥ ⎢0 0 1⎥
⎣ 3⎦ ⎣ ⎦
⎡s T1 ⎤
C = ⎢s T2 ⎥
T
⎢s T ⎥
⎣ 3⎦
is the inverse of C, that is C T = C −1. Square matrices with this property are said to
be orthogonal matrices.
Suppose that we now define a transformation by x = CX, where C is an
orthogonal matrix. Then, in terms of X, the quadratic form becomes
xTAx = (CX)TACX = X TC TACX
= X TDX = λ 1X 12 + λ 2X 22 + λ 3X 32.
It follows from this result, for 3 × 3 matrices, and similarly for higher order, that
a quadratic form is positive-definite if and only if all its eigenvalues are positive.
296
follows that the quadratic form is positive-definite. The corresponding eigenvectors are
The relation between the coordinates (x, y, z) and (X, Y, Z) of a point fixed in
space in the transformation
x = CX = [s1 s2 s3]X,
where the eigenvectors s1, s2, s3 are orthogonal unit vectors, can be seen as follows.
Put X = 1, Y = 0, Z = 0, which is a point on the X axis. Since
⎡1⎤
X = ⎢0⎥ ,
⎢0⎥
⎣ ⎦
it follows that the corresponding point in the x frame is x = s1. In other words
the elements (a1, b1, c1) of s1 are the coordinates in the x space of the point
A1 : (1, 0, 0) in the X space. Similarly, the elements of s2 and s3 are respectively the
coordinates of A2 : (0, 1, 0) and A3 : (0, 0, 1) in the X space (see Fig. 13.1).
We know that the eigenvectors are mutually orthogonal, that is sTi sj = 0 (i ≠ j).
We want to show that this implies that the new axes OXYZ are also mutually
perpendicular. Consider the triangle OA1A2: we want to show that A is a right
angle, so that the triangle is subject to Pythagoras’s theorem:
297
Z z
13.7
A3
POSITIVE-DEFINITE MATRICES
(a3, b3, c3)
1 A2 Y
1
(a2, b2, c2)
O
1
A1 y Fig. 13.1 Orthogonal mapping
(a1, b1, c1) between axes. The coordinates
x (a1, b1, c1), (a2, b2, c2), (a3, b3, c3)
X are measured in the x space.
A1A 22 − OA 12 − OA 22
= (a1 − a2)2 + (b1 − b2)2 + (c1 − c2)2 − (a 12 + b 12 + c 12) − (a 22 + b 22 + c 22)
= −2(a1a2 + b1b2 + c1c2)
= −2s T1 s2 = 0,
since the eigenvectors are unit vectors and orthogonal. Hence, by Pythagoras’
theorem, A is a right angle. Similarly, the other angles B and C
are right angles. Hence the new axes are mutually perpendicular. It can be shown
that det C = ±1. If det C = 1, then the X coordinates can be obtained from the x
coordinates by a rotation about the origin O. If det C = −1, then a reflection and
rotation are required.
Self-test 13.6
Let
G1 1 −1 −1 J
H 1 −1 −1 1 K
13
A = 12 H .
1 −1 1 −1 K
I1 1 1 1L
Show that A is an orthogonal matrix. Show also that the columns form an
orthonomal basis. Is this a general property of orthogonal matrices?
a m a m a
T1 T2 T2 T3
x y
13.8
V = 21 kx2 + 21 ky2 − kyx + 21 kx2 + 21 ky2
= kx 2 − kxy + ky 2 = 12 xTKx,
⎡x⎤ ⎡ 2k −k ⎤
x = ⎢ ⎥, K=⎢ ⎥.
⎣y ⎦ ⎣−k 2k ⎦
2k − λ −k
= 0, or (2k − λ)2 – k2 = 0.
−k 2k − λ
Hence, the eigenvalues are λ 1 = k and λ 2 = 3k, which are both positive, imply-
ing that the potential energy is a positive-definite quadratic form. This is not
surprising, since we might expect the potential energy to take a minimum value
in equilibrium. Corresponding eigenvectors are
1 ⎡1⎤ 1 ⎡ 1⎤
s1 = , s2 = ,
√2 ⎢⎣1⎥⎦ √2 ⎢⎣−1⎥⎦
1 ⎡1 1⎤
C = [s1 s2 ] = .
√2 ⎢⎣1 −1⎥⎦
⎡k 0 ⎤
V = 21 X T ⎢ ⎥X
⎣0 3k ⎦
= 12 (kX2 − 3kY)2
(X, Y) are known as the normal coordinates of the system, and are related to
x and y by
T3 − T2 = mH. (13.11)
300
where F and H stand for d2x/dt2 and d2y /dt2 respectively. The tension in a spring
is k times the extension, by Hooke’s law, where k is the stiffness of the spring.
EIGENVALUES AND EIGENVECTORS
Thus
T1 = kx, T2 = k(y − x), T3 = −ky.
Substitution into (13.10) and (13.11) yields
−2kx + ky = mF, (13.12)
In matrix form, these equations can be combined into the vector equation
f + Ax = 0,
where
⎡F⎤ ⎡ 2k /m −k /m⎤
f = ⎢ ⎥, A=⎢ ⎥.
13
⎣H ⎦ ⎣−k /m 2k /m⎦
where
⎡λ 0 ⎤ ⎡k /m 0 ⎤
D=⎢ 1 =⎢ .
⎥
⎣ 0 λ 2⎦ ⎣ 0 3k /m⎥⎦
PROBLEMS
13.1 (Sections 13.1, 2). Find the eigenvalues and 13.7 (Sections 13.1, 2). Show that the matrix
eigenvectors of the following matrices:
⎡ −1 −1 a + 1⎤
⎡2 3⎤ ⎡6 3⎤ ⎡2 1⎤ A = ⎢a + 1 −a −1 ⎥
(a) ⎢ ; (b)
⎢⎣2 7 ⎥⎦ ; (c) ⎢⎣4 6 ⎥⎦ ;
⎣4 6 ⎥⎦ ⎢ −a a + 1 −a ⎥
⎣ ⎦
⎡ 1 1⎤ ⎡ 1 2⎤ ⎡2 −2⎤ has a zero eigenvalue. For design reasons, a
(d) ⎢ ; (e)
⎢⎣14 5⎥⎦ ; (f ) ⎢⎣4 6 ⎥⎦ .
⎣4 5⎥⎦ second eigen-value must be 3. For what values
of a does this occur? Find the third eigenvalue
13.2 (Section 13.1). Show that the eigenvalues of in each case.
the symmetric matrix
⎡a b⎤ 13.8 (Sections 13.1, 2). A matrix is said to be
A=⎢ ,
⎣b c ⎥⎦ idempotent if A2 = A. Explain why all eigenvalues
of A must be either 0 or 1. Show that
where a, b, and c are real numbers, are real.
⎡1 0 0⎤
13.3 (Section 13.1). Find the eigenvalues of A = ⎢0 3 6⎥
⎢0 −1 −2⎥
⎡6 3⎤ ⎣ ⎦
A=⎢
⎣2 7 ⎥⎦ is idempotent. Find the eigenvalues and
(see Problem 13.1b). Find the inverse of A and find eigenvectors of A and A2 and confirm the
its eigenvalues. What relationship, would you above result.
guess, exists between the eigenvalues of A and
those of A−1? Find the eigenvalues of A2. How do 13.9 (Sections, 13.1, 2). Let
they relate to those of A?
⎡1 1 1 1⎤
13.4 (Sections 13.1, 2). Find the eigenvalues and ⎢1 1 −1 −1⎥
A= ⎢ 1
1 −1 1 −1⎥
2 .
eigenvectors of ⎢ ⎥
⎡ 1 1 2⎤ ⎡2 1 2⎤ ⎣1 −1 −1 1⎦
(a) ⎢1 2 1⎥ ; (b) ⎢1 2 2⎥ ; Show that A2 = I4. Explain why the eigenvalues of
⎢2 1 1⎥ ⎢2 1 2⎥
⎣ ⎦ ⎣ ⎦ A must be either 1 or −1. Can A be diagonalized?
⎡2 0 0⎤ ⎡6 5 5⎤
(c) ⎢0 2 2⎥ ; (d) ⎢ 5 6 5⎥ . 13.10 Find the eigenvalues λ 1, λ 2, λ 3 of
⎢0 2 −1⎥ ⎢5 5 6⎥
⎣ ⎦ ⎣ ⎦ ⎡1 2 1⎤
A = ⎢2 1 1⎥ .
⎢1 1 2⎥
13.5 (Sections 13.1, 2). Find the eigenvalues and ⎣ ⎦
eigenvectors of
The trace of a square matrix is the sum of the
⎡1 2 0 0⎤ elements in the leading diagonal. Thus if B = [bij]
⎢3 2 0 0⎥ is an n × n matrix, then
⎢0 0 3 1⎥
.
⎢ ⎥ trace B = b11 + b22 + ··· + bnn.
⎣0 0 1 3⎦
Confirm for A above that trace A = λ 1 + λ 2 + λ 3.
Also verify that det A = λ 1λ 2λ 3.
13.6 (Sections 13.1, 2). Show that
⎡1 0 0⎤ 13.11 (Section 13.3). Show that the vectors
A = ⎢0 2 2⎥
⎢0 2 5⎥ ⎡1⎤ ⎡ 2⎤ ⎡ 4⎤
⎣ ⎦
s1 = ⎢2⎥ , s2 = ⎢−1⎥ , s3 = ⎢ 3⎥
has a repeated eigenvalue. Find the corresponding ⎢1⎥ ⎢ 3⎥ ⎢ 5⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
eigenvectors. How many linearly independent
eigenvectors are there? are linearly dependent.
302
13.12 (Section 13.6). Let 13.18 (Section 13.7). Show that
EIGENVALUES AND EIGENVECTORS
⎡− 4 1 −2⎤ ⎡1 −1 1 −1⎤
A = ⎢ 2 −2 1⎥ . ⎢1 −1 −1 1⎥
A= ⎢ 1
⎢ 0
⎣ 1 0 ⎥
⎦ 1 1 −1 −1⎥
2
⎢ ⎥
⎣1 1 1 1⎦
Find the eigenvalues of A and a set of
corresponding eigenvectors. Hence construct a is an orthogonal matrix.
matrix C which makes C −1AC a diagonal matrix.
13.19 Show that, in the transformation
13.13 Find a matrix C which diagonalizes the
⎡X⎤ ⎡cos α − sin α ⎤ ⎡x⎤
⎢⎣Y ⎥⎦ = ⎢ sin α
matrix ,
⎣ cos α ⎥⎦ ⎢⎣y ⎥⎦
⎡1 8⎤
A=⎢ .
⎣2 1⎥⎦ the angle between the two sets of axes is α. What
do the axes of x and y become in the (X, Y ) plane?
13.14 (Section 13.5). Find a matrix C which
13.20 Show that the nonzero eigenvalues of the
diagonalizes
skew-symmetric matrix
⎡2 0 0⎤
13
A = ⎢0 2 2⎥ . ⎡ 0 a b⎤
⎢0 2 −1⎥ A = ⎢− a 0 c⎥
⎣ ⎦ ⎢−b −c 0⎥
⎣ ⎦
Verify that C −1AC = D, where D is the diagonal
are imaginary for a, b, c real.
matrix of eigenvalues.
C −1AC = D ⎡1 2 1⎤
A = ⎢2 1 1⎥ .
for a matrix A which has n linearly independent ⎢1 1 2⎥
eigenvectors, show that ⎣ ⎦
det A = λ 1λ 2 … λ n, Show that
where λ 1, λ 2, … , λ n are the eigenvalues of A. det(A − λ I3) = −λ3 + 4λ2 + λ − 4.
(Hint: use the result det AB = det A det B for Verify that
square matrices.)
−A3 + 4A2 + A − 4I3 = 0.
13.16 (Section 11.5). Find the eigenvalues and In other words, the matrix A satisfies its own
eigenvectors of the row-stochastic matrix. characteristic equation. This is known as the
Cayley–Hamilton theorem, and holds generally for
⎡ 41 1
2
1
4 ⎤ square matrices. Use the result to find the inverse
A = ⎢ 12 1 1 ⎥. matrix A−1.
⎢1 4
1
4
1 ⎥
⎣4 4 2 ⎦
13.22 Find the eigenvalues and eigenvectors of
Find a formula for An. How does A behave as
n → ∞? ⎡ 5 −1 −3 3⎤
⎢−1 5 3 −3⎥
A=⎢
−3 3 5 −1⎥
13.17 Show that .
⎢ ⎥
⎡1 0 0 ⎤ ⎣ 3 −3 −1 5⎦
A = ⎢0 cos α − sin α ⎥
⎢0 sin α Construct a matrix C such that C −1AC is the
⎣ cos α ⎥⎦ diagonal matrix of eigenvalues. Write down det A.
is an orthogonal matrix. Describe the mapping
defined by 13.23 (Section 13.6). Express the following
quadratic forms in the form xTAx, where A is
X = Ax. a 3 × 3 symmetric matrix:
Which set of points remains unaffected by the (a) x 12 + x 22 + x 32 + 4x1x2 − 4x1x3 + 4x2x3;
mapping? (b) x1x2 − x1x3 + x2x3.
303
Find eigenvalues of A in each case, and find also If
a matrix C which transforms each into the form
PROBLEMS
⎡1 3⎤
λ 1X 12 + λ 2X 22 + λ 3X 32. A=⎢
⎣2 2⎥⎦
13.24 (Section 13.6). Which of each of the (see Examples 13.1 and 13.4), find a formula for
following quadratic forms is positive-definite? Am and the sum
(a) 4x 12 + x 22 − 4x1x2; n
(b) x 12 + x 22 + 2x 32 + 2x2x3 + 2x3x1 + 4x1x2; S = ∑ Am.
(c) 6x 12 + 2x 22 − x3x1. m=1
⎢2 2 −1⎥ ⎢2 0 0⎥
not zero whilst every (r + 1)th-order minor is zero. ⎣ ⎦ ⎣ ⎦
Find the ranks of the following matrices:
Both these matrices have a repeated eigenvalue.
⎡1 2 3⎤ ⎡3 2 1 ⎤ Find the ranks of the matrices λI3 − A1 and λI3 − A2
(a) ⎢3 4 5⎥ ; (b) ⎢1 2 3⎥ ; for all the eigenvalues. Confirm that if λ is the
⎢6 7 8⎥ ⎢2 1 3⎥ repeated eigenvalue of A2, then the rank of λI3 − A2
⎣ ⎦ ⎣ ⎦
is 1, and there are two eigenvectors associated with
⎡1 2 3 2⎤ this eigenvalue: the vector space defined by this
(c) ⎢1 3 4 5⎥ . eigenvalue has dimension 2. (In general, for any
⎢2 3 5 1⎥ eigenvalue λ, the rank of λIn − A indicates the
⎣ ⎦
dimension of the vector space associated with λ: if
the root is r-fold and the rank of λIn − A is s, where
13.31 (See Problem 13.30.) The method for s must satisfy n − 1 s n − r, then the dimension
checking the rank of a matrix by calculating of the vector space of λ is n − s. If the eigenvalue
minors can be a lengthy procedure. An alternative is unique, then the vector space has dimension 1;
approach uses the elementary row operations of in other words, there is just one eigenvector
Section 12.2. Given a matrix A, row operations are associated with the eigenvalue, and if r = 2, there
13
applied to reduce A to echelon form, from which could be one or two eigenvectors depending on the
it is easier to test its rank. This is justified since it rank, and so on.)
can be proved (but not here) that elementary row Find the eigenvalues of
operations do not change the rank of a matrix.
Express the matrix in Problem 13.30c in echelon ⎡2 1 0 0⎤
form and check its rank. ⎢0 −1 0 2⎥
A=⎢
0⎥
.
0 0 1
⎢ ⎥
13.32 Consider again Examples 13.6 and 13.7. ⎣0 0 2 1⎦
Let the matrices in these examples be defined, Find the rank of A. What are the dimensions of the
respectively, by vector spaces associated with each eigenvalue?
Part 3
Integration and
differential equations
Antidifferentiation
and area 14
CONTENTS
F(x) = sin x.
Since cos x is the derivative of sin x, we say that sin x is an antiderivative of cos x
(we say an antiderivative because it is not the only one; for example, sin x + 1 is
also an antiderivative).
The antidifferentiation question in Problem B can be expressed in various
ways; for example,
(a) What must be differentiated to get cos x?
(b) What curves have slope equal to cos x at every point?
(c) Find y as a function of x if dy /dx = cos x.
Finding antiderivatives is the opposite or inverse process to that of finding
derivatives.
The following examples show that a function f(x) has an infinite number of
14
C0
y
C=0
R
C0
Q
O P x
Fig. 14.1
Some of these solutions are shown in Fig. 14.1. Different choices for C just shift the
graph bodily up or down parallel to itself. Therefore, at any particular value of x, such
as is represented by the vertical line PQR, the slopes are all the same, independently of
the value of C.
309
Evidently the same thing will happen whatever function we start with: if we
find one solution, we can add constants to obtain more.
14.1
Find a collection of antiderivatives of sin 2x.
REVERSING DIFFERENTIATION
Example 14.2
We want y such that dy/dx = sin 2x. If we differentiate a cosine we get something
involving a sine, so first of all test whether y = cos 2x is close to being an antiderivative
of sin 2x. We find that dy /dx = −2 sin 2x. This contains an unwanted factor (−2). It can
be eliminated by choosing instead
y = − 12 cos 2x,
for then we have dy/dx = − 12 (−2 sin 2x) = sin 2x, which is right. Therefore, one
antiderivative is − 12 cos 2x, and the rest are of the form
y = − 12 cos 2x + C (C is any constant).
Example 14.3 Solve the equation dy /dx = e−3x (that is to say, find a collection of
antiderivatives of e−3x).
Try y = e−3x; then dy /dx = −3 e−3x. To avoid the unwanted factor (−3) we should have
taken
1 −3x
y= e = − 13 e −3x .
(−3)
From this we construct an infinite collection of antiderivatives:
− 13 e −3x + C (C any constant).
Antiderivatives of f (x)
A function F(x) is called an antiderivative of f(x) if
d
F(x) = f(x).
dx
If F(x) is any particular antiderivative of f(x), then all the antiderivatives are
given by
F(x) + C,
where C can be any constant. (Therefore, any two antiderivatives differ by a
constant.) (14.1)
We firstly have to find any y which fits the equation dy /dx = x 3. Differentiation reduces
a power of x by unity, so try y = x4:
dy/dx = 4x 3.
The factor 4 is unwanted; we needed --14 x 4 to give x 3. Therefore all antiderivatives are
given by
y = --14 x 4 + C,
where C is any constant.
Sums of terms and constant multipliers are treated in the same way as in
differentiation: the multipliers stay as multipliers and each term is treated
separately, as in the next example.
14
The following two examples show the importance in practice of including the
constant C.
Example 14.7 Find the equation of the curve which passes through the
14.2
point (π, −1) and whose slope is given by dy /dx = sin 2x.
Since the required y is an antiderivative of sin 2x, the equation of the curve must take
The technique used in the previous example can be used for functions like
(ax + b)n, eax+b, cos(ax + b), and sin(ax + b). However, it would not work in this
simple way for a function such as (2x2 − 3)2 or sin(2x 2 − 3): the antiderivative of
(2x 2 − 3)2 is not equal to 13 (2x2 − 3)3, because x2 is present rather than x (try it, using
the chain rule).
Self-test 14.1
A point is at x = 0 on the x axis at time t = 0, and then moves along the x axis
with velocity v = 2 sin(3t) + 4. Find the displacement of the point as a func-
tion of time.
d
F(x) f (x) = F(x)
dx
sin ax a cos ax
eax a eax
By interchanging the columns and modifying the headings, we get two entries in
a possible table of antiderivatives:
f (x) F(x)
a cos ax sin ax
aeax eax
However, these entries are not yet in the form we should like them. For
example, for the first entry we would prefer to have cos ax in the left column,
instead of a cos ax. Therefore divide both entries by the constant a, remember to
introduce the arbitrary constant C to register all the antiderivatives, and we have
a more convenient table:
1
cos ax sin ax + C
a
1 ax
eax e +C
a
By such means the short table (14.2) is produced. To verify any entry, differen-
tiate the function in the right-hand column; the result should be the entry on the
left. The letter C stands for ‘any constant’ or ‘an arbitrary constant’.
313
14.2
Given function Antiderivatives
f(x) F(x)
Notice particularly the two starred entries. The formula * covers most cases, but it
does not produce antiderivatives of the function x−1 (i.e. of 1 /x). Here m = −1, so
the entry on the right becomes infinite and therefore meaningless. Therefore the
antiderivatives of x−1 must be given by some different formula, and this is shown
under **. All we have to do is to verify the formula ** as in the following example.
(The modulus or absolute value notation | x| is explained in Section 1.1.)
Example 14.9 Confirm that the antiderivatives of x−1 (i.e. 1/x) are given by
ln x + C if x is positive and by ln(−x) + C if x is negative, and that ln | x | + C
covers both cases.
(Remember that ln x does not have a meaning if x is negative or zero.) All we have to do
to verify the correctness of the formulae is to differentiate the proposed antiderivatives.
Since (d/dx) ln x = x−1, the result is right when x is positive.
Suppose now that x is negative. Then −x is positive, so ln(−x) has a meaning. Using
the chain rule (3.3) with u = −x,
d 1 1
ln(−x) = (−1) = ,
dx −x x
so the second result is confirmed.
But (see Section 1.1) | x| = x if x 0 and |x| = −x when x 0, so ln |x| is an
antiderivative whether x is positive or negative.
dy 2
= , or 2(2x − 3)−1.
dx 2x − 3
The unwanted factor 2 will not appear if we try again with y = 12 ln(2x − 3), 2x − 3 > 0.
Also 2x − 3 might be negative, so we introduce a modulus sign (compare Example 14.9).
Finally we have
y = 12 ln| 2x − 3 | + C.
Self-test 14.2
Using Table (14.2) construct the antiderivatives of
(a) e3x + 2 sin 2x; (b) 3/x2; (c) 4x–1.
14
Self-test 14.3
Can you guess and justify the antiderivatives of
2
(a) xe–x ; (b) esin x cos x?
is naturally called ‘the geometrical area between the curve and the x axis’.
A1 A3
(+) O (+) b
a AN x
A2
(−)
(−)
Fig. 14.2
315
(b)
14.3
δx
P Q
x x + δx b
a P Q
δA
R
S R S N
We require a different quantity, A , called the signed area between the curve and
the x axis. This is defined as in Fig 14.2 by
A = A1 − A2 + A3 − ··· − AN . (14.4)
A (a) = 0.
Therefore, from (14.6)
A (a) = 0 = F(a) + k,
or k = −F(a), (14.7)
a known quantity, since we selected the antiderivative F(x) of f(x) ourselves. The
required signed area A between a and b is given by
A = A (b) = F(b) − F(a),
by putting x = b into (14.6), with (14.7) as the value of k.
14
In practice we naturally use the simplest antiderivative F(x), in which the C in the
table is zero. But any nonzero choice of C will cancel out and disappear, since it
will be present in both F(a) and F(b).
Square-bracket notation
[F(x)]ba stands for F(b) − F(a). (14.9)
Example 14.12 Find (a) the signed area, and (b) the geometrical area, between
y = sin x and the x axis from x = 0 to x = 2π.
(a) f(x) = sin x, so F(x) = −cos x is an antiderivative. From (14.8) and (14.9), with a = 0
and b = 2π, the signed area A is given by
A = [−cos x] 2π
0 = −[cos x] 0 = −(cos 2π − cos 0) = 0,
2π
as is expected from Fig. 14.4: the positive and negative sections cancel. ➚
317
Example 14.12 continued
14.4
y
(b) The geometrical area A can be obtained by splitting the range into a positive section
0 to π, and a negative section from π to 2π (see Fig. 14.4). The negatively signed section
π to 2π must have its sign reversed in order to give the geometrical area:
A = [geometrical area of 1st loop] + [geometrical area of 2nd loop]
= [signed area of 1st loop] − [signed area of 2nd loop].
This is equal to
[F(x)]0π − [F(x)]2π
π = [−cos x] 0 − [−cos x] π
π 2π
Self-test 14.4
Find the signed area and the geometrical area between the x axis and
y = ex − 1 for –1 x 1.
In Fig. 14.5, F(x) is shown as a broken line. Equation (14.8) then produces an
area equal to F(2) − F(0) = 3. This is obviously correct, being equal to the sum of
the rectangular areas in Fig. 14.5.
3 F(x)
2 2
F(x)
1 1
x x
O 1 2 O 1 2
14
Problems
Note: In case you have already met the term (d) 1/x2 (write as x−2); 1/x4; 1/x when x 0
‘indefinite integral’, the term ‘antiderivative’ has (see (14.2)).
(e) √x(= x 2 ); 1 /√x; 1 /x 2 .
1 3
the same meaning.
1 2 1
(f) 3x; 2 x ; 1 /( 3x 2); 3 /(4x 4 ).
14.1 Obtain all the antiderivatives of the x −x 2x − 12 x −2x
(g) e ; e ; 5e ; e ; 3e .
following functions, and check their correctness (h) cos x; cos 3x; sin x; sin 3x;
by differentiating your results. (i) 1 − 3x; 1 + 2x − 3x 2; 3x4 − 4x 2 + 5.
(a) x5; 3x4; 2x3; 31 x 2; 6x; f(x) = 3; f(x) = 0. (j) x(x + 1) (expand by removing the
(b) − 12 x −3; 2x−2; 3x−1 when x 0 (if in doubt, brackets);
see (14.2)). (1 + 2x)(1 − 2x); (x + 1)2; (1 + x)(1 − 1 /x);
(c) x 2 ; x 2 ; x − 2 ; x 3 ; x − 3 . x2(x + x2).
3 1 1 4 1
319
(k) (x + 1)/x (turn it into the1 sum of two terms); By roughly sketching the graphs of the functions
(2√x − 1)/√x (put √x = x 2 and 1 /√x = x − 2 ,
1
for which you obtain zero, explain this fact.
PROBLEMS
then simplify as the sum of two terms); (a) y = x, 0 x 2;
(x + 1)2 /x3. (b) y = x, −1 x 1;
(l) ex + e −x ; 2e 2x − 3e3x ; e 2 x(1 + e − 2 x );
1 1
(c) y = −x2, 0 x 1;
1 /e (= e ); (e − e )/e .
2x −2x 2x −2x 2x
(d) y = cos x, −π x π;
(m) 2 cos 2x; 3 sin 12 x − 4 cos 31 x; 2 + sin 2x. (e) y = cos x − 1, 0 x 2π;
(f) y = x−1, −2 x −1 (note that x is negative in
14.2 Find all the antiderivatives of the following this range);
by trial and error, as explained in the text. Confirm (g) y = sin 3x, 0 x 32 π;
your answers by differentiation. (h) y = 1/(1 − x), 2 x 3 (note: 1 − x is negative
(a) (x + 1)3 (start by trying (x + 1)4); (3x + 1)3; over this range, so make sure you understand
(3x − 8)3. Example 14.10; alternatively, write 1/(1 − x)
1 1
(b) (1 – x)4 ; (8 − 3x) 2 ; (1 − x)3 . = −1 /(x − 1)).
(c) (2x + 1)−2; (1 − x)− 2 ; 2 /(3x + 1)3; 1/[4 (1 − x) 4].
1 1
(d) 2 cos(3x − 2) (try first sin(3x − 2)); 3 sin(1 − x); 14.7 Obtain the geometric area between the graph
2 sin(2 − 3x). and the x axis in each of the following cases. It is
necessary to treat each positive or negative section
14.3 (See Example 14.10.) Find the antiderivatives separately.
of the following. (a) y = −3, 0 x 1 (this is negative all the way);
(a) 1 /(x + 1); 1 /(x − 1); 3 /(3x − 2); 2 /(5x − 4). (b) y = x 3, −1 x 1;
(b) 1 /(1 − x); 1 /(4 − 5x). (c) y = 4 − x2, −1 x 3;
(c) x /(x + 1) (it can be written as 1 − 1 /(x + 1)). (d) y = cos x, 0 x 2π.
(d) (x + 1)/(x − 1) (compare (c)).
14.8 Find the most general function which satisfies
14.4 Use the identities cos2A = 12 (1 + cos 2A), the following equations.
sin2A = 12 (1 − cos 2A), and sin A cos A = 12 sin 2A d 2x d dx d3x d d 2x
to get rid of the squares and products in the (Note: 2 = , 3 = , etc. Work in
dt dt dt dt dt dt 2
following expressions, and in that way obtain several steps, finding the next lowest derivative in
the antiderivatives. each step.)
(a) cos2x; sin2x; sin x cos x.
(b) 3 cos2 2x; sin2 3x; sin 2x cos 2x. d 2x d 2x
(c) cos4x (you will have to use the identities twice). (a) = 0; (b) = t;
dt 2
dt 2
14.5 (a) Show that (d /dx)(x ex) = ex + x ex. By d 2x
(c) = sin t;
rearranging the terms, show that the dt 2
antiderivatives of x ex d3x d3x
are ex(x − 1) + C (use the fact that ex can be written (d) = 0; (e) = cos t;
dt 3
dt 3
as (d /dx) ex). Confirm the result by differentiation.
(b) Differentiate x2 ex. By rearranging the terms d 2x
(f ) = g (g is a constant);
and using the result in (a), find the antiderivatives dt 2
of x2 ex. d4y
(g) = w 0 (w0 is constant; this relates to the
14.6 Use the result (14.8) to obtain the signed dx 4 displacements y(x) of a bending
areas between the given graphs and the x axis. beam).
The definite and
15 indefinite integral
CONTENTS
In Chapter 14 we showed that the signed area under the graph of y = f(x) was
related to the antiderivative F(x). Calculating areas is only of limited practical
interest in itself, but the existence of a universal connection between antiderivat-
ives and signed areas allows us to adapt this idea in the form of an area analogy
that is applicable to a wide variety of problems.
This approach leads into the definition of an integral of a given function f as
the limit of a sum of infinitesimal terms ∑ ab f(x)δx. This is equal to the area under
the graph of f(x), and we have shown how to evaluate this as F(b) − F(a), where
[a, b] is the range and F is any (continuous) antiderivative of f (see eqn (14.8) and
Section 14.4). There are innumerable physical and other problems leading to sum-
mations of this type, to which the analogy applies. The standard integral notation
facilitates manipulations.
f(x) (b)
15.2
(a) S R
δA
Fig. 15.1
x=b
A = ∑ δA.
x=a
The typical area element is shown magnified in Fig. 15.1b. When δx is small, the
signed area δA is approximately equal to the signed area of the rectangle PQRS,
so that
x=b x=b
A = ∑ δA ≈ ∑ f(xp) δx
x=a x=a
y
THE DEFINITE AND INDEFINITE INTEGRAL
f(x)
y=
x0 x1 x2 xn h xn+1 xN−1 xN
x
a O b
f(xn)
Fig. 15.2
Then the area of the nth approximating rectangle in (15.1) is f(xn)h for n = 0 to
N − 1, and the approximating sum in (15.1) becomes
N−1
A ≈ hf(a) + hf(a + h) + ··· + hf(a + (N − 1)h) = h ∑ f(a + nh).
15
n=0
When we take larger and larger N, and smaller and smaller h correspondingly,
we expect that the approximation will approach the exact value. The following
example illustrates this for the very simple case of the signed area associated with
a straight line. The algorithm (15.2) is very easy to program on a computer for
any function f(x). It is called the rectangle rule.
The approximations are approaching 1.5, though very slowly. We shall see in
Section 16.3 how to improve such calculations.
323
y=x
15.3
2
–1
−ve 0 1 2
–1
Fig. 15.3
Self-test 15.1
Apply the sum (15.2) to y = 1 − x2 with a = 0, b = 1 and N = 5. Draw a sketch
showing the curve and the approximating strips. Calculate the area, and
compare this with the exact value obtained from the antiderivative.
b
(a) When b a, f(x) dx stands for lim
δx→0
∑
x=a
f(x) δx.
a
A D
C
b a
f(x) dx = [F(x)] .
b
b
a
a
By (15.3b), this is still true when b a. Therefore we have the important general
result, sometimes called ‘the basic theorem of integral calculus’.
Integral/antiderivative connection
(We have used a lot of intuition about area to arrive at (15.4); the full justification,
due to Riemann, is far beyond the scope of this book.) This relation will enable
us to evaluate the summations produced in problems that have no original con-
nection with area, provided we can obtain an antiderivative F that is continuous
over the interval [a, b]. Further applications of this area analogy are illustrated in
Section 15.8.
Notice that in a definite integral any letter can be used for the variable of integra-
tion, because the letter itself disappears in the course of evaluation; for example,
1
x dx = [ x ] = [ x ] =
0
1
2
2 1
0
1
2
2 x =1
x =0
1
2 ;
t dt = [ t ] = [ t ] = ;
0
1 2 1
2 0
1 2 t =1
2 t =0
1
2 and so on.
Consequently, the letter used is called a dummy variable. We should not choose a
letter already being used for something else.
x dx; e
b
(a) 2
(b) xt
dt (x ≠ 0).
a x
f(x) dx,
325
with no limits of integration specified, is called an indefinite integral of f(x), and has
exactly the same meaning as the word ‘antiderivative’ that we have used up until
15.4
now, and which we have denoted by F(x) (see (14.1)).
x dx = --x + C, e
2 1 3
3
2t
dt = --12 e2t + C, cos u du = sin u + C
and so on, where C is a constant. In some problems we shall assign or discover a
definite value for C; in others we might want to keep C as an arbitrary constant in
order to express every possible antiderivative (hence, ‘indefinite’ integral).
Example 15.2 Find the signed area A associated with the graph y = 3 e2x from
x = 1 to x = 3 using the new notation.
We shall need an antiderivative F(x) (i.e. an indefinite integral) of 3 e2x. Using the
notation (15.5), we may write
F(x) = 3 e2x dx = 3
2 e 2x
(for this purpose any antiderivative will do, so we have put C = 0). Then, from (15.4),
3e
3
A= 2x
dx = [ 23 e2x ]31 = 23 [e2x ]31 = 23 ( e6 − e2 ).
1
df(x)
dx
dx = f(x) + A,
where A is a constant. Next, write ∫ f(u) du = F(u), say, and suppose that the vari-
able of integration u is a function of another variable x. Consider the derivative of
F{u(x)} with respect to x:
dF(u) du dF(u)
= (by the chain rule (3.3)
dx dx du
du
= f(u) (by the definition of F (u)).
dx
326
Therefore F{u(x)} = ∫ f(u) du is an antiderivative of f{u(x)}{du/dx}; that is,
THE DEFINITE AND INDEFINITE INTEGRAL
du
F{u(x)} = f{u(x)} dx = f(u) du + B,
dx
where B is a constant. We now have
(a)
df(x)
dx
dx = f(x) + A, with A a constant.
dudx
(b) f{u(x)} dx = f(u)du + B, with B a constant, and any function u(x).
(15.6)
Self-test 15.3
15
Example 15.3 A small object P is pushed steadily along the x axis from x = 0
to x = 1, against a resistive force f(x) = x2. Find the work done against the
resistance.
Divide the range x = 0 to 1 into a large number of short steps of length δx. In general,
if the resistive force is constant the work done over a distance is (force) × (distance
moved). Although the force on P is not constant, over a short distance δx the work δW
done by the applied force is given approximately by
δW ≈ f(x) δx = x2 δx.
The total work W is given by
x =1 x =1
W= ∑ δW ≈ ∑ x 2
δ x.
x=0 x=0
15.5
x =1
W = lim
δx→ 0
∑x
x=0
2
δ x. (15.7)
and this equal to the area under the curve y = x2 between x = 0 and x = 1.
then such a sum can always be interpreted as representing, by eqn (15.4), the
signed area of y = f(x) between x = a and x = b. We shall call this observation the
area analogy. Its usefulness goes further than the connection of integrals with
antiderivatives, as illustrated in Example 15.6 and Section 15.8.
Example 15.6 Suppose that, in Example 15.5, the rainfall rate is given by
THE DEFINITE AND INDEFINITE INTEGRAL
r(t) = t 2 e−t (cm per day). Obtain the total rainfall R between t = 0 and 10 days.
1
We shall show in Chapter 16 that we are not tied to eqn (15.2) for calculating
signed area, but can find far better computing formulae.
Self-test 15.4
The mean value of a periodic function f(t) of period T is given by
f(t) dt.
T
1
m(T) =
T
0
If f(t) = a cos ωt (T = 2π/ω, a > 0), then m(T) = 0. If f(t) is an alternating cur-
rent, for example, then its mean value indicates nothing about the current.
Instead, a measure of its magnitude is the root mean square (rms) defined by
1
G1 T
J –2
rms[f(t)] = I | f(t)|2 dt L .
T
0
e
15.6
−2x
Example 15.7 Evaluate dx.
0
IMPROPER INTEGRALS
Putting e−2x dx = − 12 e −2x, we have
e
0
−2x
dx = − 12 [e −2x ]0∞ = − 12 (0 − 1) = 12 .
x
1
−2
dx = [−x−1]1∞ = [0 − (−1)] = 1.
In Examples 15.7 and 15.8 we have (see Fig. 15.4) two cases of an infinitely
long figure which encloses a finite area. This does not always happen, even if the
integrand goes to zero when x → ∞.
y y
1 1
1
y = e−2x y=
x2
O 1 x O 1 2 3 x
Fig. 15.4
We have
∞
1
x−1 dx = [ln x] ∞1 .
∞
The integral ∫ a f(x) dx is defined by the limit lim ∫ Xa f(x) dx if the limit exists:
X→∞
such integrals are also called infinite integrals.
330
The case when the integrand becomes infinite at some point in its range has
similar features:
THE DEFINITE AND INDEFINITE INTEGRAL
1 1
Example 15.10 Consider (a)
0
x − 2 dx; (b)
1
x
0
−1
dx.
x
1
− 12
dx = 2[ x 2 ]10 = 2[1 − 0] = 2.
1
(a)
0
Therefore the integral gives no problem; it is again a case of an infinitely extended figure
(extended in the y direction this time) containing a finite area.
(b) On the other hand,
x
1
−1
dx = [ln x]10,
0
There are improper integrals which do not work out for a different reason:
cos x dx.
X
(a) We have
cos x dx = [sin x]
X
X
0 = sin X.
0
Improper integrals which give a definite finite result are said to converge. If not,
they are said to diverge.
Self-test 15.5
Using (15.6), evaluate
∞
xe 0
–x 2
dx.
331
15.7
a new type of integral
To differentiate or integrate a function containing the ‘imaginary’ element i,
e cx
dx =
1 cx
c
e + C. (15.10)
U = eax cos bx dx and
V = eax sin bx dx.
U + iV = eax cos bx dx + i eax sin bx dx
= e
1
dx =
(a+ib)x
e (a + ib)x
(from (15.10))
a + ib
a − ib ax
= e (cos bx + i sin bx)
a2 + b2
1
= 2 eax [(a cos bx + b sin bx) + i(−b cos bx + a sin bx)].
a + b2
Equate this last expression to U + iV: the real and imaginary parts must separately
be equal; so, after introducing the arbitrary real constant C, we have
(a) e ax
cos bx dx =
1
a2 + b2
eax(a cos bx + b sin bx) + C,
(b) e
1
ax
sin bx dx = eax(−b cos bx + a sin bx) + C.
a2 + b2 (15.11)
332
The integrals can be expressed more simply in terms of a phase angle. Put
THE DEFINITE AND INDEFINITE INTEGRAL
a b
1 = cos φ and 1 = − sin φ
(a2 + b2 )2 (a2 + b2 )2
into (15.11a), and
−b a
1 = cos θ and 1 = − sin θ
(a + b2 )2
2 (a + b2 )2
2
(a) e ax
cos bx dx =
1
(a 2 + b 2 )
e ax cos(bx + φ ) + C,
1
2
(b) e
1
sin bx dx = e ax cos(bx + φ − 12 π) + C
15
ax
(a + b 2 )
1
2 2
Equation (15.11a) or (15.12a) can be used directly with a = −1, b = 2. However, we will
go through the working from first principles, but express the argument differently.
Remember that
eα +iβ = eα eiβ = eα(cos β + i sin β );
then we have e−x cos 2x = Re e(−1+2i)x. Therefore
∞ ∞ ∞
I=
0
e−x cos 2x dx = Re 0
⎡ 1
e(−1+2i)x dx = Re ⎢
⎣ −1 + 2i
⎤
e(−1+ 2i)x⎥
⎦0
⎛ 1 ⎞ 1 + 2i 1
= Re ⎜ 0 − ⎟ = Re = 5.
⎝ −1 + 2 i⎠ 5
Self-test 15.6
Evaluate
∞
e 0
–x
cos bx dx.
333
15.9
A signed area can be represented as a definite integral as in (15.3). Conversely, any
definite integral ∫ ba f(x) dx, whatever it represents, can be interpreted as represent-
SYMMETRIC INTEGRALS
ing the signed area of the graph y = f(x) between a and b (15.1). The connection
with area means that we have a picture of an integral which can often give useful
information without the need to find an indefinite integral, which might in any case
be impossible. One example of this is the simple numerical method described in
Section 15.2. We restate the connection, which we called the area analogy.
The following section illustrates the use of the principle (15.13): it will also be
referred to in later chapters.
(a) x
1
(c) x
x 2
(b)
1 1 (+)
(+) (+) − 12 π
t t t
−1 (−) O 1 −π (−) O π O 1
2 π
−1 (−) −1
−2
−1
Fig. 15.5 (a) x = t3; ∫ 1−1 t3 dt = 0. (b) x = sin t; ∫ π−π sin t dt = 0. (c) x = t + sin 2t; ∫ −2 π1 π(t + sin 2t) dt = 0 .
1
The ranges of integration on the two sides of the origin are equal, and because
of the special symmetry the positive and negative contributions cancel out. Such
functions are called odd functions, or functions odd about the origin. They have
the following property (see Section 1.4):
334
Symmetrical integrals over functions which are odd about the origin
f(t) dt = 0.
c
Another useful class are even functions, which are symmetrical about the y axis:
(b) x
(a) x 1 (c) x
1 x = t4 1
− 12 π 1
2 π 1
t x=
1 + t2
O
t t
x = cos t −1 O 1 −1 O 1
Fig. 15.6
(a)
− 12 π
t sin 3t dt;
4 (b)
−π
5
t cos 3t cos t dt; 1
2 (c) (e
−1
2t
− e−2t ) dt.
➚
335
Example 15.13 continued
15.9
(a) The function t4 is even and sin 3t is odd, so the integrand is odd. Since the range is
symmetrical about the origin, the integral is zero.
(b) t5 is odd, cos 3t is even, and cos --12 t is even, so the integrand is odd and the integral
SYMMETRIC INTEGRALS
is zero as in (a).
(c) e2t − e−2t is odd (put −t in place of t in the function – it just changes its sign).
Therefore the integral is zero.
If we integrate an even function between ±c, the graph shows that we get equal
contributions from both sides of the origin, which gives the following result.
f(t) dt.
c c
These ideas may also be useful if there is special symmetry about some point
other than the origin.
2π
(a) (b)
x x
1 1
A
(+) B π (+) (+) π (+)
t t
O (−) D (−) 2π (−) (−) 2π
C x = cos3t
−1 x = cos t −1
Fig. 15.7
In the graph of x = cos t (Fig. 15.7) the parts OBA and DBC are congruent, and similarly
for the other pair of divisions; in fact all four divisions are congruent. For the graph of
x = cos3t the shape is changed, but the four pieces remain congruent and retain their
original sign. The resulting cancellation gives zero for the integral.
Self-test 15.7
Without evaluating the integrals explain why
1
–3 π
2
f(t) dt =
d I(x) d dF(x)
= = f(x).
dx dx c
dx
f(t) dt = −f(x).
d
dx x
K(x) = u(x)
f(t) dt = F(v(x)) − F(u(x)).
v( x )
d d v(x) d u(x)
(a) f (t) dt = f (v(x)) − f (u(x)) .
dx u( x )
dx dx
(b) (Special cases):
f(t) dt = −f(x).
x c
d d
f (t) dt = f (x) and
dx c
dx x
(15.20)
337
(The results (15.20b) are simply (15.20a) in the respective cases v(x) = x, u(x) = c,
or v(x) = c, u(x) = x.) It is worth noticing that (15.20) does not require you to
15.10
integrate anything!
But −1 sin c 1 no matter what value of c we take, so we could never make the
integral equal to sin x + 1000.
Example 15.17 The function shown in Fig. 15.8a is described by f(x) = 1 when 0
x 1, f(x) = −1 when 1 x 2, and f(x) = 0 when x 2. Sketch a graph of
the function
x
Range 0 x 1. I(x) = 1 dt = x.
0
(i)
I(x) =
0
f(t) dt = 0
f(t) dt + f(t) dt
1
x
=1+ (−1) dt
1
(after using (i) at x = 1)
= 1 − (x − 1) = 2 − x. (ii)
➚
338
Example 15.17 continued
THE DEFINITE AND INDEFINITE INTEGRAL
(a) (b)
y y
1 1
y = f(x) y = I(x)
x x
0 1 2 3 0 1 2 3
−1 −1
Fig. 15.8
0 dt = 0,
2
I(x) = f(t) dt + f(t) dt = 0 + (iii)
0 2 0
where we used the value of (ii) at x = 2. The resulting graph of I(x) is shown in
Fig. 15.8b.
15
Self-test 15.8
Find
x2
d G
e cost dtL
J
I(x) = t
dx I x
by (a) evaluating the integral and differentiating the result, and (b) by using
(15.20d).
Problems
by (15.4).
(a) y = x3, −1 x 2; (b) y = x5, −1 x 1;
(c) y = sin x, −π x 0; (d) y = e−2x, 0 x 1. (k) (−x) dx when x is negative (you will have to
–12
dx;
1 1 2
1
(x + 1) dx;
1 1
2x (a) x 3 dx; (b) x 2 dx; (c)
(a) x dx;
2
(b) 2
(c) e dx;
−1 −1 0
(d) sin x dx; (e) (cos x − 2 sin 2x) dx; (d) x dx; (e) (1 − 3x + 2x ) dx;
4 1
–12 2
0 −1
339
(x x
2 2
values of T, and deduce the value of the mean
(f) −3
+ x−2) dx; (g) −2
dx;
over 0 t ∞; if you put T = ∞ into the
PROBLEMS
1 1
−1 integral directly, it turns out to be infinite, or
(h) x −1
dx (take care: the x values are negative); ‘diverges’ (see Section 15.6), so no conclusion
−2 can be drawn from this approach).
−1 (i) f(t) = t −1, 1 t ∞ (it is necessary to follow
(i) (−x) dx (see the remark in Problem 15.2k);
−2
–12 the procedure in the previous question, for the
same reason).
−3x
(j) 15.7 Use the even/odd properties of the integrands
0 0 (see Section 15.9) to prove the following results.
15.4 Evaluate the following integrals, using the t3
notation of (15.4). (b) dt = 0;
−1
(1 + t 4 )
(x − 1)(x + 1) dx;
1 1
x(x2 + x + 1) dx; π 2π
1
(a) (b) t cos t
0 −1 (c) dt = 0; (d) t 2 sin(t 3 ) dt = 0.
−π
1 + t2 − 12 π
(d)
2
x+x 2
dx; 3
0 x 1 15.8 (Computational: see Section 15.2 and Examples
15.1 and 15.6.) Write a simple program based on
2 4
t dt; (f)
t(t + 1) √u − 1 the algorithm (15.2) to evaluate a definite integral
(e) d u;
1
2 u ∫ ab f(x)dx. Assume that you have a subroutine for
1 1
0 −1
evaluating f(x), and that you input a, b, and N (the
(g) (h)
dw x number of subdivisions); also either a permissible
; dx ;
2w + 3
−1
x−1 −2
error E, or a parameter M which determines the
π number of iterations. If you use E, the process
(i) cos 3t dt (cos A = --(1 + cos 2A)).
2 2 1
2 might be written to print out when two successive
0 iterations are within E of each other. Check the
correctness of the program by using a function
15.5 Evaluate the following infinite integrals.
∞ ∞ ∞
such as x2 as integrand.
(a) e−3t dt; (b) x;
dx
e − 2 v dv; (c)
1
3
Estimate the values of the following integrals.
2 π
x
et
(a) t2 dt; (b) sin5t dt; (c) dt; Q(t) for t 0, where
0 0 0 1+t
I(u) du.
t
ex √(x+1) Q(t) =
(d) t ln t dt;
0
(e)
√x
sin(t2) dt. 0
| x − 1|
⎪ ⎪ dx
(b) f (x) = ⎨2 − x if 1 x 32 ⎬ ; consider positive 2 /3
.
⎪⎩0 if x 2
3
⎪⎭ x only. 0
15.12 An ‘RL’ circuit has a constant current I0 result will be the sum of two improper integrals
flowing, produced by a constant applied voltage. on 0 x 1 and 1 x 2.
Applications involving
the integral as a sum 16
CONTENTS
Example 16.1 The tension T in an elastic string is given by T = 0.01x (kg m s−2),
where x is the extension beyond the natural length. Find the work done on the
string to stretch it 2 metres beyond its natural length.
Natural length
Tension T
String x δx
Extension
Fig. 16.1
Example 16.2 A car runs from rest to rest in 1 hour, its velocity v being given by
v = 200t(1 − t) (in kilometres per hour). The rate of fuel consumption, f (in litres
per kilometre), is related to the velocity by f = 10 − 4 v 2. Find (a) the distance
travelled and (b) the amount of fuel used.
(a) In time δt it travels a distance δx, where
δx ≈ v δt.
The total displacement x (which is equal to the distance travelled since v is always
positive) is therefore
16
t =1 1 1 1
x = lim ∑ v δt =
δt→ 0
t=0
0
v dt = 0
200t(1 − t ) dt = 200 (t − t ) dt
0
2
16.1
(a) y (b)
y
Fig. 16.2 (a) Incremental volume for a cone; (b) x, y section of the cone.
x=h h h
πr 2 2 πr 2
V = lim
δx → 0
∑ πy
x=0
2
δx = πy dx =
0
2
0 h 2
x dx = 2 [ 13 x 3 ] 0h
h
πr 2 h3 1 2
= 2 = 3 πr h.
h 3
The x, y section of the cone is shown in Fig. 16.2b. If δs is arc-length
on the line y = (r/h)x, then the surface area of the ‘band’ is approximately
2πy δs = 2πy[(δx)2 + (δy)2]–2 = 2πf(x)[1 + f ′(x)2]–2 δx. Hence the surface area is
1 1
x=h
S = lim ∑ 2πy[1 + (dy/dx) ] δx,
1–
2 2
δx→0 x=0
h
= 2π rxh [1 + (hr ) ] dx,
1–
2 2
= πr[r 2 + h2]–2 .
1
The volume and surface area of any solid of revolution between x = a and
x = b, formed by rotating a profile y = f(x) around the x axis, can be found in
exactly the same way:
πy dx,
b
the volume V = 2
1
⎡ ⎛ dy ⎞ ⎤
2 2 b
Example 16.4 Find the geometrical area enclosed between the curves y = 2x 2 − 1
APPLICATIONS INVOLVING THE INTEGRAL AS A SUM
and y = x . 2
This problem is complicated if we have to think all the time about the difference
between signed and geometrical area as in Chapter 15.
Here it will be done in a different way. Divide the interval −1 x 1 into short steps
of length δx and consider the area elements indicated in Fig. 16.3. They are nearly
rectangular, and the geometrical (positive) area δA of each is given by
δA ≈ |x 2 − (2x 2 − 1)| δx = (−x 2 + 1) δx
(we may drop the modulus signs since −x2 + 1 0 in the given range).
The total geometrical area A is therefore given by
x =1 1
A = lim
δx → 0
∑ (−x
x = −1
2
+ 1) δx = (−x
−1
2
+ 1) dx
= [− 13 x 3 + x]1−1 = (− 13 + 1) − ( 13 − 1) = 43 .
y
1
y = x2
y B
δx
δA
Q
θ =β
P
16
δθ r
−1 O 1
A
θ θ =α
O x
y = 2x 2 − 1 δA
Fig. 16.4
−1
Fig. 16.3
Self-test 16.1
A surface of revolution is formed by rotating the curve y = 1 + x2 about the
x axis between x = 0 and x = 1. Find the volume of the region created.
16.2
complete circle of radius r and area πr 2:
δθ 2 1 2
where r = f(θ ).
A= 1
2 r dθ.
α
2
(16.2)
Example 16.5 Find the area of the loop of the curve r = 3 sin 2θ in the first
quadrant.
1 P
r
r = 3 sin 2θ
θ
O 1 2 x Fig. 16.5
For the loop shown in Fig. 16.5, the range of θ is 0 θ 12 π (it is generally helpful to
sketch polar curves before proceeding with the integration). Thus in (16.2)
f(θ ) = 3 sin 2θ, α = 0, β = 12 π.
The area is therefore given by
1
π 1
π
2 2
A= 1
2 (3 sin 2θ )2 dθ = 9
2 sin2 2θ dθ .
0 0
2 1
π
A= 9
4 (1 – cos 4θ ) dθ = 94 [θ − 1
4 sin 4θ ] 02 = 98 π.
0
346
Self-test 16.2
APPLICATIONS INVOLVING THE INTEGRAL AS A SUM
Sketch the curve defined by r = cos 3θ (− --16 π θ --16 π ]. Find the area enclosed
by the loop.
In Examples 15.1 and 15.4, we illustrated the use of the area analogy (15.13) using
as the area approximation the sum in (15.1), which had been introduced only for
the purpose of establishing the principle. It only gives close approximations if we
use very small step lengths; but, now that the area analogy is established, we can
look for approximation methods that will be more efficient.
An improved area approximation is shown in Fig. 16.6, where the curve y = f(x)
is ‘fitted’ by a polygonal curve. The approximation to the area of each strip indi-
vidually is obviously better in general than we would get from a rectangle. Divide
the interval x = a to x = b into N steps. We shall denote the length of each step by
h (instead of δx, because h is conventional in numerical analysis). Then
A h B
x
O a x1 x2 xn−1 xn xN−2 xN−1 b
(= x0) (= xN)
Fig. 16.6
347
b−a
h= .
N
16.3
Number the N + 1 points of division 0, 1, 2, … , N: the x values are
b−a 1
= [ 2 y0 + ( y1 + y2 + $ + yN−1 ) + 21 yN ].
N
This is called the trapezium rule.
Trapezium rule
f (x) dx ≈ b N− a [ y + ( y + y + $ + y
b
1
2 0 1 2 N−1 ) + 12 yN ].
a
In the following example, we compare the trapezium rule (16.4) with the rectangle
rule (15.2), which we can recast for comparison as
f (x) dx ≈ b N− a (y + y + $ + y
b
0 1 N−1 ).
a
Example 16.6 Compare the efficiency of the trapezium rule (16.4) with the
rectangle rule (15.2) for approximating to ∫ 10 e−x dx.
We set out the results in the following table.
N 10 100 1000
h = (b − a)/N 0.1 0.01 0.001
Rectangle rule 0.66 0.635 0.6324
Trapezium rule 0.632 657 0.632 125 0.632 120
The exact value is 0.632 120 5… . For three-decimal accuracy, the rectangle rule requires
about 1000 divisions and the trapezium rule only about 12. There are many formulae
that are far more efficient than even the trapezium rule, one of the best of these, for
combining simplicity with accuracy, being Simpson’s rule (see Problem 16.21). You
should look at books on numerical analysis for others.
348
y
APPLICATIONS INVOLVING THE INTEGRAL AS A SUM
D
(xn, yn)
xn − X
yn − Y
G
A B
(X, Y)
x
O
C Fig. 16.7
the equations
N N
∑ mn(xn − X) = 0,
n =1
∑ m (y
n =1
n n − Y) = 0. (16.5)
∑ mnxn − X ∑ mn = 0,
n =1 n =1
∑ mnyn − Y ∑ mn = 0.
n =1 n =1
N
Let ∑ mn = M, the total mass; then these equations give
n=1
N N
1 1
X=
M
∑ mnxn ,
n =1
Y=
M
∑m y .
n =1
n n
If instead of a number of particles there is a solid plate, then this too has a
balancing point. Assume that the plate is uniform so that its mass per unit area,
µ (Greek mu), is the same everywhere on it.
We also assume that the shape of the plate is such that no vertical or horizontal
line cuts across the boundary more than twice: once going in and again going out.
If the shape does not have this property, then the process as explained here has to
be modified.
Suppose that the centre of mass G is at (X, Y). Divide the area into narrow
vertical strips of width δx (Fig. 16.8a). Let the total length, or height, of a
349
(a) B (b)
16.4
y y d
y + δy δy
V(x)
y
c
A
δA
Fig. 16.8
representative strip as shown be V(x). Then its geometrical area δA is nearly equal
to V δx, and its mass δm is nearly µV δx. Therefore the moment about a vertical
axis AB through G : (X, Y) is approximately given by (x − X) δm ≈ (x − X) µV δx.
The sum of all the elementary moments must be zero, since G is the mass centre.
So, in the limit as δx tends to 0, we have
x =b
lim
δx → 0
∑ (x − X)V(x)µ δx = 0,
x=a
where x = a and x = b represent the extreme left and right limits of the plate. Since
µ and X are constants, this is the same as
x =b x =b
µ lim ∑ xV(x) δx = µX lim ∑ V(x) δx = µAX,
δx → 0 δx → 0
x =a x =a
x =b
where A is the area of the plate, equal to lim ∑ V(x) δx. Cancelling µ, we obtain
δx → 0
x =a
x =b b
xV(x) dx.
1 1
X= lim ∑ xV(x) δx =
A δx → 0 A
x =a a
Similarly, by dividing the y axis into steps δy, and considering the moments
of horizontal strips of length H(y) (see Fig. 16.8b) about a horizontal axis CD
through G, we obtain
d
yH( y) dy,
1
Y=
A c
where y = c and y = d are the extreme lower and upper limits of the plate.
In these expressions, all reference to mass has gone ( µ is no longer present).
Therefore the centre of mass of a uniform plate is also called the centroid of the
figure representing the plate, and it depends only on its shape and size.
In fact the moments about every line through G are zero, not simply the
moments about AB and CD parallel to the x and y axes that we used to find G.
350
yH( y) dy,
b d
1 1
X= xV(x) dx; Y =
A a
A c
where A is the area. Here, respectively, V(x) and H(y) are the lengths of the
vertical and horizontal strips, and x = a, b (resp. y = c, d) are the extreme
horizontal (resp. vertical) boundaries of the figure. (16.6)
Example 16.7 Find the position of the centroid or centre of mass of an isosceles
triangle of height h and base b.
y (h, 12 b)
b
y= x
2h
x
O h
b
y=− x
2h
(h, − 12 b) Fig. 16.9
16
Choose axes which make the job as simple as possible. In this case, use the axes shown
in Fig. 16.9.
From the symmetry of the isosceles triangle about the x axis, the centroid must lie on
this axis, so Y = 0 without any calculations.
The sides have equations
b
y = ± x;
2h
therefore the length of the strip at x is given by
b
V(x) = x.
h
Also the area A of the triangle is given by
A = 12 bh.
Therefore by (16.6),
h
⎛b ⎞ h
x dx =
2 2
X= x ⎜ x⎟ dx = 2 2 2
h.
bh 0 ⎝ h ⎠ h 0
3
B (m)
16.4
y δA
2
d
A D
O δx 6x
A Fig. 16.11
Fig. 16.10
added, as if they were particles, and in the limit we obtain a definite integral. It is
important to select axes and suitably shaped area elements to make a particular
problem manageable.
y
B
1 δA
2b y=− b
2h x + 1b
2
C
x
O x=h
V(x)
− 2b
1
b x
1
y= 2h
− 2b δx
A Fig. 16.12
The axes and the representative strip at x are shown in Fig. 16.12. The equations
of BC and AC are ➚
352
Example 16.9 continued
APPLICATIONS INVOLVING THE INTEGRAL AS A SUM
⎛ b ⎞
y = ± ⎜ − x + 12 b⎟
⎝ 2h ⎠
respectively, so the length V(x) to be assigned to the strip is
b
V(x) = − x + b,
h
and the area δA is approximated by
⎛ b ⎞
δA ≈ ⎜ − x + b⎟ δx.
⎝ h ⎠
Since the plate is uniform, the mass per unit area is (total mass)/(area), or M/--12 bh,
so the mass element δm is approximated by
M ⎛ b ⎞ 2M ⎛ x⎞
δm ≈ ⎜ − x + b⎟ δx = ⎜ 1 − ⎟ δx.
1
2bh ⎝ h ⎠ h ⎝ h⎠
Therefore the moment of inertia I is given by
x=h x=h
2M ⎛ x⎞
I = lim ∑ x2 δm = lim ∑ x2 ⎜ 1 − ⎟ δx
δx → 0
x=0
δx → 0
x=0 h ⎝ h⎠
h h
⎛ x⎞ ⎛ 1 3⎞
x ⎜⎝1 − h ⎟⎠ dx = h ⎜⎝ x
2M 2M
= 2 2
− x ⎟ dx
h 0 0 h ⎠
h
2M ⎡ 1 3 1 4⎤ 2M h 3 1
= ⎢ 3x − x ⎥ = = 6 Mh2 .
16
h ⎣ 4h ⎦ 0 h 12
Example 16.10 Find the moment of inertia of a circular disc of radius R and
mass M about an axis through its centre and perpendicular to the plane of
the disc.
R
r + δr
r
Axis
Element
δA Fig. 16.13
The usual (x, y) coordinates are not natural to this problem. In Fig. 16.13, the polar
coordinate r ranges from 0 to R. Break this range into ring-shaped steps as shown, the
representative ring or annulus having inner radius r and thickness δr. These constitute
the area elements δA.
We have δA ≈ 2πr δr, and the mass per unit area is M/πR2, so that the mass of the ring
δm is approximately
M
δm ≈ 2πr δr.
πR2 ➚
353
Example 16.10 continued
PROBLEMS
The moment of inertia of the ring must be equal to that of a suitable distribution of
closely spaced particles along its circumference. The contribution of each of these
imaginary particles to the moment of inertia of the ring is equal to its mass times r 2.
Since r is constant on the ring, its moment of inertia δI is equal to the total mass of
the ring times r 2:
2M
δI ≈ r 2 δm ≈ 2 r 3 δr.
R
Finally
r =R
2M R
2M 2M
I = lim ∑ 2 r 3 δr = 2 r 3 dr = 2 ⋅ 14 R 4 = 12 MR2 .
δr → 0 R R R
r =0 0
Self-test 16.3
A plane area is bounded by the parabola are y = 1 − x2 and the axis y = 0.
Express the ordinate Y of the centroid as the ratio of two integrals.
Problems
(Units are kilogram, metre, second (SI units) where (c) y = x(1 − x), 0 x 1;
they are unstated.) (d) y = sin x, 0 x π;
(e) y = x 3, −1 x 1 (the fact that x3 is negative
16.1 The resistance R of a compression spring over part of its range does not have to be taken
is given by R = 100x + 1000x2, where x is the into account: the volume elements are always
displacement from its natural length. Find the positive, unlike area elements);
work done in compressing it through a distance (f) y = x(1 − x), 0 x 2 (see the note in (e));
of 0.01. (g) y = x−1, 1 x ∞ (contrast Example 15.9,
for area);
(h) y = x–, 0 x 1.
1
16.2 The velocity v of a point moving along the 4
A into a wall at the end A. Sum the moments about current, that is in a period 2π/ω. Does it make any
A of elements of length δx, form a definite integral, difference at what instant you regard the period as
and so find the moment supporting the beam at A. starting? (To carry out the integration, you will
need the identity cos2A = --12 (1 + cos 2A).)
16.8 A ‘beam’ in the shape of a circular spindle
made of material of density 500 is fixed to a vertical 16.12 Find the geometric area enclosed between
wall at the end A with its axis of symmetry the curves y = −x and y = x(x − 1) on the interval
horizontal. Its cross-sectional area (perpendicular 0 x 2, by considering vertical strips between
to its axis) is 4 × 10 − 4(1 + 0.4x2), where x is the curves of width δx.
measured from A. Its length is 1. Find the moment
at A required to support it under gravity. 16.13 Find the geometric area enclosed between
Suppose that the data are the same, except that the curves y = −x and y = x3 between x = −1 and
the cross-section is square, or possibly irregular x = 1 by considering vertical strips of width δx
in shape. Does this affect the answer? Suppose connecting the curves. (Be careful about signs:
that the axis is bent, but that x still measures these curves cross.)
the perpendicular distance from the wall: is the
calculation affected? 16.14 For the angular ranges specified, sketch the
curves given in polar coordinates below and find
16.9 A narrow tube of length 10 cm and cross- the sectorial areas.
section 0.1 cm2 contains a chemical solution, with (a) r = θ, 0 θ 2π (a spiral arc);
concentration c(x) = 0.04 e − 4 x g cm−3, where x is
1
(b) r = 2 cos θ, − --12 π θ --12 π (a circle);
the distance from one end. Find the total mass of (c) r = eθ /2π, 0 θ π (spiral arc);
solute in the tube. (d) r = sin 2θ, 0 θ --12 π.
(Remember the identities cos2A = --12 (1 + cos 2A),
16
16.10 The water clock in Fig. 16.14 has depth sin2A = --12 (1 − cos 2A).)
0.5 m, and its profile is given by r(h) = 0.39h–,
1
4
where r(h) is the radius at height h from the 16.15 The end of a water trough is a rectangle
outlet in the bottom. The size of the outlet hole of height H and width L. Find the total force and
is such as to drain the water at a rate given by moment on the end when the trough is full. (The
dV pressure, meaning the force per unit area acting
= − 0.003h 2 m 3 h −1,
1
perpendicularly on any surface, at depth y is
dt
ρgy, where ρ is density and g the gravitational
where V is the volume of water remaining. Show constant.)
that the water level falls at a uniform rate, and find
how long it runs. (Consider the change δh in level 16.16 Determine the position of the centre of mass
which occurs in a short time δt.) of a symmetrical cone of circular cross-section
which has height H and base radius R.
or impossible to evaluate directly. Estimate them the total length s of the curve is given by
PROBLEMS
by using the trapezium rule (16.4). (Since you 2
1
b
⎡ ⎛ dy ⎞ ⎤
2
e
2
1
−x 2 Compute the lengths of the following curves.
(a) sin 2 x dx; (b) dx;
0 0
(Try the trapezium rule, Simpson’s rule of Problem
16.21, and an integrating routine from a software
2
2
ex dx sin x package if you know how to use it: the interest lies
(c) ; (d) dx.
1 1 + x3 1
x in comparing them.)
(a) y = sin x, 0 x 1;
(b) y = x2, 0 x 2;
16.21 The following is called Simpson’s rule for
(c) y = e x, −1 x 1;
numerical integration. It results from splitting (d) y = (1 − x 2 ) 2 , −1 x 1 (a semicircle, so it can
1
y dx ≈
b−a δs = [(δx)2 + (δy)2 ] 2 = [r 2(δθ )2 + (δr)2 ] 2 ,
1 1
CONTENTS
(3x − 2) dx.
3
We carried out this integration in Example 14.8 by starting with a guess that the
result will resemble (3x − 2)4. We now describe a method less dependent on trial
and error.
We shall take up a clue suggested by the chain rule procedure (Section 3.3). Put
3x − 2 = u. (17.1)
357
Then the integral becomes
17.1
u dx.
3
(3x − 2) dx = u (
3 3 1
3 du) = 1
3 u du =
3 1
12 u4 + C.
(3x − 2) dx =
3 1
12 (3x − 2)4 + C,
and this is correct. In checking its correctness by differentiation, we use the chain
rule with u = 3x − 2, and find we are simply reversing the order of the operations
that we just went through.
2x − 1 .
dx
Example 17.1 Use a substitution to obtain
Try
u = 2x − 1.
We shall need to express dx in terms of u. Since du/dx = 2, we have
dx = 12 du.
The integral therefore becomes, in terms of u,
2x − 1 =
dx ( 12 du)
= 1
2 ln | u | + C = 1
2 ln | 2x − 1| + C.
u
Example 17.2
Evaluate sin(3x + 2) dx.
Put
u = 3x + 2,
then du /dx = 3, so du = 3 dx, or dx = 1
3 du. The integral becomes
sin(3x + 2) dx = sin u · ( 1
3 du) = ( − 13 cos u) + C = − 13 cos(3x + 2) + C.
358
The essence of the matter is that the change of variable or substitution led to a
simpler integral than the one we started with. In general, for integrals of this type,
SYSTEMATIC TECHNIQUES FOR INTEGRATION
Type f(ax + b) dx
f (u) du
du 1 1
Put u = ax + b; then = a, or dx = du. The integral transforms to
dx a a
(17.2)
It is worth while to try this substitution in more general cases, even if it is not
obvious that a simplification will take place.
Example 17.3
Evaluate x(2x − 1)3 dx.
This is not quite of the form (17.2) because of the presence of the loose x. Nevertheless,
put
u = 2x − 1,
17
with the object of simplifying at least the most complicated part. Then
du = 2 dx, or dx = 12 du.
We also need to express x in terms of u, using u = 2x − 1:
x = 12 (u + 1).
Now we have
x(2x − 1) dx =
3 1
2 (u + 1)u 3( 12 du) = 1
4 (u 4
+ u 3 ) du = 1
20 u5 + 1
16 u4 + C
= 1
20 (2x − 1)5 + 1
16 (2x − 1)4 + C.
Do not miss the possibility of making a substitution in simple cases. For example:
Use a substitution to obtain sin2(3x + 4) dx.
359
17.2
Try putting
u = x2,
2
with the objective of simplifying the unfamiliar-looking term ex . It is then necessary to
deal with x and dx in the integral. We have
du
= 2x,
dx
which we can write as du = 2x dx, or
x dx = 12 du.
In this way we have translated the whole group (x dx) into terms of u, instead of having
to deal separately with x and dx. Therefore
x e x2
dx = e (x dx) = e ( du)
x2 u 1
2
= e du = e + C = e
1
2
u 1
2
u 1
2
x2
+ C,
3x
1
(x dx).
2
+2
The integrand contains a function of x2 and the combination x dx which appeared in
Example 17.4. This suggests putting u = x2 to give a simpler integral. However, we can
do even better than this.
Put
u = 3x2 + 2.
Then du /dx = 6x, so that
x dx = 1
6 du.
Therefore
3x u( u
1 1 du
(x dx) = 1
du) = 1
+2
2 6 6
= 1
6 ln |u| + C = 1
6 ln (3x2 + 2) + C,
where C is an arbitrary constant. The modulus sign in the logarithm was discarded
because 3x2 + 2 is always positive.
360
The general result is as follows:
SYSTEMATIC TECHNIQUES FOR INTEGRATION
1
Put u = ax2 + b; then x dx = du, so
2a
I=
1
2a f(u) du. (17.3)
Self-test 17.2
Evaluate x sin(x2) dx.
17
Example 17.6
Evaluate sin3x cos x dx.
In this case m = 1 and n = 3. Aim to simplify the worst term sin3x by putting
u = sin x.
Then sin3x becomes u3, and we must deal with cos x dx. As always, begin with
du /dx = cos x. Therefore
du = cos x dx,
so, by good fortune, cos x dx appears in one piece. Then we have
Example 17.7
Evaluate tan x dx.
We have
17.3
so, apart from the sign, we have exactly the combination required for the rest of the
integrand. Then
This technique can be used for products cosmax sinnax, when either m or n
(or both) are odd numbers, either positive or negative, and for certain other cases
as well.
Example 17.8
Evaluate cos3x dx. (This is the case m = 3, n = 0.)
Write cos3x dx = cos2x·(cos x dx), and put
u = sin x
(not cos x as possibly expected). Then du/dx = cos x, so that
du = cos x dx.
The remaining part of the integrand is cos2x, and we can transform this by writing
cos2x = 1 − sin2x = 1 − u2.
Then we have
where C is arbitrary.
You should try also the substitution u = cos x. It leads to an integral in terms of u that
is correct but worse than the original.
Example 17.9
Evaluate I = cos32x sin32x dx.
(Here m = 3, n = 3.) The technique requires us to decompose the term whose power is
odd. Here both powers are odd, so either will do. We shall split the integrand like this:
I = cos32x sin22x(sin 2x dx).
Put u = cos 2x so that
sin 2x dx = − 12 du.
Since sin22x = 1 − cos22x, the integral becomes
I = u3(1 − u2)( − 12 du )
= − 12 (u 3 − u5 ) du = − 18 u 4 + 1
12 u6 + C
= − 18 cos 4 2x + 1
12 cos6 2x + C,
with C arbitrary.
362
The general rule is as follows.
SYSTEMATIC TECHNIQUES FOR INTEGRATION
negative integer
I= cos m−1
ax sinnax(cos ax dx);
1
then u = sin ax, cos ax dx = du, and cos2ax = 1 − sin2ax.
a
(c) If n and m are both odd, use either (a) or (b). (17.4)
Self-test 17.3
17
Evaluate
I1 = sin x cos3x dx and I2 = sin(3x + 2) cos3(3x + 2) dx.
1
π
2
(a) (By finding an indefinite integral in terms of x.) As in Example 17.8, put u = sin x,
du = cos x dx;
cos x dx = (1 − u ) du = u −
3 2 1
3 u 3 = sin x − 13 sin 3 x,
17.4
(b) (Working with u throughout.) Put u = sin x into I. In order to express the limits of
integration in terms of u, note that u = 0 when x = 0, and u = 1 when x = 12 π . Then
(writing the limits so as to make them more explicit)
x=0
cos 3 x dx = u= 0
(1 − u2 ) du = [u − 13 u 3 ] uu==10 = (1 − 13 ) − 0 = 23 .
In Example 17.10b it would have been wrong to write the integral in the form
2π
1
0
(1 − u2 ) du.
This would imply that we were going to put u equal to 0 and 21 π after integrating.
Example 17.11 Find the centroid (centre of mass) of the uniform semicircular
plate shown in Fig. 17.1.
y
R δx
V(x)
x
O R
−R x2 + y2 = R2
Fig. 17.1
The symmetry shows that the centroid G lies on the x axis. From (16.6), the
x coordinate of G is given by
V(x)x dx.
R
1
X= 1
2 πR 2
0
(R
R
4
X= − x2 )2 x dx.
1
2
πR2 0
This is an integral of the type of (17.3). To simplify it, put u = R2 − x2, so that du/dx = −2x
and x dx = − 12 du. Also, u = R2 when x = 0, and u = 0 when x = R. Therefore
0
4 2 2 3 0 4 4
X= u 2 (− 12 du) = − 3 [u ] R 2 = − [0 − R 3 ] =
1
2 R.
πR2 R2 πR2 3πR2 3π
364
Self-test 17.4
SYSTEMATIC TECHNIQUES FOR INTEGRATION
Evaluate
1 2
I1 = 0
dx
1+x
and I2 = lnxx dx.
1
(1 − x ) .
dx
Example 17.12 Find a substitution to evaluate 2
1
2
17
Try to simplify (1 − x2 )2 first, hoping that dx will work out conveniently. To do this try
1
x = sin u; (17.5)
then (1 − x2 )2 = (1 − sin2 u) 2 = cos u. Also dx/du = cos u, so
1 1
dx = cos u du.
Therefore
(1 − x ) =
dx cos u du
2 1 = u + C = arcsin x + C,
2 cos u
You might try putting u = 1 − x2 instead: the resulting integral is different from, but no
better than, the original.
dx
Example 17.13 From Example 17.12, we know that 1 = arcsin x + C.
(1 − x2 )2
Use this result to obtain (4 −dxx ) . 2
1
2
1 1
Aim to convert (4 − x2 )2 into something like (1 − u2 )2 , so as to be able to use the
given result.
(4 − x2 )2 = 2(1 − 14 x2 ) 2 = 2[1 − ( 12 x)2 ] 2 ,
1 1 1
(4 − x ) = 2(1 − u ) = (1 − u )
dx 2 du du
1 1 1
2 2 2 2 2 2
= arcsin u + C = arcsin 12 x + C.
365
1 +dxx
17.5
Example 17.14 From Appendix E, 2
= arctan x + C.
1 +dx9x .
OCCASIONAL SUBSTITUTIONS
Use this result to evaluate 2
1 + u
1
dx du
= 3
= 1
arctan u + C = 1
arctan 3x + C.
1 + 9x 2 2 3 3
If the required integral does not seem to be similar to one that is already
known, then one has in effect to guess a suitable substitution:
We can simplify the logarithm (at the risk of extra complexity elsewhere) by putting
x = eu
so that ln x = ln eu = u. Since dx /du = eu, we have dx = eu du. Therefore
e
ln 3 x u3
dx = u
eu du = u 3 du = 14 u 4 + C = 14 (ln x)4 + C.
x
The general shape of the integrand often suggests a substitution that is sure to
simplify it. Suppose we notice that f(x) can be written in the special form
du
f (x) = cg(u) ,
dx
in which u is some function of x and c a constant. Then by eqn (15.6b) we have
I = f(x) dx = g(u) du
dx
dx = g(u) du,
I = (x4 + 1)7 x3 dx.
We could evaluate this by exponding (x4 + 1)7 by the binomial theorem, eqn (1.44).
However it is for simpler to notice that x3 = --14 d(x4 + 1)/dx, so put x4 + 1 = u(x),
changing the variable to u:
I = u7 · 14 du = 14 · 17 u8 + C = 281 (x4 + 1)8 + C,
Example 17.16
Evaluate (x 2 + 1)3 x − 2 dx.
1 1 1
1
is to spot that (d/dx)( x 2 + 1) is like the remaining 1factor, x − 2 . This
1
The important thing1
suggests that u = x + 1 is the right substitution. Specifically, put u = x 2 + 1; then
2
du 1 − 12
= 2 x and so x − 2 dx = 2 du.
1
dx
The integral becomes
2 u 3 du = 23 u 3 + C = 23 (x 2 + 1)3 + C.
1 4 1 4
Some further special substitutions together with illustrative integrals are listed in
Problem 17.23.
17
Self-test 17.5
Evaluate
4
(a) 1 +dx√x
0
(use the substitution x = u2);
(b) √(xdx− x )
0
2
(use the substitution x = sin2u).
Therefore ➚
367
Example 17.17 continued
x x −1 − x +1
17.6
dx dx dx
= 1 1
−1
2 2 2
= ln | x − 1| − ln |x + 1| + C.
C and B are arbitrary, on any range that excludes the points x = ±1.
cx + d
px2 + qx + r
in which the equation px2 + qx + r = 0 has no real roots (i.e. the denominator has no
real factors). The following example shows how to evaluate them by ‘completing
the square’ in the denominator.
The quadratic form x2 + 4x + 8 has no real factors. ‘Completing the square’ in the
denominator consists of writing x2 + 4x + 8 in the form (x + a)2 + b. The first two
terms, x2 + 4x, can be written
x2 + 4x =(x + 2)2 − 4,
so
x2 + 4x + 8 = (x + 2)2 − 4 + 8 = (x + 2)2 + 4.
The integral becomes
(x + 1) dx (x + 1) dx
I= (x + 2) + 4 = [ (x + 2)] + 1 .
2
1
4 1
2
2
To evaluate the first integral, use the substitution v = u2 + 1, as in Section 17.2; the
second is a standard integral. We obtain
I= 1
2 ln(x2 + 4x + 8) − 1
2 arctan 12 (x + 2) + C.
368
Self-test 17.6
SYSTEMATIC TECHNIQUES FOR INTEGRATION
x2 dx
(x + 3)2 (x + 1)
.
so
dx (uv) dx = u dx dx + v dx dx + B,
d dv du
(17.6)
where B is a constant. Look at the integral on the left. It means ‘an antiderivative
of (d /dx)[u(x)v(x)]’. But, from the definition (14.1), u(x)v(x) is an antiderivative.
Therefore (17.6) becomes
uv = u dv
dx
du
dx + v dx + B.
dx
Now rearrange the terms to obtain
u ddxv dx = uv − v ddxu dx − B.
This is the formula for integration by parts. Replacing −B by C:
Integration by parts
u ddxv dx = uv − v ddxu dx + C,
where C is some constant. (17.7)
It is not at first obvious how this complicated result could be of any use, but
the point of it is that the right-hand integral might be simpler than the one
369
on the left. The process was once called ‘partial integration’, because the uv
part is already integrated out. (For the dangerous effect of missing out C, see
17.7
Problem 17.19.)
INTEGRATION BY PARTS
Example 17.19
Evaluate x ex dx by integrating by parts.
First observe that the integrand consists of the product of two factors, x and ex, both of
which we can integrate and differentiate any number of times. We relate this fact to
(17.7) by identifying them with u and dv/dx respectively: put
dv
u = x and = ex . (i)
dx
Then
du
= 1 and v = ex dx = ex, (ii)
dx
where we have chosen v to be the simplest antiderivative of ex. Nothing would ultimately
be changed by introducing an arbitrary constant C into v: any antiderivative will do (see
Problem 17.18).
Fill in the right-hand side of (17.7) by picking out u, v, du/dx from (i) and (ii), and
introduce the constant C:
x e dx = x e − (e )(1) dx + C
x x x
= x e − e dx + C = x e − e + C,
x x x x
where C is arbitrary.
Example 17.20
Evaluate x cos 2x dx.
du
= 1, v = cos 2x dx = 12 sin 2x.
dx
Substituting these functions into the right-hand side of (17.7):
x cos 2x dx = x( 1
2
sin 2x) − ( 12 sin 2x)(1) dx + C
Example 17.21 For ∫ x ex dx (see Example 17.19), try the effect of assigning
SYSTEMATIC TECHNIQUES FOR INTEGRATION
x
x and e to u and dv/dx the ‘wrong way round’.
In Example 17.19, we successfully put u = x and dv/dx = ex. Now try instead
dv
u = e x, = x,
dx
then
du
= ex , v = x dx = 12 x2 .
dx
The integration-by-parts formula becomes
x e dx = e ( x ) − ( x ) e dx + C =
x x 1
2
2 1
2
2 x 1
2 x2 ex − 1
2 x e dx + C,
2 x
which is a true result, but the transformed integral is worse than the original.
Sometimes it is not immediately obvious that the method can be made to work,
as in the following.
17
Example 17.22
Evaluate ln x dx.
The integrals of other inverse functions, such as arcsinx and arctan x, respond
to the same technique (see Problem 17.11).
Example 17.23
Evaluate x2 sin x dx.
17.8
then
du
= 2x, v = −cos x.
Self-test 17.7
Using integration by parts, evaluate
integral equal if n = −1?
x ln x dx, (n ≠ −1). What does the
n
u dx dx,
dv
a
dv du
u dx = ⎢uv – v dx⎥
dx ⎢ dx ⎥
a
⎣ ⎦a
v ddux dx.
b b
dv
u dx = [uv] ba −
a
dx a
(17.8)
372
This can sometimes considerably simplify the working, especially if more than
one integration by parts is needed.
SYSTEMATIC TECHNIQUES FOR INTEGRATION
2π
1
2 2
1
π
x sin x dx = [x (−cos x)] 0 −
2 2 2
(− cos x)(2x) dx
0 0
1
π
2
=2 x cos x dx,
0
because the bracketed term is zero; we did not have to wait to the end of the calculation
to see it go. To evaluate the remaining integral, integrate by parts again, putting u = x and
dv/dx = cos x; we have
du
= 1, v = sin x.
17
dx
Use (17.8) again:
π ⎛ π ⎞
1 1
2 2
1
π
2 x cos x dx = 2 ⎜[x sin x] 02 − sin x dx⎟
0 ⎝ 0 ⎠
1
π
= 2( π + [cos x] 0 ) = 2 [ 12 π + (0 − 1)]
1
2
2
= π − 2.
The following result is important for Chapter 24, and involves the use of (17.8):
et 0
−t N
dt = N!
when N = 0, 1, 2, 3, … .
(0! is defined to be 1.) (17.9)
e t dt = F(k),
0
−t k
373
∞ −t 3
to indicate the integral’s dependence on the parameter k; for example, ∫ e t dt is 0
denoted by F(3). Notice in particular that
17.9
∞
F(0) = e −t
dt = [−e−t] 0∞ = 1. (17.10)
F(k) =
0
e−tt k dt = [t k(−e−t)] 0∞ − 0
et
(ktk−1)(−e−t) dt = k
0
−t k−1
dt
xe
0
−αx
dx
I(α) = xe
0
−αx
dx = − ddα (e
0
−αx
) dx,
374
the derivative being with respect to α (not to x, which is treated like a constant for
the purpose of the differentiation). It can be shown, as in Section 28.8, that the
SYSTEMATIC TECHNIQUES FOR INTEGRATION
operator d /dα can be taken outside the integral sign, so that we have
∞ ∞
I(α) = 0
x e−αx dx = −
d
dα
0
e−αx dx = −
d ⎛ 1⎞ 1
⎜ ⎟ = 2.
dα ⎝ α ⎠ α
In cases when we can foresee that the original integrand can be written in the form
d
of something that we can integrate with respect to x,
dα
this procedure enables the original integral to be worked out. The following
two examples further illustrate the procedure; the method can also be used for
indefinite integrals.
x α
ln x dx = I(α).
d α d α ln x
(x ) = (e ) = eα ln x ln x = xα ln x.
dα dα
Therefore
dα (x ) dx = dα x dx
d d
I(α) = α α
d ⎛ 1 ⎞ 1 1
= ⎜ xα +1 ⎟ = − xα +1 + xα +1 ln x,
dα ⎝ (α + 1) ⎠ (α + 1)2 (α + 1)
apart from a constant of integration.
(x + 1)
dx 1
= .
0
2 2 2
There is no parameter in the integral, so we shall introduce one and put α = 1 at the end.
Define I(α) by
∞
(x
dx
I(α ) = . (i)
0
2
+ α 2 )2
Observe that
d ⎛ 1 ⎞ 1 1 1 d ⎛ 1 ⎞
= −2α 2 or =− .
dα ⎜⎝ x2 + α 2 ⎟⎠ (x + α 2 )2 (x2 + α 2 )2 2α dα ⎜⎝ x2 + α 2 ⎟⎠
Then
∞ ∞
d ⎛ 1 ⎞
1 1 d dx
I(α ) = − ⎜ ⎟ dx = − . (ii)
2α dα ⎝ x2 + α 2 ⎠ 2α dα x2 + α 2
0 0
➚
375
Example 17.26 continued
PROBLEMS
But
∞ ∞
⎡1 ⎛ x ⎞⎤
dx 1
= ⎢ arctan ⎜ ⎟ ⎥ = . (iii)
0 x +α
2 2
⎣α ⎝ ⎠
α ⎦0 α
Put (iii) into (ii); we obtain
1 d ⎛ 1⎞ 1
I(α ) = − ⎜ ⎟= . (iv)
2α dα ⎝ α ⎠ 2α 2
By putting α = 1 into (iv) we obtain from (i)
∞
(x
dx 1
= ,
0
2
+ 1)2 2
as requested (though (iv) is a more general result).
Self-test 17.8
Evaluate
x (ln x) dx.
α 2
Problems
(1 −
1 1
4
3 2 3
(g) (h) tan t dt;
0 0 17.9 Obtain ∫ f(x) dx for each of the following f(x),
6π
1
2π
1 noting that they take the form cg(u) du/dx (see the
(i) 1
12 π
cot 3w dw; (j) 0
sin u cos u du; remark at the end of Section 17.5), so that (a), for
example, will respond to the substitution u = x3 − 1.
π 1
π
(a) x2(x3 − 1)5; (b) (x − 1)( x2 −2x + 3)−1;
(d) x 2 (3x 2 + 2) 2 ;
2 1 3 1
2
(k) –12
(sin v) cos v dv; (l) cos3θ dθ ; (c) 1/(x ln x);
(e) (ex − e−x)/(ex + e−x); (f ) 1 /x 2 (x 2 + 1);
1 1
17
0 − 12 π
(g) x2 /(x3 + 1).
π 2 π /ω
1 1
2
cos t dt;
2
(c) sin 2 2t dt; (d) 21 17.11 Use integration by parts (see Example
2
0 0 17.22), writing the integrand as f(x)(1), to obtain
π π ∫ f(x) dx for each of the following f(x).
(e) sin23t cos 3t dt; (f ) cos u du. 4 (a) ln2x; (b) arcsin x;
−π 0 (c) arccos x; (d) arctan x.
17.7 Use the substitutions suggested to evaluate 17.12 To evaluate ∫ f(x) dx for the following f(x),
∫ f(x) dx for the following f(x). (In several of integrate by parts twice; then look closely at your
the questions the identity 1 + tan2A = 1/cos2A is result. (If it does not work out you have probably
needed. You may also have to refer to the table, made a mistake with a sign.) Compare your
Appendix E.) results with (15.11).
(a) ln x/x (put x = eu); (a) ex sin x; (b) e−x sin x; (c) e−x cos x.
2 12
(b) x(1 − x ) (try (i) u = 1 − x2, (ii) x = sin u);
(c) 1/(ex + e−x)1 (put u = ex); 17.13 (Integration by parts: definite integrals,
(d) 1/ (1 − x 2 ) 2 (try (i) x = sin u, (ii) x = cos u; why Section 17.8). Evaluate the following.
do the results seem to be different?); π
2π
1
PROBLEMS
1
2
0
∞ F(k) = sinkx dx,
(e) e cos x dx (integrate by parts twice);
−x 0
1
π 1
π
0 and use it to evaluate ∫ 02 sin4x dx and ∫ 02 sin5x dx.
1 0
dx
for c 0 by F(c). Deduce the properties
arccos x dx; arctan x dx;
1 1
(h) (i) 1 x
−1 0
(a) to (d) below. F(c) is obviously equal to ln c,
but do not use any of the known properties of the
ln x dx.
2
2 2
0
cos4x dx and
0
cos5x dx.
= x−1x − (−x−2)x dx = 1 + x −1 dx.
(ln x) dx (k 0).
2
I(α) = x2e−α x dx,
(j)
2
(x + 1) dx 2
1
, u=x− ;
using the technique of differeniating under the x √(x + 7x + 1)
1
4
x 2
(k) u = 1 + √x.
0
17.23 (Some additional special substitutions).
Evaluate the following integrals starting with the 17.24 If p(x) is a polynomial of degree n,
substitution suggested (further substitutions may show that
be required: the table of integrals in Appendix E
may also be helpful):
e p(x) dx = +e (−1)
x
[p(x) − p′(x) + p″(x) − ···
p (x)] + C.
x
n (n)
(a) dx
x √(x 2 − a 2 )
, x = a/u; Hence evaluate
e (x − 2x + x − 2) dx.
1
dx x 3 2
(b) , x = a/u;
17
x √(a 2 − x 2 ) 0
e p(x) dx?−x
(e) dx
3 + 5 cos x
, u = tan 12 x; 0
CONTENTS
Resistance Inductance
R L
Voltage Switch
E(t)
Fig. 18.1
Problems in science and engineering are often most easily formulated in terms
of differential equations. Suppose for example that in the RL circuit of Fig. 18.1
the switch is closed at time t = 0, and that subsequently the voltage applied is E(t).
Then the current x(t) is found by solving the differential equation
dx
L + Rx = E(t).
dt
Here we have collected all the terms that involve x (including dx /dt) on the left
side and have put the term that does not involve x, namely E(t), on the right. This
is the conventional arrangement. The term independent of x which comes on the
right is then called the forcing term, the reason being obvious in this case, since
E(t) drives the circuit.
The differential equation with the same left-hand side, but with a zero forcing
18
term on the right, plays a key role in obtaining solutions of the original equation.
Such equations are called unforced differential equations, or sometimes homo-
geneous equations, and are the subject of this chapter. Also, for the present, we
shall further restrict ourselves to linear equations with constant coefficients,
which have the form:
18.1
simplest instance of all is
Example 18.1 For the differential equation dx /dt + 2x = 0, verify that (a) x = e2t
is not a solution, (b) x = 2 e−2t is a solution.
(a) Test x = e2t. Then dx /dt = 2 e2t and so
dx
+ 2x = 2 e2t + 2 e2t = 4 e2t.
dt
This is not zero, so e2t is not a solution.
dx
(b) Test x = 2 e−2t. Then = − 4 e −2t and so
dt
dx
+ 2x = − 4 e−2t + 4 e−2t = 0.
dt
The zero value is what the equation requires, so 2 e−2t is a solution.
Incidentally, we can confirm in the same way that x = A e−2t, where A is any constant,
is always a solution. We have
dx
+ 2x = −2A e−2t + 2A e−2t = 0,
dt
as it should be. This is the infinity of solutions we were expecting.
Example 18.2 Verify that the following functions are solutions of the
second-order equation d2x/dt2 + 4x = 0: (a) x = cos 2t, (b) x = sin 2t,
(c) x = A cos 2t + B sin 2t, where A and B are any constants.
Note that ‘verify’ means ‘try out’: you are not expected to show how the
solutions were obtained.
(a) If x = cos 2t, then dx/dt = −2 sin 2t, and d2x /dt 2 = − 4 cos 2t. Therefore
d2 x
+ 4x = −4 cos 2t + 4 cos 2t = 0
dt2
as required.
(b) Similarly, if x = sin 2t, then
d2 x
+ 4x = −4 sin 2t + 4 sin 2t = 0.
dt2
(c) Confirmation is straightforward, but the underlying reason why the previous
solutions can be combined into a new solution in this way is made clearer by
organizing the calculation as follows. ➚
382
Example 18.2 continued
UNFORCED LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
d2 x d2
2
+ 4x = 2 (A cos 2t + B sin 2t) + 4(A cos 2t + B sin 2t)
dt dt
⎛ d2 ⎞ ⎛ d2 ⎞
= A ⎜ 2 cos 2t + 4 cos 2t⎟ + B ⎜ 2 sin 2t + 4 sin 2t⎟ ,
⎝ dt ⎠ ⎝ dt ⎠
by rearranging the terms. We already know that the two bracketed expressions are zero,
so the whole expression is zero as required.
The separation of d2x/dt 2 + 4x into an ‘A’ part and a ‘B’ part in this way is possible
only because the equation is linear.
where A and m are unknown constants which we shall try to adjust to fit the
equation. From (18.3),
dx
+ cx = Am emt + cA emt = A(m + c) emt.
dt
This quantity must be zero for all values of t in order to fit the differential equa-
tion (18.2). Ignoring the possibility A = 0, which gives us the so-called trivial
18
18.2
We will rework the theory. Look for solutions of the form x = A emt:
dx
1 1
8 4 2 1
x 2 4
2
1
8
O
A=0
−0.6 0.2 0.4 0.6 t
−1
1
−8
−2
1 1
−8 −4 −2 −1 −2 −4
Each value of A gives a different curve, and these solution curves fill the whole plane.
Also the curves do not cross, so there is one and only one curve through every point. This
corresponds to the fact that the slope dx/dt has one and only one value at every point,
namely the value prescribed by the differential equation dx/dt = 4x taken at the point.
This is all strong evidence that we have found all the solutions. More is said about the
graphical way of understanding differential equations in Chapters 22 and 23.
dx
Example 18.4 Find all the solutions of 3 + 2x = 0.
dt
We could carry out the full calculation as in the previous example. However, if instead
we want to quote the formula, (18.4), we must first write the equation in the form
dx 2
+ 3 x = 0.
dt
Therefore c = 23 (not 2), and the general solution is
x = A e − 3 t , with A any constant.
2
384
It is worth while to memorize the formula (18.4).
In practical cases we do not usually need all the solutions, but only the one
UNFORCED LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
which satisfies some further condition of the problem. Frequently the condition
supplied describes the condition prevailing at the start of the action, or at some
other time, as in the following.
dx
Example 18.5 Find the solution of − 4x = 0 for which x = 2 when t = 1.
dt
Other ways of saying this are ‘find the solution curve which passes through the
point (1, 2)’, or ‘find a solution x(t) so that x(1) = 2’.
From Example 18.3, all the possible solutions are given by
x = A e4t.
Since x = 2 when t = 1, we must have 2 = A e4. Therefore
A = 2 e− 4
and the single solution picked out is
x = (2 e− 4) e4t = 2 e4(t−1).
Self-test 18.1
18
18.3
d2 x d x
+ − 2x = m2 emt + m emt − 2 emt = emt(m2 + m − 2).
dt2 dt
which is called the characteristic equation. Being quadratic, it may have two real
solutions, exactly one real solution, or two complex solutions, depending on the
coefficients. Consider the real cases first:
d2x dx
2
+b + cx = 0; solutions m1 and m2 of
dt dt
m2 + bm + c = 0 real and different.
Basis of solutions: em t, em t.
1 2
d2 x d x
Example 18.7 Find the general solution of 2 − − x = 0.
d t2 dt
To correspond with the standard form, (18.7), we should have to write the equation
in the form d2x/dt2 − 12 dx / dt − 12 x = 0, but there is no need to do this if we directly
test for solutions of the form x = emt. The characteristic equation then takes the form
2m2 − m − 1 = 0, or (2m + 1)(m − 1) = 0, so that1 m1 = − 12 , m2 = 1. Therefore the basis
for the general solution is the solution pair (e − 2 t , et ), and the general solution is
x(t) = A e − 2 t + B et, A and B arbitrary.
1
We might therefore think there will be no end to it: if t em0t is a solution, then
why not t 2 em0t, or some function of great complication? However, it can be proved
that every second-order linear differential equation has exactly two linearly
independent solutions (i.e. they are not just constant multiples of each other);
18
also that these form a basis of solutions: we do not need any others to construct
the most general solution. Formally:
d2 x dx
Find the general solution of +4 + 4x = 0.
18.3
Example 18.8
d t2 dt
The characteristic equation, formed by substituting x(t) = emt, is
The second solution always takes a form similar to (i) in Example 18.8:
constants). (18.10)
An alternative way to justify the second solution t e−2t in Example 18.8 is to try
x = f(t) e−2t. Then it can be shown that
d 2x dx
+ 4 + 4 = f ″(t)e−2t,
dt2 dt
which is zero for all t if f″(t) = 0. Hence f(t) = A + Bt e−2t.
388
Self-test 18.2
UNFORCED LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
are genuine solutions of the differential equation. They are complex functions,
so we call (18.11) a complex basis for solutions of the differential equation. If
we are interested in complex as well as real solutions, then we can allow the arbi-
trary constants A and B to be complex as well, in an all-inclusive general complex
solution
x(t) = A e(α + iβ )t + B e(α −iβ )t.
Suppose, however, that we want the general solution to consist only of real
functions. Then a basis for real solutions can be got from (18.11) in the following
way. By (6.8)
e(α + iβ )t = eα t eiβ t = eα t cos β t + i eα t sin β t.
This function solves the differential equation, so its real and imaginary parts
separately must also solve it. Therefore
18
where A and B are arbitrary (but real, of course). The second complex solution,
e(α −iβ )t, has the basis, (eα t cos β t, −eα t sin β t), which leads to the same family of real
solutions, so we get nothing new by considering it.
Equation (18.12) can be written in a different form. Using the identity (1.18),
we have
A cos β t + B sin β t = C cos(β t + ϕ),
where C and ϕ are constants related to A and B. Therefore (18.12) can be written
x(t) = C eα t cos(β t + ϕ).
Since A and B are arbitrary, so are C and ϕ.
389
18.4
2
dx
+ 4x = 0.
d t2
lowing result:
x
4
Graph x = 4e−0.2t
2
5
O 10 20 t
−2 Fig. 18.3 Graph of
x(t) = 4 e− 0.2t cos(2t − 1).
The damped unforced linear oscillator is the simplest linear model of an oscillat-
ing mechanical or electrical system which has a small amount of friction or some
other form of energy-loss mechanism (see Chapter 20 for a full discussion). In a
customary notation the equation is
d2 x dx
2
+ 2k + ω 2 x = 0.
dt dt
The term 2k dx/dt expresses the energy-absorbing property. Assume
k 2 ω 2.
The characteristic equation is m2 + 2km + ω 2 = 0, so that
m = − k ± (k 2 − ω 2 )2 = − k ± i(ω 2 − k 2 )2 ,
1 1
18.5
d2x dx
+ 2k + ω 2 x = 0 where k 2 ω 2
dt2 dt
Self-test 18.3
Find the general solution of
d 2x dx
2
− 4 + 13x = 0.
dt dt
(b)
(a)
x
x
P P
x0
Slope given
O t O t0 t
Fig. 18.4 (a) An infinite number of curves pass through each point. (b) Selection of a solution,
given P and the slope at P.
To pick out a particular solution, we need to determine the two arbitrary con-
stants. Two pieces of information are necessary. These may consist of two initial
conditions, conditions which define the state of the system at some starting time t0:
the values of x(t) and the slope dx/dt at t = t0 are given (see Fig. 18.4b). For example,
the equation d2x/dt 2 + ω 20 x = 0 describes the oscillations of a particle on a spring;
the initial conditions tell us its position and velocity (i.e. its state) when it starts
off. We then have an initial-value problem:
392
Initial-value problem
UNFORCED LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
d2x dx
(i) Equation: 2 + b + cx = 0.
dt dt
(ii) Initial conditions:
dx
x = x0 and = x1 at t = t0,
dt
which may be expressed alternatively as
x(t0) = x0, x′(t0) = x1,
where x0 and x1 are given. (18.16)
m3 + am2 + bm + c = 0.
If this equation has three distinct solutions m1, m2, m3, then
x = A em t + B em t + C em t.
1 2 3
PROBLEMS
(For the ‘dash’ notation x′(t) = dx /dt etc., see (4.1).) 18.6 (Götterdämmerung). Once upon a time,
rabbits in Elysium reached maturity instantly and
18.1 Say which of the following equations are bred with a birthrate of 20 rabbits per year per
linear, unforced, with constant coefficients couple. No rabbit ever died. At the start of the
(i.e. can be rearranged to conform with (18.1a)). experiment Zeus released 50 male and 50 female
(a) x′ = 3t; (b) x′ = 12 x; (c) x′ + tx = 0; rabbits.
(d) 3x′ − 2x2 = 0; (e) x′ − x = 0; (f) x′ = 0; By treating the number of rabbits as a
x′ dy 1 1 dy continuously varying quantity and considering
(g) 2 = 3; (h) + 2 y = 1; (i) = 2;
x dx y dx the number born in a short time δt, construct a
differential equation and then an initial-value
dI v′ + v + v2
( j) L + RI = 0; (k) = 1. problem for R(t), the rabbit population. Find
dt v′ − v + v2 how many rabbits there were at the end of
Year 4.
18.2 Write down all the solutions of the
Appalled by this result and assisted by Pluto,
following equations. Check one or two of them
Zeus launched another similar experiment, in
by substitution into the differential equation.
which any rabbit was allowed to live for one year
(a) x′ + 5x = 0; (b) x′ − 12 x = 0;
only. Construct the differential equation for the
(c) x′ − x = 0; (d) x′ + 3x = 0;
population. Did this alleviate the situation
(e) 3x′ + 4x = 0; (f) x′ = 2x; (g) x′ = 3x;
appreciably?
(h) x′/x = −3; (i) (x′ + 1)/(x + 1) = 1.
18.7 Obtain all solutions of the following
18.3 Solve the following initial-value problems.
equations. (The characteristic equations all
(a) x′ + 2x = 0, x = 3 when t = 0;
have real roots, not necessarily distinct.)
(b) 3x′ − x = 0, x = 1 when t = 1;
(a) x″ − 3x′ + 2x = 0; (b) x″ + x′ − 2x = 0;
(c) y′ − 2y = 0, y = 2 when x = − 3;
(c) x″ − x = 0; (d) x″ − 4x = 0;
(d) x′ + x = 0, x(−1) = 10;
(e) 3x″ − 14 x = 0; (f) x″ − 9x = 0;
(e) 2y′ − 3y = 0, y(0) = 1;
(g) x″ + 2x′ − x = 0; (h) x″ − 2x′ − 2x = 0;
(f) Find the curve whose slope at any point (x, y)
(i) 2x″ + 2x′ − x = 0; (j) 3x″ − x′ − 2x = 0;
is equal to 5y, and which passes through the
(k) x″ + 4x′ + 4x = 0; (l) x″ + 6x′ + 9x = 0;
point (1, −2).
(m) 4x″ + 4x′ + x = 0; (n) x″ = 0.
18.4 Suppose that the generator in Fig. 18.1 is
short-circuited and cut out at a moment when the 18.8 Verify that, when the characteristic equation
current in the circuit is I0. Find an expression for corresponding to x″ + bx′ + cx = 0 has coincident
the current subsequently. Show that the ratio L/R roots, m1 = m2 = m0, say, then the function x(t) = t
provides a measure of the time it takes for the em0t provides a second solution for the basis of the
current to die away. general solution. (For coincident roots, b2 = 4c.)
18.5 A radioactive element disintegrates at 18.9 Solve the following initial-value problems.
a rate proportional to the amount of the (a) x″ − 4x = 0, x(0) = 1, x′(0) = 0;
original element still remaining. Show that if (b) x″ + x′ − 2x = 0, x(0) = 0, x′(0) = 2;
A(t) represents the activity of the element at (c) y″ − 4y′ + 4y = 0, y(0) = 0, y′(0) = −1;
time t, then (d) y″ + 2y′ + y = 0, y(1) = 0, y′(1) = 1;
(e) x″ − 9x = 0, x(1) = 1, x′(1) = 1;
dA x″ − 4x′ = 0, x(1) = 1, x′(1) = 0.
+ kA = 0, (f)
dt
where k is a positive constant. 18.10 Obtain all solutions of the following
(a) Solve the initial-value problem for A if equations. (The roots of the characteristic
A = A0 (given) at time t = 0. equations are complex.)
(b) The time taken for the activity to drop to (a) x″ + x = 0; (b) x″ + 9x = 0;
half of the starting value is called the half-life (c) x″ + 14 x = 0; (d) x″ + ω 20 x = 0;
period. For uranium-232, it is found that 17.5% (e) x″ + 2x′ + 2x = 0; (f) y″ − 2y′ + 2y = 0;
has decayed after 20 years. Show that its half-life (g) y″ + y′ + y = 0; (h) 2x″ + 2x′ + x = 0;
period is about 72 years. (i) 3x″ + 4x′ + 2x = 0; (j) 3x″ − 4x′ + 2x = 0.
394
18.11 Solve the following initial-value 18.15 Consider the third-order differential
problems. equation
UNFORCED LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
CONTENTS
The previous chapter treated differential equations that are linear, have constant
coefficients, and are homogeneous (meaning unforced, or having a zero right-
hand side). It is now shown how such equations are involved in obtaining the
general solution of linear equations which have a non-zero forcing term when this
term is a linear combination of polynomials, exponentials eax, and trigonometric
functions sin bx, and cos bx. Additionally, more general results about linear
equations are stated. Section 19.5 describes the integrating factor procedure for
first-order linear equations having non-constant coefficients as well as a non-zero
forcing term; this method therefore has very general application.
dx d2 x dx
+ cx = f(t), +b + cx = f(t).
dt d t2 dt
The function f(t) is called the forcing term; it represents physically the external
input to the physical system that the equation describes, and the system will
respond with an output x(t) which depends on the input f(t).
If f(t) is an exponential function K eα t, a sine or cosine function K sin β t or
K cos β t, or a polynomial, then we can find an individual particular solution
by trial.
396
d2 x d x
+ − 2 x = 3 e2 t .
d t2 dt
Try for a solution containing the same exponential as that on the right-hand side of the
equation:
x(t) = p e2t,
where p is some constant – not an arbitrary constant, but one whose value we shall
settle by substitution: only one value will do. Then
dx d2x
= 2p e2t , and = 4 p e 2t ,
dt dt2
so that
d2 x d x
+ − 2x = 4p e2t + 2p e2t − 2p e2t
dt2 dt
= e2t(4 + 2 − 2)p = 4p e2t.
This must equal the given right-hand side, 3 e2t for all values of t, which is only possible
if 4p = 3, or
p = 43 .
19
d2 x
Example 19.2 Find a particular solution of + 4x = 2 cos 3t.
d t2
Guess that there might be a solution of the form
x = p cos 3t.
Then
dx d2 x
= −3p sin 3t, and = −9 p cos 3t,
dt dt2
so that
d2 x
+ 4x = −9p cos 3t + 4p cos 3t = −5p cos 3t.
dt2
This must be the same as the right-hand side of the equation in order for the guessed
function to be a solution, so
−5p cos 3t = 2 cos 3t.
Therefore p = − 25 , and the required solution is
x(t) = − 25 cos 3t.
In most cases when the right-hand side is a sine or cosine it will not work so
simply, as is illustrated in the following example.
397
19.1
d2 x d x
+ − 2x = 2 cos 3t.
d t2 dt
d2 x dx
Example 19.4 Find a particular solution of 2
−2 + 4x = 3.
dt dt
Test whether there is a constant solution
x(t) = p (a constant).
By substituting this in the differential equation, we get
0 + 0 + 4p = 3,
so that p = 43 , and the particular solution is just x(t) = 43 , which is obvious after it has
been worked out.
d2 x d x
Find a solution of − + x = 3 + 2 t 2.
FORCED LINEAR DIFFERENTIAL EQUATIONS
Example 19.5
d t2 dt
Try a solution of the form x(t) = p + qt + rt 2, where p, q, and r are constants. It is
normally necessary to try a polynomial of the same degree as the forcing term, which
in this case has degree 2, and to include all the lower-degree terms in the trial solution.
Since
dx d2 x
= q + 2rt, and = 2r,
dt dt2
we must have
2r − (q + 2rt) + (p + qt + rt 2) = 3 + 2t 2.
Match up the coefficients of the three powers of t; we find that
r = 2, −2r + q = 0, 2r − q + p = 3.
The equations are easy to solve and lead to the solution
x(t) = 3 + 4t + 2t 2.
These methods apply equally to first-order (and higher order) equations having
constant coefficients. Summarizing for first- and second-orders:
19
Particular solutions of
d2x dx dx
2
+b + cx = f (t) and + cx = f (t)
dt dt dt
(a) f(t) = K eα t: try a solution x(t) = p eα t.
(b) f(t) = K cos β t or K sin β t: try a solution
x(t) = p cos β t + q sin β t.
(c) f(t) is a polynomial of degree N: try a polynomial of the same degree,
with all its terms present. (19.1)
There are exceptional cases where these substitutions have to be modified. For
example, d2x/dt 2 = t has a polynomial solution of degree 3, not degree 1. These
cases are treated in Section 19.3.
If the forcing term on the right-hand side consists of the sum of several con-
stituent terms, then obtain a particular solution for each one, and add them, as in
the following example.
19.2
For x1(t), try for a constant solution x1(t) = p: it is found that p = 14 , so x1(t) = 14 .
For x2(t), try x2(t) = q e−t (following the method of Example 19.1). The substitution
gives q e−t + 4q e−t = e−t, so that q = 15 , and the solution is x2(t) = 15 e −t.
The method just described is another consequence of the linearity of the class
of equations considered. It is also called the superposition principle, and applies
to linear equations of all orders.
dx
Example 19.7 Obtain a particular solution of + x = 3 cos 2t.
dt
Remembering Example 19.3, we expect the solution will have to contain both cosine
and sine terms, so try
x(t) = p cos 2t + q sin 2t.
The substitution gives (p + 2q) cos 2t + (−2p + q) sin 2t = 3 cos 2t, so that p = 53 , q = 65 ,
and the solution is
x(t) = 53 cos 2t + 65 sin 2t.
Self-test 19.1
Find particular solutions of
d2x dx
− + 3x = f(t)
dt2 dt
in the cases (a) f(t) = 2e−2t; (b) f(t) = sin 2t; (c) f(t) = 3t 3.
d2 X d X
differential equation + + X = 3 e2 i t .
d t2 dt
Look for a solution of the form X(t) = P e 2it. To find P, substitute this expression into the
left-hand side of the differential equation:
(2i)2P e 2it + (2i)P e 2it + P e 2it = P(−4 + 2i + 1) e 2it = P(−3 + 2i) e 2it.
This must be the same as the right-hand side of the equation, 3 e2it, for all values of t.
Therefore, P(−3 + 2i) = 3, so
3 3(−3 − 2i)
P= = = − 133 (3 + 2i).
−3 + 2i (−3)2 + 22
Therefore X(t) = − 133 (3 + 2i) e2it is a particular solution. When expanded, it becomes
X(t) = − 133 (3 + 2i)(cos 2t + i sin 2t)
= − 133 (3 cos 2t − 2 sin 2t) + i [− 133 (2 cos 2t + 3 sin 2t)].
d2 x dx
+b + cx = a cos β t, (19.3)
dt 2 dt
where b, c, a, β are all real. We know that
cos β t = Re eiβ t
(see (6.8)). Therefore, if we can find a particular solution X(t) of the complex
equation (19.2), its real part will solve the corresponding real equation (19.3).
19.2
For x(t), we require only the real part of this expression:
x(t) = Re X(t)
In the case when the right-hand side of the equation has the form a sin ω t, the
calculation is the same, but the imaginary part of the complex solution must be
extracted instead of the real part. The following example demonstrates also how
right-hand sides of the form
a eα t cos β t, a eα t sin β t
can be handled in the same way.
d2 x
Example 19.10 Find a solution of + x = e−2 t sin 3t.
d t2
Use the fact that
e−2t sin 3t = Im(e−2t e3it) = Im e(−2+3i)t.
Therefore, consider the corresponding complex equation
d2 X
+ X = e(−2 + 3i)t.
dt2
To find a solution, try the form
X(t) = P e(−2+3i)t.
We find in the usual way that
(−2 + 3i)2P e(−2+3i)t + P e(−2+3i)t = e(−2+3i)t
for all values of t. Therefore
1 −1
P= = = − 401 (1 − 3i)
(−2 + 3i) + 1 4(1 + 3i)
2
and
X(t) = − 401 (1 − 3i) e(−2+3i)t.
If we take the imaginary part of X(t), we obtain a solution of the original equation:
x(t) = Im[− 401 (1 − 3i)(e −2t e 3it)]
= − 401 e −2t(−3 cos 3t + sin 3t).
The same result could be obtained by substituting
x(t) = p e−2t cos 3t + q e−2t sin 3t,
but this would be a very laborious and error-prone process.
402
The method is particularly advantageous when the coefficients are general
constants. The following equation will be important in Chapter 20.
FORCED LINEAR DIFFERENTIAL EQUATIONS
and
φ = arg[(ω 02 − ω 2) − i(2kω)],
19
where φ is the polar angle of the point ((ω 02 − ω 2), −2kω ) on an Argand diagram.
2
Particular solution of d x + b d x + cx = f (t)
dt 2 dt
(a) f(t) = a cos β t or a sin β t.
Put X(t) = P eiβ t to solve
X″ + bX′ + cX = a eiβ t.
Then x(t) = Re X(t) or Im X(t), corresponding to cos β t or sin β t respectively.
(b) f(t) = a eα t cos β t or a eα t sin β t.
Solve X″ + bX′ + cX = a e(α + iβ )t, and continue as in (a). (19.5)
Self-test 19.2
Using a complex method, find particular solutions of
d2x dx
+ − x = f(t)
dt2 dt
in the cases (a) f(t) = e−t cos t; (b) f(t) = e−t sin t.
403
19.3
There are exceptional cases for each of the three rules (19.1), when the suggested
substitution does not give any result because the trial function delivers zero when
d2 x
Example 19.12 Find a particular solution of + 9x = 5 sin 3t.
d t2
Here, x = p cos 3t and x = q sin 3t both give d2x/dt 2 + 9x = 0, so the standard solution
form does not work. From (19.6), with β = 3 and a = 5, the required solution is
5
x(t) = − t cos 3t = − 56 t cos 3t.
2×3
This solution is sketched in Fig. 19.1. Unlike the ordinary sine- and cosine-type solutions,
it grows indefinitely. Such solutions have an important physical significance described
in Chapter 20.
There are other exceptional cases that are not so frequently encountered; some
examples are given among the problems at the end of the chapter.
404
x
FORCED LINEAR DIFFERENTIAL EQUATIONS
− 56 π − 12 π − 16 π 1
6 π 1
2 π 5
6 π t
−1
−2
−3
5
x = − 6 t cos 3t
5
x=±6t
Fig. 19.1
19
Self-test 19.3
Solve the characteristic equation of
d2x dx
2
− 2 − 3x = 0.
dt dt
Explain why a particular solution of
d2x dx
− 2 − 3x = 2e−t
dt2 dt
is a special case. Find a particular solution.
From earlier experience, we should expect other solutions. In order to find some,
consider what happens when we substitute various functions x(t) in the expression
d2
x(t) − x(t). (19.9)
d t2
For example, when we put x(t) = cos t (the ‘particular solution’ mentioned above),
we obtain
405
d 2
cos t − cos t = −2 cos t,
d t2
19.4
as demanded by (19.7).
Suppose now that we can find another function, x(t) = xc(t) say, which produces
d2 x
Example 19.13 Find the general solution of + 4x = 3 cos 5t.
d t2
Particular solution xp(t). Looking forward into the calculation, it can be seen that the
solution needs no sin 5t term. Therefore try
xp(t) = p cos 5t,
where p is a constant. Then substitution into the equation requires
p(−25 cos 5t) + 4p cos 5t = 3 cos 5t
for all t, so p = − 17 . Therefore
x(t) = − 17 cos 5t.
Complementary functions xc(t). We require the solutions xc(t) of the corresponding
unforced equation d2xc /dt2 + 4xc = 0. Try for solutions of the form xc(t) = p emt. The
substitution produces the characteristic equation m2 + 4 = 0. Therefore m = ± 2i, so
a pair of solutions
(e2it, e−2it)
constitutes a complex basis. To get a real basis, choose either one, say e2it, and find its
19
19.5
Corresponding to the other term, we need a solution, xp2(t) say, of d2xp2 /dt 2 + 4xp2 =
2 cos 2t. We should normally expect a solution of the form p cos 2t + q sin 2t. However,
looking at the complementary functions we found, this function is already a
Self-test 19.4
Obtain the general solution of
d 2x dx
− 2 − 3x = 2e−t
dt2 dt
dx
(see Self-test 19.3), which satisfies x = 0, = 1 at t = 0.
dt
d
(something),
dt
then the equation would be easy to solve. This cannot be done, but we can instead
do the next best thing. This is to obtain a certain function I(t), called an integrating
factor, such that
⎛ dx ⎞ d
I(t) ⎜ + g(t)x⎟ = [I(t)x] (19.14)
⎝ dt ⎠ dt
identically (i.e. for every function x(t) and for all values of t).
The following example shows the meaning of this idea and the way it is used.
Example 19.15 (a) Show that I(t) = et is an integrating factor for the expression
dx/dt + x. (b) Use it to find the general solution of the equation dx /dt + x = e2t.
(a) We shall confirm (19.14), that
⎛ dx ⎞ d
et ⎜ + x⎟ = (etx). (i)
⎝ dt ⎠ dt
19
Work from the right-hand side of (i). Differentiate the product etx:
d t dx ⎛ dx ⎞
(e x ) = e t + etx = et ⎜ + x⎟ ,
dt dt ⎝ dt ⎠
which is the same as the left-hand side of (i), so et is an integrating factor.
(b) Multiply both sides of the differential equation by et:
⎛ dx ⎞
et ⎜ + x⎟ = et e2t = e 3t.
⎝ dt ⎠
Because of the result in (a), we can write this as
d t
(e x) = e 3t.
dt
Therefore
etx = e3t dt = 13 e 3t + A (A arbitrary),
or
x= 1
3 e2 t + A e −t.
⎛ dx ⎞ d
I(t) ⎜ + g(t)x⎟ = [I(t)x].
⎝ dt ⎠ dt
19.5
1 dI(t) d ln I(t)
= g(t) or = g(t).
ln I(t) = g(t) dt, so I(t) = e∫ g(t) dt.
(In the case of Example 19.15, we had g(t) = 1, and the present formula gives
I(t) = e∫ dt = et+C; the choice C = 0 gives the integrating factor suggested – any other
choice would do.)
then I(t)x(t) =
I(t)f(t) dt + C, giving x(t). (19.16)
dx 1
Example 19.16 Find the general solution of − x = t 3.
dt t
Firstly, consider the range t > 0. Here g(t) = −1/t. Then ∫ g(t) dt = C − ln t, so that
I(t) = e −lnt = t −1,
where we have chosen C = 0 for convenience. Multiply both sides by I(t) = t −1:
⎛ dx 1 ⎞
t −1 ⎜ − x⎟ = t −1 t 3 = t 2 .
⎝ dt t ⎠
By (19.15), this can be written
d −1
(t x) = t 2 .
dt ➚
410
Example 19.16 continued
FORCED LINEAR DIFFERENTIAL EQUATIONS
Therefore
t −1x = t 2 dt = 13 t 3 + C,
so that
x(t) = 13 t 4 + Ct.
The solution obviously falls into the shape
particular solution + complementary function,
and is clearly the general solution for all t. We should have gained nothing by considering
negative t, or by adding an arbitrary constant, when working out ∫ (1/t) dt: we only need
any integrating factor, not all possible ones.
Notice particularly that, in the examples, we did not need to calculate or check
the truth of a statement like
⎛ d x 1 ⎞ d −1
t −1 ⎜ − x⎟ = (t x).
⎝ dt t ⎠ dt
19
We already know that t −1 is an integrating factor, and this is the very property that
an integrating factor is designed to possess.
Be prepared to recognize this type of equation in disguised form, or when
different letters are involved; for example,
dy x + y
=
dx x + 1
is the same as
dy 1 x
− y= .
dx x + 1 x+1
Self-test 19.5
Find the integrating factor of
dx
− x sin t = e−t−cost sin t,
dt
and obtain the general solution of the equation.
411
Problems
PROBLEMS
19.1 Find a particular solution of each of the suggested by (19.1) does not lead to a result.
following equations by trial as in Section 19.1. Try polynomials of higher degree than 1.
(a) x′ + x = 3 e2t; d 2 y dy
(d) + = x. The absence of a term in y causes
(b) x′ − 3x = t 3 + 1; dx2 dx
(c) 2x′ + 3x = t + 3 et; the second-degree trial function qx2 + qx + r
(d) x″ + x = 3 e2t; to fail. Try a third-degree polynomial instead.
(e) x″ − 41 x = 2 et + 3 e−t; (e) x″ − 2x′ + 2x = et cos t. Try t et (p cos t + q sin
(f) x″ − 2x′ + x = 3; t), or modify the complex-number approach of
(g) x″ + 4x′ − x = 3t 2 − t; Section 19.2 to obtain a particular solution.
(h) x″ − x = 2 cos t; (f) First-order equations also have exceptional
(i) 2x″ + 3x = 2 sin 3t; dy
(j) 2x″ + x′ = sin t − cos t; cases. Consider the equation − y = ex .
dx
(k) x″ + 2x′ + x = cos 2t; (If you have read as far as Section 19.5, you can
d2y also handle it by using an integrating factor.)
(l) − y = 1 − 3 e 2x;
dx2
d 2 y dy 19.5 Find the general solution of the following
(m) − + 2y = 3 sin 2x. equations.
dx2 dx
(a) x″ + 9x = 3 e2t;
(b) x″ − 4x = 2 e−t;
19.2 Use the method of Section 19.2 to find a
(c) 4x″ − x = 1 + 3 cos 2t;
particular solution of the following.
d2y dy
(a) x″ − x = 3 cos 2t; (d) +2 + 2y = 3;
(b) x″ + x = 2 sin 3t; dx2 dx
(c) x″ + 2x′ + x = 3 sin t; (e) x″ − 2x′ + 2x = 3 sin 2t;
(d) x″ − x′ − x = 3 cos t; (f) 4x″ − 2x′ − 2x = 3t2;
(e) 2x″ + x′ + 2x = 2 cos 2t; (g) x″ + x′ = 2 − 3 e−t cos t;
(f) 3x″ + 2x′ + x = 2 sin 2t; (h) 2x″ + x′ − x = 12 t + 3 e −t ;
(g) x″ − 4x = e−t cos t (note: e−t cos t = Re e(−1+i)t); d2y
(i) + y = 1 + 2 e3x + x 2;
(h) x″ − 4x = 3 et sin 2t (note: et sin 2t = Im e(1+2i)t); dx2
(i) Show that a solution of x″ + x′ + 4x = 5 cos 3t is d2y dy
(j) +2 + y = 3 cos 2x + sin 2x;
(5/√34) cos(3t + φ), where φ = arctan 35 . dx2 dx
d2y dy
(k) +4 + 5y = e −x sin x.
19.3 The following differential equations are dx2 dx
examples of the exceptional cases treated in
Section 19.3. Find a particular solution in each case. 19.6 Use an integrating factor (Section 19.5) to
(a) x″ + x = 3 cos t; find the general solution of the following
(b) x″ + 4x = 3 sin 2t; equations.
(c) x″ + 4x = 1 + 3 cos 2t; (a) x′ − 3x = 0;
d2y (b) x′ + 2x = 3;
(d) + 9 y = 2 sin 3x; (c) x′ − 2tx = t;
dx2
d2y dy (d) x′ − t −1x = t + t e−t;
(e) −2 + 2y = ex cos x. (e) x′ − t −1x = t − 1;
dx2 dx
(f) tx′ − 2x + 3 = 0;
dy 1
19.4 The following are exceptional cases of types (g) + y = sin x
dx x + 1
not described in Section 19.3. Find a particular (you will need to use integration by parts to
solution for each. perform the integration);
(a) x″ − x = et; try a solution of the form pte t.
dy 1
(b) x″ − 2x′ + x = et; try a solution of the form (h) 3 + y = x;
pt 2 et. (In this case, both et and t et are dx x
complementary functions, so the form in dy
(i) (x − 1) − y = (x − 1)2;
(a) will not work.) dx
(c) Consider the simple differential equation 1
(j) x ′ − x = ln t;
d2x/dt2 = t. A first try with the form pt + q t
412
dy x + y
(k) tx′ − x = 1 + t; (l) = ; is
dx x + 1
FORCED LINEAR DIFFERENTIAL EQUATIONS
1 C
y(x) = xf (x) dx + , 19.9 (Newton cooling). An object is heated
x x
or cooled above or below the ambient air
where C is any constant. Find the solution of the temperature T0 . Under certain physical
equation assumptions, the body temperature T satisfies
dy 1 the equation
+ y = ln x
dx x dT/dt = −k(T − T0 ),
for x 0, for which y = 0 when x = 1. where k is a positive constant. Find the general
solution of the equation.
19.8 (a) Use an integrating factor to show that the The body is at 100°C in an atmosphere at 40°C.
general solution of After 3 minutes, its temperature is 85°C. Find the
dy value of k, and determine when the body will
19
+ y = f(x)
dx reach 60°C.
Harmonic functions
and the harmonic
oscillator
20
CONTENTS
The two previous chapters mainly describe formal techniques for solving linear
differential equations, rather than their applications. The present chapter pre-
sents qualitative ideas and terminology used for practical phenomena governed,
exactly or approximately, by linear differential equations whose solutions have
harmonic behaviour; that is to say, vibrations or waves occur that are sine or
cosine functions having various physical properties – wavelength, amplitude, phase
velocity, and so on. Such a system is called a harmonic oscillator. Section 20.9
treats travelling waves such as sound waves, and Section 20.10–20.12 some more
advanced phenomena: beats, arising when two harmonic signals are superposed;
dispersion and group velocity, where the velocity is frequency-dependent; the
Doppler effect, arising when a sound or optical source or observer is moving; and
diffraction of superposed waves in space.
which can always be done without changing the function values generated by the
expression (20.1). We say then that the function (20.1) is in standard form.
φ /ω x
C
Amplitude C
−C
Period 2π/ω
The features of the function x(t) = C cos(ω t + φ) are shown in Fig. 20.1. Assume
that the expression is in standard form (20.2). The graph swings between ±C, and
C is its amplitude. It is periodic (see Section 1.6), repeating itself at intervals of
length 2π /ω, which is its minimum period. The number of complete oscillations
per unit time is the frequency (e.g. in cycles per second, or hertz units), and
Frequency = (period)−1 = ω /2π. (20.3)
415
The parameter ω is angular frequency, often shortened merely to ‘frequency’. The
parameter φ is the phase or phase angle. As explained above, φ /ω represents the
20.2
distance that the graph x = C cos ω t has to be shifted to coincide with (20.1).
Frequently the independent variable represents length x instead of time t, as
φ x
C
τ (=ω t)
−2π −π O π 2π
−C
τ period 2π
Fig. 20.2
Self-test 20.1
Express x(t) = −2 sin(3t − --13 π) in standard form.
x(t) = C1 cos(ω t + φ1) and y(t) = C2 cos(ω t + φ2) are harmonic functions in
standard form with the same circular frequency ω . If φ1 φ2, then x is said to
lead y, or y is said to lag x, by an angle φ1 − φ2. (20.4)
xB
C A
−π π
O τ (=ω t) Fig. 20.3 –––– v = v0 cos τ;
------ i = (v0 /ω L) cos(τ − 21 π). The
period inspected is symmetrical
Period 2π about the chosen feature at B.
20
The curves in Fig. 20.3 are plotted against the variable τ = ω t and represent
v0
v = v0 cos τ and i= cos(τ – 12 π).
ωL
Choose one of the curves, say the v curve, and select a prominent feature, say the
maximum at B. Now search an interval within ±π of B (that is to say, within half a
period on either side of B) for the corresponding feature of i. This is the maximum
of i at A.
Now, as we move from left to right (time increasing) through the interval, B appears
before A – that is to say, at a quarter period (--12 π) earlier than A. This will be true for any
feature of v within its own symmetrical corresponding interval of ±π. It is equivalent to
saying that, when the two variables to be compared are in standard form, the one with
the greater phase leads, and the other lags, by the phase difference (taken positively).
In Example 20.2 it is essential to limit the search to the prescribed single period.
Otherwise we could argue (see Fig. 20.3) that because, say, C appears before B,
therefore i leads v, which is contrary to the definition. (Any basic period of length
2π will give the same priority.) Also, notice that if one entity leads another it does
not in the least imply that the first is to be taken as the cause of the second.
Suppose that two oscillations, having the same amplitude and frequency, differ
in phase by π so they are displaced by half a period. If the oscillations are added
together, there is total cancellation, as shown in Fig. 20.4. The following example
shows what happens when the phase difference is less extreme.
417
20.3
C
Self-test 20.2
Two waves are represented by C cos(ω t + φ1) and C cos(ω t + φ2). The two
waves are superimposed. What is the amplitude of the resulting harmonic
wave? For what values of the phases does cancellation occur?
(a) (b)
HARMONIC FUNCTIONS AND THE HARMONIC OSCILLATOR
Equilibrium x Spring
Fig. 20.5 (a) Mass–spring system. The arrows indicate the actual direction of the forces when
F(t), sx, and K dx/dt take positive values. (b) Schematic representation: the spring and the
frictional element must be in parallel.
L 2
+R + Q = V(t), or 2
+ + Q = V(t). (20.7)
dt dt C dt L dt LC L
Again, this is an equation of the type (20.6), with
R 1 1
x = Q, b= , c= , f (t) = V(t).
L LC L
These two physical systems serve as models of the differential equation (20.6).
They are also models of each other, for by choosing the same values of b and c
and the same forcing term f(t) the circuit would serve as a precise analogue
of the piston and mimic its behaviour exactly. A vast number of systems share
the governing equation (20.6), at least approximately. Such a system is called a
linear oscillator.
C R
V(t)
Fig. 20.6
419
20.4
Suppose that in the piston system there is no external force acting, so that F(t) = 0
for all t. We shall choose a conventional notation that simplifies the algebra a little.
We have already worked out this problem (see (18.15)). The solutions of (20.8)
subject to (20.9) are given by
x(t) = C e−kt cos[(ω 02 − k 2 )2 t + φ ],
1
(20.10)
where C and φ are arbitrary. These are called the free oscillations or natural oscilla-
tions of the system represented by the equation.
If the friction, or so-called damping, is zero then k = 0 and the equation for the
free oscillations becomes
d2 x
+ ω 02 x = 0 (20.11)
d t2
with solutions
x(t) = C cos(ω 0 t + φ) (20.12)
oscillations of (20.12) are caused to die away through the factor e−kt in (20.10).
The general effect is shown in Fig. 20.7. We say that the oscillation decays
(b)
x
(a)
x
O t
O t
Fig. 20.7
420
exponentially down to zero, when all the initial energy is used up on friction. This
is weak damping and the oscillation is said to be underdamped.
HARMONIC FUNCTIONS AND THE HARMONIC OSCILLATOR
If k2 ω 02, then there is a comparatively large amount of friction, and the form
of the solution is different from (20.10) (see Fig. 20.7b). There are no oscillations;
the x(t) curve dies away without crossing the t axis more than once, as in a dead-
beat electrical instrument or shock absorber. This is the case of heavy damping or
an overdamped oscillation.
Self-test 20.3
Let ω 0 = k in eqn (20.8), so that x satisfies
d2x dx
2
+ 2 + k2x = 0.
dt dt
Solve the equation, and discuss how the ouput behaves.
d2 x dx
+ 2k + ω 02 x = K cos ω t, (20.13)
d t2 dt
and suppose as before that the friction (or resistance) is ‘small’:
k2 ω 02 .
The mass in the piston system is now subject to competing stimuli. Left to itself
it would oscillate as in (20.10) with circular frequency (ω 02 − k 2 )2 , and finally
1
come to rest. However, the forcing term is trying to make it oscillate with a different
circular frequency ω. The result is described by the general solution of (20.13).
This is equal to the sum of a particular solution (already worked out in (19.4))
and the complementary functions which are the free oscillations given in (18.15):
[(ω 0 − ω )2 + 4k 2ω 2 ]
1
2 2 2
in which Φ is the polar angle of the point (ω 20 − ω 2, −2kω ), and C and φ are
arbitrary. (20.14)
The structure of (20.14) is very important: the general features are summarized
in (20.15) below.
421
20.5
(A) The forced oscillation (first term of (20.14)) coexists with a free oscillation
(second term). The free oscillation proceeds as if no forcing term were present.
On account of (D) the free oscillation is called a transient oscillation, and may
show itself, for example, by a brief irregularity in the voltage or current upon
switching an electrical apparatus:
Example 20.4 The circuit shown in Fig. 20.8 is initially quiescent and
uncharged. Find the charge Q(t) on the capacitor after switching the circuit on.
L = 10−3
E(t)
= 2 cos 90t R = 8 × 10−3
C = 10−1
Fig. 20.8
We shall rework the problem from first principles. The equation is (see (20.7))
d 2Q dQ
10 −3 2 + 8 × 10 −3 + 10Q = 2 cos 90t,
dt dt
d 2Q dQ
or +8 + 10 4 Q = 2 × 10 3 cos 90t.
dt2 dt
Complementary functions Qc (natural oscillation). The characteristic equation is
m2 + 8m + 104 = 0, so that m = −4 + 99.92i and the complementary functions are
Qc = B e− 4t cos(99.92t + φ), where B and φ are arbitrary.
Particular solution Qp (forced oscillation). Look for a solution to the corresponding
complex equation
d2 X dX
2
+8 + 10 4 X = 2 × 10 3 e90it,
dt dt
and take its real part. By trying a solution of the form X(t) = P e90it we obtain
P = 0.9205 − 0.3488i. In polar coordinates this becomes P = 0.9843 e−0.3622i.
The corresponding complex solution, in polar coordinates, is ➚
422
Example 20.4 continued
HARMONIC FUNCTIONS AND THE HARMONIC OSCILLATOR
(a)
1
Qp 90t
O
–1
20
(b)
1
Qc 90t
O
–1
(c)
1
Q 90t
O
–1
90t
O 4π 8π 12π 16π 20π 24π 28π 32π
Fig. 20.9 (a) Forced oscillation, Qp = 0.9843 cos(90t − 0.3622). (b) Transient, Qc = 0.9851 e− 4t
cos(99.92t + 2.777). (c) Total oscillation, Q = Qp + Qc.
423
20.6 Resonance
20.6
Return to eqn (20.14) for the linear oscillator and its solutions and examine the
forced oscillation, which is all that is left after the transient has died away. Its
RESONANCE
amplitude, A say, is given by
K
A= 1 .
[(ω 02 − ω 2 )2 + 4k 2ω 2 ] 2
Different values for the forcing frequency ω will produce different amplitudes;
some values of ω will be more effective than others in generating a large
amplitude.
Regard ω 0 and k as representing the fixed characteristics of some kind of system,
and consider an experiment in which we try to excite it with a controllable input
K cos ω t, keeping K constant but trying various values of ω . The amplitude A
will be greatest when (ω 20 − ω 2)2 + 4k2ω 2 = g(ω ), say, is a minimum with respect
to the variable ω . It is found, by solving dg/dω = 0 (see Problem 20.15), that the
minimum occurs when
ω 2 = ω 20 − 2k2, (k2 21 ω 2 ).
When ω 2 takes this value the amplitude A will take its greatest possible value for
the given K and ω , given by
K
A= 1 .
2k(ω − k 2 )2
2
0
Figure 20.10 shows schematically how the amplitude A, and also the phase Φ in
(20.14), vary with forcing frequency ω. Different curves are obtained according
to the amount of friction or damping (or resistance in the case of a circuit) in the
system, measured by the size of k; as the damping decreases, the maximum
increases. When the condition for a maximum is satisfied, the system is said to be
in a state of resonance.
(a) ∞
A k=0
(b)
O ω0 ω
k k=0
small
− 12 π
k0
k large
Φ −π
O ω0 ω k=0
Fig. 20.10
424
Resonating system
HARMONIC FUNCTIONS AND THE HARMONIC OSCILLATOR
d2x dx
2
+ 2k + ω 02 x = K cos ω t.
dt dt
K
Forced amplitude A = .
[(ω − ω ) + 4k 2ω 2 ]
1
2 2 2 2
0
A physical feeling for the buildup of a large amplitude can be obtained by think-
ing of a child being pushed on a swing by two people, one on either side of the
swing. The method is to push the swing the way it wants to go, and not to work
against it. This is best done by pushing it, forward and backward alternately,
when it is at the bottom of its path. The driving frequency is then the same as the
natural frequency of the swing. The driving cycle is a quarter of a period out of
phase with the swing’s cycle, because the force is a maximum when the displace-
ment is a minimum. In terms of (20.16), k is assumed small, so that ω 2 = ω 20 very
nearly, and the phase difference Φ is nearly 12 π or a quarter of a period, the forcing
term leading the response by this amount.
Suppose next that there is zero friction,
20
k = 0,
so that
d2 x
+ ω 02 x = K cos ω t, (20.17)
d t2
and from (20.16) the forced amplitude A is
K
A= . (20.18)
ω − ω2
2
0
The natural frequency of this system is exactly ω 0. When ω (the forcing frequency)
gets close to ω 0, the amplitude A can become very large, approaching infinity as
ω approaches ω 0: see Fig. 20.10a.
When ω = ω 0 the equation becomes
d2 x
+ ω 02 x = K cos ω 0 t, (20.19)
d t2
and apparently A = ∞. This result cannot be said to describe a steady solution
of (20.19), but must be reconcilable with (20.19) in some way. In fact it is the
‘exceptional case’ of eqn (19.6a), and has a solution
K
x(t) = t sin ω 0 t. (20.20)
2ω 0
425
This particular solution conveniently satisfies the initial conditions
20.7
x(0) = 0, x′(0) = 0, (20.21)
that is to say, the conditions for initial quiescence. It therefore represents a system
Self-test 20.4
Using (20.14) and (20.16), find the phase at the resonant frequency of a
forced linear oscillator.
θ
l
This equation is nonlinear since sin θ is not of the form aθ + b, so the methods
of Chapter 19 do not apply to it. However, the Taylor series for sin θ begins
sin θ = θ − 61 θ 3 +
(θ in radians), so provided that θ remains small enough we can approximate
sin θ by
sin θ ≈ θ.
The error is about 10% when θ = 45°, and 0.1% at 5°. Put this into (20.22); we
obtain the approximate linearized equation
426
d2θ g
+ θ = 0. (20.23)
HARMONIC FUNCTIONS AND THE HARMONIC OSCILLATOR
d t2 l
1
The general solution is θ (t) = C cos[(g/l)–2 t + φ] (see Section 20.1). The values
of C and φ will depend on how it was set going; the initial conditions amount
to prescribing the position and angular velocity at t = 0. However, C must be small
for the approximation to be justified.
Exactly linear equations are uncommon in applications. Most frequently they
occur as the result of a simplifying approximation such as we carried out for the
pendulum. Usually some function in the equation is linearized at the expense of
a restriction on the dependent variable.
1
2L
20
T
x(t)
N C
m
T
1
2L
20.8
AC ; 12 L
with an error of something like 2x2/L2, and the approximation to the restoring force
⎡ 2s ⎛ l⎞⎤2
2π ⎢ ⎝ ⎜ 1 − ⎟⎥ .
⎣m L⎠ ⎦
It is interesting to consider the case when the string is unstretched in the equilibrium
position, so that L = l. In this case l(l 2 + 4x 2)− –2 ≈ 1 − (2/l2)x2, using the binomial expansion.
1
To lowest order for small | x|, the equation of motion is the nonlinear equation
d 2x 4s
m 2 = − x3. This case is discussed in Example 23.2.
dt l
⎡ ⎛t⎞ ⎤ ⎡ ⎛ z⎞ ⎤
u(t, x) = A cos ⎢2π ⎜ ⎟ + φ ⎥ cos ⎢2π ⎜ ⎟ + α ⎥ , (20.24)
⎣ ⎝ T⎠ ⎦ ⎣ ⎝ λ⎠ ⎦
428
M M M
HARMONIC FUNCTIONS AND THE HARMONIC OSCILLATOR
z
N N
N N
M M M
Fig. 20.13 A standing wave on a stretched string. The nodes N and antinodes M remain fixed in
position.
where u is the displacement, z the coordinate along the string, λ the wavelength,
T the period of the oscillation, A 0 the amplitude, and φ and α arbitrary phase
angles. Sines can be turned into cosines by increasing the phase by 21 π, so that
(20.24) covers sines as well as cosines. We can express (20.24) in terms of angular
frequency ω and wave number k by putting
2π 2π
T= , λ= ; (20.25a)
ω k
the frequency f in cycles per second ( f hertz) is given by
1 ω
f = = . (20.25b)
T 2π
Therefore
u(t, x) = A cos(ω t + φ)cos(kz + α).
20
(20.26)
The nodes N occur where cos(kz + α) = 0, that is, where z = [(n + --)π − α]/k. 1
2
λ = 2π/k
At time t z
Velocity v
A
A moment
z
later
Fig. 20.14 A travelling wave A cos(ω t − kz + φ ) moving with phase velocity v = ω /k.
429
The shape of u(t, z) at any fixed moment t = t0 takes the form
20.8
u(t, z) = A cos(−kz + B),
ωt φ
z= + .
k k
From (20.25a,b) and (20.26), the velocity v of the wave along the z axis is therefore
given by
ω λ
v= = = λf. (20.28)
k T
The velocity of a sinusoidal wave is called the phase velocity: it is the velocity of a
point for which the phase maintains a constant value; for example, following an
antinode as above, or a node. A more direct way of justifying the equation v = λ f
is as follows. The number of waves crossing the fixed point P per second is equal
to the frequency f. Therefore, the length of the wave train crossing P per unit time
is f × wavelength, which is the velocity v.
The wavelength λ and the frequency f cannot be assigned independently in a
physical problem (and the same applies to ω and k) since they are connected by
(20.28), and the velocity of propagation v is determined by the physical medium
(even if it varies with the frequency).
in which A, ω, k, φ are constants. Although this looks similar to (20.27), its mean-
ing is different; eqn (20.29) defines u at every point in the x, y, z space. The values
of u are independent of x and y; that is to say, the value of u over any fixed plane
perpendicular to the z axis, such as ∑ in Fig. 20.15, is uniform at any particular
moment, though this value varies with time t. The waves are therefore called
plane waves.
The z axis is one ray of the three-dimensional wave; along it the situation is
the same as in Case (ii) above. The value of u over ∑′, moving with velocity v, is
equal to the value on the z axis, so any plane ∑′ (Fig. 20.15) that follows a given
constant value of u must move to the right with speed v = ω /k = λ /T. To sum up
these results:
430
HARMONIC FUNCTIONS AND THE HARMONIC OSCILLATOR
y
∑ ∑′
x
O
z
Wave
d irectio
Veloc n
ity v
z
∑ fixed
∑′ following wave
Fig. 20.15 A plane travelling wave. The disturbance u is uniform on the fixed plane ∑. It is
uniform and also constant over ∑′, which moves with velocity v.
(20.30)
20.9
P
z′
(with Oz′
To prove the result (20.31a), see Fig. 20.16. P : (x, y, z) is a representative point.
∑ is the plane which passes through P, and is perpendicular to the unit vector v.
OQ is perpendicular to ∑ at Q, and passes through the origin O. Extend OQ to
form a new coordinate axis Qz′: this is parallel to v and in the same sense. Then
(see (10.14))
v ·r = |v||r | cos θ = 1 × OP cos θ = OQ = z′,
where z′ is the Oz′ coordinate of Q (and P). The formula becomes
u(t, r) = A cos(ω t − kv·r + φ) = A cos(ω t − kz′ + φ).
This expression is the same as (20.30), but refers to an axis Oz′ parallel to v, in
place of the axis Oz. It therefore represents a plane travelling wave of the type
(20.30) propagated in the direction of v.
Self-test 20.5
Two travelling waves x = A cos(ω t − kz + φ1), x = A cos(ω t + kz + φ2) are
superimposed. What type is the resulting wave? Where are its nodes?
with
u1(t) = A cos ω 1t, u2(t) = A cos ω 2t. (20.32b)
2π p 2πq
T= (which also = ). (20.33)
ω1 ω2
Evidently T may be very much larger than the periods 2π/ω1 and 2π /ω 2 of the
individual components because of the possibly large size of the factors p and q
(see Problem 20.22).
Express the difference between the component frequencies in (20.32) by ∆ω :
ω 2 = ω 1 + ∆ω , (20.34)
If ∆ω is fairly small compared with (ω 1 + ω 2), then B(t) is a slowly varying function
1
2
compared with cos 12 (ω 1 + ω 2)t, and (20.35) takes on the appearance of Fig. 20.17c.
Figure 20.17 shows the components u1(t) and u2(t), and the composite function
20
u(t), together with the functions ±B(t), which form the profile of a stream of wave
packets made up of faster oscillations. These wave packets are called beats. The
beats arise from a kind of interference: where u1(t) and u2(t) are nearly in phase
(see Fig. 20.17b) they reinforce each other so that u(t) is large; where they are
opposed u(t) is small (see Fig. 20.17b). Despite appearances, the beats will not in
general contain an exact number of complete cycles of u(t): in this case the period
of u(t) is about 31 beats long (see Problem 20.22b). The period and frequency of
the beats (as distinct from the amplitude function B(t)) are defined to be equal to
the period and frequency of the wave packets; therefore
Beat period
TB = 12 period of B(t) = 2π/(∆ω).
Beat frequency
FB = 2 × frequency of B(t) = ∆ω /(2π). (20.37)
If the wave concerned is a sound wave, the tone that is detected by the ear cor-
responds to the pulse or beat frequency (that is, the frequency of B(t)) rather than
to that of the underlying frequencies f1 and f2. In cases where f1 and f2 are large
compared with the frequency fB of the beats, the underlying rapid oscillation is
sometimes referred to as a carrier wave, and the beats correspond to a signal.
433
u1, u2
20.9
1
−1
u1(t) u2(t)
max[u1]
(b) t
max[u2] A A A A
8
(c) t
O
−1
−2
±B(t) u(t)
Fig. 20.17 Here A = 1, ω1 = 10, ω2 = 13.1. (a) u1 = A cos ω1t, with u2 = A cos ω 2t. (b) Phase
reinforcement of u1, u2 near points A. (c) u1 + u2 and ±B(t) = ±2A cos 21 ∆ω t.
Equilibrium positions
x y
S1 S2 S3
A P Q B
For a mechanical example of the occurrence of beats, see Fig. 20.18. Unit particles
P and Q are connected to fixed points A, B through springs S1, S2, S3 of natural
lengths l, with l 13 AB; the stiffness of each of the springs S1 and S3 is K, and that
of the connecting spring S2 is k. The displacements of P, Q from equilibrium are
x(t), y(t) respectively. The particles oscillate along the line AB, and their equations
of motion are
434
F + (K + k)x = ky, H + (K + k)y = kx. (20.38)
HARMONIC FUNCTIONS AND THE HARMONIC OSCILLATOR
A related mechanical example was also considered in Section 13.8 in the chapter
on eigenvalues.
It can be checked by substitution that a particular pair of solutions is given by
x(t) = cos(t√K) + cos[t√(K + 2k)], (20.39a)
and
By(t) = −2 sin 12 [√(K + 2k) − √K]t
= 2 cos{ 12 [√(K + 2k) − √K]t − 12 π}, (20.40b)
≈ k/√K 1
(using the first term of the binomial theorem (Section 1.18 or 5.4) to approximate
to √[1 + (2k/K)]). Therefore ∆ω ≈ k /√K in (20.36). The displacement x(t) of
particle P has beat period TB given by (20.37):
2π 2π √K
TB = ≈ .
∆ω k
Beats with the same period also occur in y(t), but are out of phase with those
of x(t) by half a beat period. A fixed stock of free mechanical energy is handed
back and forth between P and Q: when one is vibrating vigorously, the other
has only a small amplitude, and P and Q alternate in this respect. The same
phenomenon occurs when two pendulum bobs are coupled by a weak spring.
where
u1(t, z) = A cos(ω 1t − k1z), u2(t, z) = A cos(ω 2t − k2 z). (20.41b)
435
u(t0, z) ±B(t0, z)
20.10
4A z
In this section we visualize graphs of u plotted against z as in Fig. 20.19, for dif-
ferent values of time t.
Suppose firstly that the phase velocity v is constant for all sinusoidal waves. Put
ω 2 = ω 1 + ∆ω, k2 = k1 + ∆k. (20.43)
This wave has the same form as the oscillation discussed in the previous section
(eqn (20.35)), except that we have t − (z /v) in place of t. Plotted against z, as in
Fig. 20.19, the wave travels unchanged at speed v. There is a carrier wave with
wavelength λ C = 2π/[12 (k1 + k2)], multiplied by a beat function B(t, z):
B(t, z) = 2 A cos( 21 t∆ω − 21 z ∆ k).
In terms of k the beat wavelength λB is
λB = 21 (wavelength of B(t, z)) = 21 [2π /( 21 ∆ k)] = 2π /(∆ k)
and the beat frequency fB is
fB = 2(frequency of B(t, z)) = 2[( 21 ∆ω )/(2π)] = ∆ω /(2π).
The beats travel with the phase velocity v:
beat velocity = 21 ∆ω /( 21 ∆ k) = v∆ k /(∆ k) = v.
This is to be expected, since the components u1 and u2 have equal velocity v, and
therefore remain in a constant phase relationship, reinforcing and cancelling each
other over segments that remain in step as the waves travel. The theory of beats in
travelling waves is related to frequency modulation in radio transmission.
436
There are media and special situations where the velocity of a travelling sinusoidal
wave varies with its frequency (or, equivalently, with its wavelength). Such waves
are called dispersive waves: light waves are dispersive, leading to their spectral
decomposition upon entering a refractive medium.
In a dispersive medium, one component wave will overtake the other; therefore
u1 and u2 will not maintain a constant phase relationship as the wave travels, and
the velocity associated with the beats is affected. Suppose that two dispersive
waves, u1(t, z) and u2(t, z), have different angular frequencies ω 1 and ω 2, and
phase velocities v1 and v2 ≠ v1 (whose values will depend on ω 1 and ω 2). Refer
back to (20.44): if ω 2 and ω 1 are fairly close then distinct beats occur (in both
time t and space z). Their profile is determined by the curves ±B(t, z), where
B(t, z) = 2 A cos( 21 t ∆ω − 21 z ∆ k), (20.45)
A graph of v(k) against k would contain all we need to know about the behaviour
of v to enable wave interactions of any degree of complexity to be computed. (We
could instead work with ω rather than k as the independent variable, or λ, or
f (see Problem 20.18), but in any case only one parameter is needed to specify
the variation of v.)
We shall relate this observation to the group velocity problem just discussed,
for cases where the wave number ∆k of the beats is small compared with the wave
number k of the carrier wave. In this case the beats will be very distinct. From
(20.45), (20.46), and (20.47)
∆ω ω − ω 1 k2v(k2 ) − k1v(k1 )
= vg = 2 = . (20.48)
∆k k2 − k1 k2 − k1
The form of this equation suggests that we can approximate to vg by an expres-
sion involving the derivative of kv(k). Suppose that
∆k = k2 − k1 → 0, or k2 → k1.
437
For simplicity, write k in place of k1; then from (20.47)
20.12
∆ω d(kv(k)) k dv(k)
→ = v(k) + .
∆k dk dk
Emitter
Observer and
wave front at P
E
At t = t0
P
u∆ t v∆t
E′ P
At E Q
t1 = t0 + ∆t
Wave front from
P at t0
Emitter
reduces the wavelength λ. The frequency f of the emitter and the phase velocity v
(relative to the medium, which we take to be stationary) are fixed, so that the
frequency of arrival of waves at any fixed point P must be greater than f.
To examine the effect quantitatively, see Fig. 20.20. In the following, u may
be positive, as shown, or negative (corresponding to E moving oppositely to the
wave direction). Put
EP = L, t1 − t0 = ∆t.
Then at time t1, E has moved to E′ and the wave front to Q, so that
E′Q = L − u∆t + v∆ t = L + (v − u)∆ t.
Let the wavelength be λ. Then EP contains L /λ wavelengths, and E′Q contains
E ′Q L + (v − u)∆ t
= wavelengths.
λ λ
In the interval ∆t, f∆ t new waves have been generated, so that
L + (v − u)∆ t L (v − u)∆ t
− = f∆ t, or = f∆ t.
λ λ λ
Therefore
v−u
λ= . (20.50)
20
f
The fixed receiver at P records, say, fP waves per second, and these are travelling at
the normal phase velocity v, so from (20.28) and (20.50),
v vf f
fP = = = . (20.51)
λ v − u 1 − (u /v)
If u 0 (E moving in the direction of v) then fP f. If u 0 (E moving oppositely to
the direction of v) then fP f.
The effect is observable if a vehicle with a siren speeds past an observer at
a point P: as it passes there is a sudden lowering of the pitch. If the speed of the
vehicle is u, the frequency drop ∆fP is given by
f f
∆f P = − , (20.52)
1 − (u /v) 1 + (u /v)
which is approximately equal to 2uf /v if u/v is small. The so-called ‘red shift’ in
astronomy, by which the velocity of a receding galaxy can be observed from
the change towards longer wavelengths (lower frequencies) in its spectrum, is
explained on the same lines.
439
Problems
PROBLEMS
20.1 Express the following in standard amplitude– (a) State the time constant for Qc in Example 20.4.
phase form C cos(ω t + φ), with C 0 and (b) Describe how T provides a measure of the
−π φ π. rate of exponential decay of x(t), rather like the
(a) 3 cos(3t + --32 π); (b) 3 cos(ω t − 3π); half-life period of a radioactive substance.
(c) 2 sin 3t; (d) 3 sin(2t + --12 π);
(e) −3 cos(2t − --12 π); (f ) −4 cos(2t + --14 π); 20.7 (Heavy damping). Find the general solution
(g) −sin t; (h) 3 cos 2t + 4 sin 2t; of the equation
(i) cos 2t + cos(2t − π); x″ + 2kx′ + ω 2x = 0
(j) cos(2t − --32 π) − cos(2t + --32 π).
when k2 ω 2. Describe the general character of
20.2 State whether x leads or lags y in the
the solutions, contrasting them with the case
following cases, and by how much. when k2 ω 2.
(a) x = 4 cos 3t, y = 3 cos(3t − 12 π).
20.8 Solve the equation
(b) x = 2 cos(2 t + 41 π), y = 3 cos(2 t + 92 π).
(c) x = −3 cos 2t, y = 4 cos 2t. x″ + 10x′ + 24x = 0
(d) x = cos 3t, y = sin 3t. subject to the initial conditions x(0) = −3, x′(0) = 20.
(e) x = 2 cos 3t, y = cos(3t − 94 π). Show that the solution curve crosses the t axis only
once, at the point t = ln 2.
20.3 Obtain the free oscillations of the following
in the form C cos(ω t + φ). State (i) the natural 20.9 (‘Critical damping’). Find the general
frequency if the damping coefficient is put to zero; solution of the equation
(ii) the frequency that actually occurs in the cosine
x″ + 2kx′ + ω 2x = 0
term of the solution; (iii) the number of complete
cycles needed for the amplitude to drop to 0.1 for the case when k2 = ω 2.
of its value at t = 0.
(a) x″ + 20x′ + (2.5 × 105)x = 0. 20.10 The following equation could represent the
(b) x″ + 0.5x′ + 4x = 0. damped vertical motion of a mass supported by a
(c) x″ + 0.15x′ + 3x = 0. spring and subjected to an external periodic force:
(d) x″ + x′ + 20x = 0. x″ + x′ + 36x = 10 cos ω t, for t 0,
the system being in equilibrium under no force
20.4 Express A cos ω t + B sin(ω t + 41 π) in the for t 0.
standard form C cos(ω t + φ ) when (a) A = 3–, B = 1;
1
2
(a) Find the period of the free (damped)
(b) A = 3–, B = −1; (c) A = −3–, B = 1;
1 1
2 2
oscillations. Show that any free oscillations
(d) A = −3–, B = −1.
1
2
stimulated at startup are reduced by a factor
of about 14 after five periods of oscillation.
20.5 (a) Show that the maxima and minima of (b) Obtain expressions in terms of ω for the
x(t) = C e−kt cos(ω t + φ) occur at times TN given by amplitude and phase of the forced oscillation.
⎛k⎞ (c) Find the condition for resonance.
ω TN + φ = − arctan ⎜ ⎟ + Nπ, (d) Plot curves of amplitude and phase against
⎝ω⎠
ω for a range 4 ω 8.
where N is any integer.
(b) Show that the values of x(t) at these points
20.11 A particle rolls to and fro under gravity at
are given by
the bottom of a parabolic cylinder having vertical
(−1)N ω C e −kTN cross-section y = ax2. There is negligible friction.
x(TN ) = .
(ω 2 + k 2 ) 2
1
The equation of motion in terms of horizontal
displacement x is then
20.6 Consider an expression of the form x″ + 2ax(g + 2ax′2)/(1 + 4a2x2) = 0.
x(t) = e−t/Tg(t), Show that for small oscillations the period is
2– π /(ag)– .
1 1
where T is a constant, and g(t) itself does not have 2 2
PROBLEMS
of amplitude or phase. What is the nature of the ⎛ ∆f ⎞ ⎛ ∆λ ⎞ ∆v
combined wave? ⎜1 + ⎟ ⎜1 + ⎟ =1+
(b) Consider separately the effects of a change ⎝ f ⎠⎝ λ ⎠ v
in phase and a change in amplitude upon and
reflection. ⎛ ∆k ⎞ ⎛ ∆v ⎞ ∆ω
⎜1 + ⎟ ⎜1 + ⎟ =1+ .
⎝ k ⎠⎝ v ⎠ ω
20.25 Two superposed plane waves, u1 and (These relations are exact, but show that when
u2, travel in the z direction through a dispersive small values are being considered the natural
medium in which the phase velocity v is regarded variables to use are ∆f /f etc.)
as a function of wavelength λ . They have the same
amplitude A and phase angle zero, but different 20.27 A fire truck speeds along a highway at
wave numbers (and consequently different 100 km h−1, sounding its siren at a frequency of
angular frequencies ω ). Show that vg = v − λ dv/dλ , 350 cycles per second. Obtain the drop in pitch
where vg is the group velocity. (Hint: start with noticed by an observer standing on the sidewalk
vg = ∆ω /∆k.) as it goes past.
Steady forced oscillations:
21 phasors, impedance,
transfer functions
CONTENTS
21.1 Phasors
Let x(t) represent any variable in the circuit, such as the current in a particular
branch. If the frequency of the applied voltage is ω /2π, then all these possible
variables x(t) share the same frequency ω /2π once the transients have died away,
though in general the phases and amplitudes of different variables are different.
Here we adopt the standardized amplitude /phase form of (20.2), assuming that
x(t) = c cos(ω t + φ), with c 0 and −π φ π. (21.1)
21.1
PHASORS
Phasor of a harmonic oscillation
The phasor of x(t) = c cos(ω t + φ) is the complex number X = c eiφ. (21.2)
X = 3 − 14 π = 3 − 45°.
The two numbers displayed are the polar coordinates of the point which re-
presents the phasor on an Argand diagram, in this case the point (3/√2, −3/√2)
corresponding to
3 3
X = 3 cos(− 45°) + i3 sin(− 45°) = −i .
√2 √2
It is often convenient to express a phasor in the form a + ib rather than in the polar
form c eiφ.
⎝ √2 ⎠
The phase (− 12
13
π) is out of the standard range (21.1), so add 2π to it, leaving X unchanged.
We obtain, in the standard form, X = √2 e 12 πi , so
11
x(t) = √2 cos(10 4 t + 11
12 π).
444
Example 21.3 Let x(t) = √3 cos ω t − sin ω t. Find the corresponding phasor.
STEADY FORCED OSCILLATIONS: PHASORS, IMPEDANCE, TRANSFER FUNCTIONS
Therefore
X = √3 − e− 2 πi = √3 − (−i) = 2 e 6 πi or 2 30°.
1 1
Self-test 21.1
Find the phasor of x(t) = −2 cos(3t − --12 π).
Differentiation and integration give important results. If x(t) = Re(X eiω t), where
X is the phasor, then dx /dt = Re(iωX eiω t), so that the phasor of dx /dt is iωX.
Differentiate again, and a further factor iω is introduced, so that the phasor
of d2x/dt2 is (iω)2X, and so on. For ∫ x(t) dt, we find in the same way that the
phasor is X/iω. The additive arbitrary constant in ∫ eiω tdt has been put to zero
because, in normal use, all the variables that occur oscillate sinusoidally.
Variable: x
dx
dt
d2x
dt2 x dt
1
Phasor: X = c eiφ iωX −ω 2X X
iω (21.4)
445
21.3
d2 q
(a) L 2 + R
dt
dq q
dt C
+ , (b) L
di 1
+
dt C
i dt, in terms of the phasors
PHASOR DIAGRAMS
Q of q(t) and I of i(t). (L, R, and C are circuit constants, and the prevailing
frequency is ω.)
(a) From (21.4) and the addition principle (21.3), the phasor is
L(iω )2Q + R(iω )Q + (1/C)Q = [(1/C) − Lω 2 + iRω ]Q.
(b) The phasor is
L(iω )I + (1/C iω )I = i(Lω − 1/Cω )I.
Self-test 21.2
Using phasors, find the steady-state solution of
d 2x dx
+ 4 + 2504x = 103 cos 100t.
dt2 dt
Example 21.6 Let u(t) = 2 cos 10t, v(t) = cos(10t − 21 π), and
STEADY FORCED OSCILLATIONS: PHASORS, IMPEDANCE, TRANSFER FUNCTIONS
−2 −2
Fig. 21.1 (a) Argand diagram showing U, V, W. (b) The sum U + V + W = O__P.
polar-coordinate notation they are U = 2 0°, V = 1 −90°, W = 3 45°. They are shown
as position vectors in Fig. 21.1a, and in Fig. 21.1b they are strung together as usual for
addition. The vector O_P _ can be measured off from the diagram, or calculated using
the dimensions shown. We have
| O_P | = [(3 / √2 + 2)2 + (3 / √2 − 1)2 ]2 = 4.27,
1
3 / √2 − 1
φ = arctan = 0.479 (radians).
3 / √2 + 2
Therefore p(t) = 4.27 cos(10t + 0.479).
Self-test 21.3
If u(t) = cos(5t − --14 π), v(t) = 2 cos(5t + --12 π) and w(t) = 3 cos(5t + --14 π), find
φ(t) = u(t) + v(t) + w(t) by means of a phasor diagram.
21
21.4
+ − + − + −
i i i
A similar table can be constructed if the voltage rather than current is pre-
scribed. The entries can be read from the table above; for example, if the phasor
of the voltage applied to an inductor is V, the phasor of the resulting current is
V/(iωL).
Discussion of circuits in terms of phasors is said to take place in the frequency
domain, rather than the time domain associated with the differential equations
of the circuits.
Each of the three cases in the table can be written in the form
V = ZI,
where Z is either R, iωL, or (iω C)−1. The quantity Z is called the complex
impedance of these elements. There is a plain analogy with Ohm’s law for direct
current through a resistance. We have
Complex impedance Z
Resistor Z=R
Inductor Z = iω L
1
Capacitor Z= .
iω C
(21.6)
Example 21.7 Show that the complex impedance Z of two elements in series,
whose complex impedances are Z1 and Z2, is given by
Z = Z1 + Z2. ➚
448
Example 21.7 continued
STEADY FORCED OSCILLATIONS: PHASORS, IMPEDANCE, TRANSFER FUNCTIONS
v
+ −
i Z1 Z2
v1 v2 Fig. 21.2
Suppose that the impedance of the unit is Z; we mean by this that, if V is the phasor
of the voltage drop across the unit and I is the phasor of the current through it
(see Fig. 21.2), then
V = ZI.
From Fig. 21.2, v = v1 + v2; therefore, by (21.3), the corresponding phasors satisfy
V = V1 + V2.
But i, and therefore I, is the same for Z1 and Z2, so
V1 = Z1I, V2 = Z2I.
Therefore V = Z1I + Z2I = ZI, or
Z = Z1 + Z2.
If the two impedances are in parallel, the analogy with Ohm’s law again exists:
Example 21.8 Show that the complex impedance Z of any two elements Z1 and
1 1 1
Z2 in parallel is given by = + .
Z Z1 Z2
Z1
i1
+
i i
Z2
i2
21
+
Z Fig. 21.3 Two impedances in
i i parallel and their combined
v impedance z = z1z2/(z1 + z2).
From Fig. 21.3, i = i1 + i2; so, by (21.3), I = I1 + I2. The voltage drop is the same for both
branches, so
I1 = V/Z1, I2 = V/Z2, I = V/Z.
Therefore I = (1/Z1 + 1/Z2)V = (1/Z)V, from which the result follows.
It is easy to extend these two results to encompass more elements, and therefore
we have the following general result.
449
21.4
(a) Impedances Z1, Z2, … , in series:
Z = Z1 + Z2 + ··· .
The analogy with resistive circuits, evident from these formulae, goes much
further. The general rules which govern voltages and currents in a passive linear
circuit are Kirchhoff’s laws: (i) that the algebraic sum of the voltages around any
closed circuit is zero; (ii) that the resultant current entering any junction is zero.
There is also a linear voltage/current relation for each branch. In terms of phasors
and complex impedances for a circuit in a state of steady harmonic oscillation,
these conditions become the following.
Kirchhoff’s laws
Around any closed circuit, ∑ V = 0.
At any junction, ∑ I = 0.
On any branch, V = ZI.
(21.8)
These rules have the same form as the rules for resistive direct-current circuits,
with V, I, and Z appearing in them in place of v, i, and R. It follows that general
rules applicable to DC circuits may be borrowed for the purpose of the circuits
we have been considering. Such rules are the Wheatstone bridge rules, Thévenin’s
theorem, and the structure of equivalent circuits. However, the restriction to
steady harmonic oscillation must be remembered: many circuits can be made
to ‘balance’ like a Wheatstone bridge for steady oscillations, but not for more
general disturbances.
Example 21.9 Find the steady alternating current in the circuit shown in
Fig. 21.4.
+
C
v(t) = v0 cos ω t
L
Fig. 21.4
➚
450
Example 21.9 continued
STEADY FORCED OSCILLATIONS: PHASORS, IMPEDANCE, TRANSFER FUNCTIONS
The unit comprising R and C consists of two complex impedances in parallel, R and
(iω C)−1. If Z is the combined impedance, then
1 1 1
= + ,
Z R (iω C)−1
which gives
R
Z= .
1 + iω RC
Z is in series with the other impedance, iω L, so the impedance of the circuit is given by
R R(1 − ω 2 LC) + iω L
Z= + iω L = .
1 + iω RC 1 + iω RC
Since I = V/Z, and V = v0, we obtain
v0(1 + iω RC) v (1 + ω 2R2 C2 )2
1
i(φ 1 − φ 2 )
I= = 2 0 2 1 e ,
R(1 − ω LC) + iω L [R (1 − ω LC)2 + ω 2L2 ]2
2
where
ωL
tan φ 1 = ω RC, tan φ 2 = .
R(1 – ω 2LC)
Finally,
v0(1 + ω 2R2 C2 )2
1
Example 21.10 Find the steady alternating current entering the circuit shown
in Fig. 21.5.
i M N
v(t) C
= v0 cos ω t
i Q P
21
Fig. 21.5
The phasor of the voltage source is V = v0. By (21.6) the impedance of MNPQ is
R + iω L, and that of MQ is 1/iω C. These are in parallel, so by (21.7) the impedance
Z of the circuit viewed between M and Q is given by
1 1 1
= + .
Z R + iω L 1/(iω C)
Therefore
V ⎛ 1 ⎞
I= = v0 ⎜ + iω C⎟ .
Z ⎝ R + iω L ⎠
The simplest way to get an expression for i(t) is to treat the two terms in the brackets
on the right separately (though this does not give the answer in standard form). We obtain
v ⎛ ω L⎞
i(t) = 2 0 2 2 1 cos ⎜ ω t − arctan ⎟ + v0ω C cos(ω t + 2 π).
1
(R + ω L ) 2 ⎝ R⎠
451
Example 21.11 (Balanced bridge circuit.) (a) For Fig. 21.6a, show that
21.5
(i) if i(t) = 0, then Z1 /Z2 = Z3 /Z4, (ii) if Z1 /Z2 = Z3 /Z4, then i(t) = 0. (b) Check
that i(t) = 0 in the circuit of Fig. 21.6b.
Z1 Z2 1
i 1
1
Z
1 1 1
Z3 Z4
Fig. 21.6
(a) The analogy (21.8) between resistive and general circuits for steady harmonic
oscillations enables us to borrow ordinary Wheatstone-bridge theory, substituting
current and voltage phasors and complex impedances for the usual constant currents,
voltages, and resistances. We can therefore say immediately that the circuit is balanced
(i(t) = 0) if, and only if,
Z1 /Z2 = Z3 /Z4.
(b) Z1 consists of a capacitor and resistor in parallel; so, by (21.6) and (21.7),
1 1 1 1
= + or Z1 = .
Z1 (iω ) −1
1 1 + iω
Also
1
Z2 = iω, Z3 = , Z4 = 1 + iω.
iω
Therefore
Z1 1 Z3 1
= and = ;
Z 2 iω (1 + iω ) Z4 iω (1 + iω )
so, from (a), i(t) = 0 and the bridge is balanced.
R C
STEADY FORCED OSCILLATIONS: PHASORS, IMPEDANCE, TRANSFER FUNCTIONS
+
v1(t) L R2 v2(t)
Fig. 21.7
c2
| G12 | = ,
c1
which is the ratio of the peak voltages, or amplitudes, of v2(t) and v1(t). The argu-
ment (polar angle) of G12 is the phase difference between them. If instead we are
interested in the current i2(t) through R2 produced by v1(t), then we need the ratio
Z12 = V1 /I2,
where I2 is the phasor of i2(t). This quantity, a voltage divided by a current, is
called a transfer impedance. Alternatively, we could consider the ratio
Y21 = I2 /V1,
in which Y21 is called a transfer admittance (whose parallel is conductance in
DC theory).
In general, the ratio of an output (such as a current in a selected branch) to
an input (such as a voltage driving a network) is called a transfer function in
the frequency domain. A different class of transfer functions is discussed in
Chapter 25, on Laplace transforms.
Example 21.12 Find the transfer impedance Z12 = V1 /I2 for the circuit of Fig. 21.8
when the prevailing angular frequency ω is 200.
2 0.01
A B C
21
i1 i1 − i2 i2
0.005 3
v1(t)
= 10 cos 200t
E D Fig. 21.8
The currents indicated take account of Kirchhoff’s second rule (21.8), that the sum of
the currents entering a junction is zero. The first law expressed in terms of the phasors
(see (21.8)), that the sum of the voltage drops round closed circuits is zero, gives for
the circuits ABCDEA and BCDEB respectively:
⎛ 1 ⎞
2 I1 + ⎜ + 3⎟ I 2 = V1 ,
⎝ 200 × 0.01i ⎠ ➚
453
Example 21.12 continued
21.6
and
⎛ 1 ⎞
⎜ 200 × 0.01i + 3⎟ I 2 − (200 × 0.005i)( I 2 − I1 ) = 0.
The methods described in this chapter were invented in the late nineteenth cen-
tury to assist engineers working with alternating current to interpret and make
calculations on their circuits. So long as only steady harmonic oscillations had to
be considered, there was no need to solve differential equations: only algebraic
equations are involved and these are much simpler to manipulate. Since that time,
the methods have been extensively developed so as to permit computer calculation
for circuits of any degree of complexity, using matrix algebra, graph theory, and
other sophisticated techniques. In Section 24.16, another method for algebrizing
circuit equations is described, using Laplace transforms.
u(t) = Re[A1 ei(ω t+φ1) + A2 ei(ω t+φ2)] = Re[eiω t(A1 eiφ1 + A2 eiφ 2)]. (21.9b)
and then
u(t) = Re[U eiω t]. (21.11b)
Figure 21.9 shows a phasor diagram illustrating vectorial addition of the phasors.
The real and imaginary parts of U, needed to evaluate (21.11b), are equal to the
components of the vector resultant shown. (Clearly, the greater the number of
wave components, the greater the advantage of using phasors.)
Im
A2 e iφ2
U
nt
lta
su
Re φ2
A1 eiφ1
φ1
Fig. 21.9 Phasor diagram for u(t),
O Re eqn (21.9a).
but different phases. Propagation is in the direction of the z axis. The variable t
is to be thought of as a stopwatch time; we shall suppose the watch is switched
on at the moment t = 0 when the origin of z is at a wave maximum. We lose no
generality by this, and it simplifies the algebra. The composite wave is given (see
Appendix B(d)) by
u(t, x, y, z) = u1 + u2 = A0 cos(ω t − kz) + A0 cos(ω t − kz + φ)
= 2 A0 cos 21 φ cos(ω t − kz + 21 φ ). (21.12)
Then U1, U2, called the complex amplitudes of u1 and u2, behave like phasors.
Since u = u1 + u2,
455
u = Re[(U1 + U2) e ] = Re[U e ], iω t iω t
21.6
where U is the complex amplitude of u.
Figure 21.10 shows the phasor diagram for obtaining U on the plane z = 0, from
(a)
Im
(b)
Im
0) (z0)
t U( A0 eiφ Resultant
U
u ltan +φ
)
Res i(–
kz 0 Re
φ O −kz0 A
e
0
−kz0 + φ
O A0 Re A e−ikz0
ω
time average of u2 = A2 cos2 (ω t − kz + φ ) dt
2π 0
2 π /ω
ω
= A2 [1 + cos 2(ω t − kz + φ )] dt = 21 A2 .
4π 0
Therefore
intensity I = KA2, (21.14)
where K is a constant for the medium. In the case of optics, I is directly related to
the brightness of an image on a screen. (We adopt standard practice by describing
a light beam by means of a scalar wave.)
Now suppose that u is expressed in the form u = Re[U eiω t]. Then
U = U1 + U2,
456
where
STEADY FORCED OSCILLATIONS: PHASORS, IMPEDANCE, TRANSFER FUNCTIONS
Beam direction Oz
v
γ
v
tion
direc
Beam
Screen
z=0
21
Fig. 21.11 Two uniform light beams, in direction Oz and v, interfere on arrival at a screen.
Let
u1(t, y, z) = A0 cos(ω t − kz), (21.16a)
from (20.31a). The wave arriving at the screen z = 0 is then the resultant of the
two waves
u1(t, y, 0) = A0 cos ω t, u2(t, y, 0) = A0 cos(ω t − ky sin γ ). (21.17)
By using the identity for the sum of two cosines (Appendix B(d)), we obtain
u1 + u2 = 2 A0 cos 21 (ky sin γ ) cos(ω t − 21 ky sin γ ). (21.18)
457
This represents a pattern of oscillatory disturbance on the screen having amplitude
2 A0 cos( 21 ky sin γ ) and phase (− 21 ky sin γ ), where both depend on the vertical
21.6
coordinate y. It arises as the result of interference between the incoming waves
where they meet on the screen.
Im
A0 P Re
O θ −ky sin γ
A0
e
–i
ky
N
sin
γ
Fig. 21.12 Phasor diagram for
Q interference of two beams on a
screen. U1 = O_P, U2 = P_Q, U = O_Q.
(a) R C
C
(b) R L
R
(c) L C
(l) Z1 Z2
(d) R
Z3 Z4
C Fig. 21.13
459
21.7 A voltage v = 2 cos ω t is applied across each 21.9 Sketch phasor diagrams for the following
of the circuits in Problem 21.6. Find the amplitude cases and in each case calculate (or measure) to
PROBLEMS
and phase of the current passing through the obtain the sum:
branches. (a) cos 10t + 2 cos(10t + 0.3);
(b) cos 10t + 2 sin(10t + 10.2);
21.8 In Fig. 21.14, numerical values are given to the (c) cos 10t + 3 cos(10t − 0.2);
complex impedances (the standard units are ohms, (d) sin 20t − 3 cos(20t + 0.75);
although the quantities may be complex). A (e) 2 cos(50t + 0.4) + sin(50t + 0.3)
voltage with phasor V0 is applied. Obtain the − 3 cos(50t − 0.5).
phasor V1 as indicated, the corresponding voltage
gain V1 /V0, and the transfer impedance V0 /I1. 21.10 Use phasor diagrams for the following
problems.
(a) 3 (a) Given axes x, y, z, obtain the general form for
a plane sound wave u(t, x, y, z), travelling in
I1 the direction that makes equal (acute) angles
with the positive directions of the three axes.
V0 3i V1 Investigate the form of the wave on a screen
placed in the (x, y) plane.
(b) The wave in (a) is crossed by another identical
plane wave, which travels in the direction of
−i the z axis. Obtain the interference pattern on
the plane z = 0.
(b) 1 I1
1
V0 −2i V1
−i
Fig. 21.14
Graphical, numerical, and
22 other aspects of first-order
equations
CONTENTS
Chapters 19 –21 were largely concerned with the theory and applications of
linear differential equations having constant coefficients. We cannot expect that
every physical situation will be described, even approximately, by equations
having this form – the field of differential equations is naturally far more varied.
In this chapter we mainly consider first-order equations which are either non-
linear, or have a nonconstant coefficient. We first show some simple graphical
and numerical methods that can give a good general picture of the solutions, and
indicate how a solution will develop starting from a given initial value. (For refine-
ments of such methods, consult books on numerical analysis: see, for example,
Boyce and DiPrima (1997).)
In cases where analytic solutions (that is, solutions expressible as more or less
explicit formulae) are obtainable the solution methods are different for each type
of equation. An arbitrary constant still arises, but it is embedded in the general
solution, with no parallel to the simple structure of particular and complement-
ary functions that we have seen for linear equations. In Sections 22.3 –22.5 we
show a few frequently occurring types: separable equations are particularly com-
mon. There exist several reference books containing vast collections of special
results and methods: see, for example, Zwillinger (1992).
22.1
where f(x, y) is unrestricted. If f(x, y) happens to take the form g(x) + h(x)y, the
(a) (b)
y
P
b (a, b)
1
Lineal element
through P : (a, b)
O a x O 1 x
Fig. 22.1 Lineal-element diagram indicating solution curves for dy/dx = xy.
Rather than to place the direction indicators at grid points as in Fig. 22.1, it is
often easier to look for curves, called isoclines, along which the slope is constant,
as in the following example.
dy
Example 22.1 Sketch the solution curves of = x − y.
dx
Here dy/dx takes constant values K on the isoclines x − y = K, or y = x − K. For example,
dy/dx = 0 on y = x, dy/dx = 1 on y = x − 1, and so on. If we draw the line y = x − K, then
the indicators along it are all parallel, with slope K, so it is easier to draw a large
number of them. Figure 22.2 is constructed in this way. (This equation is in fact linear
with constant coefficients, its solutions being y = x − 1 + C e−x.)
462
y
GRAPHICAL, NUMERICAL, AND OTHER ASPECTS OF FIRST-ORDER EQUATIONS
y −2 −1 0 2
3
1
2
2 1
1
3
O 1 2 3 x
O 1 2 x
Fig. 22.2 Solution curves of
dy/dx = x − y in the first quadrant. Fig. 22.3 Pattern of solution curves of
----------- isoclines, values of K indicated; dy /dx = x2 + y2 in the first quadrant.
———— solution curves. ----------- isoclines, ———— solution curves.
dy
Example 22.2 Sketch the solution curves of = x2 + y2 .
dx
The isoclines are the circles x 2 + y 2 = K (see Fig. 22.3), on each of which the slope is
equal to K (which must be a positive number here).
dy x
Example 22.3 Sketch the solution curves of =− .
dx y
y
K=1 K = −1
22
x
O
The isoclines having slope K are the radial straight lines −x/y = K (see Fig. 22.4), or
1
y = − x.
K
Thus, for example, if K = −1, the corresponding isocline is y = x, and solutions cut this
straight line with slope −1 as shown in the figure. On y = 0, the slope K must be infinite
so the direction indicators are vertical.
463
The method illustrates why there is always an infinite number of solutions:
there will be a single solution curve through every point where f(x, y) has a
22.2
definite value. The type of exception that might arise is illustrated by the case
f(x, y) = (xy)2 , which only has a meaning when x and y have the same sign; there
1
Self-test 22.1
Sketch the isoclines of the differential equation dy/dx = x2 − y2. Using the
isoclines as a guide, sketch solution curves of the equation.
y
GRAPHICAL, NUMERICAL, AND OTHER ASPECTS OF FIRST-ORDER EQUATIONS
P3
y3
y
P2 P3 k3
P1
(x2, y2) (x3, y3) P2
P0 (x1, y1) y2
k2
(x0, y0)
P1
y1
P0 k1
O x y0
h h h
Fig. 22.5 Step-by-step use of direction O x0 x1 x2 x3 x
indicators along a particular solution curve.
Fig. 22.6 Three steps in the numerical
solution of dy/dx = f(x, y), starting at the
point P0 : (x0, y0).
Therefore
for P1: x1 = x0 + h, y1 = y0 + hf(x0, y0);
for P2: x2 = x1 + h, y2 = y1 + hf(x1, y1);
and, in general, with n = 1, 2, 3, … , in turn,
for Pn: xn = xn−1 + h, yn = yn−1 + hf(xn−1, yn−1).
We expect that the points will be close to the solution curve. This is the Euler
method for approximating to the solution of the differential equation.
22.2
Input x=a Write x←x+h
a, b, h y=b x, y y ← y + f(x, y)h
Example 22.4 Use the Euler method to obtain a solution of the initial-value
problem
dy
= xy2 , with y = 1 at x = 0,
dx
between x = 0 and x = 1. Compare the result with the exact solution
y = (1 − 21 x2 )−1 when steps of h = 0.2, 0.1, 0.01, and 0.001 are adopted.
From (22.2), the first few terms are given by:
x1 = h, y1 = 1;
x2 = 2h, y2 = 1 + h2;
x3 = 3h, y3 = (1 + h2) + 2h2(1 + h2)2;
and so on.
The following results are obtained; the entries give y between x = 0 and x = 1 with
various values of step lengths h.
Euler’s method is very simple; but it is usually good enough to provide reason-
able accuracy over a finite range, provided that small enough intervals are used.
The simplest way of checking accuracy is to experiment with successively smaller
intervals h, noting when further reduction in h does not change the values of y
obtained at the number of decimal places required. Several problems on these
lines are given at the end of the chapter.
There exist, however, far more sophisticated algorithms which will give great
accuracy over long ranges without having to use minute values of h (which can
introduce problems of its own). The computer programs for such methods can be
found in libraries of computer routines. For example, the software Mathematica
has a program for the numerical solution and plotting of initial-value problems:
see the projects in Chapter 42. The theoretical side of the subject is called numer-
ical analysis; mathematical theory makes it possible, for example, to estimate the
size of interval required without carrying out trials.
466
The equation
dy y2
=
d x x2
is nonlinear (note the y2 term), and none of the theory of Chapters 18 and 19 can be
adapted to solve it. Write it in the form
dy d x
= 2.
y2 x
On the left only y appears, and on the right only x appears. The form looks like
an invitation to integrate both sides:
y = x
dy dx
2 2
+ C,
= g(x)h(y),
dx
−1 −2 2 1 0
−1
−2
2 x
O
2
1
1
Fig. 22.8 Solution curves
0 −1
y = x /(1 − Cx) for dy/dx = y 2/x 2.
Values of C indicated on the
−2 curves.
467
where the right-hand side is the product of two terms, one a function of x only,
and the other a function of y only. (Alternatively, you might see it more easily as
22.3
an equation which can be put into the form
Y(y) dy = X(x) dx,
dx dx x dx
are of the right type.
Separation of variables
dy
Equation type: = g(x)h(y).
dx
dy
Separate the terms: = g(x) d x.
h(y)
Integrate:
hd(yy) = g(x) dx + C, so that y is expressed as function of x
(usually an implicit function). C may take a range of values. (22.3)
dy
Example 22.5 Find solutions of the equation y = cos x.
dx
This can be written y dy = cos x dx. By integrating both sides, we obtain 12 y2 = sin x + C,
giving y = ± 2 2 (sin x + C)2 , where C is only to a certain extent arbitrary. It cannot be
1 1
completely arbitrary; for example, if C = −100, then sin x + C will always be negative
(because −1 sin x 1), so the square root never has a real value. We must have
C −1 to get any real solution. If −1 C 1 there are regularly-spaced intervals on
which sin x + C 0, giving the oval curves in Fig. 22.9. If C 1 their sin x + C 0 for
all x, giving the wavy phase paths.
y
3
x
−2π −π O π 2π 3π
−3
1 1
Fig. 22.9 Solution curves y = ± 2 2 (sin x + C )2 for the equation y(dy/dx) = cos x.
468
dy y(x + 1)
Find solutions of = .
GRAPHICAL, NUMERICAL, AND OTHER ASPECTS OF FIRST-ORDER EQUATIONS
Example 22.6
d x x(y + 1)
After separating, we have
1+y 1+x
y
dy = x
d x + C,
or
⎛ 1⎞
⎜⎝1 + y ⎟⎠ dy = ⎛⎜⎝ 1 + x ⎞⎟⎠ dx + C.
1
dy
= 2y 2 .
1
Example 22.7 Find solutions of
dx
(b) y = (x + C)2
(a) y y = (x + C)2 y x −C
22
x
O −C x O −C
Fig. 22.10 (a) y = (x + C)2 for various C. (b) The solutions of the differential equation, consisting
only of the right-hand branches of the parabolas. (Note: y(x) = 0 is also a valid solution.)
This represents a family of parabolas, as shown in Fig. 22.10a. But it cannot be right:
the curves cross at every point, although dy/dx has only one value at any point. In fact,
since y 2 0, only the positive value of y′(x) is legitimate, and this gives the right-hand
1
Self-test 22.2
22.4
Show that the equation
Given either one of them, we can immediately construct the other, so we shall
regard such pairs of expressions as being simply different ways of writing the
same thing. In effect this is what we did when carrying out the separation-of-
variables process for differential equations in Section 22.3, and we are leading up
to a generalization of this method.
In general, a differential expression or differential form has the shape
P(x, y) dx + Q(x, y) dy, (22.8)
where P(x, y) and Q(x, y) are two functions of x and y. In (22.5), we had P(x, y) = 2x
and Q(x, y) = 0; in (22.7), we had P(x, y) = y and Q(x, y) = x. The symbols on the
left of (22.5) and (22.7), d(x2) and d(xy), are called the differentials of x2 and xy
respectively.
470
The table (22.9) (below) gives a list of useful identities written in the usual form
and the differential form for comparison.
GRAPHICAL, NUMERICAL, AND OTHER ASPECTS OF FIRST-ORDER EQUATIONS
d
(C) = 0 (C constant) dC = 0
dx
d 2
(x ) = 2x d(x2) = 2x dx
dx
d 2 dy
(y ) = 2y d(y2) = 2y dy
dx dx
d dy
(xy) = x +y d(xy) = y dx + x dy
dx dx
d ⎛ y⎞ 1 ⎛ dy ⎞ ⎛ y⎞ 1
⎜ ⎟= ⎜x − y⎟ d ⎜ ⎟ = − 2 (y dx − x dy)
dx ⎝ x ⎠ x2 ⎝ dx ⎠ ⎝ x⎠ x
d ⎛ x⎞ 1 ⎛ dy ⎞ ⎛ x⎞ 1
⎜ ⎟= ⎜y − x ⎟ d ⎜ ⎟ = 2 (y dx − x dy)
dx ⎝ y ⎠ y2 ⎝ dx⎠ ⎝ y⎠ y
d ⎛ y⎞ 1 ⎛ dy ⎞ ⎛ y⎞ 1
⎜ ln ⎟ = ⎜x − y⎟ d ⎜ ln ⎟ = − (y dx − x dy)
d x ⎝ x ⎠ xy ⎝ d x ⎠ ⎝ x⎠ xy
(22.9)
22.4
dy
= 0,
dy x3
Example 22.9 Find solutions of the equation =− 2.
dx y
In differential form, this becomes
x 3 dx + y 2 dy = 0.
But
x 3 d x + y2 dy = d( 14 x 4 ) + d( 13 y 3 ) = d( 14 x 4 + 13 y 3 ).
This will be zero if
4 x + 3 y = C,
1 4 1 3
where C is, in this case, any constant. The equation is in fact separable; you should
compare this with the process in Section 22.3.
dy x − y
Example 22.10 Find solutions of = .
dx x + y
In differential form, this becomes
0 = (x − y) dx − (x + y) dy = x dx − y dx − x dy − y dy.
Try to rearrange it so that recognizable forms appear:
0 = x dx − y dy − (y dx + x dy) = d( 12 x2 ) − d( 12 y2 ) − d(xy)
= d( 12 x2 − 12 y2 − xy).
This differential will be zero as required if
2 x − 2 y − xy = C ,
1 2 1 2
where C is the ‘variable constant’, or parameter, which will generate a whole family of
solutions.
In the previous example, the terms were rearranged in a search for a group like
y dx + x dy that would simplify, in that case, to d(xy). If a differential form can be
expressed identically (that is to say, for all y(x)) in the form of a single differential
P(x, y) dx + Q(x, y) dy ≡ dF(x, y), (22.10)
It can be proved that every differential form has an integrating factor, but only
occasionally is it easy to see one. Examples 22.11 and 22.12 show cases that are
amenable.
22
dy
Example 22.11 Find a family of solutions of x = y.
dx
(This is a linear equation and it is also separable, so we have two other methods for
solving it.) In differential form:
y dx − x dy = 0,
and we cannot do anything with the left-hand side as it stands. However, the remark
above suggests we multiply by the integrating factor 1/x2, obtaining
0 = (1/x2)(y dx − x dy) = d(−y/x).
Therefore
− y/x = C, or y = −Cx,
are the solutions, as is easily confirmed.
There are other possibilities; for example (see (22.9)), we might divide by y 2 or xy.
In the end these lead to the same set of solutions.
473
Note that, in Example 22.11, the equation is linear:
22.5
dy ⎛ 1 ⎞
+ ⎜ − ⎟ y = 0,
dx ⎝ x ⎠
(where we choose C = 0 for simplicity), which is different from, though related to,
the ones which work for the differential form y dx − x dy above.
dy
Example 22.12 Find a set of solutions of x = y + y2 x.
dx
Equivalently, y dx − x dy + y 2x dx = 0. The first two terms cannot be written as dF(x, y).
The table (22.9) offers three integrating factors, x−2, y−2, and (xy)−1, to choose from.
It is, however, also necessary to be able to manage the remaining term, y 2x dx, after
multiplying by the integrating factor, so we choose y −2, which gives
0 = (1/y 2)(y dx − x dy) + x dx = d(x / y) + d( 12 x2 ) = d(x / y + 12 x2 ).
Therefore x / y + 12 x2 = C, or y = x /(C − 12 x2 ), are solutions.
Self-test 22.3
2
d2 y ⎛ d y ⎞
GRAPHICAL, NUMERICAL, AND OTHER ASPECTS OF FIRST-ORDER EQUATIONS
dw 1
− 2
= d x, or = x + A,
w w
where A is constant, so that
1
w= .
x+A
For the second stage, remember that w = dy/dx, so we have
dy 1
= .
dx x + A
Therefore
y = ln | x + A| + B,
where A and B are constants which we see, in retrospect, may be chosen
entirely arbitrarily.
dy A yD
Equation of the form =f C xF
dx
Change to a new dependent variable v by
22
v = y/x
and solve the resulting separable equation.
(To make the change write y = xv, so that
dy/dx = x dv/dx + v.) (22.14)
dy 3y − x
Example 22.14 Find solutions of = .
d x 3x − y
This equation can be written in the form
dy 3y / x − 1
= ,
dx 3 − y /x
which has the form f(y/x), so change the dependent variable from y to v = y/x. To obtain
dy/dx in terms of v, write y(x) = xv(x). Then dy/dx = x dv/dx + v, and in terms of v the
equation becomes ➚
475
Example 22.14 continued
22.5
dv 3v − 1 dv v2 − 1
x +v= , or x = .
dx 3−v dx 3 − v
= ⎜
⎛ 1 2 ⎞
− ⎟ dv.
⎝ v − 1 v + 1⎠
Therefore ln |[(v − 1)/(v + 1)2]| = ln | x | + C, where C is an arbitrary constant. After
returning to y and simplifying, we have
(y − x)/(y + x)2 = c, (22.15)
where c = ± e . The solution curves are shown in Fig. 22.11, plotted by working directly
C
from the differential equation and using a numerical method (see Section 22.2).
y
x
=
y
x
Si
dy 3y − x
y
so x
=
=
.
lu
−
dx 3x − y
tio
n
2x = K −1 v2 d(v2 ) = 12 K −1(v2 )2 + C,
or
v 4 = 4Kx − 2CK.
The initial condition x(0) = v(0) = 0 then gives
1
v(x) = (4Kx)4 .
x(t) is obtainable by solving the separable equation v = dx/ dt = (4kx)4 .
1
Self-test 22.4
Find the general solution of
dy y y2
2 = + .
dx x x2
22
Problems
In these problems, y′ means dy /dx. 22.2 (Computational). Use the Euler method to
compute approximate solutions to the following
22.1 Sketch a lineal-element diagram for the initial-value problems. Try various values of the
solution curves of each of the following. step h. Compare the results with the exact
(a) y′ = − y; (b) y′ = x − y; solutions provided.
(c) y′ = x/y; (d) y′ = xy; (a) y′ = − 12 y with y = 1 at x = 0, over the
(e) y′ = −y /x; (f ) y′ = y /x; range 0 x 2. (The exact solution
1 is y = e− 2 x.)
1
PROBLEMS
y = sin x.) solutions as in Example 22.7. Look out for
solutions you might have lost in the process:
22.3 (Computational). Use the Euler method to these are usually suggested by the sketch.
calculate a few representative solution curves in the dy dy 1
= 2y 2 ; = xy 2 ;
1
dy 2xy
22.4 (Separation of variables). Obtain solutions of (g) + = 0 (this is also a linear equation);
dx x2 − 1
the following equations. dy
(a) y′ = x /y; (b) y′ = 2x /y; (h) (1 − sin y) + cos x = 0;
dx
(c) y′ = x /(y + 2); (d) y′ = (x + 3) /(y + 2);
(e) y′ = x 2 /y 2; (f) y′ = −x 2 /y 2; dy
(i) (1 + 3 e3y ) = 2 e 2x − 1;
(g) y′ = y 2
/x 2
; (h) y′ = −y 2 /x 2; dx
(i) 2xy′ = y 2; (j) yy′ + x = 1; dy
(j) (ex + y + 1) + (ex + y − 1) = 0;
dx dx
(k) = 3t 2x3;
dt dy 1 + cos x sin y
(k) = .
(l) (sin x)
dx
= t; d x 1 − sin x cos y
dt
dy 22.8 (Differential method). Solve the following
(m) ex + y = 1; equations. Some of these need an integrating
dx
dy factor (see eqn (22.12)) such as the ones
(n) (1 + x 2 ) + (1 + y 2 ) = 0, with y(0) = −1. suggested.
dx
dy y y − 2x
(a) = (check also for solutions of
22.5 Show that the solution of the initial-value d x x x − 2y the for y = mx);
problem dy y(1 − x 2 )
(b) = (divide by x 2 );
dy
=− ,
x d x x(1 + x 2 )
dx y dy y2
(c) = 2 ;
where y = 1 when x = 2, is obtained from the equation dx y − 1
x y dy y(y − 1)
2
u du = − v dv.
1
(d)
dx
= 2
y −x
(divide by y 2 );
dy y(x 2 + y 2 − y)
Generalize this technique to apply to the initial- (e) = (divide by x 2 y 2 );
dx x(x 2 + y 2 )
value problem
dy y x3 − y
dy (f) = (show that this reduces to
= g(x)h(y) d x x x3 + y
dx x y d(x/y) = y d(xy); now put u = xy
3 2
H
(c) Assume that (in mks units) K = 4, m = 80,
y
α = 1.2, g = 10, and that the mass is dropped from
rest. Use Euler’s method (22.2) to obtain v 2, and (x, y)
hence v, over a sufficient distance to compare
Bank
Bank
PROBLEMS
dy/dx.) dr/dt = v cos θ − V and dθ /dt = −(v sin θ )/r,
where r and θ are polar coordinates for the cat
22.16 (Computational). As in Problem 22.15,
relative to the (moving) mouse. Construct a
but construct a differential equation for a stream
differential equation for r in terms of θ, and
having a parabolic distribution of velocity, greatest
solve it (it is really only a question of
in the middle and zero at the banks, of the form
integration).
v(x) = ax(H − x).
Put in plausible values for V, v, H, and a, and 22.18 A satellite of mass m takes off vertically with
compute the path. speed V at time t = 0 from the surface of a planet
of radius a. Assume that it is only influenced by the
22.17 A mouse M enters a room at O and rushes gravitational pull of the planet. If r is the distance
to its hole at H with speed v, pursued by the cat of the satellite from the centre of the planet at time
C, who starts from B at the same moment as the t, then, by Newton’s law of gravitation, its
mouse appears (see Fig. 22.13). The cat runs with equation of motion is
d 2r γ Mm
m =− 2 ,
dt 2 r
B
where M is the mass of the planet and γ is the
gravitational constant. Using the identity
d 2r dv dr
= v , where v = ,
dt 2 dr dt
solve the first-order differential equation in v and
C
r to obtain
1 2 ⎛ 1 1⎞
r (v − V 2 ) = γ M ⎜ − ⎟ .
O θ M H 2 ⎝ r a⎠
Confirm that the escape velocity (i.e. the velocity
above which the satellite will not return to the
Fig. 22.13 planet) from the surface of the planet is √(2γ M /a).
Nonlinear differential
23 equations and the
phase plane
CONTENTS
However many methods may be invented for solving differential equations, there
will always remain equations beyond their scope. But this does not mean that
nothing can be done with them. The important van der Pol equation, which
models a type of electrical oscillator,
d2 x dx
+ c(x2 − 1) + x = 0,
dt 2 dt
where c 0, cannot be solved explicitly. However, there are still comparatively
simple methods which enable us to demonstrate its really important feature,
which is that every solution, no matter how the device is started off, settles down
into the same regular periodic oscillation.
Techniques enabling such conclusions to be drawn without actually solving the
equation are called qualitative methods. This chapter outlines a way of looking
at differential equations which is at the basis of many of these techniques.
Qualitative methods do not consist of a collection of fixed results, and tend to be
exploratory. Therefore computation is important. In the final section, a simple
computing method is described which is easy to program but is effective enough
to analyse realistic physical and biological models.
We shall take t (time) as the independent variable. For derivatives with respect
to time, we use the conventional dot notation (just like the dash notation (4.1)):
dx d2 x
B= , F= 2.
dt dt
481
23.1
Let the independent variable be t and the dependent variable x. We shall only
discuss equations which can be written in the form
(there is a special reason for using the symbol y0 here), we expect that the initial
conditions will select exactly one solution.
Suppose that the equation represents an electrical system, and that a graph
of x against t for t t0 can be plotted automatically. If we find the clock in the
plotter has been wrongly set, then it will not make the graph unusable; only its
starting time t0 will be wrong. Similarly, if we do one experiment starting at
t = 8.00 h and repeat it at t = 13.00 h, the graphs plotted will be the same shape
although the starting times are different (see Fig. 23.1). Intuition suggests that
for autonomous equations, namely those in which t does not occur independ-
ently, it will not be a local clock time t0 assigned to startup that counts, but the
‘stopwatch’ time elapsed from startup, t − t0.
x(t)
This intuition is correct. The mathematical reason is that a change of time scale
from t to t − t0 does not change the form of the differential equation, so the same
phenomena follow. Put
T = t − t0, and write x(t) = X(T ).
Then dX /dT = dx/dt and d2X /dT2 = d2x/dt2. Also t = t0 becomes T = 0, so that the
new initial-value problem is
E = Q(X, A ), X(0) = x0, A(0) = y0. (23.2)
The equation is unchanged, but the starting time is assigned the value zero.
Suppose (23.2) is solved in terms of T. Restore t and x(t) by putting T = t − t0. The
solution x(t) of (23.1) is then a function only of t − t0, so it depends only on the
time elapsed from startup.
482
B(t0) = y0.
The general solution is x(t) = A cos ω t + B sin ω t (see (18.14)). The process of finding A
and B from the equations obtained by substituting the expressions for x(t) and B(t) into
the initial conditions is quite complicated (try it). Instead, put
T = t − t0, x(t) = X(T ).
Then
E + ω 2X = 0, X(0) = x0, A(0) = y0.
The solution of this system is simple:
X(T ) = x0 cos ω T + ω −1y0 sin ω T.
Put T = t − t0; then the required solution is
x(t) = x0 cos ω (t − t0) + ω −1y0 sin ω (t − t0),
that is to say, x is a function only of the elapsed time t − t0.
This problem could arise in connection with a mass oscillating on a spring. The
initial conditions imply that the position and velocity are prescribed at the start,
t = t0. If asked what the system was doing at t = t0, specification of the position and
velocity seems to constitute an adequate description of its state. It is in fact a
perfect description, since it is exactly what is required to determine the whole
future of the system. It is therefore reasonable to call the pair of numbers
(x0, y0)
the state of the system at t0.
Subsequently the system moves smoothly through a succession of states: x and
B will vary in time. Catch the system at any moment t1; then the state (x(t1), B(t1))
serves as fresh initial conditions for all the subsequent motion, but there can
never be any conflict with what was predicted from the original initial conditions.
It is the succession of states which is the subject of this chapter; the precise time
that the states occur takes a secondary place.
To track the succession of states (x(t), B(t)) for the initial-value problem (23.4),
we could in principle begin by finding its solution. The solution (Example 23.1) is
x(t) = x0 cos ω t + ω −1y0 sin ω t,
483
so
23.2
B(t) = −ω x0 sin ω t + y0 cos ω t.
In effect, these equations specify the states, x(t) and B(t), parametrically with
y dy = −ω x dx,
2
or
ω 2x2 + y2 = C, (23.6b)
y=B
(x0, y0)
A
x
O
solution. Equation (23.6a) alone does not tell us which way to travel along the curve;
for this purpose, we momentarily resurrect t. The arrows indicate the directions
that correspond to time going forwards rather than backwards. We defined y by
dx
= B = y. (23.7)
dt
If we are in the upper half plane, then y 0, so dx/dt is positive. Therefore x(t) is
increasing, and the directive arrow points from left to right. By a similar argu-
ment with y 0 we find that in the lower half plane the arrow points from right
to left. We must follow the arrow. Supplied with arrows, the state curves are called
phase paths, or trajectories, or orbits for the differential equation F + ω 2x = 0.
Starting from A : (x0, y0), follow the phase path. In going round, we can pick out
a new feature in passing: that B is zero when x is at a minimum or maximum, and
vice versa. Eventually we get back to A, renewing the initial state. Continue to
follow the path around, duplicating the first circuit; the succession of states is
repeated time after time.
This repetition does not itself establish that this is a truly periodic process.
When we meet A again at the end of the first circuit, it is at a later time, t1 say, so
the initial conditions for the original equation are to this extent changed. Even
though the system must follow the same path, perhaps it takes twice as long to
go round the second time. However, from the discussion in Section 23.1, the time
to complete any circuit, or to go repeatedly between any two fixed points on the
circuit, is invariable because the equation is autonomous. This argument does not
depend on what equation we started with, so we can say in general: any closed
phase path represents a periodic oscillation.
23
Finally notice in Fig. 23.2 the bullet at the origin. This point represents a true
solution, namely
x(t) = 0, y(t) = 0
for all t (corresponding to C = 0 in (23.6b)). It is a special case of an equilibrium
point, meaning, in general, a constant solution
x(t) = k, B(t) = 0,
where k is a constant. Equilibrium points are of great importance in phase dia-
grams. An equilibrium point surrounded by closed curves is called a centre. It
represents periodic oscillations about equilibrium. (Note, however, that oscillations
of different amplitudes do not usually have the same period).
Since we chose a simple case, we have not discovered anything we did not know
already, so consider the following example.
Example 23.2 Sketch an (x, B) phase plane for the equation F + cx3 = 0,
where c 0.
This represents small lateral oscillations x(t) of a mass attached to the middle of an
elastic string that is fixed at the ends and is unextended when x = 0. It is the same as
Example 20.5, with l = L, and c = 4s/ml. We can regard explicit solutions as being
unobtainable. ➚
485
Example 23.2 continued
23.2
Put y = B. Then F = D, and we have two first-order equations, together equivalent to
the original equation:
B = y, D = −cx3,
y dy = −c x dx,
3
where A is arbitrary. For any A 0, the phase path consists of two curves, one
for y 0 and one for y 0. On both curves y = 0 where x = ± A 4 . The curves join
1
smoothly at these two points so that the phase paths are closed curves. The family
of phase paths is shown in Fig. 23.3. The origin is an equilibrium point since x = 0,
y = 0 is a solution.
x
O
The phase diagram of Fig. 23.3 consists entirely of closed curves; so every solu-
tion of the differential equation is a periodic oscillation. (However, we cannot say
that they all have the same period: in fact they do not.) This phase diagram has
therefore revealed an important fact about an equation that we could not solve.
Self-test 23.1
Write down and solve the equation for the phase paths of F − x3 = 0. Sketch
the phase diagram.
486
equations; stability
For the moment, we shall stay with equations that are familiar from Chapter 18.
y dy = ω x dx,
2
or
y2 − ω 2x 2 = A,
where A is arbitrary. This represents the family of hyperbolas in Fig. 23.4, having
asymptotes y = ±ω x. The directions of the arrows follow the rule in Section 23.2:
left to right in the upper half plane.
O
x
23
23.3
On the other hand a centre, exemplified in Figs 23.2 and 23.3, would often be
called a stable equilibrium point. If equilibrium is disturbed by a small amount,
O
x
O
x
y = − 12x
y = −3x Fig. 23.6 A stable node.
In Fig. 23.6, the origin is called a stable node. All solutions fall straight into the
origin without any oscillations: the system is deadbeat. Notice the structure of
the node. There are two straight line solutions to the equation
dy − 27 y − 23 x
= ,
dx y
23
which can be found by trying for solutions of the form y = mx. Then dy /dx =
m = (− 27 m − 23 )/m, or 2m2 + 7m + 3 = 0. Therefore m = − 21 or m = −3, and the two
linear solutions are y = − 21 x, y = −3x. The divide the plane into four sectors which
contain curved phase paths. Each of the curves has the property that it is tangen-
tial to y = − 21 x at the origin, and parallel to y = −3x at infinity. This behaviour is
characteristic of nodes arising from linear equations, and the mutual tangency at
the origin is common to all nodes, even those arising from nonlinear equations.
The technique for second-order differential equations can be summed up as
follows.
Self-test 23.2
23.4
Discuss the possible phase diagrams of the linear equation F + B + cx = 0 for
nonzero values of the constant c.
Fig. 23.7
If
x = 0, ±2π, ±4π, … , (23.10b)
the pendulum is hanging vertically from its pivot in equilibrium. These values of x
all represent the same observed state, though on the phase plane they correspond
to different points. Similarly the values
x = ±π, ±3π, … (23.10c)
small, isochronous oscillations. We have already solved the same problem for the
phase plane in Section 23.2 and we found a centre: the family of ellipses shown in
Fig. 23.2:
ω 2x2 + y2 = C,
where C is an arbitrary non-negative constant. This family will be repeated (for
small C) around x = ±2π, ±4π, … in a progressively developing phase diagram;
see Fig. 23.8.
x
−π O π 2π 3π 4π
Fig. 23.8 Phase paths near the equilibrium points for the pendulum equation F + ω 2 sin x = 0.
Next, consider the case when the pendulum stands vertically: we choose as
representative of (23.10c) the case x = π. To find what happens when the state is
slightly displaced from the point (π, 0), put
x = π + X,
where the new variable X is going to be small. Then
23
where A is (to an extent) arbitrary. Since cos x has period 2π, the repetitious nature
of Fig. 23.8 is explained. Notice that (cos x − A)2 is real only when cos x A.
1
491
23.5
THE GENERAL PHASE PLANE
x
−π O π 2π 3π
(for A −1) represent a whirling motion. The separatrices correspond to A = −1. There are
centres at x = 0, ±2π, … , and saddles at x = ±π, ±3π, … .
Therefore A 1. With that limitation, there are two main ranges of A which
give significantly different patterns of (x, y) curves: −1 A 1 and A −1. The
centres correspond to A = 1, and the special curves joining the saddles, called the
separatrices, correspond to A = −1. Notice the regular whirling motions which
occur if y = B is large enough.
or
a ln y − by + c ln x − dx = C, (x 0, y 0). (23.15)
Equation (23.15) represents the closed curves shown in Fig. 23.10. It is possible to
23
determine in advance that they are closed: the reason will be given shortly.
The direction arrows on the figure do not obey the rule (23.8b) for the (x, B)
phase plane. Each case has to be treated separately. The principle is easy: we have
to find the direction at a single point, and the directions elsewhere are settled
by continuity of direction: we expect adjacent curves to have the same direction.
We might take the point M : (0, m) in Fig. 23.10. At this point, the second
equation of (23.13) gives D = −cm 0, so y is decreasing at M. Once this direction
is settled, the directions on the other curves follow by continuity.
There is a centre at the equilibrium point E : (c/d, a/b). If the rabbit/fox
populations take the values at E, the equations predict that the state will be
y E
(c/d, a/b)
O x Fig. 23.10
493
permanent. A bad season for grass, or a disease amongst the foxes, will put the
population state somewhere else, and thereafter the populations will undergo
23.5
periodic oscillations. If foxes feast and thrive, rabbits languish, eventually starving
the foxes; therefore rabbits prosper again; and so on.
To understand this you might need to look forward at Section 28.1. In three
dimensions, x, y, z, the surface z = f(x) + g(y) is bowl-shaped, with a minimum
or maximum at (α, β ). The paths are closed curves cut out by intersection with
the horizontal planes z = C. The functions c ln x − dx and a ln y − by have maxima
at x = c/d, y = a/b.
For a general system
B = P(x, y), D = Q(x, y),
the equilibrium points are where
P(x, y) = Q(x, y) = 0,
and might therefore appear anywhere in the phase plane, not just on the x axis
as with the (x, B) plane. On Q(x, y) = 0, D = 0 so that phase paths cut this curve
parallel to the x axis. Similarly on P(x, y) = 0, B = 0 so that paths cut this
curve parallel to the y axis. Between these curves the slopes of the paths will be
either positive or negative depending on the sign of Q(x, y)/P(x, y). The curves
Q(x, y)/P(x, y) = constant are known as isoclines (as in Section 22.1), that is,
curves along which the slopes of the phase paths are constant. The following
statements recall the main features encountered in this section.
Self-test 23.3
NONLINEAR DIFFERENTIAL EQUATIONS AND THE PHASE PLANE
We shall obtain a linear approximation to P(x, y) and Q(x, y) valid near (k, l). Put
x = k + X, y = l + Y, (23.19)
where we suppose that X and Y are small. Then, because of (23.18), the approxima-
tions will take the form
A = P(x, y) ≈ aX + bY, C = Q(x, y) ≈ cX + dY (23.20)
23.6
a microscope or seen over an immense field, so we are not restricted to small x
and y.
APPROXIMATE LINEARIZATION
In applying (23.22), do not be too ready to decide that the original equations
have a centre just because the linearized ones do: the small difference from the
linear approximation may be all that is necessary to change a centre into a spiral.
(1, 1)
O
x
(−1, −1)
Fig. 23.11
Self-test 23.4
Investigate the linear approximations of
B = y(x2 − 1), D = −x(y2 − 1)
at its equilibrium points (see Self-test 23.3). Sketch the phase diagram.
496
where
G X(t) J · G A(t) J G a bJ
X(t) = I , X(t) = I , A=I , (23.24)
Y(t) L C(t) L c dL
to secure that the origin is the unique equilibrium point (otherwise there is a line
consisting of an infinite number of equilibrium points).
We look for a basis of solutions having the form
X1(t) = U1 eλ t,1
X2(t) = U2 eλ t,
2
(23.26)
in which U1 and U2 are linearly independent constant vectors. Every solution X(t)
of (23.23) will be the sum of multiples of the basic solutions (23.26).
To determine λ1, λ2, U1 and U2, substitute the form
X(t) = U eλt
into (23.23). After cancelling the common factor eλt, we obtain λU = AU, or
(A − λI)U = 0 (23.27)
23
where I is the identity matrix. The solutions λ are the eigenvalues of A, and U1
and U2, a pair of linearly independent eigenvectors. Equation (23.27) has nonzero
solutions for U if, and only if,
4a − λ b 4
det(A − λI) = 4 = 0,
c d − λ4
which is the quadratic equation
λ2 − (a + d)λ + (ad − bc) = 0.
Put
p = a + d, q = ad − bc, ∆ = p2 − 4q. (23.28)
23.8
then all X(t) → 0 as t → ∞ and X(t) → ∞ as t → −∞. These cases correspond
respectively to unstable and stable nodes (Fig. 23.6 is an example of a node).
LIMIT CYCLES
If the eigenvalues are of opposite sign, say λ1 0 and λ2 0; then all X(t) → ∞
with the exception of the two straight line paths corresponding to C2 = 0, which
enter the origin as t → ∞. There are also two that emerge from the origin cor-
responding to C1 = 0. This behaviour defines a saddle point (see Fig. 23.4).
1
Now suppose that ∆ 0. Then ∆–2 is pure imaginary, and λ1 and λ2 are complex
conjugates:
λ1 = α + iβ, λ2 = α − iβ, (23.31)
say, where α and β are real. Also the eigenvectors U1 and U2 are complex con-
jugates (or may be so chosen), and since we require the most general real solution
we choose C2 = C1. Express these parameters in the form
C1 = | C1 | eiγ, C2 = | C1 | e−iγ 5
G uJ G| u | e J iρ G UJ G| u | e J 6 .
−iρ
U1 = I L = I , U2 = I L = I (23.32)
v | v | eiσL V | v | e−iσL 7
The general (real) solution for the system becomes
X(t) = C eα t | u | cos(γ + ρ + β t)9
(23.33)
Y(t) = C eα t | v | cos(γ + σ + β t)$
in which C and γ are arbitrary. In the X, Y plane this represents two simultaneous
harmonic motions having the same circular frequency β and different phases,
γ + ρ and γ + σ, and amplitudes modulated by the factor eα t.
Consider first the case α = 0 in (23.31), so that
X(t) = C | u | cos(γ + ρ + β t)9
.
Y(t) = C | v | cos(γ + σ + β t) $ (23.34)
X(t) and Y(t) are periodic, with period 2π/β, and therefore (X(t), Y(t)) represents a
closed path in the X, Y phase plane, which must surround the origin, since the origin
is the only equilibrium point. By varying C we generate a family of geometrically
similar curves. Therefore the origin is a centre (see, for example, Fig. 23.3: it can
be shown that these paths are central ellipses inclined to the axes in general).
In eqn (23.33) for α ≠ 0, the factor eα t modulates the amplitude of (23.34). The
closed paths (23.4) are no longer closed, but expand along every cycle is α 0,
and contract approaching the origin if α 0. We therefore have a family of spirals
(see, for example, Fig. 23.5), and the origin is therefore called a spiral point.
F + (x2 + B2 − 1)B + x = 0.
Put
B = y, D = (1 − x2 − y2)y − x. (i)
It is possible to express the phase paths in polar coordinates r, θ :
r 2 = x2 + y2 and tan θ = y/x.
Differentiate these equations with respect to t:
rK = xB + yD,
I/cos θ = (xD − yB)/x2.
2
(ii)
Substitute (i) into (ii): remember that B = y, and put x = r cos θ and y = r sin θ as necessary.
Then r(t) and θ(t) are found to satisfy
K = −r(r 2 − 1) sin2θ, (iii)
A particular solution of (iii) and (iv) is r = 1, with I = −1. This indicates a pathconsisting
of the circle r = 1, followed around in the clockwise direction with unit angular velocity.
Also, from (iii),
⎧ 0 if r 1,
K⎨
⎩ 0 if r 1, (v)
so the circle is approached from points inside by means of expanding spirals, and from
points outside by contracting spirals. The phase diagram is shown in Fig. 23.12.
y
23
O x
Fig. 23.12
If we start from any initial conditions except for the equilibrium point (0, 0), the
system settles down gradually to the regular oscillation represented by the circle. This
behaviour has a physical explanation. The ‘coefficient’ x2 + B2 − 1, although variable,
serves the purpose of a damping coefficient. Outside the circle, when x2 + y2 − 1 0
(remember y = B), energy is lost and the paths tend to drift inwards. When x2 + y2 − 1
0 there is negative damping; energy is being supplied, so the amplitude of paths within
the circle increases. For points on the circle x2 + y2 − 1 = 0 the damping is zero, so the
motion is harmonic (the solutions are x = cos(t + φ ), with φ any constant), consistent
with the circular path.
499
(a) y
23.9
15
−10
−15
Fig. 23.13 (a) Limit cycle for F + 10(x2 − 1)B + x = 0. (b) The solution x(t) corresponding to
the limit cycle.
Self-test 23.5
Using polar coordinates, show that F + (x2 + B2 − 1)B + x = 0 has a limit cycle
whose path is given by r = 1. Is it stable?
Since, approximately,
xn+1 − xn = hB(tn ) = hP(xn, yn),
and similarly for yn+1 − yn, the rule for getting from Pn to Pn + 1 is as follows.
This process gives rise to rather unevenly spaced points on a phase path, widely
spaced when P and Q are large, and very closely spaced near an equilibrium point,
where P and Q are inevitably small. However, they have the advantage that, if
necessary, regular time indications can be marked on the path while it is being
computed.
If evenly spaced points are wanted, the parameter can be changed from time t
to arc-length s. We have δs2 = δx2 + δy2; so
1
ds ⎡⎛ dx ⎞
2
⎛ dy ⎞
2
⎤2
= ⎢⎜ ⎟ + ⎜ ⎟ ⎥ = (P 2 + Q2 )2 .
1
dt ⎢⎝ dt ⎠ ⎝ dt ⎠ ⎥⎦
⎣
Therefore
dx dx ds P dy Q
= = 2 = 2
23
and
(P + Q2 )2 ds (P + Q2 )2
1 1
ds dt dt
are equivalent equations for the path, in terms of arc-length s. This gives the fol-
lowing method.
To compute the paths of the system B = P(x, y), D = Q(x, y), at evenly
spaced points
Apply (23.36) to the equivalent system
dx dy
= L(x, y), = M(x, y),
ds ds
where
L = P/(P 2 + Q 2 ) , M = Q/(P 2 + Q 2 ) .
1 1
2 2
PROBLEMS
Many of the problems involve computation. straight paths y = mx by substitution. State which
The method of Section 23.8 is sufficient, but a are unstable.
high-accuracy computer library routine would (a) B = x − 5y, D = x − y;
allow the use of much larger values of h, and (b) B = x + y, D = x − 2y;
therefore be more efficient. (c) B = −4x + 2y, D = 3x − 2y;
(d) B = x + 2y, D = 2x + 2y;
23.1 (Computation). Practise computing a phase (e) B = 4x − 2y, D = 3x − y;
diagram in the following cases. Information is (f) B = 2x + 3y, D = −3x − 3y.
given for checking, but imagine that you do not
have it. The equilibrium points are at (0, 0). Take 23.5 For the equations given: find any equilibrium
different starting points; you may have to work points; obtain a linear approximation at each
backwards as well as forwards, by changing the equilibrium point by the method of Section 23.6;
sign of h. classify it from (23.22) (finding the straight line
(a) B = y, D = −4x. (A centre, 4x2 + y2 = C. If paths in the case of nodes and saddles); and put
your paths do not nearly close, try a smaller the sketches on a phase diagram. Guess how the
interval h.) diagram away from the equilibrium points is filled
(b) B = y , D = x. (A saddle, x2 − y2 = C. Find the in (isoclines, Section 17.1, might help here). Then
asymptotes by trying y = mx in the equations: turn to Problem 23.6.
they are y = ±x. It is difficult to make sense (a) B = x − y, D = x + y − 2xy;
of the diagram without this information.) (b) B = 1 − xy, D = (x − 1)(y + 1);
(c) B = y, D = −2x − 3y. (Stable node. Find the (c) B = x − y, D = x2 − 1;
two solutions y = −x, y = −2x, which are radial (d) F + x − x3 = 0 (with B = y);
straight lines as in (b). These represent four (e) B = 4x − 2xy, D = −2y + xy, for x 0 and y 0
paths since they are interrupted by the origin.) (foxes and rabbits, Example 23.6: classify (0, 0)
(d) B = y, D = −3x − y. (Stable spiral.) as if x and y could be negative).
(e) B = y, D = −2x + y. (Unstable spiral.)
(f ) Recompute (a), marking off a time scale on 23.6 (Computational). Check some of the
each of the paths, showing intervals in t of phase diagrams you sketched in Problem 23.5 by
around 0.3. computing representative phase paths. Look out
(g) Recompute (b) with a time scale as in (f). for separatices, which end at equilibrium points.
(h) B = y, D = −2y. A different type: what is the
second-order equation that it comes from? 23.7 Sketch possible phase diagrams from the
information given. If a phase path ends in mid
23.2 Sketch the phase paths for the following air, or if you have a closed curve without an
equations by first solving for them: form dy /dx equilibrium point inside, then there is something
and separate the variables. wrong. There are often several possibilities: for
(a) B = y, D = x; (b) B = x, D = y; example, a path might either join two equilibrium
(c) B = −y, D = x; (d) B = −x, D = y; points or split, forming two branches going to
(e) B = 2y, D = x; (f) B = −2y, D = x. infinity. Suppose that the only equilibrium
points at a finite distance are those given in the
23.3 Solve the following by using the energy following cases.
transformation d 2x /dt 2 = 12 d(B 2 )/dx (Example (a) centre at (0, 0), saddle at (1, 0);
20.15), and sketch the (x, B) phase diagrams. (b) centre at (0, 0), saddles at (±1, 0);
(a) F = ex; (c) unstable node at (0, 0), stable node at (1, 0);
(b) F + B2 + x = 0 (the transformed equation is (d) centres at (±1, 0).
linear in y2);
(c) F − 8xB = 0; 23.8 (Computational). Obtain a phase diagram
(d) F = ex − e−x (the Poisson–Boltzmann equation). for the following (in some of these the linear
approximation point is zero, so it gives no
23.4 Classify the equilibrium point (0, 0) for each information):
of the following linear equations by using (23.22). (a) F + | B | B + x = 0; (b) F + | B |B + x3 = 0;
Sketch the phase diagram: in cases where it is (c) F = x4 − x2; (d) B = 2xy, D = y2 − x2;
appropriate you should first obtain the radial (e) B = 2xy, D = x2 − y2;
502
(f ) F + B(x2 + B2) + x = 0 (notice that the origin is a 23.15 (Computational). Construct a phase
spiral, although the linear approximation has diagram for the following equations. (They
NONLINEAR DIFFERENTIAL EQUATIONS AND THE PHASE PLANE
a centre – see the remark following (23.22)). each contain a limit cycle.)
(a) F + 12 (x 2 + B 2 − 1)B + x = 0;
23.9 From the Taylor series (5.4b), sin x ≈ x − 61 x3 (b) F + 15 (x 2 − 1)B + x = 0;
for small x, so the pendulum equation (23.9) is
(c) F + 15 ( 31 B 2 − 1)B + x = 0;
approximated by F + ω 2(x − 61 x3 ) = 0 (the Duffing
equation). Sketch or compute the phase diagram, (d) F + 5(x 2 − 1)B + x = 0.
and comment on the differences from Fig. 23.9,
for the exact equation. 23.16 As in Problem 23.7, sketch phase diagrams
for the general (x, y) phase plane compatible
23.10 (Computational). For a modified form of with the following information. The equilibrium
the predator–prey problem (compare Example points and limit cycles specified are the only
23.6), in a special case, the equations are ones allowed.
(a) (0, 0) is a spiral and x2 + y2 = 1 is a stable
B = 4x − 2xy − x2, D = −2y + xy − 2y2.
limit cycle.
The additional terms in x2 and y2 are meant (b) (0, 0) is a spiral, x2 + y2 = 1 a stable limit cycle,
to account for competition for resources and x2 + y2 = 4 another limit cycle.
among rabbits and among foxes. Use a linear (c) (±1, 0) are saddles, (0, 0) is a centre, and
approximation at the equilibrium points in order x2 + y2 = 4 is a stable limit cycle.
to classify them, then compute the phase diagram. (d) (±1, 0) are centres, (0, 0) is a saddle, and
x2 + y2 = 4 is a stable limit cycle.
23.11 A model for H(t) hosts supporting (e) (0, 0) is a centre; the only closed path with
P(t) dangerous parasites is O = (a − bP)H, x2 + y2 1 is the stable limit cycle x2 + y2 = 4.
P = (c − dP/H)P, where a, b, c, d, are positive.
Analyse the system in the (H, P) plane. 23.17 Show that, in polar coordinates, the
system
23.12 Figure 23.14 represents a spring of stiffiness
s and natural length l, pivoted at A at a height B = −y + x(1 − x2 − y2),
h above a smooth wire CD. At B is a bead m, D = x + y(1 − x2 − y2)
attached to the spring and sliding on the wire. becomes
The equation of motion is
K = r(1 − r 2), I = 1.
23
s ⎛ l ⎞
F+ ⎜1 − 2 1 ⎟ x = 0. By investigating the sign of K, explain why the
m⎝ (h + x 2 ) 2 ⎠ system has just one limit cycle, which is stable.
Classify the equilibrium points when l h, l = h, Sketch the phase diagram.
and l h.
A
23.18 Find the locations of all the equilibrium
points of
B = (x2 + y2 − 1)y, D = −(x2 + y2 − 1)x.
Explain why the circle x2 + y2 = 1 does not
h
represent periodic motion.
CONTENTS
e 0
−st f (t) dt = F(s)
506
(for c large enough to ensure convergence) is called the Laplace transform of f(t):
the integral transforms f(t) into another function F(s).
THE LAPLACE TRANSFORM
e e
⎡ 1 ⎤
F(s) = −st e dt =
2t −( s−2 )t dt = ⎢− e−( s−2 )t ⎥
0 0 ⎣ s−2 ⎦0
1 1 1
=− (e−∞ − e0 ) = − (0 − 1) = .
s−2 s−2 s−2
This result is true only if s 2; otherwise the integral is infinite. We shall always
assume that s is large enough to ensure that the integrals we encounter remain
finite, or converge (see Section 15.6).
24
We also use the symbol L to stand for the ‘Laplace transform of’. We have just
proved that
1
F(s) = L {e2t} = .
s−2
L{t n } = e 0
− st n
t dt.
e
⎛ u ⎞ du 1 n!
L{t n } = e− u ⎜ ⎟ = n+1 −u un du =
0
⎝ s⎠ s s 0
sn +1
for n = 0, 1, 2, … (from the standard integral, (17.9) for the factorial). Note that 0!
is to be interpreted as being equal to 1.
507
24.2
n!
L{t n} = n+1 , n = 0, 1, 2, … , t > 0.
s
e
1
L{e± t} = e−st e± t dt = −( s ∓1)t dt = − [e−( s ∓ 1)t ]0∞ .
0 0
s∓1
s z 1 are both positive if we take s 1, in which case
1 1
L{e± t} = − (0 − 1) = .
s∓1 s∓1
Exponential function, t 0
1 1
L{et } = , L{e − t} = .
s−1 s+1 (24.3)
Trigonometric functions, t 0
s 1
L{cos t} = , L{sin t} = .
s +1
2
s +1
2
(24.4)
508
Since cos t + i sin t = eit, both of these can be verified at the same time by working
out L{eit} and then separating the real and imaginary parts:
THE LAPLACE TRANSFORM
∞ ∞
e 1 1
L{eit } = e −st eit dt = −( s− i)t
dt = − [e −(s− i)t]0∞ = − (0 − 1)
0 0
s−i s−i
(since s is positive)
1 s+i
= = 2 .
s−i s +1
Therefore, as in (24.4),
∞ ∞
e e
s
24
∞ ∞
e
1
e −st sin t dt = Im −st eit dt = .
0 0
s2 + 1
Self-test 24.1
Find, using the definition (24.1), the Laplace transform of (a) sin 2t,
(b) --21 sin 2t + e2t (for s 2). Show that L{H(t − 1)} = e−s/s, where H is the
unit function (1.13).
Scale rule
If L{f(t)} = F(s), and k 0, then for t 0
1 ⎛ s⎞
L{f (kt)} = F⎜ ⎟.
k ⎝ k⎠
(24.5)
509
The proof is as follows.
24.3
∞
L{ f (kt)} = e
0
−st f (kt) dt.
e
⎛ du ⎞ 1 1 ⎛ s⎞
L{ f (kt)} = e −s(u /k ) f (u) ⎜ ⎟ = −( s/k )u f (u) du = F⎜ ⎟,
0
⎝ k⎠ k 0
k ⎝ k⎠
∞ −su
since F(s) = ∫ e f(u) du.
0
The following are special cases of the scale rule.
This is true also if k is negative, since it is equal to ∫ ∞0 e−st cos kt dt (see (24.1)).
(c) is similar to (b).
∞ ∞
L{ekt f (t)} = 0
e −st ekt f (t) dt =
e
0
(s −k )t f (t) dt.
But ∫ ∞0 e−st f(t) dt = F(s), which is supposed to be known, and here we have s − k in
place of s. Therefore
L{ekt f(t)} = F(s − k).
The shift rule is so called because the transform function F(s) is ‘shifted’ a distance
k along the s axis by the presence of the factor ekt.
There is a rule similar to (24.7) by which we can find the Laplace transform of
t n f(t) when the transform of f(t) is known:
Multiplication by t n, n = 1, 2, …
If L{f(t)} = F(s), and n is a positive integer, then for t 0
dn F(s)
L{tn f (t)} = (−1)n .
dsn (24.8)
511
The simplest way to prove this is to start with the right-hand side. Since
24.3
∞
(−t e
dF(s) d d(e −st )
= e −st f (t) dt = f (t) dt = −st ) f (t) dt
ds ds 0 0
ds 0
∞
=− e
0
−st (tf (t)) dt = −L{tf(t)}.
then, by (24.8),
d s 9 − s2 s2 − 9
L{t cos 3t} = − =− 2 = 2 .
ds s + 9
2
( s + 9) 2
(s + 9)2
For t 0,
s2 − k 2
L{t cos kt} = ,
(s2 + k 2 )2
2ks
L{t sin kt} = 2 .
(s + k 2 )2
(24.9)
Example 24.7 Find L{t3 e−3t} (a) by using the shift rule, (b) by using (24.8),
(c) by working directly from the definition of the Laplace transform.
(a) From (24.2),
6
L{t 3} = 4 .
s
Therefore, using the shift rule (24.7a) with k = −3,
6
L{e −3t t 3} = .
(s + 3)4
(b) From (24.6) with k = −3
1
L{e −3t} = .
s+3 ➚
512
Example 24.7 continued
THE LAPLACE TRANSFORM
Self-test 24.2
1
Find L{f(t)} where f(t) for t 0 is given by (a) ett3; (b) e− –2 t(H(t) − H(t − 1))
and H is the unit function (1.13).
Self-test 24.3
Given that L{1} = 1s , show that (24.8) used repeatedly, yields the transforms
of successive powers of t. Obtain the transform of t3e−kt by applying the
rule (24.8), given that L{e−kt} = 1/(s + k).
24.4
n! ⎫
⎧tn (n = 0, 1, … ),
(24.10)
A much fuller table which also includes the various rules can be found in
Appendix F. Remember that everything we do with Laplace transforms refers to
t 0 only: the defining integral (24.1) calls only on values of t 0.
Partial fractions are often useful for inverting transforms as follows:
Example 24.8 Given the transform 1/[s(s + 1)], find the inverse transform.
In partial fractions,
1 1 1
= − .
s(s + 1) s s + 1
From the table above,
1 1
↔ 1 and ↔ e −t ,
s s+1
so that
1
↔ 1 − e −t.
s(s + 1)
From (24.2),
1
↔ 1.
s
From the table (24.10),
s 1
↔ cos 2t, ↔ 1
sin 2t.
s2 + 4 s2 + 4
2
Therefore
s+1
↔ 1
− 1
cos 2t + 1
sin 2t.
s(s2 + 4)
4 4 2
24
The quadratic denominator does not have real factors, so partial fractions are not
available. Instead we complete the square:
s2 + 2s + 2 = (s + 1)2 − 1 + 2 = (s + 1)2 + 1.
We aim to write the whole expression in terms of s + 1 so that we can apply the shift
rule (24.7). So put also
3s + 2 = 3(s + 1) − 3 + 2 = 3(s + 1) − 1,
and the transform becomes
3(s + 1) − 1
.
(s + 1)2 + 1
If we had s instead of s + 1, we could invert the transform:
3s − 1 3s 1
= − ↔ 3 cos t − sin t.
s2 + 1 s2 + 1 s2 + 1
Therefore, by the shift rule with k = −1,
3(s + 1) − 1
↔ e −t(3 cos t − sin t).
(s + 1)2 + 1
Self-test 24.4
Show that
1 1
↔ (et − cos t − sin t)
(s − 1)(s2
+ 1) 2
by using partial fractions.
515
24.5
Suppose that L{f(t)} = F(s). Then the Laplace transforms of df(t)/dt, d2f(t)/dt2, …
can be expressed in terms of F(s).
e
⎧ d f (t) ⎫ d f (t)
L⎨ ⎬= −st dt.
⎩ dt ⎭ 0
dt
Integrate the right-hand side by parts. Using the notation of Section 17.7, put
dv d f (t)
u = e− st , = ,
dt dt
so that
du
= −s e− st , v = f (t).
dt
Then
e
⎧ d f (t) ⎫ d f (t)
L⎨ ⎬= −st dt
⎩ dt ⎭ 0
dt
∞
= [e −st f (t)] −
∞
0
(−s e
0
−st ) f (t) dt
= 0 − e0 f (0) + s
e0
−st f (t) dt
= −f(0) + sL{f(t)}.
⎧ d f (t) ⎫
L⎨ ⎬ = sF(s) − f (0). (24.11)
⎩ dt ⎭
⎧ d2 x dx ⎫ ⎧ d2 x ⎫ ⎧ dx ⎫
L⎨ +2 + 3x⎬ = L ⎨ 2 ⎬ + 2 L ⎨ ⎬ + 3L[x]
⎩ dt 2
dt ⎭ ⎩ dt ⎭ ⎩ dt ⎭
= s X − sx(0) − x′(0) + 2[sX − x(0)] + 3X
2
= s2X − 4s − 5 + 2(sX − 4) + 3X
= (s2 + 2s + 3)X − 4s − 13.
Self-test 24.5
Obtain the Laplace transform of the expression
d2x dx
2
− 2 − 6x,
dt dt
where x(0) = 2 and x′(0) = −1.
The results (24.12) enable initial-value problems for linear differential equations
having constant coefficients to be solved.
517
24.6
dx
+ 2x = e − t
dt
It can be seen that the terms involving f(0), f ′(0), … in (24.12), far from being
merely a nuisance, are exactly what is required to translate a differential equation
together with initial conditions into a simpler problem in ordinary algebra. We
do not have to match up arbitrary constants with the initial conditions; these
conditions are built into the transformed equations.
In many physical situations, we want to know what happens when an inactive
or quiescent system is ‘switched on’. In such cases, we have zero initial conditions
at some time t0 0. For a system described by a second-order differential equa-
tion, the variable and its first derivative are initially set to zero.
11 1⎛ s+1 1 ⎞
= − ⎜ + ⎟.
4 s 4 ⎝ (s + 1) + 3 (s + 1) + 3 ⎠
2 2
24.7
dx dy
= x − y, = x + y,
dt dt
Self-test 24.6
Solve the equation
d2x dx
2
− 3 + 2x = e3t
dt dt
with initial conditions x(0) = 1, x′(0) = 2.
(a) x (b) x
THE LAPLACE TRANSFORM
1 1
O t O c t
(c) x (d) x
1
1
d O c t O 1 t
24
(e) x
8
6
Fig. 24.1 (a) x = H(t),
4
(b) x = H(t − c),
2 (c) x = H(t − d) − H(t − c),
(d) x = t[H(t) − H(t − 1)],
O 1 2 t (e) x = et[H(t − 1) − H(t − 2)].
It is shown again in Fig. 24.1a. Figures 24.1b–e show how it can be used to
describe various step functions and switching functions.
For example, the composition of the three segments of Fig. 24.1e is specified by:
⎧et(0 − 0) = 0 if t 1,
⎪
et[H(t − 1) − H(t − 2)] = ⎨et(1 − 0) = et if 1 t 2,
⎪⎩et(1 − 1) = 0 if t 2.
The various combination rules such as the shift rule (24.7) work for H(t) in the
same way as for smooth functions f(t), as is shown in the following examples.
e
L{ f (t)} =
0
− st
e t[H(t − 1) − H(t − 2)] dt,
2
= e
1 1
−(s−1)t
[e −(s−1)t ]12 = −
dt = − (e −2(s−1) − e −(s−1) ).
1 s−1 s − 1
Alternatively, we could use the shift rule (24.7), though it has no particular advantage.
521
Example 24.17 Find the Laplace transform of the square wave function shown
24.7
in Fig. 24.2.
O 1 2 3 4 t
Fig. 24.2
By considering the segments one at a time and using Fig. 24.1c, we have
x(t) = [H(t) − H(t − 1)] − [H(t − 1) − H(t − 2)] + [H(t − 2) − H(t − 3)] − ··· ,
= H(t) − 2H(t − 1) + 2H(t − 2) − 2H(t − 3) + ··· .
From (24.14)
e −ns
H(t − n) ↔ .
s
Therefore
1 2 −s
L{x(t)} = − (e − e −2s + e −3s − ).
s s
The brackets contain an infinite geometric series with first term e−s and common
ratio −e−s (see (1.37)). Therefore
1 2 e −s 1 − e −s
L (x(t)) = − = .
s s 1+e −s
s(1 + e −s )
Suppose that we have a function g(t) which has a meaning for all positive t,
such as g(t) = e−t. Its Laplace transform is G(s) = ∫ ∞0 e−st g(t) dt. All values of g(t) for
t positive are called on to contribute to this integral, but none of its values for
negative t are called upon (Fig. 24.3a).
Now translate the function a distance c (positive) to the right as in Fig. 24.3b.
The new graph represents g(t − c)H(t − c). It brings with it a section NA which
originally corresponded to negative values of t. We cannot expect that the
Laplace transform of this new function g(t − c) can be expressed in terms of G(s),
because none of these t values played any part in the calculation of G(s).
(a) (b)
A A
O t O c t
Fig. 24.3 (a) Graph of g(t)H(t). (b) Graph of g(t − c)H(t − c).
522
Therefore we cut out the section NA and consider not g(t − c), but g(t − c)H(t − c),
which is shaded in Fig. 24.3b. It is congruent to the shaded part of Fig. 24.3a.
THE LAPLACE TRANSFORM
Then
∞
= ec
−st
g(t − c) dt.
0 0
This is the second shift rule, or the delay rule, so called because g(t − c)H(t − c)
does not start until t = c.
Delay rule
If G(s) ↔ g(t) and c 0, then
e−cs G(s) ↔ g(t − c)H(t − c). (24.15)
24.7
dx
+ 2x = f (t)
dt
y
0.4
y = f(t)
0.2
O 1 2 t Fig. 24.4
1
= (e −(s+1) − e −2(s+1) ) = F(s),
s+1
say. The transformed equation is then
sX + 2X = F(s), or X = F(s)/(s + 2).
Therefore
1 ⎛ 1 1 ⎞
X = (e −(s+1) − e −2(s+1) ) = (e −(s+1) − e −2(s+1) ) ⎜ − ⎟
(s + 1)(s + 2) ⎝ s + 1 s + 2⎠
⎛ 1 1 ⎞ −2 −2s ⎛ 1 1 ⎞
= e −1 e −s ⎜ − ⎟ −e e ⎜ − ⎟.
⎝ s + 1 s + 2⎠ ⎝ s + 1 s + 2⎠
Apply the delay rule with c = 1 and c = 2, noting that
1 1
− ↔ e − t − e −2 t.
s+1 s+2
We obtain
x(t) = e−1(e−(t−1) − e−2(t−1))H(t − 1) − e−2(e−(t−2) − e−2(t−2))H(t − 2)
= (e−t − e1−2t)H(t − 1) − (e−t − e2−2t)H(t − 2).
Both terms are zero before ‘switch-on’ at t = 1. Between t = 1 and 2, only the first
term contributes. For t 2 both terms are present, the second causing ‘switching off’.
524
Self-test 24.7
THE LAPLACE TRANSFORM
Use the delay rule to find the inverse of the Laplace transform
e−s e−2s
− .
s 2s3
Sketch the inverse function for t 0.
Assume that L{f(t)} = F(s). Let g(t) = f(t)/t (assuming that f(t)/t can be defined at
t = 0). Take the Laplace transform of both sides of f(t) = tg(t):
d
L{f(t)} = L[tg(t)] = L{g(t)]} = −G′(s) (say, by (24.8)).
ds
Hence F(s) = −G′(s). This separable equation has a solution which can be
expressed in the form
∞
L 8 f(t)9 =
1
9t 8 F(u) du.
s
(24.16)
L8
sin t 9
9 t 8
= u du+ 1 = [arctan u]
s
2
∞
s
1
= π − arctan s = arctan(1/s).
2
The division rule has to applied with care. For example the function (cos t)/t will
not have a transform since (cos t)/t → ∞ as t → 0, unlike (sin t)/t which has the limit 1
as t → 0.
Self-test 24.8
Find the Laplace transform of (e−t − 1)/t.
525
Problems
PROBLEMS
The dot notation, B = dx/dt, F = d2x/dt 2, etc., is (e) 2F + 3B − 2x, where x(0) = 5, B(0) = −2;
used in some of the questions. (f ) 3F − 5B + x − 1, where x(0) = 0, B(0) = 0.
24.1 Write down L{x(t)}, where x(t) is as follows. 24.7 Use the Laplace transform to solve the
(a) et; (b) 4 e−t; (c) 3 e t − e−t; following initial-value problems.
(d) 3t 2 − 1; (e) 12 t 3 + 2t 2 − 3; (f ) 3 + 2t4; (a) F + 3B + 2x = 0, x(0) = 0, B(0) = 1;
(g) 3 sin t − cos t; (h) 2(cos t − sin t); (b) F + B − 2x = 0, x(0) = 3, B(0) = 0;
1 1 1 (c) F + 4B = 0, x(0) = x0, B(0) = y0;
(i) 1 + t + t 2 + + t n (you get a geometric
1! 2! n! (d) F + ω 2x = 0, x(0) = c, B(0) = 0.
series; see Section 1.16). (e) F + 2B + 5x = 0, x(0) = 3, B(0) = −3;
(f ) d4y/dx4 − y = 0, y(0) = 1, y′(0) = 0, y″(0) = 0,
24.2 (Scale rule). Find L{x(t)} for the following y′′′(0) = 0 (use x instead of t as the variable in
cases of x(t). the Laplace transform).
(a) e3t; (b) 1 − 2 e−2t;
(c) sin ω t; (d) cos ω t; 24.8 Use the Laplace transform to solve the
(e) 3 cos 2t − 2 sin 2t; following initial-value problems.
(f ) cos2t (express it in terms of cos 2t); (a) F = 1 + t + et, x(0) = 0, B(0) = 0;
(g ) sin2t (see (f )). (b) F + x = 3, x(0) = 0, B(0) = 1;
(c) F + 2B + 2x = 3, x(0) = 1, B(0) = 0;
24.3 (See Section 24.3.) Find L{x(t)} in the (d) F − x = e2t, x(0) = 0, B(0) = 1;
following cases of x(t). (e) F − x = t et, x(0) = 1, B(0) = 1;
(a) t2 et (easiest to start with t2); (f) F − 4x = 1 − e2t, x(0) = 1, B(0) = −1;
(b) t e−2t; (c) t2 e−t; (g) F − 4x = e2t + e−2t, x(0) = 0, B(0) = 0;
(d) e2t cos t; (e) e−t sin t; (h) F + ω 2x = C cos ω t, x(0) = x0, B(0) = y0;
(f ) et sin 3t; (g ) e−2t sin 3t; (i) J − 2F − B + 2x = e−2t, x(0) = 0, B(0) = 0,
−3t
(h) e cos 2t; (i) t cos 3t; F(0) = 2 (look out for factors in the
( j) t sin 3t; (k) t2 sin t; denominator of X(s)).
(l) t 4 e−t (compare the three methods: (i) start with
t 4 and use the shift rule, (ii) start with e−t 24.9 Solve the following simultaneous first-order
and use (24.8), (iii) work directly from the differential equations, for the given initial values.
definition (24.1)). (a) B = x − y, D = x + y, x(0) = 1, y(0) = 0;
(b) B = 2x + 4y + e4t, D = x + 2y, x(0) = 1, y(0) = 0;
24.4 Obtain the Laplace transform for t sin kt by (c) B = x − 4y, D = x + 2y, x(0) = 2, y(0) = 1.
differentiating that of cos kt with respect to k.
24.10 Find the general solution of the following by
24.5 Invert the following Laplace transforms. putting x(0) = A, B(0) = B, where A and B are arbitrary.
(a) 1/s 2; (b) 1/s; (a) F + x = et; (b) F − x = 3; (c) F − 2B + x = et.
(c) 3/(2s); (d) 3/s 5;
(e) 1/(s − 3); (f ) 1/(s + 4); 24.11 Find the general solution of d4 y /dx 4 − y = e x,
(g) 3/(2s − 1); (h) 2 /(2 − 3s); by putting y(0) = A, y′(0) = B, y″(0) = C, y′′′(0) = D,
(i) 1/[s(s − 1)]; (j) 1/(s2 + s − 1); where A, B, C, D are arbitrary. (Let the variable in
(k) s/(s − 1);
2
(l) (2s − 1)/(s2 − 1); the Laplace transform (24.1) be x instead of t.)
(m) s/(s2 + 1); (n) 1/(s2 + 4);
(o) (2s − 1)/(s2 + 4); (p) (2s − 1)/[s(s − 1)]; 24.12 This is a system of first-order equations for
(q) (s2 − 1)/[s(s − 1)(s + 2)(s + 3)]; x0(t), x1(t), … , xn(t):
(r) s/(s − 1)(s2 + 1); (s) 1/(s − 1)3;
B0 = −β x0, Br = β (xr−1 − xr ), x0(0) = 1, xr(0) = 0
(t) (2s + 1)/(s2 − 2s + 2); (u) s /[(s2 + 1)(s2 + 4)].
for r = 1, 2, … , n. Solve them by using the Laplace
24.6 Find the Laplace transform of the following transform, showing that
expressions involving x(t), where L{x(t)} = X(s). 1
(a) B(t), where x(0) = 6; xr = (β t)r e − βt.
r!
(b) B(t), where x(0) = 0;
(c) F(t), where x(0) = 3, B(0) = 5; 24.13 Use the delay rule (24.15) to obtain the
(d) F(t), where x(0) = 0, B(0) = 0; Laplace transform of e−t(t − 2) cos(t − 2)H(t − 2).
526
24.14 Find the functions which give rise to the (b) F − 4x = f(t), where
following Laplace transforms: ⎧1 for 0 t 1,
THE LAPLACE TRANSFORM
CONTENTS
Most of the applications described in this chapter are drawn from electronics,
using terms such as signal, input, output, impulse, feedback, and so on. Such
terminology is also adopted in describing analogous behaviour of systems of all
sorts, from mechanics to biology. For linear systems, central mathematical con-
cepts are the delta or impulse function (Section 25.2), convolution (Section 25.5),
and the treatment of discrete systems (Section 25.8). The z-transform is closely
related to the Laplace transform, and simplifies the algebra of discrete systems
to some extent.
Notice that as in Chapter 21 on phasors, in Sections 25.3 and 25.4 on imped-
ance and transfer functions we consider only situations in which any transients
(exponentially decreasing terms in the solutions, which arise from the initial con-
ditions) have already died away. The currents, voltages, etc. are then sine/cosine
oscillations all having the prevailing frequency but with various amplitudes and
phases.
Division rule
LAPLACE AND Z TRANSFORMS: APPLICATIONS
g(τ ) dτ.
t
1
If G(s) ↔ g(t), then G(s) ↔
s 0
(25.1)
To prove this, put (1/s)G(s) = F(s); then we must express f(t) in terms of g(t).
Rewrite the relation between F(s) and G(s) in the form
sF(s) = G(s).
From (22.12) we know that, in general,
df
↔ sF(s), provided that f(0) = 0;
dt
so then we have df/dt ↔ G(s). This is equivalent to the initial-value problem
df/dt = g(t), with f(0) = 0. By integration we obtain
t
Example 25.1 Find f(t) when F(s) = 1/[s(s2 + 1)], (a) by using partial fractions,
(b) by using (25.1).
1 1 s
(a) = − ↔ 1 − cos t.
s(s2 + 1) s s2 + 1
(b) In the notation of (25.1), put
1
G(s) = 2 ↔ sin t.
s +1
Therefore
t
1 1 1
F(s) = 2 = ↔ sin τ dτ = 1 − cos t.
s(s + 1) s s2 + 1 0
i(τ ) dτ .
1
v(t) =
C 0
i(t) i(t)
C Fig. 25.1
529
Therefore, according to (25.1), the relation between the Laplace transforms of
v(t) and i(t) is
25.1
1
V(s) = I(s). (25.2)
1⎛
t
⎞
v(t) = ⎜
C⎝
0
i(τ ) dτ + q0 ⎟ .
⎠
i(t)
+
ωt
v(t) = v0 cos ω C
Fig. 25.2
Such an equation is called an integral equation for i(t). The Laplace transform of the
equation is
v0s 1
= RI(s) + I(s),
s +ω
2 2
Cs
so
v s2 v0 1 ⎛ (RCω )2 s RCω 2 1 ⎞
I(s) = 0 2 = ⎜ − +
R (s + 1 /(RC))(s + ω ) R 1 + (RCω )2 ⎝ s2 + ω 2
2
s2 + ω 2 s + 1/(RC)⎟⎠
after splitting into partial fractions. Therefore, for t 0,
v 1
i(t) = 0 [(RCω )2 cos ω t − RCω sin ω t + e−t/RC ].
R 1 + (RCω )2
The first two terms represent a steady forced oscillation and the final term is a transient.
530
Figure 25.3 shows the graph of a function which is zero everywhere except for a
tall, narrow rectangle with width ε and height 1/ε, so that the area under the
graph is equal to 1. Imagine that ε is a very small number, as small as we wish.
This very tall and very narrow picture is a simplified version of the impulse function
or delta function, usually denoted by δ(t). It is used in problems involving sudden
and brief events, to represent (say) impulsive force between two bodies in collision;
voltage from a lightning strike; or, if the variable is position rather than time, a
point force.
x
1
ε x = δ(t)
x
25
1
ε
f(c) C
x = δ(t − c)
x = f(t)
b
O a c c +ε t
O ε t Fig. 25.4
Fig. 25.3
where c lies between a and b. The integrand is zero except between c and c + ε ;
over this very narrow interval, f(t) hardly changes from the value f(c). Therefore
(as closely as we wish)
c+ε c+ε
f(t) δ(t − c) dt ≈
b
f(c)
f(c)ε dt =
−1
dt = f(c).
a c
ε c
If c does not lie between a and b, then the integral is zero. The delta function is
sometimes called a sifting function because of this property.
531
25.2
f(t) δ(t − c) dt = ⎧⎨⎩0f (c)
b
if a c b,
otherwise.
L{δ(t − c)} = e
0
−st
δ(t − c) dt = e−cs,
Example 25.4 Find the current resulting from an impulsive voltage Iv δ(t)
applied to the circuit of Fig. 25.5, the current being zero before application of
the voltage. (The physical dimensions of Iv are [emf × time]: see Appendix I.)
The equation for the current is L di/dt + Ri = Iv δ(t). After transformation, with i(0) = 0,
it becomes
LsI(s) + RI(s) = Iv.
Therefore
Iv I
I(s) = ↔ i(t) = v e −Rt /L.
L(s + R /L) L ➚
532
Example 25.4 continued
LAPLACE AND Z TRANSFORMS: APPLICATIONS
i(t)
v(t) = Ivδ(t) L
Fig. 25.5
The great, though brief, applied voltage gives only a finite current because of the
counter-emf generated by the coil.
The delta function can be regarded formally as the derivative of the unit func-
tion H(t). As in Fig. 25.6a, smooth out the transition of H(t), from zero to one,
as t passes through the origin, by means of a sloping straight line segment. The
derivative of this function is equal to zero outside the transition interval (0, ε) and
25
(b) x
(a) x 1
ε
Slope εε−1
O ε t O ε t
Fig. 25.6
⎛ 1⎞
s ⎜ ⎟ − H(0) = 1,
⎝ s⎠
25.3
0 ε
The right-hand side approaches 1/s as ε → 0 (use l’Hôpital’s rule of Section 5.8).
Self-test 25.1
An impulsive voltage Kδ(t) is applied at t = 0 to a circuit consisting of an
inductance L and a capacitor C in series. Obtain the current i(t), assuming
the circuit is initially quiescent.
di(ι) 1
Time domain: v(t) = Ri(t) v(t) = L v(t) = i(τ ) dτ
dt C
0
s domain: V(s) = RI(s) V(s) = LsI(s) V(s) = I(s)/(Cs)
Impedance Z(s): R Ls 1/(Cs)
(25.8)
Table (25.8) should be compared with the table (21.5) for the case of steady
forced oscillations of frequency ω /(2π). The impedances Z(s) in the s plane are
analogous to the complex impedances R, iω L, and 1/(iω C) of (21.6) for the
steady case. One can pass from one to the other by substituting iω for s, or −is for
ω. However, the s forms allow arbitrary inputs to the circuit to be considered.
Impedances combine in series and parallel in the same way as do complex
impedances (see (21.7)) in the frequency domain, but it is to be remembered that
they refer to zero initial conditions only.
534
Impedances in series
Z1 Z2 …
Z = Z1 + Z2 + ··· .
Impedances in parallel
…
1 1 1
= + + . Z1 Z2
Z Z1 Z2
…
(25.9)
Example 25.5 The circuit shown in Fig. 25.7a is initially quiescent, with zero
charge on the capacitor. The constant voltage v0 is switched on at t = 1 and off
at t = 2. Find the current i(t).
(a) (b)
R=3 R
Z1 =
25
1 + RCs
1
C = 12
v0 L=4 V(s) Z2 = Ls
Fig. 25.7
The corresponding s-domain impedances are shown in Fig. 25.7b, in which the elements
R and C are grouped. They are in parallel, so (25.8) and (25.9) give
1 1 1 1 s s+4
= + = + = .
Z1 R (Cs)−1 3 12 12
Hence
12
Z1 = ,
s+4
and also Z2 = Ls = 4s. Then Z for the whole circuit is given by
12 4(s + 1)(s + 3)
Z = 4s + = .
s+4 s+4
Therefore
s+4
I(s) = V(s).
4(s + 1)(s + 3)
Taking into account switch-on at t = 1 and switch-off at t = 2,
v(t) = v0 [H(t − 1) − H(t − 2)], ➚
535
Example 25.5 continued
25.4
so
⎛1 1 ⎞
V(s) = v0 ⎜ e −s − e −2s ⎟ .
f(t)
P Inp Driving voltage or current
ut
p(t
)
P′
q(t) Q′
u t put
Q O Fig. 25.8
536
The circuit is initially quiescent. Suppose there are N branches, and N voltages
v1(t),v2(t), … , vN(t), with transforms V1(s),V2(s), … ,VN(s), to be determined. An
LAPLACE AND Z TRANSFORMS: APPLICATIONS
external potential difference (voltage) f(t) with transform F(s) is applied across
any two points in the network. Apply the s-transform version of Kirchhoff’s
equations (eqn 21.8) to obtain N equations sufficient to determine the Vn(s). Each
equation takes one of only two possible forms:
either a1V1 + a2V2 + … + aNVN = 0
or b1V1 + b2V2 + … + bNVN = F,
whose coefficients are functions of s. Therefore the transforms are, after solving
the N linear equations, proportional to F in the form:
Vn(s) = Gn(s)F(s), (n = 1, 2, … , N).
The coefficients Gn(s) depend on the circuit parameters, and are called voltage-to-
voltage transfer functions determining the voltage induced in any branch by the
sudden establishment of the given applied voltage F(s). It is only a single further
step to derive voltage-to-current and current-to-current transforms between
particular pairs of branches. We can put the results in the following form:
25
Example 25.6 Find the transfer function GPQ(s) from the voltage transform
P(s) across R, regarded as the input, and the voltage transform Q(s) across C,
regarded as the output, in Fig. 25.9a.
Let the current i(t) be as indicated. The impedances of the various groups are shown
in Fig. 25.9b; these are in fact transfer functions from current to voltage for each unit.
In terms of the transforms,
1
P(s) = RI(s), Q(s) = I(s).
Cs
Therefore
Q(s) 1
GPQ(s) = = .
P(s) RCs ➚
537
Example 25.6 continued
25.4
(a) L (b)
rLs
R r R
1
p(t) C q(t) P(s) Cs Q(s)
i(t)
I(s)
v(t) V(s)
Fig. 25.9
Thus
1 1
Q(s) = P(s),
RC s
t
Suppose now that we have a circuit such as the one in Fig. 25.10a, called
Circuit A, where p(t) is the input voltage and q(t) the output voltage. Figure 25.10b
schematizes the arrangement and specifies the transfer function G(s) = Q(s)/P(s)
between p and q.
(a) circuit A
(b) circuit A, s domain
+ +
RA (c)
Q LAs
p(t) LA q(t) P(s) GA = = Q (s) LAs
P RA + LAs GA =
P RA + LAs Q
Fig. 25.10
RB
LBs
LB GB =
RB + LBs
(b)
circuit A circuit B
N
LAs LBs
P(s) GA = Q(s) GB = R(s)
RA + LAs RB + LBs
Fig. 25.11
circuits will be changed, and the changes will not compensate each other. In special
circumstances, however, the circuits may behave almost independently, or can be
made to do so by means of technical arrangements such as feedback.
25
Example 25.7 The two circuits A and B shown in Fig. 25.12 are connected to
form a composite circuit C. Show that
G(s) ≈ GA(s)GB(s)
(where the G(s) are the transfer functions for the voltages shown) if 1/R is much
smaller than 1/r + 1/r1.
r1 R I r1 I1 R
V1 r VA V2 C VB V r C VC
L L
Fig. 25.12
25.4
For Circuit C, by following the voltage drops around closed subcircuits as usual, we get
V = r1I + r(I − I1),
r 1 1
GA = r + r GB =
V 1 VA Cs R + Ls + 1(Cs) VB
Fig. 25.13
Example 25.8 Figure 25.14 shows a chain of three systems, which act
independently upon their inputs according to the transfer functions GA(s),
GB(s), and GC(s) indicated in the boxes. Find the transfer function G(s)
between F(s) and FC(s). Find fC(t) when f(t) = H(t), for zero initial conditions.
1 1 1
GA(s) = GB(s) = GC(s) =
s s+1 s+2
F(s) FA(s) FB(s) FC(s)
Fig. 25.14
We have
FC FC FB FA 1 1 1
= = ,
F FB FA F s+2 s+1s ➚
540
Example 25.8 continued
LAPLACE AND Z TRANSFORMS: APPLICATIONS
so
1
G(s) = .
s(s + 1)(s + 2)
Now let f(t) = H(t); then F(s) = 1/s. Therefore
1 1 1
FC(s) = G(s)F(s) = = .
s(s + 1)(s + 2) s s2(s + 1)(s + 2)
In partial fractions,
31 1 1 1 1 1
FC(s) = − + + − .
4 s 2 s2 s + 1 4 s + 2
Therefore f (t) = − 43 + 12 t + e − t − 14 e −2 t for t 0.
Finally we illustrate the relation between transfer functions in the s domain and
complex transfer functions in the ω domain (Section 21.5).
Example 25.10 The transfer function between an input F(s) and an output
X(s) is 1/(s2 + 1). Find the amplitude and phase of the steady forced oscillation
produced by an input f(t) = 3 sin 2t.
As pointed out in Section 25.3, the complex impedance is simply the s domain impedance
with iω substituted for s. The same is true for any transfer function. In the ω domain
representation, the input and output will be represented by phasors F(ω) = 3 e − 2 π i
1
(2i)2 + 1
The amplitude is the modulus of X, which is 1, and the phase is --12 π.
541
25.5
The following result enables us to interpret Laplace transforms that take the form
of a product of two functions.
g(t − τ )h(τ ) dτ
t
f(t) =
0
The integral is called the convolution of g(t) and h(t). This result will be proved
in Chapter 32, Example 32.12, by using double integration. For the present we
shall verify that it is true in some special cases.
t t
= e
0
−t
e−τ dτ = e−t e
0
−τ
dτ (i)
Notice very carefully the distinction between t and τ in the integrals (25.11): τ is
the variable of integration. The variable t is a constant so far as the integration
process is concerned; so, for example, in eqn (i), Example 25.11, we took e−t
outside the integral sign.
542
as expected.
0 0
which is the integral required, merely using u instead of τ for the variable of integration.
Self-test 25.2
25.6
Use the convolution theorem (25.11) to obtain the solution for the unknown
function x(t) in the ‘integral equation’
x(t) = I −1
v x*(t − τ )f(τ ) dτ,
0
or I −1
v x*(τ )f(t − τ ) dτ.
0
This type of result applies to the other circuit variables such as voltages and
charges, and to mechanical systems governed by linear differential equations.
In terms of general outputs and inputs:
544
Output x(t) from an input f(t) to a quiescent linear system, in terms of the
LAPLACE AND Z TRANSFORMS: APPLICATIONS
or
t
t t
= I −1 e−t e sin τ dτ − I sin(2t − 2τ ) sin τ dτ
0
τ −1
0
t t
= I −1 e−t e sin τ dτ − --I [cos(2t − 3τ ) − cos(2t − τ )] dτ,
0
τ 1 −1
2
0
(a) (b)
25.8
y g(α) y τ)
y = f(
f(τ1)
DISCRETE SYSTEMS
y = f(τ1)g(τ − τ1)
f(τ1)g(t − τ1)
τ
O Elapsed time, or age α O τ1 t − τ1 t
Fig. 25.15
Now choose any moment τ1 between 0 and t: there was a force f(τ1) applied at this
moment, and its contribution to x at time t τ1 is
g(t − τ1)f(τ1) δτ .
The factor g(t − τ1) takes into account the time elapsed between the cause and its
effect – in some problems is would be appropriate to call t − τ1 the ‘age’ of f(τ1) at
the moment t of observation, and g an ageing factor. Depending on the type of
problem, this factor might weaken or amplify the contribution of f(τ1) to the
integral as time t passes. The elapsed time is increased if either we take an earlier τ1,
or delay the time of observation by increasing t. Figure 25.15a shows a represent-
ative function g(α ), where α stands for ‘age’, and Fig. 25.15b illustrates its effect on
the influence of f at time τ1 on x at a later time t.
where
x(t) ↔ X(s) and y(t) ↔ Y(s),
subject to the condition of quiescence at t = 0. Thus G(s) completely describes the
effect of the circuit. By (25.11), the convolution theorem (25.15) is equivalent to
t t
y(t) =
0
x(τ )g(t − τ ) dτ or x(t − τ )g(τ ) dτ
0
(25.16)
O T 2T 3T 4T t O T 2T 3T 4T t
Fig. 25.16
In other words, the interpretation of g(t) is that it is equal to the output from a
unit delta-function input at t = 0. This repeats the result (25.14).
So far in the chapter we have only considered circuits made up from the tradi-
tional elements, resistances, capacitances, and inductances, but there exists a far
greater variety of basic units. We shall not describe the circuits which contain
these new features, but only specify their properties.
Figure 25.16a shows a smooth signal x(t) starting at t = 0. Imagine that this
serves as the input to a circuit that picks out the values of x(t) at times t = 0, T, 2T,
25
3T, … , samples them over very short time intervals, and ignores the values of x(t)
in between, treating them as if they were zero. This process is indicated by the
shaded strips in Fig. 25.16a. The device registers a sequence of values
{x(0), x(T), x(2T), x(3T), … },
called a sample of x(t) at equal intervals T. In an actual instrument the output
will consist of a succession of ‘spikes’ as in Fig. 25.16b. These can be thought of
as brief puffs of energy generated by the circuit, which are equal in ‘content’ to
the sequence of values above, so it is plausible to represent the sample, y(t) say, by
K
y(t) = ∑ x(kT) δ(t − kT) (25.18)
k= 0
(where K may be infinite). Such a function is called discrete. The circuit works like
the first stage of an analogue-to-digital converter.
Suppose next that we have a circuit which processes discrete inputs of interval T,
and produces discrete outputs of interval T. Such circuits may amplify, or filter, or
delay, or modify the input in a variety of ways. We then have a completely discrete
system. The input x(t) and output y(t), and their Laplace transforms X(s) and Y(s),
take the form
N N
x(t) = ∑ xn δ(t − nT), X(s) = ∑x n e−nTs, (25.19)
n =0 n =0
K K
y(t) = ∑y k δ(t − kT), Y(s) = ∑y k e−kTs, (25.20)
k= 0 k= 0
where xn and yk are constants, and N and K may be infinite. We may alternatively
express x(t) and y(t) in the form
547
x(t) = {x0, x1, x2, … , xN}, or simply as {xn};
y(t) = {y0, y1, y2, … , yK}, or as {yk}.
25.8
Thus, {n + 3} stands for {3, 4, 5, … }. In a case such as {1, 2, 0, 0, 0, 0, … } we may
DISCRETE SYSTEMS
further shorten it to {1, 2}.
Assume next that there exists a transfer function G(s) so that Y(s) = G(s)X(s).
Let g(t) ↔ G(s); then from (25.17) g(t) is equal to the output resulting from the
unit impulsive input
x*(t) = δ(t)
(or x*(t) = {1} or {1, 0, 0, 0, … } in the sequence form). The device generates only
discrete outputs. Then, g(t), which is equal to the response to x*(t), must also
have a discrete form:
M M
g(t) = ∑ gm δ(t − mT), so G(s) = ∑g m e−mTs. (25.21)
m=0 m=0
O T 2T 3T t O T 2T 3T 4T t
O T t O T t
Fig. 25.17 (a) arbitrary discrete input x(t) and output y(t). (b) input x(t) = δ(t), and delayed
output g(t − τ ) δ(t − τ ) ↔ G(s).
(a) If the input is x(t), then the output is y(t) = x(t − T). Therefore, if x(t) = δ(t),
the output is δ(t − T) as in Fig. 25.17b, and by (25.17), we must have g(t) = δ(t − T),
so the transfer function is G(s) = e−Ts. ➚
548
Example 25.16 continued
LAPLACE AND Z TRANSFORMS: APPLICATIONS
(b) To check that this transfer function really works for a general discrete input,
∞
put x(t) = ∑x n δ(t − nT). We then have
n= 0
In the general case when the transfer function takes the form {g1, g2, … , gM},
an input x(t) = δ(t), represented by x(t) = {1}, generates a string of impulses
M
∑
x=0
gm δ(t − mT) at intervals mT, m = 0 to M. We shall look at at the system’s
Self-test 25.3
The transfer function g(t) of a discrete system with time interval T is given by
25
Then we may write for the transform of a typical discrete input x(t):
N N
xn
X(s) = ∑ xn e−nTs ≡
n=0
∑z
n=0
n
.
25.9
is called the z transform of x(t). Given x(t) we can write down the z transform.
THE Z TRANSFORM
Conversely, given a suitable function X(z), we can expand it by Taylor’s theorem
for large z in powers of z −1 in order to obtain the sequence of coefficients {x0,
x1, x2, … } in (25.23), which defines x(t). This sequence is called the inverse
transform of X(z).
Suppose that {xn} is supplied as input to a discrete linear system. The z trans-
form of the output y(t) = {y0, y1, y2, … }, say, is
y1 y2
Y (z) = y0 + + 2 + . (25.24)
z z
We already know from (25.21) that if the circuit is linear it has a transfer function
G(s) taking the form of a similar sequence of impulsive terms. Therefore g(t) has a
z transform:
g1 g2
G(z) = g0 + + 2 + . (25.25)
z z
Finally, from (25.15) (since all we have done is to write a shorthand for eTs), the z
transforms of output and input are related by
Y (z) = G(z)X(z) (25.26)
−1
which is simply the product of two polynomials in powers of z .
We have lost sight of T in these expressions, but we can always recover it by
returning to time-domain or s-domain formulae by putting z = eTs. To summarize:
Example 25.17 Obtain the z transform of the discrete signal x(t) defined by the
LAPLACE AND Z TRANSFORMS: APPLICATIONS
0 0
(a) X(z) = 1 + + + = 1.
z z2
1 1
(b) X(z) = 1 + + + .
z z2
This is an infinite geometric series with common ratio z −1 (it converges only if |z | 1,
but do not worry about this). From eqn (5.4a):
1 z
X (z ) = = .
1−z −1
z −1
1 1
(c) X(z) = 1 + + + .
z2 z 4
The common ratio is z−2, so by Section 5.4
1 z2
X (z ) = = .
1 − z −2 z 2 − 1
Example 25.18
We see that
2 3 4
X (z ) = 1 + + 2 + 3 + .
z z z
To sum this series, multiply it by 1/z:
1 1 2 3
X (z ) = + 2 + 3 + .
z z z z
Subtract the second expression from the first:
⎛ 1⎞ 1 1 z
⎜ 1 − ⎟ X (z ) = 1 + + 2 + =
⎝ z⎠ z z z −1
(as in the previous Example). Therefore
z ⎛ 1⎞ z2
X (z ) = ⎜1 − ⎟= .
z −1 ⎝ z ⎠ (z − 1)2
Example 25.19 (a) Obtain the inverse z transform of the function X(z) = z /(z − 2).
(b) Deduce the time function x(t) which it represents.
(a) We need to find the coefficients in the infinite series form for X(z):
x1 x2
X (z) = x0 + + 2 + .
z z
This is a Taylor expansion of X (z) in powers of 1/z for large z (see Section 5.6). To obtain
it, we start by expressing X (z) in terms of 1/z:
−1
z ⎛ 2⎞ ⎛ 2⎞
X (z ) = = 1 ⎜1 − ⎟ = ⎜1 − ⎟ .
z−2 ⎝ z⎠ ⎝ z⎠
The binomial expansion (5.4f), with α = −1 and x = −2/z, gives ➚
551
Example 25.19 continued
25.9
2 22 2 3
X (z ) = 1 + + + + .
z z2 z 3
THE Z TRANSFORM
Therefore the sequence of coefficients (i.e. the inverse) is {1, 2, 22, 23, … }.
(b) The corresponding time function x(t) is therefore
x(t) = δ(t) + 2δ(t − T) + 22δ(t − 2T) + 23δ(t − 3T) + ··· .
Example 25.20 The response of a discrete system to the input x(t) = δ(t) + δ(t − T)
is found to be y(t) = δ(t) + 2δ(t − T) + δ(t − 2T). Find (a) the z transfer function
G(z), (b) the Laplace transfer function G(s), (c) the response to a unit impulse δ(t).
1 2 1 g g
(a) Put X (z ) = 1 + , Y (z) = 1 + + 2 , and G(z ) = g 0 + 1 + 22 + (for all we know
z z z z z
at this stage, there might be an infinite number of terms in G(z)). Since Y (z) = G(z)X(z)
2
⎛ 2 1⎞ ⎛ 1⎞ ⎛ 1⎞ ⎛ 1⎞ 1
G(z) = Y (z)/ X (z) = ⎜ 1 + + ⎟ ⎜1 + ⎟ = ⎜1 + ⎟ ⎜1 + ⎟ = 1 + .
⎝ z z2 ⎠ ⎝ z⎠ ⎝ z⎠ ⎝ z⎠ z
(b) Restore s by putting z = eTs, where T is the spacing interval:
G(s) = 1 + e−Ts.
(c) The impulse response is the inverse transform, g(t), of G(s):
g(t) = δ(t) + δ(t − T),
which can be obtained also from (a).
Self-test 25.4
The response of a discrete system to the input x(t) = δ(t) + δ(t − T) is found
to be
δ(t) + δ(t − T) − δ(t − 2T) − δ(t − 3T).
By using z transforms show that the transfer function g(t) = δ(t) − δ(t − 2T).
δ(t) g(t)
25.10
BEHAVIOUR OF Z TRANSFORMS IN THE COMPLEX PLANE
A
O t O T 2T t
δ(t − T) g(t − T)
t
O t O T 2T 3T
Output
Input function
function
A
followed by
B
t
O T t O T 2T 3T
Fig. 25.18
particular, they should not increase as time goes on. Their increase or decrease is
described by the rate of increase or decrease of the coefficients in the series
g1 g2
G(z) = g0 + + 2 + . (25.30)
z z
We shall illustrate how information about this question can be obtained by exam-
ining the behaviour of G(z) when it is given in closed form, and the variable z is
allowed to be complex.
We limit consideration to cases where G(z) is a rational function of z:
aMz M + aM−1z M−1 + + a0
G(z) = . (25.31)
bN z N + bN−1z N−1 + + b0
We shall assume that M N. Suppose that the am and bn are all real numbers, and
that the N solutions of the equation
bNzN + bN−1zN−1 + ··· + b0 = 0 (25.32)
554
are
LAPLACE AND Z TRANSFORMS: APPLICATIONS
where c may be complex: if so, then C might be complex as well. This term is the
source of a part of the discrete output signal g(t) produced by an input x(t) = δ(t),
and we shall see whether it generates an increasing or a decreasing output.
Suppose firstly that we find a pole at z = c in (25.35), where c is a real number.
Then C is also real, and
−1
C C⎛ c⎞ C Cc Cc2
= ⎜1 − ⎟ = + 2 + 3 + .
z−c z⎝ z⎠ z z z
In the time domain this corresponds to the sequence
{C, Cc, Cc2, … }.
If |c| 1 the terms are increasing in magnitude, and the system is said to be
unstable. If |c| 1 they are decreasing in magnitude. The rate of increase or
decrease is actually exponential, because
| Ccn | = |C | en ln|c|.
If c = ±1, then the output time sequence is nondecreasing, and unstable:
{C, ±C, C, ±C, … }.
Next, suppose that c is complex. Then there is another pole at z = C. Taking
these together, we obtain a pair of complex conjugate terms, generating real
coefficients:
C y C C 1 ⎛ C Cc Cc2 ⎞
+ = 2 Re = 2 Re = 2 Re ⎜ + 2 + 3 + ⎟
z−c z−C z−c z 1 − cz −1 ⎝z z z ⎠
={2 Re (C), 2 Re (Cc), … }. (25.36)
555
Evidently the magnitude (modulus) of the coefficients follows the same rule as
before.
25.10
Each of the terms (23.34) in G(z) contributes to g(t) in a similar way. Therefore,
the response y(t) to a delta function input x(t) = δ(t) depends upon the poles cn of
From (25.38)
Therefore, for n = 0, 1, 2, … ,
In Fig. 25.20 we show an Argand diagram with the unit circle |z| = 1 indicated.
This is used as a design tool to obtain a qualitative idea of how a proposed circuit
will behave, and to modify its properties. We can find the poles (the points where
G(z) is infinite), and place them on the diagram. Poles within the circle promise
transients which die away; if there is a pole outside, then a stimulus applied to the
circuit will produce ever-increasing output, so the system will be unstable. Poles
lying on the circle |z| = 1 produce transients which do not approach zero or infinity
in magnitude. If the values associated with the circuit elements can be adjusted so
that all the poles lie inside the unit circle, than we shall have a circuit for which all
disturbances die away with time.
556
LAPLACE AND Z TRANSFORMS: APPLICATIONS
2 4 6
1 3 5 t
–1
Fig. 25.19 Discrete transient of C/(z − c). Suppose that C = 0.5 and c = 0.8 e2.9i. Then | C| = 0.5,
| c | = 0.8, φ = 0, ω = 2.9. The curve y = (0.8)t cos (2.9t) and the impulsive response to δ(t) are shown.
Imaginary
axis
25
z = c3
z = c2
z = c1
1
O Real Fig. 25.20 The unit circle | z | = 1,
axis and several poles of a transfer
function G(z). One of the poles is
outside the circle so the circuit is
unstable and a transient associated
with this pole will grow
z plane exponentially.
Self-test 25.5
A discrete system has the transfer function given (in terms of z = eTs) by
G(z) = z/(16z2 − 16z + 5).
Use the result (25.37) to determine the stability of the system.
25.11
or
yn+2 = 2yn+1 − yn + xn,
Notice that we had to prescribe y0: it was not given by the difference equation,
and we could have assigned any value to it. It resembles the initial condition of a
first-order differential equation.
where a, b, c are the coefficients of yn+2, yn+1, and yn respectively. The denominator
alone determines the growth of transients, so there is really no need to work right
through the problem if all we want is information about the stability. In fact, if y0
= y1 = 0, which would be a natural condition, the circuit has a transfer function
equal to 1/(az2 + bz + c), so the situation is exactly the same as in the previous sec-
tion. Similar considerations apply to linear difference equations of any order.
A table of z-transforms can be found in Råde and Westergren (1995).
Problems
25.1 Invert the transforms (a) 1/[s(s2 + 1)], for 0.01 time units. Approximate the applied
(b) 1/[s2(s2 + 1)], (c) 1/[s3(s2 + 1)], by using (25.1). voltage by a suitable impulse function, and
solve the equation for i(t).
25.2 The equation for the current i(t) in an RLC
circuit for zero initial charge is 25.3 The displacement x(t) of a mass on a spring
t
with velocity damping and external force f(t)
i(τ ) dτ = v(t).
di 1 per unit mass reduces to the conventional form
L + Ri +
dt C 0
F + 2kB + ω 2x = f(t). The initial conditions are
x(0) = 1, B(0) = 1. An impulse I is applied at t = t0.
(a) Solve this equation when L = 2, R = 3, C = 31 , Find the solution for t 0 for k2 ω 2.
v(t) = 3 cos t in conveniently scaled units, for
zero initial current and charge. 25.4 A light plank of length l rests across a
(b) Adapt the equation to the case when v(t) = 0 crevasse, and sags under the weight of a
and there is an initial charge q0 on the mountaineer of mass M standing at the centre. The
capacitor, and solve it, given that i(0) = 0. displacement u(x), where x is measured from one
(c) The circuit in (a) is quiescent with zero charge; end, is determined in general by Kd4u/dx4 = f(x),
then, at t = t0, a voltage of 300 units acts in it where K is constant and f(x) is force per unit length
559
along the plank. The boundary conditions, which 25.7 Evaluate the convolution integral
say that the plank merely rests on its ends, are
PROBLEMS
t
t
C=2
(b) x(t) = 1 + x(τ )(t − τ ) dτ;
0
Fig. 25.21
t
x**(τ )f (t − τ ) dτ,
R=3 C=2
d
x(t) =
V1(s) V2(s) dt 0
L=1
where x**(t) represents the response of a quiescent
‘black box’ to a unit-function input H(t), and x(t)
R=5
is its response from quiescence to an input f(t).
Suppose that the transform X**(s) of the unit-
(b) R=2 L=2
function response is given by 1/(s − 1)(s + 2) in a
I(s) particular case. Obtain the response from zero
C=2 initial conditions to an input H(t) sin ω t.
V1(s) R=3 V2(s)
p(t) = p0 e− γ t + b p(τ ) e
0
−β(t− τ )
dτ ,
25.16 The differential equation
d 2x
+t
dx
−x=0
and solve the equation. dt 2 dt
does not have constant coefficients: the coefficient
25.13 A simple harmonic oscillator with of dx/dt is t. Using the results (24.8) and (24.12),
displacement x is subject to a constant force F0 show that the transform of the differential
25
for 0 t t0, and allowed to oscillate freely equation subject to the conditions x(0) = 0
for t t0. If H(t) is the Heaviside function its and x′(0) = 1 satisfies the first-order equation
equation of motion is
dX(s)
mF + kx = F0[H(t) − H(t − t0)]. −s + (s 2 − 2)X(s) = 1.
ds
If the system starts from rest in equilibrium, show
Verify that X(s) = 1/s2 satisfies this equation, and
that the Laplace transform is hence obtain the required solution of the original
F0 (1 − e −st ) equation.
L{x(t)} = ,
m s(s 2 + ω 2 )
where ω = √(k /m). Show that, for 0 t t0, the 25.17 Using the method outlined in Problem
solution is 25.16, solve the following variable-coefficient
equations using Laplace transforms:
F0 (a) tx″(t) + (1 − t)x′(t) − x(t) = 0, x(0) = x′(0) = 1;
x(t) = (1 − cos ω t),
k (b) x″(t) + tx′(t) − 2x(t) = 2, x(0) = x′(0) = 0;
and find the solution for t t0. (c) tx″(t) − x′(t) + tx(t) = sin t, x(0) = 1, x′(0) = 0.
25.14 An equation of the form 25.18 (Discrete systems, Section 25.8). The
dx(t) following signals are expressed in the sequence
= x(t − 1) + t, forms. Write the explicit form of x(t) and its
dt
Laplace transform in each case.
which relates the derivative at time t to the value (a) {1, 2, 1, 0, 0, 0, … }.
of the function at an earlier time, is an example (b) {0, 1, 2, 3, … }.
of a differential delay equation. If x(t) = 0 for (c) {3}.
t 0, show that the Laplace transform of the (d) {(−2)n}.
solution is (e) {0, 0, 3}.
1 1
L{x(t)} = = . 25.19 The transfer functions g(t) in the time
s 2(s − e −s ) s3(1 − e −s/s)
domain, and inputs x(t), are given below. Obtain
Expand 1/(1 − e−s/s) in powers of e−s/s using a
binomial expansion, and show that the outputs y(t) in each case.
(a) g(t) = {1, 1}, x(t) = {1, 1}.
⎣ t⎦
(t − n)n+2 (b) g(t) = {1, 1/2, 1/22, … }, x(t) = {1, 1}.
x(t) = − ∑ ,
(c) g(t) = {1, −1, 1, −1, … }, x(t) = {0, 2, 2}.
n = 0 (n + 2)!
561
25.20 Obtain the output y(t), when the transfer 25.25 (a) Prove that if the z transform of the
function is G(s) = 1/(1 − 31 e −Ts ) and the Laplace discrete function x(t) defined by {x0, x1, x2, … }
PROBLEMS
transform of the input is X(s) = e−Ts + 2 e−2Ts. is X(z), then the transform of {0, x0, x1, … } is
(Hint: expand G(s) in the form of an appropriate (1/z)X(z).
infinite series in powers of e−sT.) (b) Deduce that the transform of {0, 0, … , 0,
x0, x1, … } (starting with N zeros) is (1/z)NX(z).
25.21 Obtain the z transforms corresponding to (This is a time-delay rule for z transforms.)
the various specifications that follow:
(a) x(t) = δ(t − T) + 2δ(t − 2T) − δ(t − 3T). 25.26 Prove that if the z transform of
(b) x(t) = {1, −1, 1, −1, … }. {x0, x1, x2, … } is X(z), then the transform
(c) x(t) = {1/2n}. of {xN, xN+1, xN+2, … } is
(d) X(s) = e−Ts/(1 − e−2Ts). zNX(z) − zNx0 − zN−1x1 − ··· − zxN−1.
25.22 The following functions are sampled at (This resembles the differentiation rule for
interval T. Obtain the z transform of the (discrete) Laplace transforms, (24.12). Start the process
sampled functions (H(t) is the unit function (1.13)). with N = 1, then N = 2 etc., until the sequence
(a) tH(t). becomes clear.)
(b) e−t H(t).
25.27 The following represent transfer functions
25.23 Obtain the z transforms of the transfer for discrete systems, G(z). Find the poles, mark
functions, G(z), of various discrete, linear, systems them on an Argand diagram as in Fig. 25.20, and
which have been tested for the particular input x(t) state whether the systems are stable or not. Obtain
and output y(t) as specified: the rate of growth or decay of their transients.
(a) x(t) = {1, 1}, y(t) = {1, −1}, and find the (a) (z + 1)/(z2 − 4). (b) (z2 − z)/(4z2 − 1).
sequence for g(t). (c) 1/(4z 2
+ 1). (d) (z3 + 1)/(2z4 + 5z2 + 2).
(b) x(t) = {1, 0, 0, 3}, y(t) = {1, 1}.
(c) x(t) = {1, −1}, y(t) = {1, 1}. 25.28 {xn} and {yn} represent inputs and outputs
(d) x(t) = {1, 1, 1, … }, y(t) = {1, 0, −1, 0, 1, 0, −1, to discrete systems governed by the difference
0, 1, … }. equations shown. Use z transforms to obtain the
(e) x(t) = {1, 0, 1, 0, … }, transforms Y (z) in terms of X(z) and the initial
y(t) = {1, 0, −1, 0, 1, 0, −1, 0, 1, … }. values y0 and y1. State whether the systems are
stable or not.
25.24 Prove that if the z transform of the discrete (a) 4yn+2 − yn = xn; y0 = 1, y1 = 2.
function given by {x0, x1, x2, … } is X(z), then the (b) yn+2 − 3yn+1 + 2yn = 2xn; y0 = 0, y1 = 1.
discrete transform of y(t) = {x0, x1 e−CT, x2 e−2CT, … } (c) 2yn+2 + yn+1 + yn = xn+1 − xn; y0 = 0, y1 = 1.
is X(cCTz). (d) 2yn+2 + 3yn+1 − yn = xn; y0 = 1, y1 = 1.
26 Fourier series
CONTENTS
If a note on a piano is played, firstly by pressing the key and then by plucking the
string, the sounds produced are very different although the pitch or fundamental
frequency heard is the same in both cases. The note produced by an instrument is
not a pure tone or sinusoidal wave; it is a richer sound which contains other fre-
quencies. These occur in different proportions when the same note is stimulated
in different ways, or is sounded on different instruments.
A trained ear can detect some detail in these differences; the extra components
can be distinguished and their pitch recognized, or they can be isolated by using
resonators. The extra component frequencies of a note are all higher than the
fundamental frequency, and related to it in a simple manner. If the fundamental
frequency is f, then the harmonics present have frequencies
f, 2f, 3f, 4f, 5f, …,
the strength of the harmonics dropping off to zero as their frequency increases.
When these components are added, a profile for the composite wave is obtained.
A particular note was found to have components as shown:
Order of harmonic: 1 2 3 4 5 …
Frequency: f 2f 3f 4f 5f …
Relative amplitude: 1.0 0.9 0.3 0.3 0.1 …
The shape and amplitude of the component harmonic waves, and of the com-
posite wave, are shown in Fig. 26.1.
563
26.1
4
O O
O O
(2f ) (5f )
Compound sound
4
O O (ms)
(3f )
Fig. 26.1
T
FOURIER SERIES
− ωπ π
ω t
O
Fig. 26.2 A periodic function P(t)
with period T = 2π /ω.
26
sine and cosine terms are needed, because if we involve only sines or only cosines,
the sum will have a symmetry, odd or even (Section 15.9), which P(t) might not
have. Then we expect that, for suitable values of the constants an and bn
Equation (26.1) is a Fourier series for P(t), and the constants a0; a1, b1; a2, b2; …
are its Fourier coefficients. It will be shown how to determine the coefficients in
Section 26.4: the factor --21 in the constant term --12 a0 is introduced to simplify the
working.
We have spoken in terms of the one-period range t = −π/ω to t = π/ω, but every
term on the right of (26.1) is periodic with the same period T = 2π/ω as P(t).
Therefore the series will describe P(t) for every value of t, not merely for t in the
interval between ±π /ω.
0
P(t) dt and
t0
P(t) dt,
each of which is taken over a one-period interval of P(t). The figure shows that the
integrals are equal by virtue of the area analogy (15.13). The two shaded areas in
Figs 26.3a, b are assembled from identical segments which are simply added up
in a different order.
Alternatively, differentiation with respect to t0 gives
t0 +T
d G
J
P(t) dtL = P(t0 + T) − P(t0) = 0,
dt0 I t0
565
(a) P(t)
26.2
δA
(b) P(t)
δA
t0 t0 + T t
O
Fig. 26.3 Illustrating the area analogy for (a) ∫ T0 P(t) dt and (b) ∫ tt00+T P(t) dt, where P(t) has period T.
using (15.20) and the periodicity of P(t). Hence the integral is independent of t0.
The integral over any one-period interval of a function P(t) having period T,
t0 +T
t0
P(t) dt,
2
since the range − 12 π to 12 π also covers a period π. But the integrand is an odd function
about the origin, so that the value of the last version is zero (Section 15.9).
The following special results can be proved by using the trigonometric identi-
ties in Appendix B which convert products to sums.
566
−π /ω
cos nω t cos mω t dt = 0,
π /ω
−π /ω
sin nω t sin mω t dt = 0,
π /ω
26
−π /ω
cos nω t sin mω t dt = 0.
(b) For n = 1, 2, …
π /ω π /ω
−π /ω
cos2 nω t dt = −π /ω
sin2nω t dt = π /ω.
For n = 0, we obtain
π /ω π /ω
−π /ω
dt = 2π/ω and −π /ω
0 dt = 0.
(c) The range −π /ω to π/ω may be replaced by any interval of length 2π/ω.
(26.3)
Integrate both sides of this equation over the period between −π /ω and π/ω :
π /ω π /ω
−π /ω
P(t) cos Nω t dt = a0 1
2 −π /ω
cos Nω t dt
⎛ π /ω π /ω
⎞
∞
+ ∑ ⎜ an cos nω t cos Nω t dt + bn sin nω t cos Nω t dt⎟ . (26.5)
n =1 ⎝ − π /ω − π /ω ⎠
−π /ω
P(t) dt = 21 a0 −π /ω
dt =
π
ω
a0 .
567
Therefore
26.3
π /ω
a0 =
ω
π −π /ω
P(t) dt, (26.6)
−π /ω
P(t) cos Nω t dt = aN −π /ω
cos2Nω t dt =
π
ω
aN .
Therefore, for N = 1, 2, 3, … ,
π /ω
aN =
ω
π −π /ω
P(t) cos Nω t dt. (26.7)
By comparing (26.7) with (26.6), it can be seen that a0 and a1, a2, … are all given
by the same formula. That is why the constant term in (26.4) is written as 12 a0
instead of a0.
bN =
ω
π −π /ω
P(t) sin Nω t dt. (26.8)
Since P(t) is a known function, the integrals in (26.6), (26.7), and (26.8) can be
evaluated to give all the coefficients in the Fourier series (26.4).
In the following summary, the letter n is used in place of N to simplify the form of
the results.
Fourier coefficients:
π /ω
an =
ω
π −π /ω
P(t) cos nω t dt (n = 0, 1, 2, … ),
π /ω
b =
ω
n P(t) sin nω t dt (n = 1, 2, … )
π −π /ω
(in place of the range of integration −π /ω to π/ω, any other one-period interval
may be used). (26.9)
568
It can be seen also that since
FOURIER SERIES
π /ω 1
2T
ω 1
1
a0 = P(t) dt = P(t) dt,
2π
2
− π /ω
T − 12T
The average value of P(t) over a one-period interval is equal to the constant
term --12 a0. (26.10)
Notice the case of period 2π, which often occurs. In such cases ω = 2π /T = 1:
where
π
an = 1
π −π
P(t) cos nt dt,
b =
1
n P(t) sin nt dt.
π −π
(The integrals may be taken over any one-period interval instead of [−π, π].)
(26.11)
t cos nwt dt
k
(k, a positive integer)
the integrals can be obtained by repeated integration by parts. Define the following
indefinite integrals:
F1(t) = f(t) dt,
F2(t) = F1(t) dt,
F3(t) = F2(t) dt, … .
1
= P(t)F (t) − P′(t)F (t) + P″(t)F (t) dt 2 2
26.4
1 1 1
F1(t) = sin nwt, F2(t) = − cos nwt, F3(t) = − sin nwt, …
Self-test 26.1
If P(t) is 2π-periodic, and P(t) = t2 for 0 t 2π, use the Kronecker formula
to find the Fourier series of P(t).
Example 26.2 Find the Fourier series of the function P(t) shown in Fig. 26.4.
P(t)
π
The period is 2π, so that ω = 2π/2π = 1. Choosing the interval −π to π as the basis of the
calculation yields
⎧−t if −π t 0,
P(t) = ⎨
⎩ t if 0 t π.
The coefficients can be obtained from (26.11):
Coefficients bn. P(t) is an even function about the origin (see Section 15.9), and sin nt is
odd; therefore P(t) sin nt is odd. Hence the integrals defining bn are all zero:
bn = 0 (n = 1, 2, … ). (i)
Coefficients an. Since P(t) is even and cos nt is even, P(t) cos nt is even; so (26.11) gives
2 ⎛ ⎡ t sin nt ⎤ sin nt ⎞
π π π π
2 2
an = P(t) cos nt dt = t cos nt dt = ⎜ − dt⎟ ,
π 0 π 0 π ⎝ ⎢⎣ n ⎥⎦ 0 0 n ⎠
(ii)
Therefore
⎧0 if n is even,
an = ⎨
⎩− 4 / πn if n is odd.
2 (iii)
t d t = π.
2
a0 = (iv)
π 0
Collect the coefficients from (i), (iii), and (iv) and put them back into the Fourier series:
26
In Fig. 26.5, we show how P(t) gradually takes shaped as we take more and more
terms of the Fourier series in Example 26.2. Here
4 ⎛ cos t cos 3t cos 5t ⎞
P(t) = 21 π − ⎜ 2 + + + ⎟ ,
π⎝ 1 32 52 ⎠
= 1.571 − 1.273 cos t − 0.141 cos 3t − 0.051 cos 5t − … .
(c)
(a) π π
1
2 π
−π O π −π O π
(b) (d)
π π
1
2 π
−π O π −π O π
Fig. 26.5 (a) 1.571; (b) 1.571 − 1.273 cos t; (c) 1.571 – 1.273 cos t − 0.141 cos 3t;
(d) 1.571 − 1.273 cos t − 0.141 cos 3t − 0.051 cos 5t.
Example 26.3 Find the Fourier series for the function shown in Fig. 26.6.
The period is T = 2π, so that ω = 1 and the Fourier series is
∞
P(t) = 12 a0 + ∑ (a n cos nt + bn sin nt).
n=1
26.4
P(t)
π
Fig. 26.6
⎧t (0 t π),
P(t) = ⎨
⎩0 (π t 2π).
t cos nt dt.
1 1
an = P(t) cos nt dt =
π 0 π 0
1 1 1 2 π 1
a0 =
π
P(t) dt =
π
[ 2 t ]0 = 2 π.
0
Coefficient bn.
2π π π
1 ⎡ t cos nt ⎤
1 1 1
bn = P(t) sin nt dt = t sin nt dt = − + 2 sin nt ⎥
π 0 π 0 π ⎢⎣ n n ⎦0
1⎡ 1 1 ⎤ (−1)n
= ⎢ − (π cos nπ − 0) + 2 (0 − 0)⎥ = − .
π⎣ n n ⎦ n
The series is difficult to write out if the cosine and sine terms are kept together. By
separating them, we obtain
2⎛ 1 1 ⎞ ⎛ 1 1 ⎞
P(t) = 14 π − ⎜ cos t + 2 cos 3t + 2 cos 5t + ⎟ + ⎜ sin t − sin 2t + sin 3t − ⎟ .
π⎝ 3 5 ⎠ ⎝ 2 3 ⎠
In Example 26.3, the function P(t) jumps from π to zero at the points
t = … , −π, π, 3π, … .
To see what values are generated by the series at such points put, say, t = π into the
series we obtained. All the cosine terms become (−1) and all the sine terms are
zero, so that at x = π the series delivers
572
1 2⎛ 1 1 ⎞
π + ⎜ 1 + 2 + 2 + ⎟ .
FOURIER SERIES
4 π⎝ 3 5 ⎠
A few minutes with a calculator make it clear that this series for P(π) cannot
add up to π, and plainly it does not give zero either. In fact its sum is --12 π, half-way
between these values. The general rule is as follows.
The sum of a Fourier series at a jump is equal to the average of the two function
values on either side. This is written as
--12 [x(t0− ) + x(t0+ )].
(26.12)
Figure 26.7 shows how the function is fitted by the series when the six terms up
to cos 3t and sin 3t are taken.
π 1
Value 2 π given
by the series
−3π −2π −π O π 2π 3π 4π
Fig. 26.7
In (26.12), x(t 0−) and x(t 0+) are the left- and right-hand limits at the jump. As can
be seen in Fig. 26.7, x(π −) = π and x(π +) = 0 at the discontinuity at t = π.
Self-test 26.2
1 1 …
Using Example 26.2, what is the sum of the series 1 + + + ?
32 52
26.5
USE OF SYMMETRY: SINE AND COSINE SERIES
Example 26.4 Obtain the Fourier series for the switching function P(t) shown
in Fig. 26.8.
The period T is 2, so that ω = π. Choose the basic interval to be t = −1 to 1. On this
interval,
⎧−1 for −1 t 0,
P(t) = ⎨
⎩ 1 for 0 t 1.
Since P(t) is odd about the origin,
a0 = a1 = a2 = ··· = 0.
For the bn, from (26.9), since the integrands are even functions,
π 1 1
P(t)
1
P(t)
1
−1 O 1 2 t
−1
−2 − 32 −1 − 12 O 1
2 1 3
2 2 t
Example 26.5 Obtain the Fourier series for the switching function P(t) shown
in Fig. 26.9.
The period is 2, so that ω = π. Choose [−1, 1] as the representative interval; then
⎧1 if − 12 t 12 ,
P(t) = ⎨
⎩0 elsewhere on the interval.
Since P(t) is an even function, b1 = b2 = b3 = ··· = 0. The coefficients an are given by
1
1
2
1 1 2
an = P(t) cos nπt dt = cos nπt dt = [sin nπ t]−2 1 = sin 12 nπ.
−1 − 12 nπ 2
nπ ➚
574
Example 26.5 continued
FOURIER SERIES
As we have seen before, a0 gives trouble since this formula is meaningless when n = 0.
We have, in fact,
1
2
a0 = 1 dt = 1.
− 12
Then
a0 = 1, a1 = 2/π, a2 = 0, a3 = −2/(3π), a4 = 0, …,
so that the odd-order coefficients alternate in sign.
26
(a) f (t)
(c) fc(t)
π π
−π π t −3π − 2π −π O π 2π 3π t
−π
−3π − 2π −π O π 2π 3π t −3π − 2π −π O π 2π 3π t
−π
26.6
In Fig. 26.10b we have extended the non-periodic function f(t) = t on 0 t π
to an artificial function fs(t) which has period 2π and is an odd function. Being
Example 26.6 Obtain a Fourier sine series for f(t) = t on the interval 0 t π.
Extend f(t) on 0 t π as an odd function fs(t) with period 2π (not π) as shown in
Fig. 26.10b. Then ω = 1 in (26.9). Choose the interval −π to π as basic. Then since fs(t)
is odd, we know in advance from (26.13) that
∞
fs(t) = ∑b n sin nt
n=1
f (t) sin nt dt
1
bn = s
π −π
π
= f (t) sin nt dt (since f (t) is odd; see (15.17))
2
s s
π 0
π
= f(t) sin nt dt (since f (t) agrees with f(t) on 0 t π)
2
s
π 0
π π
2⎡ 1 ⎤ 2⎛ π ⎞ 2
= t sin nt dt = ⎢− t cos nt +
2 1
sin nt ⎥ = ⎜ − cos nπ⎟ = (−1) n+1
.
π 0 π⎣ n n ⎦ π2⎝ n ⎠ n
0
t
−2t0 −t0 O t0 2t0 3t0
where
t0 t0
f (t) sin t
1 nπt 2 nπt
bn = fs(t) sin dt = dt,
t0 − t0
t0 t0 0 0
Self-test 26.3
∞
Obtain the sine series expansion ∑ bn sin nt which represents cos t over the
n=1
restricted interval 0 t π.
577
26.7
Suppose that P(t) is a periodic function with period T. The Fourier series has
the form
then the series for P(t − t0), whose graph is the same shape as P(t) but moved to the
right a distance t0, is
∞
P(t − t0) = 21 a0 + ∑c n cos[nω (t − t0) + φn].
n =1
(b) cn
1
(a) P(t)
1
O
− 52 − 32 − 12 1
2
3
2
5
2 O 1 3 5 7 9n
Fig. 26.12 (See Example 26.5.) (a) P(t) = --21 + (2 /π)(cos πt − --31 cos 3πt + ··· ).
(b) Spectral components --21 , 2 /π, 2 /(3π), 2 /(5π), … .
578
The cn remain the same, and only the phase angle changes. Therefore it is only the
shape of P(t) which determines its spectrum, not its clock-timing. For this reason
FOURIER SERIES
Q(t)
P(t)
1
π
−2π −π O π 2π 3π t
−2π −π O π 2π 3π t
Example 26.7 Find the Fourier expansion of the function Q(t) shown in Fig. 26.14.
This is the same as Fig. 26.13 except that the vertical dimension is reduced by a factor
1/π. Therefore, from (26.16),
4 ⎛ 1 ⎞
Q(t) = 1
− ⎜ cos t + 2 cos 3t + ⎟ .
π2 ⎝ ⎠
2
3
Example 26.8 Find the Fourier expansion of the function Q(t) shown in Fig. 26.15.
Here the t scale is changed by a factor π. We obtain, from (26.16),
4⎛ 1 1 ⎞
Q(t) = 12 π − ⎜ cos π t + 2 cos 3π t + 2 cos 5π t + ⎟ .
π⎝ 3 5 ⎠
It is necessary to be careful here: it is not t/π but πt in the new series. Check the period:
it is equal to 2, which is correct.
579
Q(t)
26.9
π Q(t)
π
Self-test 26.4
It was shown in Example 26.6 that
∞
2 (–1)n+1 sin nt
t= ∑
π n=1 n
(0 t π).
By integrating both sides of the equation over an interval (0, τ), obtain the
Fourier cosine series for τ 2 (0 τ π).
(b) Coefficients
26
b = 2f
n 0 xP(t) sin 2πnf0t dt.
Period
(26.17)
We shall now show that (26.17) may be reorganized into another shape, as
follows:
(b) Coefficients Xn
Xn = f0 Period
xP(t) e−i2πnf0t dt.
(26.18)
The coefficients are in general complex even if xP(t) is real, and the series runs
from n = −∞ to n = ∞.
To prove (26.18) we shall work backwards from it to arrive at (26.17). Start with
(26.18a):
∞ −1
xP(t) = X0 + ∑X
n =1
n ei2 π nf0 t + ∑X
n =−∞
n ei2 π nf0 t
∞ ∞
= X0 + ∑X
n =1
n ei2 π nf0 t + ∑X
n =1
−n e− i2 π nf0 t, (26.19)
Xn = f0 Period
xP(t)[cos 2πnf0t − i sin 2πnf0t] dt = 12 (an − ibn), (26.20)
581
where
26.9
an = 2 f0
Period
xP(t) cos 2 πnf0 t dt,⎪
⎫
⎪
Therefore
a−n = an and b−n = −bn. (26.22)
It follows from (26.20) and (26.22) that when n 0, as in the sums (26.19),
Xn = 12 (an − ibn), X−n = 12 (a−n − ib−n) = 12 (an + ibn), (26.23)
where an and bn are the same numbers as the coefficients in the original series (26.17).
Finally, (26.19) becomes
∞
xP(t) = 21 a0 + ∑ [ (a
n =1
1
2 n − ibn ) ei2 π nf0 t + 21 (an + ibn ) e− i2 π nf0 t ].
After using Euler’s formula (6.8) for the exponentials, and carrying out the mul-
tiplications, the terms in which i appear cancel, and we are left with
∞
xP(t) = 21 a0 + ∑ (a n cos 2πnf0t + bn sin 2πnf0t),
n =1
which is the original form (26.17a). (Since xP(t) may be complex, so may an and bn,
so we should not shorten the final calculation by taking twice the real part
of 12 (an − ibn) ei2πnf0t.)
The following properties sometimes save calculation:
Example 26.10 Obtain the two-sided Fourier series for the function xP(t),
having period T, of which a single period is shown in Fig. 26.17.
In (26.18) f0 = 1/T. Therefore
1
T 1
τ
1 2
1 2
1 T 1
τ −i iπnt / T
= [e −i 2πnt / T ]−2 1 τ = (e − e −iπnt / T )
T (−i 2 πn) 2
2 πn
1
= sin(π nτ /T) (from (6.10)).
πn ➚
582
Example 26.10 continued
FOURIER SERIES
xP(t)
1
− 12 T − 12 τ O 1
2 τ 1
2 T t Fig. 26.17
26
Finally
∞
1 πnτ iπnt / T
xP(t) = ∑ πn
sin
T
e .
n= −∞
Problems
PROBLEMS
∞
n+a
∑n
n =1
3
+ an + 3
sin nt, series of f ′(t).
Consider now the function g(t) = t 3 defined for
where a is a design parameter in the system. Find −π t π. Find the Fourier series of g(t) and g′(t).
a in order that the leading harmonics n = 1 and Confirm that the derivative of the Fourier series
n = 2 have amplitudes in the ratio 2 : 1. What is of g(t) is not the same series as the Fourier series
the amplitude of the next harmonic? of g′(t).
Comparing the functions of f(t) and g(t), what
26.8 The two 2π-periodic signals shown in feature of g(t) do you think causes the problem
Fig. 26.18 are added. Find the Fourier series of with its differentiated Fourier series?
the combined signal. What value should F take in
order that the leading harmonic should disappear? 26.12 Sketch the wave defined by
⎧0 (−π t 0),
P(t) = ⎨
(a) F ⎩|sin 2t | (0 t π),
extended so as to have period 2π. Find its Fourier
series. (The identities
t sin A cos B = --12 [sin(A + B) + sin(A − B)],
−π O π
sin A sin B = --12 [−cos(A + B) + cos(A − B)],
will be needed.)
−F
26.13 Show that
(b) ∞
(−1)n − 1
1 t = 2∑ sin nt
n =1 n
t for −π t π. Integrate the terms from t = 0 to
−π O π t = x, and rearrange them to show that
−1 ∞ ∞
(−1)n − 1 (−1)n − 1
x2 = 4 ∑ 2
− 4 ∑ cos nx.
n =1 n n =1 n2
Fig. 26.18
Now use (26.10) to establish the value of the
constant term in this Fourier series.
26.9 A T-periodic function is defined by
Q(t) = --14 T 2 − t2 for − --12 T t --12 T. 26.14 From Problem 26.13, or by direct means,
obtain the Fourier series valid for −π t π:
Find the Fourier series of Q(t). What is the error ∞
(−1)n
between the sum of the first four terms of the t 2 = 31 π 2 + 4 ∑ cos nt.
series and Q(t) at (a) t = 0, (b) t = --14T? n =1 n
2
⎧− a (0 t 12 T ),
half-range cosine series. Sketch the sum of the Q(t) = ⎨
⎩ a ( 2 T t T ).
1
series on −∞ t ∞.
26.19 Express f(t) = cos ω t on 0 t π/ω as a 26.25 Find the Fourier series of the 2π-periodic
half-range sine series. Sketch the sum of the series sawtooth wave defined by
on −∞ t ∞. f(t) = t (−π t π).
26.20 Express f(t) = cos t on 0 t 2π as a Determine the forced part of the solution of the
half-range sine series. second-order differential equation
d 2x
26.21 Express f(t) = cos t on 0 t 2π as a + Ω 2x = K sin ω t,
dt 2
half-range cosine series.
where ω ≠ ±Ω. Hence put together the periodic
26.22 Express the function f(t), for 0 t π, output of the forced system
(a) as a half-range sine series, (b) as a half-range d 2x
cosine series: + Ω 2x = f (t),
dt 2
⎧1 (0 t 12 π),
f (t) = ⎨ where f(t) is the sawtooth wave above. For what
⎩0 ( 2 π t π ).
1
values of Ω does the system exhibit resonance?
26.23 The Fourier series for the function P(t), 26.26 The Fourier series of a function with
period 2π, given by period T is given by
⎧− t (− π t 0), ∞
P(t) = ⎨ f (t) = 12 a 0 + ∑ (an cos ω t + bn sin ω t),
⎩ t (0 t π ), n =1
is where T = 2π /ω. Multiply both sides of the
4 ⎛ cos t cos 3t cos 5t ⎞ equation by f(t) and integrate between − --12 T and
P(t) = 12 π − ⎜ + + + ⎟
π ⎝ 12 32 52 ⎠ --12 T, to obtain Parseval’s identity
(see Example 26.2). Deduce from this the Fourier 1
2T ∞
2
expanions of the following periodic functions. f (t)2 dt = 12 a 20 + ∑ (a n2 + bn2 ).
T − 12 T n =1
(a) Q(t), period 4, where
(a) Let T = π, and
⎧− 3t (− 2 t 0),
Q(t) = ⎨
⎩ 3t (0 t 2); ⎧−1 (− 12 π t 0),
f (t) = ⎨
⎩ 1 (0 t 2 π).
1
(b) R(t), period 2, where
⎧1 + t (− 1 t 0), Show that
R(t) = ⎨
⎩1 − t (0 t 1). ∞
1 π2
(Sketch R(t) to understand the connection with ∑ (2n + 1)
n =1
2
=
8
.
P(t).)
(c) Check that P(t), Q(t), R(t) have similar spectra. (b) Let f(t) = t (−π t π) be a 2π-periodic
function. Find its Fourier series (see
26.24 The Fourier series for the function P(t), Problem 26.1b), and deduce the corresponding
period 2, given by Parseval identity.
585
26.27 The function f(t) with period T has the (b) Let ω = p/q and ω0 = r /s, where p, q, r, s are
Fourier series whole numbers such that no two of them have
PROBLEMS
∞ a common divisor other than 1. What is the
f (t) = 12 a 0 + ∑ (an cos nω t + bn sin nω t). period? Express x(t) as the sum of two waves
n =1 with angular frequencies ω ± ω0 (these are
Find the Laplace transform of the function as called the sidebands). What is the Fourier
the sum of a series of Laplace transforms of the cosine expansion based on this period?
trigonometric terms. Hence find the Laplace (c) If you know about irrational numbers, show
transform of the 2π-periodic function defined by that x1(t) = cos t cos 2t never repeats itself
exactly: it is not periodic.
⎧−t 2 (− π t 0),
f (t) = ⎨ 2
⎩ t (0 t π). 26.29 (a) Prove that
(See Problem 26.1c.) 1
2T
CONTENTS
Fourier series are used to express functions defined over a finite range as a per-
iodic series of harmonic terms. Fourier integrals, which are the subject of this
chapter, are used to describe non-periodic functions over an infinite range. For the
infinite interval t 0 there exist cosine or sine transforms, and corresponding
inverse transforms. We shall treat these as intuitive extensions of the correspond-
ing Fourier series to functions having an infinite period. For functions arbitrary
over the two-sided infinite interval −∞ t ∞ there exist (subject to certain tech-
nical limitations) the complex exponential transform and its inverse, which we
formulate as a combination of cosine/sine transforms.
The strict mathematical arguments necessary to prove the results go far beyond
the scope of this book, so intuitive justification is used freely. There are some
apparent restrictions. For example, convergence of the integrals (see Section 15.6,
on improper integrals) requires the functions concerned to approach zero as the
variable t approaches ±∞. This restriction would, for instance, rule out considera-
tion of periodic functions. However, in some circumstances restrictions can safely
be disregarded as illustrated in the later sections of this chapter. (There exists a
sophisticated theory of generalized functions regulating such liberties.)
587
27.1
Figure 27.1 shows examples of functions that are not periodic. Non-periodic
functions can still be expressed in terms of harmonic functions (sines, cosines, and
x(t)
x(t)
x(t)
t t t
O O O
A full derivation of such results is too complicated to give in this book, but
representation by a continuous distribution of frequencies can be made plausible
by regarding a non-periodic function as the limit of a periodic function as it
approaches an infinite period. To illustrate this idea we shall consider a simple case.
Let pO(t) be a real-valued function for −∞ t ∞, periodic with period T, and
odd (i.e. pO(−t) = −pO(t)). Further, suppose that it consists of a stream of discrete,
equally spaced ‘pulses’, each of duration τ T, and has the value zero between
them, as illustrated in Fig. 27.2. The function pO(t) can be represented by a Fourier
sine series (see Section 26.6). Up to this point we have expressed Fourier series in
terms of the circular frequency ω = (2π /T), but here we shall use the fundamental
frequency f0 = 1/T instead. Then (26.15b) becomes
∞
pO(t) = ∑b
n= 0
n sin(2πnf0t), (27.1)
where
1 1
2T 2T
bn = 2f0
− 12 T
pO(t) sin(2πnf0t) dt = 4f0
0
pO(t) sin(2πnf0t) dt (27.2)
Period T
PO(t)
t
−T − 12T − 12 τ O 1
τ 1
T T
2 2
Pulse duration τ
Fig. 27.2
588
We shall seek a representation of the fixed single pulse present in the interval
− 12 T t 12 T by letting T → ∞. This pushes away to infinity the periodic copies
FOURIER TRANSFORMS
of the central pulse, whilst leaving the central pulse unaffected. (In a physical con-
text, as in passing a solitary pulse through an electrical filter, we should be likely
to disregard extraneous pulses that arrive only every hour, or every month, or
every century, as the period T is taken larger and larger.)
The series (27.1) becomes increasingly intractable as T τ (meaning T ‘is
much greater than’ τ ); too many terms have to be taken in order to get a reason-
able approximation to pO(t). However, we can recast the series as an integral, and
this problem disappears. Write (27.2) in the form
1
2T
bn
27
When T → ∞ so that δf → 0 the periodic copies of the central pulse are consigned
to infinity, and we are left with a solitary pulse x(t) given by
⎧p (t), − 21 τ t 21 τ ;
x(t) = ⎨ O
⎩0, elsewhere.
At the same time (by eqn (15.9)) the sum in (27.4) approaches an infinite integral,
as does the finite integral in (27.3). We then have the symmetrical pair of relations:
where
∞
The function Xs( f ) is called the Fourier sine transform of x(t), or the spectral
density or frequency distribution function corresponding to x(t), in the context
of sine transforms. All positive frequencies are represented. Notice that (27.5a)
automatically defines x(t) as an odd function if the context demands that we be
concerned with the time range −∞ t ∞. The normal use for the sine trans-
form is, however, for t 0 only.
589
If we start with an even periodic chain of pulses pE(t) and its Fourier cosine
series, we arrive similarly at the cosine transform pair:
27.1
Fourier cosine transform
where
∞
The equations (27.5a) and (27.6a) are also known as the inverse transforms
of Xs(f ) and Xc(f ). As with Laplace transforms, they solve the problem: ‘given a
frequency distribution, obtain the corresponding time function’.
To arrive at the sine and cosine equations we assumed that the signal x(t) con-
sists of a pulse of finite extent τ. However, the results are true for suitably behaved
functions having infinite extent, from t = 0 to t = ∞ (or from t = −∞ to ∞ provided
that they are appropriately odd or even functions). Thus x(t) = e−t for t 0 has
both a sine and a cosine transform for t 0.
As in eqn (26.12) for Fourier series, the value attributed to x(t) by (27.5) and
(27.6) is the average of its values on either side of a jump discontinuity at t = t0:
x(t) = 12 [x(t 0− ) + x(t +0 )] (27.7)
Example 27.1 (a) Obtain the cosine transform of the function x(t) given by
⎧1, 0 t 1,
x(t) = ⎨
⎩0, t 1,
(see Fig. 27.3a) and write down the inverse transform (without evaluating it).
(b) Deduce that
∞
sin u
du = 21 π.
0
u
(c) Show that the value attributed to x(1), a point of discontinuity of x(t),
conforms with eqn (27.7).
(a) (b)
x(t) x(t)
1
1
t t
O 1 −1 O 1
Fig. 27.3
➚
590
Example 27.1 continued
FOURIER TRANSFORMS
−∞ t ∞, so that
∞
sin 2π f
x(0) = 2 0 πf
df = 1. (ii)
Substitute u = 2πf, df /f = du/u, and we obtain from (ii) the standard integral
∞
sin u
du = 12 π. (iii)
0 u
(c) The point t = 1 marks a jump in value of x(t) from 1 to 0. Equation (27.7) predicts that
the integral (i) will deliver the value x(1) = 12 (1 + 0) = 12 . To confirm this, put t = 1 in eqn (i):
∞ ∞
sin 2π f sin 4π f
2 0 πf
cos 2π f df = 0 πf
df .
Self-test 27.1
Find the Fourier cosine transform of x(t) = e−t, and deduce from its inverse
that
∞
0
cos u
u2 + t2
π
du = e−t.
2t
27.2
x(t) is any well-behaved real or complex function on −∞ t ∞ such that
∞
∫−∞ |x(t)| dt exists. Then at points of continuity of x(t)
where
∞
(b) X( f ) = −∞
x(t) e−2πift dt.
(27.8)
These formulae closely resemble eqns (26.18) for the two-sided (complex) Fourier
series, and it is possible to calculate the transition from (26.18) to the exponential
transform by the procedure described in the previous section. Alternatively, (27.8)
can be obtained from the sine and cosine transforms.
∞
The condition that ∫−∞ |x(t)| dt should exist (i.e. converge) appears to be rather
restrictive. For example, any function which does not tend to zero as t → ±∞ is
suspect. Besides functions like t and et, this condition would disqualify all per-
iodic functions such as sin ω t. The imprecise term ‘well-behaved’ in (27.8) implies
∞
further unspecified restrictions. Here we shall only say that if ∫−∞ | x(t)| dt exists,
the only exclusions are functions having a degree of eccentricity rarely encoun-
tered in physical applications. Simple jump discontinuities in the value of x(t) are
allowed, and, as with Fourier series, eqn (27.8a) delivers a value at such points
equal to the average of the values of x(t) on either side of the jump. If there is a
jump at t = t0, then (as in (26.12))
x(t) = 12 [x(t 0− ) + x(t +0 )]. (27.9)
The scope of the Fourier transform is not paralysed by the restrictions, and the
examples in this chapter will show that the system is far more flexible than this
discussion might suggest.
2
X(f ) = x(t) e −2 π i f t dt = e −2 π i f t dt
−∞ − 12 τ
1 1
τ i
= [e −2 π i f t ]−2 1 τ = (e − π i f τ − e π i f τ )
−2 π i f 2
2π f
i 1
= (−2i) sin π f τ = sin π f τ .
2π f πf ➚
592
Example 27.2 continued
FOURIER TRANSFORMS
(b) X(f ) τ
(a) x(t)
1
t f
− 12 τ O 1
2 τ − τ1 1
τ
Fig. 27.4
∞ ∞
sin πfτ 2 π i f t
x(t) =
−∞
X(f ) e2 π i f t df = −∞ πf
e df .
Figure 27.4b shows the frequency distribution, which in this case is a real function.
(a) F [x(t)] =
−∞
x(t) e−2πift dt (27.11a)
F [X(f )] =
−1
X( f ) e
−∞
2πift
df = x(t). (27.11b)
27.4
one has been adopted in a particular piece of work. For example, the Fourier
cosine transform and its inverse may appear in the form
Gc(ω ) =
2
π x(t) cos ω t dt,
0
x(t) = G (ω) cos ω t dω,
0
c (27.12)
− 12 O 1
2 t − 12 τ O 12 τ t
Top-hat function
sin π f
(a) F [Π(t)] = .
πf
sin (π f τ)
(b) F [Π(t/τ )] = .
πf (27.13)
sinc x
FOURIER TRANSFORMS
−2 O
−4 −3 −1 1 2 3 4 x Fig. 27.6 sinc x = sin(πx)/(πx).
Its graph is shown in Fig. 27.6. It is an even function, and it can be shown that the
signed area under the curve is equal to unity (see, e.g. Example 27.1.ii).
27
(b)
−∞
sinc x dx = 1.
(27.14)
For the Fourier transform of sinc t, start with (27.15b). Since τ sinc τ f = F [Π(t/τ )],
it follows that
∞
Interchange the letters t and f, take the complex conjugate of the result to make
the sign in the exponential negative, and put 1 /τ in place of τ. We obtain
∞
1 t
Π(τ f ) = sinc e− i2 π f t dt.
−∞
τ τ
Multiply through by τ to obtain the results:
Equations (27.15) and (27.16) illustrate a general fact: that as the duration of
a signal increases (e.g. as τ increases in (27.15b)), the effective frequency range
tends to become narrower, and conversely.
595
27.4
Im X(f )
(a) x(t)
−α /2π α /2π f
O t
Fig. 27.7 (a) x(t) = e−α tH(t). (b) X(f ) = F [e−α tH(t)].
F [x(t)] =
−∞
x(t) e−i2πft dt = 0
e−αt e−i2πft dt
e
−1
= −(α + i2 π f )t dt = [e−(α + i2 π f )t]0∞
0
α + i2 π f
1
= .
α + i2 π f
Therefore
1
F [e−αtH(t)] = . (27.17)
α + i2 π f
Since x(t) is neither even nor odd the spectral distribution is a complex function.
Its real and imaginary parts are shown in Fig. 27.7b.
Example 27.3 Find the Fourier transform of the function given by x(t) = e−|t|
(see Fig. 27.8a)
1 2
−2 O 2 t −1 O 1 f
➚
596
Example 27.3 continued
FOURIER TRANSFORMS
We have
∞ 0 ∞
X( f ) = −∞
e−| t| e −i2 π f t d t =
−∞
e t e − i2 π f t d t + e
0
−t
e − i2 π f t d t
1 (−1)
= [e(1− i2 π f )t ]0−∞ + [e −(1+ i2 π f )t]0∞
1 − i 2π f 1 + i 2π f
1 1 2
= + = .
1 − i2π f 1 + i2π f 1 + 4 π 2 f 2
This function is shown in Fig. 27.8b.
27
Signal Transform
x(t) X(f ) = F [x(t)]
(a) Linearity Ax1(t) + Bx2(t) AX1(f ) + BX2(f )
(b) Time scaling x(At) X(f/A)/ |A|
Time reversal x(−t) X(−f )
(c) Time delay x(t − B) X(f ) e−i2πBf
(d) Frequency scaling x(t/C)/|C| X(Cf )
(e) Frequency shift x(t) ei2πDt X(f − D)
(f) Modulation x(t) cos 2πKt [X(f − K) + X(f + K)]/2
x(t) sin 2πKt [X(f − K) − X(f + K)]/(2i)
(g) Duality X(t) x(−f )
(h) Differentiation dx(t)/dt (i2πf )X(f )
dnx(t)/dtn (i2πf )nX(f )
(27.18)
27.5
From the time-scaling rule (27.18b).
F [Π(at)] = (1/|a |) sinc(f/a).
Example 27.6 Obtain the signal x(t) produced by the spectral distribution X( f )
shown in Fig. 27.9.
X(f )
1
−3 −2 −1 O 1 2 3 f Fig. 27.9
The two rectangular pulses are arrived at by extending the range of Π(f ) by a factor 2
to give Π(t/2), then shifting this graph along the f axis a distance 2 to the left and 2 to the
right to give
X( f ) = Π( 12 {f + 2}) + Π( 12 {f − 2}).
From the frequency-scaling rule (27.18d) with C = 12 ,
Π( 12 f ) ↔ 2 sinc 2t.
Then, by the frequency-shift rule (27.18e), with K = z 2,
X( f ) ↔ (ei4πt + e−i4πt)·2 sinc 2t = 4 cos 4πt sinc 2t.
(The modulation rule (27.18f) could have been adopted for the final stage instead.)
Example 27.8 (Sidebands) The voltage signal x(t) = v(t) cos 2πf0t represents an
audiofrequency signal v(t) used to modulate a carrier wave of high frequency
f0. Suppose that F [v(t)] = V( f ), where f lies in the range −fm f fm f0. Use
the modulation formula (27.18f ) to illustrate the general nature of the spectral
distribution function X( f ) = F [x(t)].
From (27.18f)
X(f ) = 12 [V(f − f0) + V( f + f0)]. (i)
The intervals (ii) and (iii) do not overlap, since fm f0. Therefore the spectral
distribution (i) falls into two separate parts on opposite sides of the origin of f, as in
Fig. 27.10. They are related to the sidebands of communication engineering. The two
parts have the same shape, since their graphs consist of the graph of V(f ) moved through
distances ± f0. (In general they would be complex, and even if they are real they will not
generally correspond to two real signals.)
(a) V(f )
(b) X(f )
O f −f0 f0
Fig. 27.10 (a) Spectral distribution of v(t). (b) Spectral distribution of v(t) cos 2πf0t.
599
Self-test 27.2
27.6
(a) Prove that F [dx/dt] = i2πf X[ f ], and F [d2x/dt2] = −4π2f 2X[f ]. (b) Given
that F [e−πt ] = e−πf (we cannot prove this result here), deduce that F [e−t ] =
2 2 2
δ(t)
1/ ε
t
− 12 ε 1
2 ε Fig. 27.11
F [δ(t − c)] = −∞
δ(t − c) e−2πift dt = e−2πifc.
(27.19)
The signal giving rise to δ(f − f0) is given by the inverse transform:
∞
Similarly,
1 i2 π f0 t
sin (2π f0 t) = (e − e− i2 π f0 t ),
2i
so
1
sin (2π f0 t) ↔ [δ(f − f0 ) − δ(f + f0 )]. (27.20b)
2i
Therefore, the (real) cosine and sine functions having frequency f0 are each asso-
ciated with a pair of spectral lines, located at f = ± f0, as in Fig. 27.12.
The delta function is not at all a normal function. It belongs to a class of
mathematical entities called generalized functions. They are essential in practical
applications, since their use greatly simplifies what would otherwise be very difficult
calculations. Generalized functions play a part similar to the symbol i in complex
numbers: i is not an ordinary number, but in most ways it behaves like one.
There are apparent anomalies associated with generalized functions; for ex-
ample, we have just obtained the Fourier transform of cos (2πf0t), but the normal
F [cos 2πf0t] 1
i F [sin 2πf0t]
f0
−f0 O f0 f −f0 O f
Fig. 27.12
601
definition of a Fourier transform (27.8b) does not work with a periodic function,
because the integral does not approach a definite value when we apply the infinite
27.7
limits of integration. Exact justification and interpretation of these questions are
far beyond the scope of this book. You should regard relations such as (27.20) as
where
(b) Xn = f0 Period
xP(t) e−i2πnf t dt
0
(27.21)
The spectral frequency distribution consists of an infinite row of ‘spikes’ δ(f − nf0)
spaced at equal intervals f0. These are weighted by Xn, which are just the two-
sided Fourier series coefficients for the periodic function xP(t) given by (26.18b).
To prove the result (27.21), take the Fourier series representation (26.18a), and
use (26.20a and b) to transform the cosines and sines in the series term by term.
We obtain
∞
F [xP(t)] = ∑ X δ(f − nf ),
n=−∞
n 0
which is (27.21a). The coefficients Xn are given by (26.18b), which is the same
as (27.21b).
where
x1(t) ↔ X1(f ), x2(t) ↔ X2(f ). (27.23)
x(t) =
−∞
ei2πft X1(f )X2(f ) df (27.24)
602
in which
FOURIER TRANSFORMS
∞ ∞
X1(f ) = −∞
e−i2πftx1(t) dt = −∞
e−2πifux1(u) du, (27.25)
∞
⎛ ∞
⎞
x(t) =
−∞
e i2 π f t
⎜
⎝
−∞
e− i2 π fux1(u) du⎟ X2 ( f ) df
⎠
∞
⎛ ∞
⎞
=
ei2 π f (t − u)x1(u) du⎟ X2 ( f ) df
27
⎜
−∞ ⎝ −∞ ⎠
∞
⎛ ∞
⎞
=
−∞
x1(u) ⎜
⎝
−∞
ei2 π f (t − u)X2 ( f ) d f ⎟ du
⎠
after changing the order of integration (this process is justified in Section 32.1).
The interior integral is equal to the inverse of X2( f ) at time (t − u), so it is equal to
x2(t − u). Therefore
∞
x(t) =
−∞
xi(u)x2(t − u) du. (27.26a)
x(t) =
−∞
x1(t − u)x2(u) du, (27.26b)
confirming that the two integrals on the right of (27.26a and b) are equal. This
enables us to invert products of spectral distributions.
The integrals
∞ ∞
−∞
xi(u)x2(t − u) du or −∞
x1(t − u)x2(u) du (27.27a)
x1(t) * x2(t) (or x2(t) * x1(t)) is called the convolution of x1(t) and x2(t). The result
(27.26) is the convolution theorem. In the short notation:
27.7
Obtain x(t) = x1(t) * x2(t) when x1(t) = Π(t) and x2(t) = 1/(1 + t 2).
1
x(t) = x1(u)x2(t − u) du = Π(u) du.
−∞ −∞ 1 + (t − u)2
Since Π(u) = 0 unless − 12 u − 12 , the limits of integration become ± 12 , so
1 1
t + 12
2
1 2
1 dv
x(t) = Π(u) du = du =
−2
1 1 + (t − u)2 − 12 1 + (t − u)2 t − 12 1 + v2
(after putting t − u = v)
4
= arctan(t + 12 ) − arctan(t − 12 ) = arctan ,
4t 2 + 3
which can be obtained from an addition formula in Appendix B(b).
At (i) in Example 27.9 the limits of integration were modified to take account of
the fact that the integrand is zero except over the interval − 21 u 21 . In many
typical cases it is quite awkward to establish the new limits. Consider, for example,
the convolution of two identical pulses Π(t):
∞
For different values of t, Π(t − u) occupies a different position on the u axis. For
certain ranges of t it partially overlaps Π(u) from the left, or from the right, and
for other ranges of t there is no overlap, as illustrated in Fig. 27.13.
II(u)
II(t − u) for various t
1
No overlap Overlap − 12 O 1
2 Overlap No overlap u
from left from right
Fig. 27.13
To take this into account, set up a diagram as in Fig. 27.14, with axes t and u.
The region in which Π(t)Π(t − u) is nonzero is easy to find by carrying out the
following construction.
(i) Π(u) is nonzero only if − 21 u 21 . The edges of this region are the
straight lines
u = − 21 and u = 21 .
Draw these and label them with the u values.
604
Current
FOURIER TRANSFORMS
value of t
u
1 1
2 u= 2
1
2
t−
=
u
− 12 O
−1 1
2 1 t
1
2
27
t+
=
u
1
u=− 2
Fig. 27.14
Example 27.10 (a) Show that Π(t) * Π(t) = Λ(t), where (Fig. 27.15a)
⎧1 + t, −1 t 0,
⎪
Λ(t) = ⎨1 − t, 0 t 1,
⎪⎩0, elsewhere.
(b) Show that F [Λ(t)] = sinc2f.
(a) Put x(t) = Π(t) * Π(t), and use the diagram Fig. 27.14 as described in (iii) above.
If t −1 or t 1, there is no overlap, so x(t) = 0.
If −1 t 0, the limits of integration are from u = − 12 to t + 12 , so ➚
605
Example 27.10 continued
27.8
(b)
F [Λ(t)] = sinc2f
−1 O 1 t O f
Fig. 27.15
t + 12
x(t) =
− 12
1 × 1 du = 1 + t.
2
x(t) = 1 × 1 du = 1 − t.
t− 12
Triangle function
(a) Definition
⎧1 − | t |, −1 t 1;
Λ(t) = ⎨
⎩0, elsewhere.
(b) Transform
F [Λ(t)] = sinc 2f. (27.30)
(b) F [=T(t)] = f0 =f (f )
FOURIER TRANSFORMS
0
( f0 = 1/T)
(a) =T(t)
t
−2T −T O T 2T t −3f0 −2f0 −f0 O f0 2f0 3f0
∞ ∞
Fig. 27.16 (a) =T(t) = ∑ δ(t − nT ).
n =−∞
(b) F [=T(t)] = f0 ∑ δ( f − nf ).
n =−∞
0
27
2T
∑
∞
Xn = f0 e−i2πnf0t
=T(t) dt = f0 e− i2 πnf0 t δ(t − nT) dt = f0 ,
Period n =−∞ − 12T
by the sifting rule (27.19b), since the only delta function within the period is the
one where n = 0. Therefore, from (27.31),
F [=T(t)] = f0=f ( f ). 0
Example 27.11 The function x(t) is zero when t − 12 T and t 12 T. Show that
the convolution y(t) = =T(t) * x(t) is the periodic function, having period T,
which agrees with x(t) in the range − 12 T t 12 T.
Write
∞
=T(t) * x(t) = −∞
x(u)=T(t − u) du
∞ ∞ ∞
= ∑ x(u) δ(t − u − nT ) du = ∑ x(t − nT ),
n = −∞ −∞ n = −∞ ➚
607
Example 27.11 continued
27.9
(b) y(t)
(a) x(t)
using the sifting theorem (the critical points are where t − u − nT = 0). The term with
n = 0 reproduces x(t), which is zero outside the range − 12 T to 12 T. The term with n = 1
slides that graph a distance T to the right, and we have a non-overlapping copy of x(t)
in the range 12 T to 23 T, and so on. The general picture is shown in Fig. 27.17: y(t) is a
periodic copy of x(t), with period T.
Rules for Fourier transforms and a short table of Fourier transforms are listed
in Appendix G. A longer table of transforms is given by Råde and Westergren
(1995), but note they use an alternative definition of the transform (see the
comments at the end of Section 27.3).
E= −∞
|x(t)|2 dt.
Rayleigh’s theorem
∞ ∞
−∞
| x(t)|2 dt =
−∞
| X( f )|2 df
or
∞ ∞
−∞
x(t)f(t) dt =
−∞
X( f )e( f ) df .
(27.33)
We have
∞ ∞ ∞
⎛ ∞
⎞
E=
−∞
| x(t)| dt =
2
−∞
x(t)f(t) dt =
−∞
x(t) ⎜
⎝
−∞
e( f ) e− i2 π f t df ⎟ dt
⎠
(after expressing x(t) as the inverse transform of X(f ), and taking its complex
conjugate). Now change the order of integration:
608
∞
⎛ ∞
⎞ ∞ ∞
E=
e( f )⎜
x(t) e− i2 π f t dt⎟ df = E( f )X( f ) df = |X( f )| 2 df.
FOURIER TRANSFORMS
−∞ ⎝ −∞ ⎠ −∞ −∞
Parseval’s theorem extends this result for cases when the energy depends on two
functions, x(t) and y(t), as in the case of current and voltage in circuits. It states that
∞ ∞
−∞
x(t)h(t) dt = −∞
X( f )Y(f ) df (27.34)
(Note: For the necessary background to waves and phasors see Sections 20.8
and 21.6.)
We shall illustrate a type of calculation which arises in diffraction problems
in several branches of physics. In optics it occurs in Fraunhofer diffraction by a
narrow slit, and there are similar problems in acoustics. Also there is a close con-
nection with the theory of radiating antennas. We shall present the problem in an
abstract way, since the process of tailoring it to a real situation involves additional
physical considerations.
Consider the half-space z 0, criss-crossed by travelling waves all having the
same frequency f and wavelength λ. At every point P there is a disturbance u(t, P)
produced by superposition of all the rays passing through P, and interference
between these rays determines the resultant amplitude and phase of the oscilla-
tion at P. Instead of using u(t, P) we shall assign a phasor, or complex amplitude
(see Section 21.6), U(P) to every point, so that
u(t, P) = Re[U(P) e2πift].
We need a preliminary result. Figure 27.18 show a ray directed along an arbit-
rary axis Oz. It has constant amplitude a. The disturbance is given by
⎡ ⎛t z⎞ ⎤
u(t, z) = a cos ⎢2π ⎜ − ⎟ + φ ⎥ , (27.35)
⎣ ⎝T λ⎠ ⎦
Q
O
P z Fig. 27.18
where T, λ, and φ are the period, wavelength, and a constant phase angle, and the
wave velocity v = λ /T is directed towards the right. Let Q and P be arbitrary fixed
points on Oz. In an obvious notation
⎡ ⎛t z ⎞ ⎤
u(t, zQ ) = a cos ⎢2π ⎜ − Q ⎟ + φ ⎥ ,
⎣ ⎝T λ⎠ ⎦
⎡ ⎛t z ⎞ ⎤
u(t, z P ) = a cos ⎢2π ⎜ − P ⎟ + φ ⎥ .
⎣ ⎝T λ ⎠ ⎦
609
The corresponding phasors or complex amplitudes at Q and P are UQ, UP given by
27.10
UQ = a ei[φ −(2πizQ /λ)] = a eiφQ,
UP = a ei[φ −(2πizP /λ)] = a eiφP.
Figure 27.19 shows an infinite radiating strip in the (x, y) plane having width h,
its central line along the y axis, and infinite length −∞ y ∞. Each infinitesimal
element, or source, δA on the strip emits a harmonic wave spreading equally in
all directions (i.e. it generates a spherical wave). We assume firstly that the dis-
tribution of sources on the strip is uniform: the contribution to the oscillation
strip of any element of area δA is αδA where α is independent of position on
the strip (one may imagine a uniform distribution of tiny, equal, hemispherical
loudspeakers). Secondly, all the sources have the same frequency and phase: they
are all oscillating in step.
Since the strip is infinitely long and the source distribution is independent of y,
the problem is two-dimensional: the wave fields over all cross-sections y = constant
are identical. Figure 27.20 shows the cross-section y = 0, z 0. P is a typical point
distant r from O, and OP is inclined at θ to Oz (the positive direction for θ is
clockwise here). Q is a typical elementary source at (0, x), with −h x h, and
x
x
1
h S′ Element, width δx
2
δA S′
1
h
O 2 Q
Strip,
width h O z
y θ
S z r
− 12 h − 12h S
P
Fig. 27.19 Infinite radiating strip, width h, Fig. 27.20 Cross-section y = 0 of Fig. 27.19.
parallel to the y axis, radiating into z 0, is
typical radiating element.
610
width δx. The waves arriving at P from all points on the strip SS′ interfere, and
when the quantities h/λ, r /λ, are of the appropriate magnitude, a systematic
FOURIER TRANSFORMS
To obtain an expression for the length QP: by the cosine rule (Appendix B(f))
27
⎛ 2 x sin θ x2 ⎞ 2 1
= r ⎜1 + + 2 ⎟ = r(1 + q)–2
⎝ r r ⎠
say, where
2 x sin θ x2
q= + 2.
r r
It can be shown that if h/r √5 − 1 (which would always be so in practice), then
|q| 1 for all x in − 12 h x 12 h and all θ in − 12 π θ 12 π. In that case we can use
1
the binomial theorem (5.4f) to approximate to (1 + q)–2 . The first few terms
are given by
(1 + q)2 = 1 + 21 q − 18 q2 + .
1
Therefore
⎡ 1 ⎛ 2 x sin θ x2 ⎞ 1 ⎛ 2 x sin θ x2 ⎞
2
⎤
(1 + q) = r ⎢1 + ⎜ + 2⎟ − ⎜ + 2 ⎟ + ⎥
1
2
⎢⎣ 2⎝ r r ⎠ 8⎝ r r ⎠ ⎥⎦
⎛ 1 x2 1 x3 1 x4 ⎞
= r + x sin θ + r ⎜ 2 cos2θ − sin θ − ⎟
⎝2 r 2 r3 8 r4 ⎠
x2 ⎛ 2 x x2 ⎞
= (r + x sin θ ) + ⎜ cos θ − sin θ − 2 + ⎟ . (27.38)
2r ⎝ r 4r ⎠
27.10
Finally we have
These are natural variables for the problem; the physical outcome depends on the
number of wavelengths in h, for example. If we double the wavelength we must
double r, x, and h to preserve the same geometry. Since R, X, H are dimensionless:
if the unit of length is changed, say from metres to angstrom units, these quantit-
ies are unaffected. Equation (27.40) becomes:
Suppose that the amplitude of the source at Q is aδX in the new units. We can
allow to some extent for attenuation along the ray QP, provided that it depends
effectively only on distance R. We approximate its contribution, δQU, to the
complex amplitude UP at P by putting
δUQ = u(R) e−2πiXsinθ, (27.43)
UP = lim
δX→ 0
∑ δQU = u(R) − 12 H
e−2 π iX sinθ dX
= u(R) −∞
Π(X/H ) e−2πiXsinθ dX, (27.45)
phase of UP. We therefore define the angular spectrum function F(sin θ ) by casting
off u(R), so that for constant R:
∞
F(S) = −∞
Π(X/H) e−2πiXS dX, (27.46a)
where
S = sin θ. (27.46b)
can be seen that F(S) is the Fourier transform of Π(X/H). Also, we can refer to
(27.13b) to evaluate it (with H standing in place of τ ). We obtain
sin(π SH)
F(S) = = H sinc(HS ). (27.47)
πS
In terms of the original variables x, h, λ, θ, therefore, over a circular arc r = constant,
angular distribution of amplitude ∝ sinc(h sin θ /λ). (27.48a)
sin θ = λ /h
x θ θ
− 12h O 1
2 h O O
Fig. 27.21 (a) Source distribution ∝ Π(x /h). (b) Amplitude spectrum ∝ sinc(h sin(θ /λ)) (zeros at sin θ = nλ /h.)
(c) Intensity spectrum ∝ sinc 2(h sin θ /λ).
27.12
By following exactly the same procedure, we obtain the angular spectrum
∞
where S = sin θ, X = x /λ, and E(X) = e(x). In principle, the source may be infinitely
extended in the ±x directions, though realistically we shall assume E(X) to be
negligible beyond a certain range of values. Equation (27.50) is again the Fourier
transform of the source distribution, and its inverse transform is given by
∞
E(X) = −∞
F(S) e2πiSX dS. (27.51)
−∞
F(S) e2πiSX dX ≈ −1
F(S) e2πiSX dS (27.52)
to an acceptable degree of accuracy, then we may ignore the range |S| 1 for
the purpose of obtaining E(X) from a given F(S). A commonly arising physical
situation that provides support for the approximation (27.52) involves radiation
fields that are strongly directional, the diffracted rays being effectively confined to
a fairly narrow range of θ. The radiation from a uniform strip (Section 27.10) is of
this character if the dimensions are right.
Appendix G(b): f t x X D B u
Current symbols: S(= sin θ ) X E F K D W
∞
= ∫−∞ f(X − W)g(W) dW).
(27.53)
We now give some examples showing the significance of these rules for radia-
tion problems. It will be assumed that the estimates (27.42) and (27.52) apply
where necessary. Notice that if the effective diffracted range of θ is small enough,
S can be identified with θ for the purpose of visualizing the diffraction patterns
that arise.
⎛ X⎞
E(X) = Π ⎜ ⎟ , F(S) = H sinc HS,
⎝ H⎠
so that A = 1/H. The breadth of the central loop of F(S) and its satellites is
inversely proportional to H.
27.12
Equation (27.53c) states that if we move the emitter bodily up the X axis by a dis-
tance D (wavelengths), then F(S) becomes F(S) e−2πiDS. This result may seem a little
curious, since it is physically obvious that the new spectrum is simply the old spec-
The zeros of F(S) due to the term sinc HS are at S = nπ /H and those due to cos
DS are at S = (n + 21 )π /D, and they are interlaced. If D H (not necessarily
O
x
h h
z Fig. 27.22
616
F(S)
FOURIER TRANSFORMS
S
O
27
hugely greater, but perhaps 10 times greater) an interference of the type shown in
Fig. 27.23 is obtained for the angular spectrum F(S). The envelope is proportional
to sinc(HS). The intensity spectrum is proportional to the square of this function,
sinc 2(HS) cos2(DS) (see Fig. 27.23). If D H, the underlying fine-scale oscillation
may be difficult to resolve instrumentally.
X = x/ λ
Width H X4
X3
X2
X1
27.12
0 n
n =1 H ⎠
We shall show that E(X) can be expressed in the form of the convolution
where
N
g(X) = ∑ δ(X − X ),
n =1
n (27.57b)
and δ represents the delta function (27.19). g(x) is called the distribution function
for the array. To prove (27.57): by the definition of the convolution (27.53e),
∞
⎛ X⎞ ⎛ X′ ⎞ N
E0(X) Π ⎜ ⎟ * g(X) =
⎝ H⎠
E0(X′) Π ⎜ ⎟
⎝ H⎠
∑ δ(X − X′ − X ) dX′
n =1
n
−∞
∑
N
⎛ X′ ⎞
= E0(X′) Π ⎜ ⎟ δ(X − X′ − Xn ) dX′
n =1 −∞
⎝ H⎠
∞
∑
N
⎛ X − Xn − w ⎞
= E0(X − Xn − w) Π ⎜ ⎟ δ(w) dw
n =1 −∞
⎝ H ⎠
Therefore the spectrum of the array is equal to the transform of the array dis-
tribution function, multiplied by the spectrum of the single element centred on
the origin.
Alternatively, we can obtain G(S) explicitly:
∞
N N
G(S) = ∑ δ(X − X ) e n
−2 π iXS dX = ∑e −2 π iXn S , (27.59)
−∞ n =1 n =1
so that
618
N
F(S) = F0(S) ∑ e−2 π iXn S. (27.60)
FOURIER TRANSFORMS
n −1
Problems
27.1 Obtain the Fourier sine and cosine of the cosine transform of x(t). (Hint: split the
transforms of the function x(t) = e−t for t 0. Find range of integration into two parts, −∞ to 0
the value delivered by the inverse sine transform and 0 to ∞.)
at t = 0. (Hint: cos(2πft) + i sin(2πft) = e2πift.) (b) Use the result of Problem 27.3 that the
cosine transform of e−t is √πe−π f to find the
2 2 2
sin 2u of e−t is √πe−π f . Use this result together with the
2 2 2
du = 12 π.
u2 scaling rule (27.17b) to prove that F [e−πt ) = e−πf .
2 2
0
1
dXc /df by differentiating under the integral X(ω ) = x(t) eiω t dt,
√(2π) −∞
sign (see Sections 17.9 or 27.8).
(ii) Integrate by parts to obtain the differential ∞
1
equation x(t) = X(ω ) e −iω t dω .
√ π)
(2 −∞
dXc
= − 2 π 2 f Xc ,
df 27.8 Prove that if x(t) is an even function then
and obtain the general solution. F [x(t)] is an even function of f. Use this fact to
(iii) Use the fact (see Example 32.11) that reduce the Fourier transform pair to a real form.
∞
0
e−x dx = --12 √π
2
27.9 Prove that if x(t) is an odd, real function, then
X(f ) is a pure imaginary odd function. Show that
to provide the initial condition Xc(0) = √π for the Fourier transform pair can then be reduced to
(ii), and deduce that Xc(f ) = √πe−π f .
2 2
a pair of real equations.
27.4 (a) Show that if x(t) is an even function, then 27.10 Prove the time-scaling rule, (27.18b), and
F [x(t)] is an even function of f, and takes the form the time-delay rule, (27.18c).
619
27.11 By (27.15), F [Π(t)] = sinc f. (a) Use the 27.18 Prove that
t
time-delay rule (27.18c) to obtain the transform of
x(τ ) dτ.
PROBLEMS
H(t) * {x(t) H(t)} =
⎧1, 0 t 1,
x(t) = ⎨ 0
⎩0, elsewhere.
(b) Confirm the result (a) by evaluating F [x(t)] 27.19 (a) Obtain x1(t) * x2(t) when
directly. x1(t) = x2(t) = e−tH(t).
(c) Use the time-delay rule and the time-scaling (b) Use your result together with the convolution
rule to obtain F [x(t)] where b 12 c and theorem (27.28) to obtain the transform of a new
function, t e−t.
⎧−1, − b − 12 c t − b + 12 c, (c) Obtain F [t e−α t] from (b), where α 0.
⎪
x(t) = ⎨ 1, b − 12 c t b + 12 c, (d) Obtain the same result as in (c) by noticing that
⎩⎪ 0, elsewhere. d −α t
(e ) = −t e−α t.
(Hint: sketch a diagram.) dα
27.12 Given that F [Λ(t)] = sinc 2f (proved in 27.20 (a) Prove that
Example (27.10)), where Π(t − --21 ) * Π(t + --21 ) = Λ(t).
⎧1 + t, −1 t 0, (Hint: use the convolution theorem, (27.28).)
⎪
Λ(t) = ⎨1 − t, 0 t 1, (b) Show that
⎩⎪0, elsewhere, Π(t − a) * Π(t − b) = Λ(t − a − b).
obtain (a) F [Λ(2t)]; (b) F [Λ(2t − 3)]. (c) Show that
⎧0, t − 32 and t 32 ,
27.13 (a) Prove the frequency-shift ⎪ + −
⎪ 2 t − 2,
3 3 1
t,
property, (27.18). Π(t) * Π( 12 t) = ⎨ 2
1, − 2 t 2,
1 1
(b) Obtain F [x(t)e ±i2πf0t]. ⎪3
⎩⎪ 2 − t, 2 t 2 .
1 3
(c) From (b) deduce the modulation rules, (27.18),
for F [x(t) cos 2πf0 t] and F [x(t) sin 2πf0t].
(d) Obtain F [Π(--21t) cos 2πf0 t] and F [Π(--21t) sin 2πf0t]. 27.21 Show that the total energy in the signal
x(t) = e−α tH(t) (α 0) is equal to 1 /2α. Show
27.14 (a) Given that Λ(t) ↔ sinc f, obtain F [sinc t]
2 2 that the total energy due to the frequency range
1
either by using the duality rule (27.18), or by a −f0 f f0 is equal to arctan(2πf0 /α).
direct method. πα
(b) Use the result (a), together with the time-delay
and time-scaling rules, to find F [sinc2(at + b)]. 27.22 Prove the result of Example 27.11 by using
(Λ(t) is defined by the convolution theorem (27.28) together with the
expression (27.31) for F [=T(t)].
⎧1 + t, −1 t 0,
⎪
Λ(t) = ⎨1 − t, 0 ≤ t 1, 27.23 Use the Fourier transform to obtain a
⎪⎩0, elsewhere.) particular solution of the differential equation
d 2x 1
−x= ,
27.15 (a) Prove the differentiation rule (27.18). dt 2 1 + t2
(b) Given that e−| t| ↔ 2 /(1 + 4π2f 2), obtain in the form of a convolution integral.
F −1[if/(1 + 4π2f 2)].
27.24 (a) Given that F [sinc t] = Π(t), deduce that
27.16 From the result e−α tH(t) ↔ 1 /(α + i2πf ), ∞ ∞
sinc u du =
sin u
use the time-reversal rule to obtain F [e−α |t|], d u = 12 π and 1
2 .
where α 0. 0 u 0
(b) Obtain the moving average gτ(t) when (a) x(t) * {Ay(t) + Bz(t)} = Ax(t) * y(t) + Bx(t) * z(t).
g(t) = Π(t), for values τ = 41 , 34 , 2, and (b) x(t) * y(t) = y(t) * x(t).
indicate their general nature by (c) x(t) * {y(t) * z(t)} = {x(t) * y(t)} * z(t) (i.e. the
sketches. brackets may be omitted).
Part 5
Multivariable calculus
Differentiation of
functions of two variables 28
CONTENTS
Quantities in nature usually depend on, or are functions of, more than one variable.
The elevation H of land above sea level depends on two map coordinates x and y;
so H is a function of the two variables x and y, and we write H(x, y). If we want to
take account of geological changes, then time t becomes a consideration, and in
that case H is a function of three variables x, y, t, and we write H(x, y, t). It is easy
to produce examples involving many variables; for example, the distance between
two points P : (x1, y1, z1) and Q : (x2, y2, z2) is a function of six variables. The state of
the economy is a function of a multitude of variables. We alternatively speak of a
function in one, two, three, … dimensions.
Suppose that a quantity z, called the dependent variable, depends on two
independent variables x and y. The dependence can often be expressed by an
explicit formula such as
z = x 3 + y 3, z = ex−2y, z = | xy|,
and so on. To make statements which apply to all sorts of dependence we use the
notation
z = f(x, y),
or z = g(x, y) etc. The letter f on its own signifies a particular function or process:
a computer subroutine, a particular formula, or a set of rules which will generate
a single number z when two numbers x and y are fed to it in the right order.
Thus, if
f(x, y) = 2x + y2,
624
then
DIFFERENTIATION OF FUNCTIONS OF TWO VARIABLES
f(x, y) = x2 + y2.
Set up x, y, z axes; put
z = x2 + y2,
and proceed as if plotting a graph. Take a large number of pairs (x, y), work out z
for each, then put the point (x, y, z) in the axes. For example, if x = 1 and y = 2, then
z = 5 and we ‘plot’ the point (1, 2, 5) as shown in Fig. 28.1a. For Fig. 28.1b, a great
5 z z 5
1 y y
1
−2 2 −2 2
−1 1 −1
O O
−1 1 −1 1
2 x 2 x
−2 −2
28.1
z 2 z
z
y z
O 1 y 1y
x
F
8 00 B
9 00
E 1000
1100 1200
1100 C
1000
A 9 00
8 00
1200
Fig. 28.3
number of points is supposed to have been plotted. They cover a surface shaped
like an inverted bowl.
Every function has a characteristic surface shape, which is the analogue in
three dimensions of the graphs used for functions of a single variable. Some other
functions are depicted in Fig. 28.2.
Another way of depicting a function is to sketch its contour map consisting of
its level curves. Figure 28.3 shows a contour map of a patch of countryside. Along
each contour the height is constant, and is indicated on the curve. The important
features of the terrain are very easy to pick out; there are peaks at A and B, a pass
at C (which is a ‘saddle’ as in Fig. 28.2d), valleys north west and south east of C
and ascents north east and south west of C. At E the contours are close together,
so the slope is steep, and at F the contours are widely spaced so the slopes are
comparatively gentle.
Consider again the function f(x, y) = x2 + y2 depicted in Fig. 28.1. The contour
of height c is the circle
x2 + y2 = c,
where c 0, which is a circle of radius c 2, as shown in Fig. 28.4a. This can be
1
height c
x
O 1 2
c=1 1.73 c
2 1.41
3
4
c
y
Contour
x x2 + y2 = c
Fig. 28.4
y = c/x.
These curves are known as rectangular hyperbolas. By varying c, taking positive and
negative values, the contour map or level curves of Fig. 28.5 are obtained.
−4 4
−3 3
−2 2
c=0
c = −1 c=1
c=0 c=0
x
O
c=0
c=1 c = −1
2 −2
3 −3
4 −4
Fig. 28.5
627
28.2
Suppose that z = f(x, y) represents the height above sea level of a piece of countryside.
In Fig. 28.6a, an observer stands at the point P : (x, y), facing east, in the direc-
PARTIAL DERIVATIVES
tion of the x axis. A short step forward takes the observer to Q : (x + δx, y), up or
down a slope. The altitude changes by an amount
δz = f(x + δx, y) − f(x, y).
(a) y N (b) y
Q
δy
δx
W E P
P Q
O x S O x
Fig. 28.6
The average slope in this direction over the step length δx is δz /δx, so the slope at
P facing the observer is given by
δz f (x + δ x, y) − f (x, y)
lim = lim .
δ x→ 0 δ x δ x→ 0 δx
Since the variable y is constant during the step, this is in effect an ordinary
derivative, taken with respect to x only. However, it is customary to signal that
another variable is present, which is done by using the special sign ∂ (still called
‘dee’) instead of the usual d for the derivative, writing
∂f ∂z
or
∂x ∂x
instead of df/dx or dz/dx. This is called the partial derivative of f(x, y), or of z,
with respect to x.
If the observer faces north and takes a step δy, as in Fig. 28.6b, then we obtain
in the same way the slope ∂f/∂y or ∂z /∂y in the y direction.
Partial derivatives
If z = f(x, y), then
∂f ∂z f (x + δ x, y) − f (x, y)
or = lim ,
∂x ∂x δx→0 δx
∂f ∂z f (x, y + δy) − f (x, y)
or = lim .
∂y ∂y δy →0 δy (28.1)
628
z = x2y + 2x2 − 3y + 4.
For ∂z/∂x, y has the status of a constant for the purpose of the differentiation, so
∂z
= 2xy + 4x − 0 + 0 = 2xy + 4x.
∂x
At the point (1, 3), ∂z /∂x = 10.
For ∂z/∂y, x is treated as constant, so
∂z
= x2 + 0 − 3 + 0 = x2 − 3.
∂y
At (1, 3), ∂z /∂y = −2.
⎛ ∂z ⎞ ⎛ ∂f ⎞
⎜ ⎟ and ⎜ ⎟
⎝ ∂x ⎠ (a,b) ⎝ ∂x ⎠ (a,b)
or
⎛ ∂z ⎞ ⎛ ∂f ⎞
⎜ ⎟ and ⎜ ⎟
28
⎝ ∂y ⎠ P ⎝ ∂y ⎠ P
to mean the derivatives are to be evaluated at P : (a, b). In this connection, the
following definitions are equivalent to (28.1):
⎛ ∂z ⎞ ⎛ ∂f ⎞ f (a, y) − f (a, b)
⎜ ⎟ or ⎜ ⎟ = lim .
⎝ ∂y ⎠ ( a, b) ⎝ ∂y ⎠ ( a, b) y → b y−b
(28.2)
∂ ⎛ x ⎞ ∂ ⎛ 1 ⎞
Example 28.3 Obtain (a) ⎜ ⎟ ; (b) ⎜ 2 1 ⎟ .
∂x ⎝ x + y ⎠ ∂y ⎝ (x + y )2 ⎠
2
28.3
(b) x is held constant. Use the chain rule (3.3), putting
u = x 2 + y 2, z = u− 2 ;
1
HIGHER DERIVATIVES
then
∂z dz ∂u
= .
∂y du ∂y
(We write ∂u/∂y instead of du/dy in the chain rule because both x and y are present in u,
and x is being held constant.) Continuing, we have
∂z
= (− 12 u− 2 )(2y) = − y(x2 + y2 )− 2 .
3 3
∂y
Example 28.4 The potential function V(x, t) = A e−qt sin k(x − ct) represents an
attenuating wave travelling to the right along a cable with speed c. Here A, q, k,
c are constants. Find (a) the rate of change of V with time t at any fixed point x;
(b) the ‘potential gradient’ ∂V/∂x along the wire at any moment.
(a) For ∂V/∂t, use the product rule (3.1) with u = A e−qt and v = sin k(x − ct). We treat x
as constant, so ∂v/ ∂t instead of dv/dt will be written into the product rule:
∂V ∂(uv) ∂v du
= =u +v
∂t ∂t ∂t dt
= A e−qt[−kc cos k(x − ct)] + (−qA e−qt) sin k(x − ct)
= −A e−qt[kc cos k(x − ct) + q sin k(x − ct)].
∂V ∂
(b) = A e −qt sin k(x − ct) = k A e−qt cos k(x − ct),
∂x ∂x
t being treated as constant.
It will be seen that no new rules have to be learned in order to obtain the par-
tial derivatives of given functions. In fact you have always unconsciously carried
out partial differentiation when differentiating expressions like A sin(ω t + φ),
without worrying whether A, ω, φ were really constants or just to be treated as
such while differentiating.
Self-test 28.1
Obtain the first partial derivatives of f(x, y) when f(x, y) is given by (a) cos2(xy);
(b) cos(x2 − y2); (c) ex ln(xy) (xy 0).
∂ ⎛ ∂z ⎞ ∂2 z ∂ ⎛ ∂z ⎞ ∂2 z
⎜ ⎟ = , ⎜ ⎟ = .
∂x ⎝ ∂y ⎠ ∂x ∂y ∂y ⎝ ∂y ⎠ ∂y2
In the last example, we see that the mixed derivatives satisfy ∂2z /∂y ∂x =
∂ z /∂x ∂y. This is always true for normal functions, although the proof is difficult:
2
Mixed derivatives
For any function f(x, y),
∂2 f ∂2 f
= .
∂y ∂x ∂x ∂y
In higher derivatives, the ∂x and ∂y in the denominator may be arranged
in any order. (28.3)
∂ 3f ∂ 3f ∂ 3f
For example, = = , and so on.
∂x ∂y2 ∂y2 ∂x ∂y ∂x ∂y
The next example shows how to manage a problem in notation. Often a function
f(x, y) is used in which the variables x and y only occur in a fixed combination
u = h(x, y), so that
f(x, y) = g(u), with u = h(x, y),
where g represents a general, unspecified, function of a single variable. To obtain a
general formula for ∂f /∂x use the chain rule (3.3) (see also Example 4.2c):
631
∂ f dg ∂u ∂u ∂h
= = g ′(u) = g ′[h(x, y)].
∂x du ∂x ∂x ∂x
28.3
It is a common mistake to write ∂g/∂x instead of g′[h(x, y)] in this context,
HIGHER DERIVATIVES
presumably misreading the chain rule. You must work out g′(u) first, before
substituting u = h(x, y). Thus suppose that f(x, y) = g(5x − 3y); then
∂f ∂f
= 5g′(5x − 3y) and = −3g′(5x − 3y).
∂x ∂y
Example 28.6 Prove that if z = φ(x − ct), where φ is any function, then
∂z2 1 ∂z 2
= .
∂x 2 c 2 ∂t 2
Put z = φ(u) where u = x − ct. Then
∂z dφ ∂u
= = φ ′(u).
∂x du ∂x
By the chain rule again,
∂2 z ∂ dφ ′(u) ∂u
= φ ′(u) = = φ ″(u). (i)
∂x2 ∂x du ∂x
Similarly
∂z dφ ∂u
= = φ ′(u)(−c),
∂t du ∂t
so
∂2 z ∂ d ∂u
= [−cφ ′(u)] = [−cφ ′(u)] = (−c )2 φ ″(u). (ii)
∂t 2
∂t du ∂t
Therefore, from (i) and (ii)
∂2 z 1 ∂2 z
= 2 2.
∂x 2
c ∂t
The equation
∂2 z 1 ∂2 z
=
∂x2 c2 ∂t 2
in Example 28.6 is called the wave equation in one space dimension. It is a partial
differential equation as contrasted with the ordinary differential equations treated
earlier in the book. We have verified that φ(x − ct) is always a solution, for any
function φ. The general solution is
632
φ(x − ct) + ψ(x + ct),
where φ and ψ are arbitrary functions. The general solution of partial differential
DIFFERENTIATION OF FUNCTIONS OF TWO VARIABLES
equations involves arbitrary functions rather than the arbitrary constants that
occur in ordinary differential equations: even the simple equation ∂z /∂x = 0 has
the general solution z = f(y), where f(y) is an arbitrary function.
Self-test 28.2
If u(x, y) = x2y2/(x + y), show that
∂2u ∂2u ∂u
x + y =2 .
∂x2 ∂x ∂y ∂x
Q (a, b, c) y
O
P
28.4
⎛ ∂f ⎞ ⎛ ∂f ⎞ ⎡ ⎛ ∂f ⎞ ⎛ ∂f ⎞ ⎤
z = ⎜ ⎟ x + ⎜ ⎟ y + ⎢c − ⎜ ⎟ a − ⎜ ⎟ b⎥ ,
⎝ ∂x ⎠ Q ⎝ ∂y ⎠ Q ⎢⎣ ⎝ ∂x ⎠ Q ⎝ ∂y ⎠ Q ⎥⎦
Example 28.7 Find the equation of the tangent plane at the point Q : (2, 1, −2)
on the sphere x2 + y2 + z2 = 9.
Recast the equation into the form z = f(x, y), noticing that Q is on the lower half of
the sphere:
z = −(9 − x2 − y2) 2 .
1
∂x ⎝ ∂x ⎠ (2,1)
∂f ⎛ ∂z ⎞
= −(−2y) ⋅ 12 (9 − x2 − y2 )− 2 , and ⎜ ⎟ = 12 .
1
∂y ⎝ ∂y ⎠ (2,1)
Therefore the equation of the tangent plane at Q is
z − (−2) = 1(x − 2) + 12 (y − 1),
or
z = x + 12 y − 92 .
A straight line SQR (Fig. 28.8) is said to be normal or perpendicular to the surface
z = f(x, y) at Q if it is perpendicular to its tangent plane at Q. The equation (28.5) for
the tangent plane can be written in the form
⎛ ∂f ⎞ ⎛ ∂f ⎞
⎜ ⎟ x + ⎜ ⎟ y + (−1)z = C,
⎝ ∂x ⎠ Q ⎝ ∂y ⎠ Q
where C is a constant, so (see eqn (10.22)) a triplet of direction ratios for the line
normal to the surface at Q is
⎛ ⎛ ∂f ⎞ ⎛ ∂f ⎞ ⎞
⎜ ⎜ ⎟ , ⎜ ⎟ , −1⎟ ,
⎝ ⎝ ∂x ⎠ Q ⎝ ∂y ⎠ Q
(28.6)
⎠
which are the coefficients of x, y, z.
634
R
DIFFERENTIATION OF FUNCTIONS OF TWO VARIABLES
z
Normal
n vector at Q
y
Q
S
Normal line through Q
x Fig. 28.8
Example 28.8 Find the cartesian (x, y, z) equation of the straight line normal to
the surface x2 + y2 + z2 = 9 at (2, 1, −2).
From Example 28.7 (which has the same data), the direction ratios in (28.6) are
1, 12 , −1.
Therefore the equation of the normal line at Q is (see Section 10.9)
x−2 y−1 z +2
= 1 = .
1 −1
28
The triplet of direction ratios in (28.6) can be regarded as the three components
of any vector parallel to the normal line. Such a vector is still called a normal
vector at Q, and is denoted usually by n:
⎛ ⎛ ∂z ⎞ ⎛ ∂z ⎞ ⎞
n = ⎜ ⎜ ⎟ , ⎜ ⎟ , −1⎟ .
⎝ ⎝ ∂x ⎠ Q ⎝ ∂y ⎠ Q ⎠
Any multiple of this vector is another normal vector, since it will be parallel to the
same line. A normal vector placed at Q is shown in Fig. 28.8.
28.5
point (2, 1, −2) on the sphere.
The data are again the same as in Example 28.7. The normal taken from (28.7) is
Self-test 28.3
Find the equations of the tangent planes to the surface z = x2 + y2 at the four
points (±1, ±2). Find the region in the x,y plane bounded by the tangent planes.
(a) (b)
(c) (d)
Q
Q
Fig. 28.9 (a) A local minimum. (b) A local maximum. (c) A saddle. (d) A shoulder.
636
These constitute two simultaneous equations whose solutions (x, y) are the
stationary points of f(x, y).
DIFFERENTIATION OF FUNCTIONS OF TWO VARIABLES
We shall usually describe a stationary point of f(x, y) as being ‘at P : (x, y)’ rather
than ‘at Q : (x, y, z) on z = f(x, y)’. If necessary, the corresponding value of z can
be worked out after finding (x, y).
y = −x = z1.
Therefore there are two stationary points, (1, −1) and (−1, 1). The values of f(x, y) at
these points are
f (1, −1) = 43 , f (−1, 1) = − 43 .
(a) z (b)
Q3 y
0.5
Q3 Q2 Q1
Q2
Q1 x
y –1 1
–0.5
Fig. 28.10 (a) The surface z = −y2 − --12 x4 + x2 showing maxima at Q1 and Q3, and a saddle point
at Q2. (b) The corresponding contour map showing closed level curves around the maxima.
637
As with functions of a single variable, the criteria for a maximum or minimum
involve higher derivatives. The following test enables maxima, minima, and other
28.5
stationary points to be distinguished in most cases, but we omit the proof, which
is difficult.
∂2 f ⎛ ∂2 f ⎞
with 0 ⎜ or 2 0⎟ at P,
∂x 2 ⎝ ∂y ⎠
2
∂2 f ∂2 f ⎛ ∂2 f ⎞
(c) a minimum if −⎜ ⎟ 0
∂x 2 ∂ y 2 ⎝ ∂x ∂y ⎠
∂2 f ⎛ ∂2 f ⎞
with 0 ⎜ or 2 0⎟ at P.
∂x 2
⎝ ∂y ⎠
(d) If none of these apply, the point might be any type. (28.9)
We can hint at the reason for the conditions in (28.9), by considering the
particular function
f(x, y) = --12 ax2 + hxy + --12 by2, (28.10)
where a, h, and b are constants. It follows that the derivatives are given by
∂f ∂f
= ax + hy, = hx + by
∂x ∂y
∂2f ∂2f ∂2f
= a, = h, = b.
∂x2 ∂x∂y ∂y2
Therefore f(x, y) has a stationary value where
ax + hy = 0, hx + by = 0.
Provided ab ≠ h2, the function has one stationary value at (0, 0).
Assuming that a ≠ 0, we can rewrite (28.10) in the form (completing the square):
A hy D
2
1 A D
f(x, y) = --12 a C x + F + C ab − h2F y2.
a 2a
Hence, for all (x, y) ≠ (0, 0)
f(x, y) f(0, 0) = 0 if a 0 and ab − h2 0 (minimum);
f(x, y) f(0, 0) = 0 if a 0 and ab − h2 0 (maximum).
638
If ab − h2 0 then irrespective of the sign of a there are values of (x, y) for which
f(x, y) 0, and other values for which f(x, y) 0 for the same parameter values.
DIFFERENTIATION OF FUNCTIONS OF TWO VARIABLES
2
∂2 f ∂2 f ⎛ ∂2 f ⎞
−⎜ ⎟ = − 4 0;
∂x2 ∂y2 ⎝ ∂x ∂y ⎠
so, by (28.9a), both points are saddles.
At (2, 2),
2
∂2 f ∂2 f ⎛ ∂2 f ⎞ ∂2 f ∂2 f
−⎜ ⎟ = 4 0, = 2 = 4 0.
∂x2 ∂y2 ⎝ ∂x ∂y ⎠ ∂x 2
∂y
Therefore, by (28.9c), the point is a minimum.
Self-test 28.4
A container with no lid has a triangular base in the form of an equilateral
triangle of side-length a, with vertical sides of height h. If the surface area is
a constant A, what are the dimensions of the container of maximum volume.
28.6
(xn, yn)
yn
en
x
O xn Fig. 28.11
We might have reason to believe that the underlying relation between x and y is a
straight line. There is no way of deducing this line with certainty, but the follow-
ing method is often used to obtain a convincing straight line fit to the points.
Suppose that there are N points altogether; call them
(x1, y1), (x2, y2), …, (xN, yN).
The general point is called (xn, yn). Figure 28.11 shows a candidate for the best-
fitting straight line,
y = ax + b,
and we have to adjust the constants a and b to obtain a good fit. The vertical
deviation en of a point (xn, yn) from the line is shown:
en = yn − (axn + b).
The criterion we shall use to determine the best straight line is to choose a and
N
b so that ∑e
n =1
2
n is as small as possible; that is to say, we want to minimize
N N
∑ e n2 =
n =1
∑ (y
n =1
n − axn − b)2 = f(a, b) (say).
Therefore a and b are the variables in this problem, and everything else has fixed
values.
For a minimum, we require at least that
∂f ∂f
= = 0.
∂a ∂b
The derivatives are given by
∂f N N
∂a
= ∑ 2(−xn)(yn − axn − b) = 2 ∑ (ax 2
n + bxn − xnyn),
n =1 n =1
∂f N N
∂b
= ∑ (−2)(yn − axn − b) = 2 ∑ (ax n + b − yn).
n =1 n =1
N
Noting that ∑ b = b + b + ··· + b = Nb, we find the conditions for a minimum
n =1
as the following pair of simultaneous equations for a and b:
640
N N
a ∑ xn + bN = ∑y . n
n =1 n =1 (28.11)
We shall not prove that the stationary point of f(a, b) found by this method is
actually a minimum (see Problem 28.21).
Example 28.12 Find the straight line which best fits the data:
xn 0.0 1.1 3.2 3.9 7.1 8.9
yn 1.1 1.6 1.6 2.8 2.9 3.8
Here N = 6, and the coefficients in (28.10) are
6 6
∑ xn = 24.2, ∑y n = 13.8,
n=1 n=1
6 6
∑x 2
n = 156.28, ∑x y n n = 72.21.
n=1 n=1
28
The equations for a and b are sometimes ill-conditioned, meaning that the
solutions are very sensitive to small changes in the coefficients. It is therefore
advisable to retain all the significant figures given by the data while solving them,
despite the fact that we know they already embody the errors of measurement.
Self-test 28.5
In the method of least squares (28.11)), suppose that xn = n (n = 1, 2, … , N),
that yn is measured at successive integer values of xn. Find a and b in the
straight line fit y = ax + b. (Hint: use the summations in Appendix A(f ).)
e x +α.
dx
αt dt, g(x)h(x + α ) dx,
0 −∞
641
We shall consider a definite integral, though the process works in the same way
for indefinite integrals. Indicate the dependence on α in the general case by
28.7
f(t, α) dt.
b
I(α) =
If
a
b
dI(α ) ∂f(t, α )
= dt.
dα ∂α a
(28.12)
This process is also called differentiation under the integral sign. To prove (28.12),
change α to α + δα; then I(α) changes to I(α + δα). Put
I(α + δα) − I(α) = δI(α).
Then
1 ⎛
b b
⎞
δ I(α ) I(α + δα ) − I(α )
= = ⎜ f (t, α + δα ) dt − f (t, α ) dt⎟
δα δα δα ⎝ a a ⎠
b
f (t, α + δα ) − f (t, α )
= dt.
a
δα
Now let δα → 0. Then δI(α)/δα becomes dI(α)/dα, and the integrand becomes
∂f(t, α)/∂α, which is the result (28.12).
(t + α ) .
dt
evaluate J(α ) = 2 2 2
0
From Appendix E,
∞
π
dt
I(α ) = = [α −1 arctan(t /α )] 0∞ = .
0
t +α
2 2
2α
By (28.12),
∞ ∞
∂ −2α d ⎛ π⎞ π
(t
dI 1
= dt = dt = ⎜ ⎟ =− 2.
dα 0
∂α t 2 + α 2 0
2
+ α 2 )2 dα ⎝ 2α ⎠ 2α
Therefore
∞
π
(t
dt
J(α ) = = .
0
2
+ α 2 )2 4α 3
642
Self-test 28.6
DIFFERENTIATION OF FUNCTIONS OF TWO VARIABLES
K(α) = (t +dtα ) .
0
2 2 3
Problems
28.1 Sketch contour maps of the following 28.5 In plane polar coordinates (r, θ ) in the first
quadrant, r = (x 2 + y 2 ) 2 and x = r cos θ. Form ∂r/∂x
1
functions:
(a) 2x − 3y + 4; (b) −x + 2y − 1; and ∂x/∂r, and show that
(c) (x − 1)(y − 1); (d) x2 + 41 y 2 − 1; ∂r ∂x
(e) x2 + 2x + y2 (complete the square in x); ≠ 1.
∂x ∂r
(f ) y /x; (g) y2 − x2; (h) y /x3;
(i) x3 + 4y2; ( j) y/(x + y). By considering the meaning of the derivatives
∂r/∂x and ∂x/∂r near a particular point P in the
manner of Fig. 28.6, show why it is not to be
28.2 By sketching rough contour maps, indicate
expected that the product should equal 1. (In the
the paths of steepest ascent (the paths on which z
case of a single variable and ordinary derivatives,
increases most rapidly), starting at the point (1, 1):
we often get true results by formally cancelling out
28
(a) z = 2x − 3y + 4; (b) z = x − y;
symbols like dx, du, etc., as in the chain rule. This
(c) z = x2y2; (d) z = (x − 1)2 + 41 (y − 1)2.
almost never works when more variables are
present: see for example the next problem.)
28.3 Obtain ∂f/ ∂x and ∂f /∂y at the point (2, 1) for
the following functions. 28.6 (a) Let z = sin(x − y); show that
(a) 3x + 7y − 2; (b) −2x + 3y + 4; ∂z ∂z
(c) 2x2 − 3y2 − 2xy − x − y + 1; = −1.
∂x ∂y
(d) 81 x3 + y3 − 2y − 1; (e) x4y2 − 1;
(f) (x − 1)(y − 2); (g) 1 /(xy); ∂z ∂z
(b) Let z = g(x − y); show that = −1.
x−y 3 ∂x ∂y
(h) x /y; (i) ; ( j) 2 ;
x+y x + y2
(k) (x 2 + y 2 ) 2 ; (l) (2x − 3y + 2)3; (m) ex +y ; 28.7 Show that, if z = g(x/y), then
1 2 2
∂V/ ∂x and ∂V/ ∂y, firstly in terms of x and y, then 28.9 Confirm that, if r = (x 2 + y 2 ) 2 and
1
PROBLEMS
new variables for the minimization.)
Show that z = ln r is a solution of the equation
∂ 2z ∂ 2z
+ = 0. 28.17 N points (x1, y1), (x2, y2), … , (xN, yN) are
∂x 2 ∂y 2 given in a plane, and P : (x, y) is a general point.
(This is called Laplace’s partial differential Find P so that the sum of the squares of its
equation in two dimensions.) distances from the N given points is as small
as possible.
28.10 Obtain the tangent plane and a normal
vector for the following surfaces at the points 28.18 (a) A rectangular box with a lid must hold
given. a given volume V, and have the smallest possible
(a) z = x2 + y2 at (1, 1, 2); (b) z = xy at (2, 2, 4); surface area. Show that it must be a cube. (Call the
(c) z = x /y at (2, 1, 2); lengths of two of its sides x and y.)
(d) z = (29 − x 2 − y 2 ) 2 at (3, 4, 2);
1
(b) An open-topped rectangular box must have a
(e) z = x2 + y2 − 2x − 2y at (1, 1, −2); given volume V and its surface area must be as small
(f ) z = exy at (0, 0, 1). as possible. Find its dimensions.
(c) A circular-cylindrical box must have a fixed
28.11 The two surfaces z = x2 + y2 and z = x − y + 2 volume V and minimum surface area. Find its
intersect at the point Q : (1, 1, 2). Find normal dimensions (i) if it has a lid, (ii) if it has no lid.
vectors at Q to each of the two surfaces, n 1 to the (d) A rectangular container is required to have
first and n 2 to the second. By considering the total surface area S, and a volume as large as
scalar product n 1 · n 2, find the angle between possible. Find its dimensions (i) if it has a lid,
the normals and hence the angle at which the (ii) if it does not have a lid.
surfaces cut at Q.
28.19 Find the straight line which best fits the
28.12 Find the stationary points of the following experimental data in the sense of Section 28.7:
functions, and classify them using (28.9). x 1 2 3 4 5
(a) (x − 1)(y + 2); (b) x2 + y2 − 2x + 2y; y 3.1 2.1 2.0 1.8 1.2
(c) 31 x3 − 31 y3 − x + y + 3; (d) cos x + cos y;
(e) ln(x2 + x) + ln( y2 + y); (f) ex + y −2x+2y;
2 2
28.20 The population P of a fast-breeding rodent
(g) xy + 1 /x + 1 /y; (h) x3 + y3 − 3xy + 1; was observed over a period of 12 months, and the
(i) sin x + sin y; (j) xy2 − x2y + x − y + 1; following estimates obtained:
(k) (x2 − y2) + 2xy; (l) (2 − x2 − y2)2;
t (months) 0 2 3 5 8 10 12
(m) x4 + y4 + y − x;
P (pop’n) 12 23 26 60 170 300 690
(n) x4 + y4 (this eludes the test (28.9) − the point is
obviously a minimum). Assume that the underlying growth law takes the
form (see Section 1.12)
28.13 Classify the stationary point of ax2 + 2hxy P = A ebt,
+ by2 at (0, 0) for various relations between a, b, where A and b are constants.
and h. To estimate A and b, take the logarithm of this
expression and treat y = ln P as a variable in the
28.14 Find positive numbers a, b, c so that least-squares method of Section 28.7.
(a) a + b + c = 21 and abc is a maximum.
(b) abc = 64 and a + b + c is a minimum. 28.21 For the least-squares method of
Section 28.7, use the test (28.9) to show that the
28.15 Find the absolute maximum value of
values of a and b obtained do minimize the sum
(2 − x2 − y2)2 in the ‘box’ −1 x 2, −1 y 1.
of squares. (This is, of course, rather obvious
(It will be necessary to investigate the function on
intuitively.)
the four edges of the box separately, since the
absolute maximum will not be revealed by the
conditions (28.9) if it is on the edges.) 28.22 Using Laplace transforms with respect to t,
solve the partial differential equation
28.16 Find the shortest distance between the ∂z ∂z
+x + z = 2x,
straight lines x = y = z and 2x = y = z + 2, by using a ∂t ∂x
644
for x 0 and t 0, where z(0, t) = 0 and 28.24 If z = f(x, y), how many nth-order
z(x, 0) = 0. partial derivatives of f(x, y) are these of the
DIFFERENTIATION OF FUNCTIONS OF TWO VARIABLES
CONTENTS
In Section 28.4 the tangent plane at a point on the surface z = f(x, y) was defined
to be the plane that best fits the surface in the neighbourhood of the point.
Written in algebraic terms this property becomes the incremental approximation
to the surface in the neighbourhood of the point, and constitutes the best linear
approximation to f(x, y) close to the point of contact. The incremental approxima-
tion is the origin of all applications of this topic through this and the next two
chapters. An immediate application is to the question of approximating to the
effect of making small changes δx, δy in the variables in a complicated formula
z = f(x, y), and the associated question of estimating errors in z when x and y are
subject to errors.
z δz
FUNCTIONS OF TWO VARIABLES: GEOMETRY AND FORMULAE
y) y
(x,
z=
δy
Q
δx
O
tangent plane
Fig. 29.1
⎛ ∂f ⎞ ⎛ ∂f ⎞
δz = ⎜ ⎟ δx + ⎜ ⎟ δy.
⎝ ∂x ⎠ (a,b) ⎝ ∂y ⎠ (a,b)
This is the exact change in z on the surface z = f(x, y) from its value at Q. The tan-
gent plane is the best-fitting plane to the surface at Q, so the formula
⎛ ∂f ⎞ ⎛ ∂f ⎞
f(x, y) − f(a, b) = δ f ≈ ⎜ ⎟ δx + ⎜ ⎟ δy
⎝ ∂x ⎠ (a,b) ⎝ ∂y ⎠ (a,b)
You are more likely to remember the formula obtained by calling the general
point (x, y) instead of (a, b), and putting z in place of f. Also the approximation
will be good enough to be useful only when δx and δy are ‘small’ (how small will
depend on circumstances):
647
29.1
For small enough increments δx and δy:
∂f ∂f
f(x + δx, y + δy) − f(x, y) ≈ δx + δy.
This will be the source of almost all our results from now on, but remember
that ∂z/∂x and ∂z /∂y in (29.2) are constants given by the explicit formulas (28.2)
and (29.1).
We see from (d) in the Example that the approximation improves percentage-
wise as δx and δy get smaller: it is not merely that the error decreases because δx,
δy, δz all go to zero together. The following Example shows the reason for this.
Example 29.2 Find the exact algebraic form of the error incurred by using (29.2)
to estimate δz at (2, 1) when z = x 2 + 3y2 (see Example 29.1a).
Put x = 2 + δx and y = 1 + δy. Then
δz = f(2 + δx, 1 + δy) − f(2, 1) = (2 + δx)2 + 3(1 + δy)2 − 7
= (4 δx + 6 δy) + (δx2 + 3 δy2).
648
The first two terms represent the linear approximation obtained in Example 29.1a.
The remainder is the error incurred, the part we ignore in the approximation. The
FUNCTIONS OF TWO VARIABLES: GEOMETRY AND FORMULAE
error consists only of higher powers of δx and δy, and this will always be the case.
Therefore the error is an order of magnitude smaller than the linear terms retained
in the incremental approximation (29.2).
Self-test 29.1
The hypotenuse of a right-angled triangle with side lengths x and y is given
by z = √(x2 + y2). Find an approximation to δz in terms of δx and δy.
Calculate the approximate change in δz at x = 3, y = 4 if δx = 0.1 and
δy = − 0.1.
Therefore, approximately,
δz = (− 125
3
)(0.1) + (− 125
4
)(− 0.2) = 0.004.
(The exact value of δz is 0.003 91 … .)
1
Example 29.4 The period T of the swings of a pendulum is equal to 2π(l/g) 2 ,
where l is its length and g the gravitational constant. Estimate the error in
calculating T if, instead of using closely correct values l = 1.015 and g = 9.812
in the formula, we use the rounded values l = 1 and g = 10.
The formula corresponding to (29.2) is
∂T ∂T
δT ≈ δl + δg.
∂l ∂g
Suppose for simplicity we decide to substitute the rounded values l = 1 and g = 10 into
the coefficients: we obtain ➚
649
Example 29.4 continued
29.2
∂T ∂T
= (π l − 2 g − 2 )(1,10) = 0.993, = (−π l 2 g − 2 )(1,10) = − 0.099.
1 1 1 3
∂l ∂g
In the last example, we substituted the rounded (erroneous) values into ∂T/ ∂ l
and ∂T/∂g, which led to a complication we might have avoided. However, usually
there is no choice, the exact values being unknown. Let z = f(x, y), and suppose
that we want to estimate the error is z which could arise from using measured
(i.e. approximate) values for x and y. The error ∆x in x is defined to be
∆x = (measured value of x) − (exact value of x),
and similarly for ∆y and ∆z.
Usually we only know a range of possible error, not the errors themselves. For
example, we might say that a parcel weighed 1430(±15) g, meaning that we think
it is between 1415 g and 1445 g. Therefore, the values of ∆x and ∆y are unknown,
so the exact values of x and y are unknown, and are not available to go into (29.2)
in place of (x, y). Instead, in such cases, take x, y to be convenient reference values,
at which the derivates are evaluated. To correspond with this, the definition of
δx, δy, δz in (29.2) requires
δx, δy, δz = (true values) − (reference values),
Therefore
δx = −∆x, δy = −∆y, δz = −∆z
go into (29.2). Every term has then a negative sign, so the formula in terms of ∆x,
∆y, ∆z has the same shape as the incremental formula:
Small-error formula
If z = f(x, y), then
∂z ∂z
∆z = ∆x + ∆y (approximately),
∂x ∂y
where x and y are reference values, and ∆ stands for
error = (reference value) − (exact value). (29.3)
c sin A
a= .
sin(A + B)
Suppose that c = 10 (exactly), and angles A and B are measured to 5° accuracy:
A = 45(±5)°, B = 30(±5)°. Estimate the largest possible resulting error in a.
Put
sin A
a = f(A, B) = 10 ,
sin(A + B)
where A and B are measured in radians. Then
∂a sin(A + B) cos A − cos(A + B) sin A sin B
= 10 = 10 2 .
∂A sin2(A + B) sin (A + B)
Also
∂a cos(A + B) sin A
= −10 .
∂B sin2(A + B)
Choose as reference values A = 45° and B = 30° (for this seems to be the simplest choice).
We get ∂a/∂A = 5.36 and ∂a/∂B = −1.96. The error formula (29.3) becomes
∆a = 5.36 ∆A − 1.96 ∆B
approximately, where ∆A and ∆B must be measured in radians.
The greatest possible magnitude of ∆a occurs if ∆A and ∆B happen to have the
opposite signs and their greatest possible magnitudes; that is, if ∆A = −∆B = ±0.087
radians. In that case, ∆a = ±0.64. Therefore
a = f(--14 π, --16 π) ± 0.64 = 7.32 ± 0.64,
showing a possible error of about 8.7%.
∂b ∂c
(b) Since b and c are rounded numbers, all that we know about them is that
b = 3.1(±0.05), c = 2.1(±0.05),
meaning that the error might be anywhere in the range indicated. Putting the reference
values b = 3.1 and c = 2.1 into (a), we obtain
∂x ∂x
= 0.909, = − 0.909;
∂b ∂c
so, by (29.3),
∆x = 0.909 ∆b − 0.909 ∆c.
This takes its greatest possible magnitude when ∆b and ∆c take their maximum values
and have opposite sign: that is, when
∆b = ±0.05, ∆c = z0.05.
In that case ∆x = ±0.909(0.05 + 0.05) = ±0.091.
The value of x estimated from the rounded coefficients is x = −1. Although the rounding
error is only at most 2.4%, the error in the solution could be as large as ±9.1%.
651
Self-test 29.2
29.3
The volume V of a circular cylinder of base radius r and height h is given by
V = πr 2h. Find δV in terms of δr and δh. If r = 4 and h = 5 and r increases by
δs
δy
θ
P
δx
O x
Fig. 29.2
Consider the direction P_Q which makes an angle θ with the positive x axis,
the direction for positive angles being anticlockwise as with polar coordinates.
Let the length PQ = δs, a short step, and let δx and δy be as shown. Then, by
(29.2), the change in elevation in this direction is given approximately by
∂z ∂z
δz ≈ δx + δy.
∂x ∂y
Divide by δs; we obtain
δ z ∂z δ x ∂z δ y ∂z ∂z
≈ + = cos θ + sin θ
δ s ∂x δ s ∂y δ s ∂x ∂y
from Fig. 29.2. Now let δs → 0; the approximation becomes exact, and we have
an expression for the slope in any direction. Using the notation for the directional
derivative,
δ z dz
lim = ,
δ s→ 0 δ s ds
we have the following formula.
652
Directional derivative
FUNCTIONS OF TWO VARIABLES: GEOMETRY AND FORMULAE
y P
3 (2, 3)
120°
2
Q
x
O 1 2 3
Fig. 29.3
⎛ ∂z ⎞
⎜ ⎟ = (y + 2x)(2,3) = 7,
⎝ ∂x ⎠ (2,3)
29
⎛ ∂z ⎞
⎜ ⎟ = (x)(2,3) = 2.
⎝ ∂y ⎠ (2,3)
Also
cos(−120°) = −sin 30° = − 12
and
sin(−120°) = −cos 30° = − 12 √3;
so
dz
= 7(− 12 ) + 2(− 12 √3) = − 12 (7 + 2√3).
ds
which means that the surface is descending in the direction (−120)°.
Example 29.8 The temperature distribution in a plate heated at the point (0, 0)
is given by T = 1/(x 2 + y2) 2 . (a) Find the temperature gradient at the point (3, 3)
1
in a direction of 45° to the positive x axis. (b) In polar coordinates, T = 1/r. Show
that the result (a) is the same as ∂T/∂r taken at any point on the circle r = 3√2.
⎛ ∂T ⎞ ⎛ x ⎞ 1
(a) ⎜ ⎟ = ⎜− 2 3 ⎟ =− ,
⎝ ∂x ⎠ (3,3) ⎝ (x + y2 )2 ⎠ (3,3) 18√2
⎛ ∂T ⎞ ⎛ y ⎞ 1
⎜ ⎟ = ⎜− 2 3 ⎟ =− .
⎝ ∂y ⎠ (3,3) ⎝ (x + y ) ⎠ (3,3)
2 2 18√2 ➚
653
Example 29.8 continued
29.3
Also cos θ = 1/√2 and sin θ = 1/√2. Therefore the temperature gradient at (3, 3)
in the given direction is
y 60°
r
tou
150°
ste
con
des epest
cen
t P
ste
asc epest
ent −30°
r
tou
con
−120°
O x Fig. 29.4
In the last example, the directions of steepest ascent/descent at any point are
perpendicular to the directions of the contours; we shall now show that this is
true for all surfaces. On the contour map of z = f(x, y), the slope in the direction
θ at a point P : (x, y) has the form (29.4):
dz
= A cos θ + B sin θ,
ds
654
where A and B are the values of ∂z /∂x and ∂z /∂y at P. This is zero in the directions
θ1 where
FUNCTIONS OF TWO VARIABLES: GEOMETRY AND FORMULAE
tan θ1 = −A /B.
The two directions θ1 which satisfy this equation differ by π, so they indicate
smooth passage of the contour through P. The gradient dz /ds is a maximum or
minimum when
d ⎛ dz ⎞
⎜ ⎟ = 0,
dθ ⎝ d s ⎠
or in directions θ2 where
tan θ2 = B /A,
which give the directions of steepest ascent /descent. Since
tan θ1 tan θ2 = −1,
these directions are perpendicular (see (1.9)), a fact known intuitively by any hill
walker.
Steepest ascent/descent
At each point on the map of z = f(x, y), the direction of steepest ascent/descent
is perpendicular to the contour. (29.5)
The two systems of curves, consisting of the contours and the curves which follow
directions of steepest ascent or descent, are perpendicular wherever they cross, so
29
Self-test 29.3
A surface is defined by z = e−x −2y . A person walks on the surface on the line
2 2
x + y = 1 between (0, 1) and (1, 0). Find the rate of ascent/descent of the
walker at each point on the line.
29.4
Q
δy
P
IMPLICIT DIFFERENTIATION
δx
) =c
,y
f(x
O x Fig. 29.5
of x and y. Choose any point P : (x, y) on the curve (Fig. 29.5), and move along it
a short distance to Q : (x + δx, y + δy). Then dy /dx on the curve is given by
dy δy
= lim .
dx δx →0 δ x
Since P and Q both lie on the curve, δf = 0; so the incremental approximation (29.2)
gives
∂f ∂f
δx + δ y ≈ 0,
∂x ∂y
or
δy ∂f ∂f
≈− .
δx ∂x ∂y
Now let δx → 0. The ‘≈’ becomes ‘=’, and δy/δx becomes dy/dx, from which we
obtain:
Example 29.10 Find an expression for dy /dx at a general point (x, y) on the
circle x2 + y2 = 4.
Here f(x, y) = x2 + y2, and so
∂f ∂f
= 2x, = 2y.
∂x ∂y
Therefore, by (29.6),
dy 2x x
=− =− ,
dx 2y y
where x2 + y2 = 4
656
In the last Example we would have obtained exactly the same result for the
circle x2 + y2 = 1, or x2 + y2 = 100. It is the numerical values of x and y to be put in
FUNCTIONS OF TWO VARIABLES: GEOMETRY AND FORMULAE
the right-hand side which will distinguish the circle under discussion from all the
other circles. In fact the equation we obtained,
dy x
=− ,
dx y
can be thought of as a differential equation. Its solutions (obtained by the method
of Section 22.3) are x2 + y2 = C, which includes the given circle and all the others
as well.
Example 29.11 Find dy /dx on the curve x3y − xy3 = 6 at the point (2, 1).
(You can check that the point (2, 1) is really on the curve.) Putting f(x, y) = x3y − xy3,
we have
∂f ∂f
= 3x2y − y 3 , = x 3 − 3xy2 .
∂x ∂y
Therefore, at any point (x, y) on the curve,
dy 3x2y − y 3
=− 3 .
dx x − 3xy2
At (2, 1), the slope is
⎛ dy ⎞ ∂ f / ∂x 11
⎜ ⎟ =− =− .
⎝ dx ⎠ (2,1) ∂ f / ∂y 2
(This is not a differential equation: it is a numerical value which holds at only a
29
single point.)
The link with differential equations can be used in many ways, as in the fol-
lowing example.
29.5
y
NORMAL TO A CURVE
P
O x
xy = C
y2 − x2 = B Fig. 29.6
y dy = x dx
or
y2 − x2 = B,
where B is an arbitrary constant. This is another family of hyperbolas. A small region of
the (x, y) plane is shown in Fig. 29.6.
Self-test 29.4
Using the implicit formula (29.6), find dy/dx where x2 + 2xy + 4y2 = 4 (the
curve is an ellipse: draw a sketch of it). Find where the maximum and min-
imum values occur on the ellipse.
⎛ ⎛ ∂f ⎞ ⎛ ∂f ⎞ ⎞
n = ⎜⎜ ⎟ , ⎜ ⎟ ⎟ .
FUNCTIONS OF TWO VARIABLES: GEOMETRY AND FORMULAE
⎝ ⎝ ∂x ⎠ P ⎝ ∂y ⎠ P ⎠
Any multiple of this n is also a normal at the point. Dropping the suffix P, we have
the following result.
Example 29.13 Find several normal vectors at the point (2, 1) on the curve
x 2 + y2 = 5.
Putting f(x, y) = x2 + y2, we have ∂f /∂x = 2x and ∂f /∂y = 2y; so
⎛ ∂f ⎞ ⎛ ∂f ⎞
⎜ ⎟ = 4, ⎜ ⎟ = 2.
⎝ ∂x ⎠ (2,1) ⎝ ∂y ⎠ (2,1)
Therefore one vector normal to the circle at (2, 1) is
n = (4, 2)
and from this any number of other normal vectors can be constructed by taking
multiples. For example, (2, 1), (−2, −1), and ( √25 , √15 ) are also normals, the last one
being a unit normal (one having unit length), which is often important.
29
y
x2 + y2 = 5
2 x2 − y2 = 3
n2
θ
1 (2, 1) θ
n1
O 1 2 x Fig. 29.7
Self-test 29.5
29.6
On the ellipse x2 + 2xy + 4y2 = 4 (see Self-test 29.4), find the direction of the
normal to the ellipse at the point x = 1, y 0 on the ellipse.
∂f ∂f ⎛ ∂f ∂f ⎞
î + q or ⎜ , ⎟
∂x ∂y ⎝ ∂x ∂y ⎠
as a vector function. We call this vector function the gradient of f and denote it by
grad f or ∇f
(∇ is pronounced ‘del’ or ‘nabla’). We shall see that it works rather like an ordin-
ary derivative, but in two dimensions; hence its name.
Alternatively we can regard the symbol grad or ∇ standing alone as an operator
(compare d /dx): it operates on scalar functions f(x, y), instructing us to carry out
the operation î ∂/∂x + q ∂/∂y or (∂/∂x, ∂/∂y) on f(x, y):
⎛ ∂ ∂⎞
grad f (x, y) = ⎜ î + q ⎟ f (x, y).
⎝ ∂x ∂y ⎠
660
Example 29.15 Let f(x, y) = x2 + y2. Obtain (a) the vector function grad f;
(b) the value of grad f at the point (1, 2); (c) an expression for the magnitude,
or length, of grad f at (x, y).
∂f ∂f
(a) grad f = î+ q = 2xî + 2yq ;
∂x ∂y
or we can use the alternative notations, and even the operator viewpoint:
⎛ ∂ ∂⎞
∇f = ⎜ , ⎟ (x2 + y2 ) = (2x, 2y).
⎝ ∂x ∂y ⎠
(b) At x = 1, y = 2, we have
grad f = (2, 4).
(c) The magnitude or length of a vector v = (a, b) is |v| = (a2 + b2) 2 , so
1
We can re-express some earlier results in terms of grad. For example, we may
write (29.7) immediately as follows.
If S = Uî + Vq, then
∂f ∂f ⎛ ∂f ∂f ⎞
U +V = (U, V) ⋅ ⎜ , ⎟ = S . grad f .
∂x ∂y ⎝ ∂ x ∂y ⎠
(29.10)
661
Now consider the directional-derivative formula (29.4), regarding it as repre-
senting the rate of change of f(x, y) in the direction θ :
29.6
df ∂f ∂f
= cos θ + sin θ .
so put
î cos θ + q sin θ = v,
where v is a unit vector pointing in the desired direction, and (29.10) becomes
Directional derivative
In the direction of a unit vector v, the rate of change of f(x, y) is given by
df
= v · grad f;
ds
that is to say, df/ds is equal to the component of grad f in the direction of v.
(29.11)
Equation (29.11) can be written in a different way. If a and b are two vectors,
then the angle between them, φ, can be obtained from the identity
a ·b = |a | |b| cos φ, with 0 φ π
(see (10.4)). If we put a = v and b = grad f, and use the fact that | v | = 1, we obtain
an alternative form of (29.11).
Self-test 29.6
Find the rate of change of f(x, y) = x2 − 2xy2 + y3 at (−1, 1) in the direction
(1, 2).
662
Problems
FUNCTIONS OF TWO VARIABLES: GEOMETRY AND FORMULAE
1
29.1 Use the incremental approximation (29.1) or b2 tan A tan C
S= 2
.
(29.2) to estimate the change δz due to changes δx tan A + tan C
and δy as specified, and check the percentage error
by calculating the exact result. Suppose that nominally b = 2, A = 30°, C = 60°,
(a) z = x2 + y2 at (3, 1), δx = 0.1, δy = 0.3; but that C is found to be too large by 5%. By what
(b) z = sin xy at (0.5, 1.2), δx = 0.1, δy = −0.05; amount should A be changed so that S would be
(c) z = ex +3y at (1, 1), δx = 0.1, δy = 0.2;
2 2 restored to the correct area?
(d) z = 1/(x2 + y2) 2 at (2, 1), δx = −0.2, δy = 0.1.
1
dy
1 1 1 29.10 Find at the prescribed points on the
+ = . dx
u v f curves given.
Suppose that the measured values of u and v are (a) xy = 1 at (2, 12);
u = 0.31(±0.01), v = 0.56(±0.03); calculate the (b) x2 + y2 = 25 at (3, 4);
greatest possible error in estimating f, and the (c) 1/x − 1/y = 12 at (1, 2);
corresponding percentage error. (d) 101 x 2 + 151 y 2 = 1 at (2, 3);
(e) x3 + 2y3 = 3 at (1, 1);
29.5 A viscous liquid is forced through a tube of (f) x3y + 3x2 − y2 − 19 = 0 at (2, 1);
diameter d = 0.002 ± 0.0001m and length l = 0.1m (g) xy2 − x2y + 6 = 0 at (3, 2);
under a pressure p = 5000 ± 50Nm−2, and is found (h) x2 + y2 = 4 at (2 cos θ, 2 sin θ );
to pass fluid at a rate q = 1.66 × 10−6m3s−1. The (i) x2/a2 + y2/b2 = 1 at (a cos t, b sin t);
viscosity η is given by the formula (j) x cos y = y sin x at (π /2, 0);
π pd 4 (k) y2 − 4ax = 0 at (at2, 2at).
η= .
128 ql
Find the maximum error in the viscosity estimate. 29.11 The ideal-gas equation, for a fixed mass of
gas is PV = RT, where R is a constant. There are
29.6 One root of the equation x2 + bx + c = 0 is three variables: P is pressure, V is volume, and
x = 12 [−b + (b2 − 4c) 2 ]. Suppose that b = 20.4 and
1
T is absolute temperature. Show that
c = 95.5. Estimate the percentage error in the root ⎛ ∂V ⎞ ⎛ ∂T ⎞ ⎛ ∂T ⎞
which would arise if these were rounded to ⎜ ⎟ = −⎜ ⎟ ⎜ ⎟ .
⎝ ∂P ⎠ T ⎝ ∂P ⎠ V ⎝ ∂V ⎠ P
b = 20, c = 96.
(The notation (∂u/∂v)w means that the variable
29.7 The area S of a triangle with base b and base w is kept constant during differentiation when
angles A and C is given by u = g(v, w). Use (29.6).)
663
29.12 Find the cartesian equation of the tangent Then, by differentiating this equation and treating
line at a point (x1, y1) on each of the following y as a function of x, we obtain
PROBLEMS
curves. (Find dy /dx first.) dy dy
(a) x2 + y2 = a2; (b) x2/a2 + y2/b2 = 1; 2x + 2x + 2y + 2 y = 0,
dx dx
(c) a2x2 − b2y2 = c; (d) xy = 1; (e) x 3 + y 3 = 1;
2 2
29.14 Let (x, y) be any point on the curve 29.21 Obtain grad f, where f(x, y) is given by the
y3 − x3 = 1. Find an expression for dy /dx at the following. Give its components, its direction, and
point. Since this expression holds good for every its magnitude at the points specified.
point on the curve, it is a differential equation, (a) 1/(x + y) at (1, −2); (b) y/x at (2, 0);
having the given curve as one of its solution (c) y2 − 3x2 + 1 at (0, 0); (d) 1/x − 1/y at (2, 1);
curves. Verify this by solving it, and obtain the (e) 1/r, where r is the polar coordinate,
r = (x2 + y2) 2 ; confirm that the gradient
1
other solutions.
vector points in a radial direction.
29.15 (Numerical). Form the differential equation
for the following families of curves, in which c is 29.22 Use the gradient vector to obtain a unit
the parameter; then use the numerical solution vector perpendicular to the following curves
method of Section 22.2 to obtain a contour map at the points given
of the functions concerned. (a) 2x − 3y + 1 = 0 at any point;
(a) x2 + 2y2 = c, c > 0; (b) x2 + xy − y3 = c; (b) x2 + y2 = 5 at (2, 1);
x2 + y (c) x2 + y2 = r 2 at (x0, y0) on the circle;
(c) = c; (d) xy e−x = c. (d) x2/a2 + y2/b2 = 1 at (x0, y0) on the ellipse;
x + y2
(e) y = 3x2 − 2 at (2, 10).
29.16 Form the differential equation for each
system of curves, and deduce the differential 29.23 Use the property (29.9) to find the angle of
equation for the orthogonal (perpendicular) intersection of the following curves at the point
system. Solve it to obtain the orthogonal of intersection given.
system. (a) y2 − x2 = −3 and x3 − y3 = 7 at (2, 1);
(a) y2 − x2 = c; (b) y3 + x3 = c; (b) x2y − xy2 = 0 and x/y − y/x = 0 at (2, 2);
(c) y2 = cx; (d) ey − ex = c. (c) x2 + y2 + 2x − 4y + 4 = 0 and y = x2 + 2x + 2 at
(−1, 1); explain the result geometrically.
29.17 Find the curves of steepest ascent from an
arbitrary point (a, b) for each of the following 29.24 Use (29.12) to prove the results given in
functions. Section 29.3 for a general f(x, y): that (a) the
(a) 12 x 2 + y 2; (b) x3y3; (c) 12 y 2 − y − x 2. directions of most rapid increase and decrease
through a point (x, y) are perpendicular to the
29.18 Implicit differentiation of y with respect direction of the contour through the point; (b) the
to x can be carried out as follows when f(x, y) is maximum rate of increase from the point is equal
given explicitly. Consider f(x, y) = x2 + 2xy + y2 = c. to | grad f | at the point.
Chain rules, restricted
30 maxima, coordinate
systems
CONTENTS
A chain rule is a rule for manipulating ‘functions of functions’. Chain rules are
analogous to the chain rule for a single variable of Sections 3.3 and 3.6, but there
are many forms available for use with two or more variables.
The first application of a chain rule (eqn (30.1)) is to the question of locating a
maximum/minimum. Other applications are described in subsequent sections.
Example 30.1 Show that both of the following parametrizations define a unit
semicircle, centred at the origin, in the upper half plane, traced anticlockwise:
(a) x = cos t, y = sin t, where t increases from 0 to π; (b) x = −u, y = (1 − u2) 2 ,
1
30.1
so the points lie on the unit circle. Also, as t increases from 0 to π, y is positive and x
decreases from 1 to −1. The path is the upper semicircle from (1, 0) to (−1, 0), described
in a single direction, as shown in Fig. 30.1a.
−1 O 1 x −1 O 1 x
Fig. 30.1
Given a function f(x, y) which can take values all over the (x, y) plane, the
function
g(t) = f(x(t), y(t))
picks out only the values on the path (x(t), y(t)). As we move along this path, the
value of g(t) varies, and we might be concerned with the rate at which it changes
with t. (This is generally different from the rate at which f(x, y) changes with
distance along the path, which is equal to the directional derivative (29.4), and
corresponds to using arc-length s as the parameter.)
To find df/dt, suppose that t increases from t to t + δt. Then, on the curve (x(t),
y(t)), x changes from x to x + δx and y to y + δy. Divide (29.2) (the incremental
approximation) by δt:
δf ∂f δx ∂f δy
≈ + .
δt ∂x δt ∂y δt
δx dx δy dy
Let δt → 0. Then ‘≈’ becomes ‘=’, → , and → , and we have the
δt dt δt dt
chain rule (or total derivative):
Example 30.2 Let f(x, y) = xy − y 2, x = t 2, y = t 3. (a) Find df /dt using the chain
rule; (b) find df/dt by substitution.
∂f ∂f dx dy
(a) = y, = x − 2y, = 2t, = 3t 2.
∂x ∂y dt dt
Therefore, by (30.1),
df ∂f dx ∂f dy
= + = y(2t) + (x − 2y)3t 2 = 2t 4 + (t 2 − 2t 3)3t 2 = 5t 4 − 6t 5.
dt ∂x dt ∂y dt
(This expression can be written in various ways in terms of x and y, for example as
5x2 − 6xy, or 5yx 2 − 6x2y 3 . These all look very different, but they all take the same
1 1
values since x and y are connected by the fact that (x, y) lies on the given curve.)
(b) By substitution,
f(x(t), y(t)) = xy − y2 = t 2 t 3 − (t 3)2 = t 5 − t 6.
Therefore, as before,
df
= 5t 4 − 6t 5 .
dt
The chain rule is more useful for obtaining general results, as in Example 30.3,
than in working out special instances such as Example 30.2.
Self-test 30.1
Let z = f(x, y) = x(3y2 − x2). Show how z varies on the surface z = f(x, y) when
x = cos t, y = sin t, (0 t 2π). Find dz/dt on x = cos t, y = sin t as a function
of t using the chain rule. Where do the stationary values of z occur on this
curve on the surface?
667
30.2
the Lagrange multiplier
Consider the simple function
1 1
(√2 , − √2 )
g
in
1
as
(maximum)
re
nc
A
zi
x2 + y2 = 1
x
O 1
z=
√2
B
z=
g
in
1
as
1 1
(− √2 , − √2 )
re
ec
(minimum)
zd
y
+
x
of
rs
z=
ou
nt
−1
co
z=
−√
2
Fig. 30.2 Map showing the circular path and the contours projected on to the plane z = 0.
Then A corresponds to the highest point; this is where we were walking uphill
but then turn downhill: it is a local maximum point on the path. If we plotted a
graph of elevation against time, this point would show up as local maximum on
the graph.
The clue which reveals A to be a maximum is that one of the contours of x + y
is a tangent to the path at A. Those nearby contours that the path crosses are
all lower than the one through A. Similarly, at B, there is a local minimum for
the path.
This is an example of a restricted stationary-point problem, the ‘restriction’
being the condition that the only points considered are those that lie on a particu-
lar curve. A general statement of the problem is as follows.
Example 30.4 Find the maximum possible area a rectangle may have if the
perimeter is restricted to length 10 units.
Call the sides x and y. Then we require the maximum of the area A:
A = f(x, y) = xy (i)
However, although the following problem looks very similar, there turns out to
be a difficulty.
30
On the circle,
y 2 = 1 − x 2, − 1 x 1. (i)
z = x 2 − (1 − x 2) = 2x 2 − 1, − 1 x 1.
The only stationary points of this function are where
d
(2x2 − 1) = 0,
dx
which is at x = 0. At x = 0, the curve equation (i) gives y = ±1, so we have found the
points A : (0, 1) and A′ : (0, −1). These are in fact minima, and they are shown on the
path in Fig. 30.3a.
However, there are plainly two maxima also, at B and B′, which are completely
missed by the process above. We could have found them (but lost A and A′) if we
had substituted for x instead of y by means of x2 = 1 − y2. You can see the reason
for losing A and A′ if you sketch the function 2x2 − 1 between x = ±1. The maximum
values are at the ends, but cannot be found by differentiating; see also Example 4.8.
The restricted maximum and minimum values occur on the curve which is the
intersection of the circular cylinder x2 + y2 = 1 with the saddle z = x2 − y2 shown in
Fig. 30.3b.
669
y z = x2 − y2 z
−1 − 3
(a) 2 (b)
30.2
0 2
3
A (min.)
1 y
O
B (max.) B′ (max.) O
0 x
B′ x
A′ x 2 + y2 = 1
z decreasing
1 A′ (min.)
2
3
0
− 23 −1
Fig. 30.3 (a) Contour map of f(x, y) = x2 − y2, showing also the curve g(x, y) = x2 + y2 = 1. Here A and A′ are
minima, and B and B′ are maxima. (b) The circle x2 + y2 = 1 in the x,y plane, with the corresponding values
of z = x2 − y2 shown.
We can get over this difficulty by parametrizing the curve g(x, y) = c as in the
following example, which repeats Example 30.5.
We shall now describe the Lagrange-multiplier method for solving the restricted
stationary-value problem (30.5). This uses the parametric idea, but all reference to
a parameter is eliminated eventually so that we do not have to invent a parametriza-
tion and then go through the resulting algebra.
Think of time t as a possible parameter, and P : (x(t), y(t)) as a point moving
along the curve with velocity (dx/dt, dy /dt). We shall imagine g(x, y) = c is
670
expressed parametrically so that (a) the path is traced exactly once as t moves
through its range, and (b) dx/dt and dy/dt are never both zero together (if t is
CHAIN RULES, RESTRICTED MAXIMA, COORDINATE SYSTEMS
Looking back, we have three unknowns: x and y (the coordinates of any sta-
tionary point Q) and λ, another constant. To determine these, there are three
equations: (30.3a, b) and (30.3c). Finally, we summarize the method.
∂f ∂g
− λ = 0, (ii)
∂x ∂x
∂f ∂g
− λ = 0. (iii)
∂y ∂y
(The value of λ can usually be discarded.) (30.4)
671
Notice that all reference to the parameter t has disappeared. There are many ways
of proving (30.4), but this is probably the simplest for two dimensions. The problem
30.2
is treated for three dimensions in Section 31.8.
The equations obtained are often awkward to solve. It is best to be very system-
atic, not wandering aimlessly between the equations. Be careful not to overlook
possibilities (such as that (ii) in Example 30.7 is solved by λ = 1); and check at
the end that the solutions actually fit. The values found for λ do have a special
significance in certain subjects but otherwise can be thrown away.
Example 30.8 Find the rectangle of maximum area which can be placed
symmetrically in the ellipse x2 + 4y2 = 1 as shown in Fig. 30.4.
A
(x, y)
x
O
Fig. 30.4
Suppose that one of the vertices, say A, is at (x, y). We shall require that x and y be
positive, since this is sufficient, given the symmetry. The area is equal to 4xy = f(x, y),
while x and y are subject to g(x, y) = x2 + 4y2 = 1.
The three equations, taken in the order of (30.4), become
x2 + 4y2 = 1, (i)
2y − λx = 0, (ii)
x − 2λy = 0. (iii)
Suppose that neither x nor y is zero (that could not give a maximum). Then, from
(ii) and (iii), ➚
672
Example 30.8 continued
CHAIN RULES, RESTRICTED MAXIMA, COORDINATE SYSTEMS
y = 1/(2√2),
and (v) gives correspondingly
x = 1/ √2.
The sides have length 1/√2 and √2, so the area is 1.
Self-test 30.2
Using the Lagrange multiplier method, find the stationary values of f(x, y)
= x2 − 3xy + y2 on the circle x2 + y2 = 2.
This situation arises when we change coordinates from (x, y) to another system.
For example, the equations
x = u cos v, y = u sin v,
represent polar coordinates, with u as the radial and v as the angular coordinate.
Now hold v constant; put
v = β,
say, and let u vary. Then
x = u cos β, y = u sin β.
Here u is the only active parameter; as it varies, (x, y) traces a radial straight line.
Suppose instead that u is held constant, say
u = α;
then, as v varies, (x, y) follows the circle of radius | α |
x = α cos v, y = α sin v.
The point where the two curves intersect can be described either by
u = α, v=β
673
in the new (polar) coordinates, or in the original coordinates by
30.3
x = α cos β, y = α sin β.
In general, if we have
Fig. 30.5
defines polar coordinates, u (radial) and v (angular). To sketch the curves corres-
ponding to constant u or v, we put
α = u(x, y) or β = v(x, y);
for the first quadrant. Notice that v = 0 on both x = 0 and y = 0: the connection between
(x, y) and (u, v) is not one-to-one over the whole (x, y) plane.
30
2
v=3
3
2
v=2
1 u=1
u=0
v=1
v=0
−2 −3
u = −1 v=0
O 1 2 x Fig. 30.6
Self-test 30.3
Sketch the curvilinear coordinates defined by the elliptic system
x = cosh u cos v, y = sinh u sin v.
675
30.4
Suppose that we have a (u, v) system of coordinates defined either by x = x(u, v),
y = y(u, v), or by u = u(x, y), v = v(x, y), and the curves u = α and v = β always
ORTHOGONAL COORDINATES
intersect at right angles for any constants α and β. Then the (u, v) system is said
to be an orthogonal system of coordinates. For example, polar coordinates are
orthogonal. Coordinate systems which are not orthogonal are seldom used because
of the complexity of the formulae connected with them. A test for orthogonality
is the following.
Example 30.11 Confirm that the following coordinate systems (u, v) are
1
orthogonal. (a) u = y2 − 2x2, v = x–2 y (x 0); (b) x = 2uv, y = u2 − v2.
(a) Use (30.5a). We have
∂u ∂v 1 − 1 ∂u ∂v
= − 4x, = 2 x 2 y, = 2y, = x2,
1
∂x ∂x ∂y ∂y ➚
676
Example 30.11 continued
CHAIN RULES, RESTRICTED MAXIMA, COORDINATE SYSTEMS
so
∂u ∂v ∂u ∂v
+ = − 4x( 12 x − 2 y) + 2y(x 2 ) = 0.
1 1
∂x ∂x ∂y ∂y
(b) Use (30.5b); notice how this condition is differently structured from (30.5a). We
have
∂x ∂x ∂y ∂y
= 2v, = 2u, = 2u, = −2v;
∂u ∂v ∂u ∂v
so
∂x ∂x ∂y ∂y
+ = 2v(2u) + 2u(−2v) = 0.
∂u ∂v ∂u ∂v
Self-test 30.4
Confirm that the elliptic system x = cosh u cos v, y = sinh u sin v is orthogo-
nal (see Self-test 30.3 and Example 30.11).
and a function f(x, y): an arbitrary function of position. The function f(x, y) can
be expressed in terms of the new coordinates; for example, if
x = u2 − v 2, y = 2uv, and f(x, y) = x 2 + y2,
then
f(x, y) = (u2 − v 2)2 + (2uv) 2 = (u 2 + v 2)2
when evaluated at the same point.
If we put
z = f(x, y),
then the derivatives ∂z /∂u and ∂z /∂ v indicate how z, or f(x, y), changes as we
follow the curves of constant v and constant u respectively. Consider the derivative
∂z
∂u
in which v is held constant, at v = β say. Since only u varies, we are able to adopt
the single-variable chain rule (30.1), with u instead of t. However, we must write
∂x/∂u and ∂y/∂u instead of dx/du and dy/du in order to indicate that another
variable v is present, although it is regarded as constant for the differentiation.
We obtain the following.
677
30.5
If x = x(u, v), y = y(u, v), z = f(x, y), then
∂z ∂z ∂x ∂z ∂y
= + ,
Example 30.12 Use the chain rule (30.6) to obtain ∂z /∂v where x = u2 − v2,
y = 2uv, and z = xy; check the result by substitution.
For the chain rule, we require
∂z ∂z ∂x ∂y
= y, = x, = −2v, = 2u.
∂x ∂y ∂v ∂v
Then
∂z ∂z ∂x ∂z ∂y
= + = −2yv + 2xu = 2u 3 − 6uv2 .
∂v ∂x ∂v ∂y ∂v
To check the result, write z in terms of u and v:
z = xy = (u2 − v 2)2uv = 2u3v − 2uv3.
∂z
Therefore = 2u 3 − 6uv2 , as before.
∂v
There is clearly no advantage in using the chain rule for a simple explicit case
such as this. The use of such rules is to obtain general results as in the following
examples.
Example 30.13 Find expressions for ∂z /∂r and ∂z /∂θ when x = r cos θ,
y = r sin θ, and z is a function of position.
To use (30.6), put (r, θ ) in place of (u, v):
∂z ∂z ∂x ∂z ∂y ∂z ∂z
= + = cos θ + sin θ ,
∂r ∂x ∂r ∂y ∂r ∂x ∂y
∂z ∂z ∂x ∂z ∂y ∂z ∂z
= + = −r sin θ + r cos θ .
∂θ ∂x ∂θ ∂y ∂θ ∂x ∂y
Example 30.14 Find expressions for ∂z /∂x and ∂z /∂y in terms of ∂z /∂r and
∂z/∂θ, where x = r cos θ, y = r sin θ.
The appropriate form for chain rule (30.6) will be
∂z ∂z ∂r ∂z ∂θ ∂z ∂z ∂r ∂z ∂θ
= + , = + .
∂x ∂r ∂x ∂θ ∂x ∂y ∂r ∂y ∂θ ∂y
To find ∂r/∂x etc., use the alternative form for polar coordinates:
r = (x2 + y2) 2 , θ = arctan (y/x); ➚
1
678
Example 30.14 continued
CHAIN RULES, RESTRICTED MAXIMA, COORDINATE SYSTEMS
then
∂r x r cos θ
= 1 = = cos θ ;
∂x (x2 + y2 )2 r
∂θ 1 ⎛ y⎞ y r sin θ sin θ
= ⎜− ⎟ = − 2 =− =− .
∂x 1 + ( y / x)2 ⎝ x2 ⎠ x + y2 r2 r
Therefore
∂z ∂z sin θ ∂z
= cos θ − .
∂x ∂r r ∂θ
Similarly ∂r/ ∂ y and ∂θ /∂y can be calculated to give
∂z ∂z cos θ ∂z
= sin θ + .
∂y ∂r r ∂θ
(These can also be obtained by treating the pair of expressions for ∂z/∂r and ∂z/∂θ
obtained in Example 30.13 as if they were a pair of simultaneous equations for ∂z/∂x
and ∂z/∂y, and solving them.)
P = P(U, V).
The partial derivative notation ∂U/∂M and ∂V/∂M indicates that
U = U(M, … ) and V = V(M, … ),
at least one more variable being present: the expression does not tell us its name. The
chain rule automatically simplifies the expression to
∂P ∂U ∂P ∂V ∂P
+ = .
∂U ∂M ∂V ∂M ∂M
∂P
To form for example , write
∂X
∂P ∂P ∂P
= + ,
∂X ∂X ∂X
then fill in the spaces in the first term with ∂U and the second with ∂V.
679
Example 30.16 Prove that if (x, y) and (u, v) are coordinates related by
30.6
x = x(u, v) and y = y(u, v), (i)
or alternatively by
then
⎡ ∂x ∂x ⎤ ⎡ ∂u ∂u ⎤
⎢ ∂u ∂v ⎥⎥
⎢ ∂x
⎢ ∂y ⎥⎥ ⎡1 0⎤
⎢ = = I2 .
⎢ ∂y ∂y ⎥ ⎢ ∂v ∂v ⎥ ⎢⎣0 1⎥⎦
⎢⎣ ∂u ∂v ⎥⎦ ⎢⎣ ∂x ∂y ⎥⎦
In the first matrix, the relations (i) are implied, and in the second the relations (ii).
By multiplying the matrices we obtain
⎡ ∂x ∂u ∂x ∂v ∂x ∂u ∂x ∂v ⎤
⎢ ∂u ∂x + ∂v ∂x ∂u ∂y + ∂v ∂y ⎥
⎢ ⎥. (iii)
⎢ ∂y ∂u + ∂y ∂v ∂y ∂u + ∂y ∂v ⎥
⎢⎣ ∂u ∂x ∂v ∂x ∂u ∂y ∂v ∂y ⎥⎦
Each of these elements has the right shape for the representation of a derivative by the
chain rule (30.6), though the variable combinations occupying the various positions my
seem unusual. The matrix becomes
⎡ ∂x ∂x ⎤
⎢ ∂x ∂y ⎥ ⎡1 0⎤
⎢ ⎥=⎢ ⎥,
⎢ ∂y ∂y ⎥ ⎣0 1⎦
⎢⎣ ∂x ∂y ⎥⎦
since x = x(u, v) = x(u(x, y), v(x, y)) = x and y = y(u, v) = y(u(x, y), v(x, y)) = y
identically.
Self-test 30.5
If x = r cos θ, y = r sin θ, verify identity (iii) in Example 30.16.
described in Section 22.4 for functions of a single variable (the theory, however, is
somewhat difficult). Here we shall adopt ‘=’ for brevity, but retain δx etc.
Example 30.17 Find a vector normal to the curve f(x, y) = c at a point (x, y) on
the curve. (Compare Section 29.5.)
Let P be (x, y) and Q a nearby point (x + δx, y + δy) also on the curve (see Fig. 30.7). Put
z = f(x, y).
f(x, y) = c
R Q
δy
P
δx
Fig. 30.7
becomes smaller, of course), so ( ∂z/ ∂x, ∂z/ ∂y) is a vector in the direction of the normal,
as we found in Section 29.6.
y
Q u = aα
v = bβ 2u δv 2v δv
P
−2u δu
2v δu
O x Fig. 30.8
We have to show that any two curves given respectively by u = α and v = β intersect
in a right angle, as in Fig. 30.8. If u and v are allowed to vary arbitrarily, the
incremental formula gives ➚
681
Example 30.18 continued
PROBLEMS
δx = 2v δu + 2u δv, δy = −2u δu + 2v δv. (i)
But u does not vary on the curve u = α, so δu = 0 and (i) becomes δx = 2u δv, δy = 2v δv.
The vector P_Q points nearly in the direction of the tangent at P:
P_Q = (δx, δy) = (2u δv, 2v δv). (ii)
Similarly, on the curve v = β, we have δv = 0; so
δx = 2v δu, δy = −2u δu. (iii)
P_R points in the direction of the tangent to v = β, and
P_R = (δx, δy) = (2v δu, −2u δu).
From (ii) and (iii), we have
P_Q · P_R = (2u δv, 2v δv)·(2v δu, −2u δu)
= 4uv δu δv − 4uv δu δv = 0,
so the curves intersect in a right angle.
Problems
30.1 Find a parametrization (x(t), y(t)) suitable (a) Find the maximum area of a rectangle having
for the following curves, specifying the range of perimeter of length 10.
t required to traverse the curve exactly once, in (b) Find the rectangle with area 9 which has the
the anticlockwise direction if the curve is closed. shortest perimeter.
(a) x 2 + y 2 = 25; (c) Find the stationary points of x2 + 2y2 subject to
(b) 41 x 2 + 91 y 2 = 1; x2 + y2 = 1.
(c) xy = 4; (d) Find the largest rectangle in the first quadrant
(d) x 2 − y 2 = 1 (try using the identity 1 + tan2A of the (x, y) plane which has two of its sides
= 1/cos2A); along x = 0 and y = 0 respectively, and a vertex
(e) 41 x 2 − 91 y 2 = 1; on the line 2x + y = 1.
(f) y2 = 4ax; (e) Find the minimum distance of the straight line
(g) (x − 1)2 + (y − 2)2 = 9; x + 2y = 1 from the point (1, 1). (It is easier to
(h) 2x − 5y + 2 = 0. consider the square of the distance.)
(f) Find the shortest distance from the origin to
30.2 For each of the following cases, obtain df/dt the curve x2 + 8xy + 7y 2 = 225.
in terms of t by means of the chain rule (30.1). (g) With reference to Fig. 30.4, find the rectangle
(a) f(x, y) = x2 + y2, x(t) = t, y(t) = 1/t; in the ellipse which has the maximum
(b) f(x, y) = x2 − y2, x(t) = cos t, y(t) = sin t; perimeter.
(c) f(x, y) = xy, x(t) = 2 cos t, y(t) = sin t; (h) Find the stationary points of (x − y + 1)2 on
(d) f(x, y) = x sin y, x(t) = 2t, y(t) = t 2; y = x2.
(e) f(x, y) = 4x2 + 9y2, x(t) = --12 cos t, y(t) = --13 sin t. (i) Show that in general there are three normals
to a parabola from any given point inside it.
30.3 Two athletes run around concentric circular
tracks of radius r and R with speeds v and V 30.5 Find the stationary points of f(x, y) on
respectively. They start on the same radial line. By g(x, y) = c (i) by parametrizing the given path as in
using time as a parameter, find the rate of change Example 30.6, (ii) by using the Lagrange-multiplier
with time of the distance between them and technique, in each of the following cases.
interpret any stationary points. (a) f(x, y) = x2 + y2 on g(x, y) = xy = 1;
(b) f(x, y) = x2 + y2 on (x − 1)2 + y2 = 1;
30.4 Use the Lagrange-multiplier method to solve (c) f(x, y) = x2 + 4y2 on x2 + y2 = 1;
the following problems. (d) f(x, y) = 3x − 2y on x2 − y2 = 4;
682
(e) f(x, y) = xy on g(x, y) = x2 + y2 = 1 (compare 30.9 Use the chain rule (30.6) to find ∂f/∂u
this with (a)). and ∂f/∂v in terms of u and v in each of the
CHAIN RULES, RESTRICTED MAXIMA, COORDINATE SYSTEMS
following cases.
30.6 Show by means of sketches that, for the (a) f(x, y) = 2x − y, x = uv, y = u2 − v2;
restricted stationary-value problem, a stationary (b) f(x, y) = y/x, x = u + v, y = u − v;
point can be expected at any point where the curve (c) f(x, y) = y2, x = u2 + v2, y = v/u;
g(x, y) = c is tangential to a contour of f(x, y). (d) f(x, y) = (x − y)/(x + y), x = v, y = u − v.
Use this observation to derive the Lagrange-
multiplier principle. (Hint: consider the normals at 30.10 By using the chain rule (30.6) twice, obtain
the point of tangency; or use implicit differentiation ∂ 2f /∂u2, ∂ 2f /∂v2, and ∂ 2f /∂u ∂v in each of the
to get expressions for the directions of the curves following cases.
there.) (a) f(x, y) = y/x, x = u + v, y = u − v;
There are cases when a stationary point can (b) f(x, y) = x2 + y 2, x = uv, y = u 2 − v2;
occur although the curves are not tangential there. (c) f(x, y) = y2, x = uv, y = v.
Try to identify these cases by sketching various
possibilities. (Hint: they correspond to λ = 0.)
30.11 Find expressions for ∂f /∂u, ∂f /∂v, ∂2f /∂u2,
30.7 A change of coordinates from (x, y) to (u, v) ∂2f /∂v2, and ∂2f /∂u ∂v if
is specified by each of the following. Show that f(x, y) = g(x2 − y2), x = u + v, y = u − v.
the new coordinate system is orthogonal. (The expressions will involve the functions
(a) u = 2x + 3y, v = −3x + 2y; g′(x 2 − y 2) etc.)
(b) u = xy, v = x2 − y2;
(c) u = x2 + 2y 2, v = y /x2;
(d) u = xy 2, v = y2 − 2x2; 30.12 Let w = w(u, v), u = u(x, y),
(e) u = x + 1/x + y2/x, v = y − 1/y + x2/y; v = v(x, y), where u and v are related in such
(f) x = 2u − v, y = u + 2v; a way that
(g) x = u2 − v2, y = 2uv; ∂u ∂v ∂u ∂v
= , =− .
(h) x = u/(u2 + v2), y = v/(u2 + v 2); ∂x ∂y ∂y ∂x
(i) x = u2 − v2, y = −2uv. Prove that
30.8 Let r(t) and θ (t) be polar coordinates which ∂ 2u ∂ 2u ∂ 2v ∂ 2v
30
+ = 0, + = 0.
are functions of a parameter t. ∂x 2 ∂y 2 ∂x 2 ∂y 2
(a) Express dx/dt and dy/dt in terms of dr/dt, Use the chain rule (30.6) to prove that
dθ /dt, r, and θ .
∂ 2w ∂ 2w ⎡⎛ ∂u ⎞ ⎛ ∂u ⎞ ⎤ ⎡ ∂ 2w ∂ 2w ⎤
2 2
(b) Use (a) to obtain expressions for d2x/dt2 and
+ = ⎜ ⎟ + ⎜ ⎟ ⎥⎢ 2 +
⎢ .
d2y/dt2. ∂x 2 ∂y 2 ⎢⎝ ∂x ⎠ ⎝ ∂y ⎠ ⎥ ⎣ ∂u ∂v 2 ⎥⎦
(c) Prove that ⎣ ⎦
2
d 2x d 2y d 2r ⎛ dθ ⎞
cos θ + sin θ 2 = 2 − r ⎜ ⎟ , 30.13 Let r and θ be the usual polar coordinates,
dt 2
dt dt ⎝ dt ⎠
and z = f(x, y); show that:
d 2y d 2x 1 d ⎛ 2 dθ ⎞
cos θ − sin θ 2 = ⎜r ⎟. ⎛ ∂z ⎞
2
⎛ ∂z ⎞
2
⎛ ∂z ⎞
2
1 ⎛ ∂z ⎞
2
dt 2
dt r dt ⎝ dt ⎠ (a) ⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟ + 2 ⎜ ⎟ ;
⎝ ∂x ⎠ ⎝ ∂y ⎠ ⎝ ∂r ⎠ r ⎝ ∂θ ⎠
(These two equations express the radial and
tangential components of acceleration, given on ∂ 2 z ∂ 2 z ∂ 2 z 1 ∂z 1 ∂ 2z
(b) + 2 = 2 + + 2 .
the left, in terms of polar coordinates.) ∂x 2
∂y ∂r r ∂r r ∂θ 2
Functions of any number
of variables 31
CONTENTS
Higher derivatives are defined as with functions of two variables; for example,
∂ 3f ∂ ∂ ∂f
= .
∂x ∂y ∂z ∂x ∂y ∂z
It follows from the result for second derivatives (Equation (28.3)) that, for smooth
functions,
∂ 3f ∂ 3f ∂ 3f
= =
∂x ∂y ∂z ∂y ∂x ∂z ∂z ∂y ∂x
and so on: the derivatives may be taken in any order.
The incremental approximation has the same form as (29.1) and (29.2), simply
containing further terms corresponding to the extra variables:
∂f ∂f ∂f
≈ δx + δy + δz + .
∂x ∂y ∂z
If we put w = f(x, y, z, … ), this can be written
∂w ∂w ∂w
δw ≈ δx + δy + δz + .
∂x ∂y ∂z (31.1)
31.1
Section 29.2:
Self-test 31.1
FUNCTIONS OF ANY NUMBER OF VARIABLES
The area A of a triangle in which the angle between two sides of lengths a and
b is C, is given by A = --12 ab sin C. The measured lengths are a = 2, b = 3 and
C = 30°. Possible errors of measurement are ±0.1 for a and b, and ±3° for C.
Find the maximum error in the worst case.
This condition implies that any one of the variables depends on, or is a function
of, all the others. For example, if the variables are x, y, z, and r, and
x 2 + y 2 + z 2 − r 2 = 0,
then
31
1
y = ±(r 2 − x 2 − z 2)–2 .
Subject to (31.4) we can therefore talk about partial derivatives such as ∂y/∂x:
we think of y as being a function of the other variables, but with all the variables
except x and y held constant.
Suppose that (x, y, z, … ) and (x + δx, y + δy, z + δz, … ) both satisfy condition
(31.4). Then δf = 0 and the incremental approximation gives
∂f ∂f ∂f
δx + δy + δz + $ ≈ 0. (31.5)
∂x ∂y ∂z
Suppose next that all the variables except x and y are kept constant, so that δx ≠ 0
and δy ≠ 0, but δz = ··· = 0. Equation (31.5) becomes (∂f/∂x) δx + (∂f/∂y) δy ≈ 0,
so that
δy ∂f ∂f
≈− .
δx ∂x ∂y
Now let δx → 0 and the equation becomes (compare (29.6))
∂y ∂f ∂f
=− :
∂x ∂x ∂y
Implicit differentiation
If f(x, y, z, … ) = 0, then
∂y ∂f ∂f
=− .
∂x ∂x ∂y
Any other two variables may be substituted for x and y. (31.6)
687
Example 31.2 For a fixed mass of gas, an equation of the form f(P, V, T) = 0
31.2
holds (the ‘equation of state’), where P, V, and T represent the pressure, volume,
and temperature respectively. Show that
IMPLICIT DIFFERENTIATION
∂P ∂T ∂P ∂P ∂T ∂V
(a) =− , (b) = −1.
∂T ∂V ∂V ∂T ∂V ∂P
The relation f(P, V, T) = 0 implies that any of P, V, or T is a function of the other
two variables: P = P(V, T), V = V(T, P), and T = T(P, V). If we put, say P = P(V, T)
= constant, then implicit differentiation, by (31.6), gives ∂V/∂T or ∂T/∂V in terms of
∂f/ ∂V and ∂f/∂T (where we are reminded of the ‘constant P’ condition by the partial
derivative signs instead of dV/dT and dT/dV). Similarly we obtain ∂P/∂T, ∂T/∂P,
∂P/∂V, and ∂V/∂T.
∂P ∂f ∂f ∂T ∂f ∂f
(a) =− and =−
∂T ∂T ∂P ∂V ∂V ∂T
(from (31.6)). Therefore
∂P ∂T ∂f ∂f ∂P
= =− (using (31.6) again).
∂T ∂V ∂V ∂P ∂V
(b) By repeating the process (a) with different variables,
∂P ∂T ∂V ⎛ ∂f ∂f ⎞ ⎛ ∂f ∂f ⎞ ⎛ ∂f ∂f ⎞
= ⎜− ⎟ ⎜− ⎟ ⎜− ⎟ = −1.
∂T ∂V ∂P ⎝ ∂T ∂P ⎠ ⎝ ∂V ∂T ⎠ ⎝ ∂P ∂V ⎠
Self-test 31.2
Let f(x, y, z) ≡ x3 + yz + xy2 + z2 = 0. Find the implicit derivatives
A ∂x D A ∂z D A ∂x D
C ∂z F y , C ∂y F x , C ∂y F z .
A ∂x D A ∂z D A ∂x D
Verify that C F C F = − C F .
∂z y ∂y x ∂y z
688
The chain rule for a single parameter t is obtained exactly as in the case of a single
variable: divide (31.1) by δt and take the limit, to give the following formula.
Notice that (x(t), y(t), z(t)) defines a directed path in three dimensions.
In the case of more than one parameter, the results of Section 30.1 may be
extended as follows.
31.4
on some scalar function f(x, y, z). The definition is stated for reference as
Example 31.3 Let f(x, y, z) = x2 + y2 + z2. Obtain (a) the vector function grad
f(x, y, z); (b) the value of grad f(x, y, z) at the point (1, 2, 3); (c) an expression
for the magnitude (or length) of grad f(x, y, z).
⎛ ∂f ∂f ∂f ⎞
(a) grad f(x, y, z) = ⎜ , , ⎟ = (2x, 2y, 2z);
⎝ ∂x ∂y ∂z ⎠
or one can use the ‘operator’ idea and the other way of writing a vector:
⎛ ∂ ∂ ∂⎞
grad f = ⎜ î + q + x ⎟ (x2 + y2 + z 2 ) = î(2x) + q(2y) + x(2z).
⎝ ∂x ∂y ∂z ⎠
(b) At x = 1, y = 2, z = 3,
grad f = (2, 4, 6).
(c) The magnitude or length |v| of a vector v = (a, b, c) is |v| = (a2 + b2 + c 2 )2 ; so
1
C(x, y, z, t). A whale travels on the path x = x(t), y = y(t), z = z(t), where t
is time. Show that, along the path of the whale,
dC ∂ C
= + v . grad C,
dt ∂t
where v is its velocity.
By the chain rule (31.7),
dC ∂C dx ∂C dy ∂C dz ∂C
= + + + ,
dt ∂x dt ∂y dt ∂z dt ∂t
after putting dt/dt = 1 into the final term. The whale’s velocity is
⎛ dx dy dz ⎞
v=⎜ , , ⎟,
⎝ dt dt dt ⎠
so that
dC ∂ C
= + v . grad C.
dt ∂t
(If the whale drifted with the motion of the sea, v would represent the velocity of the
31
current. This case is related to the concept of material derivative in fluid mechanics.
Instead of C there is a quantity such as the density or momentum of a particular piece
of fluid, whose variation we follow as the fluid moves around.)
Self-test 31.3
If f(x, y, z) = x2 + y2 + 2z2, find grad f. At what point on the surface
x2 + y2 + 2z2 = 1 is grad f in the direction of the vector (1, 1, 1)?
31.6
C
)=
y,z grad g(x, y, z)
g(x,
O
x
y Fig. 31.1
(Compare (29.9), for the normal to a curve in two dimensions.) The proof is as
follows. In Fig. 31.1, P : (x, y, z) is the given point on the surface and Q : (x + δx,
y + δy, z + δz) is any nearby point on the surface. Then
g(x + δx, y + δy, z + δz) − g(x, y, z) = 0, or δg = 0.
Therefore, by the incremental formula (31.1),
∂g ∂g ∂g
0= δx + δy + δz = (grad g) · (δx, δy, δz).
∂x ∂y ∂z
This shows that grad g is perpendicular to the vector (δx, δy, δz). But P_Q can
be chosen to point in any direction from P in the surface, so the only possibility is
that grad g is perpendicular to the surface itself at P.
We already know (from (28.7)) that a vector normal to a suface described in the
form z = f(x, y) is
⎛ ∂f ∂f ⎞
⎜ , , −1⎟ .
⎝ ∂x ∂y ⎠
This is reconciled with (31.11) if we write its equation in the form
g(x, y, z) = f(x, y) − z = 0.
Self-test 31.4
Find the normal to the surface x2 + 2xy + xz3 at the point (1, 1, 1). Hence find
the tangent plane to the surface at (1, 1, 1).
change of f(x, y, z) in any direction. In Fig. 31.2, let P : (a, b, c) be any point.
Suppose that we require the rate of change with distance of f(x, y, z) in the
direction PR.
Q R
δz
δx N
P
δy
y M
O
x Fig. 31.2
31.7
In the direction having direction cosines (cos α, cos β, cos γ ):
∂f ∂f ∂f ∂f
= cos α + cos β + cos γ .
In the two-dimensional version (29.4), the coefficients cos θ and sin θ are equal
to the two-dimensional direction cosines, cos θ and cos( 12 π − θ ), so (29.4) is
compatible with (31.14).
The direction cosines cos α, cos β, cos γ have the property cos2α + cos2β + cos2γ
= 1 (see Section 10.5), so they are the components of a unit vector v which points
in the desired direction. Therefore (31.14) can be written differently:
As in Section 29.6, the result (31.15) can be expressed in a third way. If a and b
are two vectors, and φ is the angle between them, then a · b = | a | | b| cos φ. Putting
v for a and grad f for b in (31.15), and using the fact that |v| = 1, we obtain the next
result.
Now take a function f(x, y, z), and a point P : (x1, y1, z1) as in Fig. 31.3. By
means of (31.16), we can explore the rate of variation of f(x, y, z) in all directions,
by pointing v in the required directions. The only thing that changes the value of
df/ds when we do this is the angle φ. It can be seen from (31.16) that: (i) if φ = 12 π,
then df/ds = 0, which is consistent with v pointing tangentially to the surface
f(x, y, z) = fP, where fP is the value of f at P; (ii) df /ds takes its maximum value
| grad f |, when φ = 0. That is to say, grad f points in the direction of most rapid
increase of f and it is normal to the surface f(x, y, z) = fP.
It is worth noticing that, for a fixed angle φ, the unit vector v may point
anywhere along the generators of a cone having axis grad f, as shown in Fig. 31.3.
The directional derivative df/ds is the same in all these directions.
694
z
FUNCTIONS OF ANY NUMBER OF VARIABLES
grad f
φ v
P
y
x Fig. 31.3
(x, y, z). (b) Find a unit vector v which points in the direction of most rapid
rate of increase in f(x, y, z) at the point (1, 1, 1). (c) An insect sets off from
(1, 1, 1) and flies a short distance δs in the direction given by (b). Find its
new coordinates (approximately).
⎛ ∂f ∂f ∂f ⎞
(a) grad f (x, y, z) = ⎜ , , ⎟ = (−2x, −y, −z), or −2xî − yq − zx.
⎝ ∂x ∂y ∂z ⎠
(b) grad f always points in the required direction; at the point (1, 1, 1), its components
are (−2, −1, −1). To obtain the corresponding unit vector v, divide by the length
[(−2)2 + (−1)2 + (−1)2]–2 = √6, obtaining v = (−2/√6, −1/√6, −1/√6).
1
(c) The insect moves a distance δs from the point P : (1, 1, 1) along v (see Fig. 31.4),
so its vector displacement - is
P
δs
(1, 1, 1)
y
v δs
x Fig. 31.4
➚
695
Example 31.5 continued
31.7
⎛ 2 1 1 ⎞
v δs = ⎜ − δs, − δs, − δs⎟ .
⎝ √6 √6 √6 ⎠
vn
Pn h Pn+1
P2
h
P1
h
P0 Fig. 31.5
vn = [(grad f )/| grad f |]Pn = (−2xn, −yn, −zn)/(4xn2 + y n2 + z n2 )–2 = (an, bn, cn)
1
(say). (31.17)
For the general step from Pn to Pn+1, we obtain the small displacement components δxn,
δyn, δzn in the x, y, and z directions (in general these will differ from step to step):
(δxn, δyn, δzn) = vnh = (an h, bn h, cn h),
from (31.17). Therefore
(xn+1, yn+1, zn+1) = (xn + δxn, yn + δyn, zn + δzn) = (xn + anh, yn + bn h, zn + cn h). (31.18)
Equations (31.17) and (31.18), with the starting point (x0, y0, z0) given, form a step-by-
step process which is easy to computerize. The following table of the early stages was
calculated with h = 0.05; the starting point in this case is the point (1, 1, 1), where
f(x, y, z) = 4 − x2 − 12 y2 − 12 z 2 as in Example 31.5.
n xn yn zn
0 1 1 1
1 0.959 0.980 0.980
2 0.919 0.959 0.959
3 0.878 0.938 0.938
4 0.839 0.917 0.917
5 0.799 0.895 0.895
696
If a surface is defined by the equation
FUNCTIONS OF ANY NUMBER OF VARIABLES
f(x, y, z) = k,
where k is a constant, it is called a level surface of the function f (it is the analogy
of a contour in the theory for functions of two variables). For example, the level
surfaces of
x2 y2 z2
f(x, y, z) = + + , (a, b, c constants),
a2 b2 c2
are the ellipsoids
x2 y2 z2
+ + = k, (k 0).
a2 b2 c2
According to (31.11), therefore, we can say in different language:
It follows that the insect in Example 31.6 crosses perpendicularly all the level
surfaces that it meets.
Self-test 31.5
If f(x, y, z) = z − x2 − y2, what are the level surfaces of the function? Find the
directional derivative at (1, 1, 3). Sketch the level surface through (1, 1, 3) and
the direction of the directional derivative.
31.8
Therefore a turning point of f(x(t), y(t), z(t), … ) is encountered at Q on every
STATIONARY POINTS
path passing through Q, and this is what we should wish to happen for the
point Q to be described as stationary.
Restricted stationary-value problems (see Section 30.2) may occur in any num-
ber of dimensions. In three dimensions, the restriction may be either to values of
f(x, y, z) on some given curve, or to values on some given surface.
To help visualize a three-dimensional situation, suppose that a fish swims
through a field of pollution of density P = f(x, y, z). At some point in the sea the
pollution is at an overall maximum, but this is of no concern to the fish if it does
not swim through it. However, it will notice highs and lows along its own path
even if there is nothing special about such points from an overall viewpoint.
These are restricted maxima and minima on the fish’s path. Suppose that the path
of the fish is expressed parametrically:
x = x(t), y = y(t), z = z(t).
Then the stationary points peculiar to the path are where df /dt = 0. By the chain
rule (31.7), these are the points where
∂f dx ∂f dy ∂f dz
+ + = 0.
∂x dt ∂y dt ∂z dt
When written in terms of t, this is an equation giving the critical values of t. (We
must be careful to avoid a parametrization such that dx/dt = dy/dt = dz /dt = 0 at
some point on the path: at such a point, a non-existent stationary point would
be predicted.)
698
(In particular cases it might be easier to substitute x(t), y(t), z(t) directly into
f(x, y, z) for the turning points of f with respect to t.)
It is more usual for restricted stationary-value problems to be formulated in
a way that avoids parametric considerations. The restriction to a surface is the
easier case. Instead of a fish in the body of the sea, consider a crab which confines
itself to the undulating seabed described by an equation of the form
g(x, y, z) = c,
encountering there the local pollution, given throughout the sea by f(x, y, z). The
crab does not know about the rest of the sea, but as it moves around it will meet
highs and lows (and other stationary points) unconnected with possibly more
extreme pollution in the body of the sea. A stationary point will be found at a
point Q on the surface g(x, y, z) = c if
31
df
= 0 at Q
ds
in all directions v from Q which do not point into the body of the sea, but are
tangential to the surface g(x, y, z) = c.
Figure 31.6 shows such a point Q, and various tangential directions denoted by
unit vectors v pointing away from Q. From (31.15), one condition for a restricted
stationary point is
grad f
grad g
v v
Q
v v c
=
, z)
x,y
g(
Fig. 31.6
df
= 0 = v . grad f at Q, for all such v. (31.23)
ds
In other words, grad f must be perpendicular to the surface at Q (ignoring for
the moment the chance that grad f might be zero at Q). But, by (31.11), grad g is
always perpendicular to the surface g(x, y, z) = c; in particular at Q. Therefore grad
f and grad g, evaluated at Q, are parallel vectors; so
grad f = λ grad g at Q,
699
where λ is an (unknown) constant, called a Lagrange multiplier for the problem.
By writing grad f and grad g in their components, we obtain
31.8
∂f ∂g ∂f ∂g ∂f ∂g
−λ = 0, −λ = 0, −λ = 0. (31.24a,b,c)
STATIONARY POINTS
∂x ∂x ∂y ∂y ∂z ∂z
We now have three equations for the four unknowns: (x, y, z) (the position of
Q) and λ. To find another equation, notice that (31.24a,b,c) would be unaffected if
we had g(x, y, z) equal to some constant other than c, so it is necessary to reassert
the particular surface:
g(x, y, z) = c. (31.24d)
The special possibility mentioned above, that (by chance) grad f = 0 at Q, is still
governed by eqn (31.24). When they are solved, we merely find that λ = 0. The case
corresponds to the unrestricted stationary-point problem (see (31.20)), where the
point found happens to lie in the specified surface.
∂f ∂g
−λ = 0, (ii)
∂x ∂x
∂f ∂g
−λ = 0, (iii)
∂y ∂y
∂f ∂g
−λ = 0. (iv)
∂z ∂z (31.25)
⎡2 − 2λ 0 1⎤
det ⎢ 0 2 − 2λ 1 ⎥ = 0,
⎢ 1 1 2λ ⎥⎦
⎣
so that (1 − λ)(2λ2 − 2λ + 1) = 0. The only real solution is
λ = 1. ➚
700
Example 31.8 continued
FUNCTIONS OF ANY NUMBER OF VARIABLES
We shall not give the proof in full. Briefly, the situation is shown in Fig. 31.7. Q is
a stationary point on the curve of intersection and v a unit vector tangential to it
at Q. Since
grad h grad f
grad g
Q
g(x, y, z) = c1
h(x, y, z) = c2
v
Curve of
intersection Fig. 31.7
701
df
= v . grad f = 0 at Q,
ds
31.8
grad f is perpendicular to v at Q. For the same reason as in the earlier case, grad g
STATIONARY POINTS
and grad h are also perpendicular to v at Q. Therefore the three vectors grad f,
grad g, grad h all lie in the same plane (which is perpendicular to v), so grad f can
be expressed in terms of the other two vectors:
grad f = λ grad g + µ grad h,
where λ and µ are certain constants, the Lagrange multipliers for this problem.
Then split this equation into its components to obtain (31.26).
x + y + z = 1, (ii)
2z − µ = 0. (v)
Self-test 31.6
FUNCTIONS OF ANY NUMBER OF VARIABLES
(a) (b)
y y
3 3
2 α2x
y = α −α 2
Envelope E
1 1
x
O 1 2 1 3 1 O 1 2 3 x
α = 52 2 1 2 3
Fig. 31.8
The family does not have to consist of straight lines. Suppose that the family is
described by
f(x, y, α ) = 0.
To find the envelope (Fig. 31.9) consider two close values of the parameter, α and
α + δα, the corresponding curves of the family being
f(x, y, α ) = 0 and f(x, y, α + δα ) = 0.
The intersection point R is the point where
f(x, y, α ) = f(x, y, α + δα ) ( = 0).
Therefore, at the point R,
f (x, y, α + δα ) − f (x, y, α )
= 0.
δα
703
Parameter
31.9
α + δα
α
Parameter
α
R
Fig. 31.9
Now let δα → 0. Then R and Q come together at P on the envelope, and this equa-
tion becomes
∂f (x, y, α )
= 0, (31.27)
∂α
at P. Also P lies on the curve
f(x, y, α) = 0. (31.28)
(We had not so far used the fact that f is zero rather than some other constant.) If
we eliminate α between (31.27) and (31.28), we obtain an equation in x and y
which describes the envelope.
Example 31.10 Find the envelope of the family of straight lines y = α − α 2x,
where α is the parameter. (See Fig. 31.8.)
Let f(x, y, α) = y − α + α 2x. Then
∂f
= −1 + 2αx = 0.
∂α
Therefore
α = 1/(2x). (i)
On the envelope, also
y − α + α 2x = 0; (ii)
so, from (i), y − 1/(2x) + 1/(4x) = 0, or
y = 1/(4x),
which is a rectangular hyperbola (see Fig. 31.8b).
Self-test 31.7
Find the envelope of the family of circles (x − α)2 + y2 = α(α 0).
704
Problems
FUNCTIONS OF ANY NUMBER OF VARIABLES
31.1 Write down the incremental approximation (c) (For A) The area of a triangle with sides a, b, c
for δf in the following cases. is given by
(a) f(x, y, z) = 2x + 3y2 + 4z2 − 3; A = [s(s − a)(s − b)(s − c)]–,
1
2
PROBLEMS
P
(e) x3y + zx3 = 5 at (1, 2, 3);
(x, y, z) 1 1 1
(f) + + = 1 at (2, 3, 6);
x y z
(g) (x2 + 4y2 − z2)−1 = 161 at (4, 1, 2).
z
31.13 By finding the gradient vectors, obtain the
y
angle between the following surfaces at the point
of intersection given:
(a) x2 + y2 + z2 = 9, x2 − z2 = 0 at (2, 1, 2);
(b) x2 − y2 + z2 = 1, 2x − 3y + z + 1 = 0 at (2, 2, 1);
r
O
θ
(c) x2 + y2 − z2 = 0, 3x + 4y + 5z = 50 at (3, 4, 5).
Explain the result.
f (x, y, z) = A eα ( 2x +4y +z ) 2 ,
2 2 2
x
where A and α are constants. Deduce that the
Fig. 31.10
vector (2x, 4y, z) points in the direction of grad f.
(b) Let f(x, y, z) = g[u(x, y, z)], where g and u are
two other functions. Show that
∂f ∂f sin θ ∂f ⎛ ∂u ∂u ∂u ⎞
= cos θ − and grad f = ⎜ g ′(u) , g ′(u) , g ′(u) ⎟ ,
∂x ∂r r ∂θ ⎝ ∂x ∂y ∂z ⎠
∂f ∂f cos θ ∂f and deduce that grad u points either in the same or
= sin θ + .
∂y ∂r r ∂θ in the opposite direction to grad f.
(c) The results (b) show that the differentiation 31.15 Write down expressions for the directional
operations ∂/∂x and ∂/∂r are equivalent derivative of the following at the point (x, y, z), in
respectively to the polar forms terms of a unit direction vector v.
∂ sin θ ∂ (a) x + 2y + 3z; (b) x2 − y2 − 3z;
cos θ − and
∂r r ∂θ (c) (x − 1)3 + y3 + z3.
∂ cos θ ∂
sin θ + . 31.16 Find df/ds for the following functions f,
∂r r ∂θ taken at the point (2, 3, 2) in the direction
Use this fact to confirm that v = ( 41 √2, 41 √2, 12 √3) .
∂ 2f ∂ 2f ∂ 2 f 1 ∂f 1 ∂ 2f (a) x − y + 2z; (b) xy + yz + zx;
+ = + + . (c) (xy + yz + zx)2;
∂x 2 ∂y 2 ∂r 2 r ∂r r 2 ∂θ 2
(d) x2 − y2 + 5 (in three dimensions: this represents
a vertical cylinder).
31.11 Obtain the vector function grad f for each of
the following. 31.17 The equations for two surfaces, f(x, y, z)
(a) x + y + z; = a, g(x, y, z) = b, where a and b are constants,
(b) 2x − 3y + 5z − 6; together represent their curve of intersection,
(c) x2 + y2 + z2; C. Show that the vector product grad f × grad g,
(d) x3 + 3z3 − 1 (in three dimensions); evaluated at a point on C, points in the direction
(e) x 2 − 41 y 2 + 91 z 2; of C. Use this to find a unit vector v in the direction
(f) 1/r, where r = (x 2 + y 2 + z 2 ) 2 ; confirm that the
1
of C in the following cases:
gradient vector points in the direction of the (a) 2x + 3y − z = 1, x − y − z = 0, at any common
position vector (x, y, z). point.
(b) x + y = 0, x − z = 0, at any common point.
31.12 Obtain a vector which is normal to the (c) x2 + y2 + z2 = 6, x − y + z = 0, (1, 2, 1).
following surfaces at the points specified, and (d) x2 + (y − 1)2 = 1, x2 + (y − 2)2 = 4, at x = 0,
construct a unit vector from it: y = 0, and any value of z. Explain what is
(a) x − 2y + z = 0 at any point; happening here.
(b) y2 + z2 = 2 at any point; (e) xy + yz + zx = 3, x + y + z = 3, at (1, 1, 1).
706
31.18 Find the stationary points of the following (e) The problem of the rectangular block
functions with respect to all the variables named of greatest volume which can be fitted
FUNCTIONS OF ANY NUMBER OF VARIABLES
PROBLEMS
∂f ∂u y z y
= g ′(u) . = .
∂z ∂z x x z
(You only need the one-variable chain rule
(3.3).) 31.25 Let f(x, y, z, t) = ei(k1x+k2y+k3z−ω t), where i is the
(c) Check the correctness of the formulae complex element (i2 = −1), and k1, k2, k3, and ω are
(b) in the cases when (i) w = ex −y +z ,
2 2 2
constants. Show that
(ii) w = sin(xy /z). ∂ 2f ∂ 2f ∂ 2f 1 ∂ 2f
(d) Using the results (b), rewrite the chain rule + 2 + 2 = 2 2,
∂x 2
∂y ∂z c ∂t
(31.7) in the form appropriate to functions of where c = ω /√(k12 + k22 + k32). (This is called the wave
the form g(u(x, y, z)). equation in three dimensions, and f(x, y, z, t) is one
(e) The path x = cos t, y = sin t, z = t represents of its solutions.)
a helix whose axis is the z axis. Find an Prove that g(k1x + k2y + k3z − ω t), where g is any
expression in terms of t for df/dt on the function of a single variable, is also a solution.
path when f(x, y, z) = g(xy/z), where g is
any function. Confirm the result for any 31.26 Find the envelopes of the following families.
simple case. (a) y = α + α 2x (parameter α );
(b) y + α 2x = α (parameter α );
31.24 Often a function takes the form x y
(c) + = 1 (parameter α );
f(u, v, w), α 1−α
where u, v, and w are themselves functions (d) x cos θ + y sin θ = 1 (parameter θ ).
of x, y, and z. Write down a version of the chain
rule (31.8) which enables ∂f/ ∂x, ∂f/ ∂y, ∂f/ ∂z to 31.27 The cross-sectional profile of a long
be found. (In this case, x, y, and z function like cylindrical mirror is the semicircle x2 + y2 = 1 in the
parameters and u, v, and w like the principal right-hand half plane. Rays from the left, parallel
variables.) Use this result to prove the to the x axis, fall on the mirror.
following results: (a) Show that the equation of the ray reflected
(a) If φ = f(x − y, y − z, z − x), where f is any from the point (cos θ, sin θ ) on the mirror is
function, then x sin 2θ − y cos 2θ = sin θ.
(b) By regarding θ as the parameter, show that
∂φ ∂φ ∂φ
+ + = 0. the envelope of these reflected rays is given
∂x ∂y ∂z by x2 + y2 = --14(3y – + 1). (In optics, this envelope
2
3
Check your result with the function is called the caustic of the reflected rays.)
(x − y)(y − z)(z − x).
31.28 Show that the envelope of the family of
(b) If φ = f(y /x, z /x), where f is any function of straight lines such that the length cut off between
two variables, then the x and y axes is L, a constant, is given by
∂φ ∂φ ∂φ x– + y– = L–. Sketch the curve (it has four
2
3
2
3
2
3
x +y +z = 0.
∂x ∂y ∂z segments).
32 Double integration
CONTENTS
32.1
Before explaining how they arise, we show first how a repeated integral is written
and evaluated. The following is an example of a repeated integral with constant
I=
(xy + y − 1) dx dy.
0 0
2
There are two stages of integration; first with respect to x, then with respect to y,
this being determined by the order in which dx and dy appear under the integral
signs.
You are recommended to copy the following procedure step by step at first.
(i) Put brackets round the inner integral, which is the first to be evaluated:
1
⎛ 2
⎞
I=
(xy + y − 1)dx⎟⎠ dy.
0
⎜
⎝ 0
2
(ii) Make it clear which variable connects with which limits of integration, by
explicitly labelling them as shown:
1
⎛ 2
⎞
I=
y=0
⎜
⎝ x =0
(xy + y2 − 1) dx⎟ dy.
⎠
(iii) Evaluate the inner integral with respect to the first variable (here x), treat-
ing the other variable (y) as a constant:
2
x =0
(xy + y2 − 1) dx = [ 21 x2y + y2 x − x] 2x = 0 = 2y + 2y 2 − 2.
I=
y=0
(2y + 2y2 − 2) dy = [ y2 + 23 y 3 − 2y]1y = 0 = − 13 .
This eliminates the variable y, so that the final result is a definite number. If
you find you are left with an x or y in the result, then you have not followed the
process correctly.
(a) I = 0 2
(xy + 1) dx dy, (b) J = (xy + 1) dy dx.
2 0
1
⎛ 4
⎞
(a) As a repeated integral I = ⎜⎝
y=0 x =2
(xy + 1) dx⎟ dy.
⎠ ➚
710
Example 32.1 continued
DOUBLE INTEGRATION
x=2
(xy + 1) dx = [ 12 x2 y + x] x4 = 2 = 8y + 4 − (2y + 2)
= 6y + 2.
This forms the integrand of the outer integral:
1
I= y=0
(6y + 2) dy = [3y2 + 2y]1y = 0 = 5.
(b) Here the order of the symbols ∫ 4x=2 and ∫ 1y=0 has been reversed, and also the order of
dx and dy. In other words, the same processes are to be carried out, but in the reverse
32
y=0
(xy + 1) dy (x constant).
This is equal to
[x( 12 y 2) + y]1y = 0 = 12 x + 1.
The outer integral becomes
4
J= x =2
( 12 x + 1) dx = 5,
Self-test 32.1
Evaluate the repeated integrals
3 2 2 3
I= (x y + xy ) dx dy,
1 0
2 2
J= (x y + xy ) dy dx.
0 1
2 2
(a)
32.2
z Thickness δy
C (b)
D
z
A
O D
B
A B
x x
8 O 8
Fig. 32.1
z= 1
32 x2 + 1
16 y2 + 2 (0 x 8; 0 y 4),
and the problem is to find the volume, V say, of the grain.
Imagine the grain divided into thin vertical plane slices, parallel to the (x, z)
plane, the thickness of a slice being δy. A typical slice is shown in Fig. 32.1a, and
the value of y is constant on its faces. It is lifted out and displayed in elevation
separately in Fig. 32.1b. The face area is given by
8
Area ABCD =
x =0
( 321 x2 + 1
16 y2 + 2) dx,
in which y takes the current constant value. Therefore its volume, δV say, is given by
⎛ 8
⎞
δV ≈ ⎜
⎝
x =0
( 321 x2 + 1
16 y2 + 2) dx⎟ δy.
⎠
When we take the sum of all the elements δV and let δy → 0, we obtain in the
usual way
4
⎛ 8
⎞
V=
⎜
y=0 ⎝ x =0
( 321 x2 + 1
16 y2 + 2) dx⎟ dy.
⎠
The result has therefore taken the form of a repeated integral of the kind de-
scribed in Section 32.1. In evaluating it, the inner integral gives the cross-sectional
area of a slice on which y is held constant:
8
x =0
( 321 x2 + 1
16 y2 + 2) dx = [ 961 x 3 + 1
16 y2x + 2x] 8x = 0 = 64
3 + 21 y2.
Finally
4
V=
y= 0
( 643 + 21 y2 ) dy = [ 643 y + 61 y 3] y4= 0 = 96.
It can be seen that if we had taken the slices parallel to the (y, z) plane the
process would have led to the integral
712
8
⎛ 4
⎞
V=
⎜ ( 321 x2 + 1
y2 + 2) dy⎟ dx.
DOUBLE INTEGRATION
16
x =0 ⎝ y=0 ⎠
The integrand is the same, and the result must be the same, when the integrations
over x and y are carried out in the opposite order.
In general a repeated integral with constant limits,
f(x, y) dx dy,
d b
c a
c a a c (32.1)
xe
1 2
x=0
x exy dx,
which, though not very difficult, does involve integration by parts. To avoid this, try the
alternative order of integration:
2
⎛ 1
⎞
I= ⎜⎝
x=0 y=0
x exy dy⎟ dx.
⎠
The inner integral, with x being treated as a constant, is
1 1
⎡ 1 ⎤
y=0
x exy dy = ⎢x exy ⎥ = ex − 1.
⎣ x ⎦ y=0
Then
2
I= (e
0
x
− 1) dx = [ex − x] 20 = e2 − 3,
Self-test 32.2
32.3
Change the order of integration and evaluate
2
(a)
z (b)
z
C C (c)
y D
y y=4
D Q
R
4
2y
x=8 x= x=8
Q x = 2y
O A R
B x A B
8 O R A B
P 4 x
x 8 O y=0 P
Fig. 32.2 (a) Silo with triangular base OPQ; OP = 8, PQ = 4. (b) A cross-section y = OR. (c) The region of
integration, with a strip AB.
Figure 32.2b shows a typical slice ABCD lifted out and viewed in (x, z) axes in
order to obtain its face area, and Fig. 32.2c shows the base of the silo in plan view;
the slice chosen is along AB.
The slices all have different x values at their starting points, and these values
depend on y, so the limits of integration are not constant in this case. In order to
determine the range of integration of the slice at level y, it is necessary to refer to
the triangular area in Fig. 32.2c, called the region of integration for this problem.
The equation of the side OQ is
y = 12 x
for 0 x 8. Since we need x in terms of y, we express this as
x = 2y,
and it is helpful to write it on OQ, as shown, together with the simpler information
required for the other limits of integration.
714
The face area of the slice ABCD at level y is therefore given by
DOUBLE INTEGRATION
8
f(x, y) dx dy.
4 8
32
V=
0 2y
Notice that the limits of integration have nothing to do with the integrand
f(x, y), but depend only on the shape of the region of integration in the (x, y) plane
(in this case the triangle OPQ in Fig. 32.2c). The limits of integration are the same
no matter what the integrand.
(x + y) dx dy.
1 2
x =2y
(x + y) dx = [ 12 x2 + yx] x2 =2y = (2 + 2y) − (2y2 + 2y2) = 2 + 2y − 4y2.
Let
2
⎛ y
⎞
I= ⎜
y=0 ⎝ x=0
xy dx⎟ dy.
⎠
The inner integral is
y
x=0
xy dx = [ 12 x2y] yx = 0 = 12 y3.
Therefore
2
I= 0
1
2 y 3 dy = 18 [ y 4 ] 20 = 2.
➚
715
Example 32.4 continued
32.4
A sketch of the region of integration can be constructed in the following way.
The region of integration consists of the points (x, y) which simultaneously satisfy
(i) 0 y 2 and (ii) 0 x y.
y
y=2
2
x=0
1 y=x
y=0
x
O 1 2
Fig. 32.3
Self-test 32.3
Find the volume of the grain in the silo shown in Fig. 32.2.
xy dy dx.
2
0 0
This is obviously nonsense: the answer contains y, whereas we ought to get the
answer 2 again. In fact the new form means nothing at all.
If the region of integration is non-rectangular we have to write
f(x, y) dy dx
and begin again, filling in the limits of integration so that we cover the same
region. The interior integral is now with respect to y, so we must start with strips
716
Vertical strip,
DOUBLE INTEGRATION
width δx
y
2
y=2
x=0
y=x
1
y=0
x
32
O 1 2 Fig. 32.4
parallel to the y axis as shown in Fig. 32.4. Then the inner integral gives the
contribution from the strip:
2
δx f(x, y) dy.
y=x
0 0 0 x
Each case has to be considered individually in this way. The region of integration
should be sketched to ensure the correct change of limits.
2
y dx dy
0 − √(1− 4y 2 )
2
I= ⎜ y dx⎟ dy.
y=0 ⎝ x = −√(1− 4y 2 ) ⎠
The limits of integration express the boundaries of the region of integrations:
x = (1 − 4y2 )2 , x = −(1 − 4y2 ) 2 , y = 0, y = 12 .
1 1
These are shown in Fig. 32.5 (the curved part can be written x2 + 4y2 = 1: an ellipse
with semi-axes equal to 1 and 12 ). Figure 32.5a shows how the form given is obtained,
by starting with horizontal strips which end at x = ±(1 − 4y2 )2 .
1
In Fig. 32.5b, the position with regard to vertical strips is shown, for which the
inner integral will be over y. When the order of integration is changed by this means,
we obtain
⎛ √( 2 )
2 1−x ⎞
1
1
I=
x = −1
⎜
⎝ y=0
y dy⎟ dx.
⎠ ➚
717
Example 32.5 continued
32.5
(a) y
DOUBLE INTEGRALS
1
width δy 2 width δy
1 1
x = −(1 − 4y2)2 x = (1 − 4y2)2
−1 O 1 x
−1 y=0 O 1 x
Fig. 32.5
√(1−x 2 )
1
y dy = [ 12 y2 ]02 = 18 (1 − x2 ).
0
Therefore
1
I= 1
8 −1
(1 − x2 ) dx = 18 [x − 13 x 3 ]1−1 = 16 .
It is left to you to try it in the original form; it is perfectly possible, but more
complicated.
Self-test 32.4
R is the region in the x,y plane bounded by the straight lines x = 1, y = 0
and the parabola x = y2. Evaluate
I= √x e R
−y√x
dx dy.
(a) y (b) δA at P
y
DOUBLE INTEGRATION
Region of
integration
R
50
40
O 30 x O x
20
10
32
Fig. 32.6
Call the area covered by the lake the region of integration R for the problem.
Construct a mesh on R consisting of small area elements δA as in Fig. 32.6b: the
mesh may be quite arbitrary for the present purpose. A typical area element δA is
at P. Below δA is a depth of water we shall denote by f(P) (we shall not use f(x, y)
because cartesian coordinates might not be the ones we eventually want to use).
The volume δV in the vertical column of water below δA at the point P is approx-
imated by
δV ≈ f(P) δA.
If we add up all the volume elements in the usual way, we obtain the total
volume V. Denote this operation by
V = ∑ δV,
R
It is natural to write this as some kind of integral as we did in one dimension (see
Section 15.1). There are several notations; we shall write
V= f(P) dA,
R
which is to be read: the double integral of f over the region R . Unlike a repeated
integral it does not give any clue as to how to evaluate it.
As a rule, the argument that gives rise to a double integral by way of a certain
summation will not have anything to do with volume; but, as a result of the sum-
mation that it represents, the signed-volume analogy referred to in Section 32.2
will always hold good:
719
32.5
R
(i) I stands for lim ∑ f (P) δA; R represents a given region in a plane; and
DOUBLE INTEGRALS
δA→ 0
R
δA is a typical area element of R , taken at the point P. The summation is
over all the elements δA of R .
(ii) (Signed-volume analogy) Whatever its origin, the integral is numerically
equal to the signed volume between a surface z = f(x, y) and the plane z = 0,
taken over the region R . (Where z is negative the contribution counts as
negative.) (32.2)
δF = σ
σ(P)δA
The region R of
the plate
δA
Fig. 32.7
δF ≈ σ(P) δA.
Add up the contributions of all the elements covering R , and take the limit as the mesh
becomes finer and finer. We obtain
F = lim
δA→0 R
∑ σ(P) δA = σ (P) dA.
R
Example 32.7 We have shown in Example 32.6 that the resultant force F on any
flat plate R subject to a (perpendicular) pressure σ(P) per unit area is given by
F= σ(P) dA. Find the force on a rectangular plate of sides 2 and 3 units
R
when σ = 3(r 2 − 2), where r is the distance from one of the corners.
This double-integral expression is perfectly general, applying to any plate, any
distribution of force, and any coordinates. We have to reformulate the problem ➚
720
Example 32.7 continued
DOUBLE INTEGRATION
y
3
Horizontal
δy strip
δA
O 2 x
32
δx Fig. 32.8
for this case. Place the rectangle as in Fig. 32.8, with the corner to which the data refer
at the origin. A suitable mesh is the rectangular mesh, with δA having sides δx and δy:
the area element is δA = δx δy. Also σ = 3(x2 + y2 − 2).
We can add (which ultimately means integrate) the contributions
δF ≈ 3(x2 + y2 − 2) δx δy
in any order that is convenient. Suppose we decide to add the contributions along each
horizontal strip at level y, and then to add the results from the strips. Then from the
strip at level y we obtain the contribution
⎛ 2
⎞
⎜
⎝
x=0
3(x2 + y2 − 2) dx⎟ δy,
⎠
after letting δx → 0. When we add the contributions from all the strips and let δy → 0,
we have the repeated integral
3 2 3
F=
0 0
3(x2 + y2 − 2) dx dy = 3 (−
0
4
3 + 2y2 ) dy = 42.
If we had considered vertical strips we would have obtained the same result for F – this
would correspond to inverting the order of integration in the repeated integral.
The following examples show the adaptability of the notation. In each case, R
represents the region in question, with P a representative point of R and dA the
corresponding element.
(i) Area. Area of R : ∫∫R dA.
(ii) Variable surface density. Total mass of a thin flat plate, of variable mass σ(P)
per unit area: ∫∫R σ(P) dA.
(iii) Moments. Moment of (ii) about the x axis: ∫∫R yσ(P) dA.
(iv) Moments of inertia. Moment of inertia of (ii) about the y axis: ∫∫R x2σ(P) dA.
(v) Probability. A function f(x, y) is eligible to be a probability density
function for random variables X and Y over a region R if f(x, y) 0 and
∫∫R f(x, y) dA = 1. The probability that (X, Y) lies in a subregion S is then
∫∫S f(x, y) dA. (Here it is helpful to retain x and y: we are not obliged to use P
if it does not have the right associations.)
721
(vi) Vector resultant. A force per unit area, f(P) (stress), variable in direction and
magnitude, is applied to the surface of a flat plate R . The resultant force F is
32.6
given by F = ∫∫R f(P) dA.
POLAR COORDINATES
In order to interpret or evaluate the integral, we write f in its components
f = îf1 + qf2 + xf3: the original double integral with a vector function as integrand is
really three double integrals in one.
The resultant force F · v in a fixed direction v is ∫∫R f (P) · v dA. This integrand
f ·v is not a vector. It can be rewritten in any convenient way: for example, as
| f | cos θ, where θ is the angle between f and v.
Self-test 32.5
A thin rectangular plate has uniform density ρ and side-lengths 2a and 2b.
Find the moment of inertia of the plate about an axis perpendicular to the
plate through its centre. (For a plate of general shape R , the moment of
inertia about a perpendicular axis is
ρ(x + y ) dx dy
R
2 2
(a) (b)
y y
r=
b
δr
β
θ=
δA ≈ r δr δθ
r δθ
δA δA
r=
a
α δθ
θ=
Region R r
O x x
Fig. 32.9
722
where P is a representative point of R , and δA for the moment permits any kind of
division of R into small area elements. We want to put everything in terms of
DOUBLE INTEGRATION
polar coordinates. This process must include a suitable choice of elements δA, so
that the summation, or integration, (32.3) can be carried out in an orderly way
over the δA elements – the equivalent of ‘strips’ in (x, y) coordinates.
The mesh suitable for this purpose is also shown in Fig. 32.9a, and one of the
area elements δA is shown in Fig. 32.9b. It is nearly a rectangle, with sides δr and
r δθ, so
δA ≈ r δr δθ.
The sum in (32.3) therefore becomes, in polar coordinates,
32
lim
δr →0
∑ f (r, θ )(r δr δθ ).
δθ → 0 R
f(r, θ )r dr dθ.
b
f(P) dA =
R α a (32.4)
Example 32.8 Find the volume V between the two planes x + y + z = 4 and z = 0,
over the quadrant 0 r 1, 0 θ 21 π.
Here R is the region 0 r 1, 0 θ 12 π in the plane z = 0. Expressed as a double
integral, the required volume V is
V= f(P) dA = z dA.
R R
Here z is given by
z = 4 − x − y = 4 − r cos θ − r sin θ = f(r, θ ). ➚
723
Example 32.8 continued
32.6
Then by (32.4),
1
π 1
(4 − r cos θ − r sin θ )r dr dθ
2
V=
POLAR COORDINATES
0 0
⎛ π
⎞
1
1
= ⎜ (4r − r cos θ − r sin θ ) dr ⎟ dθ
2
2 2
⎝θ =0 r =0 ⎠
1
π
2
(2 − 1
3 cos θ − 1
3 sin θ ) dθ = π − 23 .
0
Example 32.9 A circular disc of radius 0.1 m has a surface charge density
σ = 10 (1 + 10 r 3 sin 21 θ ) coulombs per square metre. Find the total charge.
−6 3
The total charge Q is given by ∫∫R σ (P) dA, where the region R is the disc
0 r 0.1, 0 θ 2π (if in doubt, sketch it). Remembering that, in polar
coordinates, δA = r dr dθ (or reading straight from (32.4)), we have
2π 0.1 2π 0.1
Q= 0 0
σ (r, θ ) r dr dθ = 0 0
10 − 6(1 + 10 3 r 3 sin 12 θ ) r dr dθ
2π
⎛ 0.1
⎞
= 10 − 6 θ =0
⎜
⎝ r =0
(r + 10 3 r 4 sin 12 θ ) dr⎟ dθ
⎠
2π
= 10 − 6 θ =0
[ 12 r 2 + 15 10 3 r 5 sin 12 θ ] r0.=10 dθ
2π
= 10 − 8 0
( 12 + 1
5 sin 12 θ ) dθ = 10 − 8 [ 12 θ − 2
5 cos 12 θ ] 20 π
= 3.94 × 10−8.
Since the repeated integral has constant limits, the same result would be obtained by
integrating in the reverse order (see (32.1)).
Example 32.10 The curve r = cos θ (0 θ 14 π ), together with the radii from
the origin to its ends, forms the boundary of the region R , and is shown in
Fig. 32.10. Obtain (a) the area of R , and (b) its moment about the y axis.
(a) In general, the area of a region R is ∫∫R dA. In this case, we shall add up the
contribution δA along radial sectors inclined at angle θ, one of which is shown, and
then sum the results for all these sectors to obtain the total area A. We can indicate
this, together with the range for r and θ, by writing
θ = 14 π r = cosθ θ = 14 π r = cosθ
A= ∑ ∑ δA ≈ ∑ ∑ (r δr δθ ).
θ =0 r =0 θ =0 r =0
When we let δr and δθ tend to zero, we have a repeated integral with a variable limit:
1
π cosθ 1
π 1
π
4 4 4
A= r dr dθ = = 0 dθ =
[ 12 r 2 ] rcosθ 1
cos2θ dθ = 1
π + 18 .
➚
2 16
0 0 θ =0 0
724
Example 32.10 continued
DOUBLE INTEGRATION
y
r = cos θ
r
δA
θ δθ
O 1 x Fig. 32.10
δM ≈ x δA;
so, as a double integral, the total moment M is given by
M= x dA.R
4
M= r 2 cos θ dr dθ = 1
32 π+ 1
12 .
0 0
Self-test 32.6
Find the volume of the region bounded by the paraboloid z = 1 − x2 − y2 and
the plane z = 0.
a
g(x)h(y) dx = h(y) g(x) dx,
a
32.7
b d
SEPARABLE INTEGRALS
a c
which is simply the product of two ordinary integrals. We have the following result:
Separable integrals
h(y) dy.
d b b d
g(x)h(y) dx dy = g(x) dx
c a a c (32.5)
This can sometimes speed up the working when evaluating integrals, but the fol-
lowing example proves an important result by applying (32.5) the other way round.
∞
Put I= 0
e−x dx.
2
The product is I 2:
∞ ∞
I2 = 0
e−x dx
2
0
e−y dy.
2
The area element is dA = r dr dθ, and the same region R is described in polars by
0 r ∞, 0 θ 12 π .
Then
1
π ∞ 1
π ∞
e
2 2
I =
2
e −r 2
r dr dθ = dθ −r 2
r dr (by (32.5) again)
0 0 0 0
= − 12 π 12 [e −r ] r∞= 0 = 14 π.
2
Therefore I = 12 π 2 .
1
726
y
R
DOUBLE INTEGRATION
r δθ
δA
δr
O x Fig. 32.11
32
Example 32.12 Prove the convolution theorem for Laplace transforms (see
(25.11)): that is, if F(s) and G(s) are the Laplace transforms of f(t) and g(t)
respectively, then F(s)G(s) is the Laplace transform of ∫ 0t f(τ )g(t − τ ) dτ.
Consider the Laplace transform P(s) of ∫ 0t f(τ )g(t − τ ) dτ:
∞ t ∞ t
P(s) =
0
e −st
0
f (τ )g(t − τ ) dτ dt = e
0 0
− st
f (τ )g(t − τ ) dτ dt.
The region of integration is the infinite triangle in the (τ, t) plane shown in Fig. 32.12.
Change the order of integration by summing vertical strips: we find that
Vertical strip
O τ Fig. 32.12
∞ ∞
P(s) = e
0 τ
− st
f (τ )g(t − τ ) dt dτ .
32.8
determinant
Consider the integral
where R is the region of integration in the (x, y) plane and the area elements
δA are small rectangles of side δx and δy as in Fig. 32.13. The shape of R might
suggest the use of another system of coordinates to evaluate I. The special case of
polar coordinates was illustrated in Section 32.6.
Suppose that new coordinates u and v are defined by the relations
x = x(u, v); y = y(u, v); (32.8)
where there is a one-to-one correspondence between (x, y) and (u, v). The
objective is to put (32.7) entirely in terms of u and v.
y Rectangular y
v = vP
element δA
u = uP
δy P
δx
Region R
O x O x
Fig. 32.13 Fig. 32.14
Figure 32.14 shows a general point P at (xP, yP), or at (u = uP, v = vP) in the new
coordinates. The coordinate curves u = uP and v = vP through P are also shown.
Now let δu and δv represent positive small increments in u and v respectively.
In Fig. 32.15(a) the two curves u = uP + δu and v = vP + δv are also shown. The
(a) u = uP + δu v = vP + δv
v = vP
u = uP
R (b) R
Fig. 32.15 (a) Area element δA′ for the u, v coordinates. (b) When δu and δv are small, PQRS is
nearly a parallelogram.
728
area element PQRS, denoted by δA′, is of the type appropriate for the new
coordinates, and when δu and δv are small, PQRS is nearly a parallelogram, as
DOUBLE INTEGRATION
⎡x − xP xS − xP ⎤
δA′ = det ⎢ Q
⎣ yQ − yP yS − yP ⎥⎦
where the verticals stand for the modulus of the determinant between them (see
Problem 32.17, and also Example 11.2).
The elements of the determinant are given approximately by
32
∂x
xQ − xP = x(uP + δu, vP) − x(uP, vP) ≈ δu,
∂u
∂x
xS − xP = x(uP, vP + δv) − x(uP, vP) ≈ δv,
∂v
∂y
yQ − yP = y(uP + δu, vP) − y(uP, vP) ≈ δu,
∂u
∂y
yS − yP = y(uP, vP + δv) − y(uP, vP) ≈ δv,
∂v
where the partial derivatives are evaluated at P. Therefore
⎛ ∂x ∂x ⎞ ⎛ ∂x ∂x ⎞
⎜
δA′ = det ⎜ ∂u ∂v ⎟ δu δv = det ⎜ ∂u ∂v ⎟ δu δv,
⎟ ⎜ ⎟
⎜ ∂y ∂y ⎟ ⎜ ∂y ∂y ⎟ (32.9)
⎝ ∂u ∂v ⎠ ⎝ ∂u ∂v ⎠
(where the vertical lines denote the modulus of the determinants) since we required
δu and δv to be positive.
The determinant which occurs in (32.9) is of wide importance. It is called the
Jacobian of the transformation (32.8), and has the notation
∂(x, y)
.
∂(u, v)
For brevity it is sometimes denoted simply by J(u, v).
From (32.8) and (32.9), remembering the modulus, we can therefore say:
729
32.8
curves
∂(x, y)
δA′ = δu δv (or | J(u, v)| δu δv)
We can now rewrite the original integral (32.7) in terms of the new coordinates
u and v:
f(x) dx
b
I=
a
dx
and we change the variable by putting x = x(u), the new factor appears in the
integrand, and the limits change. du
The final step is to convert (32.12) into a repeated integral in terms of u and v,
so that the integrations can be carried out.
45°
O 1 2 x Fig. 32.16
32
I= (x + y ) dx dy = r J(r, θ ) dr dθ = r dr dθ.
R
2 2
S
2
S
3
θ
1
1
4
π
O 1 2 r Fig. 32.17
4 4
32.8
Therefore
∂(x, y)
δA = δu δv = 2uv δu δv.
y
2 v
y
1) 2
v=1
=
x
(v
x
1
−
2
=
y
1)
(u
R 1
u=0 S u=1
(v
0)
y
0
=
+
0) 0
=
(u
x
x
=
−
y
O v=0 1 u
−1 O 1x
Fig. 32.19
Fig. 32.18
1 1 1 1
I=
0 0
uv du dv = 8
0
u du v dv (since the integral is separable)
0
=8× 1
2 × 1
2 = 2.
Self-test 32.7
Sketch the region bounded by the parabolas y = x2, y = 2x2, x = y2, x = 2y2.
Using the change of variable u = y/x2, v = x/y2, find the area enclosed by the
parabolas. What is the area bounded by y = 2x2, y = x2 and x = 2y2?
732
Problems
DOUBLE INTEGRATION
1 2 1 1
dx dy; dx dy;
d b b d 0 0 0 0
(c) (d)
c a a c
32.4 Find the volume of the wedge-shaped object
2π
1 1
d b
2
(e) dy dx; (f) y sin xy dx dy; having one curved surface which is part of the
c a 0 0
cylinder x2 + y2 = 1, and whose flat surfaces are z = 0
32
(xy − x y) dx dy;
1 1 32.5 Reverse the order of integration in each of the
2 2
(i) following cases. It is necessary to sketch the region
0 −1
of integration and to indicate a typical strip
(xy − x y) dy dx;
1 1
2 2
corresponding to the new order of integration,
(j) as in Section 32.4.
−1 0
1 y 1 1
2π
1
2π
1 2 y+1
2 1
(m) dx dy.
x 1 √(1− y 2 )
y 1 0
(d)
0 −√(1− y 2 )
f (x, y) dx dy;
1
4 2y
32.2 Find the signed volume between the given
surfaces and the plane z = 0 over the specified
(e)
2 0
f (x, y) dx dy;
rectangular regions.
(a) z = xy, 0 x 1, 0 y 1; 1 x2
(c) z = x + y, −1 x 2, −2 y 1; 1 1−x
(d) z = −1, a x b, c y d;
(e) z = 2x − y + 3, 0 x 1, 0 y 1;
(g)
0 −1+x
f (x, y) dy dx (it becomes the sum of
two integrals);
(f) z = 1 /(x + y), 1 x 2, 0 y 1; 1 1+√(1−x 2 )
(g) z = (x + 2y − 1)2, −2 x 1, −1 y 1. (h)
−1 1−√(1−x 2 )
f (x, y) dy dx.
x
0 0
2 2
0 −y
(c) x e
0 0
2 xy
dx dy;
733
∞ y−2
(h) R is the half plane y 0, and f(P) = y e −(x 2 +y 2)
.
x 2y e −x y dx dy;
2 2
(d)
(Hint: separate the integral: see (32.5).)
PROBLEMS
1 −2
0 4
y
1 2
32.9 A circular hole of radius 12 a is drilled through
(e) y(1 − x 2 − y 2 )2 dx dy; a sphere of radius a in such a way that the edge of
−1 −2
the hole passes through the centre of the sphere.
2 1
x
y Let the equation of the sphere and the cylinder be
(f ) dx dy; x2 + y2 + z2 = a2 and (x − 12 a)2 + y 2 = 41 a 2. If x = r
2
+ y2
1 0
cos θ and y = r sin θ, show that the volume Vc of
∞ ∞ material removed (the section in the (x, y) plane is
1
(g) dx dy; shown in Fig. 32.20) is given by
1 0
(x + y)3
2π a cos θ
1
(h)
1
1
y(x 2 − y 2 ) dx dy;
1
2
Vc = 2
− 12 π 0
a 2 − r 2 r dr dθ .
0 y
Hence find the volume of the remaining part of the
2 y −1
2
sphere.
(i) x dx dy (the integral must be split
0 − y −1 into two parts);
y
1 1
y dx dy
(j) 1 .
0 y (x 2 − y 2 ) 2
r
32.7 The symbol ∫∫R f(P) dA represents a double θ
integral taken over the region R , and dA is the area O 1
2
a a x
element at the point P in R (see Section 32.5). In
the following cases, the region R is described, and
f(P) given in cartesian coordinates. Evaluate the
integrals.
(a) R is the rectangle with corners at (1, 1), (2, 1),
(2, 4), and (1, 4), and f(P) = x2 + y2.
Fig. 32.20
(b) R is the equilateral triangle with vertices at
(0, −1), ( 3 2 , 0), and (0, 1), and f(P) = x.
1
(f) R is the first quadrant of the plane, and f(x, y) as a repeated integral in the (u, v) plane, and
= e − 4(x +y ).
2 2
evaluate it.
(g) Show that the volume of a sphere of radius a is
3 πa
4 3. (Consider the hemisphere
32.12 Sketch the region in the (x, y) plane
0 z (a 2 − x 2 − y 2 ) 2 .) bounded by the parabolas y = x2, y = 2x2, x = y2,
1
734
x = 2y2. Find the Jacobian of the transformation 32.16 Find the Jacobian J(u, v) of the
given by transformation u = x2 − y2, v = 2xy. Draw a
DOUBLE INTEGRATION
32.13 Evaluate
(x + y ) dx dy.
2 2
x e
R
x+y
dA,
R
where R is the region bounded by the square 32.17 Let PQRS be a parallelogram with P at
|x| + |y| = 1. (xP, yP), Q at (xQ, yQ), and S at (xS, yS). Show
32
V=2 (x + 2) dA
R
2 32.18 The following technique can be regarded as
integrating under another integral sign with respect
of the component. to a parameter (compare Section 17.9).
(a) Noting that
32.15 For the polar transformation x = r cos θ, y = r b
e
1 −ax
sin θ, show that − xy
dy = (e − e −bx )
x
∂(x, y) a
= r.
∂(r, θ ) evaluate the integral
Show that r and θ are given by ∞
e −ax − e −bx
y dx,
r= x +y ,
2 2
tan θ = . 0
x
x
Show that where a 0 and b 0.
(b) In a similar way, evaluate
∂(r, θ ) 1 ∂(x, y)
= =1 .
∂(x, y) r ∂(r, θ ) ∞
cos ax − cos bx
In fact under fairly general conditions, the dx,
−∞
x2
Jacobian satisfies this inverse rule, which is helpful
in some cases since it can avoid the inversion of where a and b may take any values. In this
transformations (see Example 30.16). problem the result depends on the signs of a
Find ∂(u, v)/ ∂(x, y) if u = y /x2 and v = x/y2 using and b. (You may assume that
this rule, and confirm that
∞
∂(x, y)
1 sin u 1
= . du = √π.)
∂(u, v) 3u 2v 2 0 u 2
Line integrals
33
CONTENTS
Figure 33.1 charts the fortunes, or the state (x, y), of the museum over a period,
starting at state A and arriving at state B, in the form of a curve joining A and B.
Time does not register on this diagram, except that direction of development as
time increases is indicated by the arrow. The directed curve is called the path from
A to B, denoted by (AB) or AB (there may be more letters in the brackets).
Suppose that the bonus at the starting state A is IA, and at the state B it is IB. Then
the problem is to find the change in bonus over the period, I(AB):
IB − IA = I(AB). (33.2)
Divide up the path into many short segments such as PP′ (Fig. 33.1). The incre-
ment δI over a typical segment is given by
736
y (grants)
LINE INTEGRALS
δx 0
A P′
δy
C P
33
δx
δx 0
O x (visitors) Fig. 33.1
Given a specific function f(x, y) and a specific path (AB), I(AB) could in principle
be computed by carrying out the summation (33.3) numerically, taking δx very
small, and allowing for the fact that δx is sometimes positive and sometimes
negative. We have to split up the path for this purpose: in Fig. 33.1, δx is negative
along (AC) and positive along (CB). If the path is vertical along a section, then δx
will be zero, and there will be a zero change in the bonus along this part despite
the fact that y is changing.
In imagination, let δx → 0. Then ‘≈ ’ becomes ‘=’. It is natural to write the
result as a kind of integral:
where the notation reminds us that we take values of (x, y) which lie on (AB) and
take account of the sign of δx at each point on the path.
The integral in (33.4) is called a line integral. It is not straight-forwardly an
ordinary integral because the direction, left to right or right to left, at every point
must be taken into account. The director is losing money along (AC). In order
to arrive at B from A many paths are possible. In general, a line integral I(AB) will
depend on the total history, on what path has been followed between A and B, and
we say that the integral is path dependent.
(a) y B
(b) y
33.1
5 5
C y = x2 − 3x + 3
1 1
O 1 3 5 x O 1 3 5 x
Fig. 33.2
(in suitable units). Suppose that the museum passes from state A : (3, 3) to state B
as in Fig. 33.2a, or to state C as in Fig. 33.2b; it thrives in Fig. 33.2a and declines
in Fig. 33.2b.
Consider the case (a). The graph AB can be expressed in principle as a function
of x, and the curve chosen for illustration is
y = x2 − 7x + 15.
where B is the point (5, 5). Then
I(AB) =
(AB )
(x + 21 y) dx =
(AB )
[x + 21 (x2 − 7x + 15)] dx.
But δx is positive all the way; so, regarded as the limit of the sum in (33.4), this is
just an ordinary integral. After simplifying the integrand, we have
5
I= 1
2 (x2 − 5x + 15) dx = 11.33.
x=3
I(AC) =
(AC )
1
2 (x2 − x + 3) dx.
where C is the point (1, 1). In this case, however, the δx in the sum (33.4) are all
negative: x is decreasing. To turn this into an ordinary integral, we have therefore
to reverse the sign:
3
I(AC) = − I(CA) = −
x =1
1
2 (x2 − x + 3) dx = −5.33. (33.5)
There is a reduction of bonus for bringing the museum to the edge of ruin.
While we are still observing things, notice first that, in connection with the sign
change for negative δx on (AC) in (33.5), we can write
738
3 1
I=−
1
2 (x2 − x + 3) dx = (+)
1
2 (x2 − x + 3) dx.
LINE INTEGRALS
1 3
In other words, we obtain the correct result by setting the x coordinate of the
starting point as the lower limit, and that of the end-point as the upper limit,
whether x is constantly decreasing or constantly increasing along the path, and
this is a general result.
Lastly we compare the result for the parabolic path (AC) in Fig. 33.2 with a
straight path from A to C whose equation is
33
y = x.
Then
1
(AC )
(x + 21 y) dx =
3
3
2 x dx = 23 [ 21 x2 ]13 = − 6.
This is different from (33.5), so we must in general expect that line integrals will
be path dependent.
The following summary generalizes the special case we have discussed.
xD
(a) (AB )
xy dx, (b) (ACB )
xy dx,
(b) The path (ACB) has to be broken into two parts: (AC), on which δx 0, and (CB),
on which δx 0, where C = (0, 4). Then
I(ACB) = (ACB)
xy dx = (AC)
xy dx +
(CB)
xy dx.
➚
739
Example 33.1 continued
33.2
(a) y B (b) y y=4
4 4 B
C
2 2 A
A
O 2 4 x O 2 4 x
Fig. 33.3
= [2 x − x ] + 2 [x ] = .
2 1
3
3 0
2
2 4
0
80
3
Despite (33.6c), reduction to ordinary integrals over x is not usually the best
way to evaluate line integrals, as will be seen later on.
Self-test 33.1
⁄ is the straight line directed from A : (1, 2) to B : (2, 1). Show that
(a) ⁄
(ax + by) dx = --23 (a + b) and (b) ⁄
(ax + by) dy = − --23 (a + b).
(AB)
g(x, y) dy
(AB )
g(x, y) δy means lim
δy → 0
∑ g(x, y) δy,
(AB )
740
in which the sign of δy is positive on a segment along which y is increasing and
negative on a segment where y is decreasing. We can similarly consider paths and
LINE INTEGRALS
(AB)
f(x, y, z) dx, (AB)
g(x, y, z) dy, (AB)
h(x, y, z) dz,
33
(f dx + g dy + h dz).
(AB)
(a) (AB)
(f dx + g dy + h dz) = lim
δx,δy,δz → 0
∑ (f
( AB)
P δx + g P δy + hP δz)
(b) (AB)
(f dx + g dy + h dz) = − (BA)
(f dx + g dy + h dz).
To organize such an integral in order to take account of the signs of δx, δy, δz
is often difficult. For example, if the path (AB) consists of an ellipse inclined to the
three axes, each term must be broken into two sections, leading to six integrals
in all, to ensure constancy of sign of δx, δy, or δz along each. However, if a
parametric representation of the path is adopted, the correct interpretation is
obtained automatically.
Consider the integral with respect to x,
(AB)
f(x, y, z) dx,
33.2
the original integral in place of dx. After doing the same thing with the y and z
integrals, we have the following result.
(f dx + g dy + h dz) =
(AB) tA
in Fig. 33.4.
y
B (−1, 1)
1
−1 O 1 x
−1 (1, −1) A
Fig. 33.4
On (AB), y = −x, so we can use x = t, y = −t, with t running from t = 1 to t = −1. This
covers (AB) once in the right direction (it is like using x as the parameter). Then
dy
x2 + y = t2 − t and = −1,
dt
so
−1
I=− 1
(t 2 − t) dt
= −[ 13 t 3 − 12 t 2 ]1−1 = 23 .
It is immaterial what parametrization is used, so long as it satisfies the conditions
in (33.8). The following example compares two parametrizations.
742
LINE INTEGRALS
y
1
B
33
−1 O 1 x
A
−1
Fig. 33.5
2 2 2
(b) x = (1 − t 2 )2 , y = t; so
1
dx dy
= − t(1 − t 2 )− 2 , = 1.
1
dt dt
Then
1 1
dt
I= {(1 − t 2 )2 − t[−t(1 − t 2 )− 2 ]} dt = 1 = [arcsin t ] −1 = π.
1 1
1
−1 −1 (1 − t 2 )2
x = a cos t, y = a sin t, z = bt between t = 0 and 4π. ((AB) is a helix along the z axis.)
We have
dx dy dz
= −a sin t, = a cos t, = b,
dt dt dt
so the integral becomes
4π 4π
I=
0
(−a2 cos t sin t + a2 sin t cos t + b2t) dt = b2 0
t dt = 8b2π2.
743
Self-test 33.2
33.3
Obtain (y2 dx − x2 dy) where the path ⁄ is the parabolic arc y2 = x from
1 Q
B
A
O 1 x
Fig. 33.6
On (OB), x = 0; so (OB)
x dy = 0.
Therefore
(AOB)
x dy =
(AO)
x dy + (OB)
x dy = 0.
1
(AQB)
x dy =
(AQ)
x dy + (QB)
x dy = 1.
744
Self-test 33.3
LINE INTEGRALS
The points O : (0, 0), A : (2, 0), B : (0, 1) are vertices of a rectangle with sides
parallel to the axes.
33
We can write the integrand in terms of a perfect differential (see Section 22.4):
y dx + x dy = d(xy).
Now express the integral in the form
I=
(AB)
(y dx + x dy) = (AB)
d(xy). (i)
From (33.7), the meaning of the integral is the limit of a certain sum, which can be recast
in the form
lim
δx,δy → 0
∑ ( y δx + x δ y) =
( AB)
lim
δ(xy)→ 0
∑ δ(xy).
( AB)
(ii)
As we travel along (AB), the value of (xy) starts at xAyA, where the values are taken at A,
then goes by steps δ(xy) until it attains the value xByB. In other words,
∑ δ(xy) = x y B B − xA yA,
( AB)
I=
(AB)
(y dx + x dy) = (AB)
d(xy) = xByB − xAyA
745
33.4
I= [(y + z) dx + (z + x) dy + (x + y) dz]
I= (AB)
[(y + z) dx + (z + x) dy + (x + y) dz] =
(AB)
d(yz + zx + xy)
(AB)
(f dx + g dy + h dz) = (AB)
dS = SB − SA.
(AB)
(f dx + g dy + h dz) = SB − SA.
(33.9)
746
Self-test 33.4
LINE INTEGRALS
Prove that if A and B are any two fixed points, then [(x − yz) dx + (y − 2x) dy
⁄
2 2
+ (z2 − xy) dz] has the same value for every path ⁄ connecting them.
A closed path is one that returns to its starting point, so that B has the same
coordinates as A, as in Figs 33.7a,b. We shall discuss only simple closed paths.
These do not cross over themselves, as do the curves in Fig. 33.7b.
(b) y
(a) y A
B
B
A
A
B
O
x O
x
Fig. 33.7 (a) A simple closed path. (b) Closed paths which are not simple.
It is clear from the definition (33.7) that, when A and B are the same point,
and the path is closed, their position on the curve will not affect the value of the
integral. Consequently its coordinates are not usually stated; a closed path is
indicated by a symbol such as C, and the integral is written
(f dx + g dy + h dz).
C
(The notation is also used for line integrals around closed curves.) In three
C
dimensions, the direction along C is specified by extra information, such as by an
arrow on a sketch of the curve. However, in two dimensions, a convention oper-
ates: if it is not otherwise indicated, the standard direction is anticlockwise.
33.5
where we assume a and b to be positive. We had a choice for the range of t, because
we can start at any point on the ellipse. For the choice 0 to 2π, the path starts and ends
at (a, 0). Then
CLOSED PATHS
dx dy
= −a sin t, = b cos t;
dt dt
so
2π 2π
I= 0
[a cos t(b cos t) − b sin t(−a sin t)] dt = ab
0
dt = 2πab.
(a) M (b) M B
B
A A
N N
Fig. 33.8 (a) Two paths M and N between A and B. (b) Closed curve C using M and N reversed.
To prove this, see Fig. 33.8a. Here A and B are any two points. While (AMB) and
(ANB) are any two paths from A to B. If we reverse the direction of the path
(ANB) we have Fig. 33.8b, which is a closed curve C. Suppose that we know the
integral around every closed curve to be zero. Then, on C,
0= (AMBNA)
(f dx + g dy + h dz)
= (AMB)
(f dx + g dy + h dz) +
(BNA)
( f dx + g dy + h dz)
= (AMB)
(f dx + g dy + h dz) −
(ANB)
( f dx + g dy + h dz).
P A
K
R
B
Q Fig. 33.9
The fixed points A and B are shown on Fig. 33.9, and (P, Q) is any other pair
of points. (AKB), (AP), (QB), and (PRQ) are arbitrary paths joining the points
specified by their brackets. Since I(AB) is independent of the path joint A and B,
I(APRQB) = I(AKB),
so
I(AP) + I(PRQ) + I(QB) = I(AKB),
or
I(PRQ) = I(AKB) − I(AP) − I(QB).
But the right-hand side does not depend on which path was chosen for (PQ).
Therefore I(PQ) is independent of the path joining P and Q, which proves the result.
where dA is the area element, and the line integral direction is anticlockwise.
(33.12)
749
33.6
C y = f(x)
d
M
GREEN’S THEOREM
A A B
x = k(y)
x = h(y)
N y = g(x) c
O x a O b x O x
Fig. 33.10 (a) The diagram for Green’s theorem. (b) For the integration of ∂P/∂y. (c) For the
integration of ∂Q/∂x.
Although the result is true in general, we shall prove it only for a curve like that in
Fig. 33.10a, for which lines parallel to the axes cut the curve in at most two points.
b
∂P ∂P
dA = dy d x =
A
∂y a g(x )
∂y a
= P(x, y) dx − P(x, y) dx
(AMB) (ANB)
= − P(x, y) dx − P(x, y) dx
(BMA) (ANB)
d h(y)
Q dy.
∂Q ∂Q
dA = dx dy = (ii)
A
∂x c k(y)
∂x C
Example 33.9 Show that if C is a simple closed curve, the geometrical area it
encloses is equal to 12 ∫C (x dy − y dx).
Put P = − y and Q = x in Green’s theorem (33.2):
⎛ ∂ ∂ ⎞
(−y dx + x dy) = 2 ⎜⎝ ∂x (x) − ∂y (− y)⎟⎠ dA = dA ,
1 1
2 C A A
To prove this, let C be any closed curve. Its interior is denoted by A. Then, by Green’s
33
theorem,
(a) z (b)
F
δr Q
P δz
Q
P θ B
δr δx
δy
r r + δr y
A
x
Fig. 33.11
751
The total work W(AB) done on the particle by F along the path (AB) is given by
33.7
W(AB) = ∑ δW ≈ ∑ F . δr.
(AB ) (AB )
W(AB) = (AB)
F· dr.
This integral has an ordinary meaning when it is written in (dx, dy, dz) form by
splitting F and δr into their components (see Fig. 33.11b):
F = F1î + F2 q + F3 x and δr = δx î + δy q + δz x.
Then
F ·δr = F1 δx + F2 δy + F3 δz,
and
W(AB) ≈ ∑ (F 1 δx + F2 δy + F3 δz).
(AB )
Finally, taking the limit as δx, δy, δz approach zero, we obtain the exact result.
W(AB) = (AB)
F·dr = (AB)
(F1 dx + F2 dy + F3 dz).
Example 33.10 A field of force F is constant everywhere. Show that the work
done by F alone on a particle which moves from a fixed point A to a fixed
point B is independent of the path followed.
Put F = aî + bq + cx where a, b, c are constants. Then, by (33.14), W(AB) is given by
W(AB) =
(AB)
(a dx + b dy + c dz) = (AB)
d(ax + by + cz)
independent). ➚
752
Example 33.11 continued
LINE INTEGRALS
∂ −1 ∂ −1 ∂ −1
δ(r −1 ) ≈ (r ) δx + (r ) δy + (r ) δz
∂x ∂y ∂z
x y z
=− 3 δx − 3 δy − 3 δz
(x 2 + y 2 + z 2 ) 2 (x 2 + y 2 + z 2 ) 2 (x 2 + y 2 + z 2 ) 2
= −(x/r 3) δx − (y/r 3) δy − (z /r 3) δz.
33
A
F
B
rB
rA
⎛ x z ⎞
y
W(AB) = (F1 dx + F2 dy + F3 dz) = mγ ⎜ − 3 dx − 3 dy − 3 dz⎟
(AB) ( AB) ⎝ r r r ⎠
= mγ (AB)
d(r−1) (from (a))
= mγ (r −1
B − r A ).
−1
Examples 33.10 and 33.11 illustrate cases where the work done by a force
between two fixed points is independent of the path between them, but this is
not a universal state of affairs: for example, it is not the case for the force field
F = (y, −x, 0).
33.8
that constitutes the source of the field, so that in effect we would be putting the
particle into a modified situation. The case is similar with gravity: if an asteroid
CONSERVATIVE FIELDS
enters the moon’s gravitational field, the moon will respond by moving, and the
field entered will change, if only by a little. For the purpose of defining field
intensity, we imagine that somehow such an effect is prevented from taking place.
Subject to this, we have the following definition.
Field intensity fP at P
fP is equal to the vector force that would act on a particle of unit mass (charge, etc.)
at P if the sources are assumed to be unaffected by the particle. (33.15)
Therefore, if the gravitational field intensity is GP at P, the force with which the
field acts on a particle of mass m at P is mGP. One can alternatively imagine a
particle of extremely small mass µ to be introduced as a test particle. Then fP will
be equal to µ −1 times the force exerted on such a particle.
Consider the action of a field of intensity f(x, y, z) on a unit particle which is
travelling on a path (AB) (Fig. 33.13). We shall consider not the work done by f
on the particle, but the work done against the field by the particle, which has the
opposite sign. Denote this quantity generally by v. The work done against f in a
step PQ is given by
δv ≈ − f · δr. (33.16)
The total work along the path is the limit of the sum of the δv, which can be
expressed as a line integral as before:
v(AB) = − (AB)
f · dr = − (AB)
(f1 dx + f2 dy + f3 dz),
P
Q
δr
r
A r + δr
B
O Fig. 33.13
754
According to (33.11) it is only necessary to check path independence for any
single pair of points (A, B) within this region. Therefore:
LINE INTEGRALS
The constant field of Example 33.10 and the gravitation field of Example 33.11
are conservative.
v(AP) = −
(AP)
f · dr,
where P is another point in R , is independent of the path (AP), and so its value
depends only on the location (x, y, z) of P. Therefore we shall write
v(AP) = V(x, y, z) or VP, (33.18)
z
f
P
δx
Q
A
y
x Fig. 33.14
755
where f1 is the x component of f. Equating the last two results and dividing by δx,
we obtain
33.9
f1 = −[V(x + δx, y, z) − V(x, y, z)]/δx.
⎛ ∂V ∂V ∂V ⎞
f = f1î + f2 q + f3x = − ⎜ î + q+ x⎟ ,
⎝ ∂x ∂y ∂z ⎠
or
f = − grad V. (33.19)
We call V a potential function for the field f, or simply a potential. The single
scalar function V(x, y, z) contains all the information necessary to define the
three scalar components of f: f1(x, y, z), f2(x, y, z), f3(x, y, z). The point A is
commonly taken to be at infinity: you might recognize the idea of ‘the work
required to bring a particle in from infinity’ in mechanics. However, if we choose
a different reference point A, it only changes V by an additive constant, and does
not, therefore, affect the truth of (33.19); we get the same f whatever location A
has. We sum up this result as follows.
VP = − (AP)
f ·dr,
Potential field
If there is a scalar function V such that
f = − grad V,
then f(x, y, z) is called a potential field. (33.21)
= VB − VA.
Provided that ∫(AB) dV is independent of the path from A to B, the value to be
assigned to VB − VA is unambiguous and we say that V is single valued. However,
the values of V may depend not only on the position, but also on the way in which
the position was reached (analogously to the time spent reaching a point on the
other side of a road being dependent on whether you cross directly or via the
underpass). For example, in the plane, let
V = θ,
where θ is the polar angle traversed in reaching the current position, measured
continuously from a given starting point. What do we mean by
(AB)
dθ ?
757
33.10
B
SINGLE-VALUEDNESS OF POTENTIALS
O A x
D Fig. 33.15
Figure 33.15 shows two paths from A to B : (ACB) goes from A to B more or
less directly, and (ADB) circles the origin completely first. The definition of the
integral is that
(AB)
dθ = lim
δθ → 0
∑ δθ,
(AB )
where the summation is carried out by taking small steps along the path. On
(ACB), θ passes smoothly from θ = 0 to θ = 12 π, so
(ACB)
dV = (ACB)
dθ = θB − θA = 12 π.
(ADB)
dV = (ADB)
dθ = θB − θA = 52 π.
O
x
33
Therefore the field is perpendicular to the radius vector at every point, as in Fig. 33.16. It
is easy to confirm that
f = − grad V,
where V is a (path-dependent) continuous function such that
tan V = y /x.
Thus we may take V = θ as described in the case we just discussed. (We cannot write
V = arctan y/x, because this function is discontinuous across the y axis: it would have
an infinite gradient there.) The figure makes it obvious that the field is not conservative:
more work is done if you take a unit magnetic pole against the field 50 times around
the origin in order to travel between two points than if you go directly.
The field in Example 33.12 is not conservative, but whole classes of paths are
equivalent. Suppose that, as in Fig. 33.17, we have two paths, (AMB) and (ANB),
which can be steadily deformed into each other (as if A and B were connected by
a piece of elastic) without passing over the origin. Then these two paths are equi-
valent. In this case, θ starts at θA = 0; although the value of θ wanders about
on (ANB), increasing and decreasing, it still ends at the value θB = 12 π, as on the
path (AMB).
B
M
x
O A
L
Fig. 33.17
759
PROBLEMS
A
x
O
B
Region R
Fig. 33.18
However, (AMB) cannot be deformed into the third path (ALB) without
passing over the origin; by following it around, it can be seen that θB = − 32 π for
this path.
Suppose that we confine consideration to a ‘patch’, or region R as in Fig. 33.18,
which neither contains nor surrounds the origin O. Then, within this region, the
field behaves as if it were conservative, because any path from A to B inside the
region can be deformed into any other without crossing the origin. We could not
tell, from experiments confined to R , that the field is not conservative over the
whole plane.
Problems
(a)
(AOB)
x dx; (b)
(AOB)
y dx; (c)
(AOB)
x2 dx.
(a) x dx;
P
(b) y dx;
P −1 (1, −1)
A
−1 (1, −1)
A
(c) x dx;
P
2
(d) (x + y) dy;
P
Fig. 33.19
(e) xy dy;
P
2
(f) (x dx + y dy);
P
33.3 (Section 33.2). Evaluate the following line
(g) P
( 12 dx − y dy); (h) P
(y dx − x dy). integrals over the various paths P, which are
specified parametrically.
760
(a) xy dx; P is x = t , y = t; 0 t 1.
2 2
(g) (y dx + x dy); (h) (y dx + x dy).
LINE INTEGRALS
P (ABC) (AOC)
(b) (x dy − y dx); P is x = cos t, y = sin t; 0 t π. 33.6 (Section 33.4). The integrands given are
P perfect differentials; P represents any path
(c) (z dx − x dy + y dz); P is x = t + 1, y = t, z = 2t;
having the right direction which joins the two
given points. Evaluate
P
0 t 1. (a) (x dx + y dy + z dz); P is (−1, 1, −1) to (1, −1, 1).
(x dx + y dy + z dz); P is x = cos t, y = sin t,
P
33
2 2 2
(d)
P (b) (yz dx + zx dy + xy dz); P is (0, 0, 0) to (1, 1, 1).
z = t; 0 t 2π. P
(c) e
(e) Compare (c) when P joins the same two
(x dx + y dy + z dz); P is (0, 0, 0) to
x 2+y 2+z 2
points, (1, 0, 0) to (2, 1, 2), but x = t2 + 1,
P
y = 2t − t2, z = 2t2; 0 t 1. (1, 1, 1).
33.4 (Section 33.2). The line integral ∫(AB) f(x, y) dy,
where the path (AB) is described by the curve
(d) [(y + z) dx + (z + x) dy + (x + y) dz]; P is
P
y = k(x), can be written formally as (1, 1, 1) to (0, 1, 0).
f (x, k(x))
dk
dx
dx.
(e) [cos(xy + yz + zx)] [(y + z)dx + (z + x)dy
P
( AB)
+ (x + y) dz]; P is (1, 0, π) to (0, π, 1).
Apply this formula to ∫(AB) (x + y) dy, taken over the
parabolic path in Fig. 33.19b. Express it as the sum
of two ordinary integrals over x. (This is like using
(f) (xy dx + x y dy); P is (1, 1) to (2, 2).
P
2 2
A
x
(a) (y dx + z dy + x dz); C is x = sin t, y = cos t,
C
O 1 z = sin t; 0 t 2π.
(a) (ABC)
dx; (b) (AOC)
dy; (c) (yz dx + zx dy + xy dz); C is any closed path.
C
(c) (x dy − y dx); (d) (x dy − y dx); 33.9 Show that ∫(AB) (yx2 dx + 13 x 3 dy) is path
(ABC) (AOC) independent between any two points A and B.
Use this fact to evaluate the integral along the
(e) y dy; (f) y dy; spiral path given in polar coordinates (r, θ ) by
(ABC) (AOC) r = eθ /2π for 0 θ π.
761
33.10 Show that if ∫(AB) (f dx + g dy) is independent 33.17 A force field has field intensity f(x, y, z)
of the path (AB) for every two points A and B, then = yî + q + xx. Is f conservative? Find the work
PROBLEMS
the integral around every closed path is zero. (Hint: done against the field by a unit particle moving
A and B may coincide.) in a straight line from (0, 0, 0) to (1, 1, 1).
33.11 Show that if the variables are changed 33.18 A force f is given by f (x, y, z) = yzî + xzq +
in a perfect differential form, it remains a perfect xyx. Show that it is conservative. Find the work
differential. Illustrate this by transforming the done against f along the path x = cost, y = sint,
identity y dx + x dy = d(xy) into polar coordinates. z = sin t cos t; − 12 π t 12 π. Are you doing this
the easiest way?
33.12 (Green’s theorem, Section 33.6). Confirm
the truth of Green’s theorem (33.12) for some very 33.19 Prove that a force field f having the form
simple cases for which you know you can work f = rα t, where α is any constant, r is distance from
out both the line integral and the double integral the origin, t is the unit position vector, and t = r/r,
involved. is a conservative field. (Hint: start by putting
r = (x 2 + y 2 + z 2 ) 2 , and guess something that f
1
33.13 (Green’s theorem, Section 33.6). Check the might be the gradient of. If you cannot guess,
correctess of the area formula, Example 33.9, by then use the fact that grad F(r) = t(∂F/∂r).)
evaluating the line integral --12 ∫C (x dy − y dx) taken
around the following closed paths. 33.20 Generalize Problem 33.19 to a field f = tf(r).
(a) The circle x2 + y2 = 4. What is the potential of such a field?
(b) The ellipse 14 x2 + 19 y2 = 1.
(c) The triangle with vertices (−1, 0), (2, 0), (0, 4). 33.21 Confirm that Green’s theorem still holds for
boundary C of the annular region A between the
33.14 Find the area of the star-shaped circles x2 + y2 = 1 and x2 + y2 = 4 for the line integral
region bounded by the curve x + y = 1, by
2 2
3 3
33.15 The gravitation force F arising from a What are the directions on C ?
particle of mass M at the origin upon a particle of
mass m at a point with position vector r is given 33.22 Show that ∫C (5x4y dx + x5 dy) = 0 holds
by F = −γ Mmr /r 3. Find the work done by F on a for any closed curve C for which Green’s
particle which travels in from infinity to r. theorem is true.
33.16 Use Green’s theorem with (33.10) to decide
33.23 Sketch the curve given parametrically by
whether the following represent conservative fields
(in two dimensions) or not in the stated regions. x = cos t − 12 sin 2t, y = sin t; 0 t 2π.
(a) (x2 − y2, 2xy); all x, y. Using Green’s theorem, find the area enclosed
(b) (--12 ln(x2 + y2), arctan(y /x)); x 0. by the curve.
Vector fields: divergence
34 and curl
CONTENTS
Vector fields in two dimensions have already been encountered in Section 29.6.
A vector field in three dimensions extends this concept to a vector with three
components which are functions of position in space. In terms of cartesian com-
ponents a vector field F(x, y, z) will have the form
F(x, y, z) = F1(x, y, z)î + F2(x, y, z)q + F3(x, y, z)x.
Vector fields abound in physical and engineering applications. Fluid velocity,
gravitational forces, magnetic and electric fields are examples of vector fields. In
time-varying applications the vector field and its components will also depend on
a fourth variable, namely time, but here we shall concentrate only on the space
variables.
z
z
34.1
v
O O v
y y
v
x x
is any suitable parameter. Then its tangent is in the direction of dr/dt (see eqn (9.18))
which must be in the same direction as F:
dr
= µ(t)F(x, y, z)
dt
where µ(t) is some scalar function of the parameter t. Hence in component form
dx dy dz
= µ(t)F1(x, y, z), = µ(t)F2(x, y, z), = µ(t)F3(x, y, z).
dt dt dt
Elimination of the unknown µ(t) leads to:
dx = y dy,
2
or x = 13 y 3 + C1 , (34.2)
dy = z dz, or y = 12 z 2 + C2 . (34.3)
for any value of x. Equations (34.2) and (34.3) are two families of surfaces (both are
cylindrical) and their curves of intersections are the field lines of F.
Self-test 34.1
Find the field lines of the vector field given by F = (y, −x, x). What kind of
curves are the field lines?
34
Self-test 34.2
34.3
Let r = xî + yq + zx, and r = | r |. Show that div (r/r 3) = 0.
f(x, y, z) d S
S
is the limit
lim
δ S→0
∑ f (x, y, z) δ S .
S
This superficially resembles, but does not in fact represent a double integral of the
type (32.2) considered earlier. Select x and y as basic variables (we could take y, z
or z, x instead). Then although the surface S is defined as a function of x and y by
writing z in terms of x and y, the elements δ S are inclined to the x,y plane, so do
not have area δx δy. We allow for this and convert the expression into an ordinary
double integral as follows.
Let the projection of S on to the (x, y) plane be R , and let δA be the projection
of the element δ S. (We assume that any line parallel to the z axis cuts S in at most
one point.) The element δA could be the rectangular element having area δx δy,
in which case, for small δx and δy, δ S would be approximately a parallelogram
on the tangent plane at a point P within δ S.
The relation between δ S and δA in Fig. 34.3 depends on the unit normal L at P.
Consider a vertical plane through P containing the vector L and x as shown in
Fig. 34.4. Let θ be the smaller angle between L and x, that is 0 θ 180°.
Then the length of any straight line element in δ S perpendicular to the plane of
L and x is unaltered by the projection, but all line elements in δ S lying in the plane
of n and k are changed in length by projection by a factor cos θ = |L · x|. Hence
δA = |L·x | δ S.
Thus
which can be used as a definition of the surface integral. To obtain the surface
area S put f(x, y, z) = 1.
766
x
VECTOR FIELDS: DIVERGENCE AND CURL
z θ
L
δS
P
δS P S
y
x δA R
Fig. 34.3
δA
34
Fig. 34.4
Example 34.3 The roof of a building has the cylindrical shape z = h − bx2 over a
square floor plan given by | x| a, | y | a, where h 2a2b (see Fig. 34.5). Find
the surface area of the roof.
The surface area is given by
S= δS.
S
In this case we use cartesian coordinates to define the element δA, which is the rectangle
with sides parallel to the axes with lengths δx and δy. Thus δA = δx δy. The integration
takes place over the square |x | a, |y | a in the (x, y) plane (Fig. 34.6). We also require
the unit normal L. By (28.7) the unit normal will be
z
z = h − bx2
y
a
δy
δx
−a O a x
y
O −a
x
Fig. 34.6
Fig. 34.5
➚
767
Example 34.3 continued
34.3
(−2bx, 0, −1)
L= .
4b2x2 + 1
and by (34.5)
a a
S=
−a −a
4b2x2 + 1 dx dy.
S=
−a
4b2x2 + 1 dx
−a
dy = 2a
−a
4b2x2 + 1 dx.
The remaining integral can be evaluated using the substitution x = (sinh u)/(2b).
The result is
a
S= [2ab 4a2b2 + 1 + sinh −1(2ab)].
b
If the surface S is given by z = f(x, y), we can obtain a general cartesian formula
for the surface area. A vector in the direction of the normal at any point on the
surface is given (see Section 28.4) by
A ∂f ∂f D
n = C − , − , 1F
∂x ∂y
A ∂f ∂f D G A D 2 A D 2J
L = C − , − , 1F H 1 + ∂f + ∂f K .
∂x ∂y I C ∂xF C ∂y F L
Hence
G A D 2 A D 2J
x·L = 1 H 1 + ∂f + ∂f K ,
I C ∂xF C ∂y F L
G A D 2 A D 2J
S
dS =
R
H 1 + ∂f + ∂f K dx dy,
I C ∂xF C ∂y F L
The parameters u and v are defined over a rectangle in the (u, v) plane.
For example, for the surface
r = a cos u sin v î + a sin u sin v q + a cos v x,
we can see that
|r | = √[a2 cos2u sin2v + a2 sin2u sin2v + a2 cos2v]
= √[a2(cos2u + sin2u) sin2v + a2 cos2v]
34
= a√[sin2v + cos2v] = a,
which means that the position vector r traces out a sphere of radius a, centre at
the origin. We need to specify u and v to determine which part of the sphere
is defined. For the whole surface, these parameters must range over the intervals
0 u 2π and 0 v π.
More complicated surfaces can be generated in this way, and their graphical
representation has become easier using symbolic computer software (see Chapter
42, projects for this chapter). For example, the position vector r defined by
r = (3 + cos v) cos u î + (3 + cos v) sin u q + sin v x
where 0 u 2π and 0 v 2π generates a torus (like the shape of a doughnut)
with its axis in the x direction (Fig. 34.7a). The vase-shaped surface in Fig. 34.7b
is generated by
(a) (b) z
y
y
x x
34.3
where a = 0.3, b = 3.5 for 0 u 2 and 0 v 2π.
Triple integrals or volume integrals can also be defined in vector calculus. By
where δV is an increment of volume and P is a point in δV (see Fig. 34.8). Its evalua-
tion requires it to be converted into a repeated integral with three integrations.
δV
O
x y
v
Fig. 34.8
δV = δx δy δz
z
O y
v
Fig. 34.9
➚
770
Example 34.4 continued
VECTOR FIELDS: DIVERGENCE AND CURL
The total mass is therefore the sum or integral of these elements within the cube. The
integral, with δx, δy, and δz parallel to the axes, sweeps out the interior of the cube if it
is integrated in the x, y, and z directions, in turn, between −a and a in each case. Hence
the mass M of the cube is
a a a
M= [α + β(x + y + z )] dx dy dz.
−a −a −a
2 2 2
M= [α x + β(--x + xy + xz )] dy dz
−a −a
1 3
3
2 2 a
−a
a a
= [2α a + 2β (--a + ay + az )] dy dz
1 3
3
2 2
−a −a
a
−a
= 8a3(α + β a2).
Self-test 34.3
Find the surface area of the paraboloid defined by z = 1 − x2 − y2 for z 0.
z
L2
S2 : z = g2(x, y)
δS2
δV
O
V
x δS1 y
C
S1 : z = g1(x, y) L1 Fig. 34.10
771
34.4
Let S be a surface enclosing a region V, and let F be a smooth vector field defined
in V. Then
where L is the unit normal to S drawn outwards from V (the integral on the right
is called the flux of F out of S ). (34.7)
Within the restrictions imposed on S we can divide S into two surfaces, an upper
one S2 with equation, say, z = g2(x, y) and a lower one S1 with equation z = g1(x, y),
the two surfaces meeting on the curve C. We shall use the cartesian increment
δx δy δz for δV.
The divergence theorem is really the sum of three results. Suppose that F =
F1î + F2 q + F3x, and consider first
From the previous section, noting carefully the directions of L1 and L2, the outward
normals to S1 and S2, it follows that
dx dy = x · L2 d S2 on S2 but dx dy = −x· L1 dS1 on S1,
since the angle between x and L1 is obtuse. Hence
= F x· L d S. 3
S
S
1
S
2
772
VECTOR FIELDS: DIVERGENCE AND CURL
δS v
Fig. 34.11
V V
(F î + F q + F x) ·L dS = F · L dS.
34
= 1 2 3
S S
v · L dS.
S
Assuming that the fluid is incompressible, and fluid is neither being created
nor destroyed within S, it follows that
v · L dS = 0,
S
that is the net outflow, the flux through S, is zero. By the divergence theorem it
must be true that
div v dV = 0
V
(a) z (b) z
34.5
CURL OF A VECTOR FIELD
O O
y y
x x
Fig. 34.12
Self-test 34.4
C is a simple (not self-intersecting) closed curve of area A on the plane
z = h 0 in three dimensions. A cone is formed by joining each point on C
to the origin by a straight line. Using the divergence theorem find the volume
of the cone. Deduce the volume of a regular tetrahedron and a regular
octahedron both having all sides of length a.
the first row expansion rule for determinants. This rule is analogous to the
determinant rule for the vector product given in Section 11.2. The del form is
curl F = ∇ × F.
It can be proved that curl F is a vector invariant: the physical entity it represents
does not vary under translation or rotation of the axes.
Using (34.8)
î q x
∂ ∂ ∂
curl F =
∂x ∂y ∂z
exyz x2 + y xz ey
⎛ ∂ ∂ 2 ⎞ ⎛ ∂ ∂ ⎞
34
î q x
∂ ∂ ∂
curl v = = −ω x,
∂x ∂y ∂z
ωy 0 0
O x
Fig. 34.13
775
which is a vector perpendicular to the x,y plane. The fluid as a whole does
not appear to rotate, but a small leaf placed on the flow will rotate in a clockwise
34.5
sense as it is carried along with the stream. For example, if it is placed so that
y 0 for all points on the leaf, then the points furthest from the x axis will
everywhere.
A vector field which satisfies curl v = 0 is said to be irrotational. There are two
important identities for special vector fields.
î q x
∂ ∂ ∂
curl grad φ = ∂x ∂y ∂z
∂φ ∂φ ∂φ
∂x ∂y ∂z
î q x
∂ ∂ ∂
curl grad φ =
∂x ∂y ∂z
y2z + z + y exy 2xyz + x exy xy2 + x
⎛ ∂ ∂ ⎞
= ⎜ (xy2 + x) − (2xyz + x exy )⎟ î
⎝ ∂y ∂z ⎠
⎛ ∂ ∂ ⎞
+ ⎜ (y2 z + z + y exy ) − (xy2 + x)⎟ q
⎝ ∂z ∂x ⎠
⎛ ∂ ∂ 2 ⎞
+ ⎜ (2xyz + x exy ) − (y z + z + y exy )⎟ x
⎝ ∂x ∂y ⎠
34
φ = (y2z + z + y exy) dx + f(y, z) = xy2z + xz + exy + f(y, z); (34.11)
Here the ‘constants of integration’ become functions of the other two variables in each
case since partial derivatives are being integrated. Finally, φ given by (34.11), (34.12),
and (34.13) must all result in the same answer. This can be achieved by the choices
f(y, z) = C, g(z, x) = xz + C, h(x, y) = exy + C,
where C is any constant. Hence
φ = xy2z + xz + exy + C.
Note that potentials of conservative fields can only be found to within an additive
constant.
777
Self-test 34.5
34.6
Let F(x, y, z) = xyî + yz2q + xyzx and G(x, y, z) = yî + zq + (x2 + y2)x.
Determine curl F, curl G, and div (F × G). Verify the identity
Cylinder ρ = constant
z
z êz
P : (x, y, z) êφ
P : (ρ, φ, z)
z P : (ρ, φ, z)
êρ
φ
φ
ρ ρ y
y
x y x
Vertical plane
x
φ = constant
Fig. 34.14 Cylindrical polar coordinates. Horizontal plane
z = constant
Fig. 34.15
A point P can be viewed as lying at the intersection of three surfaces (Fig. 34.15):
the cylinder ρ = a constant, the radial plane φ = a constant through the z axis, and
the horizontal plane z = a constant. These surfaces meet at right angles at every
point, and coordinate systems with this property are said to be orthogonal.
The point P can be represented by the position vector r, where
r = r( ρ, φ, z) = ρ cos φ î + ρ sin φ q + z x.
Along the ρ-increasing line through P, φ and z are constant. The vector ∂r /∂ρ,
evaluated at P, is a tangent to this curve at P, pointing in the direction of increas-
ing ρ. The corresponding unit vector in this direction is
778
∂r ∂r 1 ∂r
êρ = =
VECTOR FIELDS: DIVERGENCE AND CURL
∂ρ ∂ρ hρ ∂ρ
∂r
hρ = = | cos φ î + sin φ q | = (cos2 φ + sin2 φ )2 = 1.
1
∂ρ
where we require the components gρ, gφ, gz. Treating U as a function of ρ, φ, z, the
incremental formula gives
∂U ∂U ∂U
δU = δρ + δφ + δ z. (34.16)
∂ρ ∂φ ∂z
Also, from (31.15), the directional derivative of U is
dU
= v · grad U,
ds
where v represents an arbitrary direction. Since | δs| = | δr |,
dr ∂r dρ ∂r dφ ∂r d z dρ dφ dz
v= = + + = hρ êρ + hφ êφ + hz êz .
d s ∂ρ d s ∂φ d s ∂z d s ds ds ds
Therefore,
dU dρ dφ dz
= hρ g ρ + hφ gφ + hz gz ,
ds ds ds ds
or, expressed in increments,
δU = hρ gρ δρ + hφ gφ δφ + hz gz δz. (34.17)
Compare (34.16) and (34.17), which are true for arbitrary δρ, δφ, δz. In turn, put
one of δρ, δφ, δz to a nonzero value, and the other two to zero. We obtain
1 ∂U 1 ∂U 1 ∂U
gρ = , gφ = , gz = ,
hρ ∂ρ hφ ∂φ hz ∂z
so that grad U is given by
1 ∂U 1 ∂U 1 ∂U ∂U 1 ∂U ∂U
grad U = êρ + êθ + êz = êρ + êφ + êz .
hp ∂ρ hφ ∂φ hz ∂z ∂ρ ρ ∂φ ∂z
779
The divergence and curl also have their cylindrical polar forms. For the vector
field F = Fρ êρ + Fφ êφ + Fz êz, these are
34.7
1⎡∂ ∂ ∂ ⎤
div F = ( ρFρ ) + (Fφ ) + ( ρFz )⎥ ,
⎡ êρ ρêφ êz ⎤
1⎢∂ ∂ ∂⎥
curl F = ⎢ ⎥
ρ ⎢ ∂ρ ∂φ ∂z ⎥
⎢⎣ Fρ ρFφ Fz ⎥⎦
r = r(u1, u2, u3) = x(u1, u2, u3)î + y(u1, u2, u3)q +z(u1, u2, u3)x.
x u3 = constant
q ê3
P
O
ê2
u1 = constant ê
1
Fig. 34.16 Orthogonal curvilinear
u2 = constant
î coordinates.
Assume that the curvilinear coordinates are orthogonal, that is the surfaces
u1 = a constant, u2 = a constant, u3 = a constant meet at right angles at every point
(Fig. 34.16). The unit vector ê1 is in the direction of the curve along which the
surfaces u2 = constant and u3 = constant meet, and it points in the direction of u1
increasing. The other unit vectors are in the directions of the intersections of the
other surface pairs as shown in Fig. 34.16.
The scale factors and unit vectors are given by:
780
∂r ∂r ∂r
h1 = , h2 = , h3 = .
∂u1 ∂u2 ∂u3
1 ∂r 1 ∂r 1 ∂r
ê1 = , ê2 = , ê3 = .
h1 ∂u1 h2 ∂u2 h3 ∂u3
Elements of distance δs in the u1, u2, u3 directions are respectively
h1 δu1, h2 δu2, h3 δu3. (34.18)
We simply state the formulae for grad, div, and curl in general curvilinear co-
ordinates without derivation. They are given by:
Gradient of U
1 ∂U 1 ∂U 1 ∂U
grad U = ê1 + ê2 + ê3.
h1 ∂u1 h2 ∂u2 h3 ∂u3 (34.19)
34
Divergence of F = F1 ê1 + F2 ê2 + F3 ê3
1 ⎡ ∂ ∂ ∂ ⎤
div F = ⎢ (h2 h3F1 ) + (h3h1F2 ) + (h1h2 F3 )⎥ .
h1h2 h3 ⎣ ∂u1 ∂u2 ∂u3 ⎦ (34.20)
Derivations of these formulae are given by Riley, Hobson and Bence (1997).
34.8
z
STOKES’S THEOREM
P : (r, θ , φ)
r
θ
φ
∂r
hφ = = |−r sin θ sin φ i + r sin θ cos φ j | = r sin θ.
∂φ
From (34.19)
∂U 1 ∂U 1 ∂U
grad U = êr + êθ + êφ .
∂r r ∂θ r sin θ ∂φ
By (34.20)
1 ∂ 2 1 ∂ 1 ∂
div F = (r Fr ) + (sin θ Fθ ) + (Fφ ).
r 2 ∂r r sin θ ∂θ r sin θ ∂φ
Self-test 34.6
Spherical polar coordinates are given by x = r sin θ cos φ, y = r sin θ sin φ,
z = r cos θ. Using hr, hθ , hφ given in Example 34.7 and (34.21), obtain curl F.
Verify that curl F = 0 if F = 2r sin θî + r cos θq.
SC
VECTOR FIELDS: DIVERGENCE AND CURL
x
δS
C
q
Projection
A3
δA3
C3
î
Fig. 34.18 The region SC ; its boundary C with direction of C indicated (this direction corresponds
to the choice of the positive side of S as the upper side of the diagram); the positive direction of the
normal L, and the projection of the system on to the x,y plane.
Suppose also that SC is such that every straight line parallel to î, q or x cuts SC at
most once. This is a drastic restriction on SC ; for example, if x, y or z has a local
maximum or minimum at a point on SC the condition is not satisfied; but this is
34
the first step towards consideration of a general form of surface. The condition
implies that on SC
x = X(x, y), y = Y(x, y), z = Z(x, y), (34.22)
L · curl V dS = V · ds,
SC C
(34.24)
34.8
G ∂u ∂u J G ∂u ∂u J
[L · curl(î)]SC dS = L · I q − x L dS = I L · q − L · x L dS, (34.25)
∂z ∂y SC ∂z ∂y SC
STOKES’S THEOREM
the suffix indicating that the derivatives of u are to be evaluated on SC . Since
z = Z(x, y) on SC ,
uSC (x, y, z) = u(x, y, Z(x, y)) = U(x, y)
say, where U is a function of two variables. (Notice that
G ∂u(x, y, z) J ∂U(x, y)
I L ≠ .
∂x CS ∂x
These terms are connected through a chain rule.)
If r(x, y, z) is the position vector of a point on SC , then
r = îx + qy + xZ(x, y) = R(x, y).
But (see Section 28.5), ∂R/∂y is perpendicular to L at the point, so that from
(34.22)
∂R ∂Z
L· =L·q+L·x = 0.
∂y ∂y
By substituting for L · q in (34.25) we obtain
G ∂u ∂u ∂Z J ∂U
[L · curl(î)]SC dS = −L · x I + dS = −L · x dS (by the chain rule).
∂y ∂z ∂y L SC ∂y
Also L · x dS = dA3, where dA3 is the area of dS projected on to the x,y plane, as in
Fig. 34.8, and C projects on to C 3. Therefore
Green’s theorem in the plane (see Section 33.6) applies to this projection: when
we put Q = U(x, y) and P = 0 in eqn (33.12), we obtain from Green’s theorem
and
B Fig. 34.19
Notional illustration of adjacent
patches and the boundary curve C.
S′C ′
On opposite sides of a common
S″C ″
edge, the directions of circulation
C A are opposed, as on AB.
L · V dS = (u dx + v dy + w dz) = V · ds,
SC C C
confirming eqn (34.24)
Now consider a smooth surface SC having boundary C which does not neces-
sarily satisfy the geometrical limitations prescribed. It is plausible that such a
surface can be partitioned into sub-areas, each of which does satisfy the con-
ditions applied so far; the surface can be covered by N non-overlapping ‘patches’
of this type. The result (34.24) applies to each element Ci separately, so that SC is
the union of N patches, and by addition
34
N
L · curl V dS = ∑ V · ds. (34.26)
S i=1 Ci
The directions Ci round the patches are each determined by the right-hand screw
rule.
Figure 34.19 shows two contiguous patches S ′ and S ″ that share a common
boundary segment AB. Along AB, ds″ = −ds′, and V is continuous across AB,
so C″
V · ds″ = − C′
V · ds′. Therefore, when the summation (34.26) is carried out,
cancellation takes place along all the edges of adjoining patches, leaving only the
contributions from the uncompensated segments constituting the boundary C.
Therefore we have obtained a general form for Stokes’s theorem (of which
Green’s theorem can be regarded as a special case):
Note 1. The vectorial expressions in Stokes’s theorem do not depend upon the
axes x, y, z that were used in the proof – the course of the argument would have been
the same whatever right handed axes had been used. (The expressions in (34.27) are
invariant with respect to transformations between right-handed systems of axes.)
Note 2. The flux of curl V through SC is constant and equal to the circulation
round C for all surfaces SC spanning a fixed curve C. (This also follows from the
Divergence Theorem (34.7) since, by (34.10), curl V ≡ 0.)
785
Problems
PROBLEMS
34.1 Find the surface area of the spherical cap of (a) grad(UV) = U grad V + V grad U;
height h whose equation for z 0 is (b) div(UF) = (grad U) ·F + U div F;
z = √[a2 − x2 − y2] − a + h, (0 h a). (c) div(F × G) = (curl F) ·G − F·(curl G);
(d) curl curl F = grad(div F) − div grad F. By div
34.2 Evaluate the following triple integrals as grad F is meant î div(grad F1) + q div(grad F2 )
repeated integrals: + x div(grad F3 ).
1 z 2y
(e) grad(F · G) = F × curl G + G × curl F +
(a)
0 0 y
x dx dy dz; (F ·grad)G + (G ·grad)F.
(c)
0 0 − 12 √[1− y 2 − z 2]
x3 dx dy dz. This is often written as ∇2φ . The equation
∇2φ = 0
is known as Laplace’s equation.
34.3 It is intended to evaluate the integral
Show that φ = 1 / x 2 + y 2 + z 2 is a solution of
f(x, y, z) dx dy dz
V
Laplace’s equation.
as a repeated integral over the interior of the sphere 34.10 Prove that
x2 + y2 + z2 = a2 which lies in the first octant x 0, (a) div(F + G) = div F + div G;
y 0, z 0. Work out the limits of integration if (b) curl(F + G) = curl F + curl G.
the order of integration is x followed by y
followed by z. 34.11 Find the divergence of each of the following
vector fields:
34.4 Show that the volume of the tetrahedron (a) F = exyzî + ey zq + exzx;
2
1 ∂ ⎛ ∂U ⎞ 1 ∂ 2U ∂ 2U
⎜ρ ⎟+ 2 + = 0.
ρ ∂ρ ⎝ ∂ρ ⎠ ρ ∂φ 2 ∂z 2 34.20 Suppose that F is a smooth vector field
which equals the outward unit normal L on S. Use
If U = f( ρ), that is U is independent of the other the divergence theorem to show that the surface
variables, show that f satisfies the ordinary area of S is given by
differential equation
ρf ″(ρ) + f ′(ρ) = 0.
Hence show that f(ρ) = A + B ln ρ, where A and B
div F dV,
V
+
1 ∂ 2U
r sin θ ∂φ 2
22
= 0.
1
3 r ·L dS,
S
A solution with spherical symmetry is sought for where L is the outward normal to S.
34
U, that is with U = f(r). Show that f(r) = A + (B /r), Using this result verify that
where A and B are constants.
(a) the volume of the sphere enclosed by x2 + y2 +
z2 = a2 is 43 πa3;
34.18 A vector field is given by F = xy2î + xzq +
(b) the volume of a cone with vertex at the origin
xyzx. Let S be the surface of a cube bounded by
the planes x = ±1, y = ±1, z = ±1. Use the divergence and plane base of area A in the plane z = h is
1
theorem to evaluate 3 Ah.
F · L dS,
S
34.22 Let S be a closed surface surround a region
V for which the divergence theorem holds. Let F be a
where L is the outward normal to the cube. vector field which satisfies div F = 1 in a region which
contains V. Show that the volume enclosed by S is
34.19 Prove that given by the formula
L · curl F dS = 0
S
F· L dS.
S
Part 6
Discrete mathematics
Sets
35
CONTENTS
We are often interested in grouping together objects that have common charac-
teristics or features. We might be interested in the integers 1, 2, 3, 4, or in all the
integers. Such a group is called a set. The set of all points in a plane would consist
of pairs of numbers of the form (x, y), where x and y are coordinates which can
take any real values. These examples all involve numbers, but the elements of sets
can be other objects such as functions, or matrices, or Fourier series, or Laplace
transforms, etc.
35.1 Notation
A set is a collection of objects or elements. The elements in the set can be defined
by a rule or in any descriptive manner. Sets are usually denoted by capital letters
such as S, A, B, X, etc., and their elements by lowercase letters such as s, a, b, x,
etc. The elements in a set are listed between braces { … }. If the set A consists of
just two numbers 0 and 1, then we write
A = {0, 1}, or A = {1, 0}, (35.1)
the order being a matter of indifference. We say that 0 and 1 are the elements or
members of the set A, or belong to A. We write
0 ∈A, 1 ∈A,
read as ‘0 belongs to the set A’, etc. The number 2 does not belong to A, and we
write
2 ∉A,
that is ‘2 does not belong to the set A’.
The set defined by (35.1) is the binary set, which could represent the on and off
states of a system. This could be the state of a light switch, for example.
790
Sets can be either finite, having a finite number of elements, or infinite, in which
case the set contains an infinite number of elements. Thus the set given by (35.1)
SETS
Often the elements are defined by a rule rather than by a list or formula. We
write the set as
S = {x | x satisfies specified rules},
which can be translated as ‘S is the set of values of x which satisfy the stated rules’.
The rules occur after the vertical |. Thus
S = {x | x ∈ + and 2 x 8}
is an alternative way of writing S = {2, 3, 4, 5, 6, 7, 8}. As another example,
S = {x| x ∈ and 0 x 1}
is the closed interval [0, 1], that is, all real numbers between 0 and 1 including 0
and 1.
Self-test 35.1
List in full the elements in the following sets:
(a) S1 = {x |x ∈ + and −2 x 8},
(b) S2 = {p/q | p ∈ +, q ∈ +, 1 p 3 and 2 q 4}.
35.2
A = {1, 2, 3}, B = {3, 2, 1}, C = {3, 1, 2, 1}
are all equal, that is A = B = C. The order of the elements is immaterial, and
The intersection of two sets A and B is the set A ∩ B that contains all elements
common to both A and B. It is written and defined by
A ∩ B = {x| x ∈A and x ∈B}.
Example 35.2 Find the intersection of the sets A and B in Example 35.1.
The elements in the intersection have to belong simultaneously to both intervals, that
is to the overlapping part of the intervals [0, 2] and [1, 3], which is [1, 2]. Thus
A ∩ B = {x|x ∈ and 1 x 2}.
In the definitions of A ∪ B and A ∩ B above, we can see that the logical opera-
tion ‘or’ is associated with union, while ‘and’ is associated with intersection.
If A and B have no elements in common, then A and B are said to be disjoint.
The set with no elements is called the empty set and denoted by ∅. Thus, if A
and B are disjoint, then A ∩ B = ∅. Thus if A = {1, 2, 3} and B = {4, 5, 6} then
A ∩ B = ∅.
The complement of a set A is the set of all those elements which belong to the
universal set U but do not belong to A. We denote this set by D (the notations
Ac and A′ are also frequently used): it will depend on the definition of U. Hence,
the complement of A is, assuming that x ∈U,
792
D = {x |x ∉A}.
SETS
Set operations
(a) Union: A ∪ B = {x| x ∈A or x ∈B or both}.
(b) Intersection: A ∩ B = {x| x ∈A and x ∈B}.
(c) Complement: D = {x| x ∉A}.
(d) Empty set: ∅, the set with no elements.
(e) Subset: A ⊆ B means that A is a subset of B.
(f) Proper subset: A ⊂ B means that A ⊆ B but A ≠ B. (35.3)
Self-test 35.2
Find the union and intersection of A = {x | x ∈ and −1 x 2},
B = {x | x ∈ + and 1 x 4}.
35.3
U
VENN DIAGRAMS
A
A
U U
A B A B
(a) (b)
U U
A A
B
(c) (d)
Fig. 35.3 (a) Union A ∪ B. (b) Intersection A ∩ B. (c) Complement D. (d) Proper subset A ⊂ B.
A B A B
35
C C
(a) (b)
Identity laws: A ∪ ∅ = A, A ∩ U = A.
Complementary laws:
A ∪ D = U, A ∩ D = ∅, F = A. (35.5)
For example, D consists of all elements that do not belong to A, and none that do;
so there are no elements common to A and D. Therefore A ∩ D = ∅.
The difference of the sets A and B, written as A\ B, consists of the set of those
elements that belong to A but do not belong to B. Thus
A\B = {x| x ∈A and x ∉B} or A_ _[_ _B ∩ A.
(The notation A − B is also used for A\B.) Figure 35.5b shows a Venn diagram
for A\B.
A B
A B
Example 35.3 Using Fig. 35.6 as the Venn diagram of two sets A and B, mark by
35.3
shading the following sets:
(a) A ∪ E, (b) A ∩ E, (c) D ∩ E, (d) D ∪ E, (e) A_ _]_ _B, (f ) A_ _[_ _B.
VENN DIAGRAMS
Venn diagrams of the sets are shown in Fig. 35.7.
A B A B
(a) (b)
A B A B
(c) (d)
A B A B
(e) (f )
Fig. 35.7
De Morgan’s laws
A_ _]_ _B = D ∩ E, A_ _[_ _B = D ∪ E. (35.6)
Example 35.4 Using Fig. 35.8 as the Venn diagram of three sets A, B, and C,
shade the following sets:
(a) (A ∩ B) ∪ C, (b) (A ∩ B) ∩ C, (c) (A ∩ B) ∩ (A ∩ C),
(d) (A ∪ B) ∪ (A ∩ C).
The required sets are shown in Fig. 35.9. (Figs. 35.9b, c confirm that
(A ∩ B) ∩ C = (A ∩ B) ∩ (A ∩ C).) ➚
796
Example 35.4 continued
SETS
35
A B
Fig. 35.8
A B A B
C C
(a) (A ∩ B) ∪ C (b) (A ∩ B) ∩ C
A B A B
C C
(c) (A ∩ B) ∩ (A ∩ C) (d) (A ∪ B) ∪ (A ∩ C)
Fig. 35.9
35.3
From Fig. 35.5, we can observe that A\B = A ∩ E. Hence
(A ∪ B) ∪ (A\B) = (A ∪ B) ∪ (A ∩ E)
VENN DIAGRAMS
= A ∪ (B ∪ (A ∩ E)) (associative law)
= A ∪ ((B ∪ A) ∩ (B ∪ E)) (distributive law)
= A ∪ ((B ∪ A) ∩ U)
= A ∪ (B ∪ A) (identity law)
= (B ∪ A) ∪ A (commutative law)
= B ∪ (A ∪ A) = B ∪ A
= A ∪ B.
Alternatively, and more intuitively, we may notice that, since A\B is a subset of A, it is
therefore also a subset of A ∪ B, and so adds nothing to A ∪ B when united with it.
P1 P2
20 814 – k 31
800 – k 902 – k
992 P3
17
U : 1000
Fig. 35.10
➚
798
Example 35.7 continued
SETS
Of the 1000 products manufactured, 796 passed all the quality checks.
Elimination of n(A\ B) and n(B\ A) between (35.7) and (35.8) leads to the altern-
ative result
n(A ∪ B) = n(A) + n(B) − n(A ∩ B).
A B
PROBLEMS
n(A ∪ B ∪ C) = n(A) + n(B) + n(C) + n(A ∩ B ∩ C) − n(B ∩ C)
− n(C ∩ A) − n(A ∩ B).
This result can be constructed from the Venn diagram.
Further discussion of sets and their algebra can be found in Garnier and
Taylor (1991).
Self-test 35.3
In Fig. 35.8, shade
(a) A ∩ (B ∩ C); (b) A ∪ B; (c) (A ∪ C) ∩ (B ∪ C).
Problems
35.1 (Section 35.1). List the elements in the (a) A = {x| x ∈, and −2 x 1},
following sets: B = {x |x ∈, and −1 x 2};
(a) S = {x | x ∈+ and 3 x 10}; (b) A = {x| x ∈+ and −5 x 2},
(b) S = {x | x ∈+ and −2 x 4}; B = {x | x ∈, and −5 x 2};
(c) S = {x | x ∈ and −2 x 4}; (c) A = {n |n = 1/m and m ∈ +},
(d) S = {x | x ∈+ or −, and −2 x 4}; B = {n | n = 1/m2 and m ∈+};
(e) S = {1/x |x ∈+ and 3 x 8}; (d) A = {x | x ∈ and x2 − 3x + 2 = 0},
(f ) S = {x2 |x ∈+ and | x | 3}; B = {x | x ∈ and 2x2 + x − 3 = 0};
(g) S = {x + iy | x ∈+, y ∈+, 1 x 4, (e) A = {x| x ∈ and |x | 2},
2 y 5}. B = {x | x ∈ and | x − 1 | 1}.
35.2 (Section 35.3). Show on Venn diagrams the 35.5 (Section 35.3). Construct a set formula for
following sets: the shaded sets of Fig. 35.12:
(a) A ∪ E; (b) D ∩ E;
(c) A ∩ (B ∪ C); (d) (A ∩ B) ∪ (B ∩ C); (a)
(e) A__ _[_ _B ;
A B
(f) (A\ B) ∩ C;
(g) A\ (B ∩ C);
(h) (_A\_ B_)_ _ _]_ (_ _B _\_C _).
C
35.3 (Section 35.2). Determine the union A ∪ B
of each of the following pairs of sets A and B:
(a) A = {x |x ∈ and −1 x 2},
B = {x | x ∈ and −1 x 4}; (b)
(b) A = {x |x ∈ and −1 x 0}, A B
B = {x | x ∈ and 0 x 1};
(c) A = {1, 2, 3, 4}, B = {− 4, −3, −2, −1};
(d) A = {y |y = cos x, x ∈, and 0 x --21 π},
B = {y| y = sin x, x ∈, and −--21 π x --21 π}. C
A B
b ∈B. It is written as
A × B = {(a, b) | a ∈A and b ∈B}.
If A = B, then we write A × A = A2. Let A = {1, 2}
35
CONTENTS
We are now going to present some new operations between special entities. They
have analogies with ordinary addition and multiplication, and the symbols for
them will be similar; but not the same, since we need to emphasize that these are
Boolean operations. The algebra involved is named after George Boole (1815–
64) who first developed the modern ideas of symbolic logic. Boolean algebra has
applications in logic and switching circuits.
0 0 0 0 0 0 0 1
0 1 1 0 1 0 1 0
1 0 1 1 0 0
1 1 1 1 1 1
802
Thus, for example
BOOLEAN ALGEBRA: LOGIC GATES AND SWITCHING FUNCTIONS
0 ⊕ 1 = 1, 1 ⊕ 1 = 1, 0 * 0 = 0, 1 * 1 = 1, 0 = 1, 1 = 0.
The elements of B are known as Boolean variables. We have restricted our set B
to one with just two elements or binary digits, because this is the main applica-
tion in circuits and computer design, but definitions can be interpreted for more
general sets. A Boolean algebra is a set with the operations ⊕, *, and ¯ defined on
it, together with the following laws on any elements a, b, c which belong to B:
Commutative laws:
a ⊕ b = b ⊕ a, a * b = b * a;
Associative laws:
a ⊕ (b ⊕ c) = (a ⊕ b) ⊕ c, a * (b * c) = (a * b) * c;
Distributive laws:
a * (b ⊕ c) = (a * b) ⊕ (a * c),
a ⊕ (b * c) = (a ⊕ b) * (a ⊕ c). (36.1)
In addition, the set must contain distinct identity elements 0 and 1 for the opera-
tions ⊕ and * respectively. For these elements we must have the identity and com-
plement laws:
Identity laws:
a ⊕ 0 = a, a*1=a
Complement laws:
a ⊕ A = 1, a*A=0 (36.2)
36
Absorption laws:
a ⊕ (a * b) = a, a * (a ⊕ b) = a;
de Morgan’s laws: 15352 = A * B, 15452 = A ⊕ B;
Identity laws:
1 ⊕ a = a ⊕ 1 = 1, 0 * a = a * 0 = 0;
Reflexive law: I = a. (36.3)
803
Note that * takes precedence over ⊕ in the absence of brackets. Thus, in the first
absorption law, a ⊕ a * b means a ⊕ (a * b); in the second absorption law, the
36.2
brackets are essential.
We will prove one of the absorption laws to illustrate how proofs are ap-
Any expression made up from the elements of B and the operations ⊕, *, and ¯ is
known as a Boolean expression. For example,
a ⊕ b, a ⊕ B, a ⊕ A * b,
are Boolean expressions. For the binary set, the elements 1 and 0 can represent
‘on’ or ‘off’ states in digital circuits. The basic components in a computer are
logic gates which can produce an output from inputs. All the outputs and inputs
can be in one of two states, usually either low voltage (0) or high voltage (1).
The fundamental Boolean operations of ⊕, *, and ¯ correspond to devices
known respectively as the OR gate, AND gate, and NOT gate. As with circuit com-
ponents such as resistance and inductance, each has its own symbol.
The or gate has two inputs and a single output represented by the symbol in
Fig. 36.1. The output is f = a ⊕ b. The inputs a and b can each take either of the
values 0 or 1. Hence there are four possible inputs into the device as listed in
Table 36.2. The final column f can be completed using the sum rule in Table 36.1.
Then, if a is ‘on’ (1) and b is ‘off’ (0), the output f is ‘on’ (1). Table 36.2 is known
as the truth table of the or gate.
804
Table 36.2 Truth table for the OR gate
BOOLEAN ALGEBRA: LOGIC GATES AND SWITCHING FUNCTIONS
a b f=a⊕b
a f=a⊕b
b
0 0 0
0 1 1
Fig. 36.1 The or gate. 1 0 1
1 1 1
The symbol and truth table for the AND gate are shown in Fig. 36.2 and
Table 36.3. Again the device has two inputs and the single output f = a * b, the
product of a and b.
a b f=a*b
a f=a*b
b 0 0 0
0 1 0
Fig. 36.2 The and gate. 1 0 0
1 1 1
Finally the NOT gate is shown in Fig. 36.3 with its truth table given as
Table 36.4. The not gate has a single input and a single output which is the com-
plement of its input.
36
0 1
Fig. 36.3 The not gate.
1 0
There is further jargon associated with these gates. The output a ⊕ b is known
as the disjunction of a and b, while a * b is known as the conjunction of a and b,
and A is called the negation of a.
These devices can be connected in series and parallel to create new logic
devices, each of which will have its own truth table.
A series connection between a not gate and an and gate is shown in Fig. 36.4a.
The output a * b of the and gate becomes the input of the not gate which results
in the output 15452. This combined device is known as the NAND gate, and it has
its own symbolic representation shown in Fig. 36.4b. Its truth table is given in
Table 36.5.
805
a a*b f = 15452 Table 36.5 Truth table for the NAND gate
36.3
b a b f = 15452
LOGIC NETWORKS
(a)
0 0 1
a f = 15452
0 1 1
b 1 0 1
(b)
1 1 0
Fig. 36.4 The nand gate.
A series connection between a not gate and an or gate produces the NOR gate
as shown in Fig. 36.5a. The output f is the complement of the sum of a and b. The
nor gate also has its own symbol contraction shown in Fig. 36.5b. It has the truth
table shown in Table 36.6.
a a⊕b f = 15352 Table 36.6 Truth table for the NOR gate
b
a b f = 15352
(a)
0 0 1
a f = 15352
0 1 0
b
(b) 1 0 0
1 1 0
Fig. 36.5 The nor gate.
Self-test 36.1
The output of an AND gate (Fig. 36.2) is attached to a NOT gate (Fig. 36.3).
Construct the truth table for the system.
Example 36.2 Construct the Boolean expression for the output f of the device
shown in Fig. 36.6.
Starting from the left in Fig. 36.6, the upper and gate produces an output a * b and the
lower or gate has an output c ⊕ d. These become the inputs into the or gate on the
right. Hence the final output is ➚
806
Example 36.2 continued
BOOLEAN ALGEBRA: LOGIC GATES AND SWITCHING FUNCTIONS
a a*b
b
f
f = (a * b) ⊕ c ⊕ d.
Since there are four inputs, the output f can be determined for each of the 24 = 16
possible inputs. Hence if, for example, a = 1, b = 0, c = 0, d = 1, then the output f = 1.
Example 36.3 Figure 36.7 shows a logical network with three inputs a, b, c, and
four devices. Find a Boolean expression for the output f. Write down the truth
table for the system.
P
a R S
f
b
Q
c Fig. 36.7
36
Note that the input b is the same in both devices P and Q. The output from the and gate
P is a * b, and the output from R is 15452. The output from Q is b ⊕ c. Hence the inputs
15452 and b ⊕ c into S produce an output
f = 15452 ⊕ b ⊕ c.
The truth table for this network is given in Table 36.7. Whatever the inputs, the device
is always ‘on’.
Table 36.7
Example 36.4 Show that, using just the nor gate, it is possible to build a logic
36.3
network to model any Boolean expression.
Given inputs a and b, we have to show that devices can be constructed using just nor
LOGIC NETWORKS
gates with outputs of a ⊕ b, a * b, and A. For inputs of a and b, the single nor gate
generates an output of 15352. Figure 36.8 shows three devices which simulate the
required outputs.
(b) 15351 = A
a
7\9\8 = a ∗ b
b
25352 = B
(c) 15351 = A
a
Fig. 36.8 The simulations are: (a) OR gate; (b) AND gate (c) NOT gate.
Example 36.5 Design a logic network using or, and, and not gates to
reproduce the Boolean expression f = a * B ⊕ a for inputs a and b.
From input b we obtain B by a not gate. The inputs a and B are then fed into an and
gate to produce a * B. Finally a spur from the a input and the a * B output are fed into
an or gate as shown in Fig. 36.9.
a*B⊕a
b a*B
B Fig. 36.9
Self-test 36.2
An AND gate with inputs a and b, and a NOT gate with input c are connected
to a NOR gate. Find a Boolean expression for the output f, and construct a
truth table for the system.
808
a b f
a f=a*B⊕A*b
b 0 0 0
0 1 1
Fig. 36.10 The exclusive-or gate.
1 0 1
1 1 0
for these cases. f remains zero for the remaining outputs. We obtain
f = A * b ⊕ a * B, (36.4)
Table 36.1, the construction guarantees a Boolean expression for any truth table.
Applied to the truth table for the OR gate (Table 36.2), the disjunctive form gives
f = (A ⊕ b) ⊕ (a ⊕ B) ⊕ (a ⊕ b),
which is evidently a more complicated version of a ⊕ b.
The method can be applied to more complex truth tables. Table 36.9 shows an
output for three inputs. The output 1 appears in rows 2, 4, 5, 7, 8. In row 2, a = 0,
Table 36.9
a b c f
0 0 0 0
0 0 1 1
0 1 0 0
0 1 1 1
1 0 0 1
1 0 1 0
1 1 0 1
1 1 1 1
809
b = 0 and c = 1. Hence we introduce A * B * c which equals 1. Apply the same pro-
cedure to rows 4, 5, 7, 8 introducing the complement for zero Boolean variables.
36.5
The disjunctive normal form for a corresponding Boolean expression is, follow-
ing the rules for products of elements and their complements,
SWITCHING CIRCUITS
f = A * B * c ⊕ A * b * c ⊕ a * B * C ⊕ a * b * C ⊕ a * b * c.
Check that f does give the required output. The disjunctive normal form always
guarantees an answer, but it is not necessarily the simplest or most efficient in
circuit architecture.
Self-test 36.3
Construct a Boolean expression for the truth table
a b 15452
0 0 1
0 1 1
1 0 1
1 1 0
using the disjunctive normal form. Compare the answer with the answer of
Self-test 36.1.
Consider two switches S1 and S2 in series (Fig. 36.12). Current only flows if both
switches are closed, that is when a = 1 and b = 1, where a and b represent the
states of the switches. Hence the truth table for the series switches is as shown in
Table 36.10. Thus the state of current flow is given by f = a * b, the product of
a and b.
Similarly two switches in parallel (Fig. 36.13) correspond to the sum of a and b.
The truth table is given in Table 36.11. The final column indicates that f = a ⊕ b.
The complement of a, the state of switch S1, is another switch S2 in the circuit
which is always in the complementary state to S1, off when S1 is on and vice versa.
810
Table 36.10 Truth table for two switches
in series
BOOLEAN ALGEBRA: LOGIC GATES AND SWITCHING FUNCTIONS
S1 S2 a b f
a b 0 0 0
Fig. 36.12 Two switches in series. 0 1 0
1 0 0
1 1 1
S2
0 0 0
0 1 1
Fig. 36.13 Two switches in parallel. 1 0 1
1 1 1
S1
a
36
Example 36.6 Find a switching function f for the system shown in Fig. 36.15.
S1
S2
S4
S3
Fig. 36.15
➚
811
Example 36.6 continued
36.5
Let a1, a2, a3, a4 represent respectively the states of each switch S1, S2, S3, S4. Since S2 and
S3 are in parallel, their output will be a2 ⊕ a3. This combined in series with a4 will give
an output of (a2 ⊕ a3) * a4. In turn, this is in parallel with S1. Hence, the final output is
SWITCHING CIRCUITS
(a2 ⊕ a3) * a4 ⊕ a1.
Example 36.7 A light on a staircase is controlled by two switches S1 and S2, one
at the bottom of the stairs and one at the top. Switches can be separately ‘up’ or
‘down’. If both switches are up, the light is off. Either switch changed to down
switches the light on, and any subsequent change to a switch alters the state of
the light. Design a truth table for the circuit.
The truth table is shown in Table 36.12, where the state of Si (i = 1, 2) is ai = 0 when the
switch is up (off) and ai = 1 when the switch is down (on). The light on is f = 1, and
the light off is f = 0. This truth table is the same as that for the exclusive-or gate in
Section 36.4. Hence, from (36.4), the circuit can be represented by the switching
function
f = a1 * A2 ⊕ A1 * a2.
The actual circuit is shown in Fig. 36.16, where S1 and S2 are one-pole two-way switches.
At S1, the state a1 represents the switch ‘up’ and its complement A1 is the switch down.
A similar state operates at S2.
Table 36.12
up up off 0 0 0
down up on 1 0 1
down down off 1 1 0
up down on 0 1 1
S1 S2
a1 a2
A1 A2
AC
supply
Problems
BOOLEAN ALGEBRA: LOGIC GATES AND SWITCHING FUNCTIONS
36.1 Read through Example 36.1. Now prove the 36.9 (Section 36.3). Find a Boolean expression f
other absorption law: which corresponds to the truth table shown in
a * a ⊕ b = a. Table 36.13.
(Example 36.1 and this result illustrate the duality
principle, which states that any theorem which Table 36.13
can be proved in Boolean algebra implies another
theorem with * and ⊕ interchanged for the same a b c f
elements.)
0 0 0 1
36.2 (Section 36.1). Prove the de Morgan result 0 0 1 1
15352 = A * B, 0 1 0 0
by showing that (a ⊕ b) ⊕ (A * B) = 1. Explain how 0 1 1 0
the duality result (Problem 36.1) gives the other de 1 0 0 1
Morgan theorem. 1 0 1 1
36.3 (Section 36.1). Let B be the Boolean algebra
1 1 0 0
with the two elements 0 and 1. For arbitrary 1 1 1 0
a,b ∈B, prove the following:
(a) a * (A ⊕ b) = a * b;
(b) (a ⊕ b) * (a ⊕ B) = a; 36.10 (Section 36.3). Construct Boolean
(c) (a ⊕ b) * A * B = 0. expressions for the output f in the devices shown
in Figs 36.17a–d. Construct the truth tables in
36.4 (Section 36.1). Using the laws of Boolean each case.
algebra for the set with two elements 0 and 1,
show that: (a) a
(a) a * b ⊕ a * B = a; b
(b) a ⊕ A * B * c = a ⊕ B * c. f
36
(a ⊕ B) * (a ⊕ C ). b
Construct the truth table for this Boolean c
expression.
d
c
36.8 (Section 36.1). Show that the following
d
Boolean expressions are equivalent:
(a) a ⊕ b; (b) a ⊕ b * b. Fig. 36.17
813
36.11 Find the outputs f and g in the logic circuits Table 36.16
shown in Fig. 36.18. This device can represent
PROBLEMS
binary addition in which g is the ‘carry’ in the a b c f
binary table shown in Table 36.14. The output g
gives the ‘1’ in the ‘10’ in the binary sum 1 + 1 = 10. 0 0 0 1
0 0 1 0
g
0 1 0 0
a f 0 1 1 1
b 1 0 0 1
1 0 1 0
1 1 0 1
Fig. 36.18 1 1 1 0
Table 36.14
36.15 (Section 36.4). Find switching functions for
x y x+y the switching circuits shown in Figs 36.19a,b.
0 0 0
(a) S1
0 1 1
1 0 1 S2
1 1 10
S3
S5
36.12 (Section 36.3). Reproduce the logic gate in S4
Fig. 36.6 using just the nor gate.
(b) S1
36.13 (Section 36.4). Using the disjunctive normal
form, construct a Boolean expression f for the truth
tables given in Tables 36.15 and 36.16. S2 S3
S5
36.14 (Section 36.3). Show that any Boolean S4
expression can be modelled using just a nand gate.
(Hint: use a method similar to that explained in
Example 36.4.) Fig. 36.19
Table 36.15 36.16 A lecture theatre has three entrances and the
lighting can be controlled from each entrance; that
a b f
is, it can be switched on or off independently. The
0 0 0 light is ‘on’ if the output f equals 1 and ‘off’ if f = 0.
Let ai = 1 (i = 1, 2, 3) when switch i is up, and let ai
0 1 1 = 0 (i = 1, 2, 3) when it is down. Construct a truth
1 0 1 table for the state of the lighting for all states of the
1 1 1 switches. Also specify a Boolean expression which
will control the lighting.
Graph theory and
37 its applications
CONTENTS
b d
c Fig. 37.1
815
37.1
Here are some practical examples of situations and objects which can be usefully
represented by graphs.
EXAMPLES OF GRAPHS
(i) Electrical circuits. Figure 37.2a shows an electrical circuit with four resistors
R1, R2, R3 and R4, an inductor L, and a voltage source V1. Each edge has just one
component, and the joins between components are the vertices (the term node is
frequently used in circuit theory) in the graph. Care has to be taken with the
definition of nodes (see Section 37.6): they are not necessarily where three or more
wires meet. This circuit has four vertices a, b, c, d, and it can be represented by the
graph in Fig. 37.2b if we are only interested in the links, not what they contain. The
presence of a line or edge between two nodes in the graph indicates that there is a
component between the nodes.
a c
R1
b
R2
b
a c
L1
+ v1 R3 R4
−
(a) d (b) d
Fig. 37.2
Figure 37.3 shows another circuit with six vertices in which the boxes indicate
electrical components. The wires joining c to f and b to e cross over each other.
In the design of printed circuits, it is useful to know whether the circuit can be
redrawn so that no wires cross. Such a graph, with no edges crossing, is known
as a planar graph. The graph in Fig. 37.2 is planar, but the graph of the circuit in
Fig. 37.3 has no planar drawing: at least two edges will cross in any plane diagram
of it. We shall discuss this notion in Section 37.8.
c b a
d f
e
Fig. 37.3
(ii) Chemical molecules. Molecular diagrams look like candidates for graphs. The
molecule of ethanol can be represented by Fig. 37.4a. In its graph representation
in Fig. 37.4b, the vertices represent atoms and the edges bonds. The number of
816
(a) H H (b)
GRAPH THEORY AND ITS APPLICATIONS
H C C O H
H H
(a) (b)
37
Fig. 37.5 (a) Traffic flow in a road grid, (b) Digraph representation of the roads in (a).
bonds which meet at an atom is the valency of the atom. Thus carbon (C) has
valency 4, oxygen (O) valency 2, and hydrogen (H) valency 1. Generally in graphs,
the number of edges that meet at a vertex is known as the degree of the vertex.
(iii) Road maps. Road maps and street plans are graphs with roads as edges and
junctions as vertices. However, most road networks include one-way streets.
Hence graphs need to be modified to indicate directions in which movement or
flow is permitted. Figure 37.5a shows a typical section of a street plan with some
one-way streets. We have to associate directions with the edges as shown in the
graph of the plan in Fig. 37.5b. Note that two-way streets now have two directed
edges associated with them. This is an example of a directed graph, which is also
known by the shortened term digraph.
(iv) Shortest paths. Figure 37.6 shows a digraph with weights associated with
each edge. The graph could represent routes between towns S and F which pass
A F
4
3 2 4 2 5
B E
S 6 4 7 T
5 2 3 3 5
6
C D Fig. 37.6
817
through intermediate towns A, B, … , the weights associated with each directed
edge could stand for distances or times. This graph is shown as a digraph, but
37.2
weights could be present without directions in some cases. We might be interested
in this example in the shortest distance between the start (S) and the finish (F).
Example 37.1 Find the degree of the vertices in the graph in Fig. 37.1.
Three edges meet at the vertex a. Hence deg(a) = 3. Four edges meet at b. Hence
deg(b) = 4. Similarly, deg(c) = 2 and deg(d) = 3.
A simple graph in which every vertex is joined to every other vertex by just one
edge is called a complete graph (see also Section 37.8).
Figure 37.7 shows some examples of the various graphs described above.
b d Multiple
b c
edges
d
a c e Loop
a
a b
(a) (b)
a b c
b d
e c
a c e f e d d
(c) (d) (e)
Fig. 37.7 (a) Connected simple graph. (b) Connected multigraph. (c) Disconnected multigraph.
(d) Regular graph of degree 3. (e) Complete graph with five vertices: deg(a) = 4.
818
Since every edge has a vertex at each end, it follows that the sum of all the vertex
degrees equals twice the number of edges. This is known as the handshaking
GRAPH THEORY AND ITS APPLICATIONS
Self-test 37.1
List the degrees of each vertex, as an increasing sequence, for each graph
in Fig. 37.7.
37
(a) a b (b)
d c
Fig. 37.8
819
a a a a
37.3
HOW MANY SIMPLE GRAPHS ARE THERE?
b c b c b c b c
a a a a
b c b c b c b c
11 can be identified as unlabelled graphs. The latter graphs are shown in Fig. 37.11.
Of the 11 unlabelled graphs it can be seen that six are connected and four are
regular.
For applications involving electrical circuits, the main interest is in connected
graphs. The numbers of the various categories of graphs up to n = 7 vertices are
820
Table 37.1
GRAPH THEORY AND ITS APPLICATIONS
n 1 2 3 4 5 6 7
given in Table 37.1. It can be seen from the table that the number of unlabelled
graphs is a considerable reduction on the labelled set, and that regular graphs
are comparatively rare. The counting of unlabelled graphs does not follow from a
simple formula.
Self-test 37.2
37
List all unlabelled simple graphs with five vertices. Indicate which graphs
are connected, and which are regular. What are the degrees of the regular
graphs?
a f
b e
c d Fig. 37.12
821
Example 37.2 Electrical circuits are usually such that every edge of their
37.5
representative graph is part of a cycle. List all the distinct cycles in the graph
in Fig. 37.2a.
TREES
The graph of the circuit is repeated in Fig. 37.13. The complete list of cycles is:
3-edge cycles: a–b–c–a, a–b– d–a, a–d–c–a, b–d–c–b;
4-edge cycles: a–b–d–c–a, a–d–b–c–a, a–b–c–d–a.
a c
b f
b
g
c e
Fig. 37.13 d
Fig. 37.14
Some graphs have special closed-path and cycle properties. A connected graph
G is said to be eulerian if there exists a closed trail that includes every edge in G.
A connected graph G is said to be hamiltonian if there exists a cycle that includes
every vertex in G. The graph in Fig. 37.13 is hamiltonian but not eulerian. One
hamiltonian cycle in its graph is a–b–d–c–a. Note that this cycle does not have to
cover every edge in the graph.
The graph in Fig. 37.14 is both eulerian and hamiltonian. An eulerian trail is
a–b–c–d–e–f–g–e–c–g–b–f–a,
and a hamiltonian cycle is
a–b–c–d–e–g–f–a.
It can be shown that a connected graph is eulerian if and only if every vertex
has even degree. This provides an easy test for the eulerian property of a graph.
37.5 Trees
A connected graph which has no cycles is known as a tree. An example of a tree is
shown in Fig. 37.15. The edges in a tree are called branches.
Suppose that a graph G consists of the set V(G) of vertices and the set E(G) of
edges. Then any graph whose vertices and edges are subsets of V(G) and E(G)
respectively is called a subgraph. It is important to note that the subgraph must
be a graph whose vertices and edges come from G; and only edges that join two
vertices of the subgraph are permitted in the subset of E(G).
822
GRAPH THEORY AND ITS APPLICATIONS
Figure 37.16a shows a connected graph G and Fig. 37.16b shows a spanning tree
of G. Graphs can have many different spanning trees. The set of edges that are not
37
part of the spanning tree (the broken edges in Fig. 37.16b) is known as the cotree
and its edges are called links.
(a) (b)
Fig. 37.16 (a) Connected graph. (b) The same graph with a spanning tree.
Construct a tree from a vertex by adding edges. Each edge added must introduce
a new vertex, since otherwise a cycle would be created and the graph would no
longer be a tree. A tree with two vertices has one edge, a tree with three vertices
has two edges and so on. Hence a tree with n vertices must have just n − 1
branches. It follows that a graph with n vertices must have a cotree with e − n + 1
links, where e is the number of edges of the graph.
We now introduce the cutset, by which we can disconnect a graph into two sub-
graphs which together contain all the original vertices, by removing a minimum
set of edges in the graph.
Cutset
In a connected graph, a cutset is a set of edges (a) whose removal disconnects
the graph into two subgraphs and (b) no proper subset of the cutset disconnects
the graph.
823
(b)
37.6
(a)
a b a b
e c c
e
d d
A proper subset of the cutset is one which does not include the cutset. There must
be no redundancy in the cutset. Thus, for example in Fig. 37.17a, the broken line
C1, which removes the edges ba, bf, and bc, defines a cutset {ba, bf, bc}, since {b}
and {a, c, d, e, f } are disconnected subgraphs. C2 in Fig. 37.17b does not define a
cutset, since the subset {ab, bf, bc} of edges disconnects the graph.
Self-test 37.3
(a) What are the degrees of the vertices in the spanning tree in Fig. 37.16(b)?
Design a spanning tree with vertex degrees {1, 1, 2, 2, 2}. (b) Indicate span-
ning trees of the graph in Fig. 37.14 with vertex degrees (i) {1, 1, 2, 2, 2, 2, 2},
(ii) {1, 1, 1, 1, 2, 2, 4}.
R3
R3
C2 R6 B C D
E R6
A C1
C1 V C2 R4 R2 R1
R7 R4 R2 R1 V
R5 R5
R7
(a) (b)
The equivalent graph is shown in Fig. 37.18b: it has five nodes and 10 edges. Note
that it is a multigraph with two nodes joined by two edges and two nodes joined
by four edges.
A circuit loop in the circuit is a cycle in the graph.
Kirchhoff’s laws have already been stated in eqn (21.8), but for convenience they
are given again here in graph terms. They state (i) that the algebraic sum of the
voltages around any loop is zero, and (ii) that the algebraic sum of the currents
entering any node is zero.
In addition, for resistors we also have Ohm’s law which states that the voltage
across a resistor is directly proportional to the current flowing through it, that is
v∝i or v = Ri,
where the constant R is measured in units called ohms (Ω). Figure 37.19 shows a
circuit with two independent maintained current sources iX and iY: the symbol of
the circle enclosing an arrow represents a maintained current in the direction
of the arrow.
37
The corresponding six-node digraph with currents i1, i2, … , i8 in the directions
indicated is shown in Fig. 37.20. If any current turns out to be negative then its
direction will be opposite to that shown.
R1
R8 b i1 c
R3 R2
i2 i8
i3
iX R7
iX f i7 d
R4 R5
iY i5
i4
iY
R6
a i6 e
Now introduce nodal voltages va, vb, … , vf as shown in Fig. 37.21. The use
of nodal voltages means that effectively Kirchhoff’s first law is automatically
satisfied. The earthing at e makes ve = 0 and other voltages can be measured rela-
tive to this zero ground potential.
This circuit has 13 unknowns: 8 currents and 5 nodal voltages. The problem
with circuits is the selection of the minimum number of consistent equations
from Kirchhoff ’s laws and Ohm’s law sufficient to determine the unknowns.
The graph of this circuit is the same as that in Fig. 37.16a, and we shall use the
same spanning tree as shown in Fig. 37.16b. In this graph, the number of nodes n is
6, the number of edges e is 10. Hence the cotree has, from the previous section,
e − n + 1 = 10 − 6 + 1 = 5 links. Any cutset of the original graphs which contains
one and only one branch of the spanning tree (the rest of the cutset consisting of
links) is known as a fundamental cutset of the circuit. Hence we can associate five
825
C1 C2 C3
37.6
vb vc
i1
i8 C4
fundamental cutsets with the spanning tree in Fig. 37.16b. Five possible cutsets
C1, C2, … , C5 are shown in Fig. 37.22.
By repeated use of Kirchhoff’s second law to the nodes on one side of a cutset,
it follows that the algebraic sum of the currents crossing the cutset must be zero.
Hence the five cutset equations are:
C1: i1 − i3 + iX = 0, (37.1)
C2: i1 − i3 + i4 + i5 + i 7 − i8 = 0, (37.2)
C3: ii − i2 + i 7 − i8 = 0, (37.3)
C4: i6 − i5 − i 7 + i8 = 0, (37.4)
C5: iY − i8 = 0. (37.5)
These equations must be independent since each one contains a current from a
branch of the spanning tree which does not appear in any other equation. Further
any non-fundamental cutset equation will be a linear combination of the five
fundamental cutset equations. The number of branches in the spanning tree
defines the number of independent equations.
We can also apply Ohm’s law to each resistor (note that current flows from
high to low potential). Thus the voltage difference across R1 is vc − vb, so that
i1 = (vc − vb)/R1. (37.6)
Similarly
i2 = (vf − vc )/R2, (37.7)
i5 = vf /R5, (37.10)
i7 = vc /R7, (37.12)
We can now substitute for the currents from (37.6) to (37.13) into (37.1) to (37.5)
resulting in five linear equations to determine the nodal voltages va , vb , vc , vd , vf in
terms of the known currents iX and iY . The remaining currents can then be
calculated from (37.6) to (37.13).
826
Example 37.3 Using the cutset method, find all currents and nodal voltages in
GRAPH THEORY AND ITS APPLICATIONS
R1 = 12 Ω
R2 = 3 Ω
R3 = 1 Ω
R4 = 2 Ω
iX = 2 A
R5 = 2 Ω
iY = 1 A
Fig. 37.23
The circuit can be represented by a graph with five nodes (Fig. 37.24) with the currents
i1, i2, i3, i4, i5 in the directions shown.
37
a(va) i1 b(vb) i1 C1
a b
C2
i2
i2 C3
i3 i3 C4
i4
d (v ) c
iX d c
d
(vc) iX i4
i5
iY i5
iY
e (ve = 0)
e
A spanning tree with three links is shown in Fig. 37.25 together with cutsets C1, C2,
C3, C4. Hence Kirchhoff’s second law implies:
C1: i1 − i3 + i2 = 0, (37.14)
C2: iX − i3 + i2 = 0, (37.15)
C3: −iY + i5 − i3 + i2 = 0, (37.16)
C4: −iY + i4 + i2 = 0. (37.17)
With ve = 0, the currents in terms of the nodal voltages va, vb, vc, vd are, by Ohm’s law:
i1 = (va − vb )/R1 = 2(va − vb), (37.18)
i2 = (vc − vb )/R2 = 3 (vc − vb ),
1
(37.19)
i3 = (vb − vd )/R3 = vb − vd , (37.20)
i4 = (vc − vd )/R4 = 2 (vc − vd ),
1
(37.21)
i5 = vd /R5 = 12 vd. (37.22)
Eliminate the currents in (37.14) to (37.17) using (37.18) to (37.22):
2va − 103 vb + 13 vc + vd = 0, (37.23)
3 vb − 3 vc − vd = 2,
4 1
(37.24)
− 43 vb + 13 vc − 23 vd = 2, (37.25)
− 3 vb + 6 vc − 2 vd = 1.
1 5 1
(37.26)
➚
827
Example 37.3 continued
37.7
These are linear equations which can be solved using the methods of Chapter 12.
Computer algebra is also very useful in solving sets of equations of this type (see the
computer algebra applications for Chapter 12 in Chapter 42). The answers are
SIGNAL-FLOW GRAPHS
va = 5 V, vb = 4 V, vc = 4 V, vd = 2 V.
Since vc = vb, no current flows through the resistor on bc.
We can summarize the result for an earthed circuit which contains only resistors and
current sources. Suppose that the representative graph of the circuit contains n nodes and
e edges of which f contain known current sources. The curcuit will have e − f unknown
currents and n − 1 unknown nodal voltages giving e − f + n − 1 unknowns in total. Its
spanning tree will have n − 1 edges which will lead to n − 1 fundamental cutset equations,
and Ohm’s law will apply to e − f resistors. Hence we shall always have a consistent set
of e − f + n − 1 equations to find the unknowns.
This result can be extended to circuits with current sources, voltage sources (batteries),
and resistors. If the representative graph has n nodes and e edges of which f contain
current sources and s maintained voltage sources, then the number of unknown currents
will be e − f and the number of unknown nodal voltages will be n − 1 − s since the nodal
voltage difference across a battery will be known. Hence the number of unknowns is
e − f + n − 1 − s which will satisfy n − 1 cutset equations and e − f − s Ohm’s laws.
We wish to find Q(s) in terms of P(s), G(s), and H(s), from the equations (37.27)
to (37.29). Thus, from (37.28)
Q(s) = G(s)A(s) = G(s)[P(s) − F(s)], = G(s)[P(s) − H(s)Q(s)].
P(s) Q(s)
GRAPH THEORY AND ITS APPLICATIONS
G(s)
1 + G(s)H(s) Fig. 37.27 Block-reduced diagram
for Fig. 37.26.
H1(s)
+
P(s) + + Q(s)
G1(s) G2(s) G3(s)
−
H2(s)
G(s)
Q(s) = P(s).
1 + G(s)H(s)
This is the closed-loop transfer function. The actual signal can be obtained by
finding the inverse Laplace transform for Q(s). Hence the system is equivalent to
that shown in Fig. 37.27.
If the feedback reinforces the input signal it is called positive feedback.
Figure 37.28 shows a multiple-feedback control system with a positive and a
negative feedback. The output signal is given by
G1(s)G2 (s)G3(s)
Q(s) = P(s), (37.30)
1 − G2 (s)H1(s) + G1(s)G2 (s)G3(s)H2 (s)
which can be obtained by the method of block-diagram reduction. For example,
the feedback through H1 makes the system equivalent to that shown in Fig. 37.29.
We can now combine the series devices which reduce the system to the negative-
feedback control system considered at the beginning of this section. The details
are omitted here.
P(s) + Q(s)
G2(s)
G1(s) G3(s)
1 − G2(s)H1(s)
−
H2(s)
Fig. 37.29 First stage in the block reduction of the multiple-feedback control system.
This block-reduction method can get quite complicated for a complex feedback
system. Instead of using block reduction in this way, represent the system by a
weighted digraph as shown in Fig. 37.30, where the weights are the transfer
functions – except that the edges representing the input and output are assigned
829
H1(s)
37.7
P(s) 1 G1(s) G3(s) 1 Q(s)
x1 x2 x3 x4
SIGNAL-FLOW GRAPHS
G2(s)
−H2(s)
Fig. 37.30 Signal-flow graph for the multiple-feedback control system shown in Fig. 37.28.
weight 1 since they carry no devices. Also the negative feedback is replaced by
−H2(s), to make sure that it reduces the input into G1(s). This is the signal-flow
graph of the system. Let the inputs into the nodes be x1, x2, x3, and x4 as shown;
then, for the positive-feedback cycle,
x3 = G2x2, x2 = G1x1 + H1x3.
(The argument (s) has now been dropped from the working.) Hence
G1 G2 x1
x3 = .
1 − G2 H1
In other words, we can replace (a) by (b) in Fig. 37.31.
H1
G1G2
G1 1 − G2H1
by
x1 x2 x3 x1 x3
G2
(a) (b)
Fig. 37.31
There are other rules, and a complete list now follows for the replacements for
subgraphs in the graph.
(a) Multiple edges. See Fig. 37.32. This follows since
x2 = Gx1 + Hx1 = (G + H)x1.
G
G+H
by
x1 x2 x1 x2
H Fig. 37.32 Multiple edges.
G H GH
by
x1 x2 x3 x1 x3 Fig. 37.33 Edges in series.
830
(c) Cycles. See Fig. 37.34. This follows since
x3 = Hx2 and x2 = Gx1 + Jx3.
GRAPH THEORY AND ITS APPLICATIONS
H GH
G 1 − HJ
by
x1 x2 x3 x1 x3
H1
G1 G3
P(s) G2 Q(s)
−H2
1 1
rule (c)
H G1G2
G 1 − H1G2
1−H G3
P(s) Q(s)
by −H2
x1 G x2 x1 x2 1 1
G1G2G3/(1 − H1G2)
P(s) Q(s)
−H2
1 1
rule (c)
x3 x3
H GH
G G1G2G3
by P(s) 1 − H1G2 + G1G2G3H2 Q(s)
x1 x2 x1
J GJ 1 1
x4 x4
37.8
Find the output–input relation in the signal-flow graph shown in
PLANAR GRAPHS
Example 37.4
Fig. 37.38.
Applying rule (a) to the multiple edge, and rule (c) to the cycle, the graph is reduced
to Fig. 37.39. Apply the series rule to the divided edges to give Fig. 37.40. Finally the
multiple-edge and series rules give Fig. 37.41. Thus the output is given by
abd
q= + he(g + f ).
1 − bc
In the actual control system a, b, c, … will be transfer functions.
b
ab
a d 1 − bc d
p 1 c 1 q p 1 1 q
g
h e h e
f g+f
abd
1 − bc
p q abd
1 1 p + he(g + f ) q
1 − bc
Self-test 37.4
Suppose that c is in the opposite direction in the signal-flow graph Fig. 37.38.
Find the new output–input relation.
A
GRAPH THEORY AND ITS APPLICATIONS
W G
W G E
B C
E
A B C
The graph in Fig. 37.42 is an example of a bipartite graph in which one set
of vertices may be connected to another set of vertices, but not to vertices in the
same set. If every vertex in one set is connected by one edge to every vertex in the
37
other set then it is called a complete bipartite graph. If the sets have m and n
vertices respectively, then the notation Km,n denotes the complete bipartite graph.
Figure 37.42 shows the graph K3,3 and this graph is not planar. Check that the
graphs K2,2 and K2,3 are planar.
In planar graphs there is a relation between the numbers of vertices, edges, and
faces. In a plane drawing of a graph, the plane is divided into regions called faces.
One face is the region external to the graph. Figure 37.44 shows a planar graph
with five vertices and seven edges, and with four faces: A, B, C, and the external
face D.
A remarkable formula, due to Euler, links the numbers of vertices, edges, and
faces of a graph.
Theorem (Euler). Suppose that the graph G has a planar drawing, and let v be
the number of vertices, e the number of edges, and f the number of faces of G.
Then
v − e + f = 2.
Proof. For the graph G, define a spanning tree (see, for example, Fig. 37.45). The
spanning tree must have n vertices and n − 1 edges (see Section 37.5). It must also
have just one face. Since
n − (n − 1) + 1 = 2,
Euler’s formula holds for the spanning tree. Successively replace the other edges
in the graph. Each time an extra edge is added, a face is divided and one extra face
is added. However, algebraically, this cancels the additional edge in the accumu-
lation to Euler’s formula for the spanning tree. Hence
v−e+f=2
for the reconstructed graph G.
833
a b
37.8
D
PLANAR GRAPHS
c
C
e d
Fig. 37.44 A planar graph with Fig. 37.45 A graph with a spanning tree.
fiver vertices, seven edges, and
four faces.
The complete graph with n vertices is denoted by Kn. Since every vertex is joined
to n − 1 vertices, Kn has --21 n(n − 1) edges. The graphs of K2, K3, K4, and K5 are
shown in Fig. 37.46. Of these graphs, K2, K3, and K4 are planar, but K5 and all
succeeding complete graphs are not.
K5
K3 K4
K2
The graphs K3,3 and K5 are the keys to tests for planarity of graphs, and whether
it is possible to design, for example, a plane printedcircuit board to make the re-
quired connections between electronic components. It was proved by Kuratowski
in 1930 that every non-planar graph contains subgraphs which are either K3,3 or
K5, or K3,3 or K5 with additional vertices on their edges.
Further discussion of graph theory with many applications can be found in the
introductory text by Wilson and Watkins (1990).
Self-test 37.5
(a) A regular dodecahedron has 12 faces (pentagons) and 30 edges. How
many vertices does it have?
(b) An icosahedron has 20 faces (triangles) and 12 vertices. How many edges
does it have?
834
Braced frameworks
Consider a frame which consists of four struts in the shape of a rectangle
(Fig. 37.47a) with pin joints at each corner. Without a diagonal tie the structure will
not support a vertical load, but will collapse into a parallelogram as shown in
Fig. 37.47b. The structure can be made rigid and load bearing by the insertion of
a diagonal strut as in Fig. 37.48.
Load
(a) (b)
c1 c2 c3 c4 c5 c6
r1
r1 r2 r3 r4 r5
r2 Cycle
r3
r4
c1 c2 c3 c4 c5 c6
r5
Fig. 37.50
37.9
and c1 in the bipartite graph. No edge joins r1 and c3 since this cell is not braced.
The bipartite graph representing the framework is shown in Fig. 37.50. If the
FURTHER APPLICATIONS
graph is connected, then the framework is braced since the shearing of any cell or
group of cells is not then possible. The graph is connected in this case, and the
framework is braced. Can any braces be removed in such a way that the frame-
work is still braced? Any brace which is removed must not disconnect the graph.
If the graph contains a cycle (Section 37.4) then any edge removed from the cycle
will not disconnect the graph. This removal rule can be applied to each cycle in
the graph. If, at the end of this process, there are no cycles remaining and the
graph remains connected, then the framework is said to have a minimum bracing.
The framework graph in Fig. 37.50 contains just one cycle, namely r1 c1 r3 c3r4 c6r2 c2 r1
(see Fig. 37.49). Any edge can be removed from this cycle leaving a minimum
bracing. The removal of any further edges will disconnect the graph.
If every cell is braced in a framework then the bipartite graph will be complete,
and the framework will be seriously overbraced. You might note that a complete
bipartite graph Km,n has mn edges but a minimum bracing for an m × n frame-
work has m + n − 1 edges: for example, if m = 5 and n = 6 then mn = 30 whilst
m + n − 1 = 10.
Figure 37.51 shows an unbraced 4 × 5 framework, its (disconnected) graph, and
the same framework sheared.
c1 c2 c3 c4 c5
r1
r2 r1 r2 r3 r4
r3
r4
c1 c2 c3 c4 c5
a h
GRAPH THEORY AND ITS APPLICATIONS
b g
a
b
c
h
g
f
c f
d e d e
an edge. Lanes a and c are also compatible, and we therefore join a and b by an
edge. Lanes a and c are also compatible, but a and e are not, and so on. The graph
G in Fig. 37.53 shows which lanes are compatible, and is known as the compatibility
37
Table 37.2
Subgraph
PROBLEMS
abcd abfg def fgh
1 1 3 1 2 3 4
0 4T 2T 4T T 0 5T 5T 5T 5T T
a a
b b
c c
d d
e e
f f
g g
h h
The total waiting time for the traffic at the junction is a measure of the effici-
ency of the timings and phases. Let ta, tb, tc, … be the waiting times of the lanes so
that, from Fig. 37.54, we can see that ta = 12 T, tb = 12 T, tc = 34 T, etc. Hence the total
waiting time WT is given by
WT = ta + tb + ··· + th = 21 T + 21 T + 43 T + 21 T + 43 T + 14 T + 21 T + 43 T = 92 T.
Can the waiting time be reduced within the time constraints by choosing either
a different set of subgraphs to cover G, or a different sequence of timings?
Figure 37.55 shows the same choice of subgraphs but with different timings.
The result is a slightly shorter waiting time of 225T.
Problems
37.1 (Section 37.2). Write down the degree of each 37.4 Sketch the eight regular graphs with six
vertex in the graph in Fig. 37.56. vertices. How many of them are connected?
⎡0 1 1 1 1⎤
⎡0 2 0 0⎤ b
⎢1 1⎥
f
0 1 1 ⎢2 1⎥
(a) A = ⎢1 1⎥ , (b) A =
0 1
⎢
1 0 1
⎥ ⎢0 1 0 1⎥
.
⎢1 1 1 0 1⎥ ⎢ ⎥ g
⎢⎣1 1 1 1 0⎥⎦ ⎣0 1 1 0⎦
c e
37.7 Write down the adjacency matrices of
the graphs in Fig. 37.7. Note that a single loop
introduces an element 1 into the appropriate
position on the leading diagonal. What d
characterizes the matrix of a disconnected graph?
Fig. 37.58
c d
Fig. 37.59
b d
37.14 (Section 37.6). Figure 37.60 shows a circuit
with an independent current source i0. Represent
a e the circuit by a graph. How many vertices does the
graph have?
R1
h f
R2 R3 R4
g
R5 R6 i0 R7
Fig. 37.57
PROBLEMS
i4 i3
R2 = 2 Ω R1 = 3 Ω
e R3 = 1 Ω
i7
i1 i5
R4 = 1 Ω iY = 2 A
i0
iZ = 2 A
a i6 d
R5 = 2 Ω R6 = 1 Ω
Fig. 37.61
R4 = 1 Ω
find the remaining voltages va , vb , vd, ve . R3 = 3 Ω
R5 = 2 Ω iX = 1 A
37.16 (Section 37.6). Figures 37.62a,b show two
circuits with current sources and resistors. Use
R7 = 1 Ω
the cutset method to find the modal voltages and
currents through the resistors. iY = 2 A
R6 = 1 Ω
37.17 Complete the block-reduction method
for the multi-feedback control system shown
R8 = 1 Ω
in Fig. 37.28.
37.18 (Section 37.5). Figure 37.63 shows a Fig. 37.62
positive-feedback control system. If P(s) is the
system input, find its output Q(s), and the transfer
P(s) + Q(s)
function of a single equivalent device.
G(s)
37.19 (Section 37.7). Find the outputs in the +
systems shown in Figs 37.64a,b by progressively H(s)
replacing parts of the system by equivalent devices
until just one device remains. Find the transfer Fig. 37.63
function of the resulting equivalent single device.
(a)
H1(s)
−
P(s) + Q(s)
G1(s) G2(s) G3(s)
+
+
H2(s)
(b)
H1(s)
−
P(s) + Q(s)
G1(s) G2(s) G3(s)
+
+
H2(s)
H3(s)
Fig. 37.64
840
(a) −H1 (b)
GRAPH THEORY AND ITS APPLICATIONS
1 G1 G2 G3 1 x1 1 x2 G1 x4 G2 x5 1 x6
x1 x2 x3 x4 x5 x6
H1 H2
H2 x3
−H3
(c) H1 (d) H1
x3 G6
x4
G6 G5
G7 G5 x7 x8 G7
1 G1 G2 G3 G4 1 1 G2 G3 1
x1 x2 x5 x6 x7 x8 x9 x1 x2 G1 x3 x4 x5 G4 x6 x9
−H2 −H2
x5
(e) −H1
G4
x1 G1 x2 G2 x3
G3
−H2
37
x4
Fig. 37.65
37.20 (Section 37.7). Reduce each of the signal- 37.24 (Section 37.1). List all the paths between
flow graphs in Figs 37.65a,b,c,d to an equivalent S and T in the network given in Fig. 37.6, and
single edge, and (e) to a stem, and find the transfer hence find the shortest and longest paths. (This
function in each case. method of simply listing all paths can become very
extensive for larger networks: efficient algorithms
37.21 (Section 37.8). Label the edges, vertices, and are really required to reduce the number of
faces of the graphs shown in Figs 37.66a,b and calculations.)
verify Euler’s formula.
37.25 (Section 37.9). Show that the framework
in Fig. 37.67 is overbraced. How many ties can
be removed to leave a minimum bracing?
(a) (b)
PROBLEMS
overbraced with each diagonal tie as an edge in at
least one cycle in the associated bipartite graph.
What is the minimum number of ties which must
be added?
h g
(c) a
b
f
e
c d
CONTENTS
In many applications, functions can only take discrete values – that is, they can-
not (for various reasons) take a continuous spectrum of values. It is reasonable to
model the temperature in a room by a function which varies continuously with
time – most of the calculus in this book is concerned with such functions. On the
other hand, the population size of a country can only take integer values. As
births and deaths occur, the population size is discontinuous in time, and the
graph of population size against time will be a step function. Between births and
deaths the population number will be constant so that we are only concerned
with changes which take place at these events. In this problem jumps occur at
variable time intervals.
We can obtain discrete data from a continuous signal or function by sampling
the signal at regular time steps rather than keeping a continuous record. This is
often the situation in microprocessor-driven operations.
The progress of events is often described in the form of equations linking several
successive events: so-called difference equations. The reader may notice analogies
between the solutions of these and the solutions of differential equations.
38.1
Pn = (1 + I)Pn−1. (38.2)
DISCRETE VARIABLES
values of Pn at the integer values 1, 2, … in terms of the immediately preceding
value. Treating the variable as n, the difference in this case is 1. The notation P(n)
instead of Pn is often used to emphasize the function aspect of P but we have
chosen the more economical subscript form Pn.
It is fairly easy to solve (38.2) by repeated application of the formula starting
with (38.1). Thus
P2 = (1 + I)P1 = (1 + I)2P0,
P3 = (1 + I)P2 = (1 + I)3P0,
and so the formula
Pn = (1 + I)nP0 (38.3)
holds at least for values of n up to 3. Suppose that (38.3) holds for n = k. Then
(38.2) implies that
Pk+1 = (1 + I)Pk = (1 + I)k+1P0 .
So the same formula holds for Pk+1. Hence, if the result is true for k then it is also
true for k + 1. Equation (38.1) confirms that it is true for k = 1. It follows sequen-
tially that it is true for n = 2, n = 3, and so on. (This method of proof is known as
induction.)
Example 38.1 £1000 is invested for 5 years at the following rates: (a) 5%
5 5
annually; (b) % calendar monthly; (b) 365
12 % daily (ignoring leap years).
(c) Calculate the final amount in the account in each case.
In each case the formula is
Pn = (1 + I)nP0,
with P0 = 1000, but the I and n differ.
(a) This is the original problem with n = 5 and I = 0.05. Hence
P5 = (1 + 0.05)5 × 1000 = 1.055 × 1000 = 1276.28
(in £, to the nearest penny).
(b) This account has 12 compounding periods each year, giving a total of 60 over the
5 years. Hence we require
60
⎛ 0.05 ⎞
P60 = ⎜ 1 + ⎟ × 1000 = 1283.36.
⎝ 12 ⎠
(c) For the daily rate, there are 365 × 5 = 1825 compounding periods. Thus we require
1825
⎛ 0.05 ⎞
P1825 = ⎜ 1 + ⎟ × 1000 = 1284.00.
⎝ 365 ⎠
There is a slight gain with increasing number of compounding periods.
844
The following financial application of a loan repayment leads to a difference
equation.
DIFFERENCE EQUATIONS
The nth payment A is made at the end of year n, after which the debt out-
standing is denoted by un. The payment A comprises:
(interest owed on un−1 through year n) + (a capital repayment)
Therefore
A = Iun−1 + (un−1 − un) or un = −A + (1 + I)un−1 (38.4)
Example 38.2 The sum of £50 000 is borrowed over 25 years to be repaid in
equal instalments, the interest on the outstanding balance in any year being 8%.
Find the annual repayments over the term of the loan.
In the notation above, P = £50 000, I = 0.08, N = 25. Therefore the annual repayment to
the nearest penny is
I(1 + I)NP 0.08 × 1.0825 × 50 000
A= = = £4683.94
(1 + I)N − 1 1.0825 − 1
845
Example 38.2 continued
38.2
The total repayment over 25 years is
NA = 25A = £117 098.47.
un
DIFFERENCE EQUATIONS
1
2
u3 u4
u2
u1
u0
integer values of x in the usual cartesian axes. The series of dots in Fig. 38.1 is a
graphical representation of the sequence.
The implied limiting value of un as n → ∞ for this particular sequence suggests
that un = 12 is a constant solution of the difference equation (38.9), and this can be
confirmed. We can find all constant solutions by simply putting un = u for all n.
From (38.9), the constant solutions are given by
u = 2u(1 − u), or 2u2 − u = 0,
which implies that u = 0 and u = 12 are solutions. These are also known as the fixed
points or equilibrium values of the difference equation.
You might notice, by trial computation, that the solutions of (38.9) vary quan-
titatively with the initial value, u0. If 0 u0 1, then un appears to approach --12 as
n becomes large, but, if u0 1 or u0 0, then un becomes unbounded for large n.
We shall discuss the logistic equation further in Section 38.5.
For the second-order difference equations, the same process gives equilibrium
values. For example, if
un+2 − 2un+1 + 4un = 6,
then this equation has an equilibrium value obtained by putting un+2 = un+1 = un = u,
so that
u − 2u + 4u = 6 or u = 2.
On the other hand, the second-order difference equation (38.7) has no equilib-
rium values since
u − 3u − 2u − n2 = −4u − n2
can never be zero for constant u and all n.
847
Self-test 38.1
38.3
The sum of £100 000 is borrowed over a 25 year term at an annual interest
rate of 6.5%. Find the annual repayment assuming that the interest rate
y y=x
(2, 2) y = 12x + 1
P2
P1
1 Q1
O
P0 x
Fig. 38.2
848
For a general difference equation un+1 = f(un), the cobweb construction takes
place between the straight line y = x and the curve y = f(x).
DIFFERENCE EQUATIONS
y y y
3
y=x 2 1
38
P4 y = −x + 1
y=x
y=− x+ 3
2
3
2 y=x
1 P2 P2 Q2
2 P2 Q2
Q2
y = − 12x + 1
Q1
Q3 P3
2 P1
P1 Q3 P1 Q1
P3
Q1
O P0 1x O P0 1 x O P0 1x
Fig. 38.3 Cobweb for Fig. 38.4 Cobweb for Fig. 38.5 Cobweb for
un+1 = − 12 un + 12 with u0 = 34 . un+1 = −2un + 2 with u0 = 34 . un+1 = −un + 1 with u0 = 34 .
(a) Plot the lines y = x and y = − 12 x + 12 . They intersect at the fixed point ( 13 , 13 ). Starting
from P0 : ( 43 , 0), the cobweb traces P0 P1Q1P2Q2P3 … in Fig. 38.3. Evidently it approaches the
fixed point as n → ∞, indicating stability.
(b) The lines are y = x and y = − 23 x + 23 . The fixed point is at ( 53 , 53 ), and the cobweb
path is P0 P1Q1P2Q2 … in Fig. 38.4. The path moves away from the fixed point implying
its instability.
(c) The lines are y = x and y = −x + 1 with fixed point ( 12 , 12 ). The path starting at P0 :
( 43 , 0) follows the rectangle P1Q1P2Q2, indicating periodicity (Fig. 38.5). This is true for
any starting value except that of the fixed point itself.
Graphs of the sequences un versus n are shown in Fig. 38.6.
1
4
O 1 2 3 4 5 6 7 8 n O 1 2 3 4 5 6 7 n O 1 2 3 4 5 6 7 8 n
38.4
CONSTANT-COEFFICIENT LINEAR DIFFERENCE EQUATIONS
Stability
The first-order difference equation un+1 = −kun + a has a fixed point at
u = a/(1 + k), (k ≠ −1). The fixed point is stable if | k | 1, unstable if | k | 1,
and periodic if k = 1.
If k = −1, the equation has no fixed point unless a = 0. (38.11)
Self-test 38.2
Consider the difference equation un+1 = f(un) = --21 − un2. Plot the curve y = f(x) =
--21 − x2 and the straight line y = x. What are the coordinates of the fixed point
in the x,y plane? Given u1 = 0.2, compute u2, u3, u4, u5. Draw the corres-
ponding cobweb. Does it indicate stability of the fixed point?
where a and b are constants and f(n) is a given function. The methods generalize in
a fairly obvious way to higher-order systems.
There are many parallels between the difference equation (38.12) and second-
order constant-coefficient equations (Chapters 18–19). The equation is said to be
homogeneous if f(n) = 0, and inhomogeneous otherwise, just as in the case of
second-order differential equations. However, this section is self-contained and
reference back is not necessary. The general solution of the inhomogeneous case
requires that of the homogeneous case: hence we start with the latter.
Homogeneous equations
We can see how to proceed by looking at the first-order constant-coefficient equation
un+1 − cun = 0. (38.13)
As can be seen from (38.2) or verified directly, the general solution of this equa-
tion is
un = Ac n, (38.14)
Distinct roots
The general solution of un+2 + 2aun+1 + bun = 0 for distinct roots p1 and p2 of
p2 + 2ap + b = 0 is
un = Ap n1 + Bp n2, for any constants A and B. (38.17)
38.4
un+2 − 2aun+1 + a2un = 0,
Equal roots
The general solution of un+2 − 2aun+1 + a2un = 0 is
un = (A + Bn)an. (38.19)
Hence
un = A2 2 n e 4 π in + B2 2 n e− 4 π in
1 3 1 3
Complex roots, α ± iβ = r e ±θ i
The general complex solution of
un+2 + 2aun+1 + bun = 0,
where a 2 b, is
un = A(α + iβ )n + B(α − iβ )n.
The general real solution is
un = rn(C cos nθ + D sin nθ ). (38.20)
852
un+2 + un = 0.
The characteristic equation is
p2 + 1 = 0,
giving roots p1 = i, p2 = −i. Hence
un = Ain + B(−i)n.
In polar form, i = e 2 π i, − i = e − 2 π i. Hence the real form of the solution is
1 1
Inhomogeneous equations
38
(see (38.12)). Let un = vn + qn, where vn is the general solution of the corresponding
homogeneous equation. Substitute this form of un into (38.21):
(vn+2 + qn+2) + 2a(vn+1 + qn+1) + b(vn + qn) = f(n),
or
(vn+2 + 2avn+1 + bvn ) + (qn+2 + 2aqn+1 + bqn) = f(n).
Since vn satisfies the homogeneous equation, it follows that
qn+2 + 2aqn+1 + bqn = f(n),
which means that qn must be a particular solution of the inhomogeneous equa-
tion. As in differential equations, vn is known as the complementary function.
We construct particular solutions by appropriate choices of functions usually
containing adjustable parameters which are suggested by the form of the function
f(n). If a particular choice fails, then we reject it and try something else.
38.4
In this case we expect the choice qn = C to fail, since it must make the left-hand side of
the difference equation vanish. When this happens, we try
qn = Cn.
Table 38.1 lists some simple forcing terms f(n) with suggested forms of par-
ticular solution and alternatives containing parameters to be determined by
direct substitution.
Table 38.1
Self-test 38.3
DIFFERENCE EQUATIONS
where α is a parameter which will take various values. This nonlinear equation can
38
y
y=x
38.5
As with the cobweb for two intersecting lines for the linear difference equation
in Section 38.3, the fixed point P is locally stable if m = 2 − α −1, in that all
cobweb paths starting close to (α − 1)/α approach the fixed point P as n → ∞. This
y (a) y (b)
1 1
C
P
B
O 1 x O 1 x
Fig. 38.8 (a) Graph of y = f(f(x)) for the critical case α = 3. (b) Graph of y = f( f(x)) for α = 3.4
showing fixed points O, A, B, C. The dashed curve shows y = f(x) in both cases.
856
where we could have predicted the solution x = (1 − α)/α corresponding to the
point B in Fig. 38.8b. The solutions of
DIFFERENCE EQUATIONS
are
x1 9 1
= [1 + α z √{(α + 1)(α − 3)}] (α 3)
x2 8 2α
which determine, respectively, the coordinates of A and C.
From (38.23)
x1 + x2 = (1 + α)/α. (38.24)
38
Also
f(x1) = αx1(1 − x1) = αx1 − αx12
= αx1 − (1/α)[α(1 + α)x1 − 1 − α] (using (38.23))
= (1/α)(−αx1 + 1 + α) = x2
by eqn (38.24). Similarly f(x2) = x1.
It follows that
f(f(x1)) = f(x2) = x1 and f(f(x2)) = f(x1) = x2.
Hence if x = x1 initially then subsequently x alternates between x1 and x2 shown
by the square in Fig. 38.8b. This phenomenon is known as period doubling.
The values x = x1 and x = x2 are fixed points of y = f(f(x)), and their stability is
determined by the slopes of y = f(f(x)) at the points.
The critical slopes for stability at A and C are both (−1); we now find the value
of α at which this occurs. We have
d
f(f(x)) = α 2 − 2α 2x − α 3(2x − 6x2 + 4x3)
dx
= α 2 − 2α 2(1 + α)x + 6α 3x2 − 4α 3x3. (38.25)
Equations (38.26) and (38.27) must have the same roots in x. In each case, make
the coefficient of x 2 equal to 1. The equations for comparison are
(α + 1) (α + 1)
x2 − x+ = 0,
α α2
(α + 1) (α 2 + 1)
x2 − x+ = 0.
α 2α 2 (α − 2)
857
These equations have the same roots if
38.5
α +1 (α 2 + 1)
= ,
α2 2α 2 (α − 2)
y
1 y=x
Logistic equation
un+1 = f(un) = α u n(1 − un).
Fixed point for α 0, x 0 at x0 = (α − 1)/α.
Fixed point x0 stable if
f ′(x0) = α − 2α x0 = −α + 2 −1, that is if α 3.
Period-2 solution: fixed points (α 3)
x1,x2 = {1 + α z √[(α + 1)(α − 3)]}/(2α).
Period-2 solution stable if 3 α 1 + √6. (38.28)
858
The sequence of period-doubling bifurcations is known as the Feigenbaum
sequence, and it has certain universal aspects in that it is not just a consequence of
DIFFERENCE EQUATIONS
the logistic equation, but has common features with other difference equations
which generate period doubling.
The simplest way to view the progressively complex behaviour is through a
computer-drawn picture of the iterations of
un+1 = α un(1 − un)
for stepped increases in α starting at α = 2.8 up to α = 3.8, which covers the main area
of interest. The result is shown in Fig. 38.10. The series of single dots for each α
in 2.8 α 3 indicates the fixed point, which then bifurcates into a stable 2-cycle
attractor for 3 α 1 + √6. This in turn bifurcates into a stable 4-cycle attractor
38
at α = 1 + √6 and so on. The effect of infinite period doubling is that the solution is
ultimately non-periodic. The generally chaotic and noisy behaviour of the differ-
ence equation can clearly be seen in the large number of dots for larger values of α.
These non-periodic sets are known as strange attractors. The successive iterates
of the logistic equation wander about in a seemingly random but bounded man-
ner, and never settle into a periodic solution. However, within the chaotic band of
α values, there appear windows of periodic cycles. Problem 38.26, for example,
confirms that there is a 3-cycle around α = 3.83.
The logistic equation can be thought of as a relatively simple model example.
Many similar nonlinear difference equations also exhibit similar period-doubling
bifurcations and strange attractors.
un
1.0
0.8
0.6
0.4
PROBLEMS
38.1 £1000 is invested over 10 years at an interest f(n) = f(--13 n) + --58 ,
rate of 6% annually. Find the final total investment. given the initial condition f(1) = 0.
What should the monthly interest rate be to achieve
the same final total? 38.8 (Section 38.3). Find the general solutions of
the following difference equations:
38.2 The sum of £50 000 is borrowed over 25 (a) un+2 + 2un+1 − 3un = 0;
years and the money is repaid in equal annual (b) un+2 − 9un = 0;
instalments. The interest rate on the outstanding (c) un+2 + 9un = 0;
balance in any year is 10%. Find what the annual (d) un − 4un−1 + 5un−2 = 0;
repayments would be. After 5 years, the interest (e) un+2 − 4un+1 + 4un = 0;
rate is reduced to 9%. (f) un+3 − un+2 + un+1 − un = 0;
(a) Find the required adjustment to the annual (g) un+3 − un = 0;
repayments for the loan to be repaid over the (h) un+3 − 3un+2 + 3un+1 − un = 0;
original term. (i) un+2 − un+1 − un + un−1 = 0.
(b) If the repayments are not changed, by how
much will the mortgage term be reduced? 38.9 Express the solution of the initial-value
problem
38.3 Find the fixed points of the following
un+2 − 6un+1 + 13un = 0, u0 = 0, u1 = 1,
difference equations:
(a) un+1 = un(2 − un ); in real form.
(b) un+1 = un(1 + un )(2 − 3un );
(c) un+1 = sin un ; (d) un +1 = --12 sin un ; 38.10 Find the difference equation satisfied by
(e) un+1 = eun − 1. un = A ·2n + B ·(−5)n,
for all A and B.
38.4 Given the initial value u0 in each case,
calculate the sequence of terms up to u5 for each 38.11 Obtain particular solutions of the following
of the following first-order difference equations: inhomogeneous difference equations:
(a) un+1 = 2un(3 − un), u0 = 1; (a) un+2 + 2un+1 − 3un = f(n), where
(b) un+1 = 2un(1 − un), u0 = --12 ; (i) f(n) = 2n; (ii) f(n) = n; (iii) f(n) = 2
(c) un+1 = 3.2un(1 − un), u0 = --12 ; (iv) f(n) = (−3)n.
(d) un+1 = 4un(1 − un), u0 = --12 . (b) un+2 + 2un+1 + 2un = f(n), where
(i) f(n) = 1; (ii) f(n) = n + 3;
38.5 (Section 38.3). Sketch the cobweb solutions (iii) f(n) = cos --34 πn.
for the following first-order equations with the (c) un+3 − 3un+2 + 3un+1 + un = f(n), where
stated initial conditions, and discuss the stability (i) f(n) = 1; (ii) f(n) = n; (iii) f(n) = n2.
of the fixed point: (d) un+2 − 6un+1 + 9un = f(n), where (i) f(n) = 2n;
(ii) f(n) = 3; (iii) f(n) = 3n; (iv) f(n) = n3n.
(a) un+1 = --12 un + --12 , u0 = --12 and u0 = --32 ;
(b) un+1 = 2un − 2, u0 = --12 and u0 = --32 ; 38.12 A ball bearing is dropped from a height
(c) un+1 = −un + 2, u0 = --12 and u0 = --34 ; z = h0 on to a metal plate, and the coefficient of
(d) un+1 = − --12 un + --32 , u0 = --12 and u0 = --32 ; restitution between the ball and the plate is ε,
where 0 ε 1. Set up a difference equation for
(e) un+1 = −2un + 3, u0 = --12 and u0 = --32 . the maximum height reached after n impacts.
Solve the equation. (Assume that a ball dropped
38.6 The function f(n) satisfies from a height h hits the plate with speed v = √(2gh),
f(n) = f(--12 n) + 1. where g is the acceleration due to gravity. The
Put n = 2m and g(m) = f(2m), and show that rebound speed of the ball is ε v.) Instead of being
stationary, the plate now oscillates so that it is
g(m) = g(m − 1) + 1. moving upwards at a speed u (a constant) at the
Hence find f(n) given that f(1) = 0. moment of each impact with the ball. Find the
difference equation for hn. Show that the difference
38.7 Use the method suggested in the previous equation has a fixed point and interpret its
problem to solve meaning.
860
38.13 Dn(x) is the n × n determinant defined by probability that the walker moves to either x = r + 1
or x = r − 1 at any stage is --12 . The probability uk that
DIFFERENCE EQUATIONS
2x 1 0 … 0
the walker reaches x = 0 first, given an initial
1 2x 1 … 0
Dn(x) = (n 2), position x = k, satisfies the difference equation
0 0 0 … 2x uk = 12 uk−1 + 12 uk+1, u0 = 1, uN = 0,
for 1 k N − 1. Find uk. What is the probability
2x 1 that the walker reaches x = N first?
D2(x) = , D1(x) = 2x.
1 2x If dk is the expected number of steps in the walk
Show that before it reaches 0 or N, then dk satisfies
Dn(x) = 2xDn−1(x) − Dn−2(x). dk = 12 (1 + dk+1) + 12 (1 + dk−1), d0 = dN = 0
Solve the difference equation for x ≠ 1 and x = 1. for 1 k N − 1. Find the expected duration of
the walk.
38.14 Let {un} (n = 0, 1, … ) be a sequence. The
38
38.17 A symmetric random walk takes place on 38.22 (Section 38.5). In the logistic equation
the integer steps on the line between x = 0 and un+1 = α un(1 − un), for what positive values
x = N. At any position x = r (1 r N − 1), the of α is the origin a stable fixed point?
861
38.23 (Section 38.5). Find the two stable values 38.26 By starting from u0 = 0.957 417, compute
between which un ultimately oscillates in the u1, u2, … , u5 for the difference equation
PROBLEMS
logistic equation un+1 = 3.25un(1 − un). un+1 = α un(1 − un), α = 3.83,
and confirm that the logistic equation appears to
38.24 Consider the difference equation
have a 3-cycle for this value of α.
un +1 = α ( 12 − | un − 12 |).
Sketch the function y = f(x) = α(--12 − | x − --12 |) for 38.27 Find the fixed points of the difference
α = --32 . Where are the equilibrium points of the equation
difference equation for α 1? Show that the un+1 = α un(1 − un)2,
origin is stable if α 1, and unstable if α 1. in the three cases (a) α = 9, (b) α = 4, (c) α = --94 .
What happens if α = 1? Discuss the stability of the fixed points in each case.
Sketch the graph of y = f( f(x)) for α = 2. Show
that there exists a 2-cycle and locate the periodic 38.28 Show that the special logistic equation
values of un. un+1 = 4un(1 − un )
38.25 Find the fixed points of has the solution
un = sin2(2nCπ)
un+1 = α un(1 − u n3 ),
where C is any constant. This general solution
for all α. Determine the slope of y = f(x) =
includes closed-form chaotic solutions. For
α x(1 − x3) at the nonzero fixed point. Confirm
example, if C = 1/π, then
that this fixed point is stable if α --53 and
unstable if α --53 . Sketch cobweb solutions un = sin2(2n)
for α = 1.2, 1.4, 1.8. which never repeats itself for n = 0, 1, 2, … .
Part 7
Probability and statistics
Probability
39
CONTENTS
For example, the standard die has six faces numbered 1, 2, 3, 4, 5, 6. After a large
number of throws, we would expect the number 1 (or any other number) to appear
on the upper face with a relative frequency of 1/6. Hence we expect that the prob-
ability of a 1 appearing is 1/6.
Many probabilities are based on data, past records, the ‘degree of belief’, the
view of individuals, and so on. Horse races are usually not repeated so that there
39
can be no relative frequency approach, but book-makers and punters bet on the
basis of the previous form of the horses, the state of the course, and the pattern
of bets. Generally as the race approaches the bookmakers’ odds reflect how the
accumulation of bets has been distributed among the runners. Many outcomes
will be assigned probabilities with at least some subjective element.
Probabilities are important in measuring risk, and there can be surprising
results. From past data the earth receives a significant meteor impact every 100 years.
The probability of a particular individual being killed by such an impact is very
small but nonzero. However, the impact could be cataclysmic, which means that by
some measures the probability of being killed by a meteor impact is greater than
that arising from a plane crash. In engineering, as the reliability of components
improves, the likelihood of failure becomes more remote, but might as a con-
sequence have more serious implications if it does occur.
39.1
A1 = {5}, A2 = {1, 3, 5}, A3 = {1, 2, 3, 4}:
Example 39.1 Two coins are spun. What is the probability that at least one
head appears?
It is essential in the solution to distinguish the coins, as, say, a and b. Thus if Ha is the
event that coin a shows a head, Ta that a shows a tail, and so on, then the sample space
has four elements:
S = {(Ha, Hb ), (Ha, Tb ), (Ta , Hb ), (Ta, Tb)},
which are all equally likely. Thus
P((Ha, Hb )) = P((Ha, Tb )) = P((Ta, Hb )) = P((Ta, Tb )) = 14 .
The event A, that at least one head appears, is the subset
A = {(Ha, Hb ), (Ha, Tb ), (Ta, Hb )},
which contains three of the four elements. Hence at least one head occurs with
probability P(A) = 34 .
Example 39.2 Two distinguishable dice a and b are rolled. What are the
elements of the sample space? What is the probability that the sum of the face
values of the two dice is 8? What is the probability that at least one 5 appears?
We distinguish the outcome of each die separately, so that there are 6 × 6 = 36 possible
outcomes for the pair. The sample space has 36 elements of the form (i, j) where ➚
868
Example 39.2 continued
PROBABILITY
i and j take all integer values 1, 2, 3, 4, 5, 6, and i is the outcome of die a and j is the
outcome of b. The full list is
S = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6),
(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6),
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6),
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6),
39
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6),
(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)},
and they are all equally likely. If A1 is the event that the sum of the dice is 8, then from
the list
A1 = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)}
which occurs for 5 elements out of 36. Hence
P(A1) = 365 .
The event that at least one 5 appears is the list
A2 = {(1, 5), (2, 5), (3, 5), (4, 5), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (6, 5)},
which has 11 elements. Hence
P(A2 ) = 11
36 .
Self-test 39.1
Two distinguishable dice are rolled. Using the list given in Example 39.2,
what is the probability of the event A1 that the sum of the face values is 7?
What is the probability that (a) the event A2 that no 3 or 5 appears, (b) the
event A3 that no 3 and 5 appears?
A1 = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)},
A2 = {(1, 5), (2, 5), (3, 5), (4, 5), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (6, 5)}.
The event A3 has 14 elements of which two are common to both A1 and A2.
869
If A4 is the event that both A1 and A2 occur, then A4 is the intersection of A1 and
A2, namely
39.2
A4 = A1 ∩ A2 = {(3, 5), (5, 3)}.
(a) (b)
A1 A2 A1 A2
S S
Fig. 39.1 (a) The event A3 = A1 ∪ A2, (b) The event A4 = A1 ∩ A2.
Example 39.3 Suppose that A, B, C are three events in the sample space S. Write
down the sets which represent the events that: (a) A occurs, but neither B nor C
occurs; (b) A, B, and C all occur.
(a) The event that B or C occurs will be B ∪ C. The event that neither B nor C occurs
will be the complement B__ _]_ _C. The required set will be the intersection of this event
and A, namely
A ∩ (B_ _]_ _C).
By de Morgan’s first law (35.6), this is equivalent to
A ∩ (E ∩ y),
which can be written unambiguously as A ∩ E ∩ y by the associative law for
intersection (35.4).
(b) Events B and C occur in the set B ∩ C. Events A and B ∩ C occur in the event
A ∩ (B ∩ C) or A ∩ B ∩ C.
Two events are said to be mutually exclusive if they cannot occur together in a
single trial (or experiment), which in set terms is equivalent to the two subsets of
S being disjoint: that is, having no elements in common. Consider the following
illustrative application of a single die, which is rolled and the score noted. An
event of interest in a random experiment can be specified in many ways. A player
could be interested in even or odd scores, the score 2 or not, or scores which are
factors of 6 or not. In each case the sample space is divided into two disjoint
sets or mutually exclusive events, together constituting an exhaustive (meaning
that there are no outcomes which are not in at least one event) list of outcomes.
For example, if A stands for the event of an even score, then D must represent an
odd score. Thus
870
A ∩ D = Ø, and A ∪ D = S,
PROBABILITY
Any union of events can be expressed in terms of the union of certain mutually
exclusive events. For example, the union A ∪ B of two events A and B can be
partitioned into the mutually exclusive events A ∩ E, A ∩ B, and D ∩ B. Then
A ∪ B = (D ∩ B) ∪ (A ∩ B) ∪ (A ∩ E).
In another example, an event A in the sample space which also contains B can
be divided as
A = (A ∩ B) ∪ (A ∩ E),
which can be interpreted as meaning that A can occur either with B or without B.
Suppose the sample space is partitioned into the n mutually exclusive and
exhaustive events A1, A2, … , An. If A is any event, then
A = (A ∩ A1) ∪ (A ∩ A2) ∪ ··· ∪ (A ∩ An).
This means that, if A occurs, then it must occur as one, and only one, of the events
A1, A2, … , An. It might happen that A ∩ Ai = Ø for some intersections, but this
does not matter.
Example 39.4 In Example 39.2, express the sample space S and the events A1
and A2 in set terms.
The sample space is given by
S = {(i, j )|i, j = 1, 2, 3, 4, 5, 6},
which has 36 elements since the dice are distinguishable and (i, j) is distinct from (j, i).
The events A1 (the sum of the faces of two is 8) and A2 (at least one 5 appears for two
dice) can be written
A1 = {(i, j)|i + j = 8},
A2 = {(i, j)|either i = 5 or j = 5 or both}.
Axioms of probability
For every event A in a sample space S, the probability P(A) must satisfy:
(a) 0 P(A) 1;
(b) for the empty set (or non-event) and the sample space S:
P(Ø) = 0, P(S) = 1;
(c) for n mutually exclusive events A1, A2, … , An,
P(A1 ∪ A2 ∪ ··· ∪ An) = P(A1) + P(A2) + ··· + P(An). (39.1)
871
The rules can be interpreted as
39.2
(a) every probability must lie between and including 0 and 1;
(b) the probability of an impossible event is zero, and the probability of the
Example 39.5 Two dice are rolled. What is the probability that a total score
of 4 or 7 occurs?
Let A1 be the event of a score 4 and A2 be the event of a score 7. These cannot occur
together, so they must be mutually exclusive events. The event of a score 4 or 7 is
A1 ∪ A2. Hence by (39.1c) and the complete list of outcomes in Example 39.2,
P(A1 ∪ A2) = P(A1) + P(A2) = 363 + 366 = 14 .
If two events A1 and A2 are not mutally exclusive then they must have ele-
ments of the sample space in common. Using partitioning, which was explained
previously in this section, A1, A2, and therefore the union of A1 and A2 can be
partitioned into unions of mutually exclusive events. Thus, since A1 ∩ A2 is A1 not
A2, and A1 ∩ A2 is A1 and A2.
A1 = (A1 ∩ D2) ∪ (A1 ∩ A2).
Similarly
A2 = (D1 ∩ A2) ∪ (A1 ∩ A2).
Therefore
A1 ∪ A2 = (A1 ∩ D2) ∪ (D1 ∩ A2) ∪ (A1 ∩ A2),
since (A1 ∩ A2) ∪ (A1 ∩ A2) = A1 ∩ A2. Hence by rule (c) in (39.1),
P(A1) = P(A1 ∩ D2) + P(A1 ∩ A2), (39.2)
Elimination of P(A1 ∩ D2) and P(D1 ∩ A2) between (39.2), (39.3), and (39.4) leads to:
Geometrically the result can be seen from Fig. 39.2 in which the intersection
A1 ∩ A2 is ‘counted twice’ in P(A1) + P(A2).
872
PROBABILITY
A1
A1 ∩ A2
39
A2
Fig. 39.2
Example 39.6 If two dice are rolled, what is the probability that either the sum
is 8 or at least one 5 appears?
As we saw in Example 39.2, if A1 is the event that the sum is 8 and A2 the event that at
least one 5 appears, then
P(A1 ) = 365 , P(A2 ) = 11
36 .
These are not mutually exclusive events because both events occur when the outcomes
are {(5, 3)} or {(3, 5)}. Therefore
A1 ∩ A2 = {(3, 5), (5, 3)},
in which case
P(A1 ∩ A2 ) = 362 = 181 .
Hence, by (39.5),
P(A1 ∪ A2 ) = 365 + 11
36 − 36 = 18 ,
2 7
which means that the sum is 8 or at least one 5 appears with probability 187 . This can be
checked by counting the occurences of 8 or at least one 5 in the list in Example 39.2.
Self-test 39.2
Using a Venn diagram show that, for three events A, B, C,
(A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C).
Extend formula (39.5) to the probability of three events A, B, C, namely
P(A ∪ B ∪ C).
39.3
dice leads to a sample space with 64 = 1296 elements.
As we saw in the preamble to this chapter, probabilities can be obtained using
In the formulae 0! is interpreted as having the value 1, so that special values are
nC0 = 1, nCn = 1.
Example 39.8 How many different five-card hands can be dealt from a standard
deck with 52 cards? What is the probability that a hand dealt at random consists
39
of five spades?
This is a combination problem, not a permutation one. Thus there are
52! 52 ⋅ 51 ⋅ 50 ⋅ 49 ⋅ 48
52 C5 = = = 2 598 960
47!5! 1⋅ 2 ⋅ 3 ⋅ 4 ⋅ 5
different hands.
The number of different hands consisting of five spades is, since there are 13 spades in
the pack,
13! 13 ⋅ 12 ⋅ 11 ⋅ 10 ⋅ 9
13 C5 = = = 1287.
8!5! 1⋅ 2 ⋅ 3 ⋅ 4 ⋅ 5
To obtain the probability that a random five-card hand contains five spades we can
use the counting argument, namely that out of the 2 598 960 equally likely different
hands 1287 will have five spades. Hence, by the frequency argument
1287
P(five-card spade hand) = ≈ 0.0005,
2 598 960
which implies that about one hand in 2000 will have five spades.
Example 39.9 A box contains 20 balls of which 7 are red(r), 5 are white(w),
and 8 are black(b) balls. If three balls are drawn at random, without
replacement, find the probability that
(a) two red balls and one black ball are drawn;
(b) one of each colour is drawn;
(c) one or more red balls are drawn;
(d) all are of the same colour.
The total number of three-ball selections which can be made is
N = 20C3 = 1140
for labelled balls. They are all equally likely to be drawn.
(a) The numbers of ways in which two red balls and one black ball can be drawn is
7 ⋅6
7C2 × 8 = ⋅ 8 = 168.
1⋅ 2
Hence
168 168 14
P(2r and 1b) = = = ≈ 0.15.
N 1140 95
(b) The number of ways in which one of each colour can be chosen is
7 × 5 × 8 = 280 from a total of 1140. Hence
280 14
P(1r and 1w and 1b) = = ≈ 0.25.
1140 57 ➚
875
Example 39.9 continued
39.4
(c) The number of ways in which no red ball is drawn is 13C3 = 286 from
the total of 1140. Hence the probability that a selection contains at least
one red ball is
CONDITIONAL PROBABILITY
286 854 427
P( 1r ) = 1 − P(0r ) = 1 − = = ≈ 0.75.
1140 1140 570
(d) Since the events are mutually exclusive, using (39.1c),
P(3r or 3w or 3b) = P(3r ∪ 3w ∪ 3b) = P(3r) + P(3w) + P(3b)
C + C + C
= 7 3 5 3 8 3
20C 3
101
= ≈ 0.09.
1140
Self-test 39.3
In Example 39.8, suppose that six cards are dealt. What is the probability
that the hand consists of red cards only?
(a) A∩B
(b)
A B
B
S
Fig. 39.3 (a) Both A and B occur in the shaded intersection A ∩ B. (b) P(A | B) refers to the new
universal set B.
876
P(A ∩ B)
P(A | B) = .
P(B) (39.6)
Example 39.10 Six cards are dealt from a well-shuffied deck of playing cards.
Given that all six cards are black, find the probability that they are all of the
39
same suit.
Let A and B represent the following events:
A = {the cards are black}, B = {the six cards in the same suit}.
Thus
A ∩ B = {six black cards of the same suit}.
Therefore
(number of combinations of six clubs or six spades)
P(A ∩ B) =
(number of combinations of six cards)
2 ⋅ 13C6
= .
52 C6
Also
(number of combinations of six black cards) 26C6
P(B) = = .
(number of combinations of six cards) 52C6
Hence the conditional probability that they are all of the same black suit is
P(A ∩ B) 2 ⋅ 13C6 52C6 13! 6!20! 12
P(A | B) = = ⋅ = 2⋅ ⋅ = ≈ 0.015.
P(B) C
52 6 C
26 6 6!7! 26! 805
(i) P(A|A) = 1.
(ii) P(A|B)P(B) = P(B|A)P(A).
Example 39.11 A production line is supplied with the same component made
by two different machines M1 and M2. It is known from samples of the outputs
that the probability that a component from M1 is not faulty is 0.91 and from
M2 is 0.85. Machine M1 supplied 60% of the components and machine M2 40%.
Components are chosen at random and tested before the next stage of
production. What is the probability that
(a) given that a component was made by M2 it is not faulty?
(b) a component is not faulty? ➚
877
Example 39.11 continued
39.5
Let A1, A2, and B be the events
A1 = {component made by M1}, A2 = {component made by M2},
INDEPENDENT EVENTS
B = {component not faulty}.
From the 60%/40% supply we know that P(A1) = 0.6 and that P(A2) = 0.4. The known
failure rates in M1 and M2 give the conditional probabilities P(B| A1) = 0.91 and
P(B|A2) = 0.85.
(a) The answer is P(B|A2) = 0.85.
(b) Write the event B as (B ∩ A1 ) ∪ (B ∩ A2 ) which is still the event that the component
is not faulty. Since B ∩ A1 and B ∩ A2 are mutually exclusive, it follows that
P[(B ∩ A1) ∪ (B ∩ A2)] = P(B ∩ A1) + P(B ∩ A2)
= P(B |A1)P(A1) + P(B|A2)P(A2)
= 0.91 × 0.6 + 0.85 × 0.4 = 0.886,
using (39.6). Hence the probability of a non-faulty component is 0.89 approximately.
In solving this problem we have encountered a new law in (b) called the law of total
probability which will be discussed further in Section 39.7.
Self-test 39.4
A manufacturer buys components from three suppliers: 50% from supplier
S1, 30% from S2, and 20% from S3. A component from S1 is found to be faulty
with probability 0.05, from S2 with probability 0.07 and from S3 with probab-
ility 0.06. What is the probability that a component chosen at random is not
faulty?
In that case
P(A ∩ B) = P(A)P(B) (39.7b)
Then
P(A) = 4
52 = 1
13 , and P(B | A) = 4
52 = 1
13 = P(A).
In other words the events are independent.
39
P(A) = 1
13 but P(B | A) = 3
51 ≠ P(A),
indicating that A and B are not independent events.
Example 39.12 Figure 39.4 shows parts of two circuits which contain electrical
components P, Q, and R placed in parallel and series. For the parallel case the
circuit fails if all three components fail, but in the series case failure occurs if just
one component fails. In some time interval the probabilities of failure of P, Q,
and R are respectively p, q, and r. What are the probabilities of circuit
breakdown in the two cases?
(a) P (b)
p
Q
q
R P Q R
r p q r
Self-test 39.5
39.6
In a large collection of components, the probability that any component
is faulty is 0.05. Three components are chosen at random. Determine the
TOTAL PROBABILITY
following probabilities: (a) the three components are all faulty; (b) only one
component is faulty; (c) at least one component is faulty; (d) at least two
components are faulty.
A1 A2
S Fig. 39.5
The result generalizes to the case in which S contains n mutually exclusive and
exhaustive events A1, A2, … , An. If B is an event in S, then
n
P(B) = ∑ P(B | A )P(A ).
i i
i=1
P(A1 ) = 8
21 , P(A2 ) = 13
21 .
Also
P(B | A1 ) = 7
20 , P(B | A2 ) = 8
20 .
Using (39.9), since A1 and A2 are mutually exclusive
P(B) = P(B| A1)P(A1) + P(B |A2)P(B| A2)
= 7
20 ⋅ 218 + 8
20 ⋅ 13
21 =
8
21 .
Hence the probability that the second component draw is red is 8/21, which is the same
as P(A1). This suggests (correctly) that the probability that the second ball is red does
not depend on the colour of the first ball.
The first solution was selection without replacement. In the second part of the
question the components are replaced. In this case P(B) = 218 ; in other words, with or
without replacement, the probability that the second component is red is still 8/21.
Self-test 39.6
In Example 39.13, what is the probability that the third component is red
when randomly drawn from the box.
PROBLEMS
Bayes’ theorem
For mutually exclusive events A1 and A2
P(B | A1 )P(A1 ) P(B | A1 )P(A1 )
P(A1 | B) = = .
P(B) P(B | A1 )P(A1 ) + P(B | A2 )P(A2 ) (39.13)
Problems
39.1 How many elements do the following sample 39.3 Two dice are rolled and the scores noted.
spaces contain? Write down the elements in the sample space. How
(a) the spinning of five coins; many elements does the set have? Let A denote the
(b) the sum of the faces of three dice; event {the sum of the outcomes is 5}, and B denote
(c) a coin and a die randomly thrown together; the event {at least one die shows 4}.
(d) a dart thrown at a dartboard. Express the sets of these events in formula terms.
List all the elements in A, B, A ∪ B, and A ∩ B.
39.2 Two dice are rolled. What is the probability
that the sum of the face values is 7? What is the 39.4 Suppose that A, B, and C are three events of
probability that no 5 appears? What is the the sample space S. Write down the set formulae
probability that the score is 7 or less? for the events:
882
(a) only B occurs, 39.13 Prove that
(b) exactly one of A, B, or C occurs. C + n−1Cr−1 = nCr.
PROBABILITY
n−1 r
39.5 Suppose that a sample space S includes the 39.14 Prove that
events A and B. Show that the number of elements n n
in A ∪ B can be expressed as (a) ∑ nCr = 2n , (b) ∑ Cr 3r = 4n.
n
n(A ∪ B) = n(A ∩ B) + n(D ∩ B) + n(A ∩ E) r= 0 r= 0
(this is an alternative version of (35.7)). 39.15 How many different four-card hands can be
39
Suppose two dice are rolled. Let A denote the dealt from a deck of 52 playing cards? How many
event {the sum of the outcomes is 6} and B the hands contain four cards of the same suit? What
event {both dice show the same number}. List the is the probability that a hand dealt randomly
elements in A ∩ B, D ∩ B, and A ∩ E, and find n(A contains four cards from the same suit?
∪ B) using the formula above.
39.16 In the previous question investigate how the
39.6 A card is drawn from a deck of 52 playing probabilities change for n-card hands (1 n 13)
cards. If A is the event that an ace is drawn, B is with n cards from the same suit.
the event that a heart is drawn, and C is the event
that a black card is drawn, explain in terms 39.17 A box contains 22 balls of which 7 are red,
of the cards drawn what the following events 9 are white, and 6 are black. Four balls are drawn
represent: at random from the box without replacement.
(a) A ∩ B; (b) A ∩ C; (c) A ∪ B; Find the probability that
(d) A ∪ B ∪ C; (e) A\B; (f) D\E; (g) D\y; (a) three red balls and one white ball
(h) (A ∩ B) ∪ C; (i) (A ∩ B) ∪ (A ∩ y). are drawn;
(b) the balls are red;
39.7 Cards are drawn from a deck of 52 (c) the balls are all of the same colour;
playing cards without replacement. What is (d) there is at least one ball of each colour.
the probability that
(a) the first card is a king? 39.18 A production line is supplied with the
(b) the first two cards are kings? same component made by two different machines
(c) the first card is a king, the second and third M1 and M2. It is known from samples of the
cards are not kings, and the fourth card outputs that the probability that a component
is a king? from M1 is not faulty is 0.89 and from M2 is 0.83.
Machine M1 supplies 70% of the components
39.8 A well-shuffled deck of cards is cut twice and machine M2 30%. Components are
randomly. What is the probability that two aces chosen at random and tested before the
are shown? (This is a problem of selection with next stage of production. What is the
replacement.) probability that
(a) given that it was made by M1, a component
39.9 Evaluate the following permutations: is not faulty?
(a) 5 P3; (b) 10 P4; (c) 7 P7; (d) 7 P1. (b) a component is not faulty?
(c) given that a component was faulty that was
39.10 How many different three-letter ‘words’ manufactured by M2?
can be made up from the letters a, b, c, d, e with
no repetition of letters? 39.19 A production line is supplied with the
same component made by three different machines
39.11 How many five-digit numbers can be M1, M2, and M3. It is known from samples of the
formed (numbers cannot start with 0) from outputs that the probability that a component
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, if from M1 is not faulty is 0.87, from M2 is 0.84, and
(a) numbers are selected without replacement? from M3 is 0.91. Machine M1 supplies 45% of the
(b) any number of repetitions of numbers is components, machine M2 30%, and machine
allowed? M3 25%. Components are chosen at random
(c) without replacement but such that the number and tested before the next stage of production.
must be divisible by 5? What is the probability that
(a) a component is not faulty?
39.12 Calculate the following combinations: (b) given that a component was faulty that it was
(a) 7C3; (b) C96;
99 (c) 11C5. manufactured by M2?
883
p1
r1
PROBLEMS
p2 q
r2
p3
Fig. 39.6
(c) given that a component was faulty that it was 2, 3, …, 49 without replacement. Prizes are given
made by M1 or M2? to those who correctly select three, four, five, or
six numbers. Find the probability of winning in
39.20 Figure 39.6 shows part of a circuit with six each case. A seventh bonus ball is also drawn from
components in a parallel and series combination. the remaining 43 balls and further prizes are given
The probabilities of failures of components are p1, for those who correctly choose the bonus ball
p2, p3, q, r1, and r2 as shown and are independent. and any five of the six drawn numbers. Find the
What is the probability that this part of the circuit probability of winning in this case. What is the
fails? If all components have the same probability overall probability that a lottery ticket wins
of failure of 0.98, what is the probability that this at least one prize?
part of the circuit fails? (Parallel and series failures
are as in Example 39.12.). 39.23 A game is played in which n players each
39.21 It is known that in a batch of 100
spins a coin and the outcome is examined. The
microprocessors, 5 are defective. game continues until the outcome is either n − 1
(a) A microprocessor is chosen at random without heads and 1 tail, or 1 head and n − 1 tails. The
replacement. What is the probability that it is single player with the different outcome wins
defective? the coins from the other players. Show that the
(b) Two are chosen at random without probability that the game ends at a given play is
replacement. What is the probability that n/2n−1, and that the probability that the game
finishes at the ith play is given by the geometric
both are defective?
distribution
(c) Two are chosen without replacement. Given i −1
that the first is defective, what is the n ⎛ n ⎞
⎜1 − n−1 ⎟ .
probability that the second is also defective? 2n−1 ⎝ 2 ⎠
39.22 In the UK national lottery 6 numbered balls Find also the mean number of plays to the end of
are selected at random from 49 balls numbered 1, the game.
Random variables and
40 probability distributions
CONTENTS
40.1
s
PROBABILITY DISTRIBUTIONS
S SX
Fig. 40.1 Mapping of the random
0 1 2 3 variable X from the sample space
X(s) S onto the real line SX.
spins of a coin until a head appears. The list of possible outcomes is {1, 2, 3, … }
which is unbounded but countable.
Obviously many random variables can be associated with the same experiment.
In the example above where a coin is spun three times, a random variable Y, say,
could be the number of tails observed.
Generally, capital letters X, Y, … denote random variables and small letters
{x1, x2, … } sets of elements in a sample space.
xi = x1 x2 x3 …
pi = p1 p2 p3 …
For the coin spun three times in Section 40.1, the distribution would be, with x0
representing no heads, x1 one head, and so on,
886
xi = 0 1 2 3
RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS
pi = 1
8
3
8
3
8
1
8
Example 40.1 A box contains six components of which two are defective.
Components are selected at random without replacement until a defective
component is chosen. Find the probability distribution of the number of
components drawn from the box.
Let X be the random variable (number of components withdrawn including the
defective). Then
SX = {1, 2, 3, 4, 5} = {xi} (i = 1, 2, 3, 4, 5).
The probability
p1 = P(X = x1 ) = 26 = 13 ,
since there is a 2 in 6 chance of choosing a defective on the first selection. Also
P(X = x2 ) = 64 ⋅ 25 = 154 ,
since the probability of choosing a non-defective component at the first stage is 46 which
leaves two defective in the remaining five. Similarly
P(X = x3 ) = 64 ⋅ 53 ⋅ 24 = 15 , P(X = x4 ) = 64 ⋅ 53 ⋅ 24 ⋅ 23 = 152 ,
P(X = x5 ) = 64 ⋅ 53 ⋅ 24 ⋅ 13 ⋅ 1 = 151 .
The complete distribution is
40
xi = 1 2 3 4 5
pi = 1
3
4
15
1
5
2
15
1
15
pi
0.4
0.3
0.2
0.1
O 1 2 3 4 5 xi Fig. 40.2
Self-test 40.1
Two dice are rolled until the sum of the face values is 7. Find the distribution
of the number of throws. (Use the table in Example 39.2: note that the
distribution is infinite.)
887
40.2
Suppose a series of trials are independent, and have two possible outcomes which
occur with probabilities p and (1 − p). If p is constant throughout then these are
xi = 0 1
pi = q p
Let us consider a further distribution which can arise from Bernoulli trials. A
series of independent Bernoulli trials takes place with the probability of success or
failure of any given trial given by p or q where p + q = 1. Consider the probability
distribution of i successes in a fixed number of trials n. In the notation of probab-
ility distributions
xi = i (i = 0, 1, 2, … , n).
Here is a particular sequence:
1
1…
11 00…
00 .
i times n − i times
This sequence, in which there are i successes followed by n − i failures, occurs with
probability
piqn−i,
since the probability of a success followed by another success is p × p = p2, and
so on. However, there are many sequences which have i successes (1) and n − i
failures (0), and the number of possible arrangements is nCi (see Section 39.4 for
an explanation of the combination notation). Including every arrangement, the
probability of i successes in n trials is
nCi piqn−i.
This is called the binomial distribution for X, the random variable of the number
of successes in n trials.
xi = 0 1 2 3 …
n(n − 1) 2 n − 2 n(n − 1)(n − 2) 3 n −3
pi = qn npqn−1 pq pq …
2! 3!
which are recognizably the first few terms in the binomial expansion of (p + q)n
(see Appendix A(c)). Hence
n n
n!piqn − i
∑ p = ∑ (n − i)!i! = (p + q)
i=0
i
i=0
n = 1,
since p + q = 1. This confirms that (40.2) does satisfy the key requirement for a
probability distribution. Some bar charts for the binomial distribution are shown
for n = 10 and p = 0.3, 0.5, 0.7 in Fig. 40.3.
O 1 2 3 4 5 6 7 8 9 10 11 12 O 1 2 3 4 5 6 7 8 9 10 11 12 O 1 2 3 4 5 6 7 8 9 10 11 12
Fig. 40.3 Binomial distribution for n = 10 and (a) p = 0.3, (b) p = 0.5, (c) p = 0.7.
Example 40.2 Three dice are rolled simultaneously. What is the probability that
two 5s appear with the third face showing a different number?
Let the random variable X be the number of 5s which appear. Then
SX = {0, 1, 2, 3}.
The outcomes from each die are independent with a 5 showing called a success and
no 5 showing a failure. The probability that a single die shows a 5 is 16 . Hence X has a
binomial distribution with parameters n = 3 and p = 16 . Hence, by (40.2),
P(X = 2) = 3C2( 16 )2( 56 ) = 216
15
≈ 0.069,
which is quite small. The other probabilities are
P(X = 0) = 125
216 ≈ 0.579, P(X = 1) = 216 ≈ 0.347, P(X = 3) = 216 ≈ 0.005.
75 1
Self-test 40.2
In a population of 1000 individuals, it is found that 350 are of height greater
than 1.8 m. A random sample of eight individuals is chosen. Find the probab-
ility distribution of the number of individuals of height greater than 1.8 m.
889
40.3
The expected value or mean or expectation of a random variable is defined in
terms of a weighted average of outcomes: the weighting is equal to the probability
For the binomial distribution (40.2) with parameters n and p, the expected
value is (note that the distribution has n + 1 elements)
n
E(X) = ∑
i=0
nCi piqn − ii
n
n!piqn − i n
(n − 1)! pi−1qn − i
= ∑1 (n − i)!(i − 1)! = np ∑1 (n − i)!(i − 1)!
i= i=
n −1
(n − 1)! piqn − i−1
= np ∑ = np(p + q)n −1
i=0 (n − 1 − i)!i !
= np,
using the binomial expansion (Appendix A(c)).
Example 40.3 In Example 40.2 what is the expected value of the number of 5s
which appear when three dice are rolled?
From the definition of expected value and the results in the previous example,
3
125 75 15 1 108 1
E(X) = ∑ P(X = i)i = 216 ⋅ 0 + 216 ⋅ 1 + 216 ⋅ 2 + 216 ⋅ 3 = 216 = 2 .
i=0
This result checks with np = 3
6 = 12 .
can have the same mean but can have very different shapes in relation to the mean.
A measure of the spread is the difference X − E(X), the difference between the
random variable and its mean. However, its expectation is always zero since,
using (40.4)(i), (ii) above,
n n n
E(X − E(X)) = ∑ (xi − E(X))pi =
i =1
∑ xi pi − E(X) ∑ pi = E(X) − E(X) ·1 = 0,
i =1 i =1
which is obviously not helpful as a measure of spread. Instead we choose the random
variable (X − E(X))2. Its expected value is known as the variance and is denoted by
Using (40.4), note that the variance can be expressed in the form
Var(X) = E(X 2 − 2 µ X + µ2) = E(X 2 ) − 2 µ E(X) + µ2 = E(X 2 ) − µ2 (40.6)
Example 40.4 Find the variance of the binomial distribution given by (40.2).
Using (40.2) and (40.4)(iv)
n n
i ⋅ n!piqn− i
Var(X) = E(X2) − µ2 = ∑ i2 nCi piqn−i − µ 2 = ∑ (n − i)!(i − 1)! − µ .
2
i=0 i =1
As a device for summing the series we assume that p and q are independent parameters,
and use the formula E(X) = np(p + q)n−1 for the expected value of the binomial
distribution. Thus
n
i ⋅ n!piqn− i ∂ ⎛ n n!piqn− i ⎞
∑ (n − i)!(i − 1)! − µ
i =1
2
=p ∑
∂p ⎝ i =1 (n − i)!(i − 1)!⎟⎠
⎜ − µ2
∂ ∂
=p (E(X)) − µ 2 = p (np( p + q)n−1) − µ 2
∂p ∂p
= p[n(p + q)n−1 + n(n − 1)p(p + q)n−2] − n2p2
= pn[1 + p(n − 1)] − n2p2 (since p + q = 1)
= np(1 − p).
891
The following rules for variances can be proved:
40.4
Rules for variances
(i) Var(X + c) = Var(X);
GEOMETRIC DISTRIBUTION
(ii) Var(cX) = c2 Var(X);
(iii) Var(X + Y) = Var(X) + Var(Y) (if X and Y are independent). (40.8)
Self-test 40.3
Find the expected value and variance of the distribution obtained in
Self-test 40.2.
Note that
∞ ∞
p
∑p
i =1
i = p ∑ qi −1 = p(1 + q + q2 + $ ) =
i =1 1−q
= 1,
using the formula for the sum of a geometric series (see Section 1.16). A bar chart
of a geometric distribution with p = 0.2 is shown in Fig. 40.4.
The expected value of the random variable of the geometric distribution is
∞ ∞
µ = E(X) = ∑ ip = ∑ ipq
i =1
i
i =1
i −1 = p(1 + 2q + 3q2 + … )
pi
RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS
0.15
0.10
0.05
xi
Fig. 40.4 Geometric distribution
O 1 2 3 4 5 6 7 8 9 10 11 12 with p = 0.2.
Hence µ = 1/p. In a similar manner it can be shown that the variance is given by
σ 2 = (1 − p)/p2.
700
E(C(X)) = E(700X − 200) = 700E(X) − 200 = − 200,
p
since X is a random variable with a geometric distribution. This expected cost is less
than £2000 if
700 700
− 200 2000 or 2200.
p p
Hence the probability must satisfy the inequality p 7
22 ≈ 0.32.
Self-test 40.4
For the geometric distribution, prove that its variance is (1 − p)/p2.
40.5
accumulate. The distribution is appropriate for data arriving in a sequential
random manner.
POISSON DISTRIBUTION
The Poisson distribution has mean
∞
n λ n e− λ ∞
λ n e− λ
µ = E(X) = ∑0 n! = ∑ (n − 1)!
n= n =1
∞
λ n e− λ ∞
λn
= λ∑ = λ e− λ ∑ = λ e − λ eλ = λ
n =0 n ! n =0 n !
⎛ λ⎞
i
⎝ n⎠
⎢ i! ⎜1 − ⎟ ⎥
⎢⎣ ⎝ n⎠ ⎥⎦
Example 40.6 Certain processors are known to have a failure rate of 1.2%.
RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS
They are shipped in batches of 150. What is the probability that a batch has
exactly one defective processor? What is the probability that it has two?
We assume that the defects are independent. We use the binomial distribution with
probability
pi(n, p) = nCi p iqn−i
with n = 150 and p = 0.012 (failure of the component is ‘success’ in the binomial
convention). Hence for i = 1, 2, a direct calculation gives
p1(150, 0.012) = 150C1(0.988)149(0.012)1 = 0.297 891,
p2(150, 0.012) = 150C2(0.988)148(0.012)2 = 0.269 549.
In this problem n is ‘large’, so that it is suitable for the Poisson approximation. The
parameter λ for the corresponding Poisson distribution is given by
λ = np = 150 × 0.012 = 1.8.
Hence the probability of one failure is
λ −λ
e = 1.8 e−1.8 = 0.297 538,
1!
and of two failures is
λ 2 − λ (1.8)2 −1.8
e = e = 0.267 784,
2! 2
which show accuracy to 2 decimal places compared with the binomial distribution. This
is more than sufficient in many applications. The Poisson approximation avoids the
rounding errors which can occur in calculating probabilities raised to large powers.
40
Self-test 40.5
The Poisson distribution is given by pi = λi e−λ/i!, (i = 0, 1, 2, … ) and the
binomial distribution by bi = n! pi(1 − p)n−i/[(n − i)!i!], (i = 0, 1, 2, … , n) (see
(40.2)). As shown in this section, the Poisson distribution can be used as an
approximation to the binomial distribution for large n. See how the distribu-
tions compare numerically for n = 20, p = 0.1, λ = pi = 2, for i = 0, 1, 2, … , 6.
40.7
Consider a box containing w white balls and b black balls. Suppose that n balls are
chosen at random from the box without replacement. What is the probability
that i white balls are chosen? The i balls must be chosen from w, and the n − i balls
where
⎧0, 1, 2, … , n if n w,
i=⎨
⎩0, 1, 2, … , w if n w.
The function pi defines the hypergeometric distribution. Its mean and variance
are given by
nb nwb(b + w + n)
µ= , σ2 = .
w+b (w + b)2 (w + b − 1)
The same problem with replacement leads to the binomial distribution.
x2
A graph of a possible density function f(x) against x is shown in Fig. 40.5. By (a) the
curve must never fall below the x axis, by (b) the area under the curve must be 1,
and by (c) the probability that X lies between two values x1 and x2 is the shaded
area under the graph. Unlike pi, the pdf f(x) is not itself a probability.
We can associate with the pdf a cumulative distribution function (cdf) F(x)
which is defined by
896
f(x)
RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS
x
For continuous random variables, P(X = x) is zero which is unhelpful. Only the
probability of X over an interval of x such as P(X x) has a meaning. By (40.10b)
it follows that
F(x) → 1 as x → ∞,
and
x2
x1
F(x)
1
Example 40.7 Let X be the random variable of time to failure of a light bulb
measured from time t = 0. Assume that X has a pdf
⎧α e−α t t 0
f (t) = ⎨
⎩0 t 0,
where t is measured in hours. What is the probability that the bulb has failed
at t = 10 hours? What is the probability that the light bulb fails between
t = 10 hours and t = 20 hours?
Note that f(t) is a pdf since f(t) 0 and
∞ ∞
−∞
f (t) dt = αe
0
−α t
dt = [−e −α t]0∞ = 1.
➚
897
Example 40.7 continued
40.8
For the first question we require
10
P(X 10) = αe −α t
dt = [−eα t]10
0 = 1 − e
−10α
.
Thus the light bulb fails before 10 hours with probability 1 − e−10α, and between 10 hours
and 20 hours with probability e−10α(1 − e−10α ).
The pdf in the previous example is the exponential distribution which is frequently
used in ‘time to failure’ problems. Its cdf is given by
⎧ x
F(x) = ⎨
0
⎪ α e −α u du = 1 − e −α x, x 0;
⎪ 0, x 0.
⎩
Note that density functions do not have to be continuous: they can include jumps.
Also if some event can only take place after a given time, say, then we put the density
function equal to zero until that time.
Self-test 40.6
In Example 40.9, assume that
10 t0
f(t) = 2 γ 0tβ
3 γ e−α(t−β ) t β.
What is the relation between the parameters α, β and γ for f(t) to be a
pdf? What is the probability of failure of a light bulb failure by time t0 if
(a) t0 β; (b) t0 β?
µ = E(X) =
−∞
xf (x) dx.
⎧α e−α t, t 0,
f (t) = ⎨
⎩0, t 0,
αt e dt = − t dt (e ) dt
d
µ= tf (t) dt = −α t −α t
−∞ 0 0
∞
dt dt (integrating by parts)
d(t)
= −[t e−α t]0∞ + e −α t
0
∞
e
1
=0+ −α t dt = .
0
α
It can be shown similarly that
1
σ2 = .
α2
Self-test 40.7
Prove that the variance of the exponential distribution with pdf
80 t 0
40
f(t) =
9 α e−αt t0
is 1/α 2.
σ 2π (40.13)
−∞
f (x) dx = 1,
−∞
xf (x) dx = µ,
−∞
x2 f (x) dx = σ 2 .
f(z)
40.9
f(x) 0.4
0.3
x −3 −2 −1 O 1 2 3 z
Fig. 40.7 A normal distribution. Fig. 40.8 The standard normal curve.
2π
It has mean zero and standard deviation 1. Any normal random variable X with
distribution N( µ, σ 2) can be ‘standardized’ by considering the random variable
Z = (X − µ)/σ. In the distribution (40.12) this is equivalent to the substitution
z = (x − µ)/σ. The standard normal curve representing N(0, 1) is shown in Fig. 40.8.
The standard deviations within 1, 2, 3 units of the mean are also shown in the
figure. If Z is the corresponding random variable then the probability that Z
lies within one standard deviation of the mean zero is the area under the curve
between −1 and 1. Thus
1
2π
1
P(−1 Z 1) = e− 2 z dz = 0.6827,
1 2
−1
2π
1
P(−2 Z 2) = e− 2 z dz = 0.9545,
1 2
−2
3
2π
1
P(−3 Z 3) = e− 2 z dz = 0.9973.
1 2
−3
The last result implies that there is a 99.73% chance that a selected item lies within
three standard deviations of the mean for the standardized normal distribution.
The importance of the normal distribution lies in the observation that in many
measurements, which almost always involve random experimental errors, the
distribution of the errors seems to be normal.
The cdf for the standardized normal distribution N(0, 1) is
∞
2π
1
Φ(z) = P(Z z) = e − 12 u 2
du,
z
900
Φ (z)
RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS
Example 40.8 The mean height of 459 university students is 180 cm with
a standard deviation of 4.2 cm. Assuming that the heights are normally
distributed estimate the number of students who have heights greater than
200 cm, and the number who have heights between 175 cm and 185 cm.
For this sample µ = 180 and σ = 4.2. Hence the normal distribution N(180, 17.64)
is given by
1 1 −(x −180)2 /35.28
e .
2π 4.2
40
PROBLEMS
40.1 A biased coin is spun three times. The 1001100011100100
probability of a head appearing is 0.45 and of a tail 7 successes will have occurred in 16 trials, that
0.55. If X is the random variable of the number of is X = 16 for r = 7 in this case. Show that
heads shown, what is the sample space of X? What
is the probability distribution of X? ⎛ i − 1⎞ r
pi = P(X = i ) = ⎜ ⎟ p (1 − p)i − r
Sketch a bar chart showing the probability ⎝ r − 1⎠
distribution. What is the probability that X is for i = r, r + 1, r + 2, … . This is the negative
greater than or equal to 1, that is P(X 1)? binomial distribution. Confirm that
∞
40.2 Explain why the sequence ∑p
i=r
i = 1.
1⎛ 1 1 ⎞
pj = ⎜ j + j − 1⎟ ( j = 1, 2, 3, … ) Show that
3⎝2 2 ⎠
r r(1 − p)
can be interpreted as a probability distribution. E(X) = , Var(X) = .
p p2
If P(X = j ) = pj find P(X 6).
40.9 In a milk-bottling plant bottles are filled
40.3 The probability of success in a sequence of with milk and their weights checked. If a bottle
1
independent Bernoulli trials is . If 12 trials take
3 is underweight or more than 4% overweight the
place calculate the probabilities of 0, 1, … , 12 production line is stopped and the problem
successes. Calculate also the mean and standard investigated. Assume that a bottle fails randomly
deviation of the random variable which is the with the same probability p. What would be an
number of successes. appropriate distribution for this problem? On
average it is found that breakdown occurs every
40.4 The uniform distribution has the pdf 1503 bottles. What is the probability that an
⎧1 /(b − a) a x b individual bottle fails the weight test?
f (x) = ⎨
⎩0 elsewhere.
40.10 Suppose that the random variable X has the
Sketch the graphs of the pdf and its cdf. Find the exponential distribution with pdf
mean and standard deviation of the uniform
⎧1.5 e −1.5 t , t 0,
distribution. f (t) = ⎨
⎩0, t 0.
40.5 Prove that the variance of the geometric Find the following probabilities
distribution pi = (1 − p)i−1p, (i = 1, 2, … ), is (a) P(0 X 1); (b) P(X 0); (c) P(X 1);
(1 − p)/p2. (d) P(X 1); (e) P(X 2) or P(X 1).
40.6 Components join a production assembly 40.11 Calls to a freephone information line are
line in sequence. The probability that a particular assumed to occur so that the times between calls
component is faulty is 0.012. How many are exponentially distributed with mean time of
components (excluding the faulty component) 20 minutes between calls. If X is the random
will be expected to join the assembly line before variable of the time between calls, (a) What
a faulty one is encountered? What is the standard is the probability that there are no calls in
deviation of the number of components to a one-hour interval?
failure? (b) What is the probability that there is at least
one call within a 15-minute interval?
40.7 A coin is spun until a tail a shown. What is
the probability that eight heads appear before the 40.12 A geiger counter is an instrument for
first tail? counting the number of radioactive particles
emitted by a radioactive sample which strike the
40.8 In a series of Bernoulli trials the probability instrument. In a probability model of the counter,
of success is p. Let X be the random variable until the random variable X, which is the number of
r successes occur. For example, if 1 denotes success radioactive particles detected in a given time
and 0 denotes failure then, in the sequence interval, has a Poisson distribution
902
e−λ λ n 40.15 It is required in an application that
P(X = n) = (n = 0, 1, 2, … )
n! ⎧A(a 2 − t 2 ) −a t a
RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS
f (t) = ⎨
where λ is a parameter which characterizes the ⎩0 elsewhere
radioactivity of the sample. Show that the mean
should be a pdf. What should the parameter A be
and variance of the probability distribution are
in terms of a? Find the variance of the distribution.
both λ .
What should A and a be for the distribution to
What is the probability that five or more hits
have a standard deviation of 1?
occur in the time interval?
40.16 The time to failure of catalytic converters
in exhaust systems of cars is modelled by a normal
40.13 The random variable Z has a standardized random variable with mean of 1200 hours. If 95% of
normal distribution. Estimate the following the converters are to last at least 1000 hours without
probabilities: failure, what is the maximum value which the
(a) P(Z 0.8); (b) P(Z 0.7); standard deviation of the normal distribution can
(c) P(−0.5 Z 0.8). take?
CONTENTS
00:00–02:00 6
02:00–04:00 4
04:00–06:00 9
06:00–08:00 21
06:00–10:00 24
10:00–12:00 15
12:00–14:00 16
14:00–16:00 18
41
16:00–18:00 29
18:00–20:00 20
20:00–22:00 16
22:00–24:00 10
30
Number of vehicles
20
10
0 2 4 6 8 10 12 14 16 18 20 22 24
Time in hours
Given a set of data, the design of histogram for the data is a matter of judge-
ment. In the example, the 24 hours were divided into 12 two-hour time intervals,
but we could have collected alternatively over 24 one-hour intervals. The intervals
are also known as cells or bins. The intervals should usually be of equal ‘width’.
Also the number of intervals should not be too large for the data. A working rule
is that the number of intervals may roughly increase like √n, where n is the number
of observations. In the example above there are 188 observed vehicles which accord-
ing to the rule suggests about 14 intervals which is close to our choice of 12.
Here is another example. A ‘snapshot’ of vehicles on a short stretch of road
is taken at the same hour on the same weekday for 59 occasions. Table 41.2 is a
905
Table 41.2
41.1
Number of vehicles (xi ) Frequency (fi )
0 12
REPRESENTING DATA
1 15
2 13
3 8
4 5
5 3
6 2
7 1
15
10
Frequency
0 1 2 3 4 5 6 7
Number of vehicles on the stretch of road
frequency table, which collocates the numbers of cars. The histogram is shown
in Fig. 41.2.
The sample mean X of n observations {xi}, where xi occurs with frequency fi , is
defined by
∑ni =1 fi xi
X= ,
∑ni =1 fi
which is equivalent to the average, or mean, of the total set of observations since
there must be ∑i=1
n
fi of them. For the traffic census in Table 41.2
906
(0 × 12) + (1 × 15) + (2 × 13) + (3 × 8) + (4 × 5) + (5 × 3) + (6 × 2) + (7 × 1)
X=
12 + 15 + 13 + 8 + 5 + 3 + 2 + 1
DESCRIPTIVE STATISTICS
119
= ≈ 2.017.
59
The mean X of this sample will be an estimate for the true population mean. If the
samples are not classified into categories then fi = 1 and, as before,
n
1
X=
n
∑x,
i =1
i
the value of xi which occurs most often in a sample, and is therefore most likely to
occur in other random samples. Thus in Table 41.2, the number 1 appears most
often (15 times), so that the mode of these data is 1.
The central item in an ordered list of sample values is known as the median.
Suppose that a list of examination marks is given by
Examination marks: 31, 36, 38, 39, 45, 46, 57, 60, 65, 65, 69, 72, 75, 79
in increasing order. If the sample has an odd number of items, say 2n + 1, then the
median is the (n + 1)th item: if the number is even, say 2n, then it is defined to be
the average of the nth and the (n + 1)th numbers in the ranking. In the list of
examination marks above the median is 12 (57 + 60) = 58.5. The mode is 65 but
it would be a number of no particular significance in this list, since the mode
contains only two marks.
In Table 41.2, there are 59 numbers from 0 to 7 consisting of 12 zeros, 15 ones,
etc. The median is the 30th number which will be one of the twos. Hence the
median is 2.
The box plot displays graphically important features of data such as the
median, the spread, and symmetry of the data, and is particularly useful in com-
paring different data sets, as for example in the results in a series of associated
examination papers. Suppose that the examination marks in three papers are as
percentages (each in increasing order) as shown in Table 41.3. We first find the
median of the marks in each paper. Thus the medians are 59.5 for Paper 1,
61 for Paper 2, and 56 for Paper 3.
Suppose that the data contain 2n observations. Then the first quartile is the
median of the n smallest observations, the second quartile is the overall median
of the data, and the third quartile the median of the n largest observations. If
Table 41.3
Paper 1 (16 results) 27, 40, 46, 48, 55, 55, 56, 58, 61, 63, 64, 66, 68, 69, 72, 78
Paper 2 (11 results) 30, 38, 39, 48, 58, 61, 64, 68, 69, 70, 81
Paper 3 (9 results) 26, 40, 43, 54, 56, 61, 62, 72, 74
907
Table 41.4
41.1
First quartile Median Third quartile
REPRESENTING DATA
Paper 2 43.5 61 68.5
Paper 3 43 56 62
the data contain 2n + 1 observations then the first quartile is the median of the
n + 1 smallest observations and the third quartile the median of the n + 1 largest
observations. (The second quartile is the overall median.) The quartiles divide the
observations into four approximately equal numbers of observations.
For the examination marks paper by paper the quartiles are given in Table 41.4.
The difference between the third and first quartiles is a measure of the spread of
the data, and is known as the interquartile range.
Create a vertical scale 0 –100 as shown in Fig. 41.3. For each paper, position a
box such that its upper edge is level with the third quartile on the scale, and its
lower edge is level with the first quartile. The line across the middle of the box is
the median. Extend each box by a line to the extreme marks above and below the
box. These lines are known as whiskers. Visually we can see how the average and
spread of the marks compare. A compressed box indicates poor discrimination in
the marks, and long whiskers might indicate exceptional successes or failures
(often known as outliers). Examiners may wish to take remedial action by scaling
in the light of the comparative box plots if there are candidates in common
among the papers.
100
Highest mark
75
Third quartile
Median
50 First quartile
Lo west mark
25
0
Paper 1 Paper 2 Paper 3
Self-test 41.1
DESCRIPTIVE STATISTICS
41.2
We can estimate p from each poll. Since the mean for the binomial distribution
is np, it seems reasonable to estimate p as X /n. We shall denote an estimator for
1 np
E(N) = E(X/n) = E(X) = = p, (41.1)
n n
since we are assuming a binomial distribution. Hence the expected value of
the estimates gives the probability p, the mean of the sampling distribution of
N. Generally, if the expected value of the estimate equals the parameter being
estimated, then the estimate is called unbiased; if this is not the case then the
estimate is called biased.
The spread of the estimate can be found by calculating the expected value
E[(N − p)2]. Then
⎡X ⎤
Var[N] = Var ⎢ ⎥
⎣n ⎦
1
= 2 Var[X ] (by (40.8(ii))
n
np(1 − p) p(1 − p)
= = , (41.2)
n2 n
for the binomial distribution. As we might expect, the variance of the sample
means decreases with increasing sample size.
Given the one sample at the beginning of this section, the estimate for p is
p = 0.56. The estimated variance (see 40.5) of this single sample replacing p by
N is
N(1 − N)
Y 2 = Var[N] =
n
0.56 × 0.44
= ≈ 0.005.
500
The corresponding standard deviation N is Y ≈ 0.022.
We can turn the result round and ask the question: what sample size should we
choose to achieve a given Y? Thus the sample size will be
√[p(1 − p)]
n= .
Y2
910
If x1, x2, … , xn are values obtained in a particular sample then its mean is
n
1
X=
n
∑x.
i =1
i
What is the relation between the sample mean and the mean of the population?
The expected value of e is
⎛1 n ⎞ 1 n 1
E(e) = E ⎜
⎝n
∑
i =1
Xi⎟ =
⎠ n
∑
i =1
E(Xi ) = nµ = µ.
n
As we might expect, the expected value of the sample mean is the same as the
mean of the population.
The variance of the sample mean is, by (40.6),
⎛1 n ⎞ 1 n
nσ 2 σ 2
Var(e) = Var ⎜
⎝n
∑
i =1
Xi ⎟ = 2
⎠ n
∑ Var(X ) =
i =1
i
n2
=
n (41.4)
where σ 2 is the unknown variance of the population. Its standard deviation σ /√n
is known as the standard error of the sample mean.
We also need an estimate for the variance σ 2 of the population. We might
choose
n
(Xi − e)2
T2 = ∑ n
,
i =1
which is the variance of the sample, but is it unbiased? In other words does its
expected value equal σ 2? The following algebra supplies the answer:
911
⎡1 n ⎤ ⎡1 n ⎤
E(T2 ) = E ⎢ ∑ (X − e)2 ⎥ = E ⎢ ∑ [(X − µ) − (e − µ)]2 ⎥
41.4
i i
⎢⎣ n i =1 ⎥⎦ ⎢⎣ n i =1 ⎥⎦
⎡1 n ⎤
For large samples the difference between T2 and S2 is small but it can be significant
for small sample sizes. The estimator is often known simply as the sample variance.
⎛ e − nµ ⎞ 1
x⎟ = e−
1 2
u
lim P ⎜ 2 du.
n →∞ ⎝ σ √n ⎠ 2π −∞ (41.6)
This result can be illustrated in the case of the throwing of n dice in which the
frequencies of average scores are kept. The probabilities can easily be computed
(see Project 41.4 in Chapter 42) for small values of n. For example, if n = 2, then
the possible average scores and the probabilities with which they occur are given
in Table 41.5 and Fig. 41.4a. Graphs for n = 2, 4, 6 computed using a program to
generate the bar charts are shown in Fig. 41.4. The bounding curve begins to show
for n = 6 the familiar shape of the normal distribution.
Table 41.5
3 5 7 9 11
Average score 1 2 2 2 3 2 4 2 5 6 6
41
1 2 3 4 5 6 7 8 9 10 11
Probabilities 36 36 36 36 36 36 36 36 36 36 36
Probability
Probability
n=2
Probability
0.1 0.1 n=4 0.1
n=6
O 1 2 3 4 5 6 O 1 2 3 4 5 6 O 1 2 3 4 5 6
(a) Average score (b) Average score (c) Average score
Fig. 41.4 Probabilities versus average scores for rolling two, four and six dice.
Example 41.1 A die is rolled 6000 times. The number T of times face 1 appears is
counted. Find m1 and m2 in P(m1 T m2) in order that T should lie within one
standard deviation of its mean value 1000.
Let X be the random variable that a 1 appears face up on the die. Then E(X) = 1000. Its
variance is given by
σ 2 = Var(X) = E(X2) − E(X)2 = 5000/6.
By the central limit theorem
⎛ 1 ⎞
⎜ T − 6000 ⋅
6 k ⎟ ≈ 1
k2
P ⎜ k1
⎜⎜ 5
2⎟
⎟⎟ 2π k1
e− 2 u du,
1 2
6000
⎝ 6 ⎠
or
⎛ 5 ⎞ k2
2π
5 1
k1 6000 + 1000 T k2 6000 + 1000⎟ ≈ e− 2 u du.
1 2
P⎜
⎝ 6 6 ⎠ k1
Self-test 41.2
41.5
Suppose in Example 41.1, the die is rolled 12 000 times. Find the new m1 and
m2 in P(m1 T m2) in order that T should be within one standard devia-
REGRESSION
tion of its mean of 2000. If n is the number of times the die is rolled, how does
m2 − m1 behave in terms of n?
41.5 Regression
Suppose that we have a set of data in which one quantity is measured in relation
to another quantity. For example, the fuel consumption of a car will vary with
the speed of the car, or the weight of an individual will vary with the height of the
person. We may wish to speculate as to what the relationship is between two (or
more) quantities.
Suppose that a sample of measurements is taken, for example fuel consump-
tion (y) for different speeds (x) of a car. This leads to the paired data (x1, y1),
(x2, y2), … , (xn, yn), in which one or both variables may contain random errors.
We can obtain an idea of the likely relationship between x and y by plotting the
coordinates (xi, yi) as points in rectangular cartesian coordinates, giving what
is known as a scatter diagram. Some examples are shown in Fig. 41.5. If we fit a
curve to the data shown in the scatter diagrams, then we might guess a straight
line fit to the data in Fig. 41.5a, and a curve in Fig. 41.5b, whereas in Fig. 41.5c,
which shows data centred around a point, we might feel that no relationship
exists between the variables. Often in scientific experiments the relationship
between the variables can be inferred from some underlying theory although
parameters may be unknown. For example, it might be known that the formula
relating x and y is linear so that we need to find the best straight line fit to the data.
For others we might need to guess the likely shape of the curve from the scatter of
the data as in Fig. 41.5b.
In some data sets there can be errors in both measurements. In others, one
variable known as the controlled or independent variable x is specified (measure-
ments could be made at specified times which are known accurately) and y, which
will contain random errors, is known as the response or dependent variable.
x x x
affected by other factors (ambient temperature, engine tuning, etc.). On the other
hand in the height/weight data the measurements could be accurate, although the
weight could vary over time. There is unlikely to be a ‘formula’ relating height
and weight (there may be other parameters involved) but nevertheless it is useful
to have a working relation between the two for life tables used by insurance com-
panies. The process of estimating the response variable from a set of controlled
variables is known as regression.
If the hypothesis is that the data follow a straight line relationship then the
model is known as a linear regression model. This regression model assumes that
the random variable Y of the data {yi} is given by
41
Y = ax + b + ε,
where a and b are unknown parameters and ε is a random error with mean 0 and
unknown variance σ 2. Note that the variance of Y is
Var(Y) = Var(ax + b + ε) = Var(ax + b) + Var(ε) = Var(ε) = σ 2.
With x as a controlled variable, the vertical deviation of the point (xi, yi) from
the line is
ei = yi − (axi + b).
We use the method of least squares for the sum of the squares of the deviations
which requires the minimum of
n n
f (a, b) = ∑e
i =1
2
i = ∑ (y
i =1
i − axi − b)2
(see Section 28.6 for a full derivation of a and b). The minimum occurs where
∂f/∂a = ∂f/∂b = 0 and, as in (28.11), the best straight line fit is given by the solu-
tion of
n n n
a ∑ x 2i + b ∑ xi = ∑x y, i i
i =1 i =1 i =1
n n
a ∑ xi + bn = ∑y. i
i =1 i =1
The solutions of these equations are the least-squares estimates for a and b, and
using the notation for estimators we shall distinguish them by s and S:
Least-squares estimates:
∑ni =1 xi yi − n XY
S = h − sf, s= ,
∑ni =1 x 2i − n X 2
where
n n
1 1
X=
n
∑x, i =1
i Y=
n
∑y.
i =1
i
(41.7)
915
The least-squares regression estimator t is given by
PROBLEMS
t = sx + S,
and this can be used to estimate y for other values of x. It also defines the equa-
tion of the regression line of y on x though the data. The regression line of x on y,
which generally will be a different line, can be found similarly.
The estimates s and S have been obtained by least squares. Are they unbiased
estimators of a and b? We can decide the answer to this question by finding their
expected values. Thus, noting that Yi is the random variable with value xi and that
xi is a controlled variable,
⎡ ∑ni =1 xi Yi − n Xg ⎤
E(s) = E ⎢ n 2 2 ⎥
sdfdsf
⎣ ∑ i =1 x i − n X ⎦
E[∑ni =1 xi(axi + b + ε i ) + X ∑ni =1 (axi + b + ε i )]
=
∑ni =1 x 2i − n X
∑ni =1 xi(axi + b) − ∑in=1 X (axi + b)
= (since E(ε i ) = 0)
∑ni =1 x 2i − n X
a ∑ni =1 x 2i + bn X − naX2 − nbX
= = a.
∑ni =1 x 2i − n X
Also, by (41.9) and the result E(s) above
E(S) = E(g − sf)
1 ⎡n ⎤
= E ⎢∑ (axi + b + ε i )⎥ − XE(s)
n ⎢⎣ i =1 ⎥⎦
n
1
=
n
∑ (ax
i =1
i + b) − Xa = aX + b − Xa = b.
Problems
41.1 Find the mean, median, first and third 41.2 In a university degree examination
quartiles, and the interquartile range of the with four papers each taken by 20
following two data sets: candidates the percentage marks
(a) 10, 11, 11, 15, 17, 20, 25, 25, 27, 30, 38, 42, 47; are as shown in Table 41.6. Draw
(b) 5, 12, 15, 16, 20, 29, 29, 32, 39, 44. comparable box plots for the
Draw box plots for both sets of data. results.
916
Table 41.6 you say about the distribution of the
sample mean?
DESCRIPTIVE STATISTICS
k2
2π
41.3 Samples of packets of crisps are weighed at the 1
P(1460 T 1540) = e − 12 x 2
dx.
end of a manufacturing process. Packets have to k1
contain a minimum of 25 g. The sample weights are
25.1, 25.3, 25.0, 25.7, 25.3, 25.2, 25.1, 25.5, 41.9 Fuel consumption figures for standard urban
25.7, 25.1. cycles of a selection of cars together with their
Calculate the sample mean, mode, and standard weights are given in Table 41.8. Find the least-
deviation. squares estimator for a regression line of fuel
consumption (c) on weight (w).
41.4 In a continuous production process a
Table 41.8
machine cuts pipes into nominal lengths of 10
metres. The actual lengths in a production run Vehicle Weight, Fuel consumption,
are given in Table 41.7. Draw a histogram over
w (kg) c (km l−1)
(a) 10 intervals of width 0.1 metres, (b) 5 intervals
of width 0.2 metres. Add a frequency polygon to
both histograms.
A 2100 4.96
B 1350 9.10
Table 41.7
C 1008 12.04
Length Frequency Length Frequency D 1323 7.68
interval of pipes interval of pipes
E 710 15.15
9.5 x 9.6 1 10.0 x 10.1 21 F 1215 10.98
9.6 x 9.7 4 10.1 x 10.2 15 G 1436 7.75
9.7 x 9.8 5 10.2 x 10.3 11 H 1561 8.25
9.8 x 9.9 12 10.3 x 10.4 5 I 2120 4.85
9.9 x 10.0 20 10.4 x 10.5 2 J 1975 4.64
K 1535 5.56
41.5 In an experiment 127 observations are taken
which can be assigned to a maximum of 36 An unbiased estimator for the variance in linear
intervals. If you wish to display the data in a regression is given by
histogram, what would be a suitable number n
( yi − ti )2
of intervals to use? ∑
i=1 n−2
,
41.6 A random variable X has a uniform where ti = sxi + S. Estimate the variance of the
distribution (see Problem 40.4) with pdf regression line.
One point is some distance from the regression
⎧1, 1 x 2;
f (x) = ⎨ line (such rogue values are known as outliers). If
⎩0, otherwise. this particular vehicle is excluded from the data
A random sample of size 35 is taken. Find the mean how are the regression line and the estimated
and estimate the variance of the sample. What can variance affected?
Part 8
Projects
Applications projects using
symbolic computing 42
CONTENTS
42.2 Projects
APPLICATIONS PROJECTS USING SYMBOLIC COMPUTING
The following projects are listed by chapter. They are selected samples of prob-
lems and do not cover every topic in the book. The intention is that they can be
approached using mainly built-in Mathematica commands: very few problems
require programming in Mathematica. It is generally inadvisable to attempt
these problems by hand, since many could involve a great deal of manipulation,
although some projects are prompted by examples and problems in the relevant
chapters.
It is worth emphasizing that computer algebra systems usually generate out-
puts or answers without explanation of how the results are arrived at, unless
the programming within them is investigated. Outputs can go wrong for many
mathematical reasons. For example, a curve can oscillate too frequently for the
built-in point spacing to detect, which can result in a false graph. This can be cor-
rected by increasing the number of plot points, but the potential difficulty has to
be recognized at the formulation stage. Symbolic computation is not a substitute
for understanding mathematical techniques.
Mathematica notebooks for each project are available on the web at:
www.oxfordtextbooks.co.uk/orc/jordan_smith4e
42.2
1. Define the function
x sin x − 1 + cos x find dy /dx as a function of x and y.
f(x) = .
sin 2x + 2 − 2 ex
PROJECTS
Chapter 4
Find limx→0 f(x). Plot the function for − 0.5 x 1. Display rules for the first and second derivatives
− 0.001 and for 0.001 x 0.5, and check with respect to x of the following general forms:
graphically that this agrees with the limit. (a) f(x 2);
2. Find the derivative of (b) f(sin x);
(c) f(sin(x 2)).
f(x) = 7x2 + 8x3 + 9x4 + 10x5 + 11x6 + 12x7
and its values f ′(0.2) and f ′(0.4). 2. Find the first and second derivatives of
f(x) = 0.1x5 − 0.5x4 + 0.2x3 + x2 − 0.7x + 2.2.
3. Find the derivative of
Estimate the roots of f ′(x) = 0 from a graph of
f(x) = x4 + 2x3 − 3x2 − 2x + 4. y = f(x). Then find the roots to 5 decimal places
Find the approximate values of x where by a root-finding routine. Calculate f ″(x) at
f ′(x) = 0, using a numerical solution routine. each stationary point, and confirm the second-
Plot graphs of y = f(x) and y = f ′(x) on the derivative test for stationary points. Points of
same axes and compare the zeros of f ′(x) with inflection are given by f ″(x) = 0. Find their
the zero slopes on y = f(x). locations on the original graph of y = f(x).
4. Find the equation of the tangent to the curve 3. Plot the graph of
y = x sin 2x x2 − 1
y= ,
at x = 0.7. Plot the graphs of the curve and its 2x + 1
tangent. and its asymptotes y = --12 x − --14 and x = − --12
5. Find the first three derivatives of (see Fig. 4.13).
f(x) = x sin2x + x2 sin(x2), 4. Plot the graph of y = f(x) = x5 − 2x3 + x2 − 3x + 1
and confirm that the first nonzero higher in the interval −1 x 3, and estimate the
derivative at x = 0 is f (3)(0) = 6. roots of f(x) = 0 in this interval. Set up a
Newton routine
6. Plot the graphs of y = f(x), y = f ′(x), and
y = f ″(x) for f(xn )
xn+1 = xn − ,
f ′(xn )
f(x) = x2(x2 − 3)
for calculating the roots of f(x) = 0, and find,
in the interval −2 x 2.5. (This should
starting at x = 0.5 and 1.6, the roots to 10
confirm the results from Problem 2.19.)
significant figures. What is the smallest number
Chapter 3 of iterations required in each case to calculate
the roots to 10 significant figures?
1. Display rules for the derivatives of the
following general forms: 5. Plot the graph of y = x + sin 5x in the interval
(a) f(x)g(x); 0 x 25 using
(b) f(x)/g(x); (a) the default plotting routine,
(c) f(g(x)); (b) plotting with 20 plot points,
(d) f(x)g(x)h(x); (c) plotting with 50 plot points.
(e) f(x)g(x)/h(x); Explain why the graphs are different for this
(f) f(h(x))/h(x). type of function.
2. Find the first derivatives of Chapter 5
f(x) = esinx cos x sin x.
2
1. Obtain formulae for the Taylor polynomials for
The function is periodic. What is its minimum the following functions centred at x = a as far
period? Plot its graph and the graph of f ′(x) as (x − a)3:
over one cycle. Estimate where f(x) is stationary (a) f(x); (b) [ f(x)]2;
and then find each of the roots of f ′(x) = 0 to 5 (c) f(x)g(x); (d) e f(x).
decimal places using a root-finding routine. State the coefficient of (x − a)2 in each case.
922
2. Find Taylor expansions about x = 0 up to and Find and compare
including x5 for each of the following functions: (a) AB and BA; (b) A(BC) and (AB)C;
APPLICATIONS PROJECTS USING SYMBOLIC COMPUTING
(a) ex; (b) (x + 1) cos x; (c) (A + B)T and AT + BT; (d) (AB)T and BTAT.
(c) ln(1 + sin x); (d) exp(sin(ex − 1)).
2. Find the inverse of
3. Find the Taylor polynomials for (sin2x)/x2 up to
⎡1 x1 x 21 ⎤
and including xN for N = 2, 4, 6. Plot the graphs ⎢1 x 2 x 22 ⎥
of the function and its Taylor polynomials ⎢1 x x32 ⎥⎦
for 0.001 x 2, and compare them. At ⎣ 3
approximately what values of x do the Taylor (see Problem 7.18). Find the equation of the
polynomials visibly part company from the parabola of the form y = a + bx + cx2 through
exact function? the points (−1, −2), (--12 , −1), and (--52 , 2).
4. Find the Taylor polynomials for ln x about x = 1 3. Let
for N = 6. Construct an error function which is
⎡ 31 1
3
1
6
1
6 ⎤
the difference of ln x and its Taylor polynomial. ⎢1 1 1 1 ⎥
Show that, at 2.159 approximately, this error A = ⎢ 41 2
1
8
3
8
1 ⎥.
starts to exceed 0.2 as x increases. Plot this ⎢ 81 4
1
8
1
4
1
⎥
error function against x for 1 x 2.2. ⎢⎣ 2 6 6 6 ⎥⎦
Chapter 7 1 1 1 1
1 1 1
(a) (b) a b c d
1. Let a b c ; ;
a 2 b2 c 2 d 2
a 2 b2 c 2
⎡ 1 2 3 4⎤ 3 3
a b c d 3 3
⎢−2 3 − 4 1⎥
A=⎢
1 2⎥
,
3 4 1 1 1 1
⎢ ⎥
⎣ 4 −1 2 3⎦ (c) a b c d
.
a2 b2 c2 d2
⎡ 1 0 −1 0⎤ a4 b4 c4 d 4
⎢ 1 −2 1 2⎥
B=⎢
1⎥
,
−3 1 −3 3. Find the values of a for which
⎢ ⎥
⎣ 2 1 2 1⎦
5 a −1 1
2 1 a 2
⎡3 1 2 1⎤
⎢p p 1 2⎥ 3 a 1 4
C=⎢ −1
1 −2 − 3 2 ⎥
. 0 a 2
⎢ ⎥
⎣2 1 0 −1⎦ is zero.
923
Chapter 9 2. Use a row-reduction method to solve the linear
equations
42.2
1. Plot the curve which has the position vector
r = (2 cos t)î + (2 sin t)q + 0.3tx x + 2y + pz = 5,
from t = 0 to t = 20. What is the curve called? 3x + 2y + z = q,
PROJECTS
The position vector represents a particle 2x − y + 4z = 7,
moving along the curve. Find the velocity where p and q are two parameters. Confirm that
vector k and the acceleration vector ] of the
particle. Show that k · ] = 0. 63 − 5q
z= (p ≠ − 117 ),
11 + 7p
2. Plot the trefoil knot given parametrically by
r = (1 + a cos 3t)(cos 2t î + sin 2t q ) + a sin 3t x and discuss the nature of solutions for all values
of p and q.
with a ∼ 0.25 and 0 t 2π.
3. Using a row-reduction instruction, show that
Chapter 10 x1 + 3x3 = 5,
1. Show that
−x1 + x2 − x3 + x4 = −1,
⎡ −2 −3 /2 2 −1 /2 2 −3 /231 /2 ⎤ x1 + 2x2 + 11x3 = 4,
⎢ −2 −3 /2 −2 −1 /2 2 −3 /231 /2 ⎥
⎢2 −131 /2 −x1 + 2x2 + 3x3 + x4 = 3
⎣ 0 2 −1 ⎥⎦
is an inconsistent set of equations.
defines a rotation of axes. If each row defines
Chapter 13
the direction of the X, Y, Z axes in the x, y, z
frame, find the equation of the plane 1. Find the eigenvalues and eigenvectors of
x + 2y − 2z = 1 in the new axes.
⎡− 6 1 2 0⎤
⎢ 1 0 −3 −1⎥
Chapter 11 A=⎢
0⎥
.
2 1 −6
1. The area of a triangle whose vertices are the ⎢ ⎥
⎢⎣ −2 2 0 −3⎥⎦
points with position vectors a, b, and c is given
by the formula How many linearly independent eigenvectors
--12 |b × c + c × a + a × b |. does A have?
Find the eigenvalues of the following
Devise a program based on this formula
matrices:
to determine the area for general vertices. (a) A−1; (b) A2; (c) A + kI.
What is the area if a = (1, 0, 1), b = (2, −1, 1),
and c = (1, 1, 2)? Plot a diagram showing the 2. Find the eigenvalues and eigenvectors of
triangle.
⎛ 1 2 1⎞
2. A tetrahedron has vertices with position vectors A = ⎜ 2 1 1⎟ .
⎜ 1 1 2⎟
a = (1, −1, 2), b = (−1, 2, 3), ⎝ ⎠
c = (2, −1, 3), d = (1, 3, −2).
Construct a matrix C of eigenvectors and
Find its surface area. Draw a three-dimensional confirm that
plot showing the tetrahedron viewed from the
A = CDC −1,
point with position vector (2.1, −2.4, 1.5).
where D is a diagonal matrix of eigenvalues.
Chapter 12 Obtain the general formula for
1. Use a row-reduction routine to solve the linear An = CDnC −1.
equations
3. Find the inverse and transpose of
x + 2y − 3z = q,
2x + py + z = −1, ⎡1 2 2⎤
A = 31 ⎢2 1 −2⎥ ,
x − 2y − z = 4, ⎢2 −2
⎣ 1⎥⎦
where p and q are two parameters. Determine
for what values of p and q the equations and verify that A is an orthogonal matrix. Find
have (a) a unique solution, (b) no solution, the eigenvalues of A. What expected property
(c) an infinite set of solutions. do they have?
924
⎡ 5 5 − 6 2⎤
⎢−3 13 − 6 2⎥
A=⎢
⎢
−3 7 0 2⎥
⎥
⎢⎣ 3 −15 12 2⎥⎦
.
(f) (1 dx− x ) . 3
Find the expression det(A − λ I4), and Check each answer by recovering the integrands
by differentiation.
demonstrate the Cayley–Hamilton theorem
of Problem 13.21. 3. Evaluate the following definite integrals:
√(5 +x4xdx− 4x ) ;
2 1
Chapter 14
(a) x(ln x)3 dx; (b)
1. Plot the graphs of the derivative dy /dx = sin 2x
2
1 0
and the equation of the curve through (π, −1) of 1
–2
(c) (d) ∑
1 100
3 n
x dx x
which this is the derivative (see Example 14.7). ; dx.
(1 − x )
0
2 –52
x!
0 n=0
2. Plot the graph of
4. Find
dy
= x e−x + sin x − x2 cos 2x, a
dx
for 0 x 10. Show that an antiderivative
I(a) = (ln x) dx.
1
3
Chapter 15
1. Set up a program to compute the area under the
(ln x) dx
1
3
42
(lnxx) dx
42.2
6
Apply the program also to Problem 16.20. f(a) = 2
1
3. A thin plane metal plate consists of an isosceles
for a 1. Find f(10), f(20), and f(∞). The
PROJECTS
triangle of height h and base length 2a with a
semicircle of radius a attached symmetrically by results indicate that f(a) tends to a limit very
its diameter to the base of the triangle. Find the slowly as a → ∞. Find where
location of its centroid on its axis of symmetry. (ln x)6
g(x) =
4. Set up a program to generate Simpson’s rule x2
has a maximum value, and plot the graph
b
∞
Chapter 21
(c) 3
x e −ax2
dx. 1. Draw the phasor diagram of the sum of the
1 three phasors of
926
u(t) = 2 cos 10t, v(t) = cos(10t − --12 π), Chapter 24
w(t) = 3 cos(10t + --14 π)
APPLICATIONS PROJECTS USING SYMBOLIC COMPUTING
over the interval −1.5 x 1.5, show that the Derive a program to calculate Euler’s constant.
solutions appear to be periodic. It should give γ = 0.577 215… .
2. Plot phase paths for the van der Pol equation Chapter 25
F + 10(x − 1)B + x = 0
2 1. Find the Laplace transform of the solution of
showing the limit cycle. Also show the F + ω 2x = a δ(t − 1), x(0) = B(0) = 0,
corresponding (t, x) graph of the periodic which has impulse input applied at time t = 1.
solution (the periodic solution has an initial Invert the transform and plot the output for
value close to x(0) = 2, B(0) = 0). ω = 4, a = 1 (see Example 25.3).
927
2. Following the previous project, solve the more Chapter 27
complicated problem with two impulses:
42.2
1. Find the Fourier transforms of the following
2F + 3B + 2x = a δ(t − π) cos t + b δ(t − 2π), functions:
x(0) = B(0) = 0. (a) the top-hat function Π(t);
PROJECTS
(b) the one-sided exponential e−t H(t);
Plot the output for a = b = 1. (c) e−| t |;
3. Let f(t) = t 3, g(t) = cos t. Find the convolution (d) e−| t−1 |;
(e) 1 /(1 + t 2).
L{ f(t)}L{g(t)} = L 89 (c) 2;
0 (d) 2 cos( f − a).
⎧ 1 (0 t 1),
f (t) = ⎨ verify that
⎩−1 (−1 t 0).
∂2f ∂2f
Plot the graphs of f(t) and the first 12 terms of = .
its Fourier series. The graph should show the ∂y ∂x ∂x ∂y
Gibbs’ phenomenon, in which the Fourier series 4. Plot the surface given by z = cos xy over
approximation overshoots the function at −π x π, − --12 π y --12 π. Find the partial
discontinuities. You can try it with (say) 20 derivatives at (--14 π, 1) and construct the equation
terms or more, but you should include more of the tangent plane there. Finally plot the
interpolating points in these cases. surface and its tangent plane.
3. Find the Fourier coefficients of the 2π-periodic 5. Find the stationary points of
function defined by
f(x, y) = 0.3x3 + 0.2y2 − x2y − xy + 2y
f(x) = x6 − 5π2x4 + 7π 4x2 numerically by solving
on the interval −π x π. What is the sum of
∂f ∂f
the series = = 0.
∂x ∂y
∞
(−1)n+1
∑ n6
? Plot the contours on the (x, y) plane for
n=1 −3 x 3, −9 y 3.
928
Find the values of the second derivatives at z = x + y e−xy + xy
each stationary point and check the second over 0 x 1, −1 y 1. Interpret the
APPLICATIONS PROJECTS USING SYMBOLIC COMPUTING
derivative tests (28.9) at each point. integral as the volume under the surface. Does
6. Find the least-squares straight line fit to the the integral contain ‘negative’ volumes under
points the surface? Plot the positive part of the surface
over the same rectangle.
(0, 1.1), (1, 2), (2, 2.9), (3, 3.9),
(4, 4.5), (5, 5.1), 2. Evaluate the repeated integral
√(a2 −y2)/a
in the (x, y) plane. Plot the data and the least- a
42.2
1. A and B are the sets of integers defined by
for planarity, using a built-in diagnostic test.
A = {2n + 5(−1)n |n ∈+, 1 n 100},
B = {n2 − n + 1 |n ∈+, 1 n 10}. Chapter 38
PROJECTS
Produce lists of the elements in A ∪ B and 1. Rework Example 38.2 using a symbolic package
A ∩ B. How many elements do each of these for solving difference equations. Solve the
sets have? mortgage difference equation
2. Let A, B, and C be the following sets: Qm − (1 + I)Qm−1 = −A,
with I = 0.08 and Q0 = P = 50 000 (in £). Given
A = {n(n − 1) |n ∈+, 2 n 100},
that Q25 = 0, find A. List the outstanding debt
B = {| n2 − 100n | |n ∈+, 1 n 160}, Qm each year m to the nearest £. Plot (a) the
C = {4n | n ∈+, 1 n 2200}. outstanding debt against years and (b) the
Verify the first distributive law annual interest repayments A − IQm against
years.
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).
2. Solve the following homogeneous difference
How many elements are there in the set
equations:
A ∩ (B ∪ A)? (a) un+2 − un+1 − 12un = 0;
Chapter 36 (b) un+2 + 2un+1 + 2un = 0;
(c) un+2 + 4un+1 + 4un = 0;
1. Design programs to generate the truth tables
(d) un+3 + 3un+2 + 3un+1 + un = 0, u0 = 0, u1 = 1,
for the or gate, the and gate, the not gate, the u2 = −1.
nand gate, and the nor gate.
3. Solve the following inhomogeneous difference
2. Design a program to simulate the truth table in equations:
Example 36.3 which has the output (a) un+2 − un+1 − 12un = 2 + n + n2;
f = 15452 ⊕ b ⊕ c (b) un+2 − un+1 + 4un = 2n;
for inputs a, b, and c. (c) un+3 + 3un+2 + 3un+1 + un = n2, u0 = 0, u1 = 1,
u2 = −1.
Chapter 37 4. Devise a program to generate cobweb plots
1. In the cutset method applied to the circuit in for the first-order difference equation
Fig. 37.23, the currents i1, i2, i3, i4, i5 and the
un+1 = − kun + k
voltages va, vb, vc, vd satisfy the nine equations
for (a) k = --12 , (b) k = --32 , (c) k = 1, with initial
i1 − i3 + i2 = 0, iX − i3 + i2 = 0,
value u0 = --34 in each case (see Example 38.3).
− iY + i5 − i3 + i2 = 0,
5. Display cobweb plots for the logistic difference
−iY + i4 + i2 = 0, equation
i1 = (va − vb)/R1, un+1 = α un(1 − un)
i2 = (vc − vb)/R2, for selected values of α. Some suggested values
i3 = (vb − vd )/R3, are:
i4 = (vc − vd)/R4, (a) α = 2.8 to show a stable fixed point;
(b) α = 3.4: find the period-2 solution;
i5 = vd /R5,
(c) α = 3.5: find the period-4 solution;
where iX = 2 A, iY = 2 A, and R1 = --12 Ω, R2 = 3 Ω, (d) α = 3.7: chaotic output;
R3 = 1 Ω, R4 = 2 Ω, R5 = 2 Ω. Solve this set of (e) α = 3.83: should be able to locate a stable
linear equations for the currents and voltages. period-3 solution.
2. Draw the labelled drawings of the bipartite 6. Design a program to generate the period-
graphs K5,6 and K6,6. Answer the following for doubling display shown in Fig. 38.11 for the
each graph by the built-in diagnostic test. logistic equation un+1 = α un(1 − un) for α
(a) How many edges has each graph? increasing from α = 2.8 to α = 4.
(b) Is the graph eulerian? If it is, list an eulerian
walk. Chapter 39
(c) Is it hamiltonian? If it is, list a hamiltonian 1. (See Example 39.8.) A box contains 40 balls of
cycle. which 7 are red, 12 are white, and 21 are black.
930
In each of the cases n = 2, 3, 4, 5, 6, 7, n balls Chapter 41
are drawn at random from the box without
APPLICATIONS PROJECTS USING SYMBOLIC COMPUTING
Chapter 1 Chapter 3
1.1 2 x 3. 3.1 dy/dx = ex(sin x + cos x).
− ;
(a − b)2(x − b) (c) df(x )/dx) = 2xex [cos(x4) − 2x2 sin(x4)].
2 2
Chapter 5
Chapter 2 5.1 1 + 2x + 2x2 + 43 x 3 + 23 x4 + 15
4 5
x.
2.1 Tangent: y = −2x + 2; normal: y = − 4x + 19
8 ; 5.2 Required accuracy needs terms as far as x5.
3 13
intersection point ( , ). 16 8 5.3 1 − 12 x − 18 x2 +
x. 13 3
48
∞ 1
(x − π) − 2n
2.2
dV
= 4πr 2. 5.4 ∑ (−1) n
. 2
dr n=0 2n!
5.5 −2.
dy
2.3 = 70(x6 + x9).
dx
Chapter 6
2.4 (a) 2; (b) 2; (c) 3.
6.1 (a) 4 + 3i; (b) i; (c) 2i.
2.5 d(cosh x)/dx = sinh x: d(sinh x)/dx = cosh x.
6.2 z = 1 + i, Z = 1 − i, z2 = 2i, Z 2 = −2i, 2z = 2 + 2i,
r
2.6 (2r)!/x r!. zZ = 2.
932
π π
6.3 z = 2(cos 5π
3 + i sin 3 ), Z = 2(cos 3 + i sin 3 ),
5π
Chapter 10
2z = 4(cos 5π3 + i sin 5π3 ), z2 = 2√7(cos θ +i sin θ ),
SELF-TESTS: SELECTED ANSWERS
6.6 z = 2nπi, z = ln(2 ± √3) + 2nπi, (n = 0, ±1, ±2, … ). 10.7 (b) Perpendicular distance are 1/√14, 4/√14.
6.7 S(θ ) = cos(cos θ ) cosh(sin θ ). x− –1 y − –51 z
10.8 (b) Line is 5
= = 1.
–1
5 − –51 − –5
Chapter 7
7.1 In full the matrix is Chapter 11
G −1 1 −1 J 11.1 |c | = √26.
H −2 4 −8 K .
I −3 9 −27 L 11.4 (a) −7î + 6q + x.
G 4 −1 −1 J
G1 0J
7.2 AB =
I0 , BA = H −1 −2 7K . Chapter 12
6L I −2 −1 5L
12.1 Solution is x1 = 2, x2 = −2, x3 = −3.
7.3 2A + 3B, A2, AB + BA are symmetric: AB and BA
are not symmetric. G −2 −1 −5 −2J
H 5 2 9 4K
7.4 A4 = abcdI4, so that A−1 = A3/(abcd). 12.2 A−1 = .
H 7 3 13 6K
I −8 −3 −15 −7 L
Chapter 8
12.3
8.1 det A = 2(k − 1)2; k = 1.
(a) If a ≠ −3/2, the system has the unique solution
8.2 Dn = (x − a)n−1(x + na).
x = (a + b)/(3 + 2a),
8.3 The adjoint and inverse are given by y = (−3 + 2b)/(3 + 2a),
G −3 −k − 2 −2k + 2 J z = (a + b)/(3 + 2a).
adjA = H −4 −1 6 K (b) If a = −3/2 and b ≠ 3/2, the system has no
I −1 k + 1 −2k − 1 L solutions.
G3 k+2 2k − 2 J (c) If a = −3/2 and b = 3/2, then the system has the set
A =
−1 1 H
4 1 −6 K of solutions x = λ, y = −1 + 2λ, z = λ.
4k+5 I
1 −k − 1 2k + 1 L
The matrix is singular if k = − 54. The product A adj(A) Chapter 13
will always be zero for a singular matrix.
13.1 Eigenvalues: −1, 1 − √2, 1 + √2.
Eigenvectors (−1, 2, 2)T, (−1 + (1/√2), 1/√2, 1)T,
Chapter 9 −1 − (1/√2), −1/√2, 1)T.
9.1 A_D = (29, 35); direction is 0.878… rads to x 13.2 k ≠ ±1.
direction.
13.3 Eigenvalues are −2, 1, 3. The corresponding
9.3 Relative speed = 86.02 lm/hr; direction is 35.5° E
eigenvectors are
of S.
9.4 Plane is x − 1 = λ − 2µ, y + 1 = λ, z − 2 = 3λ − 5µ.
(−2, 1, 2)T, (2, −1, 1)T, (3, 1, 2)T.
9.5 The point of intersection is (−1, −2p/(1 − p), A possible matrix C is given by
(1 + p)/(1 − p)); the locus is the straight line x = −1, G −2 2 3J
y + z = 1. C = H 1 −1 1 K .
9.6 ] = −ω 2 r. I 2 1 2L
933
13.4 17.6 (x + 3) + ln| x + 1| + ln| x + 3| + C.
9
2
−1 1
4
3
4
15.8 I(x) = 2xex cos(x2) − ex cos x. 20.4 The resonant phase occurs at the polar angle
2
16.1 Volume = 28
15 π.
Chapter 21
16.2 Area = 1
12 π.
21.1 X = 2 e –2 πi.
1
22.4 General solution is xy2 = C(y − x)2, where C is a 28.1 (a) ∂f /∂x = −2y cos(xy) sin(xy),
constant. ∂f/∂y = −2x cos(xy) sin(xy);
(b) ∂f/∂x = −2x sin(x2 − y2), ∂f/∂y = 2y sin(x2 − y2);
(c) ∂f /∂x = (xy)x[1 + ln(xy)], ∂ f /∂y = x2(xy)x−1;
Chapter 23
28.3 Tangent planes are given by ±2x ± 2y − z = 2.
23.1 The origin is the only equilibrium point. The The tangent planes intersect the x, y plane in
equation of the phase paths is y2 = 12 x4 + C, where C is a square.
a constant.
28.4 For maximum volume a = 2√[A/(3√3)].
23.2 For c 0, the origin is a saddle; for 0 c , 1
4
the origin is a stable node; and for c 14 the origin is a 6[2 ∑ nyn − (N + 1) ∑ yn]
28.5 a = ,
spiral. N(N2 − 1)
23.3 Equilibrium points are at (0, 0), (1, 1), (−1, −1),
2[−3 ∑ nyn + (2N + 1) ∑ yn]
b= ,
(1, −1), (−1, 1). Solutions are x = ±1, y = ±1. N(N − 1)
where all summations are from 1 to N.
23.4 The origin is a centre, the points (1, 1), (−1, −1),
(1, −1), (−1, 1) are all saddle points. 28.6 K(α) = 3π/(16α 5).
ds
24.6 x(t) = 12 (et + e3t).
for 0 x 1.
24.8 L{(e−t − 1)/t} = ln[s/(s + 1)].
29.4 dy/dx = −(x + y)/(x + 4y). The maximum occurs
at (−2/√3, 2/√3) and the minimum at (2 /√3, −2/√3).
Chapter 25 29.5 The direction of the normal is
Chapter 30
Chapter 26
30.1 dz /dt = −3 sin t(sin2t − 3 cos2t). Stationary at
26.1 The Fourier coefficients are a0 = 8π /3, an = 4/n ,
2 2 t = 0, 13 π, 23 π, π, 43 π, 53 π.
bn = −4π/n, (n = 1, 2, … ).
30.2 Stationary points are at (1, 1), (−1, −1), (1, −1),
26.2 1
8 π2 (−1, 1).
Chapter 31
Chapter 27
31.1 Maximum error = 0.261 units for an
27.1 2 /(1 + 4π 2f 2 ). area A = 1.5 units.
935
31.3 The point is (2, 2, 1)/√10. Chapter 36
Chapter 32
36.2 f = (a * b) C. The truth table is
32.1 I = J = 28.
a b c f
32.2 I = 12 π.
0 0 0 0
32.3 Volume = 152/3. 0 1 0 0
32.4 I = e−1. 0 1 1 1
1 0 0 0
32.5 The moment of inertia is 13 M(a2 + b2), where M 0 0 1 1
is the mass of the plate. 1 0 1 1
32.6 The volume is V = 12 π. 1 1 0 0
1 1 1 0
32.7 Both areas = 12
1
.
36.3 f = (A * B) (a * B) (A * b): this problem and
Chapter 33 Self-test 36.1 have the same truth table.
33.2 9/10.
Chapter 37
Chapter 34 37.1 (a) {1, 2, 2, 2, 3}; (b) {2, 2, 3, 3}; (c) {1, 1, 1, 2,
3}; (d) {3, 3, 3, 3, 3, 3}; (e) {4, 4, 4, 4, 4}.
34.1 The field lines are ellipses being the intersection
of circular cylinders and inclined planes. 37.2 21 are connected of which three are regular with
degrees 0, 2 and 4.
34.3 Surface area is 16 (5√5 − 1).
37.3 (b)(ii) A spanning tree could be the graph with
34.4 Volume = 13 Ah; volume of tetrahedron = 12
1 3
a; edges {ba, bf, bg, bc, ge, cd}.
volume of octahedron = a √2. 1 3
3
37.4 ad(b + c) + eh(g + f ).
34.5 curl F = (x − 2yz)î − yzq − xx;
curl G = (2y − 1)î − 2xq − x. 37.5 By Euler’s theorem: (a) the dodecahedron has
20 vertices; (b) the icosahedron has 30 edges.
Chapter 35
Chapter 38
35.1 (a) S1 = {1, 2, 3, 4, 5, 6, 7, 8};
(b) S2 = {14 , 13 , 12 , 23 , 34 , 1, 32 }. 38.1 At 6.5% the repayment is £8198.15; at 7% the
repayment is £8526.64.
35.2 A B = {x | x ∈R and −1 x 2, x = 3 x = 4};
A B = {1, 2}. 38.2 The fixed point is ( 12 (√3 − 1), 12 (√3 − 1)). The
iteration gives u2 = 0.460, u3 = 0.288, u4 = 0.417,
35.3 (a) Same as Fig. 35.9b; (b) same as Fig. 35.9d; u5 = 0.326 to 3 decimal places, which indicates
(c) elements which are not only in A or B or C. stability.
38.3 un = (A + Bn + 18 n2)2n.
936
Chapter 39 40.5 With n = 20, λ = 2 the distributions are
compared in the following table:
SELF-TESTS: SELECTED ANSWERS
1.3 (b) Slope = --31 . Intersection with axes at (2, 0), 1.39 (b) 1 + 1/2 + 1/5 + 1/10 + 1/17.
(0, − ).
2 6
3
1.40 (b) ∑( ) 1 n
3 = ( 13 )2 + ( 13 )3 + + ( 13 )6
1.4 (b) (y + 2)/(x + 1) = −2, so y = −2x − 4. n= 2
(d) (y − 2)/(x − 1) = 3, so y = 3x − 1. = ( 13 )2 [1 + 1
3 + + ( 13 )4 ].
Now (1.31) gives the sum in the brackets. Finally we
1.7 (b) Centre (1, 0), radius 2.
obtain 121/729.
(d) Centre ( , − ), radius √11.
1 1 1
2 2 2
(e) −341/1024.
1.9 (b) x = − 53 ± 1
5 √14, y = − 15 ± 15 √14. 1.44 (c) 1/99; (e) 30/11.
1.14 (b) 1. (d) −1/√2. (f ) −√3/2. 1.45 (b) 10/9; (d) 2/3.
1.17 (b) 2 cos 12 (x + y) sin 12 (x − y). 1.49 (c) 256; (d) 20; (f) 59.
1.18 In the following, n represents any integer: 1.50 (a) 72; (b) 360.
(b) 12 π + n π; (d) 16 + 13 n; (f ) 2n. 1.51 (b) 24; (d) 164.
1.19 (b) amp. = 1.5; ang. freq. = 0.2; period = 31.41; 1.54 (a) 2880; (b) 720.
phase = −0.48.
1.55 (a) 120; (b) 720; (c) 220; (d) 1000.
1.20 (b) 12 x − 32 ; (d) arcsin 12 x, 0 x 2.
(f ) arccos(arcsin x), 0 x sin 1. Chapter 2
1
(h) − 12 + (1 + 4x) 2 , x − 14 . 2.1 (b) 0.5; (e) 2; (g) 1.
1.22 (b) 13 e 2; (d)ln 13, or − 13 ln 3; (f ) 2; (h) ±√2;
1
3 2.2 (c) 6; (e) − 14 ; (g) −4.
(l) Hint: write sinh 2x = 12 (e 2x − e − 2x ) and obtain a
2.3 (c) −1/x2; (f) 4x.
quadratic equation for e2x. x = 12 ln(4 + √17).
2.4 (c) −8.
1.26 Hint: x = tanh y = (ey − e−y)/(e y + e−y). Form an
2.5 (c) 32, −32.
equation for e y and solve it.
2.8 (c) dE/dT = 4kT3.
1.28 5 cos(ω t − 0.927).
2.9 (b) 7x6 − 18x5 + 1.
1.29 C = 2, α = 1.386, f(2) = 1/8.
2.11 Use the formula for tan(A − B) in Appendix B(b).
1.30 Tidal period = 12.57 h. It floats for 9.20 h.
2.12 (b) --21 ; (d) 1; (g) 2; (i) π /180 = 0.0175.
Hint: it floats when sin 0.5t −0.666. Sketch
y = sin 0.5t and y = 0.666 and find the intersections. 2.15 (a) 2 cos x + 3 sin x.
1.33 The vertex is (−4, 7). 2.16 (b) y = 24x − 39; (d) y = e−1x.
9.1 (a) P_Q = (5, −3), Q_P = (−5, 3). 10.19 (c) (l, m, n) = ( 13 , − 32 , − 32 ).
9.7 (b) 2a = (6, 4, 6), 3b = (3, 3, 6), 2a − 3b = (3, 1, 0). 10.34 Begin by finding any two points on the line of
intersection. (The resulting form is not unique.)
9.10 (a) (3, 3, −6). (b) (X + 2)2 + (Y − 1)2 + (Z + 3)2 = 1.
⎡ 5 0 −5⎤
12.9 (b) 1 ⎢− 6 10 1⎥ . 14.2 (b)
25
⎢ ⎥ − 15 (1 − x)5 + C; − 32 (8 − 3x)− 2 + C; 32 (1 − x) 3 + C.
3 4
⎢⎣ 7 5 3⎥⎦
⎡ 1 0 0 0 0⎤ 14.3 (b) −ln | 1 − x | + C; − 15 ln| 4 − 5x | + C.
⎢−1 1 0 0 0⎥
⎢ ⎥ 14.4 (c) 38 x + 1
4 sin 2x + 1
32 sin 4x + C.
(e) ⎢ 0 −1 1 0 0⎥ .
⎢ 0 0 −1 1 0⎥ 14.5 x2 ex − 2x ex + 2 ex + C.
⎢ 0 0 0 −1 1⎥⎦
⎣ 14.6 (a) 2; (h) −ln 2.
12.12 The shadow on the z plane has vertices at the 14.7 (c) 4 − x2 0 if −1 x 2, and
points (−1, 0, 0), (−1, −2, 0), (1, 0, 0). 4 − x2 0 if 2 x 3. The geometrical area is
12.16 Non-trivial solutions if k = 1, −1, 4. |F(x)| 2−1 + |F(x)| 23,
where F(x) = 4x − 13 x 3.
12.18 Non-trivial solutions if k = −6, −1, 3, 4.
14.8 (a) At + B; (b) 16 t 3 + At + B.
12.22 x1 = 1.398, x2 = 1.090, x3 = −0.2844,
x4 = −0.3697.
Chapter 15
Chapter 13 x =1
1
⎡−3⎤ ⎡1⎤
15.1 (b) lim
δx→ 0
∑ x 5 δx =
x = −1
x 5 dx = [ 16 x 6 ]1−1 = 0.
−1
13.1 (b) Eigenvalues 4, 9. Eigenvectors ⎢ ⎥ , ⎢ ⎥ .
⎣ 2⎦ ⎣1⎦
(e) Eigenvalues 3 − 4√2, 3 + 4√2. Eigenvectors 15.2 (b) (x + 1) dx = 1
2 2
3 (x + 1) 2 + C .
3
⎡ 0⎤ ⎡1⎤ ⎡0⎤
⎢−1⎥ , ⎢0⎥ , ⎢2⎥ .
(x − 1)dx = [ x − x]
1
⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 15.4 (b) 2 1
3
3 1
−1 = − 43 .
⎢⎣ 2⎥⎦ ⎣⎢0⎦⎥ ⎣⎢1⎦⎥ −1
13.7 a = −2 and a = − 72 . ∞
e − 12 v
dv = −2 [e − 2 v] 0∞ = − 2(0 − 1) = 2 .
1
15.5 (b)
13.12 The matrix C is given by
0
⎡ 7 −1 −1⎤
C = ⎢−1 −1 1⎥ .
(1 − e ) dt = T + e
T
1
⎡1 1 1⎤ (T + e −T − 1) = 1 + T −1 e −T − T −1 → 1
T
13.16 lim An = 13 ⎢1 1 1⎥ .
n→∞ ⎢ ⎥ as T → ∞.
⎢⎣1 1 1⎥⎦
13.22 Eigenvalues are 0, 4, 4, 12. 15.7 The integrands are (a) even; (b) odd; (c) odd;
(d) odd.
13.26 A3n = I3, A3n+1 = A, A3n+2 = A2.
15.9 (b) The exact result is √π/2.
− 12
sin (x + 1) − 12 x − 2 sin x.
1
πx dy = π(2y) dy = 28π/3.
2 2
1 1 (f ) 2x sin x + 4 cos 12 x + C.
1
2
(i) 12 x 2 ln x − 14 x 2 + C. ( j) xn+1 [ln x − 1/(n + 1)]/(n + 1).
mx dx =
L
∑ [x(x − 1) − (−x)] δx = x
2
lim 2
dx = 83 . 17.15 F(0) = 1
2 π, F(1) = 1, F(4) = 3
16 π, F(5) = 8
15 .
δx→ 0
x=0 0
17.16 (a) 2 (ln 2)3 − 6 (ln 2)2 + 12 ln 2 − 6.
16.13 3
2 . (b) F(0) = 2, F(1) = π, F(4) = π 4 + 12π 2 + 48,
16.14 (b) π.
F(5) = π5 + 20π3 + 120π.
16.15 In a plane perpendicular to the end, y is 17.23 (c) (a/b) arctan [(a tan x)/b] + C;
downward and x is horizontal; the origin is at the (d) ln(tan 12 x) + C;
top. Area elements are horizontal strips of width δy (g) ln|sec x + tan x| + C; (j) ln[(1 + √5)/2];
in the end face. Force = 12 ρgLH 2. Moment = 16 ρgLH 3. (k) 8(6√3 + 1)/15.
16.16 Distance of centre of mass from vertex is 34 H. 17.25 Coordinates of centroid: ( 35 h, 0).
16.17 1
12 σ a3b(σ = mass per unit area).
16.18 (a) 1
4 σBH 3; (b) 1
48 σHB3, where σ is mass per Chapter 18
unit area. 18.2 (b) x = A e 2 t; (e) x = A e − 3 t ; (i) x = A et.
1 4
16.23 8a.
18.3 (b) x = e 3 (t−1); (d) x = 10 e−(t+1).
1
1
(o) ln| 1 − x | + 1/(1 − x) + C. (b) The half-life T = ln 2 years. The information
3
k
17.2 (b) − 32 cos 12 (3t − 1) + C; (e) − 32 (− t) 2 + C; implies that e = 1 − 0.175 = 0.825, so k = 0.0096.
−20k
Therefore T = 72 years.
17.3 (d) 1
2 sin(x 2 + 3) + C. (j) 1
2 ln(1 + x 2 ) + C.
18.6 If N(t) is the number, then δN ≈ 20(--21 N) δt so the
17.4 (c) sin3 2x + C. (g) Put cot 2x = cos 2x/sin 2x,
1
equation is dN/dt = 10N. In the second experiment
6
then u = sin 2x, giving --21 ln | sin 2x| + C.
there is an average death-rate of 1 per rabbit per year,
(j) --31 cos3x − cos x + C.
so dN/dt = 9N.
17.5 (b) 205/32; (e) −ln 2; (h) --21 ln 2;
18.7 (b) A et + B e−2t. (e) A e t/2√3 + B e−t/2√3.
(k) zero; (n) (2 /ω) cos φ.
(l) A e−3t + Bt e−3t.
17.6 (b) --π; 1
2 (d) --π + --;
1
4
1
2 (f ) --π.
3
8 (n) A + Bt (this is an exception to (18.10)).
942
1
18.9 (b) --32 (et − e−2t). (b) Amplitude = 10 /[(36 − ω 2 )2 + ω 2 ] 2 ,
(d) The general solution is A e−x + Bx e−x, phase = −arctan[ω /(36 − ω 2) ].
ANSWERS TO SELECTED PROBLEMS
v ⎛ g⎞ 2
21.4 (b) 1 − 3 e
− 12 π i
+ e 2 πi = 1 + 4i = √17 eiφ , where
1
θ= sin ⎜ ⎟ t.
1
(lg) 2 ⎝ l⎠ φ = arctan 4 = 1.33.
18.14 θ = 0.0719 e−0.033t sin 0.696t. 21.6 (b) R + ω Li. (d) R /(1 + ω RCi).
f (τ )(e
t
1
25.8 (b) ω ( t− τ )
− e −ω (t− τ )) dτ .
24.2 (b) 1/s − 2/(s + 2); (e) (3s − 4) /(s + 4); 2
2ω 0
(g) --21 [1/s − s /(s2 + 4)].
25.9 (b) cosh t.
24.3 (b) 1/(s + 2)2; (d) (s − 2) /(s2 − 4s + 5);
(i) (s2 − 9) /(s2 + 9)2; (l) 24 /(s + 1)5. 25.19 (a) x(t) = δ(t) + 2δ(t − T ) + δ(t − 2T ),
1
X(s) = 1 + 2 e−sT + e−2sT.
24.5 (b) 1; (d) 18 t 4 ; (g) e ; (k) 12 e t + 12 e − t;
3
2
2t
(o) 2 cos 2t − 1
sin 2t; (s) 12 e tt 2; (u) 13 (cos t − cos 2t). 25.21 (a) z −1 + 2z −2 − z −3. (b) 1 − z −1 + z −2 − ···
2
= z /(z + 1). (c) 2z /(2z − 1). (d) z /(z2 − 1).
24.6 (e) (2s2 + 3s − 2)X(s) − 10s − 9.
25.22 (a) Tz /(z − 1)2.
24.7 (b) 2 et + e−2t; (e) 3 e−t cos 2t;
25.23 (a) (z − 1)/(z + 1), g(t) = {1, −2, 2, −2, … }.
(f ) y = 1
4 ex + 1
4 e− x + 1
2 cos x.
25.27 (a) Unstable. Poles at z = ±2, giving growth --43 2n
24.8 (b) 3 − 3 cos t + sin t.
and --41(−1)n2n. (c) Stable. Poles at z = ± --21 i, giving decay
(e) − 18 e − t + 98 e t − 14 t e t + 14 t 2 e t.
1 1 1
(i) − 76 e t − 12 e − t + 43 e 2t − 121 e −2t. cos πn.
4 2n 2
24.9 (b) x = 3
8 + 5
8 e4 t + 12 t e4 t; y = − 163 + 3
16 e4 t + 14 t e4 t.
24.10 (b) e t( 12 A + 12 B + 32 ) + e − t( 12 A − 12 B + 32 ) − 3, Chapter 26
where A and B are arbitrary. This is the same
26.1 (b) an = 0, bn = −2(−1)n/n.
as C et + D e−t − 3, where C and D are arbitrary.
2
(e) an = 0, bn = [1 + (−1)n − 2 cos( 12 nπ)].
24.13 e−2 e−2s[(s + 1)2 − 1]/[(s + 1)2 + 1]2 πn
= e−2 e−2s s(s + 2)/(s2 + 2s + 2)2.
2π 2 4
26.2 (b) bn = 0, a0 = , an = 2 (−1)n(n = 1, 2, … ).
24.14 (b) H(t) sin t − H(t − 1) cos(t − 1). 3 n
24.15 (b) ( 18 e 2t + 1 −2t
− 14 )H(t), 4(−1)n
8 e (c) bn = 0, an = − .
−( 18 e 2( t−1) + 18 e −2( t−1) − 14 )H(t − 1). π(4n 2 − 1)
(d) 12 H(t)t sin t + 12 H(t − π)(t − π) sin(t − π). 2
26.3 (a) a0 = 12 π, a2n = 0, a2n −1 = − ,
πn 2
Chapter 25 (−1)n
bn = − (n = 1, 2, … ).
25.3 Hint for working: s2 + 2ks + ω 2 has real factors
n
when k2 ω 2; so put s2 + 2ks + ω 2 = (s − α)(s − β ), 26.5 Series sum is 14 π.
1
where α, β = −k ± (k 2 − ω 2 ) 2 . Then x(t) is given by
26.8 F = 2.
(α − β )−1[(α + κ ) eα t − ( β + κ ) eβ t]H(t)
+ I(α − β )−1[eα (t−t0) − eβ(t−t0)]H(t − t0), 4β
26.10 a0 = 0, an = 0, bn = [1 − (−1)n](n = 1, 2, … ).
πn3
where κ = 1 + 2k.
∞
4
25.4 By proceeding as suggested, we obtain 26.16 (a) ∑ (2n − 1)π sin(2n − 1)πt.
n=1
u(x) = Ax + 16 Bx 3 + (Mg /6K)(x − 12 l )3 H(x − 12 l ).
∞
2 4
The conditions at x = l give A = Mgl 2/16K, 26.18 −∑ cos 2nω t.
B = −Mg /2K. This problem could be solved by π n=1 π(4n 2 − 1)
integrating the equation four times, and linking the
1 41 1
solutions over [0, --21 l] and [--21 l, l ] by the condition 26.23 (b) R(t) = – + –icosπt + — cos 3πt
2 π3 32
that u(x), u′(x), u″(x) are continuous at x = --21 l, but
this is automatically secured in the Laplace-transform 1 5
+ —2 cos 5πt + ...i.
method. 5 7
944
∞
1 π2
26.26 (b) ∑n
n= 0
2
=
6
. Chapter 29
ANSWERS TO SELECTED PROBLEMS
29.6 −5.7%.
Chapter 27
29.7 1.67% reduction, approximately.
27.1 Xs(f ) = 4πf/(1 + 4π2f 2); Xc(f ) = 2/(1 + 4π 2f 2). 29.9 (b) −2√2; (d) zero (it is the same in all
∞ directions).
27.8 x(t) = 2
0
X(f ) cos 2πft dt where 29.10 (b) – 43 ; (e) − 12 ; (j) 1.
27.17 {1/[α + i(2πf + β )] + 1/[α + i(2πf − β )]}. 29.19 (b) 49.8° or 130.2°.
27.19 (b) 1/(1 + i2πf )2. (d) Hint: compare Problem 29.12f.
27.20 (b) The Fourier transform is sinc2( f ) e−i2π(a+b)f 29.21 (b) (0, 12 ); (d) (− 14 , 1).
↔ Λ[t − (a + b)]. 29.22 (b) (2, 1)/ √5.
29.23 (b) φ = 0.
Chapter 28
28.3 (c) 4x − 2y − 1; − 6y − 2x − 1. Chapter 30
(f ) y − 2; x − 1. (i) 2y/(x + y)2; −2x/(x + y)2.
30.2 (b) − 4 sin t cos t; (d) 2 sin(t2) + 4t2 cos(t2).
(k) x(x 2 + y 2 )− 2 ; y(x 2 + y 2 )− 2 .
1 1
are given in order: (b) 2, 4, 3. (d) 2y/x3, 0, −1/x2. dt [R2 + r 2 − 2Rr cos(φ − θ )] 2
(h) 108(3x − 4y)2, 192(3x − 4y)2, −144(3x − 4y)2. where θ = vt/r, φ = Vt /R.
(k) −r −3 + 3x2r −5, −r −3 + 3y 2r −5, 3xyr −5, where 30.4 (b) x = y = 3. (e) The coordinates of the nearest
1
r = (x2 + y2) 2 . point on the given line are ( 53 , 15 ). Distance = 2/√5.
28.10 (b) 2x + 2y − z = 4; one normal is (2, 2, −1). 30.5 (b) (0, 0), (2, 0). (A suitable parametrization is
(d) 3x + 4y + 8z = 29; one normal is (− 32 , −2, −1). x = 1 + cos t, y = sin t.)
28.11 78.9° or 101.1°. (d) (±6 /√5, ±4 /√5). (A suitable parametrization
would be x = 2/cos t, y = 2 tan t.)
28.12 (b) (1, −1), min; (d) (nπ, mπ); min if n and
m odd, max if n and m even, otherwise saddle; 30.8 (b) F = −2KI sin θ + } cos θ − I2r cos θ − Jr sin θ,
(h) (0, 0) saddle; (1, 1) minimum; (k) (0, 0), saddle. H = 2KI cos θ + } sin θ − I2r sin θ + Jr cos θ.
30.9 (c) ∂f/∂u = −2v2/u3, ∂f/∂v = 2v/u.
28.14 (a) a = b = c = 7; (b) a = b = c = 4.
30.10 (b) ∂2f/∂u2 = 12u2 − 2v2, ∂2f/∂u ∂v = −4uv,
28.15 The maximum is 9, attained at (2, ±1).
∂2f /∂v2 = −2u2 + 12v 2.
28.16 Minimum distance = √2.
30.11 It is easiest to put x2 − y2 in terms of uv. Finally,
−2 1 1 1
28.18 (b) Depth = 2 3 V 3 ; square base, side 2 3 V 3 . ∂2f /∂u2 = 16v2g″(4uv), ∂2f /∂v2 = 16u2g″(4uv),
28.23 Lowest point is z = 43 a at (0, a) and (a, a). ∂2f /∂u∂v = 4g′(4uv) + 16uvg″(4uv).
945
1+ y 1− y
0
1
Chapter 31 (g) f (x, y) dx dy + f (x, y) dx dy.
31.21 (b) (3, 3, 3); (e) (a/√3, b /√3, c/√3); (g) ( 13, 73 , 1). 33.16 (b) Non-conservative.
Chapter 34
34.1 π[a3 − (a − h)3]/a.
Chapter 32
34.2 (a) 1 /84; (b) 1 / 24; (c) 13 / 384.
32.1 (b) e − 2; (d) (d − c)(b − a); (i) − --31 ; (m) --21 ln 2.
34.5 2√6 + 2 sinh−1(√2).
32.2 (b) Zero. Refer to the signed-volume analogy
(30.2b). (f ) ln(27/16). 34.6 Scalar potential is exyz + cos xy + zx + C.
32.4 4
3 . 34.7 Scale factors are h1 = h2 = √(u2 + v2), h3 = uv.
1
PG1G2G3
35.1 (c) −2, −1, 0, 1, 2, 3, 4; (f ) 1, 4, 9. Q= .
1 − G2H1 + G1G2G3H2
35.3 (c) A ∪ B = {−4, −3, −2, −1, 1, 2, 3, 4}.
37.19 (a) The transfer function is
35.4 (b) A ∩ B = {x |x ∈ and −5 x 2}.+ PG1G2G3
Q= .
(d) A ∩ B = {1}. 1 + G2H1 − G1G2G3H2
35.5 (b) B\(A ∪ C); (d) (B ∩ D)\ A. G1G3
37.20 (a) .
(1 − G1G2H2 )(1 + G3H1 )
35.6 (b) S1\(S ∪ S2 ∪ ··· ∪ Sr).
G1G2G3G4 GGG
35.7 (b) [(A\A1)\B1] ∪ B2.
(d) + 5 6 7.
1 + G2G3H2 1 − H1
35.10 A2 = {(1, 1), (1, 2), (2, 1), (2, 2)}. 37.24 SAFT, length 12.
36.6 See table below. 37.29 Waiting times are 13T/3 and 4T.
38.22 0 α 1.
Chapter 37
38.23 Oscillates between 0.4953 and 0.8124.
37.3 Twenty are planar.
38.24 The periodic values of the 2-cycle are 0.4
37.4 Five are connected. and 0.8.
37.8 Six not including reversed order.
37.15 i1 = − 4
i , i = − i , i3 = − i , i4 = − i ,
1 10 11 39.2 The probability that the score is 7 or less is 7/12.
21 0 2 21 0 21 0 21 0
i5 = 1
21 0i , i6 = 4
i ,i = − i .
21 0 7
3
7 0 39.5 n(A ∪ B) = 10.
947
39.6 (b) Ace of clubs or ace of spades drawn; 40.6 Mean number of non-faulty components to
(d) any ace or any heart or any black card drawn; failure is 82.33; standard deviation of the number
39.20 With the same probability of failure 0.98, 40.16 Maximum value of standard deviation is 121.6.
probability that circuit fails is 0.963.
40.17 Probability that just two bulbs will be still
39.21 (b) 1 /495; (c) 4 /99. working is 0.242.
APPENDICES
by the underlined group:
n=1 1 1
n=2 1 2 1
n=3 1 3 3 1
n=4 1 4 6 4 1
and so on. Thus
(1 + x)4 = 1 + 4x + 6x 2 + 4x3 + x4.
(iii) Permutations and combinations (see Section 1.17).
n! n!
P =
n r , Cr =
n .
(n − r)! (n − r)! r !
(d) Factorization
a2 − b2 = (a + b)(a − b),
a3 − b3 = (a − b)(a2 + ab + b2),
a3 + b3 = (a + b)(a2 − ab + b2).
(e) Constants
e = 2.718 281 82… ,
π = 3.141 592 65… ,
1 radian = 57.295 78… °,
1° = 0.017 45… radians,
360° = 2π radians.
∑r = 1 + 2 + 3 + + n = 1
2 n(n + 1)
r =1
n
∑r 2 = 12 + 22 + 32 + + n 2 = 61 n(n + 1)(2n + 1)
r =1
n
∑r 3 = 13 + 2 3 + 33 + + n 3 = 14 n 2 (n + 1)2 .
r =1
B Trigonometric formulae
(a) Relation between trigonometric functions
sin2A + cos2A = 1,
tan A = sin A/cos A; sec A = 1/cos A; cosec A = 1 /sin A.
β
a
c
γ
α
b
APPENDICES
If sin α = c, then all the solutions of sin x = c are x = nπ + (−1)nα.
If cos β = c, then all the solutions of cos x = c are x = 2nπ ± β.
If tan γ = c, then all the solutions of tan x = c are x = nπ + γ.
D A table of derivatives
APPENDICES
dy
y
dx
c (constant) 0
xn (n any constant) nx n−1
eax a e ax
kx (k 0) kx ln k
ln x (x 0) x −1
sin ax a cos ax
cos ax −a sin ax
tan ax a /cos2ax
cot ax −a/sin2x
sec ax (a sin ax)/cos2ax
cosec ax −(a cos ax) /sin2ax
1
arcsin ax a/(1 − a2x2)–2
1
arccos ax −a/(1 − a2x2)–2
arctan ax a/(1 + a2x2)
sinh ax a cosh ax
cosh ax a sinh ax
tanh ax a/cosh2ax
1
sinh−1ax a/(1 + a2x2)–2
1
cosh−1ax a/(a2x2 − 1)–2
tanh−1ax a/(1 − a2x2)
dv du
u(x)v(x) u +v
dx dx
u(x) 1 A du dv D
v −u F
v(x) v 2 C dx dx
1 1 dv
−
v(x) v2 dx
dy du
y(u(x))
du dx
dy dv du
y(v(u(x)))
dv du dx
953
APPENDICES
f(x) f(x) dx (C is an arbitrary constant.)
1
xm (m ≠ −1) xm + 1 + C
m+1
x−1 ln| x| + C, or ln |Cx|
e ax (1/a) eax + C
k x (k 0) kx /ln k + C
ln x (x 0) x ln x − x + C
sin ax −(1/a) cos ax + C
cos ax (1/a) sin ax + C
tan ax −(1/a) ln |cos ax| + C or −(1/a) ln|C cos ax|
cot ax (1/a) ln |sin ax | + C or (1/a) ln|C sin ax|
sec ax −(1/2a) ln [(1 − sin ax)/(1 + sin ax)] + C
cosec ax (1/2a) ln[(1 − cos ax)/(1 + cos ax)] + C
1
arcsin ax (1/a)(1 − a2x2)–2 +
1
x arcsin ax + C
arccos ax −(1/a)(1 − a2x2)–2 +1 x arccos ax + C
arctan ax −(1/a)ln(1 − a2x2)–2 + x arctan ax + C
sinh ax (1/a) cosh ax + C
cosh ax (1/a) sinh ax + C
tanh ax (1/a) ln{cosh ax} + C
1/(x2 + a2) (1 /a) arctan(x /a) + C
1/(x2 − a2) (1 /2a) ln |(x − a)/(x + a) | + C or
(1 /a) tanh−1(x/a) + C
1 /(a2 − x2 )2 arcsin(x /a) + C (or −arccos(x/a) + C)
1
a dx+ x = 2aπ
0
2 2
1– –1 π
2π
sin x dx = cos x dx = 1
2
0 0
π
x e dx = n! (n = 0, 1, 2, … )
0
n −x
e dx = 12 πa (a 0)
0
−ax2
e cos bx dx = a +a b (a 0)
0
−ax
2 2
e sin bx dx = a +b b (a 0)
0
−ax
2 2
Gradshteyn and Ryzhik (1994) is a useful source of hundreds of indefinite and definite
integrals.
955
APPENDICES
In the following tables, n and m represent a positive integer or zero. The constants
k and c are arbitrary unless otherwise indicated.
Transforms Inverses
∞
f (t) F(s) = e
0
− st
f (t)dt F(s) f(t)
n! 1 1
tn t m−1
sn+1 sm (m − 1)!
1 1
e kt ekt
s−k s−k
n! 1 1
t n e kt tm−1 ekt
(s − k )n+1 (s − k )m (m − 1)!
s s
cos kt cos kt
s + k2
2
s + k2
2
k 1 1
sin kt sin kt
s2 + k 2 s2 + k 2 k
s2 − k 2 s2 − k 2
t cos kt t cos kt
(s2 + k 2 )2 (s2 + k 2 )2
2ks s 1
t sin kt t sin kt
(s + k 2 )2
2
(s + k 2 )2
2
2k
H(t − c) (c 0) e−cs/s e−cs/s (c 0) H(t − c)
δ(t − c) (c 0) e−cs e−cs (c 0) δ(t − c)
f(τ ) dτ .
1
1/s as an integration If F(s) ↔ f(t), then F(s) ↔
operator (25.1) s 0
Multiplication x1(t)x2(t) −∞
X1( f − v)X2(v) dv
∞
= −∞
X2(v)X2( f − v) dv
∞
Periodic function xP(t) xP(t) (period T) ∑X
n= −∞
n δ( f − nf0 ), where f0 = 1/T,
Xn = f0 ∫ PeriodxP(t) e −2 π i f0 t dt
APPENDICES
(a) Distributions, means, and variances
(i) Discrete distributions
Distribution Probability Mean ( µ ) Variance (σ 2)
n!pr q n− r
Binomial np np(1 − p)
(n − r )!r !
1 1−p
Geometric (1 − p)r−1p
p p2
λ n e−λ
Poisson λ λ
n!
k k(1 − p)
Pascal r−1 Ck−1 pk(1 − p)r−k
p p2
C C
w r b n− r nb nwb(b + w + n)
Hypergeometric
w + bCn w +b (w + b)2(w + b − 1)
⎧λ e − λ x, x 0 1 1
Exponential ⎨0, x0
⎩ λ λ2
⎧1/(b − a), a x b
Uniform ⎨0,
1
2 (a + b)2 1
12 (b − a)2
⎩ elsewhere
Standardized 1 0 1
e− 2x
1 2
normal
2π
1
Φ(x) = e− 2 t dt
1 2
2π −∞
for 0 x 3.0 at 0.01 intervals. For x 0, Φ (x) can be calculated from Φ(−x) =
1 − Φ (x).
Φ (x)
x
958
x 0 1 2 3 4 5 6 7 8 9
APPENDICES
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.0633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9137 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
Table giving x for specified values of Φ(x) for 0.50 Φ(x) 0.99 at 0.01 intervals
APPENDICES
Physical quantities of different types, such as acceleration, force, momentum,
electrical potential, can be classified by expressing them as simple combinations
of certain primary dimensions such as mass, length and time. These expressions
determine how we can state the magnitude of a physical quantity – for example
any velocity can be expressed in metres per second, but never in metres per
kilogram. Five primary dimensions provide a basis sufficient for all common
purposes. Their names, the algebraic symbols denoting their dimension, and
appropriate units (the international (SI) system) are shown in the following table.
length L metre m
mass M kilogram kg
time T second s
electric current I ampere A
absolute temperature θ Kelvin K
We can now assign dimensions to any derived physical quantity, which classifies
it without indicating its magnitude. For example, the velocity at any moment of any
particle, substance, electromagnetic wave, etc., could in principle be measured as
(a distance travelled)/(time taken). Symbolically we write:
[velocity] = LT−1,
where the square brackets mean ‘the dimension of’. The form LT−1 of the right-
hand side indicates that an appropriate SI unit of measurement would be metres
per second. The following table comprises mechanical and electromagnetic
quantities, their dimensions, and conventional SI terms for certain special units
of measurement. Notice how known dimensional forms may be multiplied and
divided to obtain more complicated ones.
ously have the same physical dimensions. This often provides a useful check on a
calculation. Also, in any expression containing the sum of two or more terms, the
terms must all have the same dimensions if it is to make any physical sense. For
example, expressions equivalent to the form (energy + momentum), or (current +
voltage) can have no physical significance. However, in such cases the dimensions
of any letters used as constant factors must not be overlooked: the expression
(momentum + (k × energy)) could be meaningful provided [k] = TL−1.
The dimensions of quantities that appear as derivatives and integrals are
treated in the following way. Suppose for example that t is time ([t] = T) and x(t)
is a function representing displacement ([x] = L). Then
G dx J G d2x J
I dt L = LT −1
, I dt2 L = LT ,
−2
G b
J G d
J
I x(t) dtL = LT, and I t dtL = T2.
a a
where g is the acceleration due to gravity, l is the length of the pendulum, and the
angle θ and sin θ are dimensionless. Physically the equation
d2θ g
+ sin θ = 0,
dt2 l2
with the same definition of symbols could not represent a general physical law
because the dimensions of the two terms are different.
Dimensionless analysis indicates how equations can be simplified by making
them dimensionless. In the pendulum equation above, let τ = t√(g/l). Then the
dimensionless pendulum equation becomes
d2θ
+ sin θ = 0
dτ 2
which includes pendulums of all lengths, in any uniform gravitational field.
Further reading
Abell, M.L. and Braselton, J.P. (1992) Mathematica by Example, Academic Press, San
Diego.
Blachman, N. (1992) Mathematica: A Practical Approach, Academic Press, San Diego.
Boyce, W.E. and DiPrima, R.C. (1997) Elementary Differential Equations and Boundary
Value Problems (6th edn), Wiley, New York.
Garnier, R. and Taylor, J. (1991) Discrete Mathematics for New Technology, Adam
Hilger, Bristol.
Gradshteyn, I.S. and Ryzhik, I.M. (1994) Table of Integrals, Series, and Products (5th
edn), Academic Press, San Diego.
Grimmett, G.R. and Stirzaker, D.R. (2001), Probability and Random Processes (3rd edn),
Oxford University Press.
Jordan, D.W. and Smith, P. (2007a) Nonlinear Ordinary Differential Equations (4th
edn), Oxford University Press.
Jordan, D.W. and Smith, P. (2007b) Nonlinear Ordinary Differential Equations:
Problems and Solutions, Oxford University Press.
Kaye and Laby (1995) Tables of Physical and Chemical Constants, National Physical
Laboratory (16th edn) (available online at www.kayelaby.npl.co.uk).
Montgomery, D.C. and Runger, G.C. (1994) Applied Statistics and Probability for
Engineers, Wiley, New York.
Råde, L. and Westergren, B. (1995) Mathematics Handbook for Physics and Engineering,
Studentlitteratur, Lund.
Riley, K.F., Hobson, M.P. and Bence, S.J. (1997) Mathematical Methods for Physics and
Engineering, Cambridge University Press.
Roberts, G.E. and Kaufman, H. (1966) Table of Laplace Transforms, Saunders,
Philadelphia.
Seggern, D.H. von (1990) CRC Handbook of Mathematical Curves and Surfaces, CRC
Press, Baton Roca.
Skeel, R.D. and Keeper, J.B. (1993) Elementary Numerical Computing with
Mathematica, McGraw-Hill, New York.
Whitelaw, T.A. (1983) An Introduction to Linear Algebra, Blackie, Glasgow.
Wilson, R.J. and Watkins, J.J. (1990) Graphs: An Introductory Approach, Wiley, New
York.
Wolfram, S. (1996) The Mathematica Book (3rd edn), Wolfram Media/Cambridge
University Press.
Zwillinger, D. (1992) Handbook of Differentials Equations (2nd edn), Academic Press,
Boston.
Index
Pages of the main topics are given in heavy type for quick reference.
INDEX
complement 801 cells (statistics) 904 complex numbers 140 –156 (see
complement laws 802 central limit theorem 911 also Argand diagram)
conjunction 804 centre (phase plane) 484, 485, argument 146
de Morgan’s laws 802 492, 497 conjugate 142, 143
disjunction 804 centre of mass 348, 350 de Moivre’s theorem 150
disjunctive normal form 808 centroid 349, 363 difference 142
distributive laws 802 chain rules 86, 91, 101, 631, 664, division 142
duality principle 812 668, 676 exponential form 148, 151
exclusive-OR-gate 808 more than one parameter 668, Euler’s formula 149
expression 803 676 imaginary part 141
EXOR gate 808 one parameter 664, 668 logarithm 142
identity laws 802 chaos 857, 861, 865, 929 modulus 141, 145
join 801 characteristic equation ordered pair 144
logic gates 803 difference equations 850 parallelogram rule 145
logic networks 805 differential equation 385–391 polar coordinates 146
logically equivalent gates matrices 279 principal value 146
812 circle 10 product 142
meet 801 area 951 quotient 142
NAND gate 805 cartesian equation 10 real part 141
negation 804 circumference 951 reciprocal 142
NOR gate 805 vector equation 207 rules for 141
NOT gate 804 circuits(electrical) 105, 380, 418, standard form 141
OR gate 804 446–453, 823–827, 838, sum 141
product 801 839, 878 compound interest 122, 842–843
reflexive law 802 balanced bridge 451 conditional probability 875–877
sum 801 cutset method 823 cone 241, 625
switches in parallel 810 LCR 418 surface area 342
switches in series 810 Laplace transform nethods volume 342, 951
switching circuits 809 535 conic sections 12
switching function 810 parallel 448, 878 conjunction 804
truth table 803 RL 380 conjugate, complex 142, 143
truth table, inverse 808 series 448, 878 connected graph 817, 820
variables 802 signal flow graphs 827 conservative field 752–759, 775
box plot 906, 930 switching 809 potential 754, 775
interquartile range 907 cobweb diagram 847–849 continuity equation 772
median 907 cofactor 180 contour map 625–626, 636, 927
quartiles 906 combinations 49–51, 949 convergence
outliers 907 common ratio (geometric series) of infinite series 129
whiskers 907 43 of integrals 330
branch (graph theory) 821 compatibility convolution 541, 927 (see also
linear equations 267 Fourier transform;
complement (of a set) 781 Laplace transform;
C
complementary function (see z-transform)
Capacitor 447, 528 also difference discrete 552
complex impedance 447, 533 equations; differential Fourier transform 535–538,
phasor 446 equations) 956
cardinality (of a set) 798 difference equations 852 Laplace transform 541, 726,
cardioid 57, 355, 920 differential equations 405 927
carrier wave 432, 585, 598 complete graph 817 memory and 544
caustic 707 completing the square 11, 140, theorem 541, 535, 726
Cayley-Hamilton theorem 302 367 z-transform 490
964
coordinates, three-dimensional orthogonal systems of 928 dash notation 100
(see also axes) parametric equations 95, 664 definition 65
INDEX
cartesian 623 point of inflection 93, 238 dot notation 215, 480
curvilinear 672 radius of curvature of 123, 239 function of a function rule 86
cylindrical polar 777 sketching 108–114 higher order 77, 102
orthogonal systems of 675 slope 62–65 implicit 93
paraboloidal 785 tangent line 62–66 and incremental
rotation of 226–229 tangent vector 212 approximation 115
spherical polar 780 curvilinear coordinates 672 index notation 125
coordinates, two-dimensional curl 780 of inverse functions 94
(see also axes) cylindrical polars 777 logarithmic 92
cartesian 6 divergence 780 material 690
origin 6 elliptic system 674 notations 65, 100
orthogonal systems 675 gradient 780 parameter, in terms of 95
polar 28–30 paraboloidal 785 of polynomials 126
rotation 223 scale factor 780 of product 83, 101
coplanar vectors 218, 251 spherical polars 780 of quotient 85, 101
cosh function 37 cutset (graph theory) 822, 929 and rate of change 67
Taylor series 131 fundamental 824 of reciprocal 85, 101
cosine function 18 (see also cycle(graph theory) 820 second 79
trigonometric functions) cylindrical polar coordinates of sums 70
antiderivative 313 704, 777 table of derivatives 76, 91, 952
derivative 76 total 665
exponential form 150 of vectors 213
D
Taylor series 130 of f(ax + b) 90
cosine rule 58, 116, 950 damper 418 of ex 75
cosine/sine transforms 587–590 damping 419 of ln x 76
inverse 589 critical 439 of cos x, sin x 75
at a jump 589 heavy 420 of xn 69, 89
counting index(series) 43 weak 420 derivative, partial 627
Cramer’s rule 260, 262, 270 dash notation for derivative 100 higher 629
cross product (see vector deadbeat 420, 488 mixed 630
product) decay, radioactive 36, 393 second 629
cumulative distribution definite integral 320–338 determinants 173, 175, 179–190,
function(cdf) 895 degree(of angle) 16 922
curl 773–776 degree (of a vertex) 817, 818 2 × 2 173, 179
in curvilinear coordinates 780 delay rule (second shift rule) 522 3 × 3 175, 180
determinant formula 774 del (grad) operator 659 cofactor 180
identities 785 delta function (impulse function) cofactor, sign rule 181
curvature 238, 243 530, 599 expansion by first row 180
centre of 122 and discrete systems 546 expansion, general 185
radius of 123, 239 Fourier transform of 599 factorization 191, 922
curves and Heaviside unit function Jacobian 728
angle between intersecting 658 532 minor 303
asymptotes 11, 109, 113, 114 Laplace transform of 531 notation 179
caustic 707 de Moivre’s theorem 150 product 188
chord 62– 65 de Morgan’s laws 795, 802 rules 182–188
convex/concave 239 derivative, directional (see suffix permutation 180
curvature of 238, 243 directional derivative) tridiagonal 192
envelope 475, 702, 928 derivative, ordinary 65 (see also zero 186
gradient 62 derivative, partial) diagonal dominance 274
length 355 and antiderivative 307 diagonalization of a matrix
normal to 238, 657, 658 chain rule 86, 91 286–289, 923
965
difference (sets) 794 variable coefficients, linear state of the system 482
difference equations 842– 861 407 trajectory (phase path) 484
INDEX
attractor 858 differential equations, linear van der Pol equation 480, 492
bifurcation 857 constant coefficient differential form 469, 679
chaos 857 379–412 for differential equation
characteristic equation 850 basis 385, 388 469 –473
cobweb 847–849 characterstic equation integrating factor 472
complementary function 385–391 and line integals 744
852 complementary function 405 perfect 472, 744
compound interest 843 damped oscillator 390 table of 470
constant coefficient 849 first-order 382 differentiation 61–80 (see also
difference 843 forced equations 395–407 derivative)
equilibrium 846 general solution 382, 386, 404, chain rule 86, 91
Feigenbaum sequence 858 405, 420 function of a function rule 86
first-order 847 harmonic forcing 399 implicit 93
fixed point 846 homogeneous (unforced) of integral with respect to
forcing term, table equations 379 parameter 640
generating function initial conditions 384, 391 of inverse functions 94
homogeneous 849 – 852 particular solutions 395–404 logarithmic 92
inhomogeneous 852– 853 second-order, unforced partial (see also derivative,
linear, constant coefficients 384–392 partial)
logistic equation 845, second-order, forced 395 –412 product rule 83, 101
854 – 858, 861 superposition principle 399 quotient rule 85, 101
order 845 unforced equations 379–394 reciprocal rule 85
particular solution 852–853 differential equations, nonlinear reversing 307
period-2 cycle 858 (qualitative methods) of vectors 213
period-3 cycle 858, 861 480–502 diffraction 417, 608–618
period-4 cycle 858 autonomous 481 angular spectrum 610
period doubling 856 centre 484, 494 array distribution 617
recurrence relation 843 direction of paths 488, attenuation 611
stability 847, 854 492–493 convolution 616
strange attractor 858 Duffing equation 502 interference 615
z-transform 556 equilibrium point 484, 493 pattern 610
differential-delay equation 560 Euler’s method 500 phase change on ray 609, 611
differential equations, first order initial value problem 481 radiating strip 608
379–382, 407– 410, instability 486 radiation 613
460 – 479 limit cycle 497 radiation rules 614
Bernoulli equation 478 linearization 494 source distribution 612
change of variable 473 linearized systems, digraph (directed graph) 816
and differentials 469 – 473 classification of 494, 496 weighted 828
direction field 461 node 488, 494 dimensions 959–960
direction indicators 461 numerical method 499 directed graph 816
energy transformation 473 orbit (phase path) 484 directed line segment 198
Euler numerical method 463 periodic motion 484 direction cosines 225
graphical method 460 phase diagram 482 direction ratios 229, 230
integrating factor 408– 410 phase paths (trajectories, directional derivative 651–654,
isoclines 462 orbits) 484, 488, 493 661, 692–696
lineal-element diagram 461 phase plane 483 discrete systems 545–558 (see
logistic 478 saddle 484, 494 also z-transform)
numerical solution 473 self-similar systems 495 impulsive input 545
separable 466 – 469, 474 separatrix 491 input/ouput 545
singular solutions 468, 475 spiral 487, 494 sampling 546
solution curves 461 stability 486, 487 signal 545
966
time invariant 545 eigenvalues 279 –304 exp(x) (see exponential function)
transfer function 549 characteristic equation 279 expected value (mean,
INDEX
INDEX
Parseval’s identity 585, 608 domain 451 dependent/independent
period 2π 568 forcing 376 variables 623
period T 564 polygon (statistics) 903 depiction of 624
periodic function 563, 567 friction 418 derivatives, mixed 630
pitch 562 function 12 (see functions of one, directional derivative 652, 681
sawtooth wave 5 two and N variables) errors 648–650
sine series 572 complementary 405, 852 gradient vector 659
spectrum 577 generating 860 higher derivatives 629
switching functions 573 implicit 12 implicit differentiation
symmetry 572 functions of one variable 12–35 654–656, 666
two-sided 579–582 (see also derivative; incremental approximation 645
Fourier transforms 587– 620, 956 differentiation) Lagrange multiplier 667–672,
(see also diffraction) argument 12 681
convolution 601– 605 delta 530, 599 least squares method 638–640
cosine transform 586, 589 dependent/independent level curves 625
definitions 588, 589, 591 variables 12 linear approximation 646
delta function 599 discontinuous 14 maximum/minimum 635
of derivative 596 even 13 maximum/minimum,
Dirac comb 605 exponential 30, 33 restricted 667–672
duality 596 harmonic 21, 413 normal to a curve 658, 660
exponential 591, 592 Heaviside 14 normal to surface 632
of exponential function 595 hyperbolic 36, 153, 951 orthogonal systems of curves
Fourier transform pair 591 implicit 12 656
frequency distribution impulse 530 partial derivatives 627
function 588 incremental approximation saddle point 637
frequency scaling 596 115, 645, 683 stationary points, Lagrange
frequency shift 596 input/output 12 multipliers 670
fundamental frequency 587 inverse 23–25 stationary points, restricted
generalized functions 600 inverse hyperbolic 38 667
inverse transform 589, 591 inverse trigonometric 25–28 stationary points, tests for 637
jump discontinuity 591 logarithm 33 steepest ascent/descent
modulation 596 maximum/minimum 102 653–654
notations 527 mean value 339 surface 624
Parseval theorem 608 odd 13 tangent plane 632
periodic function 599 periodic 22 functions of many variables
Rayleigh’s theorem 607 point of inflection 93 683–707
rules, table of 596, 956 rational 14 chain rule 688
shah function 605 signum (sgn) 15 derivative, mixed 684
sidebands 598 stationary points 102 directional derivative 692, 693
signal energy 607 switching 810 envelope 702, 703, 707
sinc function 593, 594 translation of 13 errors 685
sine transform 587, 588 trigonometric 17–22, 25–27, gradient vector 688, 689
spectral density 588 949(table) higher derivatives 684
table 956 unit step 14 implicit differentiation 686
time scaling 596 functions of two variables incremental approximation
top-hat function 593, 623–642, 645–683 683, 684
594 chain rule, one parameter Lagrange multipliers 699–701,
triangle function 605 664–665 706
frameworks 834 –835 chain rule, two parameters level surface 696
bipartite graph 834 676–679 material derivative 690
minimum bracing 834 contour map 625 normal to surface 690
968
partial derivatives 683 compatibility graph 836 harmonic oscillator 413–425
restricted stationary points complete graph 817, 833 amplitude 414
INDEX
INDEX
645, 683 table of integrals 953–954 Lagrange multipliers 667–672,
indefinite integral 324 trapezium rule 346, 347 681, 955
table 953 variable limits 336 Laplace equation 785, 786
identity matrix 170 volume 765 Laplace transforms 505–561,
index laws 4, 948 integral equation 529, 559, 560 926–927, 955 (see also
induction 843 Volterra 559 z-transform)
inductor integrand 324 convolution theorem 541, 726,
impedance 533 integrating factor 407–410 927
phasor 446 integration 320–378 (see also cosine function 507
complex impedance 447 integral; double definition 505
inequality 5 integration) of derivatives 515
infinite series 128 change of variable 362–366 delay rule(second shift rule)
convergence 129 of inverse function 370 522
divergence 129 partial fractions 366 delta function 530
geometric 43 by parts 368–373 differential-delay equation 560
partial sums 129 reduction formulae 373 differential equations 516–519
sum 128 by substitution 356 –366, 378 differential equations,
Taylor series 130 of trigonometric products variable coefficients 560
inflection, point of 93, 238 362 discrete systems 545
inner product (see scalar interference 417, 456, 615 division by s 528
product) fringes 457 division rule 524
integer floor function 560 intersection (sets) 791 of Fourier series 585
integers 4 interval 5 Heaviside unit function 519
sums of powers of 949 infinite 5 impedance, s-domain 530
integrals 320–378 (see also inverse function 23–25 impulse function 530
antiderivative; derivative of 94 impulsive input 543
integration; double integration of 370 integral equations 529, 559,
integral; line integral) reciprocal relations 23 560
and area 323, 333, 327 reflection property 24 inverse 505, 512
area, polar coordinates 345 inverse matrix 172, 190 inverses, table of 955
area analogy 327, 346 Gaussian elimination 265 multiplication by ekt 510
of complex functions 331 Inverse trigonometric functions multiplication by tn 510
definite 323 25 notation 506
differentiation of (variable principal values 26 of powers, tn 507
limits) 336 irrotational field 775 partial fractions 513
differentiation with respect isocline 461, 493 quiescent system 517
to parameter 374 iterative methods (see rules, list of 955
even function 334 approximation) s-domain 529
improper 328 scale rule 508, 955
indefinite 324 shift rules 510, 955
J
infinite 329 sifting 531
as limit of a sum 341–353 Jacobi method (for linear sine function 507
limits of integration, variable equations) 274 square wave 521
336 Jacobian (double integration) table of 513
numerical evaluation of 322, 728 and transfer function,
339, 346, 355 jump (discontinuity) 14 s-domain 535
odd function 334 and transfer function,
rectangle rule 322, 347 ω-domain 540
K
Simpson’s rule 355, 925 Volterra integral equation 559
solid of revolution 343 Kirchhoff laws 449, 824, 825 and z transform 548
square bracket notation 316 Kuratowski 833 lead and lag 415
970
least squares geometrical interpretation 268 determinant of 175, 190 (see
estimates 914 homogeneous 271 also determinants)
INDEX
INDEX
by Gaussian elimination to plane 232 oscillator, linear 419–425
of a product 174 to surface 632, 690 outcome 866
rule for 2 × 2 173 normal coordinates 299 outlier 907
rule for 3 × 3 175 normal distribution 898–900
maximum/minimum standard normal curve 899
P
local 103 standardized 899, 957
N variables 696 table 958 parabola 12
one variable 102 number line 5 paraboloidal coordinates 785
one variable, classification 104 number 3 parallelepiped
restricted 107, 670, 697, 699 complex (see complex volume (determinant) 257
(see also Lagrange numbers) volume (vector) 251
multipliers) exponent (index) 4 parallelogram
two variables 635– 638 exponent rules 4 area (determinant) 734
two variables, classification 637 index laws 948 area (vector) 248
mean (expected value, infinity sign 4 parallelogram rule
expectation) 889, 897 integer 4 complex numbers 145
median 906 irrational 4 vector addition 201
mode 906 modulus 6 parameter (statistics) 903
modulus 6 powers 4 parametric equations of a curve
moment (see also force) rational 4 95, 664
about an axis 253 real 3 Parseval identity 585, 608
of force 251, 255 recurring decimal 4, 46 partial derivative 627 (see
vector 252 set notations 790 functions of N variables)
moment of inertia 348, 350 –352 numerical methods (see higher 629
cone 377 approximation) mixed 630
disc 352, 377 second 629
rectangle 354 partial differentiation 623–705
O
sphere 377 partial fractions 39–42
triangle 351 Ohm’s law 824 in integration 366
moment of momentum 258 operator 66 and Laplace transforms 513
mortgage 844 ordered pair 144 rules 40
moving average 620 ordinate 6 and z-transforms 554
multigraph 817 origin of coordinates 6 partial sum 129
mutually exclusive events 869 orthogonal matrix 295 –298, Pascal distribution 894, 957
923 Pascal’s triangle 52, 949
rotation of axes 296 Path, phase (see line integral;
N
orthogonal systems phase plane)
nabla (gradient) 659 of coordinates 675 path (graph theory) 820
negation (Boolean algebra) 804 of curves 654, 928 pdf (probability density
negative binomial (Pascal) oscillations function) 895
distribution 894, 957 addition 417, 454 pendulum 71, 394, 425, 489–491,
mean 894, 957 beats 431–437 648, 960
variance 894, 957 compound 431 perfect differential form 471–473
Newton cooling 412 damped 419 line integrals 744
Newton’s method 116 –119 deadbeat 420, 488 period 9, 22, 414
nodal analysis (circuits) 824 forced 420 period doubling 856
node (graph theory) 815 harmonic 413, 427 periodic functions 22 (see also
node (phase plane) 488, 494 longitudinal 298 harmonic functions;
nonlinear differential equations overdamped (heavy damping) Fourier series)
(see differential 420 amplitude 22, 414
equations, nonlinear) transients 420 and Fourier series 563, 567
972
angular frequency 22, 415 normal vector 231 expected value 889
frequency 22, 414 tangent 632, 691 exponential 897
INDEX
INDEX
rectangle rule (integration) 322, scalar 219 510, 955
347 scalar function 659 shoulder (surface) 635
recurrence relation 464 (see also scalar (dot, inner) product sidebands 585, 598
difference equations) 218–240 (see also sifting function 531
recurring decimal 2, 46 vectors) signal 545, 589
red shift 438 angle between vectors 221 signal energy 607
reduction formula 373 of basis vectors 222 signal flow graphs 827–831
regression 913–915 invariance 248 block diagram 827
controlled variable 913 perpendicular vectors 222 cycle 830
least squares estimate 914 scalar triple product 249 edges in series 829
linear model 914 cyclic order 250 feedback 827, 828
line 915 scale rule (Laplace transform) loop 830
response 913 508, 955 multiple edges 829
scatter diagram 913 scatter diagram 913 stem 830
straight line fit 914 separable differential equations weighted digraph 828
unbiased estimators 916 466–469 signed area 314, 320
relaxation oscillation 499 separation of variables 466–469 signum (sgn) function 15, 920
repeated integral 709–717 (see separatrix 491 simple harmonic motion (see
double integration) sequence 43, 129, 845 harmonic oscillator)
separable 724 of partial sums 129 Simpson’s rule 355, 925
resistor series (see Fourier series; sinc function 594
impedance 533 geometric series; infinite sine function 18 (see also
phasor 447 series; Taylor series) trigonometric functions)
complex impedance 447 sets 789–800 antiderivative 312
resonance 423 associative laws 793 derivative 75
restricted stationary values 106, binary 789 exponential form 150
667, 697 cardinality 798 rectified 582
resultant of forces 235 cartesian product 800 Taylor series 130
root mean square (rms) 328 commutative law 793 sine rule 950
rotation of axes (see also axes) complement 791 sine transform (see cosine/sine
223, 226 complementary laws 794 transform)
row operations, elementary 262 de Morgan’s laws 795 singular matrix 173
difference 794 singular solutions (of differential
disjoint 791 equations) 468
S
distributive law 794 sinh function 37
saddle (phase plane) 484, 494 duality 800 Taylor series 131
saddle (surface) 625, 635– 638, elements 789 sinusoid 22
927 empty 791 slope 9, 61–65, 66 (see curve;
sample 903 equality 790 functions of two
mean 905, 910 finite 790 variables; straight line)
standard error of mean 910 identity laws 794 solenoidal field 775
variance 911 infinite 790 solid of revolution 343
sample space 866 (see also intersection 791 surface area integral 343
events) number sets 790 volume integral 343
countable 866 ordered pairs 800 spectral density 588
discrete 866 proper subset 792 spectrum (Fourier series) 577
elements of 866 subset 792 speed 98
event 866 union 791 sphere
exhaustive 869 universal 791 surface area 951
partitioning of 870 Venn diagram 792 volume 951
Venn diagram 868 sgn (signum) function 15, 920 spherical polar coordinates 780
974
spiral (curve) 28 Stokes’s theorem 784 Taylor polynomial 125, 921, 922
archimidean 57 straight line 8 Taylor series 124 –138
INDEX
INDEX
integrals of products 362 normal to plane 231 surface area 767
inverse 25 normal to a surface 632, 690 surface integral 765
Taylor series 130 parallel 199 triple integral 769
truth tables 803 (Boolean parallelogram rule 201 volume integral 765, 769
algebra) perpendicular vorticity 782
for gates 803 plane equation 208, 231 vector (cross) product 244 –258
inverse method 808 position 206 direction of 246
for switches 809 and relative velocity 204 invariance 248
right/left-handed system of rules 245
198 of unit vectors 245
U
row 162, 169 vector space 285
uniform distribution 901 rules of vector algebra base vectors 285
union (of sets) 791 199–202 vector triple product 255
unit step function 14 scalar (dot, inner) product velocity 67, 212, 213
units (SI) 959 –960 220 angular
unit vector 211, 223 scalar triple product 249 polar components 216
universal set 791 straight line 209, 230, 234 relative 204
subtraction 200 Venn diagram 792, 868
sum of 181 vertex 814
V
tangent to curve 213, 238 vibrations (see oscillations)
valency 816 triangle rule 200 volume
van der Pol equation 499, 926 unit 211, 223 of cone 342, 951
variable, random (see random vector product 244 (see vector ellipsoid 353
variable) product) integral 769
variable, dependent/independent vector triple product 255 parallelepiped 251, 257
12 and velocity (see velocity) 213, of solid of revolution 343
variance 890, 897 216 table 951
variance, sample 911 vector fields 762–786 (see also tetrahedron 785
variate 793 curl; divergence;
vectors 193 –258 (see also axes; gradient)
W
vector field) cylindrical polar coordinates
acceleration 213 777 walk (graph theory) 820
addition of 200 curl 773–777 walk, random 860
angle between 220 curvilinear coordinates water clock 354
basis 210 779–781 wave (see also diffraction;
column 162, 169 divergence 764 interference;
components 199 divergence theorem 770–773 oscillations)
coplanar 203, 251 field lines 762–764 antinode 427
cross product (see vector fluid flow 772, 774, 775 attenuating 629
product) flux 770 beats 431, 432, 434
and curvature 238 gradient 780 carrier 432, 585, 598
differentation 212–214 identities 785 complex amplitude 453–454
directed line segment 198 integral curves 762 compound oscillation 431
displacement 193, 197 irrotational 775 diffraction 417
dot product (see scalar Laplace’s equation 786 dispersive 436
product) orthogonal coordinates Doppler effect 437
equality 199 (general) equation 631
gradient 659 –661, 688 paraboloidal coordinates 785 frequency modulation 435
(see also gradient) scale factor 778 group velocity 436
invariance 248 spherical polar coordinates intensity 455
magnitude (length) 199 780 interference 417
976
modulation 432, 435 wave number 428 complex plane 522–556
node 427 wave packets 432 definition 549
INDEX