100% found this document useful (8 votes)
4K views812 pages

Mathematical Techniques An Introduction For The Engineering, PH PDF

Uploaded by

Cris Miller
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (8 votes)
4K views812 pages

Mathematical Techniques An Introduction For The Engineering, PH PDF

Uploaded by

Cris Miller
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 812

MATH E M ATI CAL

TECHNIQUES
An introduction for the engineering,
physical, and mathematical sciences

SECOND EDITION

D. W. JORDAN AND P. SMITH


Short contents
1 Standard functions and techniques 1
PART I 2 Differentiation 30
3 Further techniques for differentiation 48
Elementary methods, 64
4 Applications of differentiation
differentiation, complex numbers
5 Taylor series and approximations 83
6 Complex numbers 96

7 Matrix algebra 112


PART II 8 Determinants 129
9 Elementary operations with vectors 142
Matrix algebra and vectors
10 The scalar product 163
11 Vector product and derivatives of vectors 183
12 Linear equations 196
13 Eigenvalues and eigenvectors 214

14 Antidifferentiation and area 238


PART Nil 15 The definite and indefinite integral 248
16 Applications involving the integral as a sum 265
Integration and differential
17 Systematic techniques for integration 277
equations
18 Unforced linear differential equations with constant coefficients 295
19 Forced linear differential equations 310
20 Harmonic functions and the harmonic oscillator 326
21 Steady forced oscillations: phasors, impedance, transfer
functions 339
22 Graphical, numerical and other aspects of first-order equations 350
23 Nonlinear differential equations and the phase plane 366

24 The Laplace transform 384


PART IV 25 Applications of the Laplace transform 402
26 Fourier series and Fourier transforms 434
Transforms and Fourier series

27 Differentiation of functions of two variables 470


PART V 28 Functions of two variables: geometry and formulae 488
29 Chain rules, restricted maxima, coordinate systems 504
Multivariable calculus
30 Functions of any number of variables 521
31 Double integration 543
33 Line integrals 564
33 Vector fields: divergence and curl 587

34 Sets 605
PART VI 35 Boolean algebra: logic gates and switching functions 615
36 Graph theory and its applications 626
Discrete mathematics
37 Difference equations 648

38 Probabilty 666
PART VII 39 Random variables and probability distributions 683
40 Descriptive statistics 701
Probability and statistics

41 Applications projects using symbolic computing 715


PART VIII
Answers to selected problems 736
Projects
Appendices 767

Index 779
Mathematical Techniques
Mathematical
Techniques
An Introduction for the Engineering,
Physical, and Mathematical Sciences

Second edition

D. W. Jordan and P. Smith


Department of Mathematics
Keele University

Oxford Toronto Melbourne


OXFORD UNIVERSITY PRESS
Oxford University Press, Great Clarendon Street, Oxford 0X2 6DP
Oxford New York
Athens Auckland Bangkok Bogota
Buenos Aires Calcutta Cape Town
Chennai Dar es Salaam Delhi Florence
Hong Kong Istanbul Karachi Kuala Lumpur
Madrid Melbourne Mexico City Mumbai
Nairobi Paris Sdo Paolo Singapore
Taipei Tokyo Toronto Warsaw

and associated companies in


Berlin Ibadan

Oxford is a trade mark of Oxford University Press

Published in the United States


by Oxford University Press Inc., New York

(© D. W. Jordan and P. Smith, 1994, 1997

First edition 1994


Reprinted 1994, 1995, 1996
Second edition 1997
Reprinted 1998
All rights reserved. No part of this publication may be reproduced, stored in a
retrieval system, or transmitted, in any form or by any means, without the prior
permission in writing of Oxford University Press. Within the UK, exceptions are
allowed in respect of any fair dealing for the purpose of research or private study, or
criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988,
or in the case of reprographic reproduction in accordance with the terms of licences
issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside
those terms and in other countries should be sent to the Rights Department, Oxford
University Press, at the address above.

This book is sold subject to the condition that it shall not, by way of trade or
otherwise, be lent, re-sold, hired out, or otherwise circulated without the publisher's
prior consent in any form of binding or cover other than that in which it is published
and without a similar condition including this condition being imposed on the subsequent
purchaser.

A catalogue record for this book is available from the British Library

Library of Congress Cataloging in Publication Data


(Data available)
ISBN 0 19 856462 7 (Hbk)
ISBN 0 19 856461 9 (Pbk)

Typeset by Keyword Typesetting Services


Printed in Great Britain by
Butler & Tanner Ltd, Frome
Preface to Second Edition
This book is a student text covering the mathematical techniques
usually taught in the early stages of science and engineering degree
courses. It also provides the groundwork of ‘methods’ needed by
first-year mathematics specialists. The requirements of such students
have influenced its content and presentation, helped in many ways,
by the authors’ long and continuous experience of teaching such
courses to a variety of joint degree students at Keele University,
including those who have only minimal entry-level background in
mathematics. We have also tried to take account of the likely
background of future students, who will probably have a more
broadly based pre-university education.
In this new edition we have taken the opportunity to respond to
the correspondents and reviewers who have made suggestions for
extensions and improvements. The new material includes: the z
transform; the Fourier transform; the use of the Jacobian for general
change of variable in double integrals; vector fields, divergence,
and curl in orthogonal coordinates; additional applications of graph
theory; and three chapters on probability and descriptive statistics.
The content of the former single chapter on vector algebra has been
spread over three chapters to facilitate selection of material and ease
of learning.
The text has been divided into eight parts, each covering a
coherent theme. Following Part I, which is an introduction to
elementary methods, differentiation, and complex numbers, the
calculus sequence continues in Parts III, IV, and V, which cover
integration, differential equations, Laplace transforms, Fourier
series, Fourier transforms, functions of several variables, multiple
and line integrals, and vector fields. The algebra strand in Parts II
and VI includes matrices, vectors, linear equations, eigenvalues, sets,
Boolean algebra, graph (network) theory, and difference equations.
Part VII is an introduction to probability and descriptive statistics.
The final part, Part VIII, consists of a selection of symbolic
computing projects related, chapter by chapter, to the main text.
We have organized the book so as to enable students to use it with
the minimum of guidance. The same features will help teachers to
make selections from the book according to the length, emphasis,
and prerequisites of different courses. From Chapter 2 onwards,
every subject treated starts from scratch. For example, it is not
assumed that the reader necessarily knows any calculus, although
most science and engineering students are likely to have had some
VI Preface

previous encounter with the subject; if so, the text can be used for
revision as well as extension of various topics. Most of the chapters
are short (the average length is about 18 pages), and the sections
within the chapters include not more than one or two new ideas.
All the principal results are displayed in summary form in numbered
and shaded ‘boxes’. For revision purposes, or in desperate cases,
progress could be made by attending only to the boxes.
The reader is encouraged to look out for some kind of geometrical
or numerical reality to illustrate symbolic statements. Where poss¬
ible we make use of graphical justifications; science students in
particular should benefit by cultivating geometrical reasoning.
Numerical methods for equation-solving, integration, and the solu¬
tion of differential equations and systems are introduced at the
points where they can effectively illustrate the main text, rather than
being collected together in a separate chapter on numerical analysis.
Attempts to generate interest by using specialized examples from
physics, chemistry, engineering, etc., are liable to misfire with many
students, who have enough to do at first in grasping the underlying
mathematical processes, and are confused by layers of scientific
vocabulary and unfamiliar notations. We have therefore taken out
into separate chapters certain technical applications such as the
harmonic oscillator, phasors, and circuit analysis, and given these
topics a fuller treatment. Certain other applications are confined to
separate sections, so that they can be avoided if they do not suit a
particular class. Most of the applications in the main text are drawn
from common knowledge, or are such as can easily be understood.
The last chapter contains a collection of projects to introduce the
use of symbolic computation and graphics manipulation, following
the text chapter by chapter. A short first-year course at Keele
University introducing mathematics students to Mathematical soft¬
ware has proved successful with students who often have no previous
experience of computing. Symbolic computation is a useful inter¬
active facility for graph plotting and routine manipulations in, for
example, linear algebra, differentiation, and integration, although it
is no substitute for understanding methods and principles. Almost
all the graphs of curves and surfaces in this book have their origins
in Mathematica graphical outputs. Specific Mathematica programs
which provide solutions for these projects are available on the
web or on disk (see p. 716). Mostly they incorporate standard
Mathematica commands, and they are readily adaptable to a
variety of alternative inputs for specific functions, matrices, vectors,
determinants, Fourier series, Laplace transforms, etc.
There are nearly 500 fully-worked examples in the book, a large
number of exercises and problems including simple programming
applications, over 120 projects for symbolic computation, and

t Mathematica is a registered trade mark of Wolfram Research Inc.


Preface vii

several Appendices consisting of tables of standard results for


reference, as well as answers and hints to selected problems and
projects.
Confidence is half the battle in mathematics: it is very encouraging
to learn how to do something so as to be able to get it nearly right
most times. Continual practice, even at a point of repetitive drill, is
the way to achieve this. The beneficial effects of practice can be
obtained without using very complicated and difficult exercises, so
on the whole we have avoided such problems.
We should like to record our thanks to the following: to Michael
Basler, University of Jena, the translator of the German edition; to
John Bentin, for numerous suggestions; to Peter Jones, Keele
University, for helpful comments on early drafts of the chapters on
probability and statistics; to Andrew Looms, Keele University, for
setting up the website; to Anne Smith, University of Virginia, for
checking a large number of examples and problems; to Robert West,
for checking the new edition; and to colleagues and students from
Keele University for drawing attention to obscurities in the first
edition. The development, writing and organization of this text and
the associated software has been a complicated process, and we wish
to express our appreciation of the helpfulness of the staff of Oxford
University Press during the production of the book.

Keele D.W.J.
March 1997 P.S.
'
Contents
1 Standard functions and techniques
PART I
1.1 Number line, intervals 1
Elementary methods, 1.2 Coordinates in the plane 2
1.3 Straight lines and curves, 3
differentiation,
1.4 Functions 6
complex numbers 1.5 Radian measure of angles 8
1.6 Trigonometric functions 9
1.7 Inverse functions 12
1.8 Exponential functions 14
1.9 The logarithmic function 15
1.10 Exponential growth and decay 17
1.11 Hyperbolic functions 18
1.12 Partial fractions 20
1.13 Summation sign; geometric series 24
Problems 26

2 Differentiation
2.1 The slope of a graph 30
2.2 The derivative: notation and definition 32
2.3 Rates of change 34
2.4 Derivative of x" (n = 0, 1, 2, 3,...) 36
2.5 Derivatives of sums: multiplication by constants 37
2.6 Three important limits 39
2.7 Derivatives of e*, sin x, cos x, In x 41
2.8 A basic table of derivatives 43
2.9 Higher-order derivatives 44
2.10 An interpretation of the second derivative 44
Problems 45

3 Further techniques for differentiation


3.1 The product rule 48
3.2 Quotients and reciprocals 50
3.3 The chain rule 52
3.4 Derivative of x" for any value of n 55
3.5 Functions of ax + b 56
3.6 An extension of the chain rule 56
3.7 Logarithmic differentiation 57
3.8 Implicit differentiation 58
3.9 Derivatives of inverse functions 59
3.10 Derivative as a function of a parameter 60
Problems 62
Contents

4 Applications of differentiation
4.1 Function notation for derivatives 64
4.2 Maxima and minima 66
4.3 Exceptional cases of maxima and minima 69
4.4 Sketching graphs of functions 70
4.5 Estimating small changes 75
4.6 Numerical solution of equations: Newton’s method 77
Problems 81

5 Taylor series and approximations


5.1 The index notation for derivatives of any order 83
5.2 Taylor polynomials 83
5.3 A note on infinite series 86
5.4 Infinite Taylor expansions 88
5.5 Manipulation of Taylor series 89
5.6 Approximations for large values of x 92
5.7 Taylor series about other points 92
Problems 94

6 Complex numbers
6.1 Definitions and rules 96
6.2 The Argand diagram and complex numbers 100
6.3 Complex numbers in polar coordinates 102
6.4 Complex numbers in exponential form 103
6.5 The general exponential form 105
6.6 Hyperbolic functions 107
6.7 Miscellaneous applications 108
Problems 109

7 Matrix algebra
PART II
7.1 Matrix definition and notation 112
Matrix algebra 7.2 Rules of matrix algebra 113
7.3 Special matrices 118
and vectors
7.4 The inverse matrix 122
Problems 126

8 Determinants
8.1 The determinant of a square matrix 129
8.2 Properties of determinants 132
8.3 The adjoint and inverse matrices 137
Problems 139

9 Elementary operations with vectors


9.1 Displacement along an axis 142
9.2 Displacement vectors in two dimensions 144
9.3 Axes in three dimensions 146
9.4 Vectors in two and three dimensions 146
9.5 Relative velocity 150
9.6 Position vectors and vector equations 152
Contents XI

9.7 Unit vectors and basis vectors 155


9.8 Tangent vector, velocity, and acceleration 157
9.9 Motion in polar coordinates 158
Problems 160

10 The scalar product


10.1 The scalar product of two vectors 163
10.2 The angle between two vectors 164
10.3 Perpendicular vectors 165
10.4 Rotation of axes in two dimensions 167
10.5 Direction cosines 167
10.6 Rotation of axes in three dimensions 169
10.7 Direction ratios and coordinate geometry 171
10.8 Properties of a plane 173
10.9 General equation of a straight line 175
10.10 Forces acting at a point 176
10.11 Curvature in two dimensions 178
Problems 180

11 Vector product and derivatives of vectors


11.1 Vector product 183
11.2 Nature of the vector p = a x b 185
11.3 The scalar triple product 187
11.4 Moment of a force 190
11.5 Vector triple product 192
Problems 193

12 Linear equations
12.1 Solution of linear equations by elimination 196
12.2 The inverse matrix by Gaussian elimination 201
12.3 Compatible and incompatible sets of equations 202
12.4 Homogeneous sets of equations 205
12.5 Gauss-Seidel iterative method of solution 208
Problems 210

1 3 Eigenvalues and eigenvectors


13.1 Eigenvalues of a matrix 214
13.2 Eigenvectors 215
13.3 Linear dependence 220
13.4 Diagonalization of a matrix 221
13.5 Powers of matrices 224
13.6 Quadratic forms 227
13.7 Positive-definite matrices 229
13.8 An application to a vibrating system 233
Problems 235
Contents

14 Antidifferentiation and area


PART III 14.1 Reversing differentiation 238
14.2 Constructing a table of antiderivatives 241
Integration and
12.3 Signed area generated by a graph 244
differential equations
Problems 246

15 The definite and indefinite integral


15.1 Signed area as the sum of strips 248
15.2 Numerical illustration of the sum formula 249
15.3 The definite integral and area 250
15.4 The indefinite-integral notation 251
15.5 Integrals unrelated to area 252
15.6 Improper integrals 255
15.7 Integration of complex functions: a new type of integral 256
15.8 The area analogy for a definite integral 258
15.9 Using the area analogy 259
15.10 Definite integrals having variable limits 261
Problems 263

16 Applications involving the integral as a sum


16.1 Examples of integrals arising from a sum 265
16.2 Geometrical area in polar coordinates 267
16.3 The trapezium rule 268
16.4 Centre of mass, moment of inertia 270
Problems 274

17 Systematic techniques for integration


17.1 Substitution method for J /(ax + b) dx 277
17.2 Substitution method for j f{ax2 + b)x dx 279
17.3 Substitution method for j cos"' ax sin" ax dx (m or n odd) 280
17.4 Definite integrals and change of variable 282
17.5 Occasional substitutions 283
17.6 Partial fractions for integration 285
17.7 Integration by parts 286
17.8 Integration by parts: definite integrals 289
Problems 291

18 Unforced linear differential equations with


constant coefficients
18.1 Differential equations and their solutions 295
18.2 Solving first-order linear unforced equations 298
18.3 Solving second-order linear unforced equations 301
18.4 Complex roots of the characteristic equation 304
18.5 Initial conditions for second-order equations 307
Problems 308
Contents

19 Forced linear differential equations


19.1 Particular solutions for standard forcing terms 310
19.2 Harmonic forcing term by using complex solutions 314
19.3 Particular solutions: exceptional cases 317
19.4 The general solution of forced equations 318
19.5 First-order linear equations with a variable coefficient 321
Problems 324

20 Harmonic functions and the harmonic


oscillator
20.1 Harmonic oscillations 326
20.2 Phase difference: lead and lag 328
20.3 Physical models of a differential equation 329
20.4 Free oscillations of a linear oscillator 330
20.5 Forced oscillations and transients 331
20.6 Resonance 333
20.7 Nearly linear systems 335
Problems 337

21 Steady forced oscillations: phasors,


impedance, transfer functions
21.1 Phasors 339
21.2 Algebra of phasors 341
21.3 Phasor diagrams 342
21.4 Phasors and complex impedance 343
21.5 Transfer functions in the frequency domain 346
Problems 348

22 Graphical, numerical and other aspects of


first-order equations
22.1 Graphical features of first-order equations 350
22.2 The Euler method for numerical solution 351
22.3 Nonlinear equations of separable type 354
22.4 Differentials and the solution of first-order equations 356
22.5 Change of variable in a differential equation 360
Problems 363

23 Nonlinear differential equations and the


phase plane
23.1 Autonomous second-order equations 367
23.2 Constructing a phase diagram for (x, x) 368
23.3 (x, x) phase diagrams for other linear equations; stability 371
23.4 The pendulum equation 373
23.5 The general phase plane 375
23.6 Approximate linearization 377
23.7 Limit cycles 379
23.8 A numerical method for x = P, y = Q 380
Problems 381
XIV Contents

24 The Laplace transform


PART IV 24.1 The Laplace transform 384
24.2 Laplace transforms of tn, e1', sin t, cos t 385
Transforms and
24.3 Scale rule; shift rule; factors t" and ekl 387
Fourier Series 24.4 Inverting a Laplace transform 390
24.5 Laplace transforms of derivatives 392
24.6 Application to differential equations 394
24.7 The unit function and the delay rule 396
Problems 400

25 Applications of the Laplace transform


25.1 Division by s and integration 402
25.2 The impulse function 404
25.3 Impedance in the s domain 406
25.4 Transfer functions in the s domain 408
25.5 The convolution theorem 413
25.6 General response of a system from its impulsive response 415
25.7 Convolution integral in terms of memory 416
25.8 Discrete systems 417
25.9 The z transform 419
25.10 Behaviour of z transforms in the complex plane 424
25.11 Difference equations 428
Problems 430

26 Fourier series and Fourier transforms


26.1 The composition of vibrations 434
26.2 Fourier series for a periodic function 435
26.3 Integrals of periodic functions 436
26.4 Calculating the Fourier coefficients 438
26.5 Examples of Fourier series 440
26.6 Use of symmetry: sine and cosine series 442
26.7 Functions defined on a finite range: half-range series 444
26.8 Spectrum of a periodic function 447
26.9 Obtaining one Fourier series from another 448
26.10 The two-sided Fourier series 449
26.11 Nonperiodic functions and the Fourier transform 451
26.12 Short notations 454
26.13 Fourier transforms of some basic functions 454
26.14 Rules for manipulating transforms 456
26.15 The delta function and periodic functions 458
26.16 Convolutions 460
26.17 The shah function 464
26.18 Energy in a signal: Rayleigh’s theorem 465
Problems 466

27 Differentiation of functions of two variables


PART V
27.1 Functions of more than one variable 470
Multivariable 27.2 Depiction of functions of two variables 471
27.3 Partial derivatives 473
calculus
27.4 Higher derivatives 476
27.5 Tangent plane and normal to a surface 478
Contents xv

27.6 Maxima, minima, and other stationary points 480


27.7 The method of least squares 483
27.8 Differentiating an integral with respect to a parameter 484
Problems 486

28 Functions of two variables: geometry and


formulae
28.1 The incremental approximation 488
28.2 Small changes and errors 490
28.3 The derivative in any direction 493
28.4 Implicit differentiation 496
28.5 Normal to a curve 498
28.6 Gradient vector in two dimensions 499
Problems 502

29 Chain rules, restricted maxima, coordinate


systems
29.1 Chain rule for a single parameter 504
29.2 Restricted maxima and minima: the Lagrange multiplier 506
29.3 Curvilinear coordinates in two dimensions 511
29.4 Orthogonal coordinates 513
29.5 The chain rule for two parameters 514
29.6 The use of differentials 517
Problems 519

30 Functions of any number of variables


30.1 The incremental approximation; errors 521
30.2 Implicit differentiation 523
30.3 Chain rules 525
30.4 The gradient vector in three dimensions 525
30.5 Normal to a surface 527
30.6 Equation of the tangent plane 528
30.7 Directional derivative in terms of gradient 529
30.8 Stationary points 532
30.9 The envelope of a family of curves 537
Problems 538

31 Double integration
31.1 Repeated integrals with constant limits 543
31.2 Examples leading to repeated integrals with constant limits 544
31.3 Repeated integrals over nonrectangular regions 546
31.4 Changing the order of integration for nonrectangular regions 548
31.5 Double integrals 550
31.6 Polar coordinates 553
31.7 Separable integrals 555
31.8 General change of variable; the Jacobian determinant 557
Problems 561
xvi Contents

32 Line integrals
32.1 Illustrating a line integral 564
32.2 General line integrals in two and three dimensions 567
32.3 Paths parallel to the axes 570
32.4 Path independence and perfect differentials 571
32.5 Closed paths 572
32.6 Green’s theorem 574
32.7 Line integrals and work 576
32.8 Conservative fields 578
32.9 Potential for a conservative field 580
32.10 Single-valuedness of potentials 581
Problems 584

33 Vector fields: divergence and curl


33.1 Vector fields and field lines 587
33.2 Divergence of a vector field 588
33.3 Surface and volume integrals 589
33.4 The divergence theorem 593
33.5 Curl of a vector field 595
33.6 Cylindrical polar coordinates 599
33.7 Curvilinear coordinates 601
Problems 602

34 Sets
PART VI
34.1 Notation 605
Discrete 34.2 Equality, union, and intersection 606
mathematics 34.3 Venn diagrams 608
Problems 612

35 Boolean algebra: logic gates and switching


functions
35.1 Laws of Boolean algebra 615
35.2 Logic gates and truth tables 617
35.2 Logic networks 619
35.4 The inverse truth-table problem 621
35.5 Switching circuits 622
Problems 623

36 Graph theory and its applications


36.1 Examples of graphs 626
36.2 Definitions and properties of graphs 627
36.3 How many simple graphs are there? 629
36.4 Paths and cycles 629
36.5 Trees 631
36.6 Electrical circuits: the cutset method 632
36.7 Signal-flow graphs 635
36.8 Planar graphs 638
36.9 Further applications 640
Problems 643
Contents xvii

37 Difference equations
37.1 Discrete variables 648
37.2 Difference equations: general properties 650
37.3 First-order difference equations and the cobweb 652
37.4 Constant-coefficient linear difference equations 653
37.5 The logistic difference equation 658
Problems 662

38 Probability
PART VII
38.1 Introduction 666
Probability and 38.2 Sample spaces, events and probability 667
38.3 Sets and probability 669
statistics
38.4 Counting and combinations 673
38.5 Conditional probability 675
38.6 Independent events <577
38.7 Total probability 678
38.8 Bayes’ theorem 679
Problems 680

39 Random variables and probability distributions


39.1 Random variables 683
39.2 Probability distributions 684
39.3 The binomial distribution 685
39.4 Expected value and variance 687
39.5 Geometric distribition 689
39.6 Poisson distribution 691
39.7 Other discrete distributions 693
39.8 Continuous random variables and distributions 694
39.9 Mean and variance of continuous random variables 695
39.10 The normal distribution 696
Problems 698

40 Descriptive statistics
40.1 Representing data 701
40.2 Random samples and sampling distributions 705
40.3 Sample mean and variance, and their estimation 707
40.4 Central limit theorem 708
40.5 Regression 710
Problems 713

41 Applications projects using symbolic computing


PART VIII 41.1 Symbolic computation 715

Projects
41.2 Projects 716
xviii Contents

Answers to selected problems 736

Appendices
A Some algebraical rules 767
B Trigonometric formulae 769
C Areas and volumes 770
D A table of derivatives 771
E A table of integrals 772
F A table of Laplace transforms, inverses and general rules 773
G A table of Fourier transforms and general rules 774
H Probability distributions and tables 776

Index 779
PART I ELEMENTARY METHODS,
DIFFERENTIATION,
COMPLEX NUMBERS

Standard functions
and techniques
Contents
1.1 Number line, intervals 1
1.2 Coordinates in the plane 2
1.3 Straight lines and curves 3
1.4 Functions 6
1.5 Radian measure of angles 8
1.6 Trigonometric functions 9
1.7 Inverse functions 12
1.8 Exponential functions 14
1.9 The logarithmic function 15
1.10 Exponential growth and decay 17
1.11 Hyperbolic functions 18
1.12 Partial fractions 20
1.13 Summation sign; geometric series 24
Problems 26

1.1 Number line, intervals


3 Draw a straight line containing a point 0, called the origin, and
indicate a scale starting at 0 as in Fig. 1.1, with positive scale
2 markings above 0 and negative markings below. Imagine the
line to be infinitely long in both directions. This is called a number
1
line, and every real number, positive or negative, has a place on it.
0 We shall use x to denote a general number.
The signs of inequality <, >, ^ have the following meanings;
-1
< ‘is less than’, ^ ‘is less than or is equal to’;
-2
> ‘is greater than’, ‘is greater than or is equal to’.
-3
If we are given two numbers, then the one which is higher on
the number line is the greater one. Therefore — 2 > —3, — 3 < —2,
Fig. 1.1 The number line, or x 3 > 0, — 3 < 0, and so on.
axis. Obviously 2 < 3. But it is also true that 2^3, because 2 is
2 1.1 Mathematical techniques

certainly either less than or equal to 3. For similar reasons, all


these are true: 1 = 1, 1^1, 1^1.
A single piece, or a segment, of the number line is called an
interval. The piece of the line between x = 2 and x = 3 which
includes both the end-points x — 2 and 3 can be specified by the
expression
‘the interval 2 ^ x ^ 3’
which means ‘all the values of x between, and including, 2 and 3’.
The interval 2 < x < 3 means all values of x between 2 and 3, but
excluding the end values. Infinite intervals can be expressed in two
ways; for example, the interval
x ^ 2 or 2 ^ x < oo

contains all the numbers x which are greater than or equal to 2.


The ‘size’ of a number is denoted by the symbol
jx if x ^ 0,
| —x if x < 0
which is called the modulus of x or mod x. Thus |3| = 3, | — 4| = 4.
We can use the modulus notation to define intervals. The inequality
|x| ^ 2 defines the same interval as — 2 < x < 2; |x — 1| < 3 is the
same as —3 ^ x — 1 < 3 or — 2 ^ x ^ 4.
(a)
If a is a negative number, then it has no real square root:
P 3 .NJa or ct- means the number whose square is equal to a, and
• (x, y)
2 the square of a number must be positive. The same applies to
A and other indices which have an even denominator. Suppose
1 •d.2,1) that a > 0; then the signs ^fa and cB always stand for the positive
X
i i i i i i number whose square is a. If we want the negative square root, we
-3 -2 -1 0 1 2 3 must indicate it by a ‘ —the two solutions of x2 = a are + s/a.
c -1
• B
(-1.8,-1.5) _2 - • 1.2 Coordinates in the plane
(2.3-2) The location of a point in a plane can be specified in terms of
-3 right-handed cartesian axes, illustrated in Fig. 1.2. These are effect¬
y ively two number lines at right angles, meeting at the common
origin O. The position of a point is determined by two coordinates
(x, y), as illustrated in Fig. 1.2a. They represent, in order, its signed
distances from the y and x axes respectively, as read off on the axis
scales. (Axes are called right-handed as is customary if, when we
walk along the x axis in the direction of increasing x, the positive
y axis is on the left. If you look at Fig. 1.2 in a mirror, you will see
left-handed axes.)
Thus, the point A has coordinates (1.2, 1), B has coordinates
(2.3, —2), etc. We refer to particular points using the notation
A : (1.2, 1) and B : (2.3, —2). For a general point P represented by
the coordinates (x, y), as in Fig. 1.2b x is known as the abscissa and
Fig. 1.2 Right-handed cartesian y the ordinate of P.
axes, x, y. The distance of P : (x, y) from the origin is OP; by Pythagoras’
1.2 Standard functions and techniques 3

theorem,
OP = J(OU2 + UP2),
where U is the base of the perpendicular from P onto the x axis
(Fig. 1.2b). If we put OP = r, then
r = yj(x2 + y2). (1.2)
Note that distances, such as OP and r, are always non-negative
numbers.
Similarly, for any two points Pl:(xl,y1) and in the
plane, the distance PXP2 between them - see Fig. 1.2b-is given by

pipi = ^ xi)2 + (Ti - y2)2]- (1.3)

1.3 Straight lines and curves


If x and y are related by an equation, then this relation can be
represented by a curve or curves in the (x, y)-plane known as the
graph of the equation.

Example 1.1. Sketch the graph of y = x3.


We decide over what interval of values of (say) x we wish to sketch the graph.
Let — 3 < x ^ 3. Construct a table of (x, y) values as shown below:

X -3 -2 -1 0 1 2 3

y -27 -8 -1 0 1 8 27

We then plot the points corresponding to this set of coordinates and draw a
smooth curve through them, as shown in Fig. 1.3. The greater the number of
values of x in the interval, the greater is the reliability of the graph. It is assumed
that the curve has smooth or regular behaviour between consecutive plotted
points.

Example 1.2. Find the equation of the straight line through the points A: (1, 2)
and B:(-1, -1).
The line is shown in Fig. 1.4. Let P: (x, y) be any point on the line. The triangles
ABR and PBQ are similar, so that

PQ _y+ 1 _ AR _ 3
QB ~ x + 1 ~ RB~ 2’

Therefore

2(y+ l) = 3(x+ 1),


or
y = fx + l
This represents the equation of the straight line through the points (1,2) and
(-1,-1).

Any equation of the form


y = mx + c
has a straight line graph, and any straight line can be expressed in
this form unless it is parallel to the y axis, when its equation is
x = b.
4 1.3 Mathematical techniques

The method of Example 1.2 can be used to find the equation of


a line passing through any two given points A: (xl5 jq) and B: (x2, y2)
(see Fig. 1.5). Let P: (x, y) be any point on the line. Then

PQ _ BR
AQ ~ AR ’

or

y-yi = y2 - yi (1.4)
X — Xj x2 — xt

This equation can be rearranged in the form y = mx + c, where

y2 -y 1 *2?! ~ *1^2
m (1.5)
X2 - Xj x2 - Xj
\
In the right-angled triangle ABR of Fig. 1.5, the angle a is given by

t y2-yi n
tan a =-, (1.6)
*2 - *1

and this gives the standard measure of the slope or gradient of


the straight line. The line slopes upwards or downwards from left
to right according as tan a is positive or negative respectively; the
slope is zero when a or tan a is zero; and the larger the size (or
modulus - see (1.1)) of tan a, the steeper is the line. Also, according
to (1.5), if the equation is in the form y = mx + c, then

slope = tan a = m. (1.7)


Finally, if we require the line through A : (xl5 jq) with given slope
m, its equation is

y-yi n
-= m. (1.8)
x — xj

The following result is often needed:

Perpendicular straight lines

The condition for the straight lines y — m1x + c and


y — m2x + cl to be perpendicular is (1*9)
mxm2 = — 1.

This is proved as follows. First suppose that m1m2 = — 1; we have


to deduce that the lines are perpendicular. Translate the two lines,
so that their directions are unchanged, to meet at the origin O as
in Fig. 1.6. The angle of intersection is unaffected by translation, and
the equations become

Fig. 1.6 line Ll : y — m{x; line L2 : y = m2x;


1.3 Standard functions and techniques 5

and mxm2 = — 1. Let A : (a, b) be any point on Lx. Then b = mxa


so that fflj = b/a. Since mxm2 — — 1, we have m2 = — a/b\ therefore
the point B : ( — b,a) lies on L2, as shown. Since the lines containing
the right angles at P and Q have length a and b in each case, the
triangles OP A and BQO are congruent, and the angles POA and
QBO are corresponding angles. Put POA = a; then
P'OA = QBO = a.
Therefore QOB = 90° — a, and finally
AOB = 180° - a - (90° - a) = 90°,
as required. The converse result, that if Lx and L2 are at right angles,
then mxm2 = — 1, can be proved in a similar way.
A circle consists of all the points which are a constant distance
from a given point. In Fig. 1.7, a circle has radius r, and its centre
is at (a, b). The point P : (x, y) represents any point on the circle.
Equation (1.3) for the distance between two points gives
VtCx - a)2 +(y- fc2)] = r.
Square this expression to get rid of the square root, and we have the
equation of a circle in its standard form:

Equation of a circle, centre (a, b) and radius r


(1.10)
(x — a)2 + (y — b)2 = r2

Example 1.3. Find the centre and radius of the circle

4x2 + 4_v2 — 4x + 8y — 11=0. (i)

To convert (i) to the form (1.10), rewrite it in the form

x2-x + y2 + 2y = Jf, (ii)

Take the terms involving x and reorganize them:

x2 - x = (x - I)2 - i

(this process is used in many different contexts and is called completing the
square). Treat the terms in y similarly:

y2 + 2y = (y+ l)2- 1.

Replace the terms in (ii) by the new forms; we get

(x-i)2-i + (y + I)2 - 1 =
or

(x-!)2 + (y + l)2 = 4.

Therefore the centre is at (j, — 1), and the radius is 2.

Notice that (1.10) implies that, if we are given an equation


Ax2 + By2 + Cx + Dy + E = 0, it can only represent a circle if A = B.
6 1.3 Mathematical techniques

(The equation might not represent anything, as with x2 + y2 + 1=0,


but if it does, it will be a circle.)
Figure 1.8 shows other important types of second-degree curve.

1.4 Functions
The area A of a circle depends on its radius r, and the dependence
is expressed in the formula A = nr2. In general, suppose that the
values of a certain independent variable x, say, determine values
of a dependent variable y in such a way that if a numerical value
of x is given, a single value of y is determined. Then we say that y
is a function of x, and write for example,
y = f(x), y = g(x),
and so on, where the letter /, g, etc. can be used to distinguish
different forms of dependence, which can be thought of pictorially
in terms of different graphs. The letter /, g, and so on, standing
alone, need not be associated with a formula in the usual sense. They
can stand for any rule, programme, or calculation process which
produces a definite single value for y when we offer a number x to
it. A function can be thought of as an input-output device as in
Fig. 1.9.
Functions can also be defined implicitly by means of formulae.
For example
x2 + y2 = 1
Fig. 1.8 (a) An ellipse
x2/a2 + y2/b2 = 1. represents a circle, centre the origin and radius 1. But if we sort out
(b) A hyperbola y from the relation we obtain y = +(1 — x2)\ which is not a single
x2/a2 — y2/b2 = 1. function, but two separate, single-valued, functions:
(c) A parabola y2 = x.
y = (1 — x2)* and y = — (1 — x2)%
representing the upper and lower semicircles respectively.
The following result is frequently required. Suppose that c is a
positive constant, and that we are given a function /, with graph
Input x Processor Output y
y = fix). The graph y — f(x — c) is exactly the same as that of fix),
X
/ y=f(x)
except that it is moved, or translated a distance c to the right along
the x axis. There is a similar result for /(x + c), the movement being
Fig. 1.9
to the left. Therefore

Translation of y = f(x) along the x axis

Let c be a positive number, and / any function. Then


y = f(x — c) and y = f(x + c) (1.11)
represent translation of y = fix) a distance c along
the x axis to the right and left respectively.

Thus y = x2 and y = (x + 2)2 have the same shape, but the second
is a distance 2 to the left of the first. In an expression such as fix + c)
the part in the brackets is called the argument of /.
1.4 Standard functions and techniques 7

It is useful to have terms in which symmetry of a graph can be


described. For example, the graphs of y = x2 and y = cos x are
symmetrical about the y axis; the two halves are reflections of each
other. Such functions are called even functions. On the other hand
y = x3 and y = sin x are antisymmetrical about the y axis; they are
called odd functions (see also Section 15.9). The corresponding
algebraic properties are

Even and odd functions


(a) /(x) is even if f(-x) = /(x), (1.12)
(b) f(x) is odd if /( —x) = -/(x).

For example, in plotting y = f(x) = x3 in Example 1.1, we did not


really have to calculate x3 for negative values of x. All that was
necessary was to notice that x3 is an odd function; ( — x)3 = — (x)3;
and this gives the table for negative x by changing the sign of the
entries for x positive.
Some functions of practical significance have graphs that are not
entirely smooth. For example, we may wish to model a device that
is turned on at a given time, being quiescent before that time but
active afterwards. A sudden change in the state of the device can be
represented by a function that has a jump or discontinuity in its
graph at the critical moment. The basic building block for functions
with a jump is the unit step function H(f) (also known as the
Heaviside function after its inventor, and sometimes denoted by
U(r)) which we shall define, using t to represent time by

when t < 0,
H(t) = (1.13)
when t ^ 0;

(b) 7 (see Fig. 1.10a). If switch-on is required at t = t0 then we can use


1
when f < f0,
H(f
when f > t0,

shown in Fig. 1.10b: it is the same graph translated to the right a


distance t0, by (1.11).
Fig. 1.10 (a) Graph of H(f).
(b) Graph of H(f — f0). Example 1.4. Sketch the graph of f(t) = H(2 — t) + H(f + 1) — 1.
The function f(t) is a combination of unit functions each of which has a
discontinuity where its argument is zero. Thus /(f) has discontinuities at t = — 1
(from H(r + 1)) and at t = 2 (from H(2 — t)). Note also that
y when t < — 1, jO when t > 2,
1 H(r + 1) = H(2 - r)
l when t ^ — 1, (1 when f < 2.
i
l
l
Hence for f < — 1, /(f) = 1 + 0 — 1 = 0. For — 1 ^ f 2, /(f) = 1 + 1 — 1 = 1.
-2 -1 O 12 3 For f > 2, /(f) = 0 + 1 - 1 = 0. The graph is shown in Fig. 1.11.
t This function would switch a device on at t = — 1 and switch it off at f = 2.
Fig. 1.11
8 1.4 Mathematical techniques

The odd function (Fig. 1.12a) denoted by sgn, defined by

-1 when t < 0,

sgn f = H(0 - H(-f) = < 0 when t = 0,

1 when t > 0;

is called the signum function (‘signum’ means ‘sign'). H(f) and


sgn t can be used along with other functions to produce a variety
of functions having discontinuities in either value or direction at
assigned points. Figure 1.12b,c shows the even functions y = t sgn(f)

(a) (c)

O -1 O

1 --1

Fig. 1.12 (a) y = sgn(f).


and f2) has discontinuities where
(b) y = t sgn(f).
(c) y = sgn( 1 - r2). 1 —

1.5
For
still
360
desirable. The absolute unit is the radian, which represents about
57°. The special property which makes the unit valuable is its
connection with length. Figure 1.13 shows a circle of radius R with
a sector AOB containing an angle 9. The length of the arc AB is
obviously proportional to R, and it is proportional to 9 whatever
the angular units, so it is proportional to the product R9. One radian
is the unit of angle such that AB is numerically equal to R9.
Then if 9 is measured in radians

AB = R9.

But, we know that, for the whole circle,

AB = 2nR.

Therefore the radian measure of angle for the whole circle must be

9 = 2k.

Since the whole circle measures 360 in degrees, it must be that

360 ’ = 2n radians,
or 1 radian = 180/tt degrees (57.29578 ..,°).

The following table summarizes some useful information:


1.5 Standard functions and techniques 9

(a) a degrees = radians,

/J radians = degrees,

(1.14)
360° = 2k rad, 180° = n rad, 90° = \k rad,
45° = In rad, 60° = ^7t rad, 30° = gjtrad.
(b) The arc-length on a circle of radius R subtended
by 8 radians is R0.

1.6 Trigonometric functions


We assume that the reader knows the meaning of sine, cosine, and
tangent in ordinary trigonometry. We shall extend the meaning to
(a) incorporate angles larger than 90°, and negative angles.
In Fig. 1.14, consider any point P : (x, y), with

AOP = 8, OP = r > 0.

The angle 8 is to be measured from the positive x axis, in the


anticlockwise direction if 8 is positive, and in a clockwise direction
if 8 is negative. The angle 8 may take any value, however large:
positive, negative, or zero. The quantity r is always positive or zero.
Then the trigonometric functions are defined as follows for positive r:

Trigonometric functions for any angle

x . „ y „ sin 0 v
cos 8 = -, sin 8 = -, tan 8 =-= -,
r r cos 8 x
(1.15)
(where r > 0).
sec 8 = 1/cos 8, cosec 8 = 1/sin 8, cot 8 = 1/tan 8.

If r and 6 are specified, they locate a point P in the plane just as


well as x and y do. They therefore constitute an alternative pair of
coordinates, called polar coordinates.
If on the other hand a point P is specified, as in Fig. 1.14, there
is a definite value of r associated with it, but there are many values
Fig. 1.14
of 8 that will do. In Fig. 1.14a, for example, the value of 8 shown is
the one we are likely to adopt. But we are brought to the same point
P if we take the angular coordinate to be (in radians)

8 ±2n, 8 ± 471, etc.,


10 1.6 Mathematical techniques

because every increase or decrease of 2n radians merely represents a


complete revolution. This ambiguity will be dealt with in the book
as it is encountered. From Fig. 1.14:

Polar and cartesian coordinates

x = rcos0, y = rsin0; (1*16)


r = (x2 + y2)\ tan 9 = y/x.

Example 1.5. Obtain (a) sin (b) tan (c) cos In. (The angles are in
radians.)
From (1.14) we obtain the values of the angles in degrees. Use the familiar
triangles in Fig. 1.15.
(a) sin fn = sin 60° = J3/2. (b) tan = tan 30° = 1/^3. (c) cos jti = cos 45° =
i/ V2-
Example 1.6. Obtain (a) cos 2n, (b) sin §7t, (c) sin( — fjc).
Fig. 1.15
From Fig. 1.16, and eqn (1.15):
(a) sin 45° = cos 45° = 1/^/2.
(b) sin 60 = cos 30° = ^3/2. (a) P is (r. 0), so cos 2n = x/r = r/r = 1;
cos 60° = sin 30° = 1 /2. (b) P is (0, — r), so sin |ti = y/r = — 1;
(c) P is (-r/yj2, —r/yj2), so sin(-|tr) = y/r = -1 /J2.

The graphs y = cos x and y = sin x are shown in Fig. 1.17. The
following points should be noticed.

1. y = cos x and y = sin x have identical shape, but are displaced a


distance jn (radians) from each other: thus (see (1.11))

sin x = cos(x — ^7t), cos x = sin(x + |tt). (1-17)


1.6 Standard functions and techniques 1 1
(a)
I1 2. The curves repeat themselves after every interval of length 2k.
/' y /' I1
1 h This is evident from the definition (1.15), because 2n radians is a
i
1
1_/
J
/1
i
1 o
J J\ |
complete revolution in Fig. 1.14, and therefore delivers the same
point P : (x, y). The functions cos x and sin x are therefore called
periodic functions with period (or wavelength) equal to 2n (see
N>|1 U)

5rei !ni
also Section 20.1).
3. y = cos x is symmetrical (or even) about the y axis; y = sin x is
_

If II antisymmetrical (or odd) about the y axis.


(b) |i -v 4. They swing between + 1 in every complete period of length 2k.
\ jl There are many trigonometric identities in common use; the
following are the most important (a more extensive list can be found
v | \ 0 V iV
) \-itl i\ | \ 7t| , in Appendix B).
2 71 \ | 2 2 | 2 71

(c)
X \ X Trigonometric identities

(a) Sums of angles:


sin(A + B) = sin A cos B ± cos A sin B,
cos(A ± B) = cos A cos B + sin A sin B,
tan A ± tan B
tan (A + B)
1 + tan A tan B
(b) Products as sums: (1.18)
cos A cos B = ^[cos(/l + B) + cos(A — B)];
cos A sin B = ^[(sin(A + B) — sin(A — B)];
sin A sin B = j[ — cos(A + B) + cos(/4 — B)].
(c) cos2 A —\( 1 + cos 2/4);
sin2 A — ^(1 — cos 2A).

Graphs of tan x, cot x, sec x, and esc x are shown in Fig. 1.18.
A function of the form
c cos(cox + <p),
Fig. 1.18 (a) y = tan x.
(b) y = cot x. (c) y = sec x. where c, to (Greek omega), and 0 (Greek phi) are constants, is called
(d) y = esc x. a harmonic function or a sinusoidal function. It includes sine
functions, by virtue of (1.17). Any function of the form
a cos cox + b sin cox
can be put into this form. Numbers c and (p can be found graphically
by plotting the point P : (a, b) in cartesian coordinates as in Fig.
1.19. Let c = (a2 + b2y > 0 be its distance from 0, and choose cp as
minus the polar angle 9: <p = —9. Then
a cos cox + b sin cox = c cos( — (p) cos cox + c sin( — (p) sin cox
= c cos <p cos ox — c sin </> sin cox
= c cos(cox + (p),
after using (1.18a). Therefore we have the following result.
12 1.6 Mathematical techniques

An important identity
a cos cox + b sin cox = c cos(a»x + (fr),
where the point (a, b) has polar coordinates r, 0, and
c = r (>0) and cp = —9.

A typical graph of y = c cos(ojx + (p) is shown in Fig. 1.20. The


wavelength, or the period if the variable stands for time, is 2k/co.
The positive number c is called the amplitude, co is the angular
frequency in radians per second. The frequency in cycles per
second is co/2n, and cp is called the phase angle. More detail is
given in Chapters 18 and 19.
A general function / is said to be periodic with period P if

f{x + p) = f{x)

for every value of x. It then repeats itself in any interval of length


p. If p is a period, so obviously are 2p, 3p, and so on. The period
usually meant when we say a function has period p is the smallest
period. Thus in Fig. 1.18, tan x and cot x have period n.

i-1

Fig. 1.20 2nla>

1.7 Inverse functions


Figure 1.21a shows the graph of y = x3, and x and y have the same
scales. Choose a number a. To find a3, locate x = a at A and follow
the track ABC. Naturally, the point C on the y axis represents a3.
Now suppose we have a number b and want to find b\ Read the
graph backwards: find y = b at U on the y axis and follow the track
UVW. Then W represents b\ Therefore exactly the same curve can
be used to find both cubes and cube roots of numbers:

x =C is the same curve as y = x3.

In order to find cube roots, we might prefer to have a separate


graph which can be read off in the ordinary way: from the x axis
to the y axis. We can do this by changing round the labels x and y
as in Fig. 1.21b. To turn it into a standard graph, with x horizontal
and y vertical, simple flip the figure over, around a diagonal axis as
1.7 Standard functions and techniques 13

shown in Fig. 1.21c: the appropriate axis is the 45° line y = x, since
the x and y scales are the same. The resulting graph is the graph of
the inverse function of y = x3, namely y = xK In the end we have
simply reflected the graph in the line y = x to obtain the graph
of the inverse, as shown in Fig. 1.21c. (By the same token, the inverse
function for y = x* is y = x3).
Back to Fig. 1.21a, follow the sequence of operations indicated
by the path ABCBA from the point A : (a, 0). In following ABC we
find a3 at C, then on CBA we find its cube root, and are back to a
again; algebraically

(a3f = a;

and similarly

((F)3 = b

by considering the path UVWVU.


These two properties are perfectly general for functions and their
inverses: when applied successively, a function f and its inverse /,
neutralize each other and return the original value. This is the
inverse function property:

f{fi.x)) = X, /(./,(x))=x. (1.20)


We can consider other powers and their inverses, such as x2 and
x1 However, no negative number has a real square root, and it is
only exceptionally true that negative numbers permit fractional
powers. In Fig. 1.22 we show the general character of positive
whole-number powers x" and their inverses x1/n for O^x^ 1.
Notice the symmetry of the inverse pairs (x", x1/n) about the 45° line
y = x1 = x. (The graphs of fractional powers such as x: lie between
the whole-number graphs in a regular way.)
Now consider the inverse problem for the function

y = sin x.

The inverse function would answer the question: what angle has its
sine equal to x? Evidently — 1 ^ x ^ 1, or there is no such angle.
Moreover if — 1 ^ x ^ 1, then there is an infinite number of angles,
not just one (for example, sin gjt = but sin -g3n and sin §k also
equal j). The universal convention in use for tables, computers and
calculators is to offer only the smallest angle in magnitude, positive
or negative. This is done by restricting the angle to the range — jn
to 571 (in radians). The resulting function inverse to y = sin x is
denoted by arcsin (the notation sin-1 is also widely used):

y = arcsin x, y restricted to — ^ y < jTt.

Its graph is the reflection of y — sin x in the line y = x as shown in


Fig. 1.23a (the x and y scales are the same). Also displayed are the
inverse functions for the cosine and tangent, denoted by arccos x
and arctan x respectively. The angular range is restricted in a similar
14 1.7 Mathematical techniques

Fig. 1.23 (a) y = arcsin x,


— 1 ^ x < 1, — W ^ y 771.
(b) y = arccos x,
—1 1, 0 < y < 7t.
(c) y = arctan x, any x,
— \n<y< \n.

way (with a different range for cos .x) so as to make the inverse
functions single-valued.
The various inverse functions are connected, as in the following
example.

Example 1.7. Simplify (a) cos(arctan x), (b) arcsin(cos x).


(a) In the triangle of Fig. 1.24a, tan 9 = x. Therefore 0 = arctan x. The
hypotenuse has length ^/(l + x2), so that cos 9 = 1/^(1 + x2).
(b) In the triangle in Fig. 1.24b

cos x = sin(j7r — x),

so that

arcsin(cos x) = — x.

1.8 Exponential functions


(b) A class of functions not so far considered are of the form
Fig. 1.24
y = ax, a > 0

(we must have a > 0 since, if a is negative, the power ax only


exists for occasional values of x: consider the case a2). Several
instances are shown in Fig. 1.25. If a > 1, then ax -> oo as x —> oo,
and ax -> 0 as x -> — oo. All pass through the point (0, 1). (Cases
where a< 1 can be considered by putting, for example, y = (s)x= 1/5A)
The number a is called the base for the exponential function
ax. Functions with different bases are all closely related. For
example, 4V = 22x, and it can be confirmed by using a calculator
that, to four decimals, 2X = 3.5° 5533x, and so on. In Example 1.8,
we show that all the curves y = ax amount to the same curve
plotted with a different x scale, so we really need only one base
to describe them all. The particular base chosen is denoted by the
letter e, which occurs in mathematics as often as does n.
The numerical value of e can be obtained by requiring that
the rate of growth of ex at any value of x is equal to ex. Choose
any value of x (see Fig. 1.26), and another value x + h very close
Fig. 1.25 to it. The rate of growth of e* at P is nearly equal to NQ/PN and
1.8 Standard functions and techniques 15

this becomes more accurate the smaller the value of h. Thus

NQ ex + h - ex
— - t- .

PN h

Since ex+h = ex e\ we can cancel ex and obtain the condition

which is equivalent to

e se (1 + h)1/h.

the approximation becoming more accurate, the smaller h is. The


following table, which was calculated and can be extended on a hand
calculator, shows how the value of e is approached as h is taken
smaller and smaller:

h 0.1 0.01 0.001 0.0001

(1 + h)i/h (see) 2.5937.. . 2.7048.. . 2.7169.. . 2.7181.

To seven decimal places, the value of e is given by

e = 2.7182818 ....

The function has a growth rate at x = 0 of e° = 1; that is to say the


graph of y = ex cuts the y axis at 45°, provided that x and y have
the same scale.
The function

y = cx or y = exp(x)

is the basic exponential function. Its rate of increase is very rapid,


as indicated by the following table (to 2 significant figures).

X -9 -6 -3 0 3 6 9

ex 1.25 x 10-4 2.5 x 10^3 5.0x 10~2 1 2.0x10* 4.0 xlO2 8.1 xlO3

When x increases by a step of 3, e* is multiplied by a factor of about


20.

1.9 The logarithmic function


The inverse function corresponding to the exponential function
y = ex is called the logarithm of x (historically the ‘natural’
logarithm):

y = In x (or sometimes loge x),

read as ‘log of x\ It answers the typical question e? = x, or


alternatively it solves the equation for y:

ey = x.
16 1.9 Mathematical techniques

For example, the equation

ev = 3

has the solution

y = In 3.

It can be confirmed, by using a scientific calculator, that In 3 = 1.098


and e1 098 = 3. The graphs y = e* and y = In x are shown in Fig.
1.27, but with different x and y scales. Note that there exists no
logarithm of a negative number. The logarithm has the following
properties.

eln* = x if x > 0, In e* = x for any x (1.21)

(since In x and e* are inverse functions).

Fig. 1.27
In 1 = 0, In e = 1; (1.22)

(since e° — 1 and e1 = e).


From (1.21), e1"" = a, elnft = b, and elnab = ab. Therefore

ab = elnab = elna elnb = elnfl + ,nb,

from which it follows that

In ab = In a + In b. In - = In a — In b. (1.23)
b

(To obtain the second result, use the result (a/b)b = a.)
From (1.21), elna = a; so, if m is any number, then
am = eln(am)

But also am = (e,na)m = emlna. Therefore elna>" = emlna, or

In am = m In a. (1-24)

Example 1.8. Prove that if a is any positive number, then ax = exlna.


(This result shows that the curves y = 2X, y = 3X, etc. become identical if the
scale of the x axis is changed by a factor In 2, In 3, etc.)
ax = (elna)x = exln“,

by the ordinary rule for powers.


1.9 Standard functions and techniques 17

Example 1.9. Find x in the equation 23x = 2(3X).


If x is the solution then it is also true that
In 23x = In 2(3X),
or
3x In 2 = In 2 + x In 3
(using (1.24) and (1.23)), or
x = In 2/(3 In 2 — In 3).

Example 1.10. Find y in terms of x, if\n y = 3 In x + 2.


If y is the right function of x, then it is also true that
y = e,n>' = e31nx + 2

or
y _ e31nx e2 _ ^glnx^3

or

There exist logarithms using bases other than e, which have occa¬
sional use. Their properties are analogous to those involving e.

1.10 Exponential growth and decay


Here we shall use t (for time) in place of x, and consider the
function
y = A ecr,
where A and c are constants. These include functions such as 2r
which, by Example 1.8, can be expressed in the form ect since
~)t qLln2 ^0.69311

If c > 0, then y is said to have exponential growth. To obtain


an idea of what this entails we consider the doubling period of
y = A ect (c > 0). Choose any moment of time t0. At some later time
t0 + T, y will have doubled, that is to say,

A Qc(t0 + T) = 2A Qct0, or A ec'° ecT = 2A ect0'

After cancelling ect0, we have an equation for T:

ecT = 2.

Therefore cT = In 2, or

T= - In 2.
c

Exponential doubling principle

(1.25)
y = A ec' doubles in every interval of length T=- In 2.
18 1.10 Mathematical techniques

Example 1.11. The number N of scientists and engineers in the U.S.A. doubled
every 10 years between 1900 and 1935, and in 1935 they numbered about 1.5 x 105.
This suggests exponential growth N = A ec'. Find c, and predict the number N for
1990 on the assumption that the trend continued.
Suppose that we count 1900 as t = 0. The doubling period is 10 years; so
N = A ec', where, by (1.25),

c = ^012 = 0.0693.
Thus

N = A e0 0693'.

In 1935, where t = 35 (years), N = 1.5 x 10s, so that

1.5 x 105 = A e°-0693 x 35,

or A = 13265.

Therefore

N = 13265 e0 0693'.

In 1990 t = 90, from which it follows that

N = 6.8 x 106.

Exponential growth occurs when a quantity increases at a rate


proportional to the amount already accumulated. In the short term,
animal populations, epidemics, and explosions have this character¬
istic. Exponential decay may also occur. If c is a positive number
and

y = Ae~ct = A/ect,

then y halves itself in every interval of length (1/c) In 2. This occurs


in radioactive decay, the period being called the half-life period of
a radioactive substance. The half-life period provides a convenient,
memorable measure of the time it would take for the substance to
become less harmful.

1.11 Hyperbolic functions


It is often convenient to represent certain combinations of exponen¬
tial functions by separate functions. The hyperbolic cosine and
hyperbolic sine functions, denoted by cosh and sinh respectively,
are defined by the following formulae.

Hyperbolic functions

cosh x = j(ex + e“*), sinh x = ^(ev — e~x).

Since

cosh( — x) = j(e * + e ( *>) = ^(e * + ex) = cosh x,


1.1 1 Standard functions and techniques 19

it follows that the graph of cosh x is symmetrical about the y axis.


By a similar argument, it can be shown that sinh x is antisymmetrical
about the y axis. Graphs of the two functions are shown in Fig. 1.28.
From the definitions (1.26)

cosh x + sinh x = e*, cosh x — sinh x = e-x.

The remaining hyperbolic functions are defined in a similar


manner to their trigonometric counterparts. Thus

sinh x cosh x
tanh x =-, coth x =-,
cosh x sinh x
(1.27)

Fig. 1.28 Graphs of the sech x =-, cosech x =-.


cosh x sinh x
hyperbolic functions cosh x and
sinh x.

Graphs of tanh x, coth x, sech x, and cosech x are shown in Fig. 1.29.
From the definitions, a number of identities follow which parallel
those for trigonometric functions but with important sign differences.
Some are derived below.

(a) cosh2 x + sinh2 x = cosh 2x,


(b) cosh2 x — sinh2 x = 1. (1-28)

For (a):

cosh2 x + sinh2 x = \{t2x + 2 + e~2x) + l(e2x — 2 + e“2x)

= j(e2x + e~2x) = cosh 2x.

For (b):

cosh2 x —sinh2 x = j{e2x+ 2 + e~lx)—|(e2x — 2 + e_2x) = 1.

To obtain the identity

sinhfxj + x2) = sinh xt cosh x2 + cosh xx sinh x2,

start with the right-hand side:

sinh x: cosh x2 + cosh Xj sinh x2

= j(e'Xl — e“*‘)2(e'X2 + e~*2) + ^(eXl + e_Xl)^r(eX2 — e_X2)


_ 1 X2 0 X\ X2 | 0-^ 1 2C2 g X1 X2 ^

_j_ i^g*l+*2 _|_ q-Xi+X2 _ qX\~X2 _ Q~Xl~X2^

= i(eXl+X2 — e-*1-*2) = sinh(x1 + x2).


Fig. 1.29
20 1.11 Mathematical techniques

To sum up similar identities:

sinh(x! ± x2) = sinh Xj cosh x2 ± cosh xt sinh x2,


cosh(X) ± x2) = cosh xt cosh x2 ± sinh Xj sinh x2,
(1.29)
tanh x{ ± tanh x2
tanh(xx + x2) =
1 + tanh xx tanh x2

The inverse hyperbolic functions are defined as follows

y = sinh -1 x (for all x),


y = cosh-1x (for x ^ 1), (1.30)
y = tanh-1x (for— 1 < x < 1).

Traditionally the index — 1 is used to symbolize the inverses in


(1.30). Do not mistake it as meaning a reciprocal: sinh-1 x does
not mean 1 /sinh x.
Inverse hyperbolic functions have alternative representations in
terms of logarithms. Since

x = cosh y — \{ey + e~y),

it follows that y satisfies

e2y — 2x e^ + 1 = 0,

which is a quadratic equation in ey. Its solutions are

Qy = i[2x ± V(4x2 - 4)] = x ± ^(x2 - 1).

Since, by definition, y ^ 0, we must discount the negative sign.


Hence

y = cosh-1 x = ln[x + ^/(x2 — 1)].

In a similar way, it can be shown that

sinh-1 x = ln[x + ^(x2 +1)].

1.12 Partial fractions


We shall first make a distinction between an equation and an
identity. The word ‘equation’ has many uses, but for the present
we shall think of an equation as something like

x2 + 2 = — 3x,

since it is true only for certain particular values of x, namely - 1


and —2. On the other hand,

x2 + 3x + 2 = (x + l)(x + 2)
1.12 Standard functions and techniques 21

is an identity, meaning that it is true automatically, or for all values


of x. Just for the purposes of this section we shall write = instead
of = when we want to draw attention to an identity.
It is easy to test the truth of the following identities by adding up
the fractions on the right.

_ 1 _si_J 1
(i)
X2 — 1 X — 1 2X+l’

(ii)
* _ 1 1 , 1 1

4x2 — 1 4 2x — 1 4 2x + 1 ’

3x + 2 _ 1 2 1
(iii)
x2(x +1) X x2 X + 1 ’

The terms on the right are individually simpler than the function
on the left. This break-up into simpler constituents is useful for many
purposes. In this section, we show how to break up a complicated
function into simpler terms of the type above.
A rational function is a function which takes the form

/(x) = P(x)/Q(x),

where P(x) and <2(x) are polynomials. For example,


l/x2(3x — 2) and (2x3 + l)/(x — l)2 are rational functions, but
xV(* + 1) and (cos x)/(x + 1) are not rational functions. We shall
be concerned only with rational functions, and initially we shall
suppose that

degree of P(x) < degree of Q(x),

something like proper fractions in arithmetic. Such functions can be


broken up into partial fractions, like the examples at the begin¬
ning. No proofs will be given here, but the reader should learn the
techniques.
It is the denominator of Q(x) which determines what the form of
the constituent partial fractions will take. Suppose that the denom¬
inator is broken up into factors as far as possible. For example,

2x4 + x3 — 4x2 + x — 6 = (2x — 3)(x 4- 2)(x2 + 1),

and it cannot be factorized any further. We shall consider only


the cases where the factors are of the type:

ax + b (a simple factor), (cx + df (a repeated factor, order n),


px2 + qx + r, with q2 < Apr (an irreducible quadratic).

The rules affecting these are as follows:


22 1.12 Mathematical techniques

Partial fractions for rational functions P(x)/Q(x)


(degree of P(x)) < (degree of Q(x))

Each factor of Q(x) gives rise to a partial fraction (or


partial fractions) as below. Capitals denote constants:
their values are unique.
(a) Simple factors. To each factor ax + b of Q(x), a
term K/(ax + b). (1.31)
(b) Repeated simple factors. To each factor (cx + d)n
of Q(x), there are n terms:
LJ(cx + d) + L2/(cx + d)2 + ■ ■ ■ + Ln/(cx + d)n.
(c) Irreducible quadratic. To each factor px2 + qx + r
of Q{x), a term (JW.x + N)/(px2 + qx + r).

P(x) is involved in these rules only to the extent that it will affect
the values of the coefficients K etc. The following examples show
how to determine the values of the coefficients.

Example 1.12. Express x/(x — l)(x + 2) in partial fractions.


We can use any convenient letters for the unknown coefficients of the terms.
The denominator has two simple factors, x — 1 and x + 2, so (1.30a) says that
the partial, fractions must have the form
x A B
(»)
(x - 1 )(x + 2) “ x - 1 X + 2
Multiply through by (x — l)(x + 2):
x = A(x + 2) + B(x - 1). (ii)
The constants must be chosen so that this becomes an identity. An identity
has to be true for any x, so if we put any value of x into (ii), the result
must be correct. Any two substitutions of numbers for x form two simultaneous
equations for the two unknown constants A and B. For example, if we put
x = — 10 and x = 100 we obtain
-10 = -8/1 - 1 IS, 100 = 102T + 99B.
The numbers chosen are inconvenient, but according to (1.31) we get the same
A and B whatever values of x we use. Therefore, choose values that make the
equations as simple as possible:
x =—2 gives —2 = 0 — 3 B, so B = f,

x = 1 gives 1 = 3A + 0, so A =\.

Therefore, from (i),

x
A 13 23
- a —

(x — 1 )(x + 2) x — 1 x + 2

Example 1.13. Express (3x — l)/(2x + l)(x - l)2 in partial fractions.


According to (1.30),

3x - 1 A B C

(2x + 1 )(x - l)2 2x + 1 x — 1 (x — l)2


1.12 Standard functions and techniques 23

Multiply by (2x + l)(x — l)2 to give

3x - 1 = A(x - l)2 + B(2x + l)(x - 1) + C(2x + 1). (ii)

We need three values of x to obtain the three equations for A, B, C. Obvious


choices are x = 1 and x = — 7. For the third, choose, say, x = 0. From (ii):

x = 1 gives 2 = 0 + 0 + 3C, so C = §,

x = —2 gives — f = %A + 0 + 0, so A = —^,

x = 0 gives — 1 = A — B + C, so B=l+ + + C = g.

Finally,

_3x-l s_10 1 1 1
(2x + l)(x — l)2 tr2x+l 9x-l 7(x-l)2'

Example 1.14. Express l/x(x2 + 1) in partial fractions.


Here, x2 + 1 is an irreducible quadratic; so, by (1.30),

1 A Bx+C
-i-b —,-•
x(x2 +1) X x2 + 1

Multiply by x(x2 + 1):

1 = A(x2 + 1) + (Bx + C)x (i)

Then x = 0 gives 1 = + + 0, so

A = 1.

There are no other very easy values of x to choose. Put the value of A just
found into (i) and rearrange: we get

— x = Bx + C. (ii)

It is easiest just to notice that (ii) is satisfied only if

B = —1 and C = 0.

Therefore

1 _ 1 x
x(x2 +1) X x2 + 1

If the degree of the numerator is greater than or equal to


the degree of the denominator, the case is not covered by (1.31),
but we can treat it as follows.

Example 1.1 5. Put (x3 + l)/x(x — 1) into the form of a polynomial plus partial
fractions.
Carry out polynomial division, until the remainder is of lower degree than the
divisor:

x + 1
x2 - x |x3 + 1
subtract x3 — x2
x2 + 1
subtract x2 — x
remainder x + 1

Therefore

x3 + 1 x + 1
- = X + 1 +
x(x — 1) x(x — 1)
24 1.12 Mathematical techniques

The last term is of the right type for partial fractions, and finally

x3 + 1 1 2
- = x + 1-h-•
x(x - 1) X x - 1

1.13 Summation sign; geometric series


The sign Y is a large Greek capital S, standing for ‘the sum of...’.
It is used in the following way. Suppose, for example, we are
provided with a string of six quantities indexed in order, say

U l, w2’ ^<3, . . . , Ug.

This is called a sequence consisting of six terms. We can denote


the general term by (say) un, where n takes values from 1 to 6.
Suppose we want to add them all up. Then

ui + t<2 T a3 + W4 + + ttg

is denoted by
6

I u„,
n= 1

which is read ‘the sum of all the u„ between n = 1 and n = 6’.


Similarly
5
it2 + w3 + m4 + u5 is written Y un.
n=2

Any letter can be used as the counting index instead of n, provided


that there is no conflict; so we could also write, for instance,
6
n3 + + u5 + Ug = ^ tt;.
i=3

The letters m, n, r, i, j, k are often used.


We index a sequence according to convenience. The first index
does not have to be 1. For example consider the important sequence

1, x, x2, x3,...,

which is the same as

This is called, for historical reasons, a geometric sequence. Each


term in turn is got from its predecessor by multiplying by the
common ratio x. The natural way to index such a sequence is to
start with n — 0 instead of n = 1. Suppose then we want the sum of
the first 6 terms. It can be expressed as
5
1 + X + x2 + X3 + X4 + X5 = Y, x">
n=0
1.13 Standard functions and techniques 25

though we could put


6
I X"-1
n= 1

instead. Such a sum, whether or not it starts with the x° term, is


called a geometric series.
We will obtain an expression for the sum S of a geometric series
having any value of the common ratio x, and which runs from the
JV

term in x° to the term in xN. Thus S = X x". Note that it contains


n = o
N + 1 terms (i.e. not N terms). Written at length:

S = \ + x + x2 + ■■■ + xN~1 + xN
Then
xS = x + x2 + x3 + • • • + xN + xN+1.

Subtract the second line from the first. All the terms cancel except
for two; we obtain

S(1 — x) = 1 — xN+\ so S = (1 — xN+1)/(l - x).

Sum of a geometric series


N
X x" = 1 + x + x2 • • • X
11 = 0 (1.32)
1 - xN+1
1

4 6 1 N

Example 1.1 6. Find the following sums, (a) £ (0.1)", (b) £ —. (c) X e"X’
n—0 n = 0 2" n—0

(d) £ (-1)", (e) f 2".


n = 0 n=0

(a) x = 0.1 and N = 4, so

1—(0.1)’=099999 =
1 — (0.1) 0.9
(as becomes obvious if you write out the terms individually).

(b) 1/2" = (§)", so x = and N = 6. By (1.32),

1 — l1!7
c_
^
'2' _2(1—
1
-J—1
128/
=m
64 •
* 2
(c) e"x = (e-*)", so the common ratio is ex, in place of x in (1.32):
l_(e*yv+1 i_e(v+i)x
S=
1 — ex —e
(d) Here x = — 1, so

1 -(- 1)N+1
S ^-- = |[l-(-l)/v + 1].
1 -(-1)
26 1.13 Mathematical techniques

The sums of N = 1, 2, 3, 4,... terms of the sequence are successively


1,0, 1,0, 1,... .
(e) x = 2 and N = 5. Therefore

1 + 2 + 22 + 23 + 24 + 25 = 1 ~ -- = 63.
1 -2

Example 1.17. Find an expression for the sum

ar2 + ar5 + ar8 + aril + ar14.

This can be written


ar2( 1 + r3 + r6 + r9 + r12).

The bracket contains a series of type (1.32), with common ratio r3 and with the
number of terms N + 1 equal to 5: therefore N = 4. We have

1 — (r3)5 2l~r15
S = ar-= ar -.
1 - (r3) 1 - r3
4

(In terms of £, we wanted S = £ ar2 + 3n. It is perhaps easier to see what to


n= 0
do when the series is written out fully.)

Problems

1.1. (Section 1.3). Sketch graphs of the following equa¬ 1.5. Show that the following pairs of lines are mutually
tions over the intervals stated: perpendicular:
(a) y = X4, -1.5 <x < 1; (a) 2x + 3y — 2 = 0 and — 3x + 2y — 3 = 0;
(b) y = x(l — x), — 1 ^ x ^ 2; (b) y = 2x + 1 and y = 2(3 — x);
(c) y = 1 + x + x2, \x - 1| ^ 2; (c) y = 2x and y = —\x\ (d) y = +x.
(d) y = \x — 1|, -3 x < 3;
(e) y = |x| + |x - 3| + |x + 2|, — 3 ^ x ^ 4;
(f) v = IM - 11, -2 x ^ 2;
1.6. (Straight line, Section 1.3). Show that the equation
(g) y = V(x2+ 1), |x|<2. (x + >’ + 1) + oc(2x - 3y - 2) = 0,
where a is a constant, represents a straight line through
1.2. (Straight lines, Section 1.3). Find the straight lines
the intersection of x + y + l= 0 and 2x — 3y — 2 = 0.
through the following pairs of points:
Find the line joining this point with the point (1, 1).
(a) (1,1), (-1,5); (b) (0,1), (2,1); (c) (2,1), (-1, - 1).
Sketch the triangle formed by these lines. Find the
lengths of each side of the triangle. 1.7. (Circles, Section 1.3). Find the centre and radius
for each of the following circles:
1.3. (Straight line, Section 1.3). What are the slopes (a) x2 + y2 = 9; (b) (x — l)2 + y2 = 4;
of the following straight lines, and where do they cut (c) x2 + y2 — 2x — 2y — 21 = 0;
the coordinate axes? (d) 4x2 - 4x + 4y2 + 4y = 9.
(a) y = x — 1; (b) 3y = x — 2; (c) 2x + 5y = 4.
1.8. (Circles, Section 1.3). Find the equation of the circle
1.4. (Straight line. Section 1.3). Find the equations of
centred at (1, —2) with radius 3.
the following straight lines:
(a) passing through (1, 2) inclined at 45° to the x-axis;
(b) passing through (—1, —2) with slope —2; 1.9. Find the points of intersection of the following
(c) with slope 0.5 and x axis intercept x = 1; circles and lines:
(d) through (1,2) parallel to the line y = 3x — 4; (a) x2 + y2 = 8 with x = 2;
(e) through ( — 1,3) perpendicular to the line (b) x2 + y2 — 2x + 2_v — 4 = 0 and y — 2x + 1;
y — 4x — 1. (c) x2 + y2 = 1 and x + y = Jl.
Standard functions and techniques 27

1.10. The following table contains experimental data: to express the following in terms of the cos and sin
X 1.06 0.84 0.72 0.44 0.23 of i(* + y) and §(x - y):
0 0.53 0.71 (a) cos x + cos y; (b) sin x — sin y; (c) cos x — cos y.
y 0.78 1.1
The hypothesis is that the points should lie on a circle
1.18. State where the graphs of the following functions
centre at the origin in the (x, y) plane. Find the distance
cross the x-axis:
of each point from the origin. Calculate the average of
(a) sin x; (b) cos x; (c) sin \x\ (d) cos 3nx;
these values, and write down the equation of the circle
(e) cos(2x — §7t); (f) e~* sin froc.
approximation.

1.19. State the amplitude, angular frequency, period,


1.11. (Functions, Section 1.4). Draw sketches of the
and phase of the following harmonic outputs:
following functions in the (x, f) plane over the intervals
(a) 2 cos(0.2r + 3.2); (b) 1.5 sin[0.2(f — 2.4)];
indicated:
(c) 2 cos(0.2f + 0.12) -F 2 cos(0.2r — 0.39).
(a) x = H(f + 1) — H(r — 1) for —2 < t =$ 2;
(b) x = sgn(l + f) + sgn(l — t) for —2 ^ t ^ 2;
1.20. (Section 1.7). State the inverse f(x) for each of
(c) x = fH(f — 1) for 0 ^ r ^ 2;
the following functions fix) for the values of x stated:
(d) x = (f2 — l)[sgn(r + 1) + sgn(l — f)] for -2 ^
(a) 4x2, x ^ 0;
t ^ 2. (b) 2x + 3, — oo < x < oo;
(c) sin 2x, 0 ^ x ^ ^7i; (d) 2 sin x, 0 ^ x < jn;
1.12. (Functions, Section 1.4). Using Heaviside and sig- (e) cos x2, 0,/$ x ^ y/n; (f) sin(§7r cos x), 0 ^ x ^ jrr;
num functions, construct a single formula in each case
(g) x-4, x > 0; (h) x2 + x, x > —
for /(f) where

'0 (r < — 1), 1.21. Sketch a graph of the function inverse to the
function x3 — x + 1. (There is no need to try to solve
(a) /(f) = 1 (-1 < f < 2), the equation x3 — x + 1 = y for x.)
10 (t > 2);
1.22. (Sections 1.8, 1.9). Solve the following equations
0 (f < 0),
for x:
(b) m =
2t (f > 0); (a) e2x = 3; (b) In 3x = 2;
(c) In x-5 = 1; (d) 3 e3x = 1;
0 (f < 0),
(e) ex + e~* = 2 (hint: multiply through by ex first);
r (0 < t < 1), (f) eln2x = 4; (g) In e2x + 3 In e5x = 2;
(h) ln(x + 1) + ln(x — 1) = 0;
(c) /(f) = { 1 (1 < f < 2), (i) ln(x + 1) + ln(x — 1) = e;
(j) 2X = 3; (k) 32x = j;
—t + 3 (2 < f < 3),
(1) sinh 2x = 4; (m) 2 sinh x = 2 cosh x + 3.
0 (f > 3).
1.23. Express 2X as a power of e.
1.13. (Section 1.5). What are the radian measures of
the following angles: (a) 30°, (b) 100°? 1.24. (Section 1.10). Prove that 10x doubles its value
in any interval of length equal to
1.14. (Trigonometric functions, Section 1.6). Using the In 2
methods of Examples 1.5 and 1.6, obtain
In 10‘
(a) sin In- (b) sin \n\ (c) sin 7r;
(d) sin( —j7t); (e) cos ^n- (f) cos §7t.
1.25. Sketch regions in the (x, y) plane defined by the
following inequalities:
1.15. (Trigonometric functions, Section 1.6). Use
(a) (x - l)2 + y2 ^ 9;
(1.18c) to show that
(b) x ^ 0, y ^ 0 and x + y ^ 1;
(a) cos4 A = f(3 + 4 cos 2A + cos 4/1);
(b) sin4 A = |(3 - 4 cos 2A + cos 4A).

1.16. (Trigonometric functions. Section 1.6). Use (1.18) (d) x2 + y2 ^ 1 and x > 0; (e) |x| + |y| ^ 1
to express the following in terms of sin x and cos x:
(a) cos(x + frr); (b) sin(x + ^Tt); (c) sin(x - \n)\ 1.26. Prove that sinfD1 x = ln[x + -J(x2 + 1)].
(d) cos(x ± 7t); (e) sin(x ± n).
1.27. Figure 1.30 shows a cross-section of a simple
1.17. (trigonometric functions, Section 1.6). Use (1.18) model of a piston and crankshaft. The crankshaft rotates
28 Mathematical techniques

at 4000 rpm (revolutions per minute). If AB = 2.5 and (b) the folium r = (4 sin2 0 - 1) cos 0;
BC = 5 (in cm), show that the displacement AC is given (c) the four-leaved rose r = sin 20;
by (d) the Archimedean spiral r = 0.040 (extend the in¬
terval in 0 to [0, 6tc]);
AC = 2.5[sin cot + y/(4 — cos2 cut)]
(e) the equiangular spiral r = O.le019 (extend the inter¬
(in cm), where t is measured from 0 = 0, and state co in val in 0 to [0, 6jt]).
radians per second.
1.32. Sketch the graphs of the following functions:
(a) sgn sin x; (b) sgn cos 2x; (c) H(x) sin x;
(d) sin2 x; (e) |sin x|; (f) sin|x|; (g) H(x - n) sin x.

1.33. The coordinates of three vertices of a rectangle are


given by

(-7,3), (1,-3), (4,1).

Find the coordinates of the fourth vertex. Determine


also the area of the rectangle.

1.34. State which of the following functions are peri¬


odic, and, if so, find the (minimum) period:
(a) sin 4x; (b) cos(7r + t); (c) sin t + cos 2f;
(d) sin x2; (e) e_s,nx; (f) cos2 x; (g) x sin x;
(h) |sin x|; (i) 1/(4 + sin21); (j) sin \t.

1.35. Decide which of the following functions are even,


odd, or neither even nor odd:
(a) x2 + x3; (b) x2 + 2x4; (c) x + sin x;
(d) sin x cos x; (e) e-*2; (f) ln(l + x2); (g) es,nx.
Fig. 1.30
1.36. (Partial fractions, Section 1.11). Express the follow¬
ing in partial fractions:

1.28. An oscillation takes the form


-; (b) ---;
(a)
(x - 2)(x + 3) (x + 1 )(x + 2)
x = 3 cos cut + 4 sin cut.
2x - 1 1
By finding numbers c and 4> such that (c) -; (d)-;
x(x-l) x(x + l)(x + 2)
c cos (j) = 3, c sin 0 = 4
1 1
express x as a single sinusoidal term. What are the (e) -; (0-;
x(x2 - 1) x(x + 2)2
amplitude and phase of the oscillation?
x2 x — 1
(g) —; (h) -;
1.29. The exponential function f(t) = C e-st< satisfies the (x+l)(x + 2)2 x2 — 2x — 3
conditions /(0) = 2 and /(1) = 0.5. Find the constants
1 1
C and a. What is the value /(2)? (i) —r; (j) —--.
(X-1)2 X3 T X2

1.30. A yacht, which has a draught of 2 metres, is 1.37. (Partial fractions, Section 1.11). In the following
anchored in a tidal estuary, in which the depth of problems with irreducible factors, express the functions
water around the yacht is in partial fractions:
5 4- 4.5 sin 0.5t 1 x
(a) -; (b) -;
(in metres), where the time f is measured in hours. What x(x2 + x + 1) (x — l)(x2 + 1)
is the tidal period in hours? Over how many hours in x
one period can the yacht float free of the estuary? (C) ---.
(x + l)(x2 + 2x + 6)

1.31. Draw sketches of the graphs of the following curves 1.38. Express the following in partial fractions:
given in polar coordinates, by constructing a table of
values of r for equally spaced angles (say 15° intervals): 1 x3
(a) r-i (b) -;
(a) the cardioid r = 0.5(1 + cos 0); x2(x~ + 1) (x + l)(x + 2)
Standard functions and techniques 29

(c) —-. (d) X n2"; (e) X (-1)";


(x2 - 9)(x + 1) n = 1 n= 1

(f) X [2(0.5)" + 3(0.6)"].


1.39. Write down in full all the terms in each of the
n—0
series:
1.41. Find a formula for the sum of
(a) X 27; (b) X (c) X nx"- X + X5 + x9 +-h x1 +4" +-b X41.
j= 2 n = 0 1 +« n= 1

1.42. ABC is a triangle with sides a, b, c opposite to


1.40. (Geometric series. Section 1.13). Using the geo¬ the corresponding angles. Prove the cosine rule; that
metric series formula find the sums of the following
c2 = a2 + b2 — lab cos C.
series:
(Hint: drop a perpendicular from B on to b. Look for
7 6 /l\" 5 a cos C, and then for an opportunity to use Pythagoras’
(a) X (0-5)"; (b) X (-); (c) X e“2n;
theorem.)
Differentiation

Contents
2.1 The slope of a graph 30
2.2 The derivative: notation and definition 32
2.3 Rates of change 34
2.4 Derivative of x"(n = 0, 1, 2, 3,. ..) 36
2.5 Derivatives of sums: multiplication by constants 37
2.6 Three important limits 39
2.7 Derivatives of ex, sin x, cos x, In x 41
2.8 A basic table of derivatives 43
2.9 Higher-order derivatives 44
2.10 An interpretation of the second derivative 44
Problems 45

2.1 The slope of a graph


Figure 2.1 shows the graph of a straight line. The x and y
coordinates are assumed to have the same scale. Choose any
two points and Q : (x2, y2) which lie on the line. If
we measure the angle a from the positive x direction then

P2 - >T
tan a = (2.1)
x, - x.
(see (1.6)). The value of tana remains the same whether Q is
to the right or left of P, since the value of the fraction on the
Fig. 2.1
right is unchanged. The angle a itself will differ in the two cases by
an amount equal to n (or 180°), but this does not affect the value
of tan a. (If we refer to a itself to indicate the steepness of a line, we
choose the value that lies between ±90°, but normally we only need
tan a). Notice that if the x and y scales differed, the angle a as
depicted would not satisfy (2.1); it would be too great or too small.
The slope or gradient of a straight line is defined to be the
quantity tan a. If the line is horizontal, tan a is zero. It is positive
or negative according as the line slopes upwards or downwards as
we go from left to right. It increases or decreases as the inclination
increases or decreases, becoming +oo when a = ±90°.
Consider now the slope or gradient of a curve at a point.
Figure 2.2a shows a typical curve. By the slope of the curve at
the point P we mean the slope of the tangent line to the curve
at P. We can think of the tangent line as the line joining two points
on the curve which are ‘infinitely close together’; but it is no use
making P and Q coincide, since we simply get tan a = 0/0, which
has no definite meaning. It is necessary to carry out an indirect
process.
2.1 Differentiation 31

(a) Let P be the fixed point (see Fig. 2.2b). Take any other point Q
on the curve and join PQ by a straight line, called the chord PQ.
If Q is some distance from P, then the slope of PQ will not be close
to that of PT; but if we take a succession of points Q closer
and closer to P, then the slope of the chord PQ can be made as
close as we wish to that of PT. The points Q that we consider are
said to approach P. The corresponding value of the slope of PQ
then approaches a limit or a limiting value, and this is equal to
the slope of the curve at P. We use the sign —> to signify ‘approaches’;
so we can write:

as Q -> P, slope of PQ -> slope of the curve at P. (2.2)


(b)
We shall be able to obtain the exact value of the slope of PT,
which is the same as the slope of the curve at P, by carrying out the
approach of Q to P in algebraic terms. To do this, we introduce a
new symbol

fix

pronounced ‘delta-x’). This is a single symbol: the Greek letter 5


stands for the words ‘the increment in’ or ‘the change in’ something,
in this case, the increment in the value of x as we move from P to Q.
Fig. 2.2 For two points P ; (xl5 jq) and Q : (x2, y2), the change in x on
moving from P to Q is

fix = x2 — xx. (2.3)

(The change in moving from Q to P is given by fix = Xj — x2, which


has the opposite sign.) Similarly we use the symbol by to indicate
the change in y as we move from P to Q.
In Fig. 2.3, P: (x, y) is the point on a graph at which we want
to find the slope (i.e. we want the slope of the tangent line PT).
Take another point Q on the curve, and suppose that it approaches
P. The separation between P and Q is indicated by the sides of length
fix and by of the triangle PNQ. Then, by (2.1),

fiv
slope of PQ = —. (2.4)
fix

Now let fix -» 0 so that Q -> P. the ratio fiy/fix approaches a


number which is equal to the value of the slope at P. We first show
what happens numerically in a particular case.

(j+5x. | +5y) Example 2.1. Find the slope of the curve y = x2 at the point F : (3, 9) on
the curve (Fig. 2.4).
At P, we have x = j and y = f We shall make a table of values of 8y/8x for
diminishing values of 5x; that is, for points Q which are approaching P (from
either side). We first need to express 8y in terms of 8x:
O ^+5;c x
3 by = (y at Q) - (y at P) = (i + 5x)2 - (|)2

1
Fig. 2.4 = (9 + I 8x + 5x2) — 9 = § 8x + 8x2. (2.5)
32 2.1 Mathematical techniques

Q to the left of P:

8x -0.1 -0.001 - 0.0001

8y -0.056 -0.00656 -0.0006656

8y 0.56 0.656 0.6656


8x
For Q to the right of P:

8x 0.1 0.001 0.0001


5y 0.076 0.00676 0.0006676

— 0.76 0.676 0.6676


8x
(6 means that the number 6 recurs: e.g. 0.6 = 0.66666 • • • .)
First notice that if we put 8.x = 0 we obtain Sy/8x = 0/0, which gives no
information at all, so we have to look at the sequence of values for 5y/5x as
5x -> 0. Inspection suggests that each term is formed from its predecessor in a
regular way; so we can predict that:

as 5x -> 0, slope of PQ -> 0.66666 ■ • • = f.

The exact value of the slope at P is therefore f.

We could have worked out the slope at P in this example without


doing any calculation. From (2.5),

5y f 5x + 5.x2 2
— = -= f + 5.x
5.x 8x

provided that 5.x ^ 0. Therefore, when 8x -> 0, 5y/5.x -*• f, as before.


Finally, in exactly the same way, we can find a general formula
giving the slope at any point P : (x, y) on the graph of y = x2. The
value of 5y corresponding to a value of 8.x is given by

5y = (x + 5x)2 — x2 = x2 + 2x 5.x + 8x2 — x2


= 2x 5x + 5.x2.

Therefore

Sy 2.x 5.x + 8.x2


— = —-= 2x + 5x.
8.x 8.x

Now let 8.x -> 0; we obtain:

the slope of y = x2 at (x, y) is 2x. (2.6)


This process is a model for treating other functions: for example,
you could now show in the same way that the slope of the graph
of y = x3 at any point is equal to 3x2.

2.2 The derivative: notation and definition


We shall need to find the value approached by 5y/5x as 5x -> 0 in
many different situations. There is a special notation used to signify
2.2 Differentiation 33

the process:

Limit notation

Let y = /(x), so that 8y = /(x + 8x) — f(x). Then the


value approached by 5y/8x when 8x -> 0 is denoted
(2.7)
by

lim —.
Sx->o 8x

Read this as ‘The limit, or the limiting value, of 8y/Sx as 8x -> O’.
(The lim sign is used in many other contexts too.)
The result of the process lim6jc_0 8y/8x, where 8y = /(x + Sx) —
fix), is called the derivative of y with respect to x, or the derivative
of /(x). The process is called differentiation. We worked out earlier
that, if y = /(x) = x2, then the derivative is equal to 2x. The
following notations are standard short ways of indicating a derivative:

dy d fix) 8y
or — /(x) signify lim
dx ’ dx dx 8x

The symbol dy/dx is usually pronounced ‘dee-y by dee-x’. Notice


that the letter used is an ordinary d, not 8.

Derivative/ slope, and tangent

(a) Let y = /(x) and Sy = fix + 8x) — /(x). Then


the derivative of y with respect to x, signified by

dy d/(x) d df
—, -, — fix) or —
dx dx dx dx
means the result of taking the limit of 8y/8x as Sx -> 0:

lim —.
dx §jc-*o 8x
(b) The slope m of a curve at any point (x0, y0) is (2.8)
given by

where the derivative is evaluated at x = x0. Therefore


the equation of the tangent line at the point is

y -

x - x0 X-XQ
34 2.2 Mathematical techniques

Thus our earlier result for y = x2 can be written in several ways:

dy d(x2)
or or
dx dx

Strictly speaking, dy/dx should be regarded as a single shorthand


symbol representing the longer expression lim5;[^0 8y/8x, and not
as a ratio which can be taken to pieces. However, its great usefulness
is that it often behaves just like an ordinary ratio of nonzero
quantities, and we shall later see cases where this property guides
us to true results and makes them easy to remember.
It is sometimes useful to think of the symbol

d
dx

standing alone as meaning: ‘differentiate what follows’. It is also


called an operator, meaning that we operate on one function
(x2 say) to produce another (i.e. 2x). Sometimes the symbol D is
used to stand for the operator d/dx. We would then write

Dx2 = 2x.

2.3 Rates of change


The quantity lim§;c^0 8y/8x is usually needed to solve problems
which have no immediate connection with the slope of graphs: this
was just introduced to give the reader a picture to hold on to.
Moreover, it is not always appropriate to call the variables x and
y if other letters arise more naturally.
For example, suppose that a car is moving along a straight road,
represented by an x axis, and that at time t its displacement from
the origin is given by

x = fit).

We can deduce its velocity from moment to moment from this


information.
Choose any moment f, and suppose that, between times t and
t + Sf, the car moves from x to x + 8x. Then 8x must be given by

8x = f{t + Sr) - f{t).

The quantities 8f and 8x could be imagined as being recorded with


a stopwatch and distance meter, and the average velocity over the
interval 8f would be

distance travelled fit + 8f) — fit) 8x


time taken Sf 8f

The smaller that 8f is, the more nearly will this ratio approximate
to the instantaneous velocity v at time t. Therefore, let 8f -> 0; using
2.3 Differentiation 35

the notation (2.7), we obtain

5x
v = hm —,
8r-»0 bt

or alternatively, by (2.8),

dx
v = —.
dt

We can borrow the result (2.6) to complete the calculation in one


case. Suppose that

x = t2.
Equation (2.6) says in effect that

if y = x2 then — = 2.x,
dx

and just by changing the letters we obtain:

, dx
if x = t then — = 2f.
dr
Therefore the velocity is

v = 21.

Another way of expressing the meaning of velocity is that


velocity is the rate of change of displacement with time. Similarly,
acceleration a is the rate of change of velocity with time:

dv
a — —.
df

For the case when x = t2 we have

bv = 2(f + 8f) — 2t = 2 5r,


so
dv .. bv
a = — = hm — = 2.
df 8r-'0 bt

As seen in the next example, the idea of rate of change is quite


general and need not involve time.

Example 2.2. Find the rate of change of the area of a circle with respect to its
radius.

Call the radius r and the area A. The rate of change of A with respect to r is

dA 5A
-- or hm —.
dr 6r-*o 5r

Since A = nr2, we have

5T = n(r + 8r)2 — nr2 = n(2r 8r + 8r2).


36 2.3 Mathematical techniques

Therefore

5A
= n[2r + 5r].
8r

Now let 8r -> 0; we obtain

d/1 .. 5/1
— = lim — = 2nr.
dr 8r-o 5r

This result could have been obtained by using our previous result

d(f2)
= 2t.
dt

with r in place of t, and multiplying it by n. (Notice also that 2nr is the


circumference: the result can be interpreted as meaning that if we increase r by a
small amount 5r, then the area increase is nearly equal to that of a narrow strip
of length 2nr and breadth 5r.)

2.4 Derivative of xn (n = 0, 1, 2, 3,. ..)


The following is our first general result:

(a) if y = c, where c is a constant, then

^ = 0.
dx
(2.9)
(b) If y = x", where n = 1, 2, 3,..., then

dy
— = nx"n- 1 .
dx

To prove (a): the graph of y = c is a horizontal straight line;


therefore its slope is zero, so dy/dx = (d/dx)c = 0.
To prove (b) in the most elementary way, we shall use an identity:
if n is a positive whole number and a, b are any numbers,

a" — b" = (a — b){an~1 + an~2b + an~3b2 + • • • + h"-1).

This can be verified by multiplying out the two brackets on the


right; everything cancels except for the two terms on the left.
Follow (2.8), with /(x) = x", so that 5y = (x + 5x)" - x":

dy 5y 1
— = lim — = lim — {(x + 8x)" - x"}.
dx 8x~*0 OX 8x-*0

Put x T 8x = a and x — b into the identity, noticing that

a — b = (x + 8x) — x = 8x.
2.4 Differentiation 37

5v 1
—^ = —(8x)[(x + 5x)"_1 + (x + Sx)"_2x + • • • + x"-1]
ox Sx
= (x + 8x)"_1 + (x + 8x)"_2x + • • ■ + x"_1

when 8x # 0. Now let Sx —► 0; we obtain

dv 8y
— = ltm ' = x" 1 + x" 1 + • ■ ■ + x" 1.
dx 8x->o 8x

There are n terms on the right, each equal to x”_1, so finally,

dy .
dx

(In Section 3.4 we show that (2.9b) is in fact true for all values of n.)

Example 2.3. Obtain (a) the general expression for dy/dx when y = x3; (b) the
slope of the curve y = x3 at P : (2, 8); (c) the angle of inclination of the tangent
line to y = x3 at the point P; (d) the equation of the tangent line through P; (e)
the velocity v and acceleration a of a point with coordinate x, when x = f3.
(a) From (2.9) with n = 3, we have

dy
= 3x.-3- 1 3x2.
dx

(b) The slope of the curve at P is equal to the value of dy/dx at x = 2, which
is 12.
(c) The slope is equal to tan a, where a is the angle of inclination, so a = 85.2°.
(d) Let (x, y) now represent any point on the tangent line at (2, 8). The
slope of the tangent is equal to 12, so from (2.1)

Therefore the equation of the tangent line is y = 12x — 16.


(e) From Section 2.3, v = dx/dr = (d/dr)r3 = 3f2. Also

dr
a = —
dt

= 3 = 61.

A little thought about the process of finding lim6(_0 8i>/8f will persuade the
reader that it is right to take the constant 3 from under the differentiation sign
in the last line; see also the next section.

2.5 Derivatives of sums: multiplication by constants


The following are general rules which become obvious when the
definition (2.8) is applied to them.
38 2.5 Mathematical techniques

Linear combinations of functions

(a) If C is a constant, then


d
[C/(x)] = C-£-/(*).
dx dx

(b) [f(x) + 0(x)] = ~ f(x) + g(x).


dx dx dx (2.10)
(c) If A, B, C,... are constants, then

\_Af{x) + Bg(x) + Ch(x) + • • •]


dx

= A d /(x) + B d- g(x) + C^- h(x) +


dx dx dx

The result (2.10c) follows easily by repeated use of (a) and (b). We
can use this rule together with (2.9) to obtain the derivatives of
polynomials, as in the following example.

Example 2.4. Obtain dy/dx when (a) y = 3x3 — jx2 + 5; (b) y = 3x(x2 — 2).
(a) From (2.10c),
dy d(x3) 1 d(x2) d(5)
+ 5) = 3 —-— + —
dx dx 2 dx dx
= 3(3x2) - \(2x) + 0 = 9x2 - x.
(b) It is necessary in this case to express y without brackets, i.e. as a
polynomial:
dv d d
— = — [3x(x2 - 2)] = - (3x3 - 6x)
dx dx dx
d(x3) dx
= 3-6— (from (2.10c))
dx dx
= 9x2 - 6.
These derivatives, of course, represent the slopes of the corresponding graphs.

Example 2.5. A car travels along a straight road with varying velocity v for one
hour. At time t hours, its displacement from the starting point 0 is given by
x = 60f2(3 — 2t) kilometres. Find expressions for (a) the velocity v; (b) the
acceleration a.
(a) The velocity is the rate of change of displacement with time:
dx d
v = — = - [60r2(3 - 20]
df dt
d(r2) „ d(t3)
= 60 — (3t2 - 2t3) = 60( 3 -2
df ' df df
= 60[3(2f) - 2(3r2)] = 360(f - f2)) in km hr'1.
(b) Acceleration is the rate of change of velocity with time:
dt;
a =
df
2.5 Differentiation 39

Therefore

a = 360(1 — 2r) (in km hr-2).

Example 2.6. The potential energy V of a pendulum of length l with a bob of


mass m is given approximately by V = mgl(9 — $93) when the angle of inclination
9 (radians) is small. Find the rate of change of V with respect to 9. (This quantity
is associated with the moment exerted by gravity.)
Using the letters suggested by the question, we require
dV d ( d9 L d(03)\
d0 dd
lmgl(6 - £03)] = mgl
\d9 S d0 J
= mgl( 1 - j92).

2.6 Three important limits


In order to increase the repertory of functions that we can differen¬
tiate, three important limits are needed. The reader might not need
to learn the proofs, but the results (2.11), (2.13), and (2.14) are
essential, and you should try to acquire a feeling for what is
happening by examining the numerical tables given; or better by
working out tables of your own.
Instead of 8x, the letter e (greek epsilon) will be used to represent
the quantity that tends to zero. (The limit is not affected by the letter
we use.)
First consider

£^o £

If we put s = 0 we get 0/0, which is meaningless, but the approach


to a limit can be seen in the following table:

0.1 0.01 0.001

1.0517 1.0050 1.0005

(An approach to zero through negative £ values is similar.) It looks


as if the limit is equal to 1.
To prove this, recall that in Section 1.8 it was shown that the
graph y = ex intersects the y axis at 45°; that is to say, its slope there
is equal to 1. (This is the characteristic property of the base
e = 2.7128 ••• .) The same thing is true if we plot y — e£ against e,
as in Fig. 2.5. Referring to this figure:

eE - 1 _ RQ - OP _ NQ
e ~ PN ~ PN’

which represents the slope of the chord PQ. When e -> 0, the slope
of the chord PQ approaches the slope of the tangent PT, which is
Fig. 2.5 equal to 1. Therefore we have proved that
40 2.6 Mathematical techniques

lim6--1 = 1. (2.11)
£-*o £

The second limit to be considered is

.. sine
lim —,
e-*o £

e being measured in radians. The approach to the limit is shown in


the following table, which includes negative values of e:

e ±0.1 ±0.08 ±0.06 ±0.04 ±0.02


sin f
- ±0.99833 ±0.99893 ±0.99940 ±0.99973 ±0.99993
£

The limit looks as if it might equal 1.


To prove this, consider Fig. 2.6a. PN is any line segment
perpendicular to the base line AB, with Q any point to the left of
N, and we allow £ to represent the angle PQN (in radians). The arc
PR is a circular arc with centre Q and radius PQ. Then
PN
-= sm £, so PN — PQ sin e.
PQ

Also (radian property, Section 1.5)

arc PR = PQ x PQR = PQ e.

Therefore

sin £ PN
(2.12)
£ arc PR

Now let Q recede some distance towards the left, as illustrated in


Fig. 2.6b. The angle £ decreases, and £ ->■ 0 as Q recedes to infinity.
At the same time the arc PR approaches the straight line PN,
tending ultimately to coincide with it. Therefore, when £ -> 0, the
length of the arc PR approaches the length of PN; so, from (2.12),

(2.13)

Fig. 2.7 Figure 2.7 shows the graphs of y = In e, and y = ln(l ± e). The graph
2.6 Differentiation 41

y = In £ (see Fig. 1.27) passes through the point (1, 0) at 45° to the
a axis. The graph y = ln(l + e) is the same graph moved over to the
left by a distance 1, so it passes through the origin 0 at 45°: that is
to say, it has slope equal to 1 at the origin. Therefore

ln(l + e)
lim (2.14)
£-o e

The reader may be glad to know that there are no more


complicated limits to be evaluated.

2.7 Derivatives of ex, sin x, cos x, In x


These follow from the definition (2.8) and the limits obtained in the
previous section.
First let
y = e\
Then according to the definition (2.8),

dy
,jc + Sjc
e- — e„x
= lim
dx $x—*o 8x
e* q^x — qx g5x _
= lim lim ex-
8jc->0 8x Sx~*>0 8x
Now put
Sx = 8.
The previous expression becomes
e£ — 1
e* lim-- e*,
£ —* 0 ^

by (2.11). Therefore

d
= e (2.15)
dx

The rate of increase of ex is therefore numerically equal to ex itself.


The simplicity of (2.15) is the reason why the number e is a desirable
base for exponential functions. The result is not so simple for any
other base.
Next, consider
y = sin x.
Then
dy sin(x + 8x) — sin x
— = lim
dx 8jc-*0 8x
42 2.7 Mathematical techniques

From the formula in Appendix B(d)

sin C — sin D = 2 sin j(C — D) cos j(C + D)

for any C and D. Put C = x + Sx and D — x into the identity. Then


we have

dy 2 sin 4<5x cos(x + jSx)


— = lim -
dx sx -* o <5x

sin \5x / . . .
= lim —-cos(x + jdx).
5x-*0 I<5X

Putting jSx — a, we have from (2.13),

dy ..sine . . .. sine,. / .
— = lim-cos(x + e) = lim-lim cos(x + e)
dx £-> 0 £ £~>0 £ £->0
= cos x.
Therefore

— sin x = cos x. (2.16)


dx

By a closely similar argument, it can be shown that

— cos x = —sin x. (2.17)


dx

(Notice the minus sign which occurs here.)


Finally, suppose that

y = In x.
Then from (2.8),

dy ln(x + 5x) — In x
— = lim
dx Sx-*o bx

.. 1 X + 5x
= lim - In lim — lnl 1 +
8jc-»0 8jc-*0 5x \

By putting

8x
e, or 5x = xe,
x

the previous equation becomes

dy 1 ln(l + e)
— = lim -
dx £-o -x e
2.7 Differentiation 43

By equation (2.14) the limit of the part containing e is 1, so we have

d
(2.18)
dx

(Remember that x must be positive for In x to have a meaning.)

2.8 A basic table of derivatives


We assemble the results (2.15) to (2.18) from Section 2.7, and (2.9)
for powers of x, in a short table of derivatives.

Derivatives of the elementary functions


Function Derivative
y = fix) dy/dx or d/(x)/dx
c (c — constant) zero
x" (n = 1,2,...) nxn -1 (2.19)
ex ex
sin x cos x
cos x — sin x
In x (x positive) 1/x (or x_1)

The derivatives of more complicated functions can be obtained from


these by using the rules described in the next chapter. A more
extensive table is given in Appendix D. Remember the rule (2.10) for
addition of functions and multiplication by constants.

Example 2.8. Obtain the equation of the tangent line at the point (\n, 7t) on the
graph of y = 2x — 3 cos x.
At a general point on the curve,

dy d dx d
— = — (2x — 3 cos x) = 2-3 — cos x (by (2.10))
dx dx dx dx

= 2 — 3( — sin x) = 2 + 3 sin x

(from the table). At (\n, n) this becomes equal to

2 + 3 sin fit = 5,

and this is the slope of the tangent line at the point. The equation of the tangent
line is therefore
44 2.9 Mathematical techniques

2.9 Higher-order derivatives


We may differentiate a function, and then differentiate the result.
For example:

if y — x4,

then — = 4x3,
dx

which we shall sometimes call the first derivative of x4. By differentia¬


ting again, we obtain

d /dy\
12x2,
dx \dx/

which we call the second derivative of x4; and so on.


In general, if y = /(x), we use the notation

d /dy\ = d^y or d2/(x)


dx \dx/ dx2 dx2

(Notice where the indices 2 are placed: the locations are different
above and below.) If we differentiate again, we get

d | d /dy\j d /d2y\ = d3}; ^ d3/(x)


dx [dx \dx/j dx \dx2/ dx3 dx3

and so on.

Example 2.9. Show that d4y/dx4 = 0 when y = 2x3 + 3x2 — 1.

Differentiating four times, we have

dy d2y
= 6x2 + 6x, = 12x + 6,
dx dx2

d3y d4y
= 12, = 0.
dx3 dx4

Example 2.10. Write down the sequence y, dy/dx, d2y/dx2,..., d7y/dx7 when
y =sin x.
The sequence is sin x, cos x, —sin x, —cos x, sin x, cos x, —sin x, —cos x (and
it continues in this regular way).

2.10 An interpretation of the second derivative


The second derivative has a simple interpretation. Suppose that
y = f(x). From Section 2.9,

d2y d /dy\
dx2 dx\dx/

dy/dx represents the slope of the graph, so d2y/dx2 gives the rate
2.10 Standard functions and techniques 45

(a) dy of change of the slope with respect to x as we move from left to


^<0 >0
At At right on the graph. Where d2y/dx2 is positive, the slope is increasing;
where it is negative, the slope is decreasing.
If d2y/dx2 is consistently positive, then the slope dy/dx steadily
increases; it might even increase from negative values (downward
slope) through a zero value (tangent horizontal) to positive values
(upward slope). If d2y/dx2 is consistently negative, then the slope
steadily decreases. Figure 2.8 shows two curves upon which, respect¬
ively, the second derivative is positive and negative all the way along.

Example 2.1 1. Sketch one period 2n of the graph of sin x and indicate the signs
(b) of dy/dx and d2y/dx2.
dy We have
^>0 <0
dx dv
dy d2y
y = sin x, — = cos x, —- = — sin x.
dx dx2
The signs of the derivatives are shown in Fig. 2.9a; dy/dx is zero at the points
marked Z. In Fig. 2.9b, dy/dx is sketched to show explicitly how it varies.

dy

Problems

2.1. (Computational). A point P is given on each of the estimate the slope of the curve at P. (Consider points on
following curves. Choose a sequence of points Q which both sides of P.)
lie closer and closer to P on the curve, and make a table (a) y = x3 at P : (1, 1); (b) y = x* at P : (1, 1);
giving the slopes of the chords PQ. From this table, (c) y = cos x at P : (^tc, 2”*);
46 Mathematical techniques

(d) y = ev at P : (0, 1); (e) y = e2v at P: (0, 1); (c) x + C (where C is a constant);
(f) y = x3 + x* at P : (1, 2) (compare (a) and (b)); (d) x(x — 1); (e) x2(x2 + 1) — 1;
(g) y = In x at P : (1, 0). (f) ax2 + bx + c (where a, b, c, are constants);
(g) (x - I)2-
2.2. (Sections 2.1, 2.2). Obtain dy/dx in each of the
following cases at the given point P. Do this from first 2.10. Prove that the following pairs of curves intersect
principles; that is, find 8y in terms of 8x, simplify in a right angle at the points given. (Hint: find dy/dx
5j*/5x, and let 8x -* 0 to obtain limgx^0 5y/5x, or dy/dx. at the point for each curve.)
(a) y = 3x at P : (2, 6); (b) y = 3 — 2x at P : (1, 1); (a) y = 1 + x — x2 and y = 1 — x + x2 at (1, 1);
(c) y = 3x2 at P : (1, 3); (d) y = x3 at P : (1, 1); (b) y = 2(1 — x2) and y = x — 1 at (1, 0);
(e) y = 1/x at P : (2, 4); (f) y = 3x + 2x2 at P : (1, 5); (c) y = 1 — 3X3 and y = g + jx2 at (1, 2).
(g) y = (1 + 2x)2 at P : (- 1, 1).
2.11. Find the angle between the following curves at their
2.3. (Sections 2.1, 2.2). Obtain dy/dx from first principles points of intersection. (Hint: the angle of intersection is
(see Problem 2.2) at a general point P : (x, y) on the given the angle between the tangents to the curves at the point;
curves. then consider (1.7) and the tangent formula of (1.18a) for
(a) y = 3x2; (b) y = x3; (c) y = 1/x; the difference of angles.)
(d) y = x + 4; (e) y = x + 1/x; (f) y = 2x2 — 3. (a) y = x2 and y = 1 — x2;
(b) y = 3X3 and y = x2 — 2x + 4
2.4. (Section 2.3). Let x be the displacement of a point
moving on a straight line, and let t represent the time 2.12. (See Section 2.6). Find the limits of the following
elapsed. Form a table by taking the given value of f and functions when e —> 0. (Remember: 0/0 has no definite
calculating the average velocity between f and t + 51 for meaning.)
diminishing values of 5f. Use the table to estimate the
(a)£; (b) —; (c) —;
velocity at time f. s 2e s
(a) x = 3f at f = 1; (b) x = 5r2 at t = 3;
e2e — 1 e2£ — 1 sin 2s
(c) x = 2t — 5t2 at t = 1; (d) x = 2t — 5t2 at t = 0.2.
(d) -; (e)-; (f)-;
2e e 2s
2.5. Use the formula (2.9) to find dy/dx at the given
sin 2s ln(l + e2)
points in the following cases. (g)-; (h)-;-;
(a) y = x at any point; (b) y = x3 at x = 3; 6 £

(c) y = x4 at x = 2 and at x = —2. sin s


(i) -when s is an angle measured in degrees',
2.6. From (2.9), write down the derivatives, dy/dx or s
(d/dx)/(x), for the given functions /(x). Use this informa¬ tan s sinh e e_£ — 1
tion to sketch rough graphs of /(x) (notice the sign and (j) -; (k)-; (1)-.
ESS
the magnitude of the slope of y = /(x)).
(a) y = x; (b) y = x2; (c) y = x3;
2.13. (See Section 2.7). Obtain d(cos x)/dx in the same
(d) y = x4; (e) y = x5.
way that (2.16), for sin x, was obtained.

2.7. Sketch a velocity-time graph and an acceleration¬


2.14. (See Section 2.7). (a)Differentiate e2* by follow¬
time graph for a point moving on a straight line with
ing the method leading to (2.15).
displacement x = f3. Use these to sketch a graph of
(b) Differentiate sin 2x by following the method
acceleration against distance. (See Example 2.5.)
leading to (2.16).
(c) Prove that (d/dx) e~x = -e~x by following part¬
2.8. In the following, different letters for the variables are
way the method leading to (2.15). (Hint:
used in place of the usual x and y. Write down the
derivatives in the appropriate form. (For example, if limE_0[(e“c - !)/( — £)] = 1.)
w = r3, then dw/dr = hr2.) Use this result to differentiate sinh x and cosh x (see
(a) V = 4nr3; (b) S = nd2; (1.26) for the definitions).
(c) E = kT4 (k is a constant);
(d) / = V/R (R is a constant); 2.15. Differentiate the following functions.
(e) H = RI2 (R is a constant); (a) 2 sin x — 3 cos x;
(f) V = RT/P (R and P are constant). (b) In 3x (see Section 1.9 for the properties of the logar¬
ithm);
2.9. Differentiate the following functions by using (2.10): (c) In x3 (see Section 1.9); (d) sin x — x;
(a) 3x2 — 2x + 1; (b) x7 — 3x6 + x + 1; (e) e* — 1 — x — 4x2.
Differentiation 47

2.16. Find the equations of the tangent lines in tne (e) e* — 1 — x — ^x2.
following cases.
(a) y = .x3 at (1, 1); (b) y = x4 - 2x2 + 1 at (2, 9); 2.18. Show that, if N is a positive whole number, then
(c) y = cos .x at (fit, 0); (d) y = In x at (e, 1); (dN/d.xN)xN = N\

2.19. For the curve y = x2(x2 — 3), find the ranges in


vz vz x for which (a) dy/dx is positive (so that y is increasing);
(f) y = 3e'v — 4.x at (0, 3). (b) dy/d.x is negative (so that y is decreasing); (c) d2y/dx2
is positive (so that the slope is increasing); (d) d2y/dx2
is negative (so that the slope is decreasing). Deduce the
2.17. Obtain dy/dx, d2y/dx2, d3y/dx3 in the following general shape of the curve from these facts. (Hint: if dy/dx
cases. changes sign at some point, then dy/dx must be zero at
(a) y = .x6; (b) y = 3.x2 — 2.x + 2; the point. But dy/d.x does not necessarily change sign
(c) y = .x6 — .x2; (d) y = 2 sin x — 3 cos x; where dy/dx = 0.)
3 Further techniques for
differentiation
Contents
3.1 The product rule 48
3.2 Quotients and reciprocals 50
3.3 The chain rule 52
3.4 Derivative of x" for any value of n 55
3.5 Functions of ax + b 56
3.6 An extension of the chain rule 56
3.7 Logarithmic differentiation 57
3.8 Implicit differentiation 1 58
3.9 Derivatives of inverse functions 59
3.io Derivative as a function of a parameter 60
Problems 62

The table of elementary derivatives (2.19) is not sufficient to satisfy


basic needs; for example, it does not even tell us the derivative of
sin 2x. But fortunately we need not start afresh every time we meet
a new function. By using the rules of combination given in this
chapter it is possible to differentiate functions that are developments
of those given in (2.19), no matter how complicated they are.

3.1 The product rule


The derivatives of a product of several functions can be obtained
when the derivatives of its individual components are known.
Examples of such products are

x2 ex, ex sin x, x ex cos x.

Suppose firstly that y takes the form of a product of two functions


u(x) and v(x):

y(x) = u(x)v(x),

where y(x) is written to display the dependence of y on x. We require


dy/dx in terms of u and v. Fix a value for x, and change it by an
amount 5x so that

x becomes x + 8x.

Then u, v, and y all change:

u becomes u + 8m, v becomes v + 8v, and y becomes y + 8y,

where u, v, y represent the values at x. Since

8y = (u + 5 u)(v + 8v) — uv,


3.1 Further techniques; for differentiation 49

we obtain
8y _ (u + 8u)(V + 8l>) ■- uv uv + u dv + v bu + bu bv — uv
8.x Sx bx
8n 8u 8v
= U-ft) -bu -#
8.x 8x 8x
Now let 8x —► 0, so that Sy/8x, bu/bx, bv/bx become dy/dx, du/dy,
dv/dx respectively. Also, since Sir -> 0 when 8x -*■ 0, the final term
becomes zero, and we obtain the product rule:

Product rule

If y(x) = u(x) v(x), then


dv d du du
— = — (uv) — u-v —.
dx dx dx dx

Example 3.1. Find dy/dx when y = x2 ex

Put u = x2, v = ex, y = x2 e* = uv. Then


du dt;
— = 2.x, —= e .
dx dx
Therefore, by (3.1),
dv dv du
— = u-hr —
dx dx dx
= x2(ex) + ex(2x) = (x2 + 2x) e*.

Example 3.2. Find dx/dt when x = e' cos t.


We have to interpret (3.1) in terms of the new symbols. Put
u = e', v = cos t, x = e' cos t = uv.
Then (refer if necessary to the table (2.19) with the appropriate changes of letters)
du dr
= e, = — sin t.
dt dt

Changing the symbols in (3.1) we have


dx dv du
— = u-hr —
dr dr dr
= e'( —sin r) + (cos r) e' = e(cos r — sin r)
= e'(cos r — sin t).

Example 3.3. Find dy/dx when y = xex sin x.


This product has three terms, and we have to carry out the differentiation in
two stages. Write
y = (x e*) sin x
and put u = xeJ and v = sin x. By (3.1),
dy d . d
— = x e — sin x + sin x — (x e )
dx dx dx
50 3.1 Mathematical techniques

= x ex cos x + sin x — (x ex). (i)


dx
To evaluate (d/dx)(x ex), use the product rule again, putting u — x and v = ex.
Then

— (x ex) = x — ex + ex — x = x ex + ex. (ii)


dx dx dx

Replace (ii) into (i):

— = x ex cos x + (sin x)(x ex + ex)


dx

= ex(x cos x + x sin x + sin x).

Another method of dealing with the product of several terms,


which is usually more convenient, is given in Section 3.7. The reader
is strongly recommended to write out all the steps completely at
first, otherwise mistakes are likely to occur.

3.2 Quotients and reciprocals


Suppose that

u(x)
y(x)
v(x)

Proceed as for the product rule: let x change to x + 5x, so


that u becomes u + bu, v becomes v + bv, and y becomes y + §y.
Then

5y {u + bu u\ 1
5x \ v + bv v) bx

_ uv + v bu — uv — u bv v bu — u bv
v(v + bv) bx v(v + bv) bx

1 ( bu bv\
—- u-u-I.
v(v + bv) \ bx bx)

Let 6x ->• 0; then 5y/5x, bu/bx, bv/bx become dy/dx, du/dx, dv/dx,
and bv 0. Therefore

dy _ d /u\ _ 1 / du dv\
dx dx \v) v2 \ dx dx)

It is worth noting the special case of the reciprocal of a function. In


that case, u(x) = 1, so du/dx = 0. Finally we have
3.2 Further techniques for differentiation 51

Quotient and reciprocal rules

(a) If y{x) = then


v(x)
dy d /iA 1 / du dtA
d.x d.x\iy v2 \ dx dx) (3.2)

(b) If y(x) =-(i.e. if u(x) = 1), then


v(x)
dy d /l\ \ dv
dx dx\vj v2 dx

Example 3.4. Obtain dy/dx when y — tan .x.


Express y in the form
sin x
y = tan .x =-.
cos x
Put u = sin x, v = cos x, y = u/v. Then
du dv
— = cos x, — = —sin x.
dx dx
From (3.2),
dy 1 / du dtA
— = — I v-u —
dx v V dx dx/

—— [cos x cos x — sin x( —sin x)]


cos2 x

= —-— (cos2 x + sin2 x) = —-—.


cos- x cos x
(Remember that cos2 A + sin2 A = 1.)

Example 3.5. Find dy/dx when y = (x + 3)/(2x3 + 1).


Put u — x + 3, v = 2x3 + 1, y = u/v. Then

By (3.2),

^ [(2-x3 + D(l) - (x + 3)(6x2)]


dx (2x3 + 1 )2
_ 1 - 18x2 - 4x3
“ (2x3 + l)2

Example 3.6. Obtain dy/dx when (a) y = 1/x; (b) y = 1/x2.

(a) Put v = x into the reciprocal rule (3.2b) (or u = 1 and v = x into (3.2a)):

dy 1 dy 1
dx v2 dx x2
52 3.2 Mathematical techniques

(b) Put v = x2 into (3.2b):

If we had put n = — 1 and — 2 respectively into the formula (2.9),

— (x") = nx"~1,
dx

proved only for positive integer n, the correct result in Example 3.6
is obtained. The formula is in fact correct for all values of n, as will
be shown in Section 3.4.

3.3 The chain rule


The chain rule will be used continually in future chapters. It is also
called the function-of-a-function rule. Suppose y can be expressed as
a function of a variable u, where u is a function of x. We shall express
this by the notation

y — y(u), where u = u(x).

An example of this is

y = cos(.x3),

which we can rewrite in the form

y = y(u) = cos u, where u = u(x) - x3.

Another example is when y = cos3 x. Write it in the form y = (cos x)3,


so that

y = u3, where u — cos x.

The rule for such cases is the following.

The chain rule

If y — y(u) where u = u(x), then ^ 3)


dy dy d u
dx du dx

The form of this result is easy to remember if you first write

dy dy •
dx • dx’

then put du in place of the dots. Sometimes it is inconvenient to


use u; any letter not already in use can be used in place of u.
To prove (3.3), fix on any value of x. Consider a nearby value
x + fix, and denote the corresponding small changes in u and y by
5u and 5y. When x becomes x + fix, then u becomes u + 8u and y
3.3 Further techniques for differentiation 53

becomes y + 5y. Evidently

8y 5y 5u
5.x 5u 5.x

since the terms 5u cancel. Now let 8x -> 0. Then 5u -t- 0, and
consequently 5y/8x, 8y/5u, 8u/5x approach dy/dx, dy/du, du/dx
respectively. Thus we obtain

dy dy du
dx du dx

The following examples show how to recognize when it is


appropriate to use the chain rule. You should lay out every
application in the systematic way shown until you are used to it.

Example 3.7. (a) You are given that (d/dx) e* = ex (see the table, (2.19)).
Deduce that (d/dx) eax = ae“, where a is any constant, (b) Find the derivative
of e~x. (c) Use this result to obtain the derivatives of sinh x and cosh x (see (1.26)
for the definitions of these functions).

(a) Rewrite y = eflX in the form

y = e“, where u = ax.

To use the chain rule (3.3), we need dy/du and du/dx:

dy du
= e and — = a.
du dx

The chain rule gives

dy dy du
= e a — ae
dx du dx

after restoring the variable x.

(b) For e_x, the constant a is — 1, so

d
= —e

dx

(c) sinh x = \(ex — e x) and cosh x = ^(ex + e x), so the result (b) gives
d d
— (sinh x) = — [|(ex - e~x)] = i(ex + e“x)
dx dx
= cosh x,
d d _
— (cosh x) = — [f (ex + e x)] = ^(ex — e x)
dx dx
- sinh x

Example 3.8. Find dy/dx when y = (x2 + l)10.


We could expand (x2 + l)10 as a polynomial by means of the binomial theorem,
but the chain rule is far simpler, Put

y = u1 °, where u = x2 + 1.
Then
3.3 Mathematical techniques

By the chain rule,

^ = ^ = lOu’2* - 20x(*> + 1)’.


dx du dx

Example 3.9. Find dy/dx when (a) y = sin(x3); (b) y = sin3 x.

(a) Put y = sin u, where u = x3. Then

dy du
— = cos u and — = 3x ,
du dx

By the chain rule,

— = — — = (cos u)3x2 = 3x2 cos (x3).


dx du dx

(b) Put y = u3, where u = sin x. Then

dy = 3u 2
— a
and du
— = cos x.
du dx

By the chain rule,

dy dy du 2
— =-= 3u cos x = 3 sin x cos x.
dx du dx

Example 3.10. Find dy/dx when y = l/(x2 + 1).


Put y = 1/u, where u = x2 + 1. Then

dy
-- and — = 2x,
du u2 dx

(where the reciprocal rule (3.2b) was used for differentiating 1/u.)
By the chain rule,

dy 1 2x
— --- 2x =---
dx u2 (x2 + l)2

Example 3.11. Find du/dt when u = a cos k(x — ct) where a, k, c, and x are
constant, t and u being the only variables.

We should not use u for the intermediate variable in the chain rule (3.3), because
it is already in use (as the name of the dependent variable). Instead of u, use an
uncommitted letter such as w as the intermediate variable, putting

u = a cos w, where w = kx — kct.

The chain rule takes the form

du du dw
dt dw dt

in which

du dw
— = — a sin w and — = — kc.
dw df

Therefore
3.4 Further techniques for differentiation 55

3.4 Derivative of xn for any value of n


Consider the derivative of y — x", where n may have any value, an
integer or not, positive or negative. The rule turns out to take the
same form as (2.9), in which n was limited to positive integers:

Derivative of xn
If y = x", where n may take any value whatever, then
(3.4)
dy
= nx n- 1
dx

To prove (3.4), we use the chain rule (3.3). Note that x = e,nx (see
(1.21)), so that

y = x" = (eln x)n = e"ln x.

To use the chain rule, we put this in the form

y — e“, where u = n ln x,

so that

dy u , du n
— = e and
du dx x

Then

dy dy du n n
— =-= e - = x - = nx'n- 1
dx du dx x x

(where we used e“ = y = x" again).

Example 3.1 2. Find dy/dx when (a) y = xf, (b) y — l/xf, (c) y = 1/^/x, (d)
y = l/(2xi + x).

(a) Here n = f in (3.4), so — (xi) = fxf


dx

(b) This may be written y = x i so n = — § in (3.4), and

d
(x”f) = —jx
dx

, , -i dT i -i
(c) y = x , so — = - jx 2.
2

dx

(d) Write y = (2xi + x)-1. We can use the chain rule: put y = u-1, where
u = 2xf + x. Then

dy
= —u (by (3.4)), — = fx“i + 1 (by (3.4)).
dx dx

Therefore, by the chain rule (3.3),

dy dy du 2X 3 + 1
— = — — = ( — u 2)(§x f + 1) = - 3
dx du dx (2xi + xY
56 3.5 Mathematical techniques

3.5 Functions of ax + b
A frequently occurring application of the chain rule (3.3) is in
connection with functions like eax+b, sin(ax + b), (ax + b)n, and
in general f(ax + b). The spirit of the chain rule is to say: ‘If
the functions were e\ sin x, x", f(x), then they would be easy.
Therefore, try the chain rule with u = ax + b.'
Suppose that, in general, we want to differentiate y when

y = f(ax + b),

and that we know how to differentiate /(x). Write

u = ax + b, y = /(«)•

Then the chain rule gives

dy dy du df(u)
dx du dx du

in which the derivative occurring on the right is already known.


The following special cases, in which b = 0, should be noticed;
they consitute an extension of the table (2.19).

Function Derivative

eax a eax (3.5)


sin ax a cos ax
cos ax — a sin ax

3.6 An extension of the chain rule


In Section 3.3, the chain rule (3.3) was looked upon as a way of
differentiating a function of a function, say y(u(x)). Sometimes we
need to consider ‘a function of a function of a function of...’. These
can always be worked through by repeated applications of (3.3), but
it may be less complicated to proceed as in the following example.

Example 3.1 3. Obtain dy/dx when y = esin(Jc2+1).


Instead of using one intermediate variable u, introduce two variables, u and v.
Put

y = ev, v = sin u, u = x2 + 1.

Then by exactly the same type of arguments as led to (3.3),

dy dy du du
dx dv du dx

We have

du
— = cos u,
du
3.6 Further techniques for differentiation 57

so

— = (er cos u)2x = 2x esln(x2+1, cos(x2 + 1).


dx

The result can be extended in an obvious way to any number of


intermediate variables, but it is seldom that more than two would
be needed:

Extended chain rule

Suppose that
y = y(v), v = v(u), and u = u(x).
(3.6)
Then
dy dy du du
dx dy du dx

3.7 Logarithmic differentiation


To differentiate a product
y = u(x)v(x)w(x)
consisting of three terms, the product rule (3.1) can be applied
twice, as in Example 3.3. An alternative procedure, which is often
simpler, is the following.
Since y = uvw,

In y = In(uuw) = In u + In v + In w.

By the chain rule (3.3), if u(x) is any function of x,

d 1 du
— In u =-,
dx u dx

and we may have y, v, or w in place of u. Therefore

1 dy Id u 1 dy 1 dw
- — +-+
y dx u dx v dx w dx

By multiplying through by y = uvw, we obtain:

Logarithmic differentiation

If y = uvw, then
dv /id u 1 dy 1 dw\ (3-7)
— = uvw[-1-H-
dx \u dx v dx w dx/
(and so for any number of terms).
58 3.7 Mathematical techniques

Example 3.14. Find dy/dx when y = (x* sin2 x)/(x2 + 1).


Put y = uvw, where

u = X2, v = sin2 x, w = (x2 + l)- 1-

Then

In y = ln(x2) + ln(sin2 x) + ln(x2 + 1)_1.

= 2 ln(x) + 2 ln(sin x) — ln(x2 + 1).

Notice that we did not just copy the formula (3.7) rigidly: the logarithm is useful
for getting rid of awkward powers and we might have missed this. Differentiate
this expression:

1 dy 1
(sin x) — (x2 + 1)
y dx 2x sin x dx x2 + 1 dx

1 + 2 cos x 2x
2x sin x x2 + 1'

Multiply through by y = (xf sin2 x)/(x2 + 1) to give dy/dx:

dy _ xi sin2 x / 1 + 2 cos x _ 2x \
dx x2 + 1 \2x sin x x2 + 1/

3.8 Implicit differentiation


An equation of the form

/(x, y) = c (a constant)

represents a curve or curves; for example x2 + y2 — 1 represents a


circle, otherwise expressed by y = y(x) = +(1 — x2)K This latter
relation is implicit in the first form, which is called an implicit
equation.
Suppose that the implicit equation for y is /(x, y) = c and its
explicit equation is y = y(x); that is to say, both equations specify
the same curve. Then f(x,y(x)) = c for all values of x: it is an
identity. Since the value of /(x, y(x)) remains constant, its derivative
is zero:

~/(x, y(x)) = 0,
dx

for every relevant value of x.


Notice further that if y is a function of x, then the chain rule (3.3),
using y as the intermediate variable instead of u, gives results such as

d 2 dy d . dy
— y = 2v — and — cos y = — sin y —.
dx dx dx dx

This fact can be used in the following way to obtain an expression


for dy/dx, even in cases when we cannot solve the implicit equation
to obtain y as a function of x explicitly.
3.8 Further techniques for differentiation 59

Example 3.1 5. Find a general expression for dy/dx at any point on the curve
given by f(x, y) = x + y + sin x 4- cos y = 1.

So long as we stay on the curve, /(x, y) does not change when x changes, so
d/(x, y)/dx = 0. Therefore

, dy dy
1 +-1- cos x — sin v — = 0,
dx ' dx

so finally

dy cos x + 1
dx sin y — 1

Such a result is not quite so good as its neatness suggests, because


we would still find it hard to say what values of y are to be associated
with a particular value of x in the new formula: this would in effect
involve solving the original equation for y in terms of x.

3.9 Derivatives of inverse functions


The derivatives of functions such as In x, arctan x, arccos x, and
arcsin x, which are the respective inverses of ex, tan x, cos x,
and sin x can be obtained by a standard procedure. We need the
general result:

dy = 1 /dx
(3.8)
dx / dy

To illustrate what (3.8) means, take as an example the case when


y = lnx. As described in Section 1.9, two statements

y = In x for x > 0 and x = e* for all y

are different ways of saying the same thing; the two graphs
depicting the relation between x and y are the same graph. For
small corresponding increments 5x and 5y on this graph,

5y j/5x
5x / 5y

Since 5x and 5y approach zero together, we obtain dy/dx =


l/(dx/dy), which is (3.8).
To find dy/dx when y = In x, write equivalently

x = ey.

Then

dx
e y.
dy
60 3.9 Mathematical techniques

so, by (3.8),

dy 1 1
dx ev x

This, of course, agrees with (2.18), where a more direct method


was used.

Example 3.1 6. Find dy/dx when y = arctan x.


If y = arctan x, then x = tan y. From Example 3.4, interchanging x and y,

dx 1
dy cos2 y

so, by (3.7),

dy
— = cos y.
2
dx

To express cos2 y in terms of tan y (i.e. in terms of x) draw the triangle in Fig.
3.1, in which y is represented by the angle A. This is a right-angled triangle
because the sides conform with Pythagoras’ theorem. Evidently cos2 y = 1/(1 +
tan2 y), as can be checked by putting tan2 y = (sin2 y)/cos2 y. Therefore

dy 1 1
dx 1 + tan2 y 1 + x2

Fig. 3.1

3.10 Derivative as a function of a parameter


If x and y are functions of a parameter, or supplementary variable,
t, so that

x = x(t), y = y(t),

then the point (x(f), y(f)) follows a curve as t varies. Suppose that
t changes from t to t + 51; then x changes to x + 5x and y to y + by.
Obviously

8y 5y / 8x
5x bt I bt ’

since bt cancels on the right-hand side. Let bt -* 0; then 8x -> 0, and


we have
3.10 Further techniques for differentiation 61

Differentiation in terms of a parameter


If x = x(r) and y = y(t), then
(3.9)
dy dy j dx
dx dt / dt

Example 3.1 7. A curve is given in polar coordinates by r = sin 9. Find dy/dx


at the point where 9 = g n.
We can use 0 as the parameter in the following way. The universal relation
between polar and cartesian coordinates is
x = r cos 9, y = r sin 0.

On the special curve described by r = sin 9, these equations become


x = sin 0 cos 9, y = sin2 9.

Then
dx , .
— = —sin2 9 + cos2 9 = cos 29
d9

(by the product rule and the identity in Appendix B) and

— = 2 sin 9 cos 9 = sin 29


d9

(by the chain or product rule and the identity in Appendix B). Therefore
dy d y dx
= tan 29.
dx dOj d0
At the point where 9 = ^7t,

— = tan t7i = 1.
dx

Example 3.1 8. The map coordinates of a moving vehicle are given by x = —t2,
y = ^f3, where t is time and t > 0. Find the direction the vehicle is facing when
Fig. 3.2 t = 2.

From (3.8),
dy dy I dx t2 x

dx dt I df —21
This equals — 1 when t = 2. The slope of the curve is negative at this point, so
the tangent to the path slopes downwards from left to right as shown in Fig.
3.2. The actual direction in which the vehicle is moving is, however, from right
to left. It is facing north west as shown.

From information such as that given in the previous example, the


speed of a moving point can be calculated. Suppose that a point
moves so that

x = x(t), y = y(t),
where t represents time. Figure 3.3 shows the effect of changing t
Fig. 3.3 to t T 51, where 8f is small: the point moves from P to Q, a short
62 3.10 Mathematical techniques

distance 5s say along the curve (5s is called an element of arc length).
Then the average speed over this short time is given by
arc length PQ 5s
~5f ~~ St
straight distance PQ
St
_ (5x2 + 5y2)*
~5t~

Now let 51 -> 0. Then 5x/51 and 5y/5f become dx/dt and dy/dt, and
finally we have the result:

Speed of a moving point

Let x = x(t) and y = y(t), where t is time.


The speed of the point is given by
(3.10)
'dxV (dyV
speed = ~ =
,dt) \dt) .

where ds stands for an element of arc length.

Example 3.1 9. Find the speed of the vehicle in Example 3.18 when t = 2.
In general

ds r/dx\2 /dyVli

dr ~ [\dt/ + \df JJ’


= (4t2 +

The speed is therefore 4^/2 when f = 2. (Speed is always counted as a


non-negative number: when we want to make a distinction as to direction, the
word ‘velocity’ is used.)

Problems

3.1. (Product rule, (3.1)). Obtain df(x)/dx for the follow¬ (g) (sin x + cos x)/(sin x — cos x);
ing f(x): (h) secx ( = 1/cos x); (i) cosec x (= 1/sin x);
(a) x ev; (b) x sin x; (c) x cos x; (d) ex sin x; (j) x/(3x2 - 2); (k) l/x(x3 + 1); (1) 1/ln x;
(e) x In x; (f) x2 In x; (g) eA In x; (h) x2 ex; (m) x" where n is a negative whole number
(i) sin x cos x; (j) x2x3 (this is the same as x5: show (x" = 1/x-"); (n) l/(x + 1); (o) e“x (= l/ex);
that the result is the same for both forms). (p) 1/tan x; (q)x-2lnx.

3.2. (Quotient and reciprocal rule, (3.2)). Obtain d/(x)/


dx for the following f(x): 3.3. Find the first, second, and third derivatives of
(a) cot x; (b) x/(x + 1); (c) (sin x)/x; (d) ex/x; (a) 1/(1 - x); (b) x sin x; (c) x/(x - 1);
(e) (x2 - l)/(x2 + 1); (f) (tan x)/x2; (d) f(x)g(x), where / and g are any functions.
Further techniques for differentiation 63

3.4. (Chain rule, (3.3)). Obtain d/(x)/dx for the follow¬ (a) ecos2 *; (b) e“cosx2; (c) ln(cos x2); (d) (e*2 - l)4.
ing /(x). (Set out the calculation systematically, as in the
examples in Section 3.3.)
3.10. (Logarithmic differentiation, Section 3.7, is
(a) sin2 x; (b) cos2 x; (c) sin x2; (d) cos x2;
easiest). Differentiate the following.
(e) tan2 x; (f) tan x2; (g) cos(l/x);
(a) x ex sin x; (b) t e‘ cos f; (c) x4 e2x sin4 3x.
(h) e_x (compare Problem 3.2(o)); (i) (x + l)5;
(j) (x3 + l)4; (k) sin 3x; (1) cos jx;
3.11. (Implicit differentiation. Section 3.8). Proceed as
(m) tan Tx; (n) e_3x; (o) sin(2x + 1);
in Example 3.15 to obtain expressions for dy/dx in the
(p) cos(3x - 2); (q) tan(l - 2x); (r) e1/x;
following.
(s) ax (write as a power of e).
(a) Show that if x2 + y2 = 4, then dy/dx = — x/y. Check
the correctness of the expression by testing it with
3.5. (General powers of x, Section 3.4). Differentiate the y = ± (4 — x2)4. Interpret the result geometrically by
following. sketching the circle x2 + y2 = 4 and considering the
(a) x-2; (b)x_1; (c) x4; (d) x~4; (e) x4; meaning of dy/dx in terms of slope, (b) xi + y4 = 1;
(0 V*; (g) v(*3); (h) i/x; (i) i/Vx. (c) x3 + xy — y3 = 0; (d) x sin y — y sin x = 1.

3.6. Differentiate the following (the independent 3.12. The same expression for dy/dx in Problem 3.11a
variable is not always x, and more than one rule is obtained when the radius is changed; for example,
is needed). if x2 + y2 = 9, we still get dy/dx = —x/y. Is this para¬
(a) x4 sin x; (b) sin4 x; (c) (x2 + 1)“4; doxical? (Notice that even in the general case of /(x, y) =
(d) sin2(3f + 1); (e)e"'cosr; (f)e“'sinf; c, a constant, the expression for dy/dx will not depend
(g) e~2' cos 3f; (h) e-3' cos 2f; on c: think of the difference between the form of the
expression and the values it takes.)
(i) sin x cos2 x; (j) sin2 x cos x; (k)

(1) x sin3 x; (m) x cos3 x. 3.13. Find expressions for dy/dx and then d2y/dx2 if
xy — x y = 1.

3.7. Differentiate cos2 x and sin2 x, (a) by using the


3.14. Differentiate the following inverse functions, using
identities cos2 A = j(l + cos 2,4) and sin2 A —
the method of Section 3.9. The results are quite import¬
y(l — cos 2A), (b) by using the product rule, (c) by
ant, and are included in the table of derivatives, Appendix
using the chain rule.
D.
(a) arcsin x; (b) arccos x; (c) arctan x;
3.8. Confirm the correctness of the following state¬ (d) arcsinh x; (e) arccosh x; (f) arctanh x.
ments. The letters A, B, C, D, and n stand for any
constants.
d2x 3.15. (Parametricdifferentiation,Section3.10).Thecurves
(a) If x = A cos 2t + B sin 21, then-1- 4x = 0;
dr2 in the following are in polar coordinates. Find dy/dx at
d2x the point specified.
(b) If x = A cos nt + B sin nt, then-F n x = 0; (a) r = sin j9 at 0 = (b) r = 1 + sin2 6 at 0 = In.
df2
_ d2x
(c) If x = A e3! + B e 3', then-9x = 0; 3.16. Obtain dy/dx in terms of f, then re-express it in
df2 terms of x, when the path of a point is given para¬
d2x metrically by the following.
(d) If x = A e"‘ + B e then-n~x — 0;
df2 (a) x = f3, y = t2; (b) x = 2 cos t, y = 2 sin t.
(e) If x = A e"' cos t + B e”' sin t, then
d2x dx 3.17. The path of a point is given parametrically by
-+ 2— + 2x = 0;
dr2 df x = a cos t, y = b sin f.
(f) If y = A ex + B e~x + C cos x + D sin x, then Show that the point travels around the ellipse
dy4
— - y = 0.
dx4

3.9. (Chain rule (3.3); or, more easily, the extension Express dy/dx in terms of f. Suppose that t represents
(3.6)). Differentiate the following functions. time. Express the speed as a function of f.
Applications of
differentiation
Contents
4.1 Function notation for derivatives 64
4.2 Maxima and minima 66
4.3 Exceptional cases of maxima and minima 69
4.4 Sketching graphs of functions 70
4.5 Estimating small changes 75
4.6 Numerical solution of equations: Newton’s method 77
Problems 81

Reminder. A basic table of derivatives is given in Appendix D at


the end of the book.

4.1 Function notation for derivatives


So far, we have used the dy/dx or (d/dx)/(x) notation for derivatives.
The usefulness of the dy/dx notation is illustrated by the chain rule
(3.3) and by (3.7)—(3.9): it strongly suggests the truth of certain
results and makes them easy to remember. However, it is sometimes
desirable to use another notation, /'(x), which means exactly the
same thing:

/'(x) means the same as d-/(x).


dx

By itself the symbol /' stands for the derivative function, because
it is ‘derived’ from the original function f. Think of /' in the
following way. Choose a ‘neutral’ letter for the independent variable,
u say, which is not being used for anything else at the moment, and
specify / in terms of u. For example, suppose that

f{u) = u2 — 3 u.

Then /' stands for the function specified by

/'(«) = j-/(u) = 2u-3.


du

Knowing now the form of the function /' (i.e. its formula), we can
put anything we like in place of u, so that

/'(x) = 2x - 3, f\t) = 2t - 3, /'(5) - 2.5 - 3 = 7,

/'(x3) = 2x3 - 3, f\x - ct) = 2(x - ct) - 3,

f'(gM) = 2flf(x) - 3 (where g is any function), and so on.

The following examples show how this notation can be used.


4.1 Applications of differentiation 65

Example 4.1. The function f is defined by f(u) = sin u. Obtain (a) /(x2); (b)

dx dx

(by using the chain rule (3.3) with u = x2).


(c) The first thing to do is to obtain the function /':

d d
/ (u) = — f(u) — — (sin u) = cos u.
du ‘ du

Now put u = x2; then

f'(x2) = cos x2.

Notice that the result (c) is different from the result (b): f'(x2) is not
the same as (d/dx)/(x2). In (b) we first find /(x2) and then differentiate with
respect to x; in (c) we first find f'(u) and then put u = x2.

Example 4.2. Express (a) the product rule (3.1); (b) the quotient rule (3.2a); (c)
the chain rule (3.3); in terms of the ‘dash’ notation.

(a) Product rule

d
— [u(x)u(x)] = u(x)v\x) + v(x)u'(x).
dx

or simply

(uv)' = uv' + vu'.

(b) Quotient rule

(c) Chain rule

-p/(«(*)) = /'(w(-1c))w'W-
dx

Example 4.3. (a) Suppose that f is any function. Express (d/dx)/(5x — 3) in


any terms available, (b) Verify the correctness of (a) in the special case when
f(5x — 3) = sin(5x — 3).
(a) Since the particular function / is not specified, the only thing to be done is
to express (d/dx)/(5x — 3) in terms of /', which is also unspecified. Then, from
the chain rule (c) in Example 4.2, with u = 5x — 3,

y- /(5x — 3) = 5/'(5x — 3).


dx

It is awkward to express the right-hand side without using the dash notation.
One alternative is to write it as

_dU Ju = 5j:-3

(b) In this case f(u) = sin u, so

f'(u) = cos u.
66 4.1 Mathematical techniques

The result in (a) predicts that

— /(5x — 3) = 5 cos(5x — 3).


dx

This is the same as the result obtained by working out

(d/d.x) sin(5x — 3)

directly by using the chain rule with u — 5x — 3.

The dash notation extends to higher derivatives: we put

The dash notation

/'(-x) = T/(x), /"(*) = £/ w,


dx dxz
(4.1)
/"'(*) = Aj/(x), ... ■
dxJ
If y = f(x), then the notation y', y'\ y'",..., is also
used.

4.2 Maxima and minima


A prominent feature in the graph of any function

y = fix)

is any point at which the graph ‘turns over’. For example, in


Fig. 2.9, the graph of y = sin x turns over at x = and x = §n.
These are points where the slope changes sign from positive to
negative or negative to positive. The derivative of /(x) is zero (the
tangent is horizontal) at such turnover points: for example it is easy
to verify that, for y = sin x,

f'(in) = 0 and /'( §tc) = 0.

Therefore

fix) = 0

can be looked on as an equation whose solutions include all the


possible points at which the graph turns over.
However, graphs do not necessarily turn over at points where
f{x) = 0. For example, if y = x3, then fix) = 3x2. This is zero at
x = 0, but the graph does not turn over at x = 0 (see Fig. 1.3): it
flattens instantaneously and then continues upward.
Fig. 4.1 Figure 4.1 sketches two typical cases in which the graph does turn
4.2 Applications of differentiation 67

over, at A (x = a) in Fig. 4.1a and at B (x = b) in Fig. 4.1b. Then


f\a) = 0 and f\b) = 0.
If fix) = 0 at a point x = c: that is, if

f'(c) - 0,
(a) then x = c is called a stationary point of fix). A stationary point
such as A in Fig. 4.1a is called a maximum of the function /(x).
More precisely, the function is said to have a local maximum at
x = c, because the value of fix) at x = c is greater than its value at
any point in the immediate neighbourhood. (There may be local
maxima elsewhere that are either greater or smaller than this one.)
Similarly a point such as B in Fig. 4.1b is called a local minimum
of fix).
To distinguish between types of stationary point algebraically,
(b)
consider also the second derivative /"(x) at x = c. Suppose
that

fie) = 0 and /"(c) < 0.

The second derivative is

_ d2y _ d /dyN
fix)
dx2 dxvdx,

and this is negative at x = c. Therefore dy/dx, or fix), is


decreasing across x = c, and since fix) = 0 at x = c, fix) must
be positive on the left of c and negative on the right. Thus the
graph is of the type shown in Fig. 4.1a, and the point is a local
maximum.
If x = c is a point where

/'(c) = 0 and /"(c) >0,

(d) then fix) is increasing, and therefore goes from negative to positive,
across x — c. The point is therefore a minimum, like the point B in
Fig. 4.1b.
In the special case when

/'(c) = 0 but /"(c) = 0

there might occur a maximum (as with y = — x4 at x = 0), or a


minimum (as with y = x4 at x = 0), or another feature called a
stationary point of inflection (as with y — ±x3 at x = 0). These
Fig. 4.2 Cases for which cases are illustrated in Fig. 4.2. One way to classify such a point
/'(0) = 0 and /"(0) = 0.
is to examine directly the sign of dy/dx on both sides of the
(a )y = -x4.
(b) y = x4 (c) y = x3.
point.
(d) y = -x3. To summarize:
68 4.2 Mathematical techniques

Stationary points of /(x)


Let f\c) = 0; i.e. x = c is a stationary point of /(x).
Stationary points can be classified by either examining
the sign of fix) on both sides of x — c or looking at
the sign of /"(c).
(a) If /"(c) < 0, f{x) has a local maximum at x = c.
(b) If /"(c) > 0, fix) has a local minimum at x = c.
(c) If /"(c) = 0, the stationary point might be a
maximum, a minimum, or a point of inflection. Examine
the sign of fix) on both sides of x = c.

Example 4.4. Classify the stationary points of /(x) - xi - 3x.


The stationary points are where f'(x) = 0; that is, where

3x2 — 3 = 0, or x = +1.

We need the signs of /"(± 1), where

fix) = 6x.
Then

/"(D = 6,

which is positive, so there is a minimum at x = 1. Also

/"( —1)-6

which is negative, so there is a maximum at x = — 1.


The values of /(x) at these points are

/(1) = —2, /(— 1) = 2;

so the graph has the shape shown in Fig. 4.3. Alternatively, we could simply
have checked the signs of f\x) = 3x — 3 on both sides of the stationary points
directly, instead of using the test (4.2).

Example 4.5. In the circuit shown in Fig. 4.4, V is a constant voltage and R and
x represent two resistances: R is fixed and x is variable. The rate of heat generation
y in resistance x is equal to I2x where I is the current. Show that y is a maximum
when x = R.

Current equals voltage divided by total resistance, so

V
MV I =-.
R + x
R

Therefore the rate of heat generation is

V2x
= fix).
(R + x)2

say. If there is a maximum, it will occur when /'(x) = 0. From the quotient rule
Fig. 4.4 (3.2)
4.2 Applications of differentiation 69

V2
/'(*) = l(R + x)2 - x -2(R +x)]
(R + x)4

R — x
= V (>)
(R + x)3

This is zero when x = R.


To show that /(x) has a maximum when x = R we may work out the sign
of f"(R). From (i),

V2
/"(x) = HR + x)3(-l) — (R — x)-3 (R + x)2]
(R + x)6

V2
(-4 R + 2x).
(R + x)4

Therefore

/"(/?) = -V2/SR3,

which is negative, so x = R corresponds to a maximum of y.


However, it is easier to look instead at the expression (i) for /'(x). When
x < R, we have f'(x) > 0, so /(x) is increasing. When x > R, we have /'(x) < 0,
so /(x) is decreasing. This ensures that a maximum has been obtained without
the need to differentiate again.

Example 4.6. x and y are two numbers subject to the restriction that x + y = 1.
Find the maximum possible value of xy.
There are two variables, x and y, but we can reduce the problem to one involving
only x by using the fact that x + y = 1, so that

y = 1 - x. (i)
In that case,

xy = x(l - x) = x - x2 = f(x),

say. Now /(x) has a stationary point (a maximum, minimum, or point of


inflection) where f'(x) = 0, that is to say, where

1 — 2x = 0, or x =

By (4.2), this value of x delivers a maximum, because f"(x) = — 2 (for any value
of x) which is negative. From (i), y = j when x = so the maximum value of
xy is l.

4.3 Exceptional cases of maxima and minima


The method of finding local maxima and local minima by solving
f'(x) = 0 reveals only points where the slope of the graph of
y = f(x) is horizontal. Sometimes there is a maximum or minimum
at an end-point of an interval, even if the graph is not horizontal
there.

Example 4.7. Suppose that the values of x to be considered are restricted to


lying between 0 and 1 inclusive: that is, 0 ^ x ^ 1. Find the points on this interval
at which x — x2 takes maximum and minimum values.

Fig. 4.5 y = x - x2, The graph of y = /(x) = x — x2 between x = 0 and x = 1 is shown in Fig. 4.5.
0 < x < 1. The maximum of /(x) = x — x2 which we found in Example 4.6 at x = \ can
70 4.3 Mathematical techniques

be seen. But, understood in a commonsense way, there are minimum values at


x = 0 and x = 1, the end points of the restricted interval. These cannot be
detected by the method of differentiation. Whether we are interested in them
would depend on the demands of any practical problem from which the
question originated.

In problems of the type illustrated in Example 4.6, this situation


can arise naturally, as in the following example.

Example 4.8. Find the maximum and minimum values of x2 — y2 on the circle
x2 + y2 = 1.
It is evident that the point (x, y) can only be on the circle if x and y both have
values between — 1 and 1 inclusive, that is if

—1 <x ^ 1 and — 1 < y < 1. (*)

A restricted interval therefore arises naturally in the problem. On the circle


x2 + y2 = 1, we have

fix) y2 = 1 — x2. (ii)

x2 - y2 = 2x2 - 1 = f{x), (iii)

say. To find the stationary points of /(x) we see that fix) = 4x, which is zero
when x = 0. Also /"(0) = 4 > 0, so x = 0 is a local minimum of /(x), whose
value is /(0) = — 1.
However, we have overlooked something. In Fig. 4.6, we show the graph of
f(x) = 2x2 — 1, within the permitted interval — 1 < x ^ 1. The local minimum
at x = 0 can be seen, but there are also maxima at the end points x = — 1 and
x = 1, where /(x) takes the values +1.
Alternatively, the maxima at x = ± 1 can be found by substituting for x
instead of y at the first stage. Put x2 = 1 — y2, so that

x2 - y2 = 1 - 2y2 = g(y),
Fig. 4.6
f{x) = 2x2 - 1, -1 < x 1. say, and solve g'(y) = 0: we then find a local maximum at y = 0, where x = ± 1.
However, we also lose sight of the minima we found before. The subject is
discussed again in Section 27.2.

Another possibility is that there may be points at which the graph


of y = f{x) does not have a definite tangent. Then f\x) or dy/dx
has no meaning at such points. For example, in Fig. 4.7, there is no
tangent at the points A, B, and C. The points A and B could qualify
as local maxima, and C as a local minimum, but at A and C the
graph suddenly changes direction, and at B there is a jump in the
value of /(x). These points cannot be located by solving /'(x) — 0,
Fig. 4.7 because f'(x) does not exist at A, B, and C.

4.4 Sketching graphs of functions


To sketch a graph is to indicate its general shape so as to draw
attention to its most important features without being concerned
with accurate plotting. To do this it is necessary for the reader to
have a clear idea of the shape of the graphs of the basic functions

xa, tax, sin ax, cosax, lnx.


4.4 Applications of differentiation 71

Example 4.9. Sketch the graph y = 1 — 1/(1 + x)2.


This can be done in stages, as shown in Fig. 4.8. Figure 4.8c is obtained from
4.8b by using the rule (1.11) with c = 1; it simply involves sliding the graph
y = — l/xz one unit to the left. To get from 4.8c to 4.8d, we add 1, which moves
the graph up the y axis by one unit.

In Fig. 4.8d of Example 4.9 we see that, as x increases, becoming


large and positive, the value of

gets closer and closer to 1. The same is true when x becomes large
and negative. This is obviously an important feature of the graph.
It can be seen to be true by thinking what happens to y when we
put a large value of x into the formula for y (think of a very large
number: x = 1,000,000 rather than x = 10). Then obviously 1/(1 4- x)2
is very small, so y gets very close to 1, and the larger x becomes,
the nearer y is to 1. The same is true when x is large and negative.
We say that, as x increases, the graph approaches the line
y — 1, which in general terms is called an asymptote of the graph.
When x approaches —1, the graph approaches the vertical line
x = — 1; this is also called an asymptote. The two continuous halves
of the graph to the left and right of x = — 1 are called branches.
Suppose that y = /(x) is to be sketched. A general question to be
asked is ‘What happens to y when x increases towards infinity
(or decreases towards minus infinity)?’ We normally say ‘as x
approaches ±oo’, and as usual indicate the approach by

X -> ± 00.

For example, 1/x —► 0 when x -► — oo. Also

—> 4 when x -> oo (or x-»• — oo).


3x + 2

To see this, think of the effect of giving x an immense value. Only


the terms x and 3x are significant; they are said to dominate the
expression; so

x — 1 x j
3x + 2 ^ 3x _ 3'

The limit notation can be used in this context (see Section 2.2).
We can write, for example

1 — 2x
lim - 2.
X~* 00 1 + x

The reasoning is the same as in the earlier case: think of a very large
value of x.
72 4.4 Mathematical techniques

Very often the function has no definite limit as x —> go. For
example lim^...^ sin x does not exist; no definite single number is
approached, since sin x simply goes up and down between ± 1 for
ever. However, it is quite usual to write, say.

lim x2 = oo,
X~> 00

even though oo is not a number.


Notice the following result.

lim flx"e““ = 0,
X-> 00

where a and n are any constants, and c is a positive


(4.3)
constant.

We shall not prove (4.3); but, to convey the feel of it, a table of
values is given for the special case of x3 e~x:

x 0 1 2 3 4 8 10
x3 e~x 0 0.36 1.08 1.34 1.17 0.18 0.05

Fairly large values are needed before the function settles down to
approach zero, because x3 is increasing, and therefore competes with
e~* in the early stage. However, e~x will beat any power of x down
to zero eventually. In the following example, we sketch the graph
of the function in the table without using the calculated values above.

Example 4.10. Sketch the graph y = x3 e x.


Do it in stages, using any easily obtained facts you can think of.
1. Are there any points where it is easy to obtain values? At x = 0, y = 0.
2. Are there any definite points where x3e_x is infinite? There are no such
points.
3. Are there any points where the graph crosses the x axisl Only the point
found in (1).
4. Are there any maxima/minimal

— = 3x2 e x — x3 e x = x2(3 — x) e x.
dx

This is zero when x = 0 and x = 3, so these are stationary points.


dy/dx is positive when x < 3, and negative when x > 3, so x = 3 is a
maximum. Since e3 % 20, at this point y x 1^.
Near the other stationary point x = 0, dy/dx is positive on both sides, so
x = 0 is a point of inflexion.
5. Behaviour as x -> oo. According to (4.3), y -» 0 as x -*• oo.
6. Behaviour as x — oo. As x -*• — oo, we have x3 -» — oo and e_x -* oo
(think, for example, of x = — 1000). Therefore, x3 e_x -> — oo (very rapidly).
Fig. 4.9 The sketch is shown in Fig. 4.9.
4.4 Applications of differentiation 73

Example 4.1 1. Sketch the graph of e~F sin 2x for 0 ^ x ^ 2n.


(x is assumed to be in radians.) Split the expression into its two factors, e~F
and sin 2.x. These are shown in Fig. 4.10a,b. The value of e“F drops to about

(a) (b) (c)

Fig. 4.10

l at .x = 2k. Also, sin 2x is zero when

2.x = 0,7t, 2jt,..., 67i,


or when
.x = 0, jn, n, fit, 2n.
The product of the two is shown in Fig. 4.10c. The graph crosses the x axis
(i.e. y = 0) where sin 2x = 0, and nowhere else. The height of the peaks and
troughs of e“)xsin 2.x are estimated by the size of the factor e-ix, shown as a
broken line, which multiplies the maxima and minima of sin 2x. The new
maxima and minima do not occur at exactly the same points: it is left to the
reader to show that the new maxima and minima occur at values of x which
satisfy the equation tan 2x = 6.

It is useful to be able to distinguish between the behaviour of


functions such as those shown in Fig. 4.11 a,b. The function 1/x of
Fig. 4.1 lb is infinite at x = 0, but the sign changes across x = 0. The
terminology used to describe y = 1/x near x = 0 is
y — oo as x -» 0 from the left;
y -* oo as x -> 0 from the right.

Example 4.1 2. Sketch the graph of y = l/(x — 2)(x + 1).


Look out for the obvious things first. At x = 0, we have y — — The
function is infinite at x = 2 and x = — 1. It does not cross the x axis
anywhere.
Now consider the sign of 1 /(x — 2)(x + 1). It is positive when x < — 1 (try
e.g. x = —3). It is positive when x > 2 (try e.g. x = 3). It is negative when
— 1 < x < 2, which is linked with the facts that the graph does not cross the x
axis and, as we already know, that y is negative when x = 0.
We know now that

y -*• oo as x -> — 1 from the left;

y — oo as x - 1 from the right;

y — oo as x—>2 from the left;

y -» oo as x -*■ 2 from the right;


Fig. 4.11
so Fig. 4.12 is emerging.
74 4.4 Mathematical techniques

Fig. 4.11 y — l/(.x — 2)


x (x + 1).

We now locate precisely the obvious maximum between x = — 1 and 2, and


make sure there are no other stationary points. By the reciprocal rule (3.2b),
we have

dy 1 d 2x — 1
C(x - 2)(x + 1)] =
dx (x 2)2(x + l)2 dx (x — 2)2(x + l)2'

This is zero at x = \ and nowhere else. There is no need to use the test (4.1);
the point can only be a maximum since there are no other stationary points.
The value of y there is — f.

We now return to asymptotes and show that there can be


asymptotes that slope. Consider the function

x2-l
37 ~~ 2x + f

when x is large, positive or negative. The term x2 — 1 is dominated


by x2; meaning that the part -1 is negligible compared with x2
when x is large. Likewise the dominant term in 2x + 1 is 2x. It is
therefore obvious that

y-v+oo when x->+oo.

However, we can do much better than this, because it can be seen


by polynomial division that
3
x2- 1 4

2x + 1 2x + 1

Therefore the graph will approach the straight line

y = 2* - i
when x is large. As in the earlier instances we have seen, the line
y = 2X ~i is said to be an asymptote of the original graph. The
notation

y ~ — j when x + oo

is sometimes used, meaning that the curve approaches the line


y = 2* - i when x is large. The curve is sketched in Example 4.13.
4.4 Applications of differentiation 75

In the same way, a function may be an asymptotic to a curve


as x -> ± oo. For example, if

1 1 .
y =-- sin x,
x x
then

y~- when x->±oo.


x

Example 4.1 3. Sketch the graph of y = (x2 — l)/(2x + 1).


The curve cuts the x axis (y = 0) at x = +1. Also y = — 1 when x = 0. The
function is infinite at x =
Also, as shown above, the straight line y = \x — \ is an asymptote for large
y values of x. This is shown as the broken line in Fig. 4.13.
From the quotient rule (3.2a),

dy (2x + l)(2x) — (x2 — 1)2 x2 + x + 1


dx (2x + l)2 (2x + l)2

This is never zero, because the equation x2 + x + 1 = 0 has no real solutions.


Therefore, there are no stationary points.
Asymptote x =

4.5 Estimating small changes


Let

y = /(*)•
Fig. 4.13
Suppose that the value of x changes by a small amount 5x. Then y
will change by a small amount §y. There is a simple approximate
relation between 5y and 8x which is important for practice and
theory.
Fix a particular value of x: say x = a; the small deviation 8x will
be made from this value of x. The derivative at x = a is f\a).
According to (2.8), to obtain f'(a) we take a nearby point x = a + 5x
and form the ratio

by _ f(a + 5x) - /(a)


8x 5x

and

by
f'(a) as 5x -> 0.
bx

If 5x is small enough, 5y/8x will become close in value to /'(a):

jp * /'(<>),
ox

so that

5y f'(a) 8x.
76 4.5 Mathematical techniques

This is how to obtain an approximation to the change 8_y in y


due to a small change from x = atox = a + 5x. It is easier to
remember the result in the form

8y « - 5.x,
dx
near a general point x (which again shows the usefulness of the
dy/dx notation in suggesting true results). We call this the incre¬
mental approximation for functions of a single variable.

Incremental approximation

For a small increment 5x from x = a:


(a) 8y ss /'(a) 8x (4.4)
(b) (Mnemonic form)
dy
5y 8x.
dx

Example 4.14. Let y = x + 1/x. Estimate the change Sy in y when x changes


from x = 2 to x = 1.8. Compare the estimate with the exact value of 8y.

Put

y = X + ~ = f{x).
X

Then

dy 1
or f'{x) = 1 -
dx

so that /'(2) = 0.75. Here 5x = 1.8 — 2.0 = —0.2; so, by (4.4a),

by *0.15 x (-0.2) = -0.15.

(The exact value is given by 8y = (1.8 + 1/1.8) — (2.0 + 1/2.0) = —0.1444 • ■ •.)

Example 4.1 5. The volume V of a sphere of radius r is given by V = §7tr3.


Estimate the change in volume if the radius increases from 2.0 to 2.1 metres.
We shall use the letters that the question offers, considering 8F and Sr. Put

V=jnr3 = f(r).
Then
dV
f (r) = — = 47tr2,
dr
so, by (4.4b),

bV x 4nr2 br. (i)

(Notice that 47tr2 is the formula for the surface area of a sphere: the change in
volume is nearly equal to the surface area times the thickness 8r.) Now put r = 2:

/'(2.0) = 1 6tt and br = 2.1 — 2.0 = 0.1.

Then, by (i),

8K% 16n x 0.1 = 5.02.


4.5 Applications of differentiation 77

The exact value is 5F = 5.282 • • • (cubic metres), implying an error of 5% in


our estimate.

The number bV« 5.02 in Example 4.15 might not seem to


qualify as a small change. Furthermore, if we express the identical
problem in different units, say in centimetres rather than metres, the
numbers are even larger; then br becomes 10 (cm) and 8F is about
5 x 106 (cm3). On the other hand, if the units had been kilometres
then br and bV would have looked very small indeed. But nothing
at all is changed except the units of measurement. We still get only
a 5% error in the estimate. The reason for this is that the ratio

Estimated bV/Exact bV = [/'(r) 5r]/[/(r + br) — f{rf]

is dimensionless; that is to say it is unaffected by the choice of units.


There is no easy way to predict when the method will work well:
geometrically speaking, we are content to guess that the graph sticks
sufficiently closely to its tangent line at a within the interval a + 5x.

Example 4.1 6. The cosine rule for a triangle ABC is


c2 = a2 + b2 — lab cos C.

In a triangle for which a = 3 and b = 4, estimate the change in c when C


increases from 60° to 65°.

Put in the fixed numbers, a = 3 and b = 4; then

c2 = 25 — 24 cos C
or
c = (25 — 24 cos C)* = /(C),

say. By the chain rule (3.3) with u — 25 — 24 cos C,

/'(C) = — = -(25 - 24 cos C)~*(24 sin C).


dC

The quantity 5C must be measured in radians, because radian measure was


assumed in obtaining the derivatives of the sin and cos functions. So we put

C = 60° = radians, 8C = yfo7t = 0.087 radians.

We know that cos C = j and sin C = ^/3, so

/'($*) = 6V3/V13.
Therefore, by the incremental approximation (4.4),

5c * (6y/3/y/13) x 0.087 = 0.25.

(The exact change is 5c = 0.2489 • • •.)

4.6 Numerical solution of equations: Newton’s


method
It is often necessary to solve equations for which there is no standard
method of solution. (In fact, this is true for practically all equations.)
Simple examples are the equations

x4 + x3 — 1 = 0 and e~x — x = 0.
78 4.6 Mathematical techniques

y For such cases, there are many methods for obtaining numerical
>> = x4+x3-l
1 1.5 I solutions, which are applicable no matter how complicated the
l 1 equation is. We describe one of them here.
1.0 t
\ To apply the method, it is necessary first of all to obtain at least
♦ 1
a rough idea of the location of the solution we are seeking. There
0.5 1
\
1 are various ways of doing this: for example we can plot a rough
,
-1.51 -1.0 - 0.5 O 0.5 U.O x graph. Taking the first example above, the graph
\ 1
-0.5 / y = x4 + x3 - 1
\
\ -1 0 4
y- is sketched in Fig. 4.14 using only five values of x; namely —1.5,
— 1,0, 0.5, and 1. The solutions occur where it crosses the axis; there
-1.5 -
seems to be one not far from —1.3 and one not far from 0.8.
Fig. 4.14 y = .v4 + x3 - 1. Suppose now that we have a general equation to solve:

fix) = 0;
and that, by drawing its graph, or by some other method, we have
established that one of its solutions is not far from the value

x = x0,

say. We show how to locate this solution accurately.


Figure 4.15 shows one possibility for the shape of the graph of
y = /(x) close to its (unknown) solution x = c, say, corresponding
to the point C. (If the graph is different from this, the discussion is
much the same.) The initial estimate x = x0 corresponds to A0. The
point A0 could be on the left of the solution C as shown, or on the
right; we are not likely to be sure: again the argument is much the
same (see Problem 4.14).

Perform, in imagination, the following steps.

1. Starting at the point A0, where x = x0. Draw the perpendicular


A0B0, intersecting the curve at B0. Construct the tangent at B0, and
continue it to intersect the x axis at Au where x = xv Then A1 is
nearer to the solution C than A0.
2. Repeat the process, starting with the improved estimate Av
We arrive at A2, where x = x2, which is a still better approximation.
3. Using the new estimates as starting values as they arise,
4.6 Applications of differentiation 79

keep repeating the process to produce a sequence of approximations

Ao (x = x0), A1(x = x1), A 2(x = x2), A3 (x = x3),

and stop when the accuracy attained is satisfactory.

The steps 1-3 can be carried out algebraically:

1. Starting with A0 (x = x0), the equation of the tangent line at


B0 is given by

y - fix0)
= /'(*<>).
x — x0
At Ax (x = Xi), we have y = 0, so

-fix o)
= fix o).

Therefore, the new approximation Ax (x = xt) is given by

fix o)
Xj - x0
f'ixo)'

2. Xj takes the place of x0, and x2 takes the place of x3 so

fix l)
X2 = Xj
fix,)'
3. Once the nth approximation x„ is available, the (n + l)th value
xn+1 is given by

fjx„)
^« + i =
f\xny

This process, in which essentially we do exactly the same thing


over and over again, using for each step the information obtained
from the previous step, is called a step-by-step process or an
iterative process. It is summarized in the following algorithm (or
recipe), known as Newton’s method.

Newton's method for the numerical solution of


fix) = 0
Find a value x = x0 sufficiently close to the solution
required. Then carry out the following step-by-step
process until the desired accuracy is obtained: (4.5)
/(*«)•
x„+1 — xn-—,
fixn)
for n = 0, 1, 2, 3,..., successively.

The following example works through the equation with which


we opened the section.
80 4.6 Mathematical techniques

Example 4.17. The equation x4 + x3 - 1 = 0 has a solution near x = 0.8.


Find it to five-decimal accuracy.
We have
x0 = 0.8, J'(x) = x4 + x3 - 1, f'(x) = 4x3 + 3x2.
Then, in (4.5),
x4 + x3 - 1
X/i + 1 XM , 3 | *2 2 "
4x„ + 3x„
Starting with x0 = 0.8, we obtain the following table:

n 0 12 3
x„ 0.80000 0.81976 0.81917 0.81917
/(xj -0.07840 0.00247 0.00000 0.00000
Evidently we do not have to pursue the sequence any further.

Example 4.1 8. The equation e~x = x has a solution near to x = 0.5. Find the
solution accurately to five decimal places.
We have
x0 = 0.5, fix) = e~x — x, fix) = -e“x - 1.
From (4.4),
e“*" - x„ _ x„ + 1
X/i +1 xn ,
- e Xn — 1 e " + 1
(the last step for simplicity of calculation). We obtain the following sequence:
X0 Xj x, x3
0.50000 0.56631 0.56714 0.56714.

This repetitive process described is easy to program for a computer


for individual cases as they arise, and then the complexity of the
equation is of no importance. The same program can be adapted to
scan a range of x in order to get a provisional idea of where the
solutions are to be found. A simple program combined for safety’s
sake with inspection of the whole sequence of values output would
satisfy most requirements.
However, to write a program which will automatically, without
intervention, find the solutions for any function fix) that might be
presented to it is a very different matter. For example, we would
have to find means to be absolutely sure that none of the possible
tangents would by chance carry us an irrecoverable distance away
from the solution we are seeking (see e.g. Problem 4.15) by designing
a way of automatically recognizing and rectifying the situation if it
occurs.
Applications of differentiation 81

Problems

4.1. (See Section 4.1 on the ‘dash' notation). The func¬ 4.8. Sketch the graphs of the following functions.
tion / is defined by f(u) = u2. Obtain the following. (a) l/(x2 + 1) (this is an even function: see (1.12)).
d (b) e~*2. (c) x/(x — 1). (d)xe_x.
(a)/'(f); (b) /'(f); (c) — /(f2); (e) x2 e~-v. (f) x3 e~x. (g) e2x — 4 ex.
df
(h) (In x)/x for x > 0 ((In x)/x —> 0 when x -* oo; this
d can be proved by putting x = e“ and letting u —► oo).
(d)/'(f*); (e) — /(fT); (f)/"(p). (i) [ln( —x)]/x for x < 0 (compare (h)).
df
(j) x In x — x for x > 0 (x In x -> 0 when x -> 0; this
can be seen by writing x = e_" and letting u -> oo).
4.2. (See Section 4.2). Find the stationary points of (k) sin 1/x (Start by finding where it crosses the axis,
the following functions and classify them as maxima, using the fact that sin u = 0 when u = 0, + 7t, ±271,
minima, or points of inflection. ... .)
(a) -Y2 — y; (b) y2 — 2y — 3; (c) y In x (y > 0); (l) (x2 — l)2 (This is an even function: see (1.12)).
(d) .re/ (e) 1/(y2 + 1); (f) y2 — 3y + 2; (m) x(x2 — l)2 (This is an odd function: see (1.12)).
(g) ex + e~v; (h) y2 + 4y + 2; (i) y — x3; (n) (sin x)/x (You will not be able to find the exact
(j) y2(y — 1); (k) sin y — cos y (in 0 < y < 2ti). positions of the maxima and minima; be content
(1) sin y cos y ( —7i < y < 7i); (m) e~x sin x; to indicate the trend. It is an even function: see
(n) e~ >x sin 2x (see Example 4.11); (1.12) . For the value approached at x = 0, see
(o) x — cos y; (p) 2 ev — \ e2v; (2.13) .)
(q) y2 e-A; (r) (In x)/x (y > 0); (s) (1 — x)3;
(t) sin3 y; (u) e~x’; (v) ev’ v; 4.9. Sketch the graphs of the following functions.
(w) y + x- *; (x) y3 e~x. (a) l/(x2 — 1) (Hint: write x2 — 1 = (x + l)(x — 1), and
then follow Example 4.12; alternatively, sketch y =
x2 — 1 and imagine taking its reciprocal).
4.3. Let v = J'(u{y)). Use two successive applications
(b) x/(x2 — 1). (c) l/x(x — 2).
of the chain rule in the form of Example 4.2c to show
(d) x3/(l — x) (Hint: see the note on curved asymp¬
that
totes following Example 4.12)
d2 (e) (x + 2)/(x — 1) (See the hint in (d)).
y = /"(«(Y))[U'(x)]2 + /'(t/(Y)X(x). (f) l/(x + 1) + l/(x + 2).
d.\-

Show that if/'(«) is always greater than zero, or always 4.10. (See Section 4.5). Find the approximate value of
less than zero, then /(«(y)) and u(x) have the same the change 8y in y due to a small change 5x in x using
stationary points. Consider, e.g. Problem 4.2v in this the incremental approximation (4.4) in the following
connection, with f(u) = e" and u(x) = x1 — x: it becomes cases. Compare the approximate and exact values of 8y.
rather obvious. (a) y = x3 when x = 2 and 8x = 0.1;
(b) y = x sin x when x = §7i and 8x = —0.2;
4.4. A rectangular piece of ground is to be marked out, (c) y = cos x when x = 5J1 and 8x = 0.1;
which must have a given area A. Find the dimen¬ (d) y = (1 + x)/( 1 — x) when x = 2 and 8x = —0.2;
sions of the plot which requires the minimum length of (e) y = tan x when x = and 8x = 0.1;
perimeter fence. (This is a ‘restricted’ problem, like (f) y = 1/(1 — x2) when x = 0.5 and 8x = +0.1.
Example 4.6. Call the sides x and y.)
4.11. (a) If the focal length of a lens is f and a viewed
object is at distance u, then the image is at distance v
4.5. A tunnel cross section is to have the shape of a
where v = uf/(u — /). Let / = 0.75 (m). Find approxi¬
rectangle surmounted by a semicircular roof. The total
mately the change in v if u changes from 1.25 to
cross-sectional area must be A, but the perimeter mini¬
1.30 (m).
mized to save building costs. Find its dimensions.
(b) In a Wheatstone bridge circuit, the out-of-balance
voltage v is given by
4.6. A circular-cylindrical oil drum is required to have a
given surface area (including its lid and base). Find v = £(P,P4 - + R2)(R3 + R4),
the proportions of the design which contain the greatest where E is the applied voltage and R,, R2, R3, R4
volume. represent the resistances in the branches. Suppose that
E = 5, Ri = 4, R2 = 2, R3 = 6, and R4 = 3, so that
4.7. Solve Problem 4.6 for the case when the lid is not the circuit is initially balanced. Obtain an approximate
included in the restriction. expression for 5v in terms of a small change 8R, in R,.
82 Mathematical techniques

(c) In a triangle ABC with corresponding sides a, b, c, and make the construction based on Fig. 4.15 to explain
the formula a = b sin A/sin B applies. Show that 5a x why this is so.
— a cot B 5B.
(d) In a triangle ABC with corresponding sides a, b, c, 4.15. (a) Supposing y = /(x) to have a continuous graph,
the area A is given by illustrate graphically that the following principle is true:
A = [.■>(.<> — a)(s — b)(s — c)]±, If f(a) and f(b) have opposite signs, then there is at
least one solution of the equation /(x) = 0 in the range
where s = \(a + b + c). Find an approximate expression
for 6/4 in terms of 5a. (Hint: use logarithmic differentia¬ a < x < b.
tion to shorten the working.) Estimate 8/4 when a = 2, (b) (Computational) The equation ex — 3x = 0 has
b = 4, c — 5, and 8c = 0.1, with a and b remaining exactly two solutions, and they are in the range 0 < x <
constant. 2.5. Use the principle in (a) to narrow the ranges in which
they are known to lie, so as to produce starting values
4.12. (Computational). Growth on a deposit by com¬ x0 for Newton’s method. One systematic technique is to
pound interest is given by the formula C — P( 1 + r)", start with the given end points x = 0 and 2.5, then to
where P is the amount deposited, r is the compound- halve the interval repeatedly, considering the signs at the
interest annual growth rate, n is the time of deposit ends of the subdivisions.
in years or fractions of a year, and C is the accumulated
balance. Obtain approximating expressions for 5C when 4.16. (Computational), (a) Suppose that an equation
(a) r changes by a small amount 5r; (b) n changes by a /(x) = 0 is known to have exactly one solution in a
small amount 5n. (c) Consider plausible values of P, r, particular finite interval a < x < b. Write a program,
and n, and experiment with the accuracy of the formulae using the principle described in Problem 4.15, to obtain
for various values of 8r and 5n. a closer starting value x0 for Newton’s method. (Since
there is only one solution, any subdivision you find
4.13. (Newton's method. Section 4.6). It is not too across which the sign of /(x) does not change can be
difficult to make these calculations on a hand-held ignored. Arrange for the process to stop when the
calculator. Find the solutions of the following equa¬ solution is located within a small preset interval of length
tions within the broad ranges indicated, which contain £.)
exactly one solution. (b) Try this with, say, the equation x(ex — 1) = 1,
(a) x4 + 2x2 — x — 1 = 0 (range 0.5 < x < 1); whose single solution lies between 0 and 1.
(b) x4 + x* - 1 = 0 (0.5 < x < 0.75); (c) By choosing E to be very small, the process can
(c) x In x = —0.3 (0.1 < x < 0.2); by itself locate the solution to any degree of accuracy
(d) ev = 4x3 (0 < x < 1); if E is small enough. (This is called the bisection method
(e) tan x = 2x (0 < x < jn); for solving equations.) Obtain the number of iterations
(f) (e'sin x)/(l + x) = 2 (1.5 < x < 1.9). required to locate the single solution of /(x) = 0 in
0 < x < 1 to 2, 4, and 6 decimal accuracy. (The number
4.14. The equation /(x) = xe"'+ 1 = 0 is known to of iterations is the same for any such equation.)
have exactly one solution (not far from x = —0.6). (d) Solve the equation in (b) by Newton’s method
Demonstrate numerically that it is of no use to start to 2, 4, and 6 decimal accuracy, starting with x = 0.5,
off Newton’s method for this equation with a value and compare the number of iterations required with
of x greater than 1. Sketch the graph of the function, the number required by the bisection method.
Taylor series and
approximations
Contents
5.1 The index notation for derivatives of any order 83
5.2 Taylor polynomials 83
5.3 A note on infinite series 86
5.4 Infinite Taylor expansions 88
5.5 Manipulation of Taylor series 89
5.6 Approximations for large values of x 92
5.7 Taylor series about other points 92
Problems 94

5.1 The index notation for derivatives of any order


We shall use yet another standard notation for derivatives in this
chapter. Since we shall have to keep track of derivatives of high
orders we modify the ‘dash’ notation of (4.1) as follows to provide
a brief form:

Index notation for derivatives

For the first, second, third,... derivatives respectively


of /(x), write
f\x) = /(1>(x), /"(*) = /<2,(x), (5.1)
/'"(x) = /(3)(x),....
If y = /(x), the notation y(1), y(2), y(3),..., is also used.

Thus if /(x) = x3, then /(1)(x) = 3x2,/(2)(x) = 6x, and /(3)(x) = 6.


As with the dash notation, we encounter such forms as /<2)(u) = 6u,
/(2)(0) = 0 and /(2)(x — c) = 6(x — c).

5.2 Taylor polynomials


Firstly we shall show how to obtain approximations to a given /(x)
for use when x is a small number. Suppose, for example, that

fix) = —— •1
1 — X

Since /(0) = 1, we can be sure that

1 - x
so long as x is small enough. This is shown in Fig. 5.1a. It is, of
course, a poor approximation, acceptable only very close to x = 0.
Fig. 5.1 A better approximation near x = 0 is given by the equation of
84 5.2 Mathematical techniques

the tangent line at x = 0 (Fig. 5.1b). Since f(1)(x) — 1/(1 — x)2, the
slope at x = 0 is /(1)(0) = 1.
The equation of the tangent line at x = 0 is therefore y = 1 + x, so

-% 1 + x,
1 — X
when x is small enough.
We need a way to continue improving the approximation, P(x)
say, to a further stage and beyond. At present we have reached the
tangent approximation P(x) = 1 + x, which was chosen so that
P(0) = /(0) and P(1)(0) = /a)(0). To obtain the next approximation
choose a P(x) which also matches the second derivative at x = 0:
P( 0) = /(0), P(1)( 0) = /(1)( 0), P<2)(0) = /<2)(0).
This involves adding a term in x2, and we can choose its coefficient
so that the extra condition is satisfied without disturbing the two
terms we have already found. Continuing with the example,

/<2)(x) =-2--, so /<2)(0) = 2.


(1 - x)3
It is easy to check that P(x) = 1 + x + x2 satisfies the three
conditions. Therefore

—1— ~ 1 + x + x2
1 - X

is an improved approximation. This represents the parabolic curve


shown in Fig. 5.1c.
We can carry out this process for any function /(x), and take it
to any level of approximation we wish. Successive approximations
will consist of polynomials of increasing degree. However, we must
not expect too much of it: we cannot go too far from the origin and
still expect a good approximation.
To deal with the general case, we need the following simple result.

Derivatives of a polynomial in x at x = 0

P(x) = a0 + axx + a2x2 + • • • + aNxN


is a polynomial of degree N. Then
P(0) = a0, PU)(0) = au P<2)(0) = 2!a2, (5.2)
and in general
P<n)(0) = n\ an
for n — 1, 2, 3,..., N.

It is easy to verify (5.2) by working out the first few derivatives.


Now suppose that we wish to approximate to a general function
/(x) (near x = 0) by means of a polynomial
5.2 Taylor series and approximations 85

P(x) = a0 + axx + a2x2 + • • • + aNxN.


We require that

P( 0) = f(0), P(1>( 0) = f(1>( 0), P(2\0) = /(2)(0), ... .

According to (5.2), the coefficients are given by

a0 = P( 0) = /(0),

o, =Ip»>(0) = I/<‘»(0),

«2 = 2,P'2’(0) = 21|/,2’«».

a, = 2 P,3>(0) = 2 /,3»(0),

and so on. By writing the coefficients an in terms of the known values


/(n)(0) we obtain the Taylor polynomial approximation:

Taylor polynomial P(x) of degree N near x = 0

Let P(x) be the Nth degree polynomial

AO) + Y\ ^<1)(0)x + 2\ ^<2)(0^2 + + fW^xN- (5.3)

Then for x sufficiently close to zero,


fix) % P(x).

Example 5.1. Obtain a fifth-degree polynomial which approximates to ex for


values of x which are not too large.
Use (5.3), putting f(x) = ev. This case is simple:

J\x) = f°\x) = f{2\x) = • • • = /(5)(x) = ex,

so /(0) = /(1,(0) = /(2>(0) = • • • = /<5,(0) = 1. Therefore

1 1_ 1, 1. 1,
ex « 1 H— x H— x2 H— x3 + — x4 H— x5 = P(x).
1! 2! 3! 4! 5!

(If we take higher degree approximations, the terms continue according to the
same rule.) We show ex and its approximation P(x) in the following table for
a few values of x.

X 4 -3 -2 -1 -0.5
ex 0.0183 0.0498 0.1353 0.3679 0.6065
P(x) 3.533 -0.6500 0.0667 0.3666 0.6065

X 0 0.5 1 2 3 4
ex 1 1.6487 2.7183 7.3891 20.086 54.598

P(x) 1 1.6487 2.7167 7.2667 18.400 42.867

The approximating polynomial P(x) clings to the true values for a considerable
range around the origin.
86 5.2 Mathematical techniques

Example 5.2. (a) Obtain the Taylor polynomial approximation of any degree N
for the function 1/(1 - x) near x = 0. (b) Obtain an expression for the error in
the approximation.
(a) Putting 1/(1 — x) = /(x), the sequence of derivatives of /(x) is

2-1 3-2-1
fa>(x) = 1 , /(2,(x) /<3,(x)
(1 - x)2 (1 -x)3’ (1 - x)4

and in general

n\
r\x)
(1 -x)n+1

Therefore, referring to (5.3), the Taylor polynomial of degree N is

1 + X + x2 + x3 + • • • + xN.

(b) The error in an estimation using this approximation is equal to

„ 1
P(x) - /(x) = 1 + X +-Ix*-
1 -X

(1 - x)( 1 + X + X2 + • • • + x*) - 1
1 - X
-xN+l
1 - X

The reader should experiment with this expression using various values of N
and x. (i) If x is very small, the error involved is very small even if N is only 2 or
3. (ii) If we take any fixed value of x in the range — 1 < x < 1, the error will
approach zero when we take approximations of higher and higher degree
(because, when —1 <x< 1, xN+1 approaches zero as N increases: try this
numerically with, say, x = 0.9). (iii) The approximation fails altogether if x > 1
or x < — 1. The error will be large, and to increase N will make it still larger
because |xA,+ 1| increases when N increases.

5.3 A note on infinite series


In the previous section we did not put any limit on the degree of
the approximating polynomial, and there seems to be no reason
why we should not let the terms run on for ever: in fact,, let the
degree N approach infinity. If we extend the polynomial approxi¬
mation of Example 5.1 for f(x) — e*, we obtain an example of a
so-called infinite series:

, 1 1
1 + — x -I— x2 + — x3 +
1! 2! 3!
2 1 3
or
n
z
= 0 nl
x

It might be that by extending the approximating polynomials in this


way, approximation will become equality, so that the sum of the
series will be equal to the original function instead of being just an
approximation to it, but this is only true with reservations.
5.3 Taylor series and approximations 87

There are many types of infinite series (see, e.g. Chapter 26 on


Fourier series). Consider first what is meant by the sum of an
infinite series. When x is given any particular value, the terms to
be added become simply numbers. We cannot in practice add an
infinite number of numbers: no matter how many operations we
carry out we never reach the end. However, this does not mean that
the infinite series does not add up to a definite number, only
that we cannot reach it exactly by simply piling on more and more
terms.
Consider the simpler infinite series that we get from putting
x = 0.1 into the Taylor polynomial for 1/(1 — x) (see Example (5.2),
and letting the degree increase to infinity. It is the geometric series

1 + 0.1 + 0.12 + 0.13 + 0.14 + • • •.

This is the same as

1 + 0.1 + 0.01 + 0.001 + 0.0001 + • • •.

If we record the sum of 1, 2, 3, 4,..., terms successively, we obtain


what is called a sequence of partial sums (‘partial’ because we
only take a finite number of terms into account). The sequence is

1, 1.1, 1.11, 1.111, 1.1111, ... .

The number that is being approached is obviously 1.11111...,


which is equal to 10/9. This number is equal to the value of 1/(1 — x)
when x = 0.1, so in this case the infinite series has delivered the value
required. Similarly, if we put x = \, the infinite series is

i+i + (i)2 + (i)3 + (I)4 + -- - = i+ i + i + l+ i16 + -- --

For the sum of 1, 2, 3, 4,... terms, we obtain the sequence of partial


sums
1 li li I 7 ill
‘2’ l4’ 18) 1 16» • • • >

which is obviously approaching the value 2, and this is the value of


1/(1 — x) when x = Infinite series whose partial sums approach a
definite value as we take more and more terms are said to converge
to this value, which is called the sum of the infinite series.
However, not all infinite series converge. For example, if we form
successively the sum of 1, 2, 3,... terms of the infinite series

1 + 1 + 1 + • ■ •,

then we obtain

1,2,3,4, ... ,

which is obviously going to infinity. The infinite series

1 - 1 + 1 - 1 +•••
88 5.3 Mathematical techniques

has the successive partial sums

1,0, 1,0, 1, ... ,


which is not going anywhere. Such series are said to diverge. The
reader might be surprised to know that the infinite series

1 + 2 + 3 + 4 + ' ' '

diverges: the partial sums go to infinity. (It is worth experimenting


with this series: even using a computer you might take a while to
convince yourself that it really does diverge.)

5.4 Infinite Taylor expansions


We return to the subject of general Taylor polynomials of the type
(5.3) when we extend the polynomial to an infinite number of terms,
so that we have an infinite series instead of a polynomial expression.
This is called a Taylor series or an infinite Taylor expansion
about the origin x = 0 for the function /(x).
The mathematical theory of infinite series, and in particular of
Taylor series, cannot be discussed in this book. In the previous
section it is indicated that pitfalls might arise when the polynomials
are extended into infinite series. Moreover, it seems obvious, for
example, that the values of a function and its derivatives at the origin
only cannot possibly predict values elsewhere if we allow functions
to be completely arbitrary at other points.
However, the ordinary functions do follow the simple pattern
illustrated by the case of /(x) = 1/(1 - x) in Example (5.2). Each
function has an individual range of values of x, called its interval
of validity, in which the Taylor series converges to the exact
value of /(x). Elsewhere, the series must not be used for approxi¬
mation. In Problem 5.2, the reader is invited to verify the coefficients
in the series in Table (5.4) on page 89.
Equality is achieved over the intervals of validity stated: the
infinite series exactly represents the original function. Notice (5.4d)
(see page 89): it is the binomial theorem (see Appendix A(c))
extended to arbitrary values of a, positive or negative (if a is a
positive integer N, it ends at the term in xN).
An important special case is the infinite geometric series with
common ratio x:

1 + x + x2 + • • • =-.
1 — X

This is obtained by putting a = — 1 and — x in place of x in (5.4d).


It is valid for — 1 < x < 1.
When a series is used to provide approximations by taking
only a finite number of terms, it is necessary to estimate how many
terms to take so as to obtain a desired degree of accuracy. It is
usually sufficient to observe the size of the terms involved, as in the
following example.
5.4 Taylor series and approximations 89

Standard Taylor expansions about X — 0


Interval
Function and expansion of validity

(a) e* = 1 -1— x + — x2 + • • • any x


1! 2!

(b) sin x = x —- x3 + — x5 — • • • any x


3! 5!

(c) cos X = 1-x2 + — x4 — ■ • • any x


2! 4! (5.4)

(d) (1 + x)“ = 1 + ax + ——— x2


2!
a(a - l)(a - 2) 3
+-x3 -1- —1 < x <
3!

(e) ln(l + x) = x — \x2 -f 5X3- —1 < x ^

(If a series is cut short after the term in xN, the


result is an Nth-degree Taylor polynomial
approximation.)

Example 5.3. Find how many terms of the Taylor series for sin x are needed to
obtain three-decimal accuracy over the range — 1 < x < 1 (in radians).
The intuitive requirement is that we should stop at the point where we can see
that taking further terms is not likely to affect the third decimal place. The
magnitude (modulus) of the terms in (5.4b) increases when the magnitude of x
increases, so it should be sufficient to provide an approximation good for the
largest value, x = 1. The magnitudes of successive terms when x = 1 are equal to

1, 0.16, 0.083, 0.0002, 2 10~6,....

It is therefore enough to retain three terms of the series; that is to say we should
retain powers of x up to x5. To three decimals, then,

sin x » x-x3 H— x5 for — 1 < x < 1.


3! 5!

5.5 Manipulation of Taylor series


We can obtain new Taylor series from the standard ones in (5.4).

Example 5.4. Find the Taylor expansion about x = 0 for the function (2 — x)*,
and state its range of validity.
Write

(2 - x)* = 2*(1 - M* = 2*[1 + (-Ml*.


90 5.5 Mathematical techniques

We can use the binomial expansion (5.4d), with a = and with — \x in place
of x. The expansion will be valid, provided that — 1 < — fx < 1 i.e. when
— 2 < x < 2. Therefore

(2-x)* = 2*[l +(—ix)]*

= 2t^1 + 2(~2X) + 5 (—2X)2

+ Kt-na-2>
3!
= 2±(1 - \x - ^x2 - 72gX3 + • • •)
when — 2 < x < 2.

To find the first few terms in the Taylor series for a composite
function /(x) such as

/(*) =
(1 +x)*’

it is usually best not to start from first principles by calculating /(0),


/(1)(0), /<2)(0) and so on, which can lead to great complication, but
to manipulate standard expansions as in the following examples.

Example 5.5. Approximate to (sin x/x)2 by a polynomial of degree 4, and


compare the approximate and exact values when x = 0, 1, 2.
From (5.4b),

where only terms up to x4 are retained. Write the approximating polynomial


P(x) as

P(x) = 1 - 0.3333x2 + 0.0444x4

to obtain the table

X 0 0.25 0.5 1.0 2.0


[sin x/x]2 1 0.9793 0.9179 0.6861 0.2067
P(x) 1 0.9793 0.9194 0.7111 0.3772

Example 5.6. Approximate to e x/(l + x)* near x = 0 by a polynomial of


degree 2.

Write
/(x) = e-*(l + x)~T
Use (5.4a) with — x in place of x, and carry it to degree 2:

e~x % 1 - x + — x2.
2!
5.5 Taylor series and approximations 91

Also, by (5.4d) (the binomial theorem) with a =

(1 T x) *«l+( — 2)* + -——— -- x2.


2!
Then by multiplying the two polynomials we obtain

f(x) « 1 - §x + V*2.

when x is small. (Reject powers higher than 2 in the final product - they would
not be correct since we neglected such terms in the original approximations.)

Example 5.7. Obtain the first three nonzero terms of the Taylor expansion for
1/cos x.
There are several ways of doing this problem.

Working from (5.3). The reader might try this, but it is rather arduous.

Using the power series for cos x. Write

1 1
COS X 1,1.
1 - - x2 + - X4 -
2! 4!

The problem is to find the first three terms in the reciprocal of the infinite series;
we then have a Taylor polynomial. Anticipate that only the even powers of x
will occur, as in the expansion of cos x. Then we expect

1 1
= b0 + b2x2 + b4x4 + •
cosx 1 2 , 1 4
1-x2 H— x4 - •
2! 4!

We have to find b0, b2, fi4. To do this, cross-multiply:

1 1
1 = ( 1 - 11 x2 + - x4 ^ (b0 + b2x2 + fc4x4 + • • •)
2! 4!

= b0 + (b2 - jb0)x2 + (b4 - \b2 + fib0)x* + ■■■

(retaining only powers up to x4). Match the coefficients of powers of x on both


sides, starting with the constant term; we obtain

and, since the coefficients of x2 and x4 on the left are zero,

b2 — 2b0 — 0 and b4 — \b2 + j^b0 = 0.

The last two equations can be solved successively to give

b2 = 2 and bA = 2i-
Finally

1/cos X se 1 + 2*2 + 24*4-

Polynomial division. We can evaluate 1/(1 - ^x2 + yjx4-) by long division,


92 5.5 Mathematical techniques

setting it out like this, ignoring powers higher than x4:

1 + W + 24X*

\X2 + ^X 4

subtract: 1 - w
iv2 , J_v4
+

iv2
X — 24■'

lv2 Iv4
subtract: 'X — IX

^-x4
24A

5.6 Approximations for large values of x


When x is large, 1/x is small. This fact can sometimes be used to
obtain approximations valid when x is large, as in the following
example.

Example 5.8. Obtain a three-term approximation to (1 + 1/x)* valid when x is


large enough.
Translate the binomial theorem, (5.4d), with a = in terms of a neutral variable,
say u:

if1
2\2 1)
(1 — 1 + \u + u2 + when — 1 < u < 1,
2!

so (1 + u)* % 1 + ju — |u2 when u is small enough, the approximation improv¬


ing as u gets smaller. Now put u = 1/x; we obtain

1 1
1 + 1 +
2x 8x2’

when x is large enough (positively or negatively), the approximation improving


as x gets larger.

5.7 Taylor series about other points


The Taylor series about x = 0 for

f(x) - —-— = 1 + x + x2 + x3 H-
1 — x

does not work when x = 2: we get 1 + 2 + 22 H-, which is


infinite. However, we can obtain a different Taylor-type series which
represents 1/(1 — x) near x = 2 by a process which amounts to
changing the origin, as in the following example.

Example 5.9. Find a Taylor-type series which represents 1/(1 — x) for values of
x near x = 2.

Look for a series of this type:

1
= b0 + bj(x - 2) + b2(x - 2)2 + ■ • •,
1 —x
5.7 Taylor series and approximations 93

because we want a series that works when x is close to 2; which is to say when
x 2 is small, rather than when x is small as before. Therefore we need a series
consisting of powers of x — 2. We can bring the element x — 2 into view by
writing

1 1 _ 1
l^c _ f- (x - 2 + 2) ~~ ~~ 1 + (x - 2)'

Now expand the final term by using (5.4d) (the binomial theorem) with a = — 1,
and x — 2 in place of x, obtaining

, — = - 1 + (x - 2) - (x - 2)2 + (x - 2)3 + • • •,
1 —x

valid if — 1 < x — 2 < 1, that is, if

1 < x < 3.

Example 5.10. Obtain a Taylor series about the point x = n for the function
COS X.

There exists already the series (5.4c) which is valid at x = n. However, if we are
interested in approximating to cos x near x = n, an expansion in powers of
x — 7i should be more economical and expressive than one consisting of powers
of x. We show two ways of finding the series.

(a) On the lines of Example 5.9. Write

cos x = cos[7i + (x — 7l)]

= cos 7t cos(x — n) — sin n sin(x — n) = —cos(x — 7t).

We can use (5.4c) to expand this, by putting x — n in place of x. We obtain

1 7 1 ,
cos x = — cos(x — jt) = — 1 H— (x — k)~ — (x — 7t) + • • •.
2! 4!

This is valid for all values of x. A two-term approximation shows that cos x has
a parabolic shape near x — n — 0 or x = 71, where cos x has a local minimum.

(b) Matching the value and the derivatives at x = n. The derivatives of /(x) =
cos x at x = 7i are given by

f(n) = cos 7i = — 1, /(1,(7r) = —sin n — 0,

/<2,(71) = —cos 71 = 1,

and so on. The same relations hold good between the coefficients of a polynomial
in powers of x — 7r and the values of its derivatives at x = 7t, as was stated in
(5.3) for polynomials in x at x = 0. We simply put x — n in place of x in (5.3).
The required Taylor series is

f(x) = ./(tc) + ^ f{1\n)(x - 7i) + ^ f{2\n)(x - n)2 + ■ ■ ■,

which is the same as the result obtained in (a).

The general result is the following:


94 5.7 Mathematical techniques

Taylor series about a point x = c

f(x) = /(c) + ~ / (1)(c)(x - c)


(5.5)
+ ^ fi2)(c){x - c)2 + • • •.

(The range of validity depends upon f(x).)

Problems

5.1. Obtain a four-term Taylor polynomial approxi¬ 5.6. (See Section 5.5). Find the first three nonzero terms
mation valid near x = 0 for each of the following. in the Taylor series at x = 0 for the following.
Estimate the ranges of x over which three-term poly¬ (a) e~x/(l + x); (b)(l-x)*ex;
nomials will give two-decimal accuracy (you cannot (c) [ln(l - x)]2/x2.
usually tell until you have seen the next-higher term),
(a) e*x; (b) (1 + x)*; (c) (1 + x)“/
5.7. (See Example 5.7). Find the first three nonzero
(d) sin 2x; (e) cos /x; (f) ln( 1 + x);
terms in the Taylor expansions at the origin for the
(g) (1 + x2/ (hint: consider (1 -I- u)/ then put u = x2); following.
(h) ln(l + 3x) (see the hint in (g)).
(a) 1/(1 + ln[l + x]); (b) tan x; (c) 1/(1 + ex);
(d) tanh x, or (ex — e_x)/(ex + e~x). (It is less compli¬
5.2. Verify the coefficients of each of the infinite Taylor cated if you firstly reduce this to a more manageable
series shown in (5.4) (taking the ranges of validity for form);
granted); namely (e) x/sin x.
(a) ex; (b) sin x; (e) cos x;
(d) (1 + x)a; (e)ln(l+x).
5.8. (See Section 5.6). Find a three-term approximation,
valid for large enough values of x, in each of the following
5.3. For each of the following series give the Taylor
cases.
polynomial having the lowest degree which you think
will safely give four-decimal accuracy over the ranges
given.
(a) ex over — 2 ^ x ^ 2;
«(,-!)/ (b,ta(.+l):
(b) sin x over — 2 ^ x < 2; (c) x*/(l + *)/ (d) ln(l + x + x2)
(c) cos x over — 2 ^ x < 2; (e) 1/sin (x_1).
(d) (1 + x)* over — 0.5 ^ x ^ 0.5;
(e) ln(l -F x) over — 0.5 ^ x ^ 0.5.
5.9. (a) Show that 1/sin x % (1/x) + £x when x is non¬
zero but small enough.
5.4. Obtain the first two nonzero terms in the Taylor
(b) Show that, when x is large enough, (1 + x)* «
series at the origin for the following.
x* + l/2x*
(a) arcsin x; (b) arccos x; (c) arctan x;
(c) Show that, when x is large enough, (2 + x/ —
(d)e_xsinx; (e)e_xcosx.
(1 + x)* % l/2x*.
(d) Show that, when x is nonzero but small enough,
5.5. (See Section 5.5). Find three nonzero terms in the
then 1/(1 - cos x)* % (Jl/x) + (x^/2/24).
Taylor series at x = 0 for the following functions and
state the ranges of validity.
(a) 1/(1 + 3x); (b) 1/(2 — x); (c) (3 - x)+; 5.10. (a) (See Section 5.6). Write
(d) (x - 3)’; (e) ln(9 - x); (f) cos ^x;
In x = ln(l + [x - 1]),
(g) sin xi (Consider the series for sin u; then put u = xi).
This is not strictly a Taylor series - it is spoken of as ‘a and so obtain the Taylor series for In x about x = 1.
Taylor series in x*’ - but it still is useful for approxima¬ State the range of validity.
tions. (b) Obtain the Taylor series about x = for cos x,
(h) cos xi. and state the range of validity.
Taylor series and approximations 95

(c) Obtain the Taylor series about x = 1 for the (a) By considering the first three terms, rediscover
function (1 + x)-, and state the range of validity. the conditions on /"(c) which determine the type of
stationary point (see (4.2)).
5.11. Suppose that /(x) has a stationary point at x = c. (b) By considering further terms of the Taylor series,
Write down the form of its Taylor series about x = c, extend the criteria to obtain a general rule which covers
taking this into account. the case /"(c) = 0.
Complex numbers

Contents
6.1 Definitions and rules 96
6.2 The Argand diagram and complex numbers 100
6.3 Complex numbers in polar coordinates 102
6.4 Complex numbers in exponential form 103
6.5 The general exponential form 105
6.6 Hyperbolic functions 107
6.7 Miscellaneous applications 108
Problems 109

6.1 Definitions and rules


The quadratic equation

x2 — 2x + 2 = 0

can be written in the form

(x - l)2 = -2 + 1 = -1

using the method known as ‘completing the square’. If we solve this


equation formally by taking the square root, then

x— \ = 1 or x=l+N/— 1.

However, there is no ordinary number whose square is — 1. We call


V- 1 an ‘imaginary’ number, and denote it by the symbol j. It will
be treated very like an ordinary number. However, if j2 appears it
may be replaced by —1. Expressions involving j, such as the
solutions

x = 1 ± j

of the quadratic equation above, are called complex numbers. The


symbol j is common in engineering and electronics but i is used
instead in mathematics. The complex roots of the quadratic equation
can be expressed as

x = 1 + j, and x = 1 — j, or x = 1 + j.

The general quadratic equation

ax2 + bx + c = 0, (6.1)

can be solved as follows. By completing the square, the left-hand


side of this equation can be written as

b bY b2
ax2 + bx + c = a[x2 + -x) + c = a\x-\- -he —
\ a J 2a)V 4a
6.1 Complex numbers 97

The quadratic equation (6.1) becomes

/ b\2 b2 — 4 ac
\ 2a) 4a2

Taking the square root

x+b = +V(fc2 ~4ac)

2a ~ 2a

Hence the roots of the quadratic equation (6.1) are

x = 4 ac).
2 a 1 2 a^ib

The roots are distinct real numbers if b2 > 4ac, equal and real if
b2 = 4ac, and complex numbers if b2 < 4ac. In the last case they
can then be written in terms of j as

x = - ± j - y/(4ac - b2),
2a 2a

where yj{4ac — b2) is a real number.


A complex number in standard form is any number of the form

z = x + jy,
where x and y are real numbers. In this expression x is the real part
of z, written as x = Re z, and y is the imaginary part, written as
y = Im z (note that it is y, not jy, which is called the imaginary part.)
If y = 0, then z is a real number, and if x = 0, then z is an
imaginary number.
We need to put together rules for manipulating complex numbers:
the rules are natural consequences of operations of addition, multi¬
plication, etc. on numbers containing j. The only exception to
normal algebra is that, whenever j2 appears, we can substitute — 1.

Example 6.1. Express j2, j3, j4, j5, and j6 in standard form.
The standard forms are

j2 = -1, j3 =j2j = -1 x J = -j.


Since —j can be written 0 + (— 1)j, it follows that Rej3 = 0 and Imj3 = — 1.

j4=j2j2 = (-D(- D= 1, J5=JJ4=j, j6=jj5=~j-

Example 6.2. Express the following in standard form, and state the real and
imaginary parts in each case:

(a) (2 + j) — (3 + 3j); (b) j(j + 2);


(c)(l-j)(l+2j); (d) (2 - 3j)(2 + 3j).

(a) (2 + j) — (3 + 3j) = 2 + j — 3 — 3j = — 1 — 2j.

Real part = — 1; imaginary part = — 2.

(b) j(j + 2) = j2 + 2j = — 1 + 2j.

Real part = — 1; imaginary part = 2.


98 6.1 Mathematical techniques

(c) (1 -j)(l +2j)= 1 +2j-j -2j2 = 1 +2j -j + 2 = 3 +j.


Real part = 3; imaginary part = 1.
(d) (2 - 3j)(2 + 3j) = 22 - (3j)2 = 4 - 9j2 = 4 + 9 = 13.
Real part = 13; imaginary part = 0.

Let zx = xx + jy2 and z2 = x2 + jy2. In formal terms, the principal


rules are as follows.

1. Two complex numbers zx and z2 are said to be equal if and only


if xx = x2 and yx — y2: we write zx = z2.

2. The sum of two complex numbers zx and z2 is given by

zj + z2 = (*! + j>i) + (x2 + j y2) = (xx + x2) + j(>! + y2),

its real part being the sum of the real parts, and its imaginary part
the sum of the imaginary parts of zx and z2.

3. Similarly the difference is

Zi - z2 = (xx + jyt) - (x2 + jy2) = (xx - x2) + j(yx - y2).


4. The product of zx and z2 is

ZiZ2 = (xx + j^iKxj + jy2)

= *1*2 +jyi*2 + *ijy2 + jyijy2


= *1*2 + jyi*2 + j*ty2 + j2yiy2
= XiX2 + )yxx2 + )xxy2 - yxy2 (since j2 = - 1)
= (XjX2 - yxy2) + j(yxx2 + xxyj.

In order to carry out division, a special result is needed. Suppose


that z = x + jy, where x and y are real. Then the number x — jy,
where we have changed j to —j, is called the complex conjugate
of z, and will be written z. The product zz is given by

zz = (x + jy)(x - jy) = X2 - (jy)2 = x2 + y2,

which is a real positive number. (This was illustrated in Example


6.2d.)

Property of the conjugate

Let z = x + jy, with x, y real. Then z = x — jy and (6.2)


zz = x2 + y2.

5. The reciprocal of a complex number in standard form. Let


z = * + jy, and consider
1 _ 1

^ * + jy
This is not in standard form a + jb: to reduce it to standard form,
6.1 Complex numbers 99

multiply it by the factor

x-jy
x-jy’

x - jy being the conjugate of x + jy. This factor is equal to 1, and


it will not affect the value of 1/z. Hence

x-J y x-j y
z x+j y x-jy x2 + y2

x . y
J (from (6.2))
x2 + y2 x2 t y2

in standard form. This process also enables us to reduce quotients


to standard form, as in Example 6.3(c) below.

Example 6.3. Reduce to standard forn


1 1 1 ~j,
(a) (b) (c) (d)
2 + 3j ’ ' ' 2 - 3j 1 +)'
The standard forms are:
1 1 2 — 3j 2 — 3j 2 3
(a)
2 + 3j 2 + 3j 2 — 3j 22 + 32 13 13

1 1 2 + 3j 2 + 3j 2 3
(b) — H-
2 - 3j 2 — 3j 2 + 3j 22 + 32 13 13

1 “j_ 1-j 1 — j _ 1 — 2j + j2
(c) J^
1 +j 1+j 1-j l2 + l2

1 1 -j _ _ ■
(d)
J J

The general quotient rule is

Zi _ Ui +iyi)(x2 - jyi)
z2 (x2 + iyiK^z -jy2y

= jx1x2 + yxy2) + j(x2yx - xty2)


x\ + j y2x2 - jx2y2 + y\

= (Xix2 + yxy2) + j(x2y1 - xty2)


x\ + y22

Example 6.4. Find the standard form of the complex numbers (a) z1 + z2, (b)
2zj — 3z2, (c) ZjZ2, (d) zj/z2, where z1 = — 1 + 2j and z2 = 2 — 3j.
(a) z, + z2 = (-1 + 2j) + (2 - 3j) = 1 - j.
(b) 2zj — 3z2 = 2( — 1 + 2j) — 3(2 — 3j)
= -2 + 4j-6 + 9j= -8 + 13j.
(c) ZjZ2 = (-1 + 2j)(2 - 3j)
= -2 + 4j + 3j - 6j2 = -2 + 4j + 3j + 6 = 4 + 7j.
100 6.1 Mathematical techniques

(d) First

z? = (- 1 + 2j)( — 1 + 2j) = 1 — 4j + 4f= —3 — 4j.


Then

zf = -3 - 4j _ ( — 3 — 4j) (2 + 3j)
z2 2 — 3j (2 — 3j) (2 + 3j)

_ — 6 — 8j — 9j — 12j2
4 — 6j + 6j — 9j2

_ 6 - 17j _ 6 17 .
“ 4 + 9 _T3_l3J'

The conjugate z of z has the following further properties, which


are simply applications of the rule:
‘to obtain the conjugate, change j to — j wherever it appears'.

Properties of the conjugate

(a) zx + z2 = Zj + z2.

(b) ZXZ2 = 2XZ2.

(d) x = Re z = - (z + z), y = Im z = — (z — z).


2 ' 2j

Property 6.3(c) was illustrated by Example 6.2(a)(b). Similarly,


for example,

(2 + 3j)(4 +j)(3 - 2j) js (2 — 3j)(4 — j)(3 + 2j)


the conjugate of
(1 - 4j) 15 (1 + 4j)

we do not have to work the whole thing out first and then find the
conjugate. These rules become important later.

6.2 The Argand diagram and complex numbers


Imaginary axis

y
A complex number z = x + jy can be regarded as a pair of real
P: (x,y), representing
numbers (x, y) known as an ordered pair. The pair of numbers
can be interpreted as the cartesian coordinates of a point in the
plane in the usual way. The complex number z = x + jy has
an abscissa x and an ordinate y. In Fig. 6.1, the x axis is known as
the real axis and the y axis is the imaginary axis. The number
z = x + jy is represented by the point P : (x, y). A figure showing
complex numbers is known as the Argand diagram of the complex
numbers.
The length OP = r = J(x2 + y2) is called the modulus of z (or
Fig. 6.1 The Argand diagram. simply ‘mod z’) and written |z|.
6.2 Complex numbers 101

Example 6.5. Obtain \z\ where (a) z = 2 + 3j; (b) z = 2 — 3j.


(a) |z| = |2 + 3j| = (22 + 32)i = Vl3.
(b) |z| = |2 + (—3)j| = [22 + (-3)2]i = V13.
Thus \2 + 3j| = |2 - 3j|.

Example 6.6. Let z, 3j and z2 = 3 — 2j. Find


(a) | z, + z21; (b) |z2f; (c) |zjz2|;

(d) NW; (e) mN.


(a) z, + z, -(l-3j) + (3-2j)-4-5j
(this must be worked out in standard form first). Hence
I*, + z2l = |4 - 5j| = [42 + ( —5)2]f = 741.
(b) |z,| + |z2| = [l2 + ( —3)2]2 + P2 + (~2)2]i = 710 + 713.
(c) |z,z2;| = |(1 - 3j)(3 - 2j)| = |-3 - llj|
= [(-3)2 + (-ll)2]i = 7l30.
(d) |z,||z2| = |1 - 3j113 - 2j| = 710713 - 7130
(that is, the same as (c))
1 -3j 3 + 2j
7j).
3 — 2j 3 + 2j
Therefore
1 710
19 - 7j| 7130 =
13 13 713 '

|1 — 3j| _ 710
(that is, the same as (e)).
|3 - 2j| 713

The following results hold good for the modulus; they are
illustrated in Example 6.6 above.

Properties of the modulus

Fig. 6.2 Parallelogram law of


If z = x + ]y (x and y real), then \z\ = (x2 + y2)\ and
addition. (a) \z\ = \z\, (b) zz = |z|2, (6.4)
(c) |zxz2| = IZil^l, (d) \zjz2\ = k1|/|z2|,
(e) the distance between two points zx and z2 is
\Z1 ~ ^21 = lZ2 — Zll-

The identity (6.4b) follows directly from (6.2). We shall defer the
proof of (c) and (d), but the truth is illustrated in Example 6.6c,d,e,f.
A sum or difference cannot be split in this way: contrast the results of
Examples 6.6a and 6.6b.
The sum of two complex numbers can be interpreted by the
parallelogram law of addition in the Argand diagram, as in Fig. 6.2.
Construct a parallelogram on OPx and OP2, where P1 and P2
correspond to the complex numbers zl and z2. The corner P of the
parallelogram represents the sum zl + z2. This follows from the
Fig. 6.3 Argand diagram showing z addition rule for complex numbers. If you know anything about
and its conjugate z. vectors, you will recognize that complex numbers add like vectors.
102 6.2 Mathematical techniques

The conjugate z = x — jy is the reflection of z in the x axis, shown


in Fig. 6.3.

6.3 Complex numbers in polar coordinates


In Fig. 6.1, r and 9 obviously serve as polar coordinates, as defined
in (1.16). In the context of complex numbers, 9 is called the
argument of z and denoted by arg z. The same point P can be
described by the angles 9 + 2nn, where n = ± 1, ±2,.... Usually
we use the principal value of the argument, denoted by Arg z
(capital A), which is the smallest numerically; its limits are given by
— n < 9 ^ 7i, i.e.

— n < Arg z ^ n.

The pair of equations

cos 9a = -,
x ■ a
sin 9 = y
-,
r r

has exactly one solution for 9 within this range.

Example 6.7. Find the moduli and principal values of the arguments of the
following complex numbers: (a) Z[ = 2j; (b) z2 = — 1 — j; (c) z3 = — 2; (d)
^4 = 2 +i)V3'
The moduli are given by:
(a) |zfl = |2j| = 2;
(b) |z2| = |-i -j| = V(i + i) = V2;
(c) |z3| = |-2| = 2;
(d) |z4i = ii+iiV3i = va +1) = i.
A sketch of the Argand diagram for the complex numbers helps to decide their
arguments. Figure 6.4 shows their locations. Thus

(a) Arg z1 = 0j = ire; (b) Arg z2 = d2 = -frc; (c) Arg z3 = 03 = n; (d) Arg z4 =
04 = 3tr.

In Fig. 6.1, the coordinates (x, y) and the polar coordinates (r, 9)
are related by
x = r cos 9, y = r sin 9.
Hence the complex number z = x + jy can be written
z = r cos 9 + jr sin 6 = r(cos 9 + j sin 9), (6.5)
which is the polar form of the complex number z. Note that r ^ 0.

Example 6.8. Express —1 + ^/3j in polar form.


Here

r = V[(-1)2 + (V3)2]=V(1 +3) = 2


and 0 is given by (see Fig. 6.5)
cos 0 = — i, sin 0 = ^/3.
Hence from Fig. 6.5, 0 = \n, and
—1 + = 2(cos §7c + j sin §7t).
Fig. 6.4
6.3 Complex numbers 103

Example 6.9. Obtain (a) |cos 0 + j sin 0|; (b) 11 /(cos 0 + j sin 0)|.
(a) |cos 0 + j sin 0\ = (cos2 0 + sin2 oji = 1.
(b) |l/(cos 0 + j sin 0)| = l/|cos 0 + j sin 0\ (by (6.4d))
= 1 (from (a)).

6.4 Complex numbers in exponential form


Consider the function

f(9) = cos 9 + j sin 9,

where 9 can take any value. Its derivative with respect to 9 is

d f(9)
--= — sin 9 + j cos 9 — j(cos 9 4- j sin 9) = jf(9).

Hence f(9) satisfies a relation in which the derivative is ‘propor¬


tional’ to itself notwithstanding that the constant of proportionality
is j. As we saw in Chapter 1, a function with this property is the
exponential function k ejS, where k is a constant. We conclude that

cos 9 + j sin 9 = k ej6>

for some value of k. In particular this must be true for the value
0 = 0, from which k = 1. Hence, we obtain the important result

ejS = cos 9 + j sin 9. (6.6)

The conjugate formula is

e~ie = cos 9 — j sin 9. (6.7)

applying the rule of replacing j by —j. Hence, from (6.5), any complex
number can be written in the exponential form (‘Euler’s formula’)

z = r(cos 9 + j sin 9) = r eid,

with its conjugate

z = r(cos 9 — j sin 9) = r z~iG.

Properties of e?e for real 6


(a) eje = cos 9 + j sin 9, (b) |ej0| = 1, (6.8)
(c) conjugate of ej0 is e_je.

Exponential form for a complex number

z — r eie = r cos 9 + jr sin 9, (6.9)


where r — |z| and 0 is any value of arg z.
104 6.4 Mathematical techniques

Example 6.10. Express the following in standard form (a) 2 einj; (b) 3 e nj; (c)
e3*j; (d) 2e~i"j; (e) 3e±”j.
Remember that, in re's, the numbers r and 9 are polar coordinates. For these
simple cases, we can therefore put the points straight on an Argand diagram
and read off the coordinates, without needing to work out cos 6 and sin 9.
(a) r = 2, 9 = \n (90°). Hence 2 ei7tj = 2j.
(b) r = 3, 9 = — 7i (- 180°). Hence 3 e“J" = —3.
(c) r = 1, 9 = 3tl Hence e3,tj = — 1.
(d) r = 2, 9 = —in (-90°). Hence 2 e~inj = -2j.
(e) r = 3, 9 = kn (45°). Hence 3 e^ = i/2 + ji/2.

It follows by treating (6.6) and (6.7) as two simultaneous equa¬


tions for sin 9 and cos 9, that

(6.10)

Equation (6.6) will still be true if we replace the angle 9 by n9,


where n is an integer. Hence we obtain De Moivre’s theorem:

cos n9 + j sin n9 = e"jfl = (ej0)" = (cos 9 + j sin 9)n.

De Moivre’s theorem

If n is any integer, then (6.11)


(cos 9 + j sin 9)" = cos n9 + j sin n9

The complex numbers having arguments 9 and 9 + 2tin are equal


for all integer values of n, since 2n is a complete revolution on the
Argand diagram. Thus
gj(0 + 2nn) _ gjn gj2fra __ gj0 . j _ gj0
= eJ e-

Example 6.11. Express the following complex numbers in exponential form


with principal arguments', (a) j; (b) — 5j; (c) —3; (d) 4 — 4j; (e) 3 — 4j.

In each example, we put r cos 0 equal to the real part and r sin 9 equal to the
imaginary part of the given complex number. In each case we shall find 9 by
plotting the point on an Argand diagram.
(a) r cos 9 = 0, r sin 0=1. Hence r = 1 and, in the interval —n<9^n, we obtain
9 = \n. The exponential form is

j = eh*.
(b) r cos 9 = 0, r sin 9 = — 5. Hence r = 5 and 9 = — ^jt. The exponential form
is

— 5j = 5 e"h’T.

(c) r cos 9 = —3, r sin 9 = 0. Hence r = 3 and 9 = n. The exponential form is

-3 = 3ej\

(d) r cos 9 = 4, r sin 0 = — 4. Hence r = ^/(16 + 16) = 4^/2 while 9 = —


6.4 Complex numbers 105

Hence

4 — 4j = 4^/2 e~4jIt.

(e) r cos 0 = 3, r sin 0 = —4. Hence r = ^/(9 + 16) = 5 while the angle a is the
principal argument such that

cos a = f, sin a = — f.

The exponential form is

3 - 4j = 5 eja.

where <x = —53.1° or —0.927 radians.

Example 6.12. By expressing — 1 + j in the form reie,find (— 1 + j)~8 as a


complex number in standard form.
First r = | — 1 + j| = Jl. From its position on an Argand diagram 6 — 3 x 45°,
or §7t in radians. Therefore

-1 + j = V2e^.
Then
(-1 +J)"8 = (V2e^jr8 = (V2)~8e-67tj = ^e-6nj.
On an Argand diagram the polar coordinates are r = yj and 0 = —671= ■— 3(27t).
This value of 0, equivalent to 3 complete revolutions, puts us on the positive
real axis again, so that

("I +J)'8=T6-

6.5 The general exponential form


The advantage of the exponential form of a complex number is that
it is particularly easy to differentiate, integrate, and to combine with
other exponentials, including ordinary ones. We are not tied to the
Argand diagram when we manipulate the exponentials, so we shall
often use letters other than r and 6.

Example 6.13. Prove that


cos(T + B) = cos A cos B — sin A sin B,

sin(A + B) = sin A cos B + cos A sin B.

As with an ordinary exponential,


gj(A + B) _ ejA gjfi

In terms of the definition (6.8), this becomes

cos(T + B) -f j sin(A + B) = (cos A + j sin A)(cos B + j sin B)


= (cos A cos B — sin A sin B)
+j(sin A cos B-\- cos A sin B).

The real and imaginary parts of the two sides of the equation must be
respectively equal, and so we have the result immediately.

Example 6.14. Express cos 50 in terms of power of cos 0.


Since
e5je = (ejfl)5,
106 6.5 Mathematical techniques

it follows that

cos 50 + j sin 50 = (cos 0 + j sin 0)5.

Expand the right-hand side by the binomial theorem. Thus

cos 50 + j sin 50 = cos5 0 + 5 cos4 0 j sin 0

+ 10 cos3 0 (j sin 0)2 + 10 cos2 0 (j sin 0)3

+ 5 cos 0 (j sin 0)4 + (j sin 0)5

= cos5 0 + 5j cos4 0 sin 0

— 10 cos3 0 sin2 0 — lOj cos2 0 sin3 0

+ 5 cos 0 sin4 0 + j sin5 0.

Equate real parts on both sides of the equation:

cos 50 = cos5 0—10 cos3 0 sin2 0 + 5 cos 0 sin4 0.

Finally replace sin2 0 by 1 — cos2 0 and simplify:

cos 50 = cos5 0—10 cos3 0 (1 — cos2 0) + 5 cos 0 (1 — cos2 0)2,

= 16 cos5 0 — 20 cos3 0 + 5 cos 0.

(Incidentally, sin 50 can also be found in terms of powers of sin 0.)

Example 6.1 5. Prove the result in (6.4d), that

=!ii!
-2 N

Put zx = r, ej0‘ and z2 = r2 ej0z. Then

eJ"' _ ej(9,-<m
z2 r2 er2

Therefore

zi = 1 = ]£i!
z, r2 |z2|’

as required.

Consider the number z where

z = c ep+iq,

and p, r, and c > 0 are real numbers. We have

z = r ep+iq = r ep ejq = r ep(cos q + j sin q),

(q is assumed to be in radians). Therefore we have

The form z = c ep+iq (p, q, c real, with c > 0)


(a) |z) = c ep, (b) arg z = q + 2nn, (6.12)
(c) Re z = c ep cos q, Im z = c ep sin q.

In science and engineering, complex exponentials of the type in


(6.12) are often used to describe oscillations of various kinds. Instead
of c ep+iq, the kind of symbols that occur may look like
6.5 Complex numbers 107

A e(~k + jco)t' of ce(« + j/»r>

in which t represents time. Recast c e(<x+j/i)t by writing it as c ea(+j/Jt,


and we have

The form c e(a+j/?)' (a, /?, c, t real)

(a) c eM cos fit — Re c e(a+jW(, (6.13)


(b) c eat sin fit = Im c e(“+J/?)'.

Example 6.16. The damped vibration of a piece of machinery is described by


x = 0.01 e-0 02' cos 15r. Write this in the form x = Re c e(a+j^>‘.
We have

x = 0.01 e“0 02! cos 15t = 0.01 e-0'02' Re ejl5'

= Re(0.01 e“0 02< ejl5() = Re(0.01 e(-° 02 + 15-'>‘).

Example 6.1 7. The current i(t) in a branch of a circuit is given by

i(t) = c e~kt sin (cot + (f).

Write this in the form of (a) the imaginary part of a complex function; (b) the
real part of a complex function.

(a) i(t) = Im(c e-,tIej<“'+^) = Im(c e<“'I+j<0),+^).


(b) Note that, if z = x + jy, then

y = Im z = Re(—jz).

Therefore

i(t) = Re(—j e(_t+j"),+j*) = Re(c e-^ e(_ll+j“)t+w)

= Re(c

6.6 Hyperbolic functions


The hyperbolic functions cosh and sinh are related to the trigono¬
metric functions cos and sin. The hyperbolic functions were defined
in Section 1.11 by

cosh x = ^(ex + e_x), sinh x = ^(ex — e~x).

It follows that

cosh jx = ^(ejx + e_jx) = cos x,

sinh jx = ^(ejx — e_jx) = j sin x,

by (6.10). Similarly

cos jx = jf(ej2x + e“j2x) = j{e~x + ex) = cosh x,

sin jx = — (ej2x - e_j2x) = —ij(e“x — ex) = j sinh x.


108 6.6 Mathematical techniques

Example 6.1 8. Solve the equation cosh z = — 1.


Since, for real z, we have cosh z ^ 1, we expect the equation to have complex
roots. In exponential form,

i(e- + e";) = -1,


or

e2" + le1 + 1 = 0,

or

(e" + l)2 =0.

Hence

e-'= -1 = e(2n+ 1)7tj (n = 0, ±1, ±2,...).

Here we have considered all the representations of the number —1, in the
form e(2n+ 1)7tj. It is important when finding all the roots of an equation to include
all possible arguments, not just the principal one. By matching the exponents,
the roots are

z = (2n + l)jtj (n = 0, ± 1, ±2,...).

6.7 Miscellaneous applications


The polar form of complex numbers can be used to solve polynomial
equations as in the following example.

Example 6.1 9. Find all roots of z5 = 4 — 4j.


We first express 4 — 4j in polar form p ejj. Thus

p cos a = 4, p sin a = —4,

from which it follows that p = ^32 and ot = — + 2/m, using an Argand


diagram. Hence, in polar form,

4 — 4j = 4y/2e~*in+2nKi = 2; ej,t(_4 + 2n| („ = 0, ±1, ±2,...).

Let z = r ejM. Then

r$ gSjfl _ 2f gj^(-J+2n)

or

r ej0 = ^2 eb’t(~i+2n) (n = 0, ±1, ±2,...).

Comparing the two sides gives

r = J2, 0 = jq( — 1 + 8/7)71.

Five successive values of n give distinct roots: other values of n merely duplicate
existing roots. In full, the roots are

V2e-^j", 72eftjrt, V2e^, V2e^jrt.

The exponential form (6.8a) can be used to express cos nd and


sin nd in terms of powers of cos 6 and sin 6 respectively, and
conversely, to express cos" 6 and sin" 6 in terms of cosines and sines
of multiple angles.
6.7 Complex numbers 109

Example 6.20. Expand cos6 0 in terms of multiple angles.


Let z = cos 0 + j sin 0. By de Moivre’s theorem, with n an integer,

1
z" = cos nO + j sin nO, = cos nO — j sin nO.

By adding these two results, it follows that

cos nO = - f z" + (6.14)

Hence

(2 cos 0)6 = I z + - 1 ,

= z6 + 6z4 + 15z2 + 20 + ^ + 4 4-
+ 46
z z z

+ i) + 6(-'4 + ?)+15(:! + ?) + 2°
= 2 cos 60 + 12 cos 40 + 30 cos 20 + 20,

by repeated use of (6.14). Finally

cos6 0 = 32 cos 60 + 3+ cos 40 + cos 20 + 3+.

We can also use the polar form to sum certain series as in the
following example.

Example 6.21. Find the sum of the series


cos 20 cos 30
/(0) = 1 + cos 0 + H-!-•••.
2! 3!

In summation notation, the series can be written

A0)=E~-
n=0 n\

Since cos nO = Re e"j0, consider the series


e2jfl e3j9
S(0) = 1 + ejS + + + •
2! 3!

the real part of whose sum is the required sum /(0). Thus

(eje)2 (eje)3
S(0) = 1 + ejfl + + +
2! 3!

exp ej9 = ecos9+Jsin0

_ ecos,,[Cos(sin 0) + j sin (sin 0)],

using the formula (5.4a) for the power series of the exponential function. Thus

f(0) = Re S(0) = ecos9 cos(sin 0).

Problems

6.1. (Section 6.1). Find the roots of the following quad- (a) x2 + 2x + 5 = 0; (b) x2 - 6x + 10 = 0;
ratic equations: (c) -’c" + 2jx + 3 = 0.
110 Mathematical techniques

6.2. (Section 6.1). Find all the complex roots of x4 + 6.11. (Section 6.4). Express the following complex num¬
3x2 -4 = 0. bers in exponential form with principal arguments:

(a) -1 +j; (b) -2; (c) — 3j; (d) 7 - 7j^3;


6.3. (Section 6.1). Express the following complex numbers
in standard form: (e) (i -j)(i +jV3); (0 —— 4 ; (g)e2+ji
(a) (1 — j) + (3 + 4j); (b) 2(3 — j) + 3(- 1 — j);
(c) 3( —1 + j) —4(2 — 3j); (d) 3(1 +j)(2-j); (h) (1 + j) e2j; (i)(l-jV3)9; (j)
(2 + j)(7 + 5j) 2 — 2j
2+j
(e) (f) (g) (-1 +2j)2;
3 —j 3 —j 6.12. (Section 6.3). Using Euler’s formula (6.8) for e±jfl,
1 obtain the trigonometric identities for cos(0!±02) an<3
(h) ( — 1 + 2j)2 + (i) (1 +j)5-
sin(0, ± 02).
(-1 +2j)2
6.13. (Section 6.2). Using the parallelogram rule, sketch
6.4. (Section 6.1). Find the boundary curve in the (p, q)-
the locations in the Argand diagram, for general complex
plane which separates the (p, q) values giving real roots
numbers z, and z2, the following points: z, + z2, zj + z2,
from those for the complex roots in the quadratic
Zj - z2, Zy + z2, zj - z2.
equation

,\'2 + px + q = 0,
6.14. Let f(0) = cos 0 + j sin 0. Verify that
where p and q are real parameters. Of the real roots,
d 7(0)
where in the (p,q) plane do the roots which are both
d 02
= -m.
negative lie?
Show that it is still true if
6.5. (Section 6.1). Let z, = 3 — j and z2 = 1 + 2j. Find,
f(d) = a cos 0 + b sin 0,
in standard form, the complex numbers
where a and b are arbitrary complex numbers.
(a) Zj + z2; (b) z{z2; (c) zjz2, (d) zjz\.

6.15. Prove that tan ja = j tanh a, where a is a real


6.6. (Section 6.1). Let Zj = 2 + 3j and z2 = —2+j. Find
number.
the following complex conjugates:

(a) z| + z2; (b) zxz2, (c) zx/z2, (d) zjz2. 6.16. Find all the complex roots of the following equa¬
tions:
(a) cosh z = 1; (b) sinh z = 1;
6.7. Let z = l+j. Find the following complex numbers
in standard form and plot their corresponding points (c) e2 = — 1; (d) cos z = ^2.
in the Argand diagram:
6.17. The logarithm of a complex number z = r eie is
(a) z; (b) z2; (c) z2; (d) 1/z; (e) z/z. defined by
log z = In r + jO,
6.8. (Section 6.5). Three complex numbers are given
by z,=2e1+j, z2 = 3e“j, and z3 = ^e_1+2j. Express which will be a multivalued function because of the
the following complex numbers in standard form: term j0. The principal value of the logarithm is denoted
by Log z (note the capital letter L), and defined by
(a) z, + z2 + z3; (b)z,z2 + z3; (c) z,z2z3;
(d) z,z2/z3; (e) z2z2 — 2z2. Log z = In r + j0,
where n < 0 ^ n.
6.9. (Section 6.3). Find the modulus and principal argu¬ (a) Find the principal value Log(l + j%/3), and indic¬
ment of each of the following complex numbers: ate its location on the Argand diagram.
(b) Find all roots of the equation log z = jij.
(a)z, = -2 + 2j; (b) z2 = 4 - 4^/3j; (c) z3 = — 5j; (d)
z4 = -3; (e) z5 = 3 + 4j. (c) Express Log(ej) in standard form.
(d) Show that e'08 2 = z.

6.10. (Section 6.2). Let z = x + jy. Express each of the


6.18. If z # 0 and c are complex numbers, then zc is
following equations in the complex variable z in real
defined by
form in terms of x and y. Sketch, and identify in each
vc = eclnz
case, the corresponding curve in the Argand diagram:

(a) zz = 1; (b) Im z = 2; (See Problem 6.17).


,
(c) |z — a\ = 1 where a is a complex number; (a) Express 2) in standard form.
(d) (z - z)2 = -8(z + z); (e) |z — 1| + |z + 1| = 4; (b) Find the principal value of jj.
(f) Arg z = \n (see Section 6.3); (g) |z| = arg z. (c) Find all complex roots of zj = — 1.
Complex numbers 1 1 1

6.19. (Section 6.4). Find all complex roots of z5 = — 1, 6.25. Show that the mapping
and sketch their locations on the Argand diagram.
c
w = z + -,
z
6.20. (Section 6.5). Find the modulus, argument, real
and imaginary parts of each of the following complex where z = x + ]y and w = u + ju and c is a real number,
numbers: maps the circle |z| = 1 in the z plane into an ellipse in
the w plane; and find its equation.
(a) 2e3 + 2j; (b) 4ej; (c) 5ecosFt + jsini’'; (d)e1+j.

6.26. (Section 6.7). Show that


6.21. (Section 6.5). An oscillation in a system is given
by x = 0.04e~ool! sin 12t. Write this in the form cos6 0 = ^(cos 60 + 6 cos 40+15 cos 20 + 10),

and find sin6 0.


x = Re(c e3+j/i).

6.27. The damped oscillation of a vibrating block is


6.22 (Section 6.5). The current in a branch of a circuit
given by
is given by
x = Re z, z _ e(-o.2 + o.5j)<^
i(f) = c e-0 05' sin(0.4f + 0.5).
in terms of the time f. Find x, and determine the values
Write this in the form of the real part of a complex of t where x is zero. Find the velocity of the block
function. (a) as dx/df;
(b) as Re dz/df;
6.23. A function f(z), where z = x+jy, is known as and confirm that the answers are the same.
a function of complex variable z. Find the real and
imaginary parts of the following functions in terms of 6.28. Given that 2 + j is a solution of the equation
x and y:
z4 - 2z3 - z2 + 2z + 10 = 0,
(a) z2; (b) z + 2z2 + 3z3; (c) sin z;
(d) cos z; (e) e: cos z; (f) ez\ find the other solutions.

6.29. Find the sum of the series


6.24. Let w = /(z), where z = x + jy and w = u + jv
are complex variables. If f(z) = z2, find u and v in sin 20 sin 30
(a) 1 — sin 0 H-+ • • •;
terms of x and y. The relation represents a mapping 2! 3!
between two Argand diagrams. What curves do the
22 cos 20 23 cos 30
hyperbolas x2 — y2 = 1 and xy = 1 map into the (u, v) (b) 1 + 2 cos 0 4-1-
plane? 2! 3!
PART II MATRIX ALGEBRA AND
VECTORS

Matrix algebra
/ Contents
7.1 Matrix definition and notation 112
7.2 Rules of matrix algebra 113
7.3 Special matrices 118
7.4 The inverse matrix 122
Problems 126

7.1 Matrix definition and notation


In many applications in physics and engineering it is useful to be
able to represent and manipulate data in tabular or array form. An
array which obeys certain algebraic rules of operation is known
as a matrix. Capital letters are usually used to denote matrices.
Thus

12-1
0 3-4

is a matrix with 2 rows and 3 columns. The individual terms are


known as elements: the element in the second row and third
column is —4. This matrix is said to be of order 2 x 3, or a 2 x 3
matrix. A general m x n matrix, one with m rows and n columns,
can be represented by the notation

an an n

a21 a22
= [fly], (1 < i < m, 1 < j < n),

aml am2

where atj is the element in the ith row and jth column of A; or by

A = [iau:i= 1 l,...,n],

or simply

A = [fly],
for brevity, if it is clear in context that the matrix is m x n.
7.1 Matrix algebra 113

A 1 x 1 matrix is simply a number: for example, [ — 5] = —5.


Matrices which have either one row or column are known as
vectors. Thus

-1.1
[1.3 2.9 4.6] and 6.5
-2.0

are respectively row and column vectors.


A matrix in which the number of rows equals the number of
columns is called a square matrix: if m # n then the matrix is said
to be rectangular.

7.2 Rules of matrix algebra


We need to define consistent algebraic rules for manipulating
matrices, such concepts as addition, multiplication, etc. As we shall
see, these rules have their origins in the representation of linear
equations and linear transformations, but for the present we simply
state them as a list of rules.

1. Equality. Two matrices can only be equated if they are of


the same order, that is, if they each have the same number of
rows and the same number of columns. They are then said to be
equal if the corresponding elements are equal. Thus if

a b ~e f
A = and B =
c d _lj h_

then A = B if and only if a = e, b = f,c = g, and d = h. In general,


if A = [fly] and B = [t>0] are both m x n matrices, then A = B if
and only if au = bu for i — 1,2,... ,m and j = 1,2,, n.

Example 7.1. Solve the equation A = B when


x 1 2 ”1 1 T
A = B =
0 x2-y 3 J) 2 3_

Since A and B must have the same elements, if follows that x = 1 and x2 — y = 2.
Hence, y = x2 — 2 = — 1. The solution is x — 1 ,y— — 1.

2. Multiplication by a constant. Let k be a constant or scalar. By the


product kA we mean the matrix in which every element of A is
multiplied by k. Thus, if

2.0 1.5 3.1


-1.2 3.0 -4.6
1 14 7.2 Mathematical techniques

and k = 10, then


10 x 2.0 10 x 1.5 10 x 3.1
kA =
10 x -1.2 10 x 3.0 10 x -4.6
20 15 31
-12 30 -46
Equally, we can ‘factorize’ a matrix. Thus
' 5 25 -30“ “1 5 -6
A = = 5
10 15 -5 2 3 -1

3. Zero matrix. Any matrix in which every element is zero is called


a zero or null matrix. If A is a zero matrix, we can simply write
A = 0. (A better notation is OmX„ to denote the zero matrix with m
rows and n rows, but it is not widely used.)

4. Matrix sums and differences. The sum of two matrices A and B


is defined if A and B are of the same order, in which case A + B is
defined as the matrix C whose elements are the sums of the
corresponding elements in A and B. We write C = A + B. Thus, if
A = [fly] and B = [by] are both m x n matrices, then
C = A + B = [fly + by] .

Example 7.2. If
” 1 3~ " -4 -6
A- 2 2 and B =
1

_3 1_ _ —6 -4.

then find A + B, B + A, and A + 2B.

We have
1

ON|

1
U>
4^

LO
1

" 1 f

A + B = 2-5 2-5 = -3 -3 = -3 1 1
_3 - 6 1 - 4_ _-3 — 3_ _ 1 1_

(by Rule 2).

Also

-4+1 -6 + 3~ ” -3 -3~
B + A = -5 + 2 -5 + 2 = -3 -3 A + B.
.-6 + 3 -4 + 1. _ -3 -3.
Further
1

ON|
-P^

” 1 3_
1

1
m

cn

A + 2B = 2 2 + 2
1

1
_1

'O

.3 1.
1

1 3 -12
2 2 + -10 -10 (by Rule 2)
3 1 -12 — 8 .
7.2 Matrix algebra 115

—7 -9
-8 -8
-9 -7.

As the second sum suggests, the commutative property of the real


numbers, namely a(J- + + atj, implies the commutative prop¬
erty of matrix addition, that is,

A + B = B + A.

The difference of two matrices is written as A — 5 which is


interpreted as A + (— 1)5, using Rule 2 for the multiplication of 5
by the number — 1, and then Rule 4 for the sum of A and (—1)5.
In practice, we simply take the difference of corresponding elements.

Example 7.3. Find A — B and 2A — 3B, if

04

04

m
n -1 2i

1
A = 1_ B=
o

K>

1 0 - 1_
1
i

We have

04

CO
1-1 2~ "2

1
0 2 — 3_ _1 0 - 1_

1-2 -1+2 2 + 3“ “-1 1 5


_i
7
04

04
0-1 2-0 —3 + 1_

1
Also

2A - 3 B

The rules of arithmetic as applied to the elements of matrices lead


to the following results for matrices for which addition can be
defined:

(a) A A- (B + C) = (A + B) + C (associative law of addition).

In other words, the order of addition of matrices is immaterial.

(b) k(A + B) — kA + kB, (k + l)A = kA + IA.

5. Matrix multiplication. We now need to define the concept of the


product of two matrices. Not all matrices can be multiplied: they
must have the right shape, or be conformable for multiplication to
be defined. The product of A and 5, in this order, is written as AB
1 16 7.2 Mathematical techniques

(no product sign is used), but it is only defined if the number of


columns in A equals the number of rows in B. Let us look at
the case where A is a 1 x 3 matrix, that is, a row vector, and B is
a 3 x 1 matrix, that is, a column vector, given by

bu

A = [a n al2 fli3], B — b 2\

b3i

The product AB is defined as the 1 x 1 matrix C given by

bn

AB = [an al2 a13] ^21

b21

— [«11^11 + ^12^21 + al3^3ll — (7.1)


Here, the single remaining element is the sum of the products
of corresponding elements from the row in A and the column
in B. Thus the product of a 1x3 matrix and a 3 x 1 matrix
is a 1 x 1 matrix (or number). This is known as a row-on-column
operation.
Suppose now that A is a 2 x 3 matrix and that B a 3 x 2 matrix
which are given by

bn
an fl12 a i3
A = B = b2 i
_a21 a22 fl23_
b3i
The product AB is now a 2 x 2 matrix C given by

bu b 12
'ii 12 an
AB b 2\ b22
La21 22 fl23_
b3i
anbn + 6*12^21 + fli3^3i anb l2 + «n^22 + ai3b32
_a2lbu + <322^21 + a23bil a2\b\2 + a22^22 + fl23^32_

= C- (7.2)
Note that each row in A ‘operates’ on each column in B giving
four elements in the 2 x 2 matrix C.

Example 7.4. Find AB if

We have

1 -1
AB =
2 1
7.2 Matrix algebra 117

“1 x 0 + (-1) x 1 + 0 x (-2) 1 x3 + (— l)x(—l) + 0x4~


2x0+1 x 1 + ( —3) x ( — 2) 2x3+1 x (-1) + (-3) x 4
~-l 4'
7 -7

We can use a summation (Section 1.13) notation to condense the


expanded sums of products which occur in matrices. The sum of a
string of numbers, say, cA + c2 + c3 can be expressed as
3

Z<*
i= 1

where i runs through all the integers between the lower limit on i
under the £ symbol, and the upper limit above. Thus, for example,
8
Z = h3 + /i4 + h5 + h6 + h7 + h8.
i=3

We can also use the summation notation with the double-suffix


notation, as in
4

Z bi6 = b16 + b26 + b36 + /j46.


i= 1

The product given by (7.1) can be written

AB = [flu&n + al2b2i + al3b31] =


Lj=i
Z aubn
J
= I alJbJl-
7=1
Similarly the elements in the square matrix (7.2) can be expressed as
3
AB =
.
I
Z aikbkj ■ ij =L2
fc-1
The summation formulae give a clue to the general expression for
the product of an m x n matrix A and an n x p matrix B. Remember
that the number of columns in A must always equal the number of
rows in B for the product to be defined. Thus, the row-on-column
definition of the product is the m x p matrix

AB Z aikbkj :i = 1,..., m;j = 1


_k= 1

Multiplication rule
The element in the ith row and j'th column of the
product consists of the row-on-column product of the
ith row in A and the yth column in B.

Example 7.5. If A is a 5x4 matrix, B is a 4x5 matrix, and C is a 6x4


matrix, which of the following products are defined: AB, BA, AC, CB, (AB)C,
(CB)A?
118 7.2 Mathematical techniques

AB is a 5 x 5 matrix.
BA is a 4 x 4 matrix.
AC is not defined since A has 4 columns and C has 6 rows.
CB is a 6 x 5 matrix.
(AB)C is not defined since AB is a 5 x 5 and C is a 6 x 4 matrix.
CB is a 6 x 5 matrix; hence (CB)A is a 6 x 4 matrix.

One conclusion which can be inferred from the previous example


is that matrix multiplication does not commute; that is, in general,
AB # BA. As the previous example indicates, one or both products
may not be defined; when both are defined, AB and BA may be of
different order; and, even when both are defined and of the same
order, AB is generally not equal to BA. So we must be careful about
the order of multiplication. In the product AB, we say that A is
multiplied on the right by B, or that B is multiplied on the left by A.
The expressions ‘A postmultiplied by B' and ‘5 premultiplied by
A’ are also used. Statements such as 'A is multiplied by B' can be
ambiguous without carefully stating how the product occurs.

Example 7.6. If
1 2
ri - i on
B = 1 2
1 2.
calculate AB and BA.

We have
” 1 2
'1 0~ 0 0~
AB = 1 2 —

_3 -2 -1 0 0_
_ 1 2
and 2
" 1 ‘7 _

1 1 0
BA = 1 2 —
7 _

3 2 -1
_ 1 2_ _7 —

This example illustrates the point that AB can be a zero matrix


without either A or B or BA being zero. Also, as a consequence,
A(B — C) = 0 does not necessarily imply B = C.
We state the following results concerning sums and products, but
proofs are omitted:

(a) A(B + C) = AB + ACfdistributive law of addition),


(b) A(BC) = (AB)C(associative law of multiplication),
provided that the products are defined.

7.3 Special matrices


We define and give properties of several special matrices. Some
properties apply to rectangular matrices: others are specific to
square matrices.

The transpose of any matrix is one in which the rows and columns
are interchanged. Thus the first row becomes the first column, the
7.3 Matrix algebra 119

second row the second column, and so on. We denote the transpose
of A by Ar. Hence,

a a
i i 12
a a a
11 21 31
A = a a then At —
21 22
a a a
L. 12 22 32-1
a a
L 31 32J

The 3x2 matrix A becomes the 2 x 3 matrix AT.

Example 7.7. Find the transposes of A, B, A + B1 and AB. Where


1 2"
A = 0 1 , B =
_-l 1_ -

Confirm that (AB)1 = BTAT.


We see that

3 l"
1 0 -r
, B1 = -1 2
2 1 1
0 — 2_

1 2" 3 l”
V 4 3"
(A + B1)1 = 0 1 + -1 2 -1 3
=
. -1 1_ 0 — 2_ / _ -1 - 1_
~4 -1 -1
= AT + (Bt)t = AJ + B,
3 3-1

( 1 2“ \T 5 3 -4'
3 -1 0
(AB)T = 0 1 —
1 2 -2
1 2 -2J
\ .-1 i_ . -2 3 — 2_

5 1 -2

= 3 2 3
-4 -2 — 2_

3 r 5 1 -2“
_1 0 -r
Bt^t = -1 2 = 3 2 3
_2 1 1
0 — 2_ _ -4 -2 — 2_

Hence (AB)1 = B1 A1.

1. Properties of the transpose. Provided that the sum A + B and


product AB are defined for two matrices A and B, the last example
points to the following two results concerning transposes:
(a) (A + B)T = AT + BT;
(b) (AB)T = BTAT

2. Symmetric matrices. A square matrix is said to be symmetric if


A = AT. Since rows and columns are interchanged in the transpose,
this is equivalent to au = aj( for all elements if A = [a;j]. Symmetric
120 7.3 Mathematical techniques

matrices are easy to recognize since their elements are reflected in


the leading diagonal, the diagonal string of elements from the top
left to the bottom right of the matrix. Thus

1 3 -2
A = 3 2 4
-2 4-1

is a 3 x 3 symmetric matrix.
A square matrix A for which A = — AJ is said to be skew-
symmetric. Note that, if A is any square matrix, then A + A1 is
symmetric and A — AT is skew-symmetric. The elements along the
leading diagonal of a skew-symmetric matrix must all be zero. Thus

0 1 2
-10-3
-2 3 0

is skew-symmetric.

3. Row and column vectors. As we defined them in Section 7.1, a row


vector is a matrix with one row, and a column vector is one with
one column. For vectors, we usually use bold-faced small letters and
write, for example,

ai

a2
bJ = lbx b2 ... bj

The transpose of a row vector is a column vector and vice versa.


If A is an m x n matrix, then Aa is a column vector with m rows.

Example 7.8. If
" 1 -1 2“ - x~ ~ 2~

A = 3 1 -4 , X = y , d= 1

- -1 2 1- - z- - -1-
find the set of equations for x, y, z represented by Ax = d.
The matrix equation in full is

1-1 2" " *" ■ 2"

3 1 -4 y = 1

1 2 1- - z- - - 1-

x — y + 2z " 2"

3x + y — 4z = 1

- —x + 2 y + z- - -1-
7.3 Matrix algebra 121

The set of linear equations for x, y, z is

x — y + 2z = 2,

3x + y — 4z = 1,

— x + 2y + z = — 1.

We shall say more about the solutions of linear equations


in Chapter 8.

4. Diagonal matrices. A square matrix all of whose elements off the


leading diagonal are zero is called a diagonal matrix. Thus, if
A = \_atj] is an n x n matrix, then A is diagonal if atj = 0 for all i ^ j.
Hence

1 0 0
A = 0 -2 0
0 0 3

is an example of a 3 x 3 diagonal matrix.


A diagonal matrix is obviously symmetric. If A and B are diagonal
matrices of the same order then A + B and AB are also both
diagonal.

5. Identity matrix. The diagonal matrix with all diagonal elements


1 is called the identity or unit matrix I„. Hence, the 3x3 identity is

1 0 0
h 0 1 0 (7.4)
0 0 1

(If there is no confusion likely to arise, I„ or I3 are simply replaced


by the universal symbol /.) The reason for the definition becomes
clear if we multiply a 3 x 3 matrix by I3. If A is a general 3x3
matrix, then
- —

al 1 al2 al3 1 0 0 a11 al2 al 3

a21 a22 a23 0 1 0 = a21 a22 a23

_ a31 a32 a33_ 0 0 1 _ a31 a32 a33_

Similarly \3A = A.
A need not be square: provided that the products are defined,
AI = A and IA = A for the appropriate identity matrix in each case.

6. Powers of matrices. If A is a square matrix of order n x n, then we


write AA as A2, AA2 as A3 and so on.
122 7.3 Mathematical techniques

If A is diagonal, as in

dx 0 0
A - 0 d2 0
0 0 d2

then

1-
1_
<N

o
o
0 ~d\ 0

a2 = 0 d\ 0 A3 = 0 d2 0 etc.

po m 1
"e
0 0 d\_ 0 0

In particular Vm = Im for all positive integers n.

7.4 The inverse matrix


If A and B are square matrices, each of order n x n, which satisfy
the equations
AB = BA = I„,
then B is called the inverse of A. We say the inverse because,
if a matrix B exists with this property, then it is uniquely determined
by A (although we shall not prove this here). We write B = A~x
(not B = I/A). Since the definition is ‘symmetric’, it follows that A
is the inverse of B, that is, A = B l. The inverse matrix defines
‘division’ for matrices, but analogies with numbers must not be
taken too far. It is a particularly useful operation since it enables us
to manipulate matrix equations. Thus, if AB = C, and the inverse
of B exists, then we can solve the equation and find A as A = CB ~l.
How do we find the inverse? Does it always exist? Let us look
first at the case in which A is a 2 x 2 matrix, and consider the
equation
Ax — d,
where
1_

1
l
O'

X
IN

A = d=
II

_fl21 a22_ L*2J


Thus
1

-1
1

d^
Cj

a\2
_a21 a22_ _*2_

or

allx1 + a12x2 = du (7.5)


a2lxl + a22x2 = d2. (7.6)

Eliminate x2 by multiplying (7.5) by a22, (7.6) by al2 and by


subtracting the two equations so that

(alia22 a2lal2)Xl — a22^\ ~ a\2^2-


7.4 Matrix algebra 123

Similarly, elimination of leads to

~(aHa22 ~ a2\a\2)X2 = a21^1 ~ a\\^2-

Provided that alla22 — <32iai2 ^ 0, it follows that

a22^1 a12^2 ci2idl + a11d2


X, -
a\\a22 a2la\2 lfl22 ~' fl2lal2

We can now express this pair of equations in matrix form:

1_

1
1 a22^l a\2^2
x = = Cd

1—

1_
X
to
alla22 ~ a2lal2 ^21^1 ^11^2

where

a22 — a 12
C = with det A = ana22 — fl2ifli2-
det A ~a21 in
If Ax = d is multiplied on the left by the inverse A \ then

A 1 Ax — \2x = x = A yd.

Hence A 1 can be identified with the matrix C. In other words.

1 l22 — Cl 12
A'1 = C = (7.7)
det A — a 21 lll J
It is worth remembering the rule for 2 x 2 matrices by which A 1
can be constructed from A.

Rule for 2x2 inverse

The diagonal elements all and a22 are interchanged, (7.8)


the signs are changed for the other two elements, and
the matrix is divided by det A.

The number det A — alla22 — a21a12 is known as the determinant


of A. It may also be written directly in terms of the corresponding
matrix as

ai i an
det A = det
_Cl2i «22_r
or, more briefly, as

a21 a22

The determinant is a function of the corresponding matrix, but is a


number—not an array. If det A = 0, then the matrix has no inverse.
It is then said to be singular: if A has an inverse, then A is said to
be nonsingular. We shall say more about determinants in the next
chapter.
124 7.4 Mathematical techniques

Example 7.9. Decide whether A, where


T 1 3“
A ~ _-l 4_

is singular or not. If it is nonsingular, find its inverse.

Here an = 1, al2 = 3, a2i = — 1, and a22 = 4. Hence

det A — =1x4 — ( — l)x3 = 4 + 3 = 7.


-1 4

Since det A # 0, then A is nonsingular. Its inverse is, by the rule above,

4 -3
=4
1

Example 7.10. If
1 3 1 2"

A = B =
-1 4_ 1 -1

find A~\ B~l, and (AB)_1.

Always check the determinants first. Here

det A = 4 — (-3) = 7, det B = —1—2 = —3.

Hence

4 -3 1 -2
A'1 =k B 1
1 1 -1 1
Also

1 3 1 2 '4 -1
AB =
-1 4 _1 -1 _3 -6

Thus, det(AB) = —24 + 3 = —21, and

-6 f
(AB)'1 =
■3 4

Note that

-1 -2 4 -
B'A
-1 1 1

1 -6 1
'21
(-4B)
-3 4

This last result suggests the following correct rule for the inverse
of the product of two square matrices, namely

(AB)~l =

For the inverse of a 3 x 3 matrix we can adopt the same approach


as for the 2x2 case by eliminating pairs from xl, x2, x3 from the
set of equations Ax = d or
\
Cl 1 ^X ^ Cl ^ 2X? “I- ^ i 3X3 —- d ^,

fl21Xl + a22x2 T a23X3 = ^2» > (7.9)


«31X1 + ®32X2 + «33X3 = ^3'y
7.4 Matrix algebra 125

The result is
x = A~1 d,
where

a22a33 ~~ a32a23 ~ 1 2fl33 fl32a 1 3) a 12a23 — a22a 1 3


A-1 — (fl21^33 ~ a31fl23) ai lfl33 — a31fl13 — (a 11 a23 — a21a 1 3) (7.10)

a2\a32 ~~ a3la22 —(^11^32 ~ a31fl12) <3 1 1 fl22 — fl2112

with

det A — ctii(a22^33 ~ #32^23) ~ ^12(^21^33 — ^31^23)


+ ^13(^21^32 — <331^22)5 (7.11)
provided that det A / 0. Again det A is known as the determinant
of A and denoted by

flu Qi2 Ol3


det A = a2\ a22 <323
a31 a32 <333

Equation (7.10) gives the inverse matrix, as can be verified by


calculation of the products AA~l and A~lA. Even for 3x3
matrices, the formula for the inverse is quite complicated. If
det A = 0, then the matrix is singular.
Determinants, which have arisen in the context of inverse mat¬
rices, have important properties particularly with regard to their
evaluation. They will be discussed in more detail in the next chapter.

Example 7.11. Verify by direct multiplication that


0 1 1
A = -1 11
1 — 1 1_
has the inverse

-2 O'
B = -1 -1
1 1_
Check the matrix product BA :
1

O
O

"2 -2 0 " 0 11"

BA=\ 2 -1 -1 -1 11 = 0 1 0
0
0

_0 1 1. .1 - 11 .

Hence B = A~ ’.
Note that we need only verify that either BA = I3 or AB = I3, not both. If
BA = I3, then AB = I3, and vice versa.

Example 7.1 2. Using formula (7.10) find A ', where

2 1 0
A = 1 -15
-1 -1 2_
126 7.4 Mathematical techniques

We first find det A:

det A = 2 x [(— 1) x 2 — (— 1) x 5]

— 1 x [1 x 2 - (-1) x 5]

+ 0 x [1 x ( — 1) — (— 1) x (-1)]

= 2 x 3 - 1 x 7

= - 1.

Thus, from (7.10),

— 1x2 — (— l)x5 -[1 x2 - (-l)xO] 1x5 — (— l)xO


A — [1 x2—(—l)x 5] 2 x2 — (— 1)x0 —(2 x 5 — 1x0)
1 L 1 X(-1) - (-l)x(-l) — [2x( — 1) — (— l)x( — 1)] 2 x (— 1) — 1 x (—1)_

3 -2 5~ " -3 2 — 5~
-7 4 -10 = 7 -4 10
_ -2 1 — 3_ 2 -1 3_

This is the ‘formula’ method of finding the inverse of a 3 x 3 matrix, but it


is not an efficient procedure numerically. There are better methods using row
operations which will be explained in Chapter 12.

Problems

7.1. (Section 7.1). The matrix A = [afj.] is given by 2 1


1 2 3 C = -1 1
-1 0 1 0 1
A =
2 -2 4 verify the distributive law A(B + C) = AB + AC for
the three matrices.
1 5 -3
7.5. (Section 7.2). Let
Identify the elements a13, and a31.
-1 0
-1 2
7.2. (Section 7.2). Solve the equation A — B, where A = B = 1 2
2 3
1 -2 1 X 3 -1
A = 3 1 y—x 1 1
_ -1 2_ -1 2 2
for x and y. Verify the associative law A{BC) = (AB)C for these
matrices.
7.3. (Section 7.2). Given that
7.6. (Section 7.2). Let
1 2 -3" 2 -1
, B = 4 2 -2 -1
-1 0 4^ 4 1 A = B =
2 1 4 2
find the matrices A + B, A — B, and 2A — 3B.
Show that AB = 0, but that BA # 0.

7.4. (Section 7.2). Given that 7.7. (Section 7.3). Let


1 0 2 1 3
1 3 0
A = 2 1 , A = 1 -12
2 1 1
_-l -1_ _ —2 1 1_
Matrix algebra 127

i
Find a matrix C such that A + C is the identity matrix 4
I3. Deduce that AC = CA. Find AC, and hence the 1
B = ' 4
matrix A2 + C2.
l
' 2j

7.8. (Section 7.3). A general n x n matrix is given by


find the products AB and BA, and confirm that B is
A = [fly]. the inverse of A.

Show that A + A7 is a symmetric matrix, and that


A — At is skew-symmetric. 7.13. Let
Express the matrix 2 0 1
2 1 3 A = 2-2 2
A = -2 0 1 0 4 1
3 1 2
Find the powers A2 and A3, and verify that
as the sum of a symmetric matrix and a skew-symmetric
A3 - A2 - 12A = -I2I3.
matrix.
Hence find the inverse matrix A~' by multiplying the
7.9. (Section 7.3). Let equation on both sides by A~l.

1 3
7.14. (Section 7.4). Using the rule for inverses of 2 x 2
-1 2 matrices, write down the inverses of:
0 1^ 1 2 3
(a) (b)
Write down AJ, and find the products AAr and ATA. -1 -7 11

1
O
1 O'

1
7.10. (Section 7.3). If (c) * (d)
— — 8 0

_0 —2_
1 -1 2 X
f —99 1001
A = 0 1 , X = y ’ (e)
97 98_
2 -3 7

2 7.15. (Section 7.4) The sparsely filled matrix A is given by

d= 0 0 10 0
-1 0 0 10
A =
write down the set of equations defined by Ax = d. 10 0 0
Confirm that the same set of equations is given by 0 0 0 1
xTAr = d\
Thinking about the row-on-column rule for matrix
7.11. (Section 7.3). Let multiplication, can you guess the columns in the inverse
matrix A-1? How would this rule generalize to the
1 0 0
matrix
A = a -1 0
0 a 0 0
b c 1
0 0 b 0
A = 9
Find A2. For what relation between a, b, and c is
c 0 0 0
A2 = I3? In this case, what is the inverse matrix of A?
What is the inverse matrix of A2"-1 (n a positive 0 0 0 d
integer)?
7.16. (Section 7.4). Write down the set of equations
7.12. If given by Ax = d, where

2 0 1 0 1 1 X

A = 1 -2 2 , X = y
2-2 2
1_

_ 1 0 1_ z
O

1
1
128 Mathematical techniques

6 they, and what implications have they for the given


points in the plane? Find a, b, and c in terms of the
d= 3
given data. Find the equation of the parabola through
-9 the points (— 2, 0), (1, —2), (3, 4).

Find A 1 and calculate the product A 'd. What is


7.19. The elements in a 3x3 matrix A = [a,;] are
the solution of the equation?
given by the rule
7.17. (Section 7.4). If A and B are both n x n matrices au = (-jY - ij-
with A nonsingular, show that
Write down the matrix A. Calculate det A and the
(A ~ 'BA)2 = A~'B2A. inverse of A.
1 2 1 2
A = and B = Calculate
-1 1_ -1 0 7.20. If
a'b4a. 2 1 3
A = 1 -12
7.18. (Section 7.4). For interpolation purposes for
given data, it is required that the parabola y = a + bx + 1 2 1
cx2 should pass through the three points with coordin¬ show that A3 — 2A2 — 9,4 = 0, but that A2 — 2A —
ates (.v,, >’,), (x2, y2), and (x3, y3) in the (x, y) plane. 913 ^=0. Does the inverse of A exist?
Show that the matrix equation for the constants a, b, and
c can be written as
7.21. An nth-order square matrix A satisfies A2 = A
1 x, x, a V’l and A A I„. Show that
1 x2 x\ b = y2 (a) det A = 0;
c (b) (i„ + A)"1 =i„-U;
_ 1 -V3 *3. _ y 3-
(c) (l„ + A)m = I„ + (2m — 1) A for any positive integer
Verify that the inverse of the 3x3 matrix on the left is m.
AjXj •VjAl AI AS

(A , V, )(.V, A,) (Aj - V,)(.Y, - A,) (A, - Aj)(AS - Aj) *i Li X2 yi


Let Ax = m2 = . Calculate
A. + AS
A'2 + -Vj Aj + A‘|
-JT _ —>’2 X2_
(.V2 - A", )(.Yj - .V,) (-Aj - A-,)(.Y, - A,) (A | - A_,)(AS - Aj) Ax A- A2, AxA2, A2Ax, Axl. Compare your results with
I I z, -I- z2, z(z2, and l/zx, where z, = xx + jy2 and z2 = x2 +
( V, - A’| )(A'j - A’, ) (A., - A,)(.Y, - AS) (A, - A,)(A, - Aj) jy2 are complex numbers (see Chapter 6). Consider the
possibility of developing further parallels, such as to |z|
provided that certain conditions are met. What are and e'.
Determinants

Contents
8.1 The determinant of a square matrix 129
8.2 Properties of determinants 132
8.3 The adjoint and inverse matrices 137
Problems 139

8.1 The determinant of a square matrix


As we saw in (7.8) (Section 7.4), certain combinations of elements
from a square matrix appear as the denominator in the construction
of the inverse matrix. If this number, called the determinant of the
matrix, turns out to be zero, then the matrix is singular and no
inverse exists. Here we look at the definition of the determinant of
a matrix and its properties. Special emphasis will be placed on the
2x2 and 3x3 determinants which suggest generalizations to
higher-order cases.
Given the matrix

‘n ‘12
A =
Lu21 **22

then the determinant of A is denoted and defined by

*' n **12
det A — — **11**22 a2lal2• (8.1)
a2\ a22

(The notation \A\ is also used extensively for the determinant.) For
the 3x3 matrix

**n **12 **13

A = **21 **22 **23

- **31 **32 **33 -

its determinant is (see (7.11)) defined as

**11 **12 **13

det A = **21 **22 *23

**31 **32 **q


*33

= *?n**22**33 ~~ **11**32**23 ~ **12**21**33 + **12**31**23

+ **13**21**32 — **13**31**22‘ (^*2)

In (8.2) there are six terms, each of which is the product of three
130 8.1 Mathematical techniques

elements. Each term contains three elements, each from a different


row and column. In other words, there are never two elements in
any term from the same row or column. It can be seen that there
must be just 3 x 2 x 1=6 terms of this form, because three
elements can be chosen from row 1, two from the two remaining
elements in row 2, and one element from row 3.
Each term is prefixed by either +1 or —1. This is decided
according to the following rule. Write each term in the form

aljia2j2a3j3’

in which the first suffixes are in consecutive increasing order.


Examine the second suffix permutation jj2j3. The permutation is
said to be even (odd) if it has an even (odd) number of inversions.
An inversion occurs whenever a larger integer precedes a smaller
one. Thus the permutation 132 is odd, since 3 precedes 2, but 312
is even since there are two inversions because 3 precedes 1 and 2.
If the number of permutations is even, then a + sign is attached; if
the number is odd, then a — sign is attached. This rule can be
extended to a determinant of any order.
While this expansion of the determinant says something about
the structure of the determinant, it is not really a practical rule for
evaluating determinants. Returning to (8.2), we can rewrite det A as

det A — all(a22a33 a32an) an(a2\a33 ^31^23)

+ an(a2ia32 ~~ a3la22)-

The terms in parentheses are themselves 2x2 determinants. Thus

a 11 a 12 a113
>

det A = a2\ a22 a23

*31 u3 2 *33

a22 a2 3 «21 a23 «21 a22


al 1 ai2 + al3 . (8.3)
a32 a33 a31 a33 a31 a32

This expression is called an expansion by the top row. The term


associated with an, namely

22 a23
C 11
32 a33

is known as the cofactor of an. The cofactor of an element of A


is obtained by deleting the row and column through the element
and writing down the determinant of the elements of the remaining
2x2 submatrix with a + or - sign attached. The cofactors of aX2
and a13 are

a21 a23 a21 a22


Cl2 — C13 =
a31 ^33 a31 a32
8.1 Determinants 131

where the signs attached should be noted. In the same way the
cofactors of the elements in the second and third rows are defined
as follows:

_ «12 a 13 au a13 _ An a 12
^21 — 5 ^22 ~ > Q23 9
a32 a 33 a31 a33 a31 a 32

«12 «13 flu A13 All fl 12


Qt = 9 (-32 = — • C33 =
«22 «23 a21 a23 a21 a22

The signs associated with the cofactors alternate, starting with a +


at the top left as we move across or down from the top left-hand
corner as shown:

+ - +

- + - .
+ - +

Alternatively the sign associated with Ctj is + if i + j is even and


— if 2 + j is odd.

Example 8.1. Let


1 -1
A = 2 3
1 -1
Evaluate det A by expanding by the first row. Find the cofactors C13, C23, C33
of the elements in the third column. Calculate

a13^l3 + ^23^23 + a33^33’

and verify that it also equals det A.


By (8.3),
3 -2 2 -2
det A = 1 x -(-1) x + 0 x
1 1 1 1
= (3 - 2) + (2 + 2) = 5.
The cofactors are (with due regard to the sign convention)
2 1
C13 = C,
1 1
1
T33 =
2
Hence expansion by the third column gives

al3^~ 13 a23^-23 T ^3 3^3 3 = 0 X ( 5) + ( — 2) X 0+ 1 X 5 = 5,


which is the same as det A. (See also Rule 4 below.)

Example 8.2. Evaluate the determinant


1 2 k
det A = 2-1 3
-1 4-2
for any k. Find the value of k for which the determinant is zero.
132 8.1 Mathematical techniques

Expanding by the first row gives

2 k
det A = 2—1 3
4 -2

1 3 2 2
= 1 x —2 x +k x
4 -2 -1 -1

= 1 x (2 - 12) - 2 x (-4 + 3) + k x (8 - 1)

= -10 + 2 + 7/c = -8 + Ik.

Hence det A = 0 if k = §.

The notion of cofactors generalizes to higher-order determinants.


The alternating-sign rule applies from the top left-hand corner. For
example, a 4 x 4 determinant has 16 cofactors, each of which is a
3x3 determinant.

8.2 Properties of determinants


We list here some properties of determinants. Many of them are
useful in evaluating determinants. We shall not aim for complete
generality but illustrate the rules mainly in the 3x3 case. However,
the rules have obvious generalizations to higher orders.

1. det AJ = det A, where AJ is the transpose of A (see Section 7.3).


The determinants of a square matrix and its transpose are equal
since

a21 a31

a12 a22 a32

al3 °23 °33

ana2 la 33 ~ An a23a32 a2lal2a33 + a2lal3a32

+ a31 a12a 23 — a3lal3a225

and all terms in this expansion can be identified with those in (8.2).
Hence det AT = det A.

Example 8.3. Evaluate


28 -29

det + = 0 -4

0-2 5

Since the determinant has two zeros in the first column, it is advantageous to
use Rule 1. The determinant of the transpose of A is given by
1 0 0
det AT = 28 1 -2

-29 -4 5
8.2 Determinants 133

which now has two zeros in the first row. Hence the expansion by the top row
becomes particularly easy:
1 -2
det Ar = 1 x = 5 - 8 = -3.
-4 5

2. If every element of any single row or column of the matrix A is


multiplied by a scalar k, then the determinant of this matrix is
k det A.
(Note: this rule is different from Rule 2, Section 7.2, for matrices.)
This is a self-evident result, since just one element from every row
and column appears in every term. Thus, by (8.2), if every element
of the second row in A is multiplied by k, then
an a 12 a is

ka21 ka22 /ca23 = 1^22^33 — al \^-a23a32 ~~ fll2^A2lfl33

+ a12ka23a31 + al3ka21 a32 al2,ka22a21,


a31 a32 «33
= k det A.
By putting k = 0 in this result, note that any determinant must
have zero value if all the elements of any row or column are zeros.

Example 8.4. Evaluate the determinant


-1 99 1
A = 2 33 -2
3 55 1
Since the second column obviously has a factor of 11, then we can remove this
factor from the second column before expansion. Thus, by Rule 2,
-19 1
A = 11 x 2 3-2
3 5 1
3 -2 12 3
= llx( (-1) x -9 x + 1 x
5 1 3 5
= 11 x [-(3 + 10) - 9 x (2 + 6) + (10 - 9)]
= 11 x (-13 - 72 + 1) = -924.

3. If B is obtained from A by interchanging two rows (or columns)


then det B = —det A.

Suppose, for example, that rows 1 and 3 are interchanged, so that

an al2 al3 a31 a32 a33

A = a21 a22 a23 , B = a2l a22 a23


<3
m
ro

- a31 a32 -flu a 12 A13 -


1

Then, by analogy with (8.2), the expansion of B by its first row is


134 8.2 Mathematical techniques

given by

*22 a23 *21 *23 a2l *22


det B — a31 *32 33
*1 2 al3 *11 *13 al 1 ‘12

— ^31^22^13 a3la23al2 a32a2\a\3 T a32a23all

+ a33a2lal2 — a33a22all-

These are the same terms as those present in (8.2) except that the
sign of every term is changed. Therefore in this case
det B = —det A.
The same is true whichever row or column pairs are exchanged.
The rule applies to a determinant of any order.

Example 8.5. Evaluate the determinant


12 12
0 2 0 0
A =
-13 0 4
-1 2 0-1
There are several ways of approaching the evaluation of this determinant since
the second row and third column each have three zeros. It is obviously
advantageous to have as many zeros as possible in the top row. With this in
view, interchange rows 1 and 2 using Rule 3:
0 2 0 0
1 2 1 2
A = -
-1 3 0 4
-1 2 0 -1
Expanding by row 1, remembering the sign rule for cofactors:
1 1 2
A = 2 x -1 0 4
-1 0 -1
Now successively use Rule 1 and interchange rows with columns, and then Rule
3 and interchange the new rows 1 and 2:
1 -1 -1
A = 2 x 1 0 0
2 4 -1
1 0 0
= ( — 2) x 1 -1 -1
2 4 -1
-1 -1
= (-2) x
4 -1
= (-2) x (1 + 4) = -10.

4. Expansion by any row or column.

From (8.2), by grouping the terms differently, we can write, for


8.2 Determinants 135

example,

det A = U31(a12U23 — al3a22) — a32(alla23 ~ a13a2l)

+ a33(alla22 ~~ al2a2l)

al2 a13 a1 1 «13 al 1 al2


a32 + a33
a22 a23 a2l fl23 a21 fl22

— a31^31 + a32^32 + 033C33.

Here the elements a31, a32, a33 constitute the third row, and we call
this the expansion of det A by the third row.
It can be shown that the expansion can be written down similarly
using any row or column. Thus

det A = a12C12 + ^22^22 + a32^~32

is an expansion by the second column.

Example 8.6. Evaluate det A, where


1 3 0 1"
10 0 2
A =
-12 2 4

Since column 3 contains three zeros, expand by this column. The cofactor of
the element in row 3, column 3, is the 3x3 determinant obtained from A by
deleting the third row and third column in A. It is associated with a + sign.
Hence

1 3
det A = 2 1 0 (remember the sign rule)
2 1

(expanding by row 2)

= 2(4 + 10) = 28.

5. If two rows (or columns) of A are identical, then det A — 0. This


is a direct consequence of Rule 3. Interchange the two identical rows
(columns). The determinant looks the same, but its value is now
— det A.' Hence det zl = — det ,4, which implies that det A = 0.
Also, as a consequence of Rule 2, it follows that, if the corre¬
sponding elements of two rows (columns) are in the same ratio, then
the value of the determinant is zero. Thus, for example,

99 18 63 11 2 7

11 2 7 = 9 11 2 7 (by Rule 2).

-2 3 4 -2 3 4

= 0 (by Rule 4).


136 8.2 Mathematical techniques

6. If the matrix B is constructed from A by adding k times one row


(or column) to another row (column) then det B = det A: in other
words, any number of such operations on rows and on columns has
no effect on the value of det A.
For our standard matrix A, consider the matrix B which is
obtained from A by adding k times the elements in the first row to
the elements in the third row. Thus

0u a12 013

021 a22 023

+ kan 032 + ka12 033 + ka

= (a31 + kall)C31 + (a32 + kal2)C32 + (033 + k0i3)C33

(expanding by row 3)

= a31^31 + a32^32 + a33^33

+ k(allC3l + a 12^32 + 013Q3)

an 012 013
= det A + k 021 022 023 9

01 1 012 013
= det A,

since the second determinant vanishes by Rule 5.


Note that

all^31 + fl12^32 + al3^33 = O’


that is, in its general form, the sum of the products of the elements
of one row (or column) and the cofactors of the elements of another
row (column) is zero. This follows since the left-hand side must arise
from a matrix with two identical rows (columns).
Rule 5 is a particularly useful rule for simplifying the elements in
a determinant before expansion and evaluation. We illustrate a
number of these points in the next example.

Example 8.7. Evaluate


2 99 -99
A= 999 1000 1001
1000 1001 998
Usually we use the rules (particularly 6) either to introduce zeros into the matrix
or to reduce the size of elements as far as possible. It is important to list the
operations in order to make the sequence of operations intelligible. For this
purpose we identify the current rows by iq, r2,..., and the current columns by
c,,c2,.... Denote the new rows and columns which have been changed by
r'i, r2,... and c), c2,-There are many ways of approaching the evaluation of
A. A first step in this example could be to add column 3 (c3) to column 2 (c2)
since this produces a zero at the top of column 2. This operation is represented
by c2 = c2 + c3, and we list the operations on the right-hand side as we proceed.
The second operation is to subtract the new row 3 from the new row 2. A
8.2 Determinants 137

decision is taken at each step in the light of the new matrix. By Rule 5, these
operations do not affect the value of A. Hence

2 99 -99
999 1000 1001
1000 1001 998

2 0 -99
999 2001 1001 (C2 = C2 + C3)
1000 1999 998

2 0 -99
-1 2 3 (r2 = r2- r3)
1000 1999 998

2 0 -99
fc'l = Cj - |c
-2 2 2
\C3 = C3 - jc
0.5 1999 —
1.5
2 0 -93
-2 2 -4 (c3 = c3 + 3cj)
0.5 1999 0

2(4 x 1999) - 93[(—-2 X 1999) - 1],

= 387 899.

Note that while r2 = r2 + kr3 does not affect the value of the
determinant, r'2 = kr2 + r3 will change its value by a factor k.

8.3 The adjoint and inverse matrices


We can now rewrite the formula for the inverse given in Section 7.4,
using cofactors. The transposed matrix of cofactors given by

Cn C21 C31
adj A = Q2 C22 ^32 (8.4)
_ Cl 3 ^23 ^33 _

is known as the adjoint of A. Hence the inverse matrix of A given


by eqn. (7.10) becomes

i _ adj^ A
det A ’

in terms of the adjoint and determinant of A.


We can alternatively confirm by direct matrix multiplication that
adj /4/det A is the inverse of A. Thus

flu fl 12 A1 3 C21 C31


adj A
A - a2i a22 a23 Ci 2 C22 C32
det A det A
a31 a32 a33 _C13 C23 C33 _
138 8.3 Mathematical techniques

a11^3i + U12C32 + a13^33


auCu + a 12^12 “I" <31 3Cl 3 ^ 1 1 C2 1 + U i 2^22 4~ a13^23

a2l^-l 1 + @22^12 + a23^13 a21^-21 T a22^22 + a23^-23 Cl21 ^31 A U22^32 4" a23^33
det A
A- a32^~ 22 A a31^3l + CI32C32 T a33^33-
_ fl31^1 1 + 032^12 @33(^13 @3 l ^2 1 ”1” U33C23

det A 0 0

1
0 det A 0 = I3, using (8.4).
det A
0 0 det A _

This confirmation uses the results that the sum of the products
of the elements of one row and their cofactors is the value of the
determinant whilst the sum of the products of one row and the
cofactors of another row is zero.

Example 8.8. Find the inverse of

" 1 2

A = 0 1

- 1 -1

We evaluate det A first. Thus

det A = 1 x (-2 - 1) - 2 x (0 + 1) - 1 x (0 - 1) = -4.

The cofactors are

C„ = -3, Ci2= “I. C,3 = -1


C21 = 5, C22 = _ b C23
= 3,

T31 = -1, C32 = h T33 = 1

Hence

" -3 5 -1"
1 -1 -1 1
A~l = 4
_ -1 3 1_

The definition of the adjoint generalizes to matrices of higher


order. However, the adjoint of a 4 x 4 contains sixteen 3x3
determinants, which is about the limit of hand calculations unless
the determinant is sparsely filled with nonzero elements or can be
reduced to such a determinant. Such computations become a fertile
source of errors. There are computer packages available which will
quickly perform the arithmetic operations for determinants of
reasonable size.
8.3 Determinants 139

Determinant, adjoint, and inverse for 3x3


matrices

(a) (determinant of A)

a22 a23 a21 a23


det A = flxl a12
a32 a33 a3 1 a33

a31 a32
(b) (adjoint of A) (8.5)

Cm C3I
adj A = Ci 2 ^22 C32

- ^13 (-23 ^33 -


(c) (inverse of A)

A-1 _ ad-M
det A

Problems

8.1. Evaluate the following determinants. 2 10 0 0

1 0 1 12 10 0
1 2
0 1 0 (f) 0 12 10
(a) (b)
1 3
1 0 1
0 0 12 1

0 0 0 12
1-1 2

(c) 3 1-1 8.2. Without evaluating the following determinants,


explain why they are all zero:
2 1-1

2 3 4 -l 2 3
2 1 0 -1
(a) 4 6 8 ; (b) 3 1 -2
0 0 2 0
(d) •> 1 -1 2 -2 -3 -1
3 -1 2 1

0 1 -1 1 a b C

(c)
0 10 0 0
a—b b—c c—a
10 0 0 0

(e) 0 0 0 0 1 1 1 1

0 0 10 0 (d) 3 0 0

0 0 0 10 5 0 0
140 Mathematical techniques

8.3. Given that the equation


X y 1
a b c
hi 1 = 0
A = b c a

a2 b2 1
c a b
represents the equation of the straight line
what is the value of through the points (a,, br) and (a2, b2) in the (x, y) plane.
a3 ab ac2
If Jfj and X2 are the cofactors of x and y, what
is the slope of this line in terms of the cofactors? Using
ab c ac this method, find the equation of the straight line
through the points:
ac a be
(a) (1,-1) and (2, 3); (b) (- 1, 0) and (4. - 1).
in terms of A?
8.8. Find the value of a which makes the determinant
8.4. Simplify first and then evaluate the following deter¬ 1 1-1
minants:
I a 2
99 100
II 2

(a) 98 102
equal to zero.
-1 2
8.9. Explain why
77 84
x 2 -2
(b) 75 87
2 x 3 =0
1 -2
X — 1 X

2 -1 1
will be at most a cubic equation in x, but that
(c) 99 98 55 1 1 2
200 197 3 x 2 = 0

87 84 83 81 X 1 X

will be at most a quadratic equation in x. Solve both


77 76 77 75
(d) equations, and find all roots including any complex ones.
54 53 52 54
8.10. Show that
-43 -44 -46 -4
nn+t>n <^i2 + ^12 ai3 + bi3

8.5. Explain why the determinant

1 i 1 a3 1 a32 a 33

A = a b c 5 a[2 «13 t>n bi2 ^13

a2 b2 c2 = «2 1 a22 a2i + a2i a22 «23

has factors b — c, c - a. and a — b. Express the value of a3i °12 «33 a3i a32 <*33
A as the product of factors.
8.11. The determinant

8.6. Factorize the determinant 1 1 + i a\2 + b12 al3 + ^13

1 1 1 21 + b2i a22 + b22 a23 + b23

A = a b c 31 + b3i a32 T b32 «33 + b33


is required as the sum of determinants each of which has
a3 b3 c3
just as or bs in columns. How many determinants are
there in the sum? If the determinant is n x n, how many
8.7. Explain, using one of the rules for determinants, why determinants would there be in the sum?
Determinants 141

8.12. Show that Show also that


1 a, - h\ «i + 1 i> 1 det A2 = (det A)2.
1 U 2 — b2 a2 + b2 = 2 1 a2 b2 (b) Write down AT and find its determinant det AT.
Confirm that
1 - hi a3 + b3 1 «3 h3
det At = det A.
8.13. Let Dn be the n x n tridiagonal determinant defined
(c) Find A~l and detT-1, assuming that det A ^ 0.
by
Confirm that
i 1 0 0
det A~l = 1/det A.
1 2 1 0
(d) Show that
D„ = 1 2 det adj A = det A.
1 (These formulae for 2 x 2 matrices suggest general¬
izations for n x n matrices. Thus, for two n x n matrices
0 0 2 A and 6,
Show that
det AB = det A det B,
Dn = 2Dn_x -D„_2.
det A" = (det A)n,
If <2„ = Dn — D„_,, deduce that
det At = det A,
Qn = Qn-l ='-=e3= I-
det A ~1 = 1/det A,
Show that Dn = n + 1.
det adj A = (det A)n~1.
8.14. Find all values of x for which
We shall not attempt to prove these formulae here.)
x a b c
8.16. If
a x b c
2 -l" ’ 1 2 -l"
a b x c
A = 0 1 2 , B = 0 3 1
a b c x
is zero. . 1 3 -1- .2 l 3_
calculate det A, det B, det AB, AJ, det AT, adj A,
8.15. Let A and B be the two 2x2 matrices det adj A, A-1 and detT-1. Confirm the results con¬
an Ol2 bn jectured at the end of the previous problem.
A = , B =
_«21 @22— .b 2 i b 22—
8.17. The elements in a 3x3 matrix A - [au] are
Write down det A and det B.
given by the formula
(a) Find the product AB and its determinant det AB.
Confirm that a.-ji = «/' + (- 1)'2J (i, j = 1, 2, 3).

det AB = det A det B. Show that det A = 0 for all real a.


Elementary
operations with
vectors
Contents
9.1 Displacement along an axis 142
9.2 Displacement vectors in two dimensions 144
9.3 Axes in three dimensions 146
9.4 Vectors in two and three dimensions 146
9.5 Relative velocity 150
9.6 Position vectors and vector equations 152
9.7 Unit vectors and basis vectors 155
9.8 Tangent vector, velocity, and acceleration 157
9.9 Motion in polar coordinates 158
Problems 160

9.1 Displacement along an axis


Figure 9.1 shows an x axis with origin at 0 and a scale indicated.
The positive direction for x is from left to right. Two points are
indicated, P at x = xP = —2 and Q at x = xG = 1.5. The distance
between two points is always expressed as a positive number, so in
this case

distance from P to Q, or from Q to P = PQ or QP = 2 + 1.5


= 3.5 units.

p o Q
•-1-'-1-♦-1---
-2-10 1 2 x

If we are told that xP — — 2, and that the distance between P and


Q is 3.5 units, this does not tell us where Q is: xQ might be either
1.5 or —5.5. We need a way to express, as a single piece of
information, both the distance PQ and whether Q lies to the right
or left of P.
This is done by attaching a plus or minus sign to the distance.
We use plus if Q as viewed from P is in the positive direction of
the x axis (to the right in this case), and minus if Q is in the
negative direction (to the left in this case). This quantity is called
the displacement of Q relative to P, or the displacement of Q
from P, and is defined in terms of xQ and xP by

displacement of Q from P = xQ — xP.


9.1 Elementary operations with vectors 143

In this case the displacement of Q from P is equal to 1.5 — (— 2) =


3.5. This is positive, showing that Q is to the right of P. By the same
rule,
displacement of P from Q = xP — xQ.
= (-2) - 1.5 = -3.5.
The minus sign indicates that P is to the left to Q.

Example 9.1. A pedestrian wanders up and down the high street, which extends
east and west. Starting at the bus stop, she strolls 80 m east, 25 m west, 50 m east,
then races 100 m west, at which point the returning bus drives off. Where was she,
relative to the bus stop, at this timel

JV

w £

80 m

1 25 m
1 A C
CL 1
o 1
11 50 m
m 1
1 F 100 m

L _i
Fig. 9.2 5m

There is no difficulty about this question: Fig. 9.2 shows that she ends up east
of the bus stop with 5 more metres to go. Notice how natural it is to count one
direction as positive and the other as negative. We shall formalize this,
because can get useful illustrations about handling displacements from this
problem.

B F D C . E
4 m I i i 1 i•i I^J i•I_J—
O x
f'9* 9.3 (in io m steps)

In Fig. 9.3 we have drawn an east-pointing axis .x. The origin is at O (it will
make no difference where it is) and the bus stop is at B. The direction changes
at C, D, E, and the end-point is F.
We want to find the displacement of F from B. This is defined by xF — xB.
Write it in the form:
xF- xB = (xF - xE) + (xE - xD) + (xD - xc) + (xc - xB)

which is identically true because xE, xD, and xc cancel out. The quantities in
the brackets are relative displacements: for example, (xD — xc) represents the
displacement of D relative to C.
The data of the problem consists of these displacements; all we have to do
is to get the signs right. For example, since we chose the positive direction to
be east, and the movement from C to D is west, the displacement of D from C
is —25. By substituting all the information we obtain
xF - xB = 80 + (-25) + 50 + (-100) = 5.

Since this is positive, she ends up 5 m east of the bus stop.


144 9.1 Mathematical techniques

We did not need to know the actual coordinates of any of the points B
to F; the position of the origin O makes no difference to relative displacements.

Relative displacement along a line


Definition: Given an axis Ox, and points P and Q,

Displacement of Q from P = xQ — xP

= —(displacement of P from Q).

(a) The value of xQ - xP is unaffected by changing


the origin of x. (9.1)
(b) Addition of displacements:

XD - XA = (XD - Xc) + (XC - XB) + (XB - XA),

identically (there may be any number of intermediate


points).
(c) The order in which the displacements take place
does not affect the final displacement.

JV 9.2 Displacement vectors in two dimensions


We shall extend the idea of displacement into two dimensions.
Suppose that a ferry boat stationed at a port A is instructed to
proceed in three stages to a destination D. The instructions are:

(a) Go 50 km east;
(b) continue 20 km north;
(c) continue 20 km north west to D.

The navigator can plot the route as in Fig. 9.4. The axes point
east and north for convenience, and the origin 0 may be anywhere
we please. The three stages are drawn to scale, with their directions
indicated by the arrowheads.
Each instruction prescribes a displacement in two dimensions
relative to the initial point of each stage. A notation for the
displacements is

AB, BC, CD.

The bar emphasizes that they have particular directions (e.g., A to


B). The arrows are called displacement vectors.
Any displacement vector can be described by a pair of numbers,
called its components. The two components are the displacements
in the x and y directions which take place during the two-dimensional
displacement. For example, Fig. 9.5 shows the displacement vector
corresponding to the instruction:

Fig. 9.5 ‘Proceed 1 km southeast from P\


9.2 Elementary operations with vectors 145

The same point Q is arrived at by saying:

‘Proceed to a point 1/^/2 km east and (—1/^/2) km north


of P\

The numbers l/y/2 and (—1/^2) are the components of the


displacement vector PQ.
In general, if PQ represents a displacement vector, and P and Q
have coordinates (xP, yP) and (xQ, yQ), then

component of PQ in the x direction = xQ — xP,

component of PQ in the y direction = yQ — yp.

We may then write PQ in component form:

PQ = (XQ ~ xP,yQ - yP).

Suppose we have a chain of successive displacements. In the


ferry boat problem the chain consists of AB, BC, CD. The final
displacement, of D relative to A, is AD. In component form

AD = (xD - xA,yD - yA).

The final components can be broken up into the successive stages:

XD ~ XA = (XD - XC )+ (XC - Xb) + (XB - Xa)>

yD - y.4 = (yD - yc) + (yc - yB) + (yB - yA)-

Since CD = (xD — xc, yD — yc), and so on, it is reasonable to write

AD = CD + BC + AB.

The components are ordinary numbers, so we can change the


order in which displacements are added, and write instead

AD = AB + BC + CD,
or reorder them in any other way: the boat will still arrive at the
same point. This is true for any number of displacements.

y Example 9.2. Figure 9.6 shows the track of the ferry boat again, (a) Find the
x displacement q displacement vector AD in component form, (b) Express AD in terms of its length
^ y displacement

AD and the angle 0 it makes with the positive x axis.


(a) In component form,

AB = (50, 0), BC = (0, 20), CD = (-20/^2, 20/Q2),

(in km units) and suppose that


O X
AD = (X, Y).
Fig. 9.6
Then, adding the individual x and y components,

jy = 50 + 0 - 20/^2 = 50 - 20/^2,

and Y = 0 + 20 + 20/^2 = 20 + 20/^2,


146 9.2 Mathematical techniques

(b) From Fig. 9.6,

length AD = Jx2 + Y2 = [(50 - 20/J2)2 + (20 + 20/V2)2]*

= 49.5 (km).

0 = arctan — = arctan 0.952 = 43.6°.


X

9.3 Axes in three dimensions


From now on we shall consider both two and three-dimensional
situations. To locate points in a plane, two axes are needed. For
Fig. 9.7 three-dimensional space, introduce a third axis Oz perpendicular to
the other two and drawn through the origin 0 in the direction shown
in Fig. 9.7. These axes are indicated briefly by Oxyz. The position
of any point P is then specified by a triplet of coordinates, (x, y, z),
determined by reference to the three axes. Ox, Oy, Oz. For the point
P in Fig. 9.7, x = 2, y — 3, and z = 1, and we indicate P by writing
P -(2, 3,1).
There was a choice of two possible directions for Oz, as shown
in Fig. 9.8. These two sets of axes cannot be superposed no matter
how we turn them about: they are mirror images of each other, like
a right shoe and a left shoe. The axes shown in Fig. 9.8a are called
right-handed axes (left-handed axes, Fig. 9.8b, are seldom used).

9.4 Vectors in two and three dimensions


A displacement vector is a case of a physical quantity which has a
magnitude and a direction, and which follows a certain set of rules
similar to those in ordinary algebra. Velocity, acceleration, and force
are other examples. Such a quantity is called a vector quantity,
and can be depicted in terms of directed line segments similar to
those we used in Section 9.2 for displacements. The rules which
follow apply to directed line segments. We shall illustrate later how
the rules also apply to other vectors, such as forces.
z

1. Components and magnitude. Figure 9.9 shows a vector placed in


a set of axes. Its initial point is P: (xP, yP, zP) and its end-point
is Q: (xQ, yQ, zQ). We denote it either by PQ, where the bar stresses
the direction, P to Q; or (more often) by a single letter, say

a (in heavy print) or a_ (underlined when handwritten).

The components of a in the x, y, and z directions respectively


are au a2, and a3, where

ai = xq ~~ xpi ai ~ y<2~ yP, a3 — zq — zP. (9.2)


We write
Fig. 9.9 A typical vector PQ,
or a. PQ, or a = (flj, a2, a3). (9.3)
9.4 Elementary operations with vectors 147

The length or magnitude of a is denoted by PQ (no bar) or


QP, or by |a| or \PQ\. By Pythagoras’s theorem the length PQ is
given by

V[(*Q - xe)2 + (Tq - yP)2 + (zQ ~ zP)2],


so
PQ or |a| or \PQ\ = N/(,a\ + a\ + a\). (9.4)
This is always a positive number.

2. Equality of two vectors. We say that a = b if their components


are equal:

ax = bu a2 = b2, a3 = b3.

This is equivalent to saying that a = b if they have the same


magnitude and direction. Instead of saying ‘in the same direc¬
tion’, we may say ‘parallel and with the same sense’.
The vectors shown in Fig. 9.10a are all cajled equal although they
are in different places. Figure 9.10b shows four vectors in the form
of a parallelogram, but only two letters, a and b, are needed to label
it. Figure 9.10c shows two vectors which are parallel but have
opposite senses, or directions.

Fig. 9.10 (a) and (b) illustrate


equality of vectors, (c) The 3. Multiplication by a positive or negative number. If k is a real
vectors a and —a have opposite number, then
senses.
ka — (kai, ka2, ka3). (9.5)

Therefore ka is \k\ times as long as a. If k is positive then ka is in


the same direction as a, and if k is negative it is in the direction
opposite to a.
The vector ( — a) means the same as (— 1 )a:

-a = (-au -a2, -a3), (9.6)

which has the same length as a and the opposite direction.

4. Addition and subtraction

a + b = (aj, a2, a3) + (bx, b2, b3)

= (^i + bu a2 -f b2, a3 + h3), (9.7)


148 9.4 Mathematical techniques

z so the sum of two vectors is obtained by adding the corresponding


components (and similarly for any number of vectors).
This is equivalent geometrically to the triangle rule, illustrated
in Fig. 9.1 la. Choose any point A as the starting point, then draw
a followed by b, as if they were successive displacements from A.
The definition says that

a + Z> = Afl + BC = AC, (9.8)

where AC is the third side of the triangle.


Sometimes the parallelogram rule, illustrated in Fig. 9.12a, is
more convenient. In this case, draw the vectors a and b out from
the same point P, then complete the parallelogram PACB. The
z
diagonal vector PC is equal to a + b, because

JC = PB = b

and so by the triangle rule applied to PAC

a + b = PA +AC = PC.

For subtraction.

a — b = a + ( — b), (9.9)

which is illustrated in Fig. 9.11b using the triangle rule and in Fig.
9.12b using the parallelogram rule.
Also,

a - a = (au a2, a3) + ( — alt —a2, -a3)


z
= (0,0, 0).

This is the zero vector, denoted by 0 (or 0, if handwritten).

5. Brackets and rearrangement of sums of vectors. Addition involves


only the addition of the x, y, and z components separately. Since
the components are ordinary numbers, we may change the order
in which they are added; for two vectors we have

a + b = b + a. (9.10)
z
We can also use brackets in the usual way:

a + (b + c) = (a + b) + c. (9.11)

These are like the rules of ordinary algebra. A more complicated


example is

(a + b) — (c + d) = (b — c) — (d — a).

6. Vectors in two dimensions. All of the foregoing definitions and


properties apply equally to vectors in two dimensions. All that is
rule. necessary is to delete the z component. Thus, in two dimensions, if
9.4 Elementary operations with vectors 149

a = (al5 a2) and b = (fr^ b2), then

a + b = (al + bu a2 + b2).

Example 9.3. Find \a — b\ when a = (a1; a2) and b = (b1, b2)


a — b = (al — bu a2 - b2).

\a — b\ = magnitude of a — b

= _ M2 + («2 - M2]-

Example 9.4. M is the midpoint of the side AB of the triangle PAB (Fig. 9.13).
Put PA = a and PB = b. (a) Express the vector PM in terms of a and b. (b)
Deduce that the diagonals of a parallelogram bisect each other. (You can think
of this in two dimensions, but it applies equally in three.)
(a) PM = PB + BM (triangle rule)

= PB + \BA (M is the midpoint).


= b + \~BA. (i)

Fig. 9.13 M is the midpoint of AB. Also BA = BP + PA (triangle rule; note the direction of BP)
= — PB + PA (see (9.6)).
= —b + a (ii)
Substitute for BA in (i):

PM — h + )f — b + a)

= j(a + b) (after rearrangement).

B (b) In Fig. 9.14 we have added a fourth vertex D to form a parallelogram. The
point N is the midpoint of PD. Then

FiV = \T5
= \(a + b) (parallelogram rule).

Therefore, from the result in (a),

TN = FM,
Fig. 9.14 N is the midpoint of PD.
so the mid-points of PD and BA coincide.

Example 9.5. (In two dimensions.) In the (x, y)-plane, a and b are two vectors
which are not parallel, and c is another vector, (a) Prove that c = la + pb, where
X and g are constants, (b) Find X and g when a = (1, 1), b = (2, 0), and c = (3, 4).
(a) Take any point Q. Draw a, b, and c radiating from it, and then complete
B the parallelogram QBCA, as in Fig. 9.15. Then

c = QC = QA + QB (parallelogram rule).

But QA and QB point respectively in the directions of a and b so they are equal
to certain (unique) multiples of a and b.

QA = Xa and QB = gb,

Fig. 9.15 Vectors in a plane: say. Therefore


c = /.a + gb.
c = Xa + gb.

(b) a = (1, 1), b = (2, 0), and c = (3, 4), so from (a)

(1,1) = 2(2, 0) + m(3,4).


150 9.4 Mathematical techniques

The individual components on the two sides must match, so

1 = 21 + 3/i,

and 1=0 + 4p.

The solution is p = 1/4 and 1 = g, so

C = 8«+ lb

The result in Example 9.5 (a) is important, and it extends to three


dimensions as follows:

Relation between three coplanar vectors

a and b are two non-parallel, nonzero vectors with the


same initial point Q, and c is any other vector at Q,
in the same plane as a and b. Then

c — /m + iib, (9.12)

where a and p are certain (unique) constants.

z Figure 9.16 shows the three vectors a, b, c in their common plane.


We can use the same argument as in the previous example. (It is
not actually necessary for the vectors a, b, and c to be in the same
plane and emerge from the same point to start with: it is sufficient
for them merely to be parallel to the same plane, so that we can
translate them to the positions in Fig. 9.16.) Then the argument in
Example 9.5 follows.

9.5 Relative velocity


In this Section we shall assume that all the velocities are constant.
Velocity has magnitude and direction, so we can depict it by a
Fig. 9.16 directed line segment whose length is proportional to the speed
(always a positive number), and which points in the right direction.
But to decide whether velocity can be treated as a vector (i.e.,
whether it obeys the rules in Section 9.4) we need to say what
addition of velocities is to mean physically.
Typically, addition of velocities is concerned with combining
relative velocities. For example, if an escalator is moving at 0.5 m s~1
relative to the wall, and a passenger is walking up at 1 ms -1 relative
to the escalator, then the actual velocity of the passenger is
1 + 0.5 = 1.5 m s-1 relative to the wall.
Since relative velocities are relative displacements per unit time,
velocity vectors obey the same rules as displacement vectors. Take
a set of axes Ox, Oy, Oz which are to be regarded as fixed axes.
They might be fixed relative to the earth’s surface, or relative to the
9.5 Elementary operations with vectors 151

directions of distant stars. Let

vP = velocity of a point P relative to the fixed axes,

vQ = velocity of a point Q relative to the fixed axes,

and vQP = velocity of Q relative to P.

Then the velocity vQP of a point Q as observed from P, in terms of


the velocities vQ and vP observed from the fixed axes, is given by

velocity of Q relative to P — velocity of Q — velocity of P,

or vQP = vq- vP. (9.13)

Example 9.6. (Figure 9.17) A river of width 0.2 km flows with uniform speed
3 km hr"1 from west to east. A boat sets off from a point S on the south bank,
wishing to land at a point N on the north bank directly opposite S. It can travel
at a speed of 5 km hr"1 relative to the water. In what direction should it point
in order to arrive at N by a straight line route? How long does it takel

N
T
I
I

Fig. 9.17

The true path of the boat (i.e., as seen from the bank, or relative to fixed axes)
is not along the direction it is pointing, because it is also being carried
downstream. However, viewed from axes which travel along with the water, it
does go in the direction it is pointing, at an apparent speed of 5 km hr-1. To
visualize this, imagine there is a dense fog, so that the banks cannot be seen and
the pilot is not aware of the current.
With B denoting ‘boat’ and W denoting ‘water’, put

vB = velocity of B relative to fixed axes (direction north,


magnitude, or speed, unknown);

vBW = velocity of B relative to the water W (speed 5 km hr-1,


in the unknown direction it is pointing);

fw = velocity of the water W relative to fixed axes


(direction east, speed 3 km hr-1).

We also know from (9.13) that these are connected by

VBW = — VW’

or vB = vBW + vw.

This information gives Fig. 9.18.

(a) From Fig. 9.18, the boat is directed at 9 = arcsin § = 36.9°.


152 9.5 Mathematical techniques

(b) Pythagoras’s theorem gives the magnitude of rB:

kel = V(52 “ 32) = 4km hr_1-

Therefore the time taken is 0.2/4 = 0.05 hr = 3 minutes.

9.6 Position vectors and vector equations


z In Fig. 9.19 P is the point with coordinates (2, 3, 1). The vector OP,
or r, which has its initial point at the origin of coordinates, O, is
called the position vector of P. The components of r, or OP, are
then equal to the coordinates of P, so

OP = r = (2, 3, 1)

in this case. Position vectors are often distinguished from ordinary


vectors by using the letter r. Apart from their being attached to the
Fig. 9.19 Position vector,
r = OP, of the point P. origin, the rules for position vectors are the same as for ordinary
vectors.
This device enables us to specify, for example, the point at which
a force acts, without mixing up vectors with coordinates in the same
calculation. It also allows us to do coordinate geometry in vector
terms, by obtaining vector equations describing curves and sur¬
faces in terms of the position vector r = (x, y, z).

Example 9.7. (Two dimensions.) A circle has radius c, and its centre C at the
point (a, b). (a) Obtain a vector equation for the circle, (b) Deduce the ordinary
cartesian equation.
(a) The circle is shown in Fig. 9.20. P is any point (x, y) on its circumference,
so its position vector in component form is

r = (x, y).

The centre C has position vector rc, where

rc = (a, b).

Also, CP = r — rc. The length of CP must be constant and equal to c, so

k- rc\ = c■ (i)
This is the vector equation required.
(b) To turn (i) into x, y form write r and rc in component form:

r~rc = (x, y) - (a, b) = (x - a, y - b).

The length of this vector is given by

k - 'cl = V{(* - a)2 + (y - b)2}.

Therefore, after squaring both sides in (i), we get

(x - a)2 + (y- b)2 = c2,

which is the usual form for the equation of a circle. (This is not an efficient way
of obtaining it, of course. We are simply checking that (i) makes sense.)
9.6 Elementary operations with vectors 153

Example 9.8. Three points, A, B, and C (which do not lie in a straight line), have
position vectors a, h. and c. (a) Obtain a parametric vector equation for the plane
through the points A, B, C, (b) Deduce parametric cartesian, (i.e., x, y, z), equations
for the plane in the case where the points are A: (1, 2, 1), B: (2, 2, 0), C: (2, 1, 2).
(c) Deduce the ordinary cartesian equation for this plane by eliminating the
parameters occurring in (b).
(a) Figure 9.21 shows the points A, B, C, and their position vectors. The point
P: (x, y, z) with position vector r is any point in the plane through A, B, and
C. By the triangle rule

BA = a-b, BC = c-b, BP = r - b.

By using the result (9.12). which relates any three coplanar vectors, we obtain

BP = 1BA + pBC,

or r — b = /.(a — b) + p(c — b), (i)

where 2, g are two constants which depend on the position of P. We find every
point r in the plane by letting the parameters 2, g run through all possible values
between — oo and + oo, so (i) is a parametric vector equation for the plane
Fig. 9.21 through A, B, C.
(b) Since r, a, b, c are position vectors, their components are given by the
coordinates of P, A, B, C, so equation (i) becomes

(x, y, z) - (2, 2, 0) = 2{(1, 2, 1) - (2, 2, 0)} + g{(2, 1,2)- (2, 2, 0)}

= 2(-l,0, l) + g(0, -1,2).

Take the vector (2, 2, 0) over to the right-hand side, and then match the x, y, z
components separately:

x = 2 — 2, y = 2 — g, z = 2 + 2g (ii)

where /. and g may take any values. These are cartesian parametric equations
for the plane.
(c) We obtain an x, y, z equation by eliminating 2 and g from the equations
(ii). From the first two equations we have

2= 2—x and g = 2 — y.

Substitute these into the third equation of (ii):

z = (2 - x) + 2(2 - y),

which is the same as

x + 2y + z = 6. (iii)

If A, B, and C do not lie on a straight line, the equation of the plane


through them will always be like Example 9.8 (iii):

Equation of a plane

The general equation of a plane is

ax + by + cz = d, (9.14)

where a, b, c, d, are constants.


154 9.6 Mathematical techniques

Example 9.9. (Three dimensions) Two points, A and B, have position vectors a
and b. (a) Obtain a parametric vector equation for the straight line joining A and B.
(b) Deduce parametric cartesian (i.e., x, y, z) equations for the case where the points
are A: (2, 2,-1) and B:(0, 1, -2). (c) By eliminating the parameter between the
equations in (b),find cartesian equations for this line.
(a) Figure 9.22 shows the points A and B and their position vectors a and b.
z
The point P:(x, y, z) with position vector r represents any point on the line
joining AB. Also,

AB = b — a, and AP = r — a.

AP is some multiple, 2 say, of AB:

AP = XAB,

or r — a — X(b — a).
Therefore
r = (1 — X)a + Xb. (*)

This is the required parametric vector equation, with X as the parameter. As X


increases from —oo to + oo, P traces out the straight line passing through A
and B.
(b) Since r, a, and b are position vectors, their components are the same as
the coordinates of P, A, and B:

r = (x, y, z), a = (2, 2, — 1), b = (0, 1, —2).

Substitute these into (i):

(x, y, z) = (1 — X)(2, 2, — 1) + 2(0, 1, -2)

= (2 - 22, 2 - 2, - 1 - 2).

Now match the x, y, z components on both sides:

x = 2 — 22, y = 2 — 2, z = — 1 — 2. (ii)

These are parametric cartesian equations, in which the parameter ranges from
— OO to +00.
(c) In order to get rid of the parameter 2 in (ii), write them successively in the
form
; _ x - 2 ; _ y - 2 ; _ z +1

Since the three fractions are equal (equal to the current value of 2) we obtain
the relation between x, y, z which holds on the line:

x — 2 y — 2 z+1
-2 ~~ -1 ~~

which simplifies to

-jx+ 1 = — y + 2 — — z - 1. (iii)

The shape of the result (iii) of Example 9.9 might strike you as being
peculiar. It really consists of two simultaneous equations, represent¬
ing two planes which intersect along the required line AB. The
expression cannot be reduced to a single equation. The general case
will be given in Chapter 10.
9.6 Elementary operations with vectors 155

Example 9.10. Given the straight line

2x — 2 = y + 1 = — 2z, (i)

(a) Find any one point on the line, (b) Find a parametric equation for the line.
(c) Find the coordinates of the point where the line crosses the plane

x — y + z = 0. (ii)

(a) Put, for example, x = 1. Then from (i), 2x — 2 = y + 1, so when x = 1,


y = — 1. Also from (i), 2x — 2 = —2z, so z = 0. Therefore, the point (1, — 1, 0)
lies on the line. (Other values of x lead to other points.)
(b) Proceeding as in (a), put x = X, where A may take any value. Then we
find that

y = 22 — 3 and z = —A + 1.

Therefore a set of parametric equations is

x = A, y = 2A — 3, z = — A + 1. (iii)

(c) From (ii) and (iii), at the point where the line meets the plane the value of
A must be given by

0 = x — y + z = A — (2 A — 3) + ( — A + 1).

At this point A = 2, so from (i) again, the line meets the plane at

x = 2, y = 1, z = — 1.

(Alternatively, solve the equations (i) and (ii) simultaneously.)

9.7 Unit vectors and basis vectors


A vector of unit magnitude is called a unit vector. For example,

a — (—j, f, f) is a unit vector since

a = V{M)2 + ®2 + ®2} = l-
The vector (1, 0, 0) is a unit vector; it points in the direction of
z the x axis, since if it is drawn as a position vector it would join the
2 , origin to the point 1 unit along the x axis. Similarly, (0, 1, 0) and
(0, 0, 1) are unit vectors in the y and 2 directions respectively. These
vectors have the special symbols 1, j, and k, and are called basis
vectors for the given coordinates:
1 = (1, 0, 0),j = (0, 1, 0), k = (0, 0, 1). (9.14)
(They are sometimes spoken of as ‘/-hat’, and so on.) Figure 9.23
shows them as position vectors.
Any vector can be expressed in terms of i, j, and k. Suppose that
a = (al9 a2, af) in component form. Then

a = (ax, 0, 0) + (0, a2, 0) + (0, 0, a3)


= ^(1,0, 0) + n2(0, l,0) + a3(0, 0,1)

= aj + a 2} + a2k. (9.15)

Fig. 9.23 Basis vectors, i,j., k. The components become the coefficients of 1, /, and k.
156 9.7 Mathematical techniques

Example 9.11. Let a = 2/ + 3j - k and b = i- 3k. Express the vector x in


the equation 3a + 2x = b in terms of i, j, k.
In the usual way, we find that

.v = \(b - 3a) = \b-\a

= \{i - 3k) - f(2i + 3/ - k)

= -¥-\l
The components of x are therefore (—|, — §, 0).

If a is any vector, then the vector a (called ‘a-hat’)

a = a/\a\,

obtained by dividing a by its own length (or magnitude), is a unit


vector in the direction of a (we can say ‘the direction of a is a).

Example 9.12. Obtain the unit vector pointing in the direction of the force
F=2i- 3/ - 6k.

|F| = V{22 + (-3)2 + (-6)2} = V49 = 7.

Therefore, the unit vector pointing in the same direction is

F = (2i - 3j - 6k)/I = p - fy - f k,

or, in component form,

p — ti
r — v7’ 7> 71-

Unit vectors
A unit vector is a vector of unit magnitude. The unit
vector in the direction of a is denoted by a (a-hat).
(a) If a is any vector, then
a = a/\a\. (9.i6)
(b) The vectors i, j, k (basis vectors) are the unit
vectors in directions Ox, Oy, Oz. If a = (au a2, u3)
is any vector, then
a = aj + ap + ci^k.
(For two dimensions, use only i and /.).

Example 9.1 3. Find the point Q where the straight line joining A: (2, 3, 1) and
B: (1,2,2) intersects the plane x + y + z = 0.
The position vectors of A and B, in terms of f, J, k, are a = 2/ + 3/ + k and
b = i + 2j + 2k respectively. Let r = xi + yj + zk be the position vector of a
general point on the line AB. Then from Example 9.9(a), the parametric equation
of AB is

r — (1 — X)a + /ib = (1 — 2)(21 A~ 3j -i- k) -(- 2(f T 2j T 2k).


9.7 Elementary operations with vectors 157

After collecting terms in i,j, and k on the right, this becomes

xi + yj + zk = (2 - X)i + (3 - X)j + (1 + X)k. (i)

Match the coefficients of ?, /, k on either side of (i); then

x = 2 X, y — 3 — 7, z = 1 -f- X. (ii)

The intersection point Q is on the plane, x + y + z = 0, so

(2 - X) + (3 — X) + (1 + X) = 0.

Therefore X = 6 at Q. Put this value back into (ii):

x = — 4, y = —3,z = 7.

The position vector of Q is therefore

— 4i — 3/+ Ik.

Problems can be worked through with the vectors given either in


component form or in i, j, k form, whichever is convenient.

9.8 Tangent vector, velocity, and acceleration


Suppose that the coordinates of a point P depend on a parameter
t (which might stand for time). Then we can write

K0 = + y(t)j + z{t)k.

As t runs from the value t = a to t = b, where b > a, P follows a


curve from A to B, as in Fig. 9.24.
Consider two points, P and Q, close together on the curve, where
the parameter values are t and t + St respectively. The correspond¬
ing position vectors are r(t) and r(t + St). By the triangle rule,

PQ — r(t + <50 — r(t) = Sr.

Now consider the vector T defined by

T — lim {r(t + St) — r(t))/St = lim dr/St.


dt->0 dt-> 0

This is like an ordinary derivative, so we denote this vector by

T — dr(t)/dt. (9.17)

Notice also that this is equivalent to

T = dr/dt = * dx/dt + j dy/dt + k dz/dt,

since i, j, k are constant.


As St approaches zero, Sr, and therefore Sr/St, become more and
more nearly tangential to the curve. Therefore T is a tangent
vector to the curve at P. To decide which way T points, consider
the case when St is positive. Then Sr points in the direction of
increasing t; so the tangent vector T must also point in this
direction.
158 9.8 Mathematical techniques

Derivative of r(t)

r(t) = ix(t) + Jy(t) + kz(t), where t is a parameter,


represents a curve. The vector T given by (9.18)
T = dr/dt = idx/dt + jdy/dt + k dz/dt
is a tangent to the curve, in the direction of increasing t.

If the parameter t stands for time, then dr/dt is the definition of


the velocity of P, and dv/dt represents its vector acceleration:

Velocity and acceleration vectors


If a point P has position vector r(t), then
velocity v(t) = dr/dt, (9
acceleration aft) = dv/dt or d2r/dt2
Also the speed = |v(f)|.

Notice that velocity and acceleration are not generally parallel.

Example 9.1 4. (Motion in the (x, y)-p\ane.) The position vector of a point P is
given by

r(t) = ic cos tot + Jc sin cot,

where c and to are positive constants. Find (a) the velocity v(t) and the speed
of P; (b) the acceleration a(t) of P.
(a) |r(f)| = cy/[cos2 cot + sin2 cur] = c, so P is moving around a circle of radius
c in the (x, y)-plane.

v = dr/dt = —icto sin tot 4- Jew cos cut.


The direction of v is tangential to the circle, by (9.18). By putting, e.g., t = 0 we
obtain v = jcoc, and since c, to > 0, this shows the motion to be anticlockwise.
Also, speed = |v| - coo.
(The speed is constant, but the velocity is not, because its direction is changing.)
(b) a = dv/dt = — icto2 cos cot - jcto2 sin cut = —to2(ic cos cut + Jc sin cut)

= —cu2r.

The acceleration is therefore directed towards the centre of the circle (perhaps
unexpectedly).

9.9 Motion in polar coordinates


Suppose that two-dimensional polar coordinates r, 9 are appropriate
Fig. 9.25 Polar unit vectors er to the geometry of an application. Figure 9.25 shows a point
and ee. P-(x, y,z) and its polar coordintes r, 6. There are also two unit
vectors er and ee associated with P, er in the direction of 9 constant
with r increasing, and ee in the direction of r constant with 9
9.9 Elementary operations with vectors 159

increasing. The position vector of P is r, given by


OP = r = rer. (9.20)
The unit vectors e, and ee vary in direction according to the value
of 9, and are therefore functions of 9. They are related to the basis
vectors f and j as in Fig. 9.26. By the triangle rule,
e, = i cos 9 +y sin 9, ee = — / sin 9 + j cos 9. (9.21)
We shall need their derivatives with respect to 9:
deJdO — —i sin 9 +J cos 9 = ee (9.22)
and dee/d0 = — i cos 9 — j sin 9 = — er. (9.23)
Now suppose that P is moving along a curved path. Then r and
9 are functions of time, t, so we can write r(f), 9(t) for its polar
coordinates, and consider their derivatives with respect to t. There
is a useful dot notation for time derivatives which saves a lot
of writing—it works in the same way as the dash notation, (4.1):

Dot notation for time derivatives


If x(t) represents a function of f, then x stands for (9.24)
dx/dt, x stands for d2x/dt2, and so on.
Fig. 9.26

By using the chain rule, and writing 9 for d9/dt, we obtain from
(9.22) and (9.23) the time variation of e, and ee:
dejdt = 9ee and dejdt = -9er. (9.25)
This result is used in the following example.

Example 9.1 5. The polar coordinates of a point moving in a plane are r(t), 0(t),
where t is time. Find the polar components (a) of its velocity and (b) of its
acceleration.
(a) The position vector is r(t) = r(t)er. The velocity v is dr/df:
v(f) = dr/df = d(r<?r)/df.
Both r and er depend on 0, so we use the product rule for differentiation:
v = rer + r der/dt = re, + rOee (i)

by (9.25). Therefore the radial velocity component is r and the transverse


component is rO.
(b) The acceleration is dr/df, given by
dv/dt = (d/dt)(re, + r0eg) (from (i))
, de, d(r()) x de,.
= re, + r-h e„ + r0
df df “ dt
= re, + rOeg + (rO + r0)ee — r92e.

= (r — r62)e, + (rd + 2r0)eg.


Therefore the radial component of acceleration is r — rd2, and the transverse
component is rd + 2rd.
160 Mathematical techniques

Problems

9.1. Sketch the two-dimensional displacement vectors the same sense (positive direction), and similarly for Q Y
PQ and QP, and state their x and y components, when and QZ. The frame QX YZ is said to be a translation (a
the coordinates of P and Q are as follows. motion without rotation) of the frame Oxyz.
(a) P: ( — 2, 3), 0: (3,0), (b) P: (3, 4), Q: (2, 1), Suppose that ~OQ — (2, -1,3). (a) Find the coordi¬
(c) P: (0, 1). Q: (— 1, -2), (d) P: (- 1, -1), 0:(O,O). nates of the point P in QXYZ if it has coordinates
x = 5, y = 2, z = — 3 in Oxyz. (b) Find the equation
9.2. (a) to (h) represent two-dimensional displacement of the sphere x2 + y2 + z2 = 1 in terms of X, Y, and
vectors expressed in terms of their x, y components. For Z.
each one obtain the length, and the angle of inclination
0 to the positive direction of the x axis in the range — 180°
9.11. ABCD is any quadrilateral in three dimensions.
to 180 .
Prove that if P, Q, R, S are the mid-points of AB, BC,
(a) (3,0). (b) (0, 2), (c) (— 1, 1) (d) (1. 1),
CD, DA respectively, then PQRS is a parallelogram.
(e) (— 1, — 1), (f) ( — 3, 4), (g) ( — 3, —4),
(h) (-2, 1).
(Make sure that these angles are in the right quadrant 9.12. ABC is a triangle, and P, Q, R are the mid-points
by means of a rough sketch.) of the respective sides BC, CA, AB. Prove that the
medians AP, BQ, CR meet at a single point G (called
9.3. Obtain the components of the vectors a in (a) to the centroid of ABC', it is the centre of mass of a uniform
(d) , where L is the magnitude and 0 the angle made triangular plate.)
with the positive direction of the x axis (—180° < d ^
180"): (a) L = 2, 0 = 45°, (b) L = 3, 0 = 120°, (c) L = 3,
0 = 60', (d) L = 3, 0 = -150°. 9.13. Show that the vectors OA = (1, 1, 2), OB = (1, 1, 1),
and OC = (5, 5, 7) all lie in one plane. Show that the
9.4 Two ships, and S2 set off from the same point same is true if OA = (a, a, p), OB = (b, b, q), OC = (c, c, r),
Q. Each follows a route given by successive displace¬ where a, b, c, p, q, r may stand for any numbers. Explain
ment vectors. In axes pointing east and north, S, follows this result geometrically.
the path to B via OA — (2, 4), and AB = (4, 1). S2 goes
to E via QC = (3, 3), CD = (1,1) and DE = (2, —3).
9.14. A glider is moving with a velocity v = (40, 30, 10)
Find the displacement vector BE in component form, the
relative to the air and is blown by the wind which has
distance BE, and the final bearing of S2 seen from S,.
velocity relative to the earth of w = (5, — 10, 0). Find the
velocity of the glider relative to the earth.
9.5 Find the distances between the pairs of points whose
coordinates are: (a) (0, 0, 0) and (1, 2, 3), (b) (1, 2, 3) and
(3, 2, 1), (c) (1,0, -1) and (-1, 1,0). 9.15. The captain of a boat at night can tell that it is
moving relative to the sea with velocity (5, 4) km hr-1,
9.6 State the projections on the three axes of the vector and by observation of lights on shore its true velocity
PQ when P is the point (1, 2, 1) and Q is (2, 3, 3). is found to be (4, 1). What is the velocity of the current?

9.7. Find 2a, 3b, and 2a — 3b when


(a) a = (1, 2, 1), b = (2, 1, 2), 9.16. A cyclist rides north along a straight road at
(b) a = (3, 2, 3), A = (1, 1,2), 10 km hr -1. The wind appears to come from the west.
(c) a = (6, 3, 1), b = (4,2, 1). If she increases her speed to 20 km hr-1 then the wind
How do you recognize that la — 3b is parallel to appears to blow from the north-west. Determine the
the (x, y)-plane in (b), and parallel to the z axis in (c)? speed and direction of the wind.

9.8. Sketch a diagram to show that if A, B, C are any


9.17. A ship travels south with speed u and the apparent
three points, then AB + BC + CA = 0. Formulate a
wind direction is from the east. Another travels west with
similar result for any number of points.
speed 2u/y/3, and the apparent wind direction is from
30° east of north. Find the true wind velocity.
9.9. Sketch a diagram to show that if A, B, C, D are
any four points, then CD = CB -I- BA + AD. Form¬
ulate a similar result for any number of points. 9.18. r is the position vector (2, 3, 1), and a = (1, 1, 2)
is a general vector. R is the position vector defined
9.10. Oxyz and QXYZ are two sets of axes with origins by R = a + 2r. Find the coordinates of the terminal
at O and Q respectively. QX is parallel to Ox and has point of R.
Elementary operations with vectors 161

9.19. Find the angle 6, where 0^0^ 180°, made by 9.26. Find the cartesian equation of the planes passing
the position vector r with the positive directions of the through the following points: (a) (1,0, 1), (0, 1, 0), (0, 0, 1),
axes Ox, Oy, Oz in the following cases: (a) r = (1,0, 0), (b) (0,0, 0), (1,2, -1), (2, 2, 2).
(b) r = (0, 1, 1), (c) r = (0, 0, -1), (d) r = (1, 1, 1), (e)
<• = (1, 1,-1). 9.27. Find the shortest distance from the origin of the
line given in vector parametric form by r = a + th,
where a = (1, 2, 3), b = (1, 1, 1), and t is the parameter.
9.20. P: (1, 1.0), Q: (1, 1, 1), and R:{ 1,2, 1) are three of
(Hint: use a calculus method, with t as the independent
the vertices of a parallelogram with sides PQ and
variable.)
PR. Use vector methods to find the coordinates of (a)
the fourth vertex. S, (b) the mid-point of PS, (c) the
9.28. For each of the following cases find a unit vector
mid-point of QR. Show that (b) and (c) have the same
which has the same direction as a, and a unit vector
coordinates (it is where the diagonals intersect).
which has the opposite direction, (a) a = (3, 4, 3), (b)
Find the mid-points A, B, C, D of the four sides
a = 2i + 3j + 6k, (c) a = (— 1, — 1, 2), (d) a = i — 2j + k,
PR, RS, SQ, QP respectively. Show that ABCD is a
(e) 3i - 6j + 3k.
parallelogram.

9.29. Express in terms of i,J, k the vectors whose initial


9.21. Show that the points A\{\, 2, — 1), B:( 3, 3,—2), and terminal points are respectively given by the follow¬
and C: ( —3, 0, 1) are collinear (lie on a straight line), by ing position vectors: (a) i +j + k and — 2/ 4- 3/ 4- 5k,
considering the vectors AB and AC (or any other (b) i + 2j — k and 3i—j—2k. Find the length of the
two combinations of A, B, and C). (a) Find which point vector in each case.
is between the other two. (b) Find any other point on
the line, (c) Show that the points x = 2X 4- 1, y = X A- 2, 9.30. Show that the line joining the points with position
z = — X — 1, where X is a parameter which may take any vectors i — j + 2k and 2i — 2j — 3k intersects the z axis.
value, all lie on the line (these are parametric equations
for the line). 9.31. A set of two-dimensional position vectors is given
by r = ai + bj, where |a| 4- \b\ < 1. Describe the shape of
the region which includes all the points with these
9.22. Two points A and B have position vectors a and position vectors.
b respectively. In terms of a and b find the position
vectors of the following points on the straight line 9.32. A set of position vectors is given by r = ai +
passing through A and B. (a) the mid-point C of AB; bj + ck, where |«| 4- \b\ + |c| < 1. Describe the shape of
(b) a point U between A and B for which AU/UB = the region which includes all the points with these
1/3; (c) a point V for which AV/VB = 1/3, but for position vectors.
which V does not lie between A and B.
9.33. Suppose that a weightless framework supports N
particles, which have masses m: and are located at
9.23. Suppose that X is a number such that 0 < X < 1.
points with position vectors where i = 1, 2, 3,..., N.
Find two points, U and V, on the line through A and B
You may assume that the centre of mass is at the
such that (a) AU/UB = X and U is between A and
point with position vector r, where
B. (b) AV/BV-X and V is not between A and B.
(c) What is the case if X > 1? r = Z mi-
Find the centre of mass of three particles of masses
lkg, 2 kg, and 3 kg at the points i + j + 2k, — 2/4-
9.24. (a) Obtain a vector parametric equation for the
3j - 5k and 3/ 4- 2k.
straight line which passes through the point (1, 4, 2) and
is parallel to the line joining the points (2, 3,4) and
9.34. Obtain a parametric vector equation for the line
(1, 2, 3). (b) As in Example 9.9, deduce a pair of simul¬
which is parallel to i + 2j — k and which passes through
taneous cartesian equations for the line, (c) Obtain
the point with position vector / 4- j 4- k. Find the point
the points where the line intersects the (x, y)-plane and
of intersection of this line with the plane x — y 4- z = —2.
the (y, z)-plane. (c) By using these two points, obtain
another pair of cartesian equations for the line.
9.35. An aircraft flying with constant speed V is circling
horizontally at height H above an airfield which lies in
9.25. Suppose that P has position vector r, and r = Xa 4- the (x, y)-plane. Its motion is in the clockwise direction
(1 — X)b, where X is a parameter, and A, B are points when viewed from below. The centre of its circular path
with a, b as position vectors. Show that P describes a is at Pi 4- Hk, and at time t = 0 it is at the point
straight line. Indicate on a diagram the relative positions (P 4- R)i 4- Hk. Find the position vector for the aircraft
of A, B, P when X < 0, 0 < X < 1, and X > 1. at time t.
162 Mathematical techniques

9.36. a and h are two position vectors. Find in terms of 9.39. A particle describes an elliptical plane path with
a and b a position vector which bisects the angle position vector r — ia cos cot + jb sin cut, where t is time
between them. and co, a, b are constants. Show that the acceleration is
always directed towards the centre.
9.37. An aircraft A is flying along a path given by the
position vector 0.41/ + 148r/ + 0.99/r, where t is the
time in hours, and distance is in km. Another aircraft, 9.40. The position vector of a particle is given in polar
B, takes ofT from an airfield at the origin 0 at time coordinates by r = sec t, 0 = t. Sketch the path for
f = 0 and follows the path given by the position vector 0 < t < jn. Find the radial and transverse components
lOOrf + 250tj + 250tk. (a) Show that A and B are moving of acceleration.
along straight lines at constant speeds, and find the
speeds, (b) Show that a near miss between A and B will
9.41. The position vector of a particle P is given by
occur, and find the time that this happens.

9.38. Two moving points A and B have position vectors r = ia cos cot sin vt+j sin cot sin vt + ka cos vt,
rA(t) = xA(t)t + yA(t)j + zA(t)k and rB(t) = xB{t)i +
yB(t)j + zB(t)k respectively, which depend on the time t. where a, co, v, are constants and t is time. Show that P
(a) Show that the velocity of B relative to A is drB(t)/dt — moves on a sphere of radius a. Find the velocity of the
drA(t)/dt. (b) Suppose that the two points are A: (t, — t2, t) particle and show that its magnitude is a(v2 + co2 sin2 vt)*.
and B: (f3, 2f2, 1 + 31). Find the velocity of B relative to Deduce that the minimum speed occurs at the highest
A and the velocity of A relative to B. (c) Find the time and lowest points of the sphere, and find where the
t at which the relative speed is a minimum. maximum occurs.
The scalar product

Contents
10.1 The scalar product of two vectors 163
10.2 The angle between two vectors 164
10.3 Perpendicular vectors 165
10.4 Rotation of axes in two dimensions 167
10.5 Direction cosines 167
10.6 Rotation of axes in three dimensions 169
10.7 Direction ratios and coordinate geometry 171
10.8 Properties of a plane 173
10.9 General equation of a straight line 175
10.10 Forces acting at a point 176
10.11 Curvature in two dimensions 178
Problems 180

10.1 The scalar product of two vectors


Suppose that in component form

a = (flj, a2, a3) and

b = (blt b2, b3).

The dot product or scalar product of a and b is denoted by a dot


and is defined by

a-b = + a2b2 + a3b3.

(It is necessary to write the dot, because there is also another form
of product, called the vector product.) The dot product is not a
vector, but an ordinary number, or a scalar quantity. Some simple
properties are:

Scalar or dot product

Definition: Let a = (al, a2, a3) and b = (bu b2, b3).


Then
a-b = a1bl + a2b2 + a3b3.
(a) a-b = b-a (commutative property)
(b) a-(b + c) = a-b + a-c (distributive property)
(c) Connection with the magnitude |a|:
a-a — a\ + a\ + a\ = |a|2.
(For two dimensions, omit the third component.)
164 10.1 Mathematical techniques

Example 1 0.1. Find (a — A)-(a + A) when a = ( — 1, 0, 1) and b = (2, 3, 2).


a - b = (-3, -3, -1) and a + b = (1, 3, 3).

Therefore

(a -A)-(a + b) = (-3 x 1) + (-3 x 3) + (-1 x 3) = - 15.

Example 10.2. Prove that (a — A)-(a + b) = |a|2 — |A|2.


Use the rules in (10.1) to proceed as in ordinary algebra:

(a — b)-(a + b) = (a — A)-a + (a — A)-A = a-a — A-a + a-A — A-A

= a-a - A-A = |a|2 - |A|2 (by (10.1c)).

10.2 The angle between two vectors


In Fig. 10.1 we show two vectors, a and b, in three dimensions. Their
initial points coincide at P. By the angle 9 between a and b we mean
the angle 9 in the plane of a and b as shown: the angle chosen is
the one which is in the range 0° to 180° (i.e., we refer to the
internal angle, and do not use negative angles).
By the triangle rule

BA = a — b,

and the lengths of the sides of the triangle ABP are given by
\PA\ = |a|, \PB\ = \b\, \BA\ = \a — b\.

The cosine rule (Appendix B) says that

BA2 = PA2 + PB2 - 2PA ■ PB cos 9,

or \a — b\2 = \a\2 + \b\2 — 2\a\\b\ cos 9. (10.2)


But from (10.1c), putting a — b in place of a,

| a — b\2 = (a — b)'(a — b) = a-a + b'b — 2 a-b,

or \a - b\2 = \a\2 + \b\2 - 2a-b. (10.3)


By comparing (10.2) with (10.3) we obtain
a-b = \a\|Z>| cos 9,

or cos 9 = a"b/\a\\b\.

Angle between two vectors

Let 9, 0J ^ 9 ^ 180°, be the angle between the direc¬


tions of a and b. Then
(a) a-b — \a\\b\ cos 9. (10.4)
(b) cos 9 = a-b/\a\\b\,
or 9 = arccos (a‘b/\a\\b\) (a calculator gives this angle
uniquely in the range 0° to 180°).
10.2 The scalar product 165

If a and b are not at the same point to start with we may still
refer to 9 as being the angle between them. The result (10.4b) can
also be written in the form cos 9 — a-b where a and b are the unit
vectors in the directions of a and b.

Example 10.3. Given three points A: {l, 1, 1), B:(3,2,3), and. C:(0, —1, 1),
find the angle 9 between CA and CB.
Put C4 = a, CJ3 = b.

Then a = (1,1, 1) — (0, -1,1) = (1,2,0),

b = (3,2, 3) — (0, -1,1) = (3, 3, 2).

|a| = V{12 + 22 + 02} = y/5 and |A| = ,/{32 + 32 + 22} = J22.


ab = (1 x 3) + (2 x 3) + (0 x 2) = 9.

From (10.4),
ab _ 9
cos 9 = = 0.858.
H^rvno
Finally 9 = 30.9°

10.3 Perpendicular vectors


Cases when vectors are perpendicular or orthogonal are particu¬
larly important. The condition is that cos 9 = 0.

Example 10.4. Show that the vectors a = (1, 2, 3) and b = ( — 5, 1, 1) are per¬
pendicular.
We have

and
a-b = (1, 2,3)• (— 5, 1, 1) = -5+ 2 +3 = 0.

Therefore

9 = 90°, by (10.4).

From (10.4) the condition for two vectors to be perpendicular


may be expressed as follows:

Perpendicular vectors
If a and b are nonzero vectors, they are perpendicular if ‘

a-b = alb1 + a2b2 + a3b3 = 0.

The basis vectors i, j, k are perpendicular, so

i-j = j-k = k-i = 0. (10.6)


166 1 0.3 Mathematical techniques

Also, they have unit magnitude, so by (10.1c),

i’J =J'J = k’k = 1. (10.7)

Suppose that a = (al5 a2, a3) in component form. Then

i-a = i-(aj + aj + a?,k)

from (10.6) and (10.7). The component is therefore picked out by


scalar multiplication by i. Similarly,

j-a = a2, k-a = a3.

We can therefore write any vector in the form

a = (i-a)i + (J- a)j + (k-a) k.

(Remember that i-a,j-a, and k-a are ordinary numbers.)

Scalar products of i, j, k
(a) i-i=j-j=k-k = 1;
i-j = j-k = k-i = 0.
(10.8)
(b) The components of any vector a are given by
= i-a, a2 = j-a, a3 = k-a.

Example 10.5. Find the numbers a, /?, and y which make the vectors

a = oil + j + 2k. b = i + fij — k, c = / — J + yk

mutually perpendicular.
We require that a-b = b-c — c-a = 0.

a-b = (a/ +/ + 2 k)-(i + [Sj — k) = a. + ft — 2 = 0,

A-c = (f + pj — k)-(i—j + yk)= 1 — jff — y = 0,

c-a = (/ — j + yk)-(ixi +J+ 2k) = a - 1 + 2y = 0.

Therefore a, ft, y must satisfy

a + P = 2, (i)

- P - y = -1, (ii)
a + 2y = 1 (iii)

Substitute a from (i) and y from (ii) into (iii) to give

(2-jB) + 2(l -p)= 1,


so that P = 1. From (ii), y = 1 — p = 0, and from (i), a = 2 — P = 1. Therefore
the required vectors are

a = i + j + 2k, b = i +j — k, c — i ~ j.
10.4 The scalar product 167

10.4 Rotation of axes in two dimensions


In Fig. 10.2a, P is a point which has coordinates (x, y) in the axes
Ox, Oy. OX, O Y is another set of axes, rotated relatively to the first
set by an angle 9. The positive direction for 6 is anticlockwise, and
9 may lie in the range +180° so as to cover all possibilities, like a
polar angle. The unit basis vectors in the axes OX, OY are / and J
respectively. The problem is to find the coordinates (X, Y) of P in
the new axes.
We can express i and j in terms of /and J. From Fig. 10.2b, their
components in the X, Y axes are
(b) i — (cos 9, —sin 9),j = (sin 9, cos 9).

Therefore

i = /cos 9 — /sin 9,

and j = /sin 6 + J cos 9.

The position of P in space does not change when we change the


axes, so in terms of the new axes

XI + YJ = xi + yj
Fig. 10.2 (a) Change of axes in 2 = x(I cos 9 — J sin 9) + y(/sin 9 + J cos 9)
dimensions, (b) The associated
unit vectors. — (x cos 9 + y sin 9)1 + ( — x sin 9 + y cos 9)J.

Finally, by equating the coefficients of / and J, we obtain the result


(10.9a):

Rotation of axes in two dimensions


Given axes inclined at 9 as in Fig. 10.2a, the coordinates
x, y and X, Y are related by
z (a) X — x cos 9 + y sin 9,
Y = — x sin 6 + y cos 9.
(b) x = X cos 9 — Y sin 9,
y = X sin 9 + Y cos 9.

The inverse relation (10.9b) can be obtained by solving the equations


in (10.9a) for x and y; or by interchanging x, y and X, Y in (a) and
putting ( — 9) in place of 9.

10.5 Direction cosines


Figure 10.3 shows a position vector r = OP, where P is the point
(a, b, c), so

OP = r = (a, b, c).

Fig. 10.3 Angles made by OP The angles between r and /, j, k respectively (chosen for definite¬
with the axes. ness between 0° and 180°, as for 9 in Section 10.2) are a, /?, y. These
168 10.5 Mathematical techniques

angles specify the direction of r uniquely. It is convenient to use not


the angles themselves, but their cosines, which are normally indicated
by /, m, n:

l = cos a, m = cos /I, n = cos y.

These are called the direction cosines of r, and also specify the
direction of r uniquely.
Referring to Fig. 10.3,

|r| = yjia1 + b2 + c2).

Also
l = cos a = a/\r\, m = cos f = b/\r\, n = cos y = c/|r|.

Therefore
l2 + m2 + n2 = cos2 a + cos2 /? + cos2 y

= (a2 + b2 + c2)/|r|2 = 1.

The vector given in component form by

(cos a, cos /?, cos y) = (/, m, n)

is therefore the unit vector which specifies the direction of r.


Now let s be a vector having any magnitude and location, but
pointing in the same direction as r. Then s and r have the same
inclinations a, jS, y to the axes, and cos a, cos /?, cos y, the direction
cosines of s, are the same. To summarize

Direction cosines /, m, n of any vector s


If the angles between s and Ox, Oy, Oz, are a, jS, y,
respectively, in the range 0° to 180°, then
/ = cos a, m = cos /?, n = cos y
are the direction cosines of s.
(a) Any vector parallel to s with the same sense has
the same direction cosines /, m, n.
(b) l2 + m2 + n2 = cos2 a + cos2 /l + cos2 y = 1.
(c) s = (/, m, n) is a unit vector in the direction of s.

Example 10.6. Obtain the direction cosines of the vector s = i + 2j — 2k. Find
the angles between s and the coordinate axes.
The components of s are (1,2, —2), so its length is given by

s = Vfl2 + 22 + ( —2)2] = 3.

Therefore
1 0.5 The scalar product 169

The corresponding angles in the range 0° to 180° are a = arccos | = 70.5°,


/? = arccosf = 48.2°, y = arccos(-f) = 131.8°.

10.6 Rotation of axes in three dimensions


Figure 10.4a shows two sets of axes Oxyz and OXYZ, with the same
origin 0. The basis vectors are respectively j, j, k, and /, 7, K. We
shall show how to change from one set of axes to the other, as we
did in Section 10.4 in two dimensions.
The three components of any unit vector are equal to its three
direction cosines. /, 7, and K are unit vectors, so in the axes Oxyz
let them be given in terms of their direction cosines by:

/ = (/l5 mu nx), 7= (l2, m2,m2), K = (Z3, m3, n3). (10.11a)

By inverting our view of the two sets of axes, we can also specify
the components of i,j, k in the axes OXYZ:

i = (Zl512, l3),j - (ml5 m2, m3), k = (nu n2, n3) (10.11b)


\
\
(this is illustrated for the case of i in Fig. 10.4b).
X Next, suppose that a fixed point P has position vector
(b) Z
r = (x, y, z) - xi + yj + zk
/
/

/
/ in the axes Oxyz. We need to find the components of r in the axes
/
/ OXYZ. By substituting
/
/
/ i = ij + l2J + l3K, etc.

from (10.11b) into r, we obtain

r = (lxx + mxy + nxz)14- (l2x + m2y + n2z)J

+ (l3x + m3y + n3z)k.

The OX YZ coordinates of the point P are therefore

\
(l-^x + m^y + nxz, l2x + m2y + n2z, l3x + m3y + n3z). (10.12a)
\
\
The inverse relation is obtained in a similar way. Given a fixed
X
point Q, with coordinates (X, Y, Z) in the axes OXYZ and position
Fig. 10.4 (a) Change of axes in 3 vector /?, then
dimensions, (b) Angles between
i and the X, Y, Z axes. R = (X, Y, Z) = X/+ YJ+ ZK.

Now use (10.11a) to show that R is given in the axes Oxyz by

R = (fix + l2Y+ l3Z)i + (mxX + m2Y+ m3Z)j

+ (nxX + n2Y + n3Z)k.

The coordinates of Q in Oxyz are therefore

(fiX + fiY + Z3Z, m1X + m2Y + m3Z, rqX + n2Y + n3Z).


(10.12b)
170 10.6 Mathematical techniques

In matrix form, the coordinates in the two systems are related as


follows:

Rotation of axes; three dimensions


/ = (/x, m1; nj, J = (l2, m2, n2), K = (/3, m3, n3) are the
basis vectors for axes OXYZ, referred to axes Oxyz
(the components being direction cosines). Then

[x t m\ ni\( x\
(a) Y = m2 n2 y (10.13)
\z, \ m3 nj\ z)

( X^ / h iA/x\
(b) y m2

\J \ n2

The matrix of direction cosines in (b) is the inverse of the matrix


in (a).

Example 10.7. In axes Ox, Oy, Oz, l— (^, —f, f), J = (f, — f, —f), K =
(§, §, j) are perpendicular unit vectors which are basis vectors for a new set of
axes. Find the new coordinates of the point P: ( — 3, —3, 3).
From (10.13), the new coordinates are

X = 3(—3) — §( —3) + f(3) = 3,

Y = f(—3) — 3( —3) — f(3) = -3,

Z = §( —3) + f(~3) + i(3)= -3.

Example 10.8. (a) Confirm that the matrices

~h mx »i" " h h ls~

i2 m2 and m2 m3

-Is m3 >h- n2 «3-

are inverse matrices, where /= (/,, mx, nx), J= (/2, m2, n2) and K - (/3, m3, n3)
are mutually perpendicular unit vectors.
(b) Find the equation of the plane 3x + 3y + 3z = 1 in the new axes, using
the basis vectors given in Example 10.7.

(a) Multiply the two matrices. The diagonal elements are

l\ + l\ + li mi + m\ + m\, n\ + n\ + nj,

all of which are equal to unity since /, J, K are unit vectors. The other elements
have the typical form

lf2 + m,m2 + nxn2 = (ll,m1,nl)-(l2,m2,n2).

All of these are zero because /, J, K are mutually perpendicular. Therefore, the
product is the unit matrix.
10.6 The scalar product 171

(b) In this case we need x, y, z in terms of X, Y, Z. The equations


corresponding to (10.13b) are

x = lYX + l2Y + l3Z = $X + |T + fz,


y = mtX + m2 Y + m3Z = — \X — f 7 + fZ,

z = nyX + n2 Y + n3Z = §X — § Y + fZ.

In the new coordinates,

3x + 3y + 3z = (X + 2Y + 2Z) + (-2X — Y + 2Z) + (2X -2Y + Z)

= X - Y+5Z.

Therefore, the plane has the new equation

X - Y+ 5Z = 1.

10.7 Direction ratios and coordinate geometry


In ordinary three-dimensional coordinate geometry the inclination
of a straight line is specified without distinguishing between the
two possible directions along the line. The method used (in vector
terms) is equivalent to specifying the three components of any vector
v that is parallel to the line. The length of s, and its direction forwards
or backwards along the line, are immaterial. If

s = pi+ qj + rk = (p, q, r),

is parallel to the line, then the triplet of numbers p, q, r is called a


set of direction ratios for the line. Alternatively, if AB is any
segment of the straight line, then the projections of AB on to Ox,
Oy, Oz is a set of direction ratios for the line.
Any multiple of p, q, r, say Xp, Xq, Xr, is also a set of direction
ratios for the line, because it corresponds to a parallel vector
Vj = Xpi + Xqj + Xrk. For example, if s = 2i + 3/ + 6k is parallel to
a given line, then 2, 3, 6 and 6, 9, 18 are both sets of direction ratios
for the line. So are —2, —3, —6, corresponding to the vector ( — l)s,
although it points in the opposite direction.
By putting X = ±7 we obtain the direction ratios +f, ±f, +f
corresponding to the unit vectors ±s. These are also direction
cosines for +s, from which the angles made with the directions of
the axes can be obtained.

Example 10.9. Find the angles made with Ox, Oy, and Oz by a line with
direction ratios 2, 3, —6.
Put s = 2f + 3/ — 6k: this is parallel to the line. Since |s| = 7, the corresponding
unit vector s is given by

s = Ijs = 7? + 7/ — 2k = i cos ix +J cos jB — k cos y,

where cos a, cos /?, cos y are its direction cosines. Therefore, the inclination of
the line is specified by the angles

a = arccos f = 73.4°, j6 = arccos f = 64.6°, y = arccos (—f) = 149°.


172 1 0.7 Mathematical techniques

Example 10.10. (Two dimensions) Find a set of direction ratios for the straight
line y = 2x + 1.
We are looking for any vector which is parallel to the line. The points A: (0, 1)
and B: (1, 3) lie on the line, so the vector s = AB given by

s = OB — OA = (/ + 3j) -j = i + 2j

is parallel to the line. Therefore one set of direction ratios is given by the numbers
1, 2.

Example 10.11. (Two dimensions) Find parametric and cartesian equations


for the straight line through the point A: (a, b), which has direction ratios p, q.
In Fig. 10.5, A is the point with position vector a = at + bj, and s = pi + qj. P
is a general point on the line with position vector r = xi + yj, and s = pi + qk =
AS.

r = OA + AP,

and AP is some multiple of s, say:

AP = Is.

Therefore

r = a + as, (i)

where X is a parameter. This is a parametric vector equation for the line.


By equating corresponding components we have

x = a + Xp, y = b + Xq, (ii)

and these are parametric cartesian equations.


Now eliminate the parameter between equations (ii):

(x - a)/p = (y - b)/q.

This is a cartesian equation, which could be reduced to the standard form


y = mx + c.

Direction ratios of a straight line

Definition: if pi + qj + rk is parallel to the line, then


p, q, r, (or any multiple Xp, Xp, Xr) is a set of direction
ratios for the line.

(a) The angles a, f, y made with Ox, Oy, Oz are


obtained from the equations (10.14)

cos a = p/k, cos P = q/k, cos y = r/k,

where k = J(p2 + q2 + r2).

(For two dimensions, suppress the third compo¬


nent.)
10.8 The scalar product 173

10.8 Properties of a plane


z Figure 10.6 shows a plane which passes through a given point
A: (au a2, a3), and is perpendicular to a line having direction ratios
p, q, r. We shall obtain equations for the plane.
The position vector of A is a given by

a = OA = aj + aj + a3k. (10.15)

From (10.14) the vector n given by

n = pi + qj + rk (10.16)

is parallel to the line, so n is also perpendicular to the plane (« is


called a normal to the plane). P:(x,y,z) represents an arbitrary
point on the plane, with position vector r given by

r = OP = xi + yj A zk. (10.17)

By the triangle rule, AP = r — a, and n must be perpendicular to


AP\ therefore

tr(r — a) = 0,

or n'r = n-a. (10.18)

This is a vector equation for the plane.


By substituting for n, r, and a from (10.15, 16, and 17) we obtain
the cartesian equation

px + qy + rz = pa{ + qa2 + ra3.

Now suppose we start with an equation in the form

ax + by + cz = d, (10.19)

such as 2x + ly — 5z = 3. We shall show how it can be written in


the form (10.18). Put

r = xi + yj + zk, and p — ai + bj + ck.

Then (10.19) can be written in the form

p-r = d. (10.20)

Now let ^4: (al5 a2, a3) be any point we like that satisfies the equation
(10.19), and put a = aj + aj + a3k. From (10.20), this means that

d = p-a.

Therefore (10.19) can be written

P'v = p%a. (10.21)

This is like (10.18). Therefore (10.19) represents a plane, the plane


passes through the point with position vector a, and p is perpendicu¬
lar to the plane:
174 1 0.8 Mathematical techniques

Vector equation of a plane


(a) A vector equation for a plane through the point
a perpendicular to a vector n is trr — n-a.
(b) ax + by + cz = d always represents a plane. (10.22)
(c) p — ai + bj + ck is perpendicular to the plane
ax + by + cz = d.

Example 10.1 2. Show that the plane 3x — 2z = 1 is parallel to the y axis.


By (10.22c) the vector p = 3/ — 2k is perpendicular to the plane. Also

J'P =/•(3f - 2k) = 3ji - 2)-k = 0,

so p is perpendicular to j. Therefore the plane is parallel to j.

Example 10.l 3. Find the angle of intersection between the two planes 2x + 3 y +
4z = 5 and 2x — 6y — 3z = 0.
From (10.22c), the vector p — 2i + 3/ + 4k is a normal to the first plane (i.e., it
is perpendicular to it) and q — 2i — 6j — 3k is a normal to the second plane.
From Fig. 10.7, one of the angles between the two planes is equal to the
standard angle 9 (with 0° ^ 9 ^ 180°) between the two normals, p and q. By
(10.4a)

p q = \p\\q\ cos 9,

or — 26 = y/29 x 7 cos 9.

Therefore cos 9 = —26/7^/29, so 9 = 133.6°.

Example 10.l 4. Show that the planes 2x + 2y — z = 10 and 3x — 2y + 2z = 0


are perpendicular.

The planes are perpendicular if their normal vectors are perpendicular. Taking
the equations in order, by (10.22c) the vectors
Fig. 10.7 The angle between the
p = 2/ + 2/ — k and q = 3/ — 2J + 2k
planes, 9, is equal to the angle
between the normals p and q. are normal to the planes. Then

p-q = 6 — 4 — 2 = 0,

so the planes are perpendicular.

Origin
To find the distance D of the plane ax + by + cz — d from the
O
origin, consider Fig. 10.8. Drop a perpendicular ON from the origin
O to the plane at N. The equation of the plane may be written

p-r — d

where p — ai + bj + ck. Since N is a point on the plane,

p-ON = d.

or, after dividing by \p\,

P‘ON = d/\p\,
Fig. 10.8 The distance from the
origin 0 to a plane. where p = p/\p\ is the corresponding unit vector. Also OiV = ±Dp
10.8 The scalar product 175

depending on its sense, so

±Dp-p = d/\p\.

But p-p = 1, and by taking the modulus we find D:

D = \d\/\p\ = \d\fyj(a2 + b2 + c2). (10.23)

Now let Q, position vector q, be any point, distance DQ from the


plane. Move the origin to Q, and let R denote the new general
position vector measured from Q. Since R — r — q, the new equation
of the plane is p• (R + q) — d, or p• R — d — p• q. Therefore d — p-q
is to be put in place of d in (10.23).

Distance of a point from a plane ax + by + cz =d


Put ai + bj + ck = p. Then
(a) Distance D of 0 from the plane:

D - \d\/\p\ = \d\/J(a2 + b2 + c2). (10.24)


(b) Distance DQ of a point Q, position vector q.

Dq = |P’q - d\/\p\.

1 0.9 General equation of a straight line


z
Figure 10.9 shows the straight line through a point A : (ax, a2, a3)
with position vector a. Its inclination is specified by direction ratios
p, q, r. The vector

s = pi + qj + rk

is parallel to the line by (10.14), and is shown with its initial point
at A, so that it lies along the line. P:(x,y,z) is any point on the
line, and has position vector r.
By the triangle rule

r = ~OA + AJ5 = a + ~AP.

AP is always some multiple X of s:


Fig. 10.9
AP = Xs,
t
where X is a parameter which may take any value, so finally

r = a + Xs

is a vector parametric equation for the line.


In components this becomes

x = a1 + Xp, y = a2 + Xq, z = a3 + Xr,

and these are parametric cartesian equations for the line.


176 10.9 Mathematical techniques

Provided that none of p, q, r is zero the parameter k can be


eliminated by rearranging the equations:

(x - ax)/p = O - a2)/q = (z - a3)/r. (10.25)

This is the cartesian equation of a straight line, since any line passes
through some point A and has some direction p, q, r. This
expression is not unique, because (a1,a2,a3) and p, q, r are not
unique.
The equation (10.25) really consists of two simultaneous equa¬
tions: for example the pair

(x - aj/p = (y - a2)/q and (y - a3)/q = (z - a)/r.

These are the equations of two planes, and the line is their line of
intersection.

10.10 Forces acting at a point


The magnitude and direction of a force acting at a point in a body
can be depicted by a directed line segment. This pictorial possibility
does not automatically mean that forces behave like vectors: it must
be established that the rules for combining vectors listed in Section
10.3 parallel the experimental facts of mechanics.
The analogy is a little different according to the physical situation.
For the simplest case, Fig. 10.10 shows several forces, Fu F2,...,
acting on the same point P. P might be a single particle, or a single
point fixed in a large body. It is ultimately an experimental fact that
the forces have the same physical effect as a single force F, called
the resultant of the forces shown, which acts at the same point and
is obtained by vector addition:
Fig. 10.10 Forces acting at a
point P.
F=Fl + F2 + ---. (10.25)

A zero force has zero effect. Together with (10.25), this gives the
F
condition for equilibrium of a particle under the influence of several
forces Fx, F2,..that the resultant force F must be zero, or

F=Fl+F2 + -.. = 0. (10.26)

The magnitude of a force (expressed in units such as newtons)


is denoted by \F\, and is proportional to the length of the arrow
that represents it. The component of a force in an arbitrary direction
is illustrated in Fig. 10.11. Suppose the direction is indicated by the
unit vector s. Then

component of F in direction s = F-s = \F\ cos 9, (10.27)

where 0 is conveniently given a value between 0° and 180°. Notice


that if 9 is between 90° and 180°, then the component is negative.
Fig. 10.11 This agrees with the definition of vector components in the i, j, k
10.10 The scalar product 177

directions that we used before. The process of obtaining a compo¬


nent of F in a certain direction is often spoken of as resolving F in
that direction.

Example 10.15. Find the component of the force F=3i+j + k in the


direction of the vector s = 2z + 3/ + 6k.
The unit direction vector s is given by

s = s/|s| = (2i + 3/ + 6A)/7 = f? + fy + f*.

The component of F in this direction is given by

F’S = (3* +j + k) • (fi + 1$ + fi£) = 15/7.

In two dimensions, if the components of a force F in any two


non-parallel directions are zero, then F must be zero (and conversely,
of course). For suppose the angles made with the two directions are
9 and </>, and they do not differ by 0° or 180°. If

|F| cos 9 = 0 and |F| cos 0 = 0,

then |F| must be zero, so F = 0. (One direction is not sufficient, since


F might be perpendicular to that direction.) This principle, of
‘resolving in two directions’, is used frequently to solve problems.
The following is a simple example.

R Example 10.16. Figure 10.12 represents a particle P at rest on a rough


inclined plane of inclination 30°. The forces acting on P are the force of gravity
downwards of magnitude mg where g is the gravitational constant, the normal
(perpendicular) reaction of the plane R, and the frictional force F. Find R and F.
The arrows indicate provisional directions for the vectors R and F. The scalar
quantities R and F attached to the arrows stand for the unknown components of
R and F in the assumed directions and these might not be positive numbers. This
convention provides a safety net, for suppose we have, say, guessed the direction
of F wrongly, and that it actually acts down the plane rather than up it. The
mistake will do no harm, because F will simply turn out to be a negative number
in our answer. This is a conventional way of lettering diagrams in mechanics.
It is easiest to resolve in the assumed directions of F and R:

Fig. 10.12 In direction of F: 0 = F + mg cos (180° — 60°)

which is the same as

0 = F — mg cos 60° = F — \ mg, (i)

In direction of R. 0 = R + mg cos (180° — 30°)

which is the same as

0 = R — mg cos 30° = R-- mg.


V3 (ii)

Therefore

F = \mg and R =

(You would usually go straight for the commonsense way of writing the
components given by (i) and (ii), avoiding the cosines of large angles,)
178 10.1 1 Mathematical techniques

10.11 Curvature in two dimensions


(a)
s increasing In Fig. 10.13a, S is a fixed point on an arc and P is any other point
on it, with position vector r = xi + yj + zk. A positive direction
along the arc is indicated. P is then determined by specifying a
number s, where

P
|s| = arclength SP,

and s is positive or negative according to whether P is on the positive


or negative side of S. The parameter s is a kind of coordinate for P,
measured along the arc. Indicate the dependence of r on s by writing
r(s) (compare r(t) in Section 9.8). Given a particular function r(s).
the curve can in principle be reconstructed, although it is usually a
complicated matter.
Figure 10.13b shows the vector PQ = Sr, where P has parameter
(b) value s and Q has parameter value s + 5s. According to (9.18), the
vector dr/ds is tangential to the arc at P, and points in the direction
of increasing s. Also, in this case, when Ss is small,

|<5r| ~ |<5s|,

approximately, and so

|dr(s)/ds| = lim |<5r/<5s| = 1. (10.28)


<5s-> o

O
Therefore, in the case when the parameter used is s, dr/ds is a unit
tangent vector, which we can write as t, pointing in the direction
Fig. 10.13 The approach to a of increasing s.
tangent at P.
Since t is a unit vector,

t't = 1.

Therefore, by using the product rule to differentiate,

t’dt/ds + (d?/ds)*f = 0,

or 2(f*df/ds) = 0.

Therefore, dt/ds is perpendicular to t.


Draw a unit normal PN = h to the curve at P as in the diagrams
of Fig. 10.14. As we walk along the curve in the direction of t, the

Fig. 10.14 (a) k > 0,


curve is concave viewed
from the side of n. (b)
k < 0, The curve is

convex viewed from the


side of n. (c) k = 0, a
point of inflexion.
10.1 1 The scalar product 179

direction of h is towards the right. Since i and df/ds are perpendicular,


dt/ds must be a certain multiple, k say, of n:

df/ds = ten. (10.29)

The three cases in Fig. 10.14 relate to the sign of k. In Fig. 10.14a,
the curve is concave as viewed from the side of «, implying that
if we make a small increase in s, then dt points in the direction of
n; therefore k is positive. In Fig. 10.14b, the curve is convex as
viewed from the side of n; and in the same way it follows that
k is negative. In the case of a point of inflection (Fig. 10.14c), k is
zero.
The number k is called the curvature of the curve at P. The
greater is \k\, the more sharply the curve is turning. The positive
quantity p given by

p = l/|/c|,

is its radius of curvature at P. This is the radius of the circle that


best fits the curve at P. We will not prove this, but illustrate it in
the following example.

Example 10.1 7. Obtain expressions for t and dt/ds for the case of a circle of
radius a with centre at the origin, and confirm that p = a (Fig. 10.15).
Measure s from the point S; then

s = ad.

The unit tangent vector t has components ( — sin 9, cos 9), so

t = — / sin 9 -f / cos 9.

To differentiate with respect to s use the chain rule with dd/ds = 1/a (because
ds/d0 = a)

— = - (— i cos 9 — j sin 9)
ds a

(observe that f-df/ds = 0). The unit normal n in the right-hand direction has
components (cos 9, sin 9), so

h = i cos 9 +j sin 9.

Evidently

1
dr/ds = —h,
a

so K = — 1/a (consistent with a curve that is convex viewed from the side of the
normal n). Also

p = 1/|k| = a,

the radius of the circle.


180 Mathematical techniques

Problems

10.1. Obtain the scalar products of the pairs of vectors a — b all on the same diagram: there are two theorems
given in component form by: (a) (2, 2, 1) and (3, 1, 2), (b) to be had, depending on whether you think of the triangle
(2, -3, 2) and (-2, 3,-1), (c) (2, 2, -3)and(-l, 1, -2), or the parallelogram rule.
(d) (2,3,4) and (1, —2, 1), (e) (p - q, p + q, p) and
(p + 4, q, -p - q)- 10.12. Show that the component of a vector F in the
direction of another vector a is given by F-a/\a\. Find
10.2. (Two dimensions) Obtain the scalar products of the components of F = (8, 15, 9) in the directions of
the pairs of vectors given in component form by (a) the three vectors a, b, c, where a = (2, 3, 6), b = (0, 3, 4),
(2,3) and (3,4), (b) (1,0) and (0,1), (c) (5,6) and and c = (2, 2, 1). Express F in the form F = Aa + pb +
(0, -4), (d) (2, 3) and (3, -2). vc, where A, p, v are constants.

10.3. Prove that |a + b\2 + |a — b\2 = 2(|a|2 + |ft|2). 10.13. Show that the vectors a = i + 3/ + 4k and b =
(Hint: see equation (10.1c).) Sketch the vectors a,b,a + b, — 2/+ 6/ — 4k are perpendicular. Obtain any vector
a — b on one diagram in order to obtain a geometrical c = cli + C2J + c3k which is perpendicular to a and b,
theorem from this result. (There are two possible and derive from it two unit vectors (their senses will be
theorems, depending on what diagram you draw.) opposite).

10.4. Let a = (2, -3, 4) and b = (- 1, -2, 3), or 10.14. Let a = / + j — k and b = 2i — j + 2k. Find the
in the alternative form a = 2i — 3/ + 4k and b = —i — angle (in the range 0° to 180°) between a and b,
2J + 3k. Evaluate a b, (a) using the first form, (b) using and construct any vector perpendicular to a and b.
the second form with (10.8).
10.15. Find the value of X such that the vectors (A, 2, — 1)
and (1, 1, — 3A) are perpendicular.
10.5. Given that a = i + 2/ — k and b = i + 3/ + k, eval¬
uate the following scalar products: (a) a-b, (b) (a — b)-(a+
10.16. Determine numbers a, ji, y which ensure that
b), (c) (« — b)-{a — b), (d) a-a + 2crb + b-b, (e) (a-a)a —
the vectors a = (a, 2, — 3), b = (— 1, 2/3, 2) and c =
(b-b)b.
(2, 1, —3y) are mutually perpendicular.
10.6. Find the angles, in the range 0° to 180°, between
10.17. The points A:(l, 0, 0), B: (0,1,0), C: (0, 1, 1),
the pairs of vectors (a) i +j + k and i + j, (b) i — j + k
and Z): (0, y, z) are the vertices of a tetrahedron. Find
and i + j. (c) 2/ — j + 3k and i + 3J + 2k.
y and z such that ABD is an equilateral triangle and
BCD is a right angle.
10.7. (Two dimensions) Find the angle 9 (0° ^ 9 ^
180°) between the pairs of vectors: (a) 3/ -I- 4/ and
4i — 3j, (b) / — 2j and 2i—j, (c) i — 2j and — 6/ + 3/. 10.18. (Change of axes in two dimensions.) Oxy and
OXY are two sets of right-handed axes with the same
origin O. OX is reached from Ox by an anticlockwise
10.8. Find the angle between one of the edges of a cube
rotation 45°. (a) Obtain the X, Y coordinates of a
and a diagonal line through one end.
point P whose coordinates in Oxy are (2, 2). (b) Find
the values of x and y for the point Q for which X = 1,
19.9. A circular cone has its vertex at the origin and its
Y = — 1. (c) Find the equation of the circle (x — l)2 +
axis in the direction of the unit vector a. The half-angle
y2 = 1 in the axes OX Y.
at the vertex is a. Show that the position vector r of a
general point on its surface satisfies the equation
10.19. Find the lengths, and the direction cosines /,
«t= |r| cos a. m, n, of the following vectors, (a) j, (b) i + j + k, (c)
/ - 2j - 2k, (d) t-J+ k, (e) 1 — y — k, (f) 2i + 3j + 6k,
Obtain the cartesian equation when a = (2/7, —3/7, (g) * — 2/ — 2k, (h) 3k, (i) -3k.
— 6/7) and a = 60°.
10.20. (Change of axes.) (a) Show that the vectors
10.10. A: (2, 2,-1), B: (0,1,1), C:(- 1,2,0) are three with components in Oxyz given by X = (6, 15, 10)/19,
points. Find the angles in the triangle ABC. Y = (15, — 10, 6)/19, Z = (10, 6, — 15)/19 are mutually
perpendicular unit vectors, (b) A sketch will show that
10.11. Confirm the fact that a-b = ^(|« + b\2 — [A/ Y,Z] is a right-handed system, so it defines a new
|a — b\2). (Hint: it is easier to start with the right-hand set of right-handed axes OX YZ. Write down the change-
side.) Test the result using any two vectors. Deduce a of-axes matrices in (10.13a) and (10.13b). (c) Find the
simple geometrical theorem by sketching a, b, a + b, coordinates of the point x = 1, y = 2, z = 2 in the new
The scalar product 181

axes, (d) Express the equation of the plane x + y + z = 0 to \p q - d\/\p\.


in the new coordinates. Deduce the distance of the point (1, 1,2) from the
plane x + 2y — 4z + 3 = 0.
10.21. The following are sets of direction ratios p, q,
r for a straight line. Obtain two possible sets of direction 10.30. A : (0, - 1, 3), B\ (1, 0, 3), and C: (0, 0, 5) are three
cosines in each case, (a) 3, 4. 12; (b) 6, —10, 15. points. Let P, be the plane through A, perpendicular to
—/ + A, and P2 be the plane through A, B, and C.
10.22. A swarm of particles expands through all space. (a) Find equations for P, and P2. (b) Obtain the angle
The velocity v(r) of the particle with position vector r(t) between Px and P2. (c) Determine the perpendicular
at time r in a given set of axes is equal to f(t)r. Show distance from the origin 0 to Pv (d) Show that the
that the rule is the same when the velocity is measured line of intersection, L, of Pl and P2 meets the line OD,
relative to any given particle. where D is the point (1,4,-4). (e) Determine the
point of intersection of L with OD.
10.23. The angles made by a vector a and the positive
directions of the axes Ox, 0: are 45° and 30° respec¬ 10.31. (Two dimensions) A straight line has a unit
tively. Find the angles that a may make with Oy. normal n, and s points along the line. Let Fs and Fn
be the component vectors of a vector F in the directions
10.24. The following are sets of direction ratios p, q, of s and n respectively, so that F = Fs + F„. Show that
r for a straight line. Obtain two sets of direction cosines, F = Fs — (F-n)n. (Hint: see (10.8c).)
describing unit vectors parallel to the line, for each case. Find Fs and Fn when F = / — 3/ and the straight
(a) 3, 4, 12; (b) 6, -10, 15. line is given by 2x — 3y = 1.

10.25. (b) Find any constant vector parallel to the 10.32. (Two dimensions) A mirror Mx stands upright
line given parametrically by x = 1 — 2, y = 2 + 32, z = on a table (sketch it as a straight line Mx through the
1 + 2. (Hint: see eqn (10.25).) (b) Find the equation of origin O in the (x, y)-plane). s is a unit vector along
the plane which is perpendicular to line in (a) and which M pointing away from O, and n is the unit normal
passes through the origin. (Hint: see eqn (10.22b).) (c) vector to M pointing to the left of s.
Find the equation of the plane such that the line in (a) (a) A ray of light in the plane, with direction vector
lies in the plane, and the plane passes through the origin. u, falls on the mirror and is reflected in the direction
(Hint: the new plane must be perpendicular to the plane By considering its vector components in the directions
in (b).) of s and n show that

ux = — u A 2(u-s)s.
10.26. Find the angle 9, in the range 0° ^ 6 ^ 90°,
between the pairs of planes given as follows: (a) 2x — (b) Find ux when u = — (i+J)/j2, and the mirror
3y + z = 2 and x — y = 0, (b) x + y + z = 0 and z = 0. lies along the line y = 0. (c) Suppose there are two
(Hint: consider the normals.) mirrors, Mx and M2, forming a wedge of angle 60° in
the sector x > 0, y > 0, Mx being along y = 0 and M2
10.27. The vector equations of two planes are a-r = u along y/x = J3. A ray enters the wedge in the direction
and b-r = v, where a and b are constant vectors and u = — fcos 9 — /sin 9. Use the result in (a) to find the
u and v are constants. What is the vector relation direction of the twice-reflected ray.
between a and b for the planes to be perpendicular?
(Hint: see 10.22(b).) Obtain any plane perpendicular 10.33. (a) Find any two points on the line of inter¬
to x + y + z = 0. section of the planes x + y + z = 2 and 2x + y — 2z =
1 (e.g., one point is obtained by starting with x = 0).
10.28. (a) Show that the planes ax + by + cz = d, where (b) Obtain a parametric vector equation for the line of
a, b, c are fixed and d may take any value, are all parallel. intersection, (c) Deduce cartesian equations of the form
(b) Show that the straight line through 0 and perpen¬ (10.25) for the straight line. (Notice that the equations
dicular to the plane 2x + y — z = 2 has the parametric in (a), taken together, already define the line in cartesian
equation r = 2(2, 1, — 1), where 2 is a parameter, (c) For form, but the form (10.25) is more informative since it
(b), find the point at which the line intersects the plane, contains the direction ratios.)
and deduce its length (this is the distance of the plane
from the origin), (d) Find the distance between the plane 10.34. Obtain, in parametric form, the line of
in (b) and the plane 2x + y — z = 1. intersection of the planes 2x + 3y — z = 1 and x +
y + z = 0. Deduce the standard form (10.25).
10.29. Let p r = d, where p = (a,b,c) and r = (x, y,z),
be a plane, and Q be a point with position vector q. 10.35. Find direction ratios for the line of intersection
Show that the distance of Q from the plane is equal of the planes 2x + 3y — 2z = 1 and x — 3y + 2z = 2.
182 Mathematical techniques

(Notice that the line cannot be represented in the form that ds2 x Sx2 + Sy2, where s represents arc length.
(10.25).) Deduce that

ds/df = (a2 sin21 + b2 cos2 t)F


10.36. Three points are given by A: (— 1, — 2, 1),
B: (—1, —2,0), and C:(-1,0, 3). Let Pt be the plane Find the unit tangent vector, a unit normal, the curvature
through B with normal vector n{ = j + k and P2 the and the radius of curvature at the points where t = 0,
plane through C with normal vector n2 = 2/ — j + 3k. 471, and 2n.

Show that the line AC is perpendicular to Px.


10.40. A plane curve has the equation y = f(x), and
10.37. Given two planes, a{x + b{y + cxz = and a2x the position vector r of a point on the curve can be
+ b2y + c2z = d2, show that any solution p, q, r of the represented by
simultaneous equations
r = xi + f(x)j
aiP + byq + c,r = 0, a2p + b2q + c2r = 0
using x as the parameter. Show that the unit tangent
is a set of direction ratios for the line of intersection. vector to the curve is

10.38. (Perspective drawing) An observer’s eye E is at the „ dr dr dx i + /'/


point i + j + k, and views objects through a plane screen ds dx ds ^/[l + /'2]
which has the equation r*(l.lt + 1.1/ + k) = 1. Q is a
general point on an object behind the screen, and its Show that the curvature k of the curve at any point
position vector is r = xi + yj + zk. Find the coordinates is given by
of the apparent position of Q on the screen. (Hint: find
the equation of the line EQ\ then find where it cuts the
screen.) d+/'2)3/2'

Find the curvature along


10.39. An ellipse is given parametrically by r =
ta cos t + jb sin t, where a and b are constants and t (a) the parabola y = x2;
is the parameter, with —n<t^n (in radians). Show (b) the cosine curve y = cos x.
Vector product and
derivatives of vectors
Contents
11.1 Vector product 183
11.2 Nature of the vector p = a x b 185
u.3 The scalar triple product 187
ii,4 Moment of a force 190
n.5 Vector triple product 192
Problems 193

11.1 Vector product


A second form of product finds applications in problems about
moments, angular velocity, and in other circumstances that involve
rotation.
The vector product, or cross product, is denoted by a bold
multiplication sign as in a x b, or a caret sign as in a a b. Its
definition is:

Vector product a x b
(a) a x b = (a2b3 - a3b2)i + (a3bt - axb3)j

+ (axb2 - a2bx)k,

which can be written as a determinant:

i Ji k
(11.1)
(b) a x b = a, a2 a3

bi b2 b3

«3
= i +J + k
b2 b3 b-, bi bi b:

Example 11.1. Find the vector products a x h and b x a, where a = 2i — j +


3k and b = —i + 2j + dk.
From (11.1a),

a xA=[{-l) x 4} -(3 x 2)]i+ [{3 x (-1)} -(2 x 4)1/

+ [(2 x 2) — {(— 1) x (-1)}]*

= — 10/ — llj + 3k.

In evaluating b x a, we exchange the a and b components in the expression


184 1 1.1 Mathematical techniques

(11.1a), so the sign of each of the three bracketed terms changes. Therefore

b x a = — a x h = lOf + 11/ — 3k.

(Correspondingly we interchange the last two rows in the determinant form


(11.1b), which changes its sign by Section 8.2, Rule 3.)

Algebraic manipulations are governed by the following rules:

Algebraic properties of a x b
(a) a x b = —b x a (the vector product does not
commute).
(b) ax (b + c) = axb + axc (distributive law).
(c) a x (Xb) = Xa x b where X is any number.
(d) a x b = 0 if b and a are parallel: in particular,
a x a — 0.

These are proved as follows:


(a) If a and b are interchanged, then the three brackets in (11.1a)
change sign.
(b) Put frj + cu b2 + c2, b3 + c3 in place of b1, b2, h3, in (11.1a),
and separate the groups of terms involving b and c. This is also a
property of determinants.
(c) This follows immediately from (11.1a): X is a factor throughout.
(d) a and b are parallel so b = Xa, where X is some number.
Therefore a x b = Xa x a (from (11.2c)). If we now put bx = ax etc.
into (11.1a), we obtain a x a = 0; so a x b = 0.
The unit vectors i, /, k are simply related by the cross product:

Vector products of i, j, k
(a) i x j = k, j x k = i, k x i = j.
(11.3)
(b) j x i = —k, i x j = —i, i x k = —j.

Notice that for the group in (11.3a), the cyclic order *, j, k, i, j,...
is maintained, and for the group in (11.3b) there is a different cyclic
order j, i, k, j, /,.... To prove, for example, that i x j = k, put
i = (1,0,0) and j = (0,1,0) into the definition (11.1b) (or into
(11.1a) if you are not sure about determinants). Then we obtain

i J k

ixj = 1 0 0 = Of + Oj + 1 k

0 1 0
= k.

The group (11.3b) follows by the change-of-order rule, (11.2a).


1 1.2 Vector product and derivatives of vectors 185

(a) 1 1.2 Nature of the vector p = a x b


* Firstly we show that p — a x b is perpendicular to both a and b. This
\
\ implies that if we move a and b to emerge from a common point Q
I
l (see Fig. 11.1a) then p, or a x b, is perpendicular to the plane
containing a and b.
Using the definition of a x b.

a-p = a-(a x b)

= (aj + aj + a3k)-{(a2b3 - a3b2)i + (a3h1 - a3b3)j

+ (a1b2 — a^b^k

= «i(a2^3 - «3b2) + a2(a3b1 - a3b3) + a3(a1b2 - a2b3)

= 0.

Therefore, by (10.5), p is perpendicular to a. Similarly, p is perpen¬


dicular to b.
However, so far as we can tell from this argument, p might point
in either of two directions, as suggested by the diagrams in Fig. 11.1a
and b. We want to distinguish between them, and the distinction is
\
similar to the distinction between right- and left-handed axes
I (compare Fig. 9.8). One way to recognize a right-handed system
\
1 follows:

Fig. 11.1 It will be shown later


that the direction of p is that
given in (a).
Test for a right-handed system
(See Fig. 11.2.) Place a = QA, b — QB, c = QC, at a
common point Q. View Q from any point V on the
opposite side of the triangle ABC from Q. Then
(a) [a, b, c], in that order, is a right-handed system if
the direction of the circuit A to B to C is seen from
V as anticlockwise. Otherwise, \a, b, cl is left-
handed. (n-4)
(b) If [a, b, c] is right-handed, then (maintaining the
cyclical order), [b, c, a] and [c, a, b] are right-handed.
The others are left-handed.
(c) A set of axes is right-handed if [/,/, k] is right-
handed.

It is essential to place V on the opposite side of the triangle ABC


Fig. 11.2 Test for a right-handed from Q, otherwise the apparent direction of the circuit is reversed.
system of vectors [a, b, c].
In Fig. 11.1a, the system [a, b,p\ is right-handed, and in Fig. 11.1b,
Viewed through the triangle, the
vertices A, B, C follow in [a, b, />] is left-handed.
anticlockwise order. Returning to the cross-product p = a x 6, where the vectors all
186 1 1.2 Mathematical techniques

z emerge from Q, set up a special set of right-handed axes Qx, Qy, Qz,
as in Fig. 11.3. The axes satisfy the following conditions:

(i) Qx is in the direction of a.


(ii) Qy is in the plane of a and b, perpendicular to Qx. It is
directed so that the y component of b is positive.
(iii) The direction of Qz makes the axes right-handed.

The unit vectors are i,j, k. From the conditions (i) and (ii), with
the usual notation,

J k

0 0 = axb2k. (11.5)

bn 0

Since, according to (i) and (ii), ax and b2 are positive, p is in the


direction of k, and the test (11.4a) shows that

[a, b,p] is a right-handed system. (11.6)

Therefore, Fig. 11.1a is the correct one, and Fig. 11.1b gives the
direction of p incorrectly.
Moreover (see Fig. 11.3),

b2 = |6| sin 9,

(since 0° ^ 0^180°, the sign of b2 is positive as required).


Also

ai — W\-

Therefore, from (11.5),

p = a x b = k\a\\b\ sin 9, (11.7)

which specifies p in a simple way.

Properties of p = a x b
(a) p is perpendicular to a and b, in the direction
making [a, b,p~\ right-handed.
(b) \p\ — |a||Z>| sin 9, where 9 is the angle in the range
0° to 180° between the directions of a and b.

The properties of a x b in (11.8) depend only on the magnitude


and direction of a and b. We are bound to find the same results
whatever axes we use to obtain them: the axes we actually used were
chosen only to simplify the algebra. Therefore, we have shown that
the cross product is invariant with respect to changes in axes
1 1.2 Vector product and derivatives of vectors 187

(provided that we confine ourselves to right-handed axes; left-


handed axes would produce (—/>)).

Invariance of a x b
a x b is invariant with respect to changes from one
(11.9)
right-handed set of axes to another.

Other invariants are the length and direction of a vector a, and


therefore the vector a itself: its components are different in different
axes, but the physical vector we are talking about does not change.
The scalar product, a-b = + a2b2 + a2b3 is also invariant; that
is to say, it has the same numerical value in any axes: this value is
equal to |a||Z>| cos 9 and so does not change.

Example 1 1.2. Let a = QA and b = QB be two vectors from Q, representing


two sides of a parallelogram. Show that the area of the parallelogram is equal to
|a x b\.
Complete the parallelogram as shown in Fig. 11.4. Construct a perpendicular
BN on to QA. Then

Area QACB = base QA x height BN

- |a||A| sin d — \a x b\, (by (11.8)).

Fig. 11.4 The area of a Example 11.3. Two planes have normals n1 and n2 respectively, and pass
parallelogram. through a point A with position vector a. Obtain a vector parametric equation
for their line of intersection.
O
Figure 11.5 shows the two planes and their line of intersection LM, which
contains the point A. P: (x, y, z) with position vector r is a general point on the
line.
Let p be any vector parallel to LM. Then AP is always a multiple of p, so

r — a = Xp (i)

where A is a parameter.
We may choose p to be given by

Fig. 11.5 P = ni * n2; (ii)

it is perpendicular to and n2 by (11.8a), so it is parallel to LM. Therefore,


from (i) and (ii),

r = a + Xnx x n2 (iii)

is a parametric vector equation for the line.

11.3 The scalar triple product


The scalar quantity a-(b x c) is called a scalar triple product. It
has the following properties:
188 1 1.3 Mathematical techniques

Properties of the triple scalar product

ax a2 a3

(a) a-(b x c) = bl b2

(b) a-(b x c) — b‘(c x a) = c-(a x b)


(c) a*(c x b) = b'(a x c) = c(b x a) (11.10)
- —a'(b x c).
(d) If any two vectors are equal or parallel,
a'(b x c) = 0.
(e) It is invariant for right-handed axes.

The proofs are as follows:

(a) Put a = (a1? a2, a3) and so on. Then

a-(bxc) = (flj, a2, a3)

x (b2c3 - b3c2, b3c1 - btc3, b1c2 - b2Ci)

= «i(h2c3 - b3c2) + a2{b3cl - bxc3) + a3(b1c2 - b^J

al a2 a3

(b) In
a-(b x c), b-(c x a), c-{a x b),

the cyclic order a, b, c, a, b,... is maintained. The determinants for


b-(c x a) and c-(a x b) are each obtained from a -(b x c) by means
of two row interchanges, and this leaves the determinant unaltered
(see Section 8.2, Rule 3). Therefore they are all equal.
(c) Compare these three products with those in (b), and recall
that b x c = —c x b etc.
bxc (d) If h is parallel to c, then b x c = 0 from (11.2d). If a is parallel
to b or c, then use the same argument on one of the equivalent
permutations in (11.10b).
(e) Its value remains the same in any right-handed axes because
the cross and dot products have this property (see (11.9).
The brackets in the triple scalar product are not strictly necessary
and are often omitted, because the alternative bracketing (a-b) x c
would be meaningless.
A parallelepiped is the three-dimensional analogue of a parallelo¬
gram and its volume can be expressed as a scalar triple product.
Figure 11.6 shows the parallelepiped which has the vectors QA = a,
Fig. 11.6 QB = b, QC = c as three adjacent sides.
1 1.3 Vector product and derivatives of vectors 189

Drop a perpendicular AN on to the plane QBEC. Then

Volume = area QBEC x height AN.

But from Example 11.2, since QBEC is a parallelogram,

area QBEC — \b x c|.

Since b x c is perpendicular to the plane of b and c, AN and Axe


are parallel, so

height AN — QA cos 9 = \a\ cos 9.


Therefore
volume = \a\\b x c\ cos 6

= |a'(b x c)|

from (10.2).

Volume of a paralellepiped
If the adjacent sides at a vertex Q are QA = a, QB = b,
QC = c, then (11.11)

volume = |a'{b x c)|.

Vectors are said to be coplanar if, when drawn from the same
point, they lie in the same plane. The condition for this is

Coplanar vectors

Three nonzero vectors a, b, c at the same point are


coplanar if, and only if, (11*12)
a-(b x c) = 0.

(If they are not at the same point, then this is the condition that
they should be parallel to a common plane.) The result follows from
(11.11): the volume of the corresponding parallelepiped is zero.

Example 1 1.5. Show that the points A: (1, 2, 2), B: (3, 4, 5), C: (— 1, 0, — 1) lie
on a plane through the origin.
Suppose the three points A, B, C have position vectors a, b, c. To show a, b, c
are coplanar, evaluate a-(b x c):
a-(b x c) = (1, 2, 2)*{(3, 4, 5) x (-1,0, -1)}

1 2 2

3 4 5

-1 0 -1
3 5 3 4
= 1 - 2 +2
-1 -1 -1 0
4-2x242x4 = 0.
190 1 1.3 Mathematical techniques

Therefore, A, B, C, and 0 are all in the same plane, so the points A, B, C are
on a plane through the origin.

11.4 Moment of a force


Suppose that, in three dimensions, a force F is acting at a point P
in a body (Fig. 11.7) and Q is any point. Then the magnitude M of
the moment or torque about the point Q which is exerted by F
is defined to be

M = \F\d, (11.12)

where d is the length of the perpendicular QN from Q on to the line


of action of F. In Fig. 11.7, QP = R, and 9 is the angle between F
Fig. 11.7 and R, with 0^0^ 180°. Then

d — |/?| sin 9.
so
M = \F\\R\ sin 9. (11.13)

This equation suggests a connection with the vector R x F.


Define the vector moment M about Q of F acting at P by

M=RxF (11.14)

(note that R comes first in the product). Then by (11.8b),

\M\ = \R x F\ = \R\\F\ sin 9,

which is the same as (11.13). We call M the vector moment about


the point Q of F acting at P. M is perpendicular to the plane of
R and F, in the direction making [/?, F, M] right-handed.

Example 11.6. A force F=i — j+2k acts at P: (1,2,1). Find its vector
moment M about the point Q : (2, 1, 1).
In these axes the position vectors of P and Q are

p = i+2J+k, q = 2t+J+k,
so
R = QP=p-q = -i +j.

The moment M is given by

t J k

M = R x F= -1 1 0 2t+2j.
1 -1 2

Example 11.7. A force F = i — j (force units) acts at P: (1,2,0). Find its


vector moment about the origin 0.

0, P, and Fall lie in the (x, y)-plane, so the physical problem is two-dimensional.
Fig. 11.8 The Oz axis points towards you, out of the page, in Fig. 11.8.
1 1.4 Vector product and derivatives of vectors 191

The vector moment M is given by

J k

M= R x F— 1 2 0 -3k.

1 -1 0

Thus M is parallel to Oz and its z component is —3. Figure 11.8 shows the
A F2j
negative sign corresponds to F having a clockwise influence on a wheel
turning about the point 0.

A
Example 11.8. (Generalizes Example 11.7.) A force F = FJ + Fj acts at
V
P: (a, b, 0). Find its vector moment about the origin.
We have

i J k

M— R x F= a b 0 = (F2a — F^b)k.

P F2 0

This situation is also (physically) two-dimensional; the z direction in Fig. 11.9


would only be needed in order to display M. The expression illustrates the
separate clockwise and anticlockwise contributions respectively of Fx and F2.

The scalar triple product

§M = s(RxF) (11.14)

represents the component of M in the direction of a unit vector s. Its


physical significance concerns torque or moment about an axis
(rather than about a point), as follows.
Figure 11.10 shows an axis of rotation A A' (in three dimen¬
sions) passing through a point Q and parallel to a unit vector s. A
force F acts at P. Q' is any other point on AA', and P' is any point
A on the line of action of F.

Fig. 11.10 Moment of F about Put


an axis parallel to s: QP = R and QT7 = R.
s-(R' x F) = s-(R x F).
Then
s (R x F) = s-{(QrQ + R + PF) x F}.
But QQ' is parallel to s, and PP' is parallel to F, so by (11.1 Od) these
make no contribution to the triple scalar product, and we obtain

§‘(R x F) = s (R x F). (11.15)

Thus any point on the axis and any point on the line of action of
F may be put into the triple scalar product without affecting its
value.
The freedom given by (11.15) allows us to choose Q' and P' such
that Q'P' = R is perpendicular both to s and to the line of action of
F, as in Fig. 11.11. Put

\R'\ = \W\ = d\

Fig. 11.11 this is the distance between the two skew lines.
192 1 1.4 Mathematical techniques

Next, construct a set of coordinate axes at origin Q'. Let Q'x be


in the direction of/?', Q'z in the direction of s, and Q'y perpendicular
to Q'x and Q'y in the direction necessary for the axes to be
right-handed. The unit vectors are i, j, k.
Express F in terms of its components in directions f, /, k. The i
component is zero since F is perpendicular to Q'x, so

F=F2j + F3k,

Put s M = M, say. Then

M = s (R x F) = F-(s x R ) (by (11.10b)).

Also

vx R' = (\S\\R'\ sin 90°)/

= \R'\j = dj.

Therefore

M = F-dj = (FJ+F3k)-(dj)
— F2d. (11.16)
The expression (11.16) corresponds to what we should expect
about the turning effect of F about the given axis. There is no
contribution for f3 because F3k is parallel to the axis of rotation,
and F1 is zero in these axes. What remains is is/, which is
perpendicular to the axis of rotation Q'z, and d is the perpendicular
distance of F from it.
For this reason the scalar quantity M = §'(R x F) is called the
moment of F about an axis of rotation AA', as in Fig. 11.10.
Dropping the dashed quantities, the unit vector s is the direction
AA', and R = QP, where Q is any point on AA' and P any point
on the line of action of F (A/ being independent of the choice of
these points, by (11.15)).

1 1.5 Vector triple product


The vector

h’ = a x (A x c) (11.17)

is called a vector triple product. The vector h’ is perpendicular to


Axe; but A x c is perpendicular to A and c, so A, c, and w are
parallel to the same plane. Therefore (see (9.12), it must be possible
to express w in the simple form w = AA + fic. The required relation is:

Vector triple product

a x (A x c) = (a-c)b — (a-b)c. (11.18)


1 1.5 Vector product and derivatives of vectors 193

To prove (11.18), translate a, b, c to a common point Q, and set


z
up axes Qx, Qy, Qz as in Fig. 11.12, such that Qz is in the direction
of a. Then

a = a3k,

so a-b = aib3 and a-c — a3c3. (11.19)

Remember that A x i =j,k x j = —iand A x A = 0. In these axes,

tv = a x (b x c)

= a3k x {(b2c3 - b3c2)i + (b3cx - b3c3)j + (fc^ - b2cx)k)

= a3(b2c3 - b3c2)j — a3(b3c1 - b3c3)i

= <*&(!> J + bj) - a3b3(cxi + cj).

Fig. 11.12 The third components of b and c (b3k and c3k) are missing in the
brackets: to make them appear, add to the right-hand side the term
A A

a3b3c3k — a3b3c3k (which = 0).

After bringing them into the brackets we get

w’ = a3c3(bj + bj + b3k) - a3b3(cj + cj + c3k)

= (a-c)b — (a-b)c.

Example 11.8. Find a x (h x c) when a = i + /, b = 2/ — j, and c = i + j + k.

a-b = (i +j)-(2i-j) = 2-1 = 1;

a-c = (i + j)-(i +f + k) = 1 + 1 = 2.
Therefore
a x (b x c) = (a-c)b — (a-b)c = 2b — c

= 2(2i-/) — (t +y + k) = 3/ - 3y - A.
(The product could also be worked out directly.)

Problems

11.1. In component form let a = (l, — 2,2), A = the point with position vector a. (b) Obtain the equation
(3,—1,-1), and c = (—1,0, — 1). Evaluate the to the line when a = / + 2J + k, b = i—y and c = j + k.
following:
(a) a x b (b) b x a (c) a x a 11.4. Show that the vector a x u, where a = (aua2, a3)
(d) a-(b x c) (e) c(a x A) (f) A-(« x c) and u is any vector, is parallel to the plane atx + a2y +
(g) (a x A)-A (h) a x (a x A) (i) (c x A) x a. a3z = d. Obtain two vectors parallel to the plane 2x —
3y — z = l.
11.2. Given two planes, *-•/!, = d1, r-n2 = d2; show that
the plane through the origin perpendicular to their line 11.5. Under what conditions will a x A = 0?
of intersection is given by r-{tt1 x n2) = 0.
11.6. Show that the vectors a = 2f+3y + 6A and A =
11.3. (a) Use the vector product to obtain a vector 6i + 2y — 3A are perpendicular. Find a vector c which is
parametric equation for the straight line which is per¬ perpendicular to A and c and such that [a. A, c~\ is a
pendicular to the vectors A and c, and passes through right-handed set.
194 Mathematical techniques

11.7. (a) The vertices of a triangle are A, B, C, with be found such that v = Xb x c + Yc x a + Za x h.
position vectors a, b, c. Show that the area of the (Hint: start by forming a-v from this expression.) (b)
triangle ABC is given by ^\b x c + c x a + a x b\. (Hint: Find X, T, Z if v = 2i + j — 2k, a = i — j, b = i + 2j,
see Example 11.2.) (b) A second triangle has vertices at c = j- 2k.
a + X(b — c), b, c. where X is a scalar. Show that the areas
of the two triangles are the same. What simple geo¬ 11.13. The equations r = a + Xu and r = b + fiv, where
metrical result does the equality exhibit? (c) Find the X and /< are parameters, represent two skew lines Lx and
area of the triangle whose vertices are at / — 2j — k, L2 (straight lines which do not intersect), (a) Write down
i—j + 2k, i + 2j — k. a vector w which is perpendicular to both Lx and L2. (b)
Show that values of X, /r, and v can be found so that
11.8. A, B, C are three points which do not lie on a
straight line, and D is another point. Put AB = b, (a + Xu) + vw> = (b + )iv),
AC = c, and AD = d. Show that the distance of D
from the plane passing through A, B, C is equal to and explain why this implies that there actually exists a
\d-(b x c)\/\b x c|. straight line L3, which joints Lx and L2 and is perpen¬
dicular to both, (c) For the case when a = — u = k,
11.9. Show that, if QA, QB, QC are adjacent edges of b = t—j,v = i + J+k, find the values of X, fi, v. Deduce
a rectangular parallelepiped with coordinates the points where L3 meets L, and L2. Find an equation
for L3, and the perpendicular distance between L1 and
Q:(x0,yo,z0), A: (xt, y{, zQ. B: (x2, y2, z2),
L2.
C:(x 3, y3,z3),
11.14. Find the vector moments M of the given forces F
then its volume is given by the modulus of the determinant
acting at the points P as specified. Make sketches,
Xi - x0 X2 -X0 X3 -x0 indicating the direction of M.
(a) F = (2, 0, 0) at P: (0, 3, 0). Find M about the origin
y\ — Fo y2 ~~ Lo F3 ~ Fo O.

(b) F =(2,0,0) at P: (0,3,0). Find M about Q:(0,0,3).


z\ - zo Z2 - zo Z3 — zo
(c) F =(2,0,0) at P: (0, — 3,0). Find M about
11.10. (Oblique coordinates) (a) Let a, b, c be three Q A0, 0, 3).
non-coplanar vectors, and v be any vector. Show that
v can be expressed as 11.15. A force F of magnitude 4 acts at the point
(1, — 1, 2) in the direction of i — 2j — 2k. Find the vector
v = Xa + Yb + Zc,
moment M of F, (a) about the origin, (b) about the point
where X, Y, Z are constants given by ( — 2, 1, 2). (c) Find its component about the y axis, taken
in the direction of j (i.e., s = j, in the text.)
X = v(b x c)/D,

f=»'(cx a)/D, 11.16. Find the moment M about the axes specified,
where the force is F = (2, 0, 0) acting at P: (0, 3, 0). (Note
Z = v(a x b)/D,
that the sense of the axis needs to be specified. If the
where sense is reversed, then the sign of s-(/? x F), changes.)
D = a‘(b x c). (a) The z axis, taken in the positive direction, (b) The z
axis, taken in the negative direction, (c) The x axis, in
(Hint: start by forming, say, v(a x b). Equation (1 l.lOd)
the positive direction, (d) The y axis, in the positive
gets rid of two terms.) (b) Check the formulae for the
direction, (e) The axis through the origin, direction
case a = (1, 1, 0), b = (1, 1, 0), c = (1, 0, 1). v = (1, 1, 1),
s = (l/V3, 1/V3, 1/V3).
by solving the three equations obtained by splitting the
vector equation into components.
11.17. Find the magnitude of the moment M of the force
11.11. (Cramer’s rule.) In Problem 11.10, write the vector F = (1, 1, 2) acting at P. (2, —3, 1), about the axis AB,
equation v = Xa + Yh + Zc in the form of three simul¬ where A:(2,3,2) and B: (1,1,1). Verify directly that
taneous equations involving the components of a, b, c. the component of F in the direction of AB makes no
Now write the formulae for X, Y, Z in determinant form. contribution. Show that the component of F along any
This is known as Cramer’s rule (see Section 12.1), for line joining P to the axis AB makes no contribution.
solving any three simultaneous equations provided D ^0.
11.18. A fixed force Facts at a fixed point P with position
11.12. (a) show that if three vectors a, b, c are non- vector r. An axis passes through the origin, but it can be
coplanar and v is any vector, then constants X, T, Z can adjusted so as to take any direction s. Show that the
Vector product and derivatives of vectors 195

magnitude |M| of the moment M about the axis is a 11.21. Supposing that a and b emerge from the same
maximum when s is perpendicular to the plane contain¬ point, show geometrically that a x (a x b) and
ing r and F. (Hint: remember a-b = |a||A| cos 9 in the b x (a x b) are in the plane qf a and b.
usual notation.) Under what conditions is \M\ a mini¬
mum, and what is its value? 11.22. If v = (a x b) x (c x d), then v can be written
in either of the forms v = pc + qd or v = ma + nb.
Justify this expectation geometrically, then obtain the
11.19. A rigid lamina in the (x, y)-plane rotates at co constants by using eqn (11.18).
radians per second about the z axis in axes Oxyz, in
the manner of a wheel on an axle, (a) Show that if r 11.23. Prove that
and 9 are the polar coordinates of any point P, then the
velocity of P is given by v = — icor sin 9 + jtor cos 9. a x (h x c) + b x (c x a) + c x (a x b) = 0.
(b) Show that v may be written v = to x r, where
to = tok (co is called the angular velocity vector in two 11.24. (a) Find a vector which is perpendicular to n
dimensions.) and in the plane of n and b, where n and b are
(c) Choose any point Q which travels round with the any two vectors, (b) Show that the straight line r = b +
/.in x [(a — b) x «], where p is a parameter, passes
lamina, and let QXYZ be another set of axes which
remain parallel to Oxyz. Show that, viewed relative to through the point with position vector b, and meets the
QXYZ, any point P has velocity Vgiven by V = to x R,
straight line given parametrically by r = a + hi in a right
where R is its position vector in OXYZ. angle.

11.25. You are given two planes, r nl = dx, r-n2 = d2.


11.20. A point of a rigid body is fixed at the origin of Show that the point on their line of intersection that
coordinates 0. It rotates about 0 with angular velocity is closest to the origin has the position vector
co (i.e., at any instant the body is rotating at a rate |co| a(«j x «2) x nj + /)(«! x n2) x n2
radians sec-1 about the line in the instantaneous direc¬
tion of the vector co.) where a and /? are certain constants. Obtain a formula
Explain why every point of the body is moving for the constants.
perpendicularly to to. Show that the velocity v of any
point is given by v = to x r. (Hint: compare Problem 11.26. A particle P of mass m and position vector r(t)
11.19.) moves with velocity v(t) under the action of a single
Find the matrix S such that v = Sto. Show that force F. A point Q at q(t) has velocity u(t). The moment
\v\2 = toTSrSco, and that of momentum (or angular momentum) //(f), of P about
Q is defined by

-xy — zx H = (r — q) x (mv).
(y2 + z2
Show that dH/dt = (r — q) x F — mu x r. Deduce that
S7S = -xy x2+z2 -yz
if w(f) = 0, then dH/dt = M, where M is the moment of
\ — zx -yz y2 + z F about the point Q.
Linear equations

Contents
12.1 Solution of linear equations by elimination 196
12.2 The inverse matrix by Gaussian elimination 201
12.3 Compatible and incompatible sets of equations 202
12.4 Homogeneous sets of equations 205
12.5 Gauss-Seidel iterative method of solution 208
Problems 210

12.1 Solution of linear equations by elimination


A matrix equation

Ax = d, (12.1)

defines a set of linear equations (we referred to them previously


in connection with the inverse matrix in Section 7.4). In general, A
will be an m x n matrix, while .v and d are column n-vectors. Usually,
but not always, we are interested in the case where the number of
unknowns in the equations equals the number of equations. In
other words, there is neither a surplus of unknowns nor equations.
In this case we have an n x n or square matrix, and this is the
normal situation in applications. For example, the set of equations
defined by
1

_1
K>

*i 1 "
(N

-S5

-7
II

II
1

*2
i_
1 4 -2.
1

-*3-
i

is

.Xj + 2x2 + x3 = 1, (12.2a)

— 2xl + 3x2 — x3 = — 7, (12.2b)

Xj + 4x2 — 2x3 = — 7. (12.2c)

Consider now the case in which A is an arbitrary square matrix. If


the inverse of A exists, then multiplication of (12.1) on the left by
A~l leads to the solution vector

det A

using the formula for the inverse given in Section 7.4. Let n = 3;
12.1 Linear equations 197

then, for our standard matrix

flu al2 a13

A = a2 1 a22 a23

-1
<3
- a31

ro
a32

we have

adj A
x-
det A
x 3 -I

dj C, ,d, + C21d2 T C^id^


adj A 1
u2 Ci2di + C22d2 + C32d3
det A det A
- _ ~^i3^i + C23d2 + C33d3 _

where Cu, C12,... are the cofactors of allf al2,... (see Section 8.1).
Thus, comparison of elements in the vectors leads to

12 ai3

*i (Clld1 + C21d2 + C3ld3) — 22 a23


det A det A

32 a3 3

d! ai3 ai i a\2 d,
1 1
x2 a2l <^2 a23 , x3 = - a2i a22 d2
det A det A

a3 1 d3 a33 a31 a32 ^3

This representation of the solution is known as Cramer’s rule. It


is systematic in that, for x1? the determinant in the numerator has
the first column of A replaced by d, for x2, the second column is
replaced by d, and so on. The generalization to n linear equations
in n unknowns is fairly clear from this formula. It is a useful
theoretical result, but not generally a recommended method of
solving more than four equations in four unknown equations.
High-order determinant evaluation is complicated.
Short of using computer software, the simplest method of solving
equations involves systematic elimination. Consider the three equa¬
tions

Xj + 2x2 + x3 = 1, (12.3a)

— 2xy + 3x2 — x3 = —7, (12.3b)

xi + 4x2 — 2x3 = — 7. (12.3c)

We can perform three elementary row operations on linear equations


which do not affect the solution. They are
198 12.1 Mathematical techniques

Elementary row operations


(i) any equation can be multiplied by a nonzero
constant,
(ii) any two equations can be interchanged,
(iii) any equation can be replaced by the sum of
itself and any multiple of another equation.

Step 1. Eliminate xx from (12.3b,c) by subtracting multiples of


(12.3a) from (12.3b) and (12.3c):

Xi + 2x2 + x3 = 1,
7x2 + x3 = -5 (r2 = r2 + 2^),'

2x2 — 3x3 = — 8 (r3 = r3 — rj).

The required operations between the equations are listed on the


right.

Step 2. We now proceed to eliminate x2 from (12.2c) using a multiple


of the new row 2. Hence

Xj + 2x2 + x3 — 1,

7x2 + x3 = —5,

-¥x3=-¥ (r3 = r3-fr2)

Step 3. Using rule (ii) above, reduce the coefficients of x2 and x3 in


the second and third equations above to 1:

Xj + 2x2 + x3 = 1,

X-2 + tx3 = -7 (r2 = yr2),

X3 = 2 (r3 = — 2lr3)-

Step 4. Starting from the third equation, we can now solve the
equations by back subsitution. Since x3 = 2, from the second equation,

x2 = — f - tX3 = — f — j x 2 = -1,

from the first equation,

Xj = 1 — 2x2 — x3=1+2 — 2=1.

Thus the solution is

Xj 1, x2 = 1, x3 = 2.

The method is known as Gaussian elimination.


In fact, we need not write down the equations for xl5 x2, x3 at
each stage, since all the information in (12.3) is given by the 3 x 4
12.1 Linear equations 199

matrix

12 1 1
-2 3 -1 -7

14-2-7

which is known as the augmented matrix for the system of equations.


The elementary operations referred to previously become elementary
row operations on the matrix. We can reproduce the steps above by
the following more compact procedure:

1 2 1 r "l 2 1 r

-2 3 -1 -7 - 0 7 1 -5 /r'2 = r2 + 2r1

1 4 -2 — 7_ _0 2 -3 -8. \r'3 = r3 - iq

' 1 2 1 1

- 0 7 1 -5
23 46 (ri = r3 - fr2)
_0 0 7 7

12 1 1

0 1 7 —f 7r2 \

0 0 1 2 —2lr3/

where the arrow means ‘is transformed into’. The final matrix
is said to be in echelon form, that is, it has zeros below the diagonal
elements starting from the top left. We can now solve the equations
by back substitution as before.
The elements underlined are known as pivots and they must be
nonzero. They are used to clear the elements in the column below
them. If any pivot turns out to be zero as the method progresses,
then that equation or row is replaced by the first row below which
has a nonzero coefficient in the column. If there are no further
nonzero coefficients, then the pivot moves across to the next column.
It is possible to complete the Gaussian elimination by using
further row operations on the echelon matrix. Thus, continuing from
the echelon form above

" 1 2 1 1 “ "l 2 0 -i“ /ri = IT - r3 \

0 1 i 5
0 1 0 -l
7 7 - \r'2 = r2- jr3J

_0 0 1 2. _0 0 1 2_

" 1 0 0 1" (ri = G - 2r2),

- 0 1 0 -1

_0 0 1 2_

where the pivots are bold again. The final matrix now represents
the solution set x 1 - 1, *2 = -1, *3 = 2-
200 12.1 Mathematical techniques

Example 1 2.1. Using Gaussian elimination and back substitution, solve the set
of equations
X| + x2 4- 2x3 = 4,

2xj + 2x2 + x3 — x4 = — 1,

X 2 + x3 + x4 = 6,

x2 — x3 + 2x4 = 5.

We first perform the pivotal row operations on the augmented matrix as follows:

"1 1 2 0 4~ " 1 1 2 0 4n

2 2 1 -1 —1 0 0 -3 -1 -9

0 1 1 1 6 0 1 1 1 6

_0 1 -1 2 5. _ 0 1 -1 2 5_ (f 2 = r2 -2r,)

" 1 1 2 0 4~

0 1 1 1 6
(r2 ^r3)
0 0 -3 -1 -9

_0 1 -1 2 5_

' 1 1 2 0 4“

0 1 1 1 6

0 0 -3 -1 -9

_o 0 -2 1 -1. (ri = r4 - r2)

" 1 1 2 0 41

0 1 1 1 6
*
0 0 -3 -1 -9
5
-0 0 0 3 5_ (t'a = U - It)
" 1 1 2 0 4“

0 1 1 1 6
—►
0 0 i i 3 = -ir3X
C'3 - 1 /
-0 0 0 1 3- u — 5r4 /
(Note the row change r2 <-> r3 because of the zero pivot.) Back substitution now
gives

x4 = 3, x3 = 3 — jx4 = 2, x2 = 6 — x3 — x4 = 1,

x, = 4 — x2 — 2x3 = — 1.

The solution can now be written as

■*r " -1"

x2 1

^3 2

-X4J - 3-
1 2.2 Linear equations 201

1 2.2 The inverse matrix by Gaussian elimination


We can also use Gaussian elimination to find the inverse matrix.
In solving the set of equations Ax = d, we obtain the solution
X = A~ld. In other words:

Use elementary row operations to transform A into


the identity /, and use the same operations to trans- (12.5)
form / into A~l.

Suppose that we require the inverse of


"0102 “

10 10
A =
0 10 1
_ 1 0 2 0_
We reduce A to I4 and perform the same row operations on I4.
Thus, we can write down the steps in parallel as follows:
0 10 2 1 0 0 o"

10 1 0 0 1 0 0
h =
0 10 1 0 0 1 0

1 0 2 0_ 0 0 0 1 _

10 1 o' 0 1 0 o"
(ri r2)
0 10 2 1 0 0 0

0 10 1 0 0 1 0

10 2 0_ 0 0 0 1 _

10 1 0 “ 0 1 0 o-

0 1 0 2 1 0 0 0

0 10 1 0 0 1 0

0 0 1 0_ (U = r4 - r,) 0-10 1 _

10 1 o" 0 1 0 o"

0 10 2 1 0 p 0

0 0 0 -1 (r3 = r3 - r2) -1 0 1 0

0 0 1 0_ 0-10 1_

10 1 o" 0 1 0 o"

0 10 2 1 0 0 0

0 0 1 0 0-10 1
(r3 «-»■ U)
0 0 0 -1 0 I 0
202 1 2.2 Mathematical techniques

" 1 0 1 o’ 0 1 0 0

0 1 0 2 1 0 0 0
—>
0 0 1 0 0-1 0 1

_0 0 0 1_. _1 0 -1 0_

”1 0 1 o" 0 1 0 0

0 1 0 0 -1 0 2 0

0 0 1 0 0-1 0 1

_0 0 0 1. 1 0 —1 0_

” 1 0 0 o" 0 2 0 - 1~

0 1 0 0 -10 2 0

0 0 1 0 0-101

_0 0 0 1 1 0-1 0_

= u = A'1.

We conclude that
0 2 0 -1
-10 2 0
A-1
0-101
1 0-1 0_

1 2.3 Compatible and incompatible sets of


equations
Not all sets of equations have solutions. For example x + y = 1 and
x + y = 2 has no solutions. Consider the set of equations
x + y - z = 3,

3x — y + 3z = 5,
x — y + 2z = 2.
We can sense that there might be a problem by first evaluating the
determinant of the coefficients of x, y, and z. Thus

1 1 1 1 -1

3 -1 = 4 0 2

1 -1 2 0 1

4 2
= 0.

Thus Cramer’s rule will fail, although there still may be solutions.
We can determine whether solutions exist more readily by using
Gaussian elimination. In this case the application of row operations
1 2.3 Linear equations 203

on the augmented matrix leads to

1 -1 3 1 1 3

3 3 5 0 -4 -4 = r2 - 3r

1 2 2 0 -2 -1 - ri

1 1
0 -4

0 0 1 J (1-3 = r3 - 2r2)>
which is the echelon form for this set of equations. However, row 3
is inconsistent since 0/1. Hence these equations can have no
solutions.
On the other hand, consider the following set:

x + y — z = 1,
3x — y + 3z = 5,

x — y + 2z = 2,

(this is the previous set with one change to the first equation).
Gaussian elimination now gives

“ 1 1-1 1 ' " 1 1 -1 1 "

3-1 3 5 - 0 -4 6 2 Y2 = r2 - 3iy

_ 1 -1 2 2_ _ 0 -2 3 1_ vr3 = r3 - r1

‘ 1 1 -1 r
- 0 -4 6 2

0 0 0 0- (r) = r3 - ir2).

Row 3 is now consistent, and row 2 is — Ay + 6z = 2. Hence

y = ~i(2 - 6z)
and, from row 1,

x = 1 - y + z = § - \z.

Thus z can take any value, say /, so the full solution set is
1-
1_

m|<N

H<N

X
1

= 1,1;
y — 2 + 2a

_z _ A

for any value of /. It can be seen in this case that there exists an
infinite number of solutions, a different one for each different value
of A.
Geometrically, in three dimensions, it can be seen why equations
can have a unique solution, no solution, or an infinite set of
204 1 2.3 Mathematical techniques

solutions. Any equation such as

ax + by + cz = d

represents a plane in R3. Three equations represent three planes,


and we need only visualize how they might intersect or not. The
coordinates of any point of intersection of the planes in the solution
of the equations. The three diagrams in Fig. 12.1 show how three
planes can intersect in a single point, no point, or a line of points.

Example 1 2.2. Determine the complete sets of values for a and h which make
the equations
x — 2y + 3z = 2,

2x — y + 2z = 3,
(b)
x + y + az = b,

have (i) a unique solution, (/';') no solutions, (in) an infinite set of solutions.
Reduce the augmented matrix to echelon form using pivots to clear each column
successively:

"l -2 3 2~ ” 1 -2 3 2

2 -1 2 3 - 0 3 -4 -1 r2 — 2r1"

_1 1 a b_ _0 3 a — 3 b-2_

’ 1 -2 3 2

- 0 3 -4 -1

.0 0 a + 1 b - 1_ (C = U - U).
We can now interpret the echelon matrix.
(i) If a ^ — 1, then z has the unique solution
Fig. 12.1 (a) Unique solution, b- 1
(b) no solution, (c) line of
solutions. a + 1

Also y and x can be found by back substitution.


(ii) If a = —1, and b ^ 1 then row 3 will lead to an inconsistency. Hence
there are no solutions of the equations.
(iii) If a = — 1, and b = 1 then row 3 implies z = X for any number X. Also
x and y can be found by back substitution.

This example illustrates the advantage of Gaussian elimination


over the formula-based method of Cramer’s rule. The Gaussian
method can still be used if the number of equations differs from the
number of unknowns. Consider the following example.

Example 1 2.3. Investigate all solutions of

x + y — z = 1

3x — y + 3z = 5,

x — y + 2z = 2,

x + z = 3.
12.3 Linear equations 205

The augmented matrix is

1 1 -1 r "l 1 -1 l"

3 -1 3 5 0 -4 6 2 (r 2 = r2 -3r^
1 -1 2 2. 0 -2 3 1 C = r3 -c
1 0 1 3_ _0 -1 2 2. \ C = r4
/

“ 1 1 -1 1"

0 -4 6 2

0 0 0 0 As = r3 - ir2\
i 3
_0 0 f 7 -1 VC = r4 - hv

' 1 1 -1 r

0 -4 6 2

0 0 i 3
t 7

_0 0 0 0_ (r3 r4)

Row 4 is consistent, while row 3 implies z = 3. Then y and x can be found by


back substitution in rows 2 and 1.

On the other hand, there may be more variables than equations,


as in the following example.

Example 12.4. Show that the following equations are inconsistent:

Xj + x2 + x3 + 2x4 = 1,

x: - 2x2 + 3x3 - x4 = 4,

3xi — 3x2 + 7x3 = 7.

Proceed as before, and successively reduce the augmented matrix by pivots. Thus

”l 1 1 2 r ’ 1 1 1 2 1 ’

1 -2 3 -1 4 —» 0 -3 2 -3 3

_3 -3 7 0 7_ _0 -6 4 -6 4_ \r3 = r3 - 3rly

1 1 12 1 r
0-3 2-3 3

0 0 0 0 ■2 J (r3 = r3 — 2r2).

Since 0^2, row 3 indicates an inconsistency, so the equations are incompatible.

1 2.4 Homogeneous sets of equations


Any set of equations Ax = 0 is known as a homogeneous set; it
is a set of linear equations with zero right-hand sides. Clearly, the
206 1 2.3 Mathematical techniques

equations always have the so-called trivial solution x — 0, but there


may exist nontrivial solutions. What are the conditions for their
existence? Consider the following example.

Example 1 2.5. Find the value of a for which the following equations have
nontrivial solutions:

x + y + z = 0,
x + 2y =0,
x — 3y + az = 0.
Proceed in the usual way using Gaussian reduction. Thus

1 1 1 0 1 1 1 0

1 2 0 0 - 0 1 -1 0 r2 = r2 — fj

1 -3 a 0_ 0 —4 a — 1 0

1 1 1

0 1 -1

0 0 a — 5 0 (C = r3 + 4r2).

Hence z can only be nonzero if a = 5. Hence, nontrivial solutions exist if, and
only if, a = 5. Back substituting, we find that the solution is

z = X, y = z = X, x = — y — z = — 2X,

for any X.

If A is a square matrix, then Cramer’s rule (Section 12.1) implies


that

x det A = 0.

Hence x is a nonzero column vector if and only if det A = 0, which


is a test for nontrivial solutions for systems in which the number of
variables is the same as the number of equations.

Homogeneous equations Ax = 0, where A is


square.
If det A = 0, there is an infinite number of nontrivial (12.6)
solutions. If det A ^ 0, the only solution is x = 0.

Example 1 2.6. Find all conditions on the constants a, b, and c in order that

x + y + z = 0,

ax + by + cz = 0,

a2x + b2y + c2z = 0,

should have nontrivial solutions. Find the solutions in the cases (a) a = 1, b = 1,
c = 2; (b) a = 1, b — 1, c = 1.

This system of equations will have nontrivial solutions for x, y, and z if, and
1 2.4 Linear equations 207

only if,

1 0 0

= (b - a)(c - a) a 1 1

a2 b + a c + a

= (b — a)(c — a)(c + a — b - a)

= (b — c)(c — a)(a — b).

Hence, nontrivial solutions exist if b = c, c = a, or a = b.


(a) (a = 1, b = 1, c = 2) The equations become

x + y + z = 0,

x + y + 2z = 0,

-x + y + 4z = 0.

The augmented matrix is

1110 1110

112 0 0 0 10

114 0 0 0 3 0

1110

0 0 10

0 0 0 0 (1-3 = r3 - 3r2).

Row 2 implies z = 0, while row 1 implies x = — y. Let y = A, say. Then the


solution set is

X —X -1

y = X = 1

z 0 0

for any X.
(b) (a = 1, b = 1, c = 1) Applying Gaussian elimination, we find that

1 1 1 0 1 1 i 0

1 1 1 0 —> 0 0 0 0

1 1 1 0_ 0 0 0 0 ■rs = r3 - G

Hence, we are left with row 1, which implies

x + y + z = 0.
208 1 2.4 Mathematical techniques

Let z = X and y = /t. Then x = — 2 — yU. Hence the solution is

X — X — /.t -1 -i

= /< = 0 2 “b l

z X 1 0

for any X and any /r. Note that this is a two-parameter solution set.

1 2.5 Gauss-Seidel iterative method of solution


The method of Gaussian elimination, described in Section 12.1, is
not a practical approach, by hand, for a large system with perhaps
30 equations in 30 unknowns. Whatever method is employed, they
will have to be solved by computer. But decisions about what scheme
may be the best for a given set of equations are not always easy.
There are many direct iterative methods in addition to the row-
operation method described in Section 12.1. Two such methods will
be briefly explained here.
Consider the equations

3*! + x2 + x3 = —1,

— x3 + 4x2 + x3 = —8,

2xq + x2 + 5x3 = —14.

Write the equations as

*1 = i(-*2 - *3 - 1), (12.7)

*2 = iOl - X3 - 8), (12.8)

X3 = K — 2x! - x2 - 14), (12.9)

where x1? x2, and x3 are now the subjects of the three equations.
To start the iteration choose initial values for x2 and x3, say,
x20) = 0 and x^0) = 0, without thinking about the equations. Calcu¬
late x(11) from equation (12.7) as

x[1] = K-40)-*30>- !)• (12.10)


Use x^ in (12.8) with x^0). Thus

x^1* = jG)1* — x(30) — 8). (12.11)

Finally, use the update x^1* in (12.9) to find x^k

xP = i(2x(11) - x!,1' - 14). (12.12)

Hence we have calculated a new approximate solution given by x^1',


x(i \ *32)- Now repeat the calculations starting with x^1’ and x^1' as
the new initial values to obtain x(j2), xf\ x^3). The output from these
iterations is shown in the following table.
1 2.5 Linear equations 209

i 0 1 2 3 4 5 6 7

x(>) -0.3333 1.1110 1.0570 1.0030 0.9984


-*i —
0.9997 1.0000
v(0 0 -2.0830
X2 -1.1600 -0.9825 -0.9926 -0.9997 -1.0000 -1.0000
v(i) 0 -2.2500
X3 -3.0130 -3.0260 -3.0030 -2.9990 -3.0000 -3.0000

All solutions are quoted to four decimal places. It can be seen


that the exact solution, which is xi — 1, x2 = — 1, x3 = — 3, can be
achieved to this accuracy after seven steps for this example. This is
known as the Gauss-Seidel scheme for numerical solution of
linear equations.
An alternative method without updating given by

x«1) = i(-x<0)-xf-1),
X*1' = !(x<°> - x<0) - 8),

JC<1) = I(_2*<0,-X<0)- 14),

is known as Jacobi’s method. However, convergence to the exact


solution is slower with this scheme.
These methods do not always converge to the solution. For
example, the Gauss-Seidel scheme applied to the system (12.2) fails
since the iterates continue to increase in size. It can be shown that
the Gauss-Seidel scheme converges if the magnitude of each leading
diagonal element exceeds the sum of the magnitudes of the remaining
elements in the same row of the matrix of coefficients. This is the
case for the system given by (12.8a, b,c). Here the matrix of
coefficients is

"3i r
-14 1.
.215.

Each of the diagonal elements dominates the remaining elements in


that row, since

3 ^ 1 + 1 = 2, 4^| — 1| + 1=2, 5^2+l=3.

This property of the system of equations is known as diagonal


dominance. If the matrix is not diagonally dominant, then the
scheme may or may not converge. Usually a few steps will indicate
whether this is likely to be the case.
The schemes for both these methods can be expressed in matrix
form as follows. Let the system of equations

Ax = d,

where A = [a(J] is an n x n matrix. Let

A = + D + A]j,

where AL, D, and Av are respectively the lower triangular, diagonal,


210 Mathematical techniques

and upper triangular matrices given by

0
1
O
0 ...
an 0 . 0
a21 0 ... 0
0 «22 . 0
a32 0 , D

II
*31

L 0 0
- anl an2 0 _

■ 0 a12 al3 a\n

0 0 a2 3 aln

IK

II
C
_ 0 0 0 0 _

The matrix equation becomes

Alx + Dx + Avx = d.
It is easy to find x from an equation of the form Dx = • • *, since D
is a diagonal matrix with a simple inverse. This is the matrix which
is updated by Jacobi's method. Assuming that .r,0) = [x1/”,..., xj,°’]T
is the given initial estimate, the approximate solution at step r will
be computed from

Dx(r) = -Avxir~1) - ALx(r_1) + d.

On the other hand, in the Gauss-Seidel scheme, we take advantage


of the observation that x{{\ x{{\ ... are successively computed from
rows 1,2,.... Hence they can be used in the rows that follow. Thus
the Gauss-Seidel iterations are given by

Dx(r) = -Alx(r) - + d.

Problems

12.1. (Section 12.1). Solve the following systems of linear (e) x, + 5x2 +2x4=1,
equations using Cramer’s rule:
- 3x2 - x4 = 1,
(a) x, + x3 = 1,
3x2 + x3 + x4 = 1,
X2 ~ x3 = 3,
2x2 + x3 + x4 = 2.
2*! + x2 = — 1;
(b) -x, + 7x2 + x3 = 1, 12.2. (Section 12.1). The currents i,, i2, i3 (in amps) flow
in parts of a circuit which contains a variable resistor of
*2 - *3 = 3,
resistance R (in ohms). The equations for the currents
2.x, + x2 + 10.x3 = — 1; are given by
(c) x, + 5x2-x3 = 1, - i2 - i3 = 12,
— 3.x, + x2 — x3 = 1, — i, + Ri2 = 24,
3.x, + -x2 + .x3 = — 3; h +5i3= -12,
(d) -x, + ,x2 + -X 3 = 1,
in terms of the voltages on the right-hand side. For
ax, + bx2 + cx3 = d,
design reasons, the current i3 should be 2 amps. How
a2x, + b2x2 + c2x3 = d2\ many ohms should the resistance R be?
Linear equations 21 1

12.3. (Section 12.3) Show that the following sets of 12.9. (Section 12.2). Find the inverses of the following
equations are inconsistent. matrices.
(a) x1 + 2x2 + x3 = 3,
6-3 6
x, - 3x2 + 2x3 = 4,
(a) 3 6 6;
5x, + 5x2 + 6x3 = 1;
(b) X, + x2 + x3
_ —12 -3 6_
— 2,
X, + X3 + 2x4 = 3, 1 -1 2"

*l+x2 + II (b) 1 2 1;
- *2 + 2x3 = 2; _ -4 -1 2_
(c) Xj + X2 = 1,
"2-1 20"
X2 + *3 = 1,
1 0-12
*3 + x4 = 1, (c)
x4 + x5 -1, 0 0-12
x, + 3x2 + 5x3 + 7x4 + 4x5 = 1. _-1 0 1 0_
" 1 1 0 0 0 0"
12.4. (Section 12.3). Determine the complete set of values
for a and b that make the equations 0 1 0 0 0 0
x + y — z = 2,
0 0 1 0 0 0
2x + 3y + z = 3, (d)
0 0 0 1 0 0
5x + 7 y + az = b,
have (i) a unique solution, (ii) no solutions, (iii) an 0 0 0 0 1 1
infinite set of solutions. _0 0 0 0 0 1_

12.5. (Section 12.3). Investigate all solutions of the system ”10000”

x- y + 2z = 1, 110 0 0
x + y + 3z = 2,
(e) 1110 0.
x + 2 y — z = 3,
11110
x — 2y + 6z = 0.
11111
12.6. (Section 12.3). Show that the following equations
are inconsistent. 12.10. Show that
xt + x2 + x3 — x4 = 10, ”10101
*1 - X2 - X3 = 1.
0 10 10
4x! — 2x2 — 2x3 — x4 = 5.
10 10 1
12.7. (Section 12.1). Solve the following equations by
0 10 10
Gaussian elimination.
x2 + 2x3 — x4 = 11, 10 10 1

X] + x2 + -*3+ -X4 — 1, is a singular matrix.


2x, + x2 — x3 + 4x4 = 0,
12.11. (Section 12.1). The four planes
Xj — x2 + x3 — 2x4 = 2.
6x — 3_y — z = — 3,
12.8. (Section 12.3). Find the value of a for which the
2x — y + 5z = 15,
linear equations
ax — y + 2z = 1, y + z = 1,

x + 2y — az = 2, 2x + y — z = 1,
4x + y — 2z = 2, are the faces of a tetrahedron. Find the coordinates
have no solutions. of all its vertices.
212 Mathematical techniques

12.12. A light source is situated at the point P:(3,2,2). 12.17. (Section 12.4). Show that
A triangle has the points A:(l, 1, 1), B:(1,0, 1), a2 +1 ab ca
C:(2, 1, 1) as vertices. Find the coordinates of the vertices
of the shadow of the triangle on the coordinate planes ah b2 -1-1 be
x = 0, y = 0, and z = 0.
ca be c2 +1

12.13. The parabola For what values of t do the equations


(1 + t)x + 2y + 3z = 0,
y = a + fix + yx2
2x + (4 + t)y + 6z = 0,
is required which passes through the three points (xl, y,),
(x2, y2X and (x3, y3). When solutions exist, find a, /?, and 3x -I- 6y + (9 + t)z = 0,
y, and discuss the cases where there are no solutions. have nontrivial solutions? Find all solutions in each
case.
12.14. Find all values of the constants A and /i in
order that the equations 12.18. (Section 12.4). For what values of k do the
equations
x + y + z = 4,
kxx + 4x2 — x3 4- 3x4 = 0,
x — y 4- z = 2,
4xj 4- kx2 — x3 + 3x4 = 0,
2x + y — Xz = n,
4xj — x2 + kx3 + 3x4 = 0,
may have (a) just one solution, (b) no solutions, (c)
4x, — x2 -t- 3x3 + kx4 = 0,
an infinite set of solutions.
have nontrivial solutions?
12.15. (Section 12.2). For each of the sets of equations
below, set up the augmented matrix and, using ele¬ 12.19. (Section 12.3). Show that the equations
mentary row operations, decide on the consistency of the Xj + 2x2 + 3x3 = 4,
equations. If they are consistent, obtain all solutions in
2xj + 3x2 + 8x3 — x4 = 20,
each case.
2xy + 5x2 + 4x3 + x4 = 5,
(a) x + y + z = 3,
are inconsistent.
3x + 5 y + z = — 1,

x + 2y = 0; 12.20. (Section 12.2). Find the inverses of


" 1 A 0" ' 1 0 o"
(b) y + z = 1,

x + y + 2z = 3,
0 1 A and P 1 0

x y =1; _0 0 1_ _0 F 1_

(c) x + 2y + z = 4, Hence find the inverse of


1 T 2/1 A 0
x+ y = -1,
/r 1 + Xfi A
3x + 4y — z = 12.
0 n 1_
12.16. (Section 12.4). Find all solutions of the deter¬ Find the inverse of
minant equation
' 13 3 0"
1 -k 2 -1
4 13 3 .
2 1-k -1 = 0.
0 4 1_
-1 -1 2 — k

What are the values of k for which the following set 10.21. (Section 12.4). Express the determinant
of equations has nontrivial solutions? 1 1 1
(1 — k)x + 2y — z = 0, a2 b2 c2

2x + (1 — k)y — z = 0, a(b + c) b(c + a) c(a + b)


—x — y + (2 — k)z = 0. as the product of factors.
Linear equations 213

Obtain the values of a, fi, and c for which nontrivial x, — 2x2 + x3 = 4,


solutions of
Xj - x2 - x3 = 1,
x 4- a2y + a(b + c)z = 0, 2x, + 3x2 — 4x3 = 4,
x + b2y + b(c + a)z = 0, (the matrix in this case is not diagonally dominant.)
x + c2y + c(a + b)z = 0,
12.24. (Section 12.5). Show that one row of the matrix
exist. Find the complete solution in the case of coefficients fails to be dominant in the system
a + b = — c.
6,Xj — x2 + x3 = 2,
3xj + 2x2 + x3 = 1,
12.22. (Section 12.5). Using the Gauss-Seidel iterative
scheme solve the system of equations: X] — x2 + 4x3 = 5.

3xj + x2 + x3 = 5, However, confirm that the Gauss-Seidel scheme delivers


a solution accurate to four significant figures after ten
6x2 — 2x3 4- 3x4 = 6,
iterations starting at (0, 0, 1).
*i + 4x3 - 2x4 = 1,
12.25. For comparison purposes with the Gauss-Seidel
x2 + 2x3 — 4x4 = 2.
method, solve the equations (12.7)—(12.9), namely
Show that the iterations converge to a solution accurate
*1 = - *3 - 1),
to four significant figures within eleven steps, starting
*2 = 4O1 - *3 - 8)>
from (0, 1,0,0). Confirm also that the matrix of coeffi¬
cients is diagonally dominant. x3 = i( —2x, x2 14),
using the Jacobi method. How many steps are required
12.23. (Section 12.5). Show that the Gauss-Seidel scheme to achieve the same accuracy as that in the table in
fails for the system Section 12.5, i.e. to five significant figures?
3 Eigenvalues and
eigenvectors
Contents
13.1 Eigenvalues of a matrix 214
13.2 Eigenvectors 215
13.3 Linear dependence 220
13.4 Diagonalization of a matrix 221
13.5 Powers of matrices 224
13.6 Quadratic forms 227
13.7 Positive-definite matrices 229
13.8 An application to a vibrating system 233
Problems 235

13.1 Eigenvalues of a matrix


With any square matrix A, we can associate a set of homogeneous
linear equations Ax = 0. As we saw in Section 12.4 of the previous
chapter, such a set of equations will only have a nontrivial solution
set if det A = 0. Consider now the n x n set of equations
Ax = Ax, or (A — AIJx = 0, (13.1)
where A is a parameter, and I„ is the unit matrix (see Section 7.3(5)).
In order for these equations to have nontrivial solutions, we must
have
det(A — AI„) = 0.
This can only be satisfied if A takes certain values. These are called
the eigenvalues of the matrix A, and the equation they satisfy (eqn
(13.1)) is called the characteristic equation of A. The characteristic
equation is a polynomial equation, of degree n in A. We usually list
the eigenvalues as Aj, A2, and so on.

Example 13.1. Find the eigenvalues of


1 3’
A =
2 2
The eigenvalues of A are given by the determinant equation
3
det(A — AI2) = 0,
2 - A
which can be expanded into
(1 -A)(2- A)-6 = 0, or A2-3A-4 = 0.
This factorizes into (A — 4)(A + 1) = 0: hence the eigenvalues are Xx = -1,
A2 = 4.

Example 1 3.2. Find the eigenvalues of


13.1 Eigenvalues and eigenvectors 215

In this case
2-2 -2
det(/4 — XI2) = = (2 - 2)(4 - 2) + 2
1 4-2
= 22 — 62 4- 10 = 0,
and the quadratic equation has the roots
2 = i[6 ±7(36 -40)] = 3 ±j.
Thus real matrices can have complex eigenvalues.

Example 1 3.3. Find the eigenvalues of

1 2 1

A = 2 1 1

1 1 2
Here
1 - 2 2 1

det(4 2 1 -2 1

1 1 2-2
4-2 4 —2 4—2 (ri = H + r2 + r3)

2 1 -2 1

1 1 2-2
1 1 1

= (4 - 2) 2 1-2 1

1 1 2-2
1 0 0

= (4 - 2) 2 -1-2 -1

1 0 1-2

= (4 - 2)( - 1 - 2)(1 — 2),


= 0,
if 2 = 4 or +1. Hence the eigenvalues are
21 = 4, 22 = 1, 23 = — 1.

Eigenvalues
The eigenvalues of the n x n square matrix A are the
solutions 2 of the determinant equation (13.2)
det(A — 2I„) = 0.

1 3.2 Eigenvectors
Associated with each eigenvalue 2 of A, there will be an infinite
number of nontrivial solutions of the equation (A — 2I„)jc = 0.
216 1 3.2 Mathematical techniques

These are called the eigenvectors of A corresponding to the eigen¬


value X, and are generally denoted in this text by s. Thus, if X is an
eigenvalue of A, then there will exist a corresponding eigenvector
s / 0 of

(A-Xln)s = 0.

The solutions of this set of linear equations can be found by


Gaussian elimination.

Example 1 3.4. Find the eigenvectors of

1 3'
A =
2 2

From Example 13.1, the eigenvalues are A, = 4 and A2 = — 1. Let the cor¬
responding eigenvectors be

Thus (A — /t1I2)*i = 0 becomes

1 - 4 3 ai 0 f — 3a j + 3 bx = Oj
2 2-4 0_T °r 1 2a, - 2ft, = OJ'

Solution is easy in this case, and the solutions can be expressed as a, = = a


for any at. If we put a = 1, then an eigenvector is

1
*i
1

Any nonzero value of a will give an eigenvector; we usually choose a convenient


value for the parameter to give one solution. The others are multiples of this.
Similarly {A — X2\2 )s2 = 0 implies

'1 + 1 3 2a2 + 3 b2 = 0
= 0, or
2 2 + 1_ iAj la2 + 3 b2 = 0

The eigenvectors for tf is case are

rpi
for any nonzero jS. As before, we choose a particular value of f which makes
the eigenvector specific and simple. In this case we could put /? = 3 to give the
eigenvector

Example 1 3.5. Find the eigenvectors of

1 2 1

A = 2 1 1

1 1 2
1 3.2 Eigenvalues and eigenvectors 217

The eigenvalues of A are 2, = 4, 22 = 1, 23 = -1 (see Example 13.3). Let the


corresponding eigenvectors be

S: = bi (i = 1, 2, 3).

c,
In each case, we need to solve (A — At-I3)«f = 0. If 2, = 4, then
— 3a, + 2 b{ + cl = 0,
2a, — 3 £>, + c, = 0,
a, + b1 — 2c, = 0.
Gaussian elimination leads to

-3 2 1 0 -3 2 1 0

—> 5 5
2 -3 1 0 0 3 3 0 r2 — r2 + fr,

1 1 -2 0_ 5 5
0 3 3 0_ G = r3 +

-3 2 1 0

0 f o
0 0 0 0 (r3 = r3 + r2).
By back substitution, c, = a, b1 = c, = a, a, = ^(2£>, + c,) = a. Thus, with
a = 1, an eigenvector is

s, = 1
1
The other eigenvectors corresponding to 2, are simply multiples of s,. Using
the same procedure shows that the two eigenvectors corresponding respectively
to X2 A-3 can be chosen to be
-1 1

*2 =
-1 > *3 =
-1

2_ 0_

Example 1 3.6. Find the eigenvalues and eigenvectors of

1 2 -1

A = 1 2 -1

2 2 -1

In this example,
-2 2 -1

det(/l - 2I3) = 1 2-2 -1

2 2 -1-2

— 2 2 0 (ri = r, - r2)

1 2-2 -1
2 2 - 1-2
218 1 3.2 Mathematical techniques

-A 0 0

1 3 —A -1 (c'2 c2 + c,)

2 4 -1 - X

= — A[(3 — A)( — 1 — A) + 4]

= — A(A - l)2.

This particular matrix has an eigenvalue 0 and a repeated eigenvalue 1. How


does this affect the eigenvectors? Let the eigenvectors be, for A, = 0 and X2 = 1,

s, = (/= 1,2).

LCU

For A, = 0,

a, + 2b, — c, = 0,

a, + 2b, — c, = 0,

2 a, + 2b, — c, = 0.

Hence a, = 0, b, = a, c, = 2a, for any a. An eigenvector is

For A2 = 1,

2b2 — c2 = 0,

u2 + b2 — c2 = 0,

2 a2 + 2 b2 — 2c2 = 0.

If we let b2 = f], then c2 = 2/? and a2 = c2 — b2 = ft. Hence we can associate


with A2 = 1 the eigenvector

by putting ft = 1. There are only two independent eigenvectors in this example.

Note that if A has a zero eigenvalue, then A must be a singular


matrix since det A = 0. And conversely, if A is singular, then A has
at least one zero eigenvalue.
The matrix in Example 13.6 has two eigenvalues (one repeated)
and two eigenvectors. The meaning of this reduced eigenvector set
will be illustrated in the context of coordinate transformations in
Section 13.4. As the next example illustrates, a matrix can have a
repeated eigenvalue but still retain a full set of independent eigen¬
vectors.
1 3.2 Eigenvalues and eigenvectors 219

Example 1 3.7. Find the eigenvalues and eigenvectors of


3 0-1

A = 0 1

2 0

Thus

3-2 0 -1

det(.4 — 2I3) = 0 1-2 0

2 0-2
(3 — 2)(1 — 2)( —2) + (—1)(—2)(1 — 2)

(1 — 2)[ —32 + 22 + 2]

-(2-2X2- l)2.
Let 2j = 2 and 22 = 1 with corresponding eigenvectors

a,

s, = *>i (i = 1, 2).

Ci

For 2j = 2,

ax - c, = 0,

-hi =0,

2fl! — 2c! = 0.

We can let b1 = 0, cx = a, flj = a. Hence we can choose

= 0

For 22 = 1,

la2 — c2 = 0,

0 = 0,
2a2 — c2 = 0.

If a2 = Pi then c2 = 2f but b2 can then take any value y, say. Hence, the
eigenvector set is

P 1 0

*2 = y =P 0 + 7 1

2P 2 0

that is, it contains two parameters f and y. The choices of /J = 1 with y = 0, and
{} = 0 with y = 1, say, give two independent eigenvectors

1 0

0 and 1

2 0
220 1 3.2 Mathematical techniques

Unlike in the previous example, we can associate three distinct


eigenvectors with this matrix even though the matrix has only two
eigenvalues. We shall take up this point again in connection with
the diagonalization of matrices.

Eigenvectors
The eigenvectors of a square matrix A are the non¬
trivial solutions s of the homogeneous equations (13.3)
(.A — Xrln)s — 0, for each eigenvalue Xr.

1 3.3 Linear dependence


It is useful in mathematics to gather, in a collection or set,
elements which have common features. For example, we might
consider the set of all integers, the set of all fractions, or the set of
all real numbers. In a similar way, we can gather all m x n matrices.
They all obey certain rules, and are said to form a vector space. We
shall not consider the general case here, but restrict ourselves to the
set of all m x 1 column vectors: this set is called an m-dimensional
vector space Vm. These vectors obey the rules of matrix algebra.
Thus if

_1
1
Q-

Cl i

a2 b2
*1 = > S2 ~
_1
-Cl

-am-
1
=

then Sj and s2 belong to Vm, and so does + (ls2 for any constants
a and /J.
An important set of vectors in Vm is the set of base vectors
■ i -
o
i
_l

"0“

0 1 0
o

’ * • * ? (> —
o
II

II
<N
••

• •• o


■•

_1
o

_ 1 _
Any vector in Vm can be expressed as a linear combination of these
vectors. Thus

si = — a1el + a2e2 + • • • + ame„

L am
1 3.2 Eigenvalues and eigenvectors 221

The set of vectors {el5 e2, ■ ■ ■, em} is said, therefore, to form a basis
of'l'. None of the vectors ex, e2,..., em can be expressed as a linear
combination of the others, so that they are said to be linearly
independent. A set of n column vectors su s2,..., sn is said to be
linearly dependent if there exist constants a,, a2,..., a„, not all
zero, such that

aiSl + a2S2 + ' ' ' + (*„«„ = 0.


If the above equation holds only when — <x2 = • • • = a„ = 0, then
the vectors are linearly independent. It can be proved that any set
of linearly m independent vectors can form a basis of the vector
space Vm.

Example 1 3.8. Show that the column vectors

= (1, 1,0)T, a2 = (1, 0, 1)T, = (0, 1, 1)T

form a basis in three dimensions.


We must test whether

xa, + ya2 + za3 = 0

has non-zero solutions for x, y, z. The equations in full are

x +y =0,
x + z = 0,

y + z = 0.

The determinant of the coefficients is

1 1 0

D = 1 0 1 -2/0.

0 1 1

By (12.6) the only solution is x = y = z = 0. The vectors are therefore linearly


independent and can form a basis.

By a similar argument it can be shown that

*!=(!, 1,0)T, A2 = (1,0, -1)T, A3 = (0, 1,1)T

are linearly dependent and herefore cannot form a basis.

1 3.4 Diagonalization of a matrix


We will take a constructive approach to this problem for a 3 x 3
matrix. Consider the matrix of Examples 13.3 and 13.5, namely

1 2 1

A = 2 1 1

1 1 2_
which has the eigenvalues = 4, X2 = 1, 23 = — 1 and eigenvectors
222 1 3.4 Mathematical techniques

Construct a matrix C which has these eigenvectors as its columns:

1-1 1

C = [«! s2 s3] = 1 -1 -1
1 2 0_

Form the product

AC = A\sx s2 s3] = [/4s1 As 2 ^3] = [-Vi ^isi ^353]>

the last equality holding since the eigenvector si is defined as a


nonzero solution of Ast = st. Hence

AC = [sx s2 s3]£> = CD,

where

D = 0 X2 0

0 0 ^-3 -
that is, D is a diagonal matrix with eigenvalue elements. If we
premultiply this equation by C_1, then

C~XAC = C~XCD = I3D = D.

Effectively the matrix C has diagonalized the matrix A. In the


example,
-1 i 1 1 “
"l -1 r 3 3 3

1 1 1
C”1 = 1 -1 -i =
6 6 3

1 1
_ 1 2 0_ L 2 2 0_

Finally, it can be checked that

1 2 1 '1 -1 r "4 0 o'


2 1 1 1 -1 -i = 0 1 0

1 2 0. _1 2 0_ _0 0 -1.
It might appear at first sight that there is not a unique answer
for D since C is not uniquely defined. However, if C is replaced by
kC, with k / 0, then (ZcC)-1 = /c_1C~1, from which it still follows
that

(kCyxA(kC) = 1 C~lACk = C~lAC = D,


k
1 3.4 Eigenvalues and eigenvectors 223

since the k always cancels out. Individual columns in C can also be


multiplied by different factors, depending on the choice of eigenvector,
without changing the outcome.

Example 1 3.9.Use the eigenvalues and eigenvectors of

3 0-1

0 1 0

2 0 0

obtained in Example 13.7 to construct a transformation which diagonalizes A,


and verify that the diagonalized matrix is

2 0 0

D = 0 1 0

0 0 1

From Example 13.7, we see that A has the eigenvalues Aj = 2 and A2 = A3 = 1.


However, we can associate two linearly independent eigenvectors with the
repeated eigenvalue. Thus, we can define C by

1 1 0

C= [s, *3] = 0 0 1

1 2 0

Its inverse is

2 0-1

C"1 -10 1

0 1 0

Finally it can be verified that

2 0 -1 3 0 -1 1 1 0

C~‘AC = -1 0 1 0 1 0 0 0 1

0 1 0 2 0 0 1 2 0

2 0 0

0 1 0 = D.

0 0 1
Following the remarks just before this example

2 3 0

C = [2s, 3s2 -S3] = 0 0 -1

2 6 0

would equally well be an acceptable matrix in the diagonalization.

Example 13.10. Find a transformation which diagonalizes the matrix


224 1 3.4 Mathematical techniques

From Example 13.2, the eigenvalues are — 3 + j, A2 — 3 j. Corresponding


eigenvectors are

-1 +j

The eigenvalues and eigenvectors are complex-valued but this does not affect
the method. The matrix C becomes

1+j -1-j
C = [*! s2] =
1 1

Its inverse is

1 1 1 + jl i 1
1 +J1
det C L-i -i+jJ 2j _-l -1 +jJ

Finally check that

C"1
1 1 i +r ~2 —2 '-1+j -1-j'
2j -1 -i +j_ _1 4 1 1

"3+j 0
0 3 — j_

Diagonalizing a matrix
To diagonalize a matrix A:
(i) find the eigenvalues of A;
(ii) find n linearly independent eigenvectors sn of A (13.4)
(if they exist);
(iii) construct the matrix C of eigenvectors;
(iv) calculate the inverse C-1 of C;
(v) compute C~lAC.

Not all matrices can be diagonalized in this way. In Example 13.6


where

1 2 -1

A 1 2 -1

2 2 -1

we can associate only two linearly independent eigenvectors with


the eigenvalue 0 and the repeated eigenvalue 1, and no diagonalizing
matrix C can be constructed.

1 3.5 Powers of matrices


The transformation C of the previous section can be used to obtain
a formula for calculating powers of square matrices. This follows
1 3.5 Eigenvalues and eigenvectors 225

since it is a simple matter to find powers of diagonal matrices. Thus if


o o
D o ;.2 o
o 0 ,

then

1-
'2

o
o

o
'-i 0 '-1 0 'T 0

D2 = 0 X2 0 0 X2 0 = 0 z.2 0

<N O
_ 0 0 '-3- . 0 0 x3. _ 0 0

1
and, in general,
A'' 0 0

D" 0 X\ 0
0 0 XI
In the previous section we showed that, if a 3 x 3 matrix A has
three linearly independent eigenvectors, then we can find a matrix
C such that

AC = CD,
where D is a diagonal matrix, its elements consisting of the
eigenvalues of A. Thus by multiplying on the right by C_1 we find
that
A = CDC~X.

Hence
A2 = CDCXCDC~X = CDI3DC_1 = CD2C_1,

since C~XC = I3. Continuing this process, we find that

A2, = A2A = CD2C~ XCDC~ 1 = CZ)3C~\

and, in general,
A" = CD"C~ 1.

Example 13.1 1. Find a formula for A", where

1 2 1
A = 2 1 1

1 1 2
(See Examples 13.3 and 13.5 and Section 13.5.)
The eigenvalues of A are =4. = 1, /.3 = —1: and the diagonalizing
transformation, with its inverse, is
1 1 1
1 - 1 1 3 3 3

_1 _1 i
C = 1 - 1 - 1 c 1 = 6 6 3

1
1 0 r -1 o
2 2 v _
226 1 3.5 Mathematical techniques

Hence

n 1 1 I
1-1 1 4 0 0 3 3 3

1 -1 -1 0 1 0 1 1 1
/4" = CDnC~1 = 6 6 3
1 _i o

<N

O
o
L 2 2

1
1

1
1
i i
4" -1 (-1)" 3 3
_JL i
4" -1 -(-1)" 6 3

4" 2 0

1 1 1 1 1 -2 1 -1 0
4" -If
— 1 1 1 + 6 1 1 -2 +- -1 1 0
3 2
_1 1 1_ _ -2 -2 4_ 0 0 0_

Example 13.12. Let

— a a
P =
P 1 - PS
where 0 < a,/l < 1. Find P" and lim„_a P".
The matrix P is an example of a row-stochastic matrix, that is, all elements
are non-negative and the sum of the elements in each row is 1. The eigenvalues
of P are given by

1 — a —X
P 1 - p-A

Hence

(1 — a — 2)( 1 — P — X) — a/S = 0,
or
a1 — /.(2 — a — /?) + 1 — a — /? = 0.

The roots = 1, /l2 = 1 — a — /9 = p, say. Choose the corresponding eigen¬


vectors

"11 r-«“
LiJ L p J

Let

1 —a
C = [S, 52]
j p
Its inverse is given by

1 P ot
C'1
a+ P 1 1
1 3.5 Eigenvalues and eigenvectors 227

Thus

1 —a 1 0 P 1
P" = CDnC~1 =
.1 /5 J 0 p" _-l lja + p

1 1 -ap" P a
a + PH ppn _ -1 1

1 P + ap" a — ap"
a + p IP - pp" a + Ppn_

1 P a a —a
a_
+
a + P l-p p J
Since 0 < a < 1 and 0 < P < 1, it follows that

p= 1 — a— P < 1 and p=l — a — P > l — 1 — 1 = — 1,

that is, |p| < 1. As n -> go, then p" -> 0 and

1 P a
P"
a + P \-P a_

Powers of a square matrix


To find the power A" of a diagonalizable matrix A:
(i) find the eigenvalues and eigenvectors of A;
(ii) construct a matrix C of eigenvectors such that (13.5)
where D is the diagonal matrix of eigenvalues;
(iii) the required answer is
An = CDnC~1.

1 3.6 Quadratic forms


Suppose that x = [xl5 x2, ■ ■ ■, x„]T, an n-dimensional column vector
with elements xx, x2,..., x„. Any polynomial function of these
elements in which every term is of degree two in them is known as
a quadratic form. Thus, if n = 3, then

x\ + 8x^2 + X2 + 6x2x3 + x\

is an example of a quadratic form. Quadratic forms can always be


expressed as a matrix product of the form

xTAx.

The example above can be written as


1
1_

Xi

[Xj x2 x3] 4 1 3 ^2
_1

m
o

-^3-
1
228 1 3.6 Mathematical techniques

In this representation, A is required to be a symmetric matrix.


Nonsymmetric representations are possible with, for example.

1 0 0

A = 8 1 2
0 4 1

but the symmetric form is standard.


Let us find the eigenvalues of the symmetric matrix in (13.6) in
the usual way by solving

1 - A 4 0

4 1 -2 3 = 0.

0 3 1 -2
Hence
(1 - /.)[(! - A)2 - 9] — 4-4 (1 — A) = 0
or
(1 - A)[(l - A)2 - 25] = 0.

It follows that the eigenvalues are Xt = 1, X2 = —4, /3 = 6. It can


be shown by the methods previously explained that corresponding
eigenvectors are

1
_1
3" 1
1

*1 = 0 . *2 = 5 . S3 = 5
i_

_l
m
_ -3.
1

i
i

If a and b are two column vectors, and

a1 b = 0,

then a and b are said to be orthogonal. If we examine the eigenvectors


Aq, s2, and s3 above, then it is easy to see that

sjs2 = [3 0 -12 + 0 + 12 = 0,

and similarly that sjs3 = 0 and = 0. Thus the three eigenvectors


are mutually orthogonal: regarded as ordinary vectors in the sense
of Chapter 9, they are mutually perpendicular.
It will be shown that this property of the eigenvalues follows from
the symmetry of the matrix of the quadratic form. However, we
first show that the eigenvectors of a symmetric matrix must be real
numbers.

Theorem 1 3.1. If A is a symmetric real matrix, then its eigenvalues are real.

Proof. Suppose that X = a. + j /J is an eigenvalue. Since left-hand side of the


1 3.6 Eigenvalues and eigenvectors 229

equation det(A — Xln) = 0 is a real polynomial in X, it must also have an


eigenvalue X = a — j/?. Let s and s be the eigenvectors corresponding to X
and its conjugate X. Thus

As = Xs, As = Is. (13.7)

Since A is symmetric, it follows that (As)T — sJAT = sTA, and we can


replace (13.7) by

As = As, sJA = X sT. (13.8)

Multiply the first equation in (13.8) on the left by sT, and the second
equation on the right by s. Thus

sJAs = XsTs, sT/4s = IsTs.


Elimination of sTAs leads to

(I — I)sTs = 0. (13.9)
To show that srs / 0, put sr = (ax,..., an). Then, since a„a„ = |a„|

ai

«s = [fli a2---aj = Iaj2 + |a2|2 + • • • + |a„|2 > 0.

From (13.9), it follows that X = X or oc+jfl = a— j/3, from which we conclude


that [1 = 0. Therefore X is real.

Theorem 13.2. If A is a symmetric matrix, then the eigenvectors associated


with two distinct eigenvalues are orthogonal.

Proof. Let Xl and X2 be the distinct eigenvalues, and st and s2 their corresponding
eigenvectors. Then

As\ = Xlsl, As 2 = X2s2.

Transpose the second equation so that the equations become

As1=X1s1, sJA = X2sJ,

since A is symmetrical. Multiply the first equation by sj on the left, and the
second equation by s, on the right. Hence
sjTs, = Xisjsl, sjASi = X 2s2Sj.

Eliminate sjAs^ between these equations, leaving

Xxs2s! = X2s\sx, or (Xx - X2)s2sx = 0.

Since Xx # X2 it follows that sjsj = 0; namely st and s2 are orthogonal.

1 3.7 Positive-definite matrices


A quadratic form xJAx is said to be positive-definite if xTAx > 0
for all x / 0. If this is true, we simply describe the matrix A as
positive-definite. Consider the particular case in which A is a
3x3 matrix. Let Xx, X2, I3 be its eigenvalues, with corresponding
eigenvectors su s2, s3 which are chosen so that they are all unit
vectors, that is, = SJS3 = 1-
Any quadratic form can be written as jt7Ax where A is symmetric.
230 1 3.7 Mathematical techniques

As we saw in Section 13.5, we can diagonalize A by using the matrix

C = [«! s2 s3],

so that
0 0

C~lAC — D — 0 0

0 0 23_,

For a symmetric matrix, the eigenvectors are orthogonal (Theorem


13.2). Hence

= $[[>! s2 *3]^

= [S^l *1*2 *1*3].


= [10 0],

since is a unit vector. In a similar way,

sJC = [0 1 0], s]C = [0 0 1],

Hence, if we construct a matrix with sj, sj as its rows, then

■*r '1 0 0"

CTC = c= 0 1 0

-4. _0 0 1_

In other words, the transpose of C is equal to the inverse of C:

is the inverse of C, that is, CT = C-1. Square matrices with this


property are said to be orthogonal matrices.
Suppose that we now define a transformation by x = CX, where
C is an orthogonal matrix. Then, in terms of X, the quadratic form
becomes

xTAx = (CXfACX

= XTCTACX

= XTDX = X^Xl + X2X22 + 2.3X3.

It follows from this result, for 3 x 3 matrices, and by implication


for higher order, that a quadratic form is positive-definite if and
only if all its eigenvalues are positive.

Example 1 3.1 3. Find an orthogonal matrix C which transforms the quadratic


form xTAx where
1 3.7 Eigenvalues and eigenvectors 231

3 -1 0

A = -1 3 0

0 0 1

into a diagonal quadratic form XTDX.


The eigenvalues of A are given by det(T — 7I3) = 0, where

3-7 -1 0

det(T 3 - 7 0

0 1 - 7

= ((3 — 7)2 — 1)(1 — 7),

= (7 - 2)(7 - 4)(1 - 7).

Hence the eigenvalues are = 1, 72 = 2, 73 = 4. Since all the eigenvalues are


positive, it follows that the quadratic form is positive-definite. The corresponding
eigenvectors are

0 i/V2 -i/V2
*1 = 0 » *2 = i/V2 . *3 = 1/V2
1 0 0

Hence the required orthogonal matrix C is

0 1/V2 -1/V2
C = j>x s2 s3] 0 1/V2 1/V2
1 0 0

The relation between the coordinates (x, y, z) and (X, Y, Z) of a


point fixed in space in the transformation

x == CX —— [Ti s2

where the eigenvectors s1; s2, s3 are orthogonal unit vectors, can be
z seen as follows. Put X = 1, Y = 0, Z = 0, which is a point on the
Z\
\ X axis. Since
\

it follows that the corresponding point in the x frame is x = sv In


other words the elements (a1, bx, cv) of Sj are the coordinates in the
x space of the point A^. (1, 0, 0) in the X space. Similarly, the
* / elements of s2 and s3 are respectively the coordinates of A2: (0, 1, 0)
ix
and Ay. (0, 0, 1) in the X space (see Fig. 13.1).
Fig. 13.1 Orthogonal mapping We know that the eigenvectors are mutually orthogonal, that is
between axes. The coordinates sJSj = 0 (i / j). We want to show that this implies that the new
(aub1,ci), (a2,b2, c2), axes OX YZ are also mutually perpendicular. Consider the triangle
(a3, b3, c3) are measured in the
x space. OA1A2: we want to show that AxOA2 is a right angle, so that the
232 13.7 Mathematical techniques

triangle is subject to Pythagoras’s theorem:

A rA\ - 0A\ - OA\ = (aA - a2)2 + (bx - b2)2 + (cx - c2)2

- (a\ + b\ + c\) - (a2 + b\ + c22)

= — 2(ala2 + b ±b2 + c^) = — 2s]s2 = 0,

since the eigenvectors are unit vectors and orthogonal. Hence, by the
theorem of Pythagoras, A1OA2 is a right angle. Similarly, the other
angles A2OA3 and A3OAl are right angles. Hence the new axes are
mutually perpendicular. It can be shown that the det C = ±1. If
det C = 1, then the X coordinates can be obtained from the x
coordinates by a rotation about the origin 0. If det C = — 1, then a
reflection and rotation are required.

Example 13.14. Show that


1 2 2

C
'-' = 13 2 1 -2

2 -2 1
is an orthogonal matrix. If x = CX, what does the point x = l, y = 2, z = — 1
map into in the (X, Y, Z) coordinates?
In this example,

1 2 2

S1 = 3 2 > v - 13
^2 1 ’ v -
-*3 ~ 13 -2

_2_ _ —2_ 1_
Clearly

s]$i = |[1 2 2] = 1.

Similarly sjs2 = 1 and sjs3 = 1. Also

*1*2 = Kl 2 2] 1

-2
= i[lx2 + 2xl+2x (-2)] = 0.

Similarly, s]s3 = 0 and = 0. We need to invert the transformation so that

X=C~1x = CT x

V 1 1 X l + 2x2 + 2x(-l)

2 _ 1 2 x 1 + 1x2 + (-2) x (-1)


— 3

_-l_ _ 2 x 1 + (-2) x 2 + 1 x ( —1)_

= [1 2 -1]T.

Hence X = 1, Y = 2, Z = — 1.
1 3.8 Eigenvalues and eigenvectors 233

1 3.8 An application to a vibrating system


Positive-definite matrices occur frequently in applications. For
example, consider the system consisting of two particles of equal
mass m and three equal springs stretched in a straight line between
two supports as shown in Fig. 13.2. Suppose that, in equilibrium,
the springs are unstretched, each of length a. The mechanical system
vibrates longitudinally so that the displacements of the particles are
x and y as shown.
If a spring is stretched or compressed from equilibrium by a
length x, then its potential energy stored is \kx2 where k is a constant
known as the stiffness of the spring, which measures its reaction
to being stretched or compressed. The total potential energy of the

Fig. 13.2
Longitudinal oscillations.
mrr—
system is

V = \kx2 + jk(y - x)2 + |y2.


Note that the extension of the middle spring is y — x. Thus

V = \kx2 + ^ky2 — kyx + \kx2 + jky2,


= kx2 — kxy + ky2,
= WKx,
where

X 2k -k
x = K =
Ly. -k 2k
The eigenvalues of K are given by det(/C — /I2) = 0, that is,

2k — X —k
= 0, or (2k — X)~ — k — 0.
2k - X
Hence, the eigenvalues are = k and X2 = 3/c, which are both
positive, implying that the potential energy is a positive-definite
quadratic form. This is not surprising, since we might expect the
potential energy to take a minimum value in equilibrium. The
corresponding eigenvectors are

1 f 1 1
N =
_1_

normalized as unit vectors. The matrix of eigenvectors, C, is given by

1 1 1
C = [Sl s2] =
V2 L> -i.
The transformation x = CXintroduces the coordinates V1 = (X, T)
234 1 3.8 Mathematical techniques

in which

\{kX2 + 3kY2).

These are known as the normal coordinates of the system, and are
related to x and y by
x ~(X + 7)/V 2"
y. _(X- Y)/j2_'
Normal coordinates are often more convenient coordinates to use.
For the same problem, we can also derive the equations of motion
of each particle. If Tu T2, and T3 are the tensions in each of the
springs, then applying Newton’s law (force equals mass times
acceleration) for each particle gives the differential equations
T2 — Ty = mx, (13.10)

Ti — T2 = my. (13.11)
where x and y stand for d2x/dt2 and d2y/dt2 respectively. The
tension in a spring is k times the extension, by Hooke’s law, where
k is the stiffness of the spring. Thus

Tl = kx, T2 = k(y — x), T3=-ky.


Substitution into (13.10) and (13.11) yields
— 2kx + ky = mx, (13.12)
kx — Iky - my, (13.13)
In matrix form, these equations can be combined into the vector
equation
x + Ax = 0,
where
X 2k/m — k/m
X = A =
_ — k/m 2k/m_
If we use the normal coordinates, then x = CX implies
CX + ACX = 0.
Multiply on the left by C 1 = CT:

X+ CTACX = 0,
or
X + DX = 0, (13.14)
where
O' k/m 0
_1
O

_ 0 3k/m_
CS
i

Equation (13.14) now separates into the two differential equations


X + (k/m)X = 0,

Y + 3(k/m) T = 0,
1 3.8 Eigenvalues and eigenvectors 235

which, unlike (13.12) and (13.13) are now no longer simultaneous


equations, but uncouple into two equations which can be solved
separately and independently for X and Y. We say more about the
solution of differential equations in Chapter 18.

Problems

13.1. (Sections 13.1, 2). Find the eigenvalues and eigen¬ 1 0 0


vectors of the following matrices:
A = 0 2 2
2 1
(a) ; (b) (c) 0 2 5
4 6 2 1_ 4 6_T
has a repeated eigenvalue. Find the corresponding eigen¬
"1 r 1 2 ~2 -2 vectors. How many linearly independent eigenvectors are
(d) ; (e) ; (0
4 5 14 5 4 6 there?

13.2. (Section 13.1). Show that the eigenvalues of the 13.7. (Sections 13.1, 2). Show that the matrix
symmetric matrix —1 —1 CL T 1

A = fl 4 1 —a -1

—a a + 1 —a
where a, b, and c are real numbers, are real. has a zero eigenvalue. For design reasons, a second
eigenvalue must be 3. For what values of a does this
13.3. (Section 13.1). Find the eigenvalues of occur? Find the third eigenvalue in each case.

6 3
A = 13.8. (Sections 13.1, 2). A matrix is said to be idem-
2 7
potent if A2 = A. Explain why all eigenvalues of A
(see Problem 13.1b). Find the inverse of A and find its must be either 0 or 1. Show that
eigenvalues. What relationship, would you guess, exists
1 0 0
between the eigenvalues of A and those of A '? Find the
eigenvalues of A2. How do they relate to those of A1 A = 0 3 6
0-1-2
13.4. (Sections 13.1, 2). Find the eigenvalues and eigen¬
is idempotent. Find the eigenvalues and eigenvectors of
vectors of
A and A2 and confirm the above result.
"l 1 2~ ~2 1 2~

(a) 1 2 1 (b) 1 2 2 13.9. (Sections, 13.1, 2). Let

_2 1 1_ 2 1 2 ”1111“

”2 0 0" "6 5 5 1 1-1-1


A = 2
(c) 0 2 2 ; (d) 5 6 5 1-1 1-1

0 2 -1 5 5 6 1-1-1 1
Show that A2 = I4. Explain why the eigenvalues of A
13.5. (Sections 13.1, 2). Find the eigenvalues and
must be either 1 or —1. Can A be diagonalized?
eigenvectors of

“ 1 2 0 o' 13.10. Find the eigenvt

3 2 0 0 "l 2 1 ’

0 0 3 1 A = 2 1 1

1 1 2
_0 0 1 3_
The trace of a square matrix is the sum of the elements
13.6. (Sections 13.1, 2). Show that in the leading diagonal. Thus if B = is an n x n
236 Mathematical techniques

matrix, then 13.17. Show that

trace B = h,, + b22 + • ■ ■ 4- b„„. ' 1 0 0

Confirm for .4 above that trace A = /.t + k2 + /.3. Also A = 0 cos a — sin a
verify that det ,4 =
_0 sin a cos a
13.11. (Section 13.3). Show that the vectors is an orthogonal matrix. Describe the mapping defined

1_

1-
I
by
" 1 ’
V — A i
i
= 3

05

II
H = , 52 -l

rn
Which set of points remains unaffected by the map¬

1_
_1

kyi
1
ping?
1

1
are linearly dependent.
13.18. (Section 13.7). Show that
13.12. (Section 13.5). Let
“1 -1 1 - 1”
'-4 1-2
1-1-1 1
.4 = 1 A =\
1 1-1-1
0 1 0

Find the eigenvalues of .4 and a set of corresponding _ 1 1 1 1 _


eigenvectors. Hence construct a matrix C which makes is an orthogonal matrix.
C~'.4C a diagonal matrix.
13.19. Show that, in the transformation
13.13. Find a matrix C which diagonalizes the matrix
x“ cos a — sin ol X
T1 81
.4 = Y _sin a cos a_ _y_
2 1
the angle between the two sets of axes is a. What do the
13.14. (Section 13.5). Find a matrix C which diag¬ axes of .x and y become in the (A, F) plane?
onalizes
13.20. Show that the nonzero eigenvalues of the skew-
2 0
symmetric matrix
.4 = 0 2 0 a b
0 2
A = —a 0 c
Verify that C ',4C = D. where D is the diagonal matrix
of eigenvalues. —b —c 0
are imaginary for a, b, c real
13.15. Using the diagonalization result
13.21. Let
C'AC = D

for a matrix .4 which has n linearly independent eigen¬ "l 2 f


vectors. show that A = 2 11
det .4 =
_ 1 1 2
where ./.„ are the eigenvalues of .4. (Hint:
Show that
use the result det .46 = det .4 det 6 for square mat¬
rices.) det(A - ;.I3) = -A3 + 4k2 + 2-4.
Verify that
13.16. (Section 11.5). Find the eigenvalues and eigen¬
vectors of the row-stochastic matrix — A3 + 4A2 + A - 4I3 = 0.
~i i i” In other words, the matrix A satisfies its own charac¬
4 2 4
teristic equation. This is known as the Cayley- Hamilton
theorem, and holds generally for square matrices. Use
ill the result to find the inverse matrix A~*.
- 4 4 2 _

Find a formula for .4". How does A behave as n -*■ oo? 13.22. Find the eigenvalues and eigenvectors of
Eigenvalues and eigenvectors 237

5 - 1 -3 3~ 13.26. Let
-1 5 3 -3 ”o 1 o"
A =
—3 3 5 -1 A = 0 0 1
3 -3 -1 5_
1 0 0
Calculate 7 2 and A3. F
Construct a matrix C such that C 'AC is the diagonal Show that A has two complex eigenvalues, and
matrix of eigenvalues. Write down det A. find the corresponding eigenvectors. Construct a
matrix C which diagonalizes A and find a formula for
13.23. (Section 13.6). Express the following quadratic A". Compare this result with the ad hoc method above.
forms in the form xT.4x, where A is a 3 x 3 symmetric
matrix: 13.27. Let A be a square matrix, and let S represent the
(a) a? + x2 + .Y3 + 4.y1.y2 — 4.y,.y3 + 4x2x3; sum of the powers of A from A up to A":
(b) ;y,.y2 - .y,.y3 + x2x3. S = A + A2 + A3 + ■ ■ ■ + An.
Find eigenvalues of A in each case, and find also
By multiplying the equation by A and subtraction,
a matrix C which transforms each into the form 21Xj +
show that
}.2x\ + /.3xi
S = A(1 + An)(I - A)'1,
and state any cases for which this method fails.
13.24. (Section 13.6). Which of each of the following
If
quadratic forms is positive-definite?
(a) 4a'i + x2 — 4.y,.y2;
(b) x2 + x2 + 2a3 + 2.y2.y3 + 2a'3a-, + 4.y1x2;
(c) 6xj + 2xf — A3A,.
(see Examples 13.1 and 13.4), find a formula for Am and
the sum
13.25. (Section 13.8). Consider three particles, each of
mass in, and four equal springs stretched in a straight S= t A"
m = 1
line between fixed supports distance 4a apart by four
springs each with unstretched lengths a (as in Fig.
13.28. Let
13.2, but with three particles). Consider longitudinal
oscillations of the systems and let a, y, z be the extensions 1 2 1
of the springs, assuming Hooke's law with stiffness k for A = 2 1 1
the tension in each spring, show that a, y, : satisfy the
differential equations 1 1 2
Find the eigenvalues of A and confirm that
k( — 2a + y) = mx,
~2~ " -l" " -3~
k(x — 2y -f z) — my,
*i = 2 , S2 = -1 . *3 = 3
k(y — 2 z) = mz. 2 2 0

Express the equations in the matrix form are eigenvectors of A. Construct the matrix C and verify
that
x + Ax = 0.
D = C~' AC,

Find the eigenvalues and eigenvectors of A. Construct where D is the diagonal matrix of eigenvalues. (This is a
a matrix C such that C7AC is diagonal. Obtain differen¬ reworking of the problem at the beginning of Example
tial equations for the normal coordinates X, Y, Z. 13.5, but with different eigenvectors.)
INTEGRATION AND
DIFFERENTIAL EQUATIONS

Antidifferentiation
and area
Contents
14.1 Reversing differentiation 238
14.2 Constructing a table of antiderivatives 241
14.3 Signed area generated by a graph 244
Problems 246

14.1 Reversing differentiation


Compare the following two problems:

d
Problem A: — sin x = /(x); what is /(x)?
dx

d
Problem B: — F(x) = cos x; what is F(x)?
dx

For Problem A we know already that

/(x) = cos x.

This provides one answer to Problem B, which is solved by

F(x) = sin x.

Since cos x is the derivative of sin x, we say that sin x is an


antiderivative of cos x (we say an antiderivative because it is not the
only one; for example, sin x + 1 is also an antiderivative).
The antidifferentiation question in Problem B can be expressed
in various ways; for example

(a) What must be differentiated to get cos x?


(b) What curves have slope equal to cos x at every point?
(c) Find y as a function of x if dy/dx = cos x.
14.1 Antidifferentiation and area 239

Finding antiderivatives is the opposite or inverse process to that


of finding derivatives.
The following examples show that a function /(x) has an infinite
number of antiderivatives: there is an infinite number of functions
whose derivatives are /(x). However, they are all very simple
variants on a single function.

Example 1 4.1. Find y as a function of x if dy/dx = 2x.


One solution is y = x2, because its derivative is 2x. But the derivatives of x2 + 3,
x2 — /, and so on are also equal to 2x. In fact

y = x2 + C

is an antiderivative of 2x for any constant C.


Some of these solutions are shown in Fig. 14.1. Different choices for C just
shift the graph bodily up or down parallel to itself. Therefore, at any particular
value of x, such as is represented by the vertical line PQR, the slopes are all
the same, independently of the value of C.

Evidently the same thing will happen whatever function we start


with: if we find one solution, we can add constants to obtain more.

Example 1 4.2. Find a collection of antiderivatives of sin 2x.


We want y such that dy/dx = sin 2x. If we differentiate a cosine we get something
involving a sine, so first of all test whether y = cos 2x is close to being an
antiderivative of sin 2x. We find that dy/dx = — 2 sin 2x. This contains an
unwanted factor ( — 2). It can be eliminated by choosing instead

y =-cos 2x = — | cos 2x,


-2

for then we have dy/dx = — i ( — 2 sin 2x) = sin 2x, which is right. Therefore,
one antiderivative is — \ cos 2x, and the rest are of the form

y= cos 2x + C (C is any constant).

Example 14.3. Solve the equation dy/dx = e 3jc (that is to say, find a
collection of antiderivatives of e~3jc.)
Try y = e^3A'; then dy/dx = — 3 e_3x. To avoid the unwanted factor ( — 3) we
should have taken

y =-e 3a = -y e 3a
(-3)

From this we construct an infinite collection of antiderivatives:

— | e~3-v + C (C any constant).

It can be proved that the above process, of finding a particular


antiderivative of a function and adding constants, generates all
possible antiderivatives for that function.
240 14.1 Mathematical techniques

Antiderivatives of f(x)
A function F(x) is called an antiderivative of f(x) if

~ F(x) = f(x).
dx
If F(x) is any particular antiderivative of /(x), then (^.1)
all the antiderivatives are given by
F(x) + C,
where C can be any constant. (Therefore, any two
antiderivatives differ by a constant.)

An antiderivative of a function is also more usually called an


indefinite integral of the function, and the process of getting it is
called integration. If you know the term already, it is perfectly safe
to use it. We shall change over to it in Chapter 15.

Example 14.4. Find all the antiderivatives of x3.


We firstly have to find any y which fits the equation dy/dx = x3. Differentia¬
tion reduces a power of x by unity, so try y = x4:
dy/dx = 4x3.
The factor 4 is unwanted; we needed jx4 to give x3. Therefore all antiderivatives
are given by
y = i*4 + c,
where C is any constant.

Sums of terms and constant multipliers are treated in the same


way as in differentiation: the multipliers stay as multipliers and each
term is treated separately, as in the next example.

Example T4.5. Obtain all the antiderivatives of 2 e~3x — ^x3 + 2.


From the previous examples, one antiderivative of e_3x is — ye_3x, and one for
x3 is lx4. Also, one antiderivative of 2 is obviously 2x. Therefore one antideriva¬
tive of the given expression is
2(-ie-3*)-|(±x4) + 2x,
and all its antiderivatives are of the form
-fe-3*-|x4 + 2x + C,
where C is any constant.

The following two examples show the importance in practice of


including the constant C.

Example 1 4.6. A point is at x = 2 on the x axis at time t = 0, then moves with


velocity v = t — t2. Find where it is at time t = 3.

Velocity is the rate at which displacement changes with time: v = dx/dt. In this
case

v = dx/dt = t — t2.
14.1 Antidifferentiation and area 241

Therefore x is some antiderivative of t — t2. All of its antiderivatives are in¬


cluded in

x = \t2 - ir3 + C,

where C is any constant.


To find what value C must take in this case, we obviously have to take the
starting point into consideration: x = 2 when t = 0. To obtain the value of C,
substitute these values into our expression:

2 = 0 - 0 + C.

Therefore C = 2, so the position at any time is given by

x = \t2 — |f3 + 2.

Finally, when f = 3, we have x = —f.

Example 1 4.7. Find the equation of the curve which passes through the point (k,
— 1) and whose slope is given by dy/dx = sin 2x.
Since the required y is an antiderivative of sin 2x, the equation of the curve must
take the form

y = — | cos 2x + C,

where C is some (not ‘any’) constant. Since also we know that the curve passes
through the point x = ji, y = — 1, we must require

— 1 = — \ cos 27t + C = —j + C,

so C = — f. Finally the required curve is

y = —2 cos 2x —

Example 14.8. Obtain the antiderivatives of (3x — 2)3.


As in the earlier examples, we try to guess the structure of y, given that
dy/dx = (3x — 2)3. There is not much to go on, so try an analogy with x3; it
would lead us to try something like y = (3x — 2)4. To check this, differentiate
using the chain rule with u = 3x — 2 and y = u4:

— = 4(3x - 2)3 • 3 = 12(3x - 2)3.


dx

The factor 12 is unwanted; we really needed y = yj(3x — 2)4. Therefore all the
antiderivatives are given by y — j^Cix — 2)4 + C.

The technique used in the previous example can be used for


functions like (ax + b)n, eax+b, cos (ax + b), and sin(ax + b). How¬
ever, it would not work in this simple way for a function such as
(2x2 — 3)2 or sin(2x2 — 3): the antiderivative of (2x2 — 3)2 is not
equal to j(2x2 — 3)3, because x2 is present rather than x (try it,
using the Chain Rule).

14.2 Constructing a table of antiderivatives


Since antidifferentiation is the inverse of differentiation, any table
of derivatives can be read backwards in order to provide anti-
derivatives. Suppose that two typical entries in a table of derivatives
are as follows:
242 14.2 Mathematical techniques

Given function Derivative

F{x) fix) = ~ Fix)


ax

sin ax a cos ax
-.ax
a eax

By interchanging the columns and modifying the headings, we get


two entries in a possible table of antiderivatives:

Given function One antiderivative

fix) Fix)

a cos ax sin ax
~ax
a tax

However, these entries are not yet in the form we should like
them. For example, for the first entry we would prefer to have
cos ax in the left column, instead of a cos ax. Therefore divide
both entries by the constant a, remember to introduce the arbitrary
constant C to register all the antiderivatives, and we have a more
convenient table:

Given function Antiderivatives

fix) Fix)

1 .
cos ax - sin ax + C
a

ax
1
e - eax + C
a

By such means the short table (14.2) is produced. To verify any


entry, differentiate the function in the right-hand column; the result
should be the entry on the left. The letter C stands for ‘any constant’
or ‘an arbitrary constant’.
14.2 Antidifferentiation and area 243

A short table of antiderivatives

Given function Antiderivatives

m F(x)

a (constant) ax + C

m+ 1
* .x"'( unless m = 0 — --X + c
m + 1
Jin x + C if x > 0
** x 1 i.e. - (14.2)
[ln( — x) + C ifx < 0

or In | x | + C (x ^ 0)

- eax + C
a
1 .
cos ax - sin ax + C

sin ax — cos ax + C
a

Notice particularly the two starred entries. The formula * covers most
cases, but it does not produce antiderivatives of the function x-1
(i.e. of 1/x). Here m = — 1, so the entry on the right becomes infinite
and therefore meaningless. Therefore the antiderivatives of x~1 must
be given by some different formula, and this is shown under **. All
we have to do is to verify the formula ** as in the following example.
(The modulus or absolute value notation |x| is explained in Section
1.1.)

Example 14.9. Confirm that the antiderivatives of x_1 (i.e. 1/x) are given by
In x + C if x is positive and by ln( — x) + C if x is negative, and that In |x| + C
covers both cases.
(Remember that In x does not have a meaning if x is negative or zero). All we
have to do to verify the correctness of the formulae is to differentiate the
proposed antiderivatives. Since (d/dx) In x = x“', the result is right when x is
positive.
Suppose now that x is negative. Then — x is positive, so ln( —x) has a
meaning. Using the chain rule (3.3) with u = — x,

d 1 1
— ln( —x) =-(-1) = -,
dx —x x

so the second result is confirmed.


But (see Section 1.1) |x| = x if x > 0 and |x| = —x when x < 0, so In |x| is
an antiderivative whether x is positive or negative.

Example 14.10. Find the antiderivatives of (2x — 3) '.


The power m = — 1 is the starred case in the (14.2), so try y = ln(2x — 3),
244 14.2 Mathematical techniques

supposing initially that 2x — 3 > 0. Then

dx 2x — 3 ’
The unwanted factor 2 will not appear if we try again with y = yln(2x - 3).
Also 2x - 3 might be negative, so we introduce a modulus sign. Finally we have

y = j In |2x — 31 + C.

14.3 Signed area generated by a graph


Figure 14.2 shows the graph of a function y = f(x) between x = a
and x — b, in which we assume that the x and y scales are the
same. Divide the range as shown into N sections so that in any
section y is either positive only, or negative only.
Let Au A2, ■ ■ ■ denote the geometrical areas of these segments,

and A the sum of these. Geometrical area is always positive, so A1,


A 2,. • ■ are all positive numbers. Then

A = Aj + A2 + A3 + • • • + An. (14.3)
is naturally called ‘the geometrical area between the curve and the
x axis’.
We require a different quantity, A, called the signed area between
the curve and the x axis. This is defined by
A = Al-A2 + A3-An. (14.4)
In forming A, we use the rule: If y is positive, the contribution
takes a positive sign; if y is negative, the contribution takes a negative
sign. This quantity has a far more useful range of applications than
has geometrical area. For example, suppose that a point is moving
on a straight line; then the signed displacement from its starting-
point is equal to the signed area^of its velocity-time graph.
We show how to calculate the signed area A of the graph of
y = f(x) between two given points, x = a and x — b (Fig. 14.3a).
Let A(x) represent the signed area between a and a variable point
with coordinate x (Fig. 14.3a). Increase x by a small step Sx; the
signed area from a to x + 8x is A (x + 8x). The change in signed
area, 8A = A(x -h 8x) — A(x) (positive or negative), is equal to the
Fig. 14.3 signed area of PQRS in Fig. 14.3a and b. This is very nearly
14.3 Antidifferentiation and area 245

equal to the signed area of the rectangle PQRN in Fig. 14.3b (in
this case the required sign is negative) so

5A se /(x) 8x

which automatically takes the right sign. Therefore

SA
~ /(*)•
8x
Now let 8x —> 0; ‘ ~ ’ becomes ‘ = and bA/6x becomes dA/dx,
so that

- = fix). (14.5)
dx

From (14.5) A(x) must be one of the antiderivatives of/(x). To


find which one, choose any particular antiderivative and call it F(x).
Then A (x) can differ from F(x) only by a constant, k say, so

A(x) = F(x) + k. (14.6)

To determine the value of k, use the fact that A(x) = 0 at x = a,


because the starting point is then the same as the end point; that is
to say,

A (a) = 0.

Therefore, from (14.6)

or
A (a) = 0 = F(a) + k,
k = —F(a), (14.7)

a known quantity, since we selected the antiderivative F(x) of /(x)


ourselves. The required area A between a and b is given by

A = A{b) = F{b) - F(a),

by putting x — b into (14.6), with (14.7) as the value of k.

The signed area A of /(x) between x = a and b

A = F(b) - F(a), (14-8)


where F(x) is any antiderivative of /(x).

In practice we naturally use the simplest antiderivative, in which


the C in the table is zero. Any nonzero choice of C will cancel out
and disappear, since it will be present in both F(a) and F(b).

Example 1 4.10. The signed area of y = x2 from x = — 1 to x = 2.


(This happens to be the same as the geometrical area, because y is never
negative.) Here a = — 1 and b = 2. Also, the simplest antiderivative of x2 is

F(x) - l x3.
246 14.3 Mathematical techniques

Therefore, from (14.8),


A = F(b) - F(a) = 2)3 - i(-l)3 = 3.

There is a special notation: the square-bracket notation, which


we shall use generally from now onward.

Square-bracket notation

[F(x)]„ stands for F(b) — F(a).

Example 14.11. Find (a) the signed area, and (b) the geometrical area,
between y = sin x and the x axis from x = 0 to x = 2n.
(a) f(x) = sin x, so F(x) = —cos x is an antiderivative. From (14.8) and (14.9),
with a = 0 and h = 2n, the signed area A is given by
J4 = [ — cos x] 5n = — [cos x] o" = — (cos 2n — cos 0) = 0,
as is expected from Fig. 14.4: the positive and negative sections cancel.
(b) The geometrical area A can be obtained by splitting the range into a
positive section 0 to n, and a negative section from k to 2k (see Fig. 14.4). The
negatively-signed section k to 2k must have its sign reversed in order to give
the geometrical area:
A = [geometrical area of 1st loop]
+ [geometrical area of 2nd loop]
= [signed-area of 1st loop] — [signed area of second loop].
This is equal to
[T(x)]S - [F(x)]27t = [-cos x]g - [-cos x]2lt
= ( — COS 71 + cos 0) — ( — cos 2tc + COS 71)
= (1 + 1) — (— 1 + (-l)) = 2 + 2 = 4.

Problems

Note: In case you have already met the term ‘indefinite (j) x(x + l)(expand by removing the brackets);
integral’, the term ‘antiderivative’ has the same mean¬ (1 + 2x)(l - 2x); (x + l)2; (1 + x)(l - 1/x);
ing. x2(x + x2).
(k) (x + 1 )/x (turn it into the sum of two terms);
14.1. Obtain all the antiderivatives of the following (2^/x - l)/y/x (put jx = xi and l/^/x = x"4 then
functions, and check their correctness by differentiating simplify as the sum of two terms);
your results. (x + l)2/x3.
(a) x5; 3x4; 2x3; ix2; 6x; J\x) - 3; /(x) = 0. (l) e-v + e‘-v; 2e2x - 3e3jr; e"v(l + e"**);
(b) —Tv 3; 2x 2; 3x 1 when x > 0 (if in doubt, see l/e2jc( = e~2-x); (e2- - e~2*)/e2*.
(14.2)). (m) 2 cos 2x; 3 sin \x — 4 cos jx; 2 + sin 2x.
/ v 4 4 -i ± -i
(c) .x2; x2; x 2; x3; x 3.
(d) 1/x2 (write as x“2); 1/x4; 1/x when x<0
(see (14.2)). 14.2. Find all the antiderivatives of the following by trial
(e) 7-x(=xi); l/Jx; 1/xl and error, as explained in the text. Confirm your answers
(f) 3x; ix2; l/3x2; 3/4x* by differentiation.
(g) ev; e"-v; 5e2-v; e”^-v; de-2-". (a) (x + 1 )3 (start by trying (x + 1 )4); (3x + 1 )3; (3x - 8)3.
(h) cos x; cos 3x; sin x; sin 3x; (b) (1 - x)4; (8 - 3x)*; (1 - x)*.
(i) 1 - 3x; 1 + 2x - 3x2; 3x4 - 4x2 + 5. (c) (2x + l)”2; (1 - x)“i; 2/(3x + l)3; 1/4(1 - x)[
Antidifferentiation and area 247

(d) 2 cos(3x — 2) (try first sin(3x — 2)); 3 sin(l - x); (f) y — x_ — 2 x < — 1 (note that x is negative in
2 sin(2 — 3x). this range);
(g) y = sin 3x, 0 x sc § n;
14.3. (See Example 14.10). Find the antiderivatives of the (h) y = 1/(1 — x), 2 ^ x ^ 3 (note: 1 — x is negative
following. over this range, so make sure you understand
(a) 1 /(x + 1); l/(x - 1); 3/(3x - 2); 2/(5x - 4). Example 14.10; alternatively, write
(b) 1/(1 - x); 1/(4 - 5x). 1/(1 — x)-l/(x — 1)).
(c) x/(x + 1) (it can be written as 1 — l/(x + 1));
(d) (x + l)/(x — 1) (compare (c)).
14.7. Obtain the geometric area between the graph and
14.4. Use the identities cos2 A = ^(1 4- cos 2A), sin2 A = the x axis in each of the following cases. It is necessary
|(1 — cos 2A), and sin A cos A = \ sin 2 A to get rid of to treat each positive or negative section separately.
the squares and products in the following expressions, (a) y = — 3, 0 ^ x ^ 1 (this is negative all the way);
and in that way obtain the antiderivatives. (b) y = x3, — 1 < x < 1;
(a) cos2 x; sin2 x; sin x cos x. (c) y = 4 — x2, — 1 < x ^ 3;
(b) 3 cos2 2x; sin2 3x; sin 2x cos 2x. (d) y = cos x, 0 ^ x < 2k.

(c) cos4 x (you will have to use the identities twice).

14.5. (a) Show that (d/dx)(x ex) = ex + x ex. By 14.8. Find the most general function which satisfies the
rearranging the terms, show that the antiderivatives of d2x d dx d3x d d2x
x ev are eA'(x — 1) + C (use the fact that ex can be written following equations. (Note: —- =-, —- =--
df2 df dr dr3 df df2
as (d/dx) ev). Confirm the result by differentiation.
etc. Work in several steps, finding the next lowest
(b) Differentiate x2 ex. By rearranging the terms and
derivative in each step.)
using the result in (a), find the antiderivatives of x2 ex. d2x d2x d2x
(a) — = 0; (b) — = f; (c) — = sin f;
14.6. Use the result (14.8) to obtain the signed areas dt2 df2 df2
between the given graphs and the x axis. By roughly d3x d3
(d) -= 0; (e) -= cos t
sketching the graphs of the functions for which you dr3 df3
obtain zero, explain this fact.
(a) y = x, 0 ^ x ^ 2; (f) -= g (g is a constant);
df2
(b) y = x, — 1 ^ x s$ 1;
d4y
(c) y = — x2, 0 sg x ^ 1; (g) —- = vvq (w0 is constant; this relates to the displace-
(d) y = cos x, — n^x^n; dx4
(e) y = cos x — 1, 0 < x ^ 2k; ments y(x) of a bending beam.).
The definite and
indefinite integral
Contents
15.1 Signed area as the sum of strips 248
15.2 Numerical illustration of the sum formula 249
15.3 The definite integral and area 250
15.4 The indefinite-integral notation 251
15.5 Integrals unrelated to area 252
15.6 Improper integrals 255
15.7 Integration of complex functions: a new type of integral 256
15.8 The area analogy for a definite integral 258
15.9 Using the area analogy 259
15.10 Definite integrals having variable limits 261
Problems 263

15.1 Signed area as the sum of strips


Consider signed area from another point of view. Figure 15.1a
represents the graph of a function y = f(x) between x = a and x = b.
Since we are going to talk about area, assume that the x and y
scales are the same. Divide the interval a to b into N small equal
steps each of width

b — a
8x =
N

To any step PQ there is a signed area element PQRS which we call


5A. The total signed area A is equal to the sum of all these:

x—b

A = X

which signifies ‘the sum of all the elements 8A between a and b\


The typical area element PQRS is shown magnified in Fig. 15.1b.
When 8x is small, the signed area SA is nearly that of the shaded
rectangle PQNS. Therefore, for Sx small, we have

A= £ 8jq ^ £ /(x) 8x.


x=a x=a

When 8x -> 0 (with N increasing correspondingly), the approxi¬


mation approaches perfection and we have
15.1 The definite and indefinite integral 249

Signed area as a sum

Signed area A of y = /(x), x = a to b\


x=b
(15.1)
A = lim Yj f(x)
5jc-»0 x = a

15.2 Numerical illustration of the sum formula

Fig. 15.2

We shall specify the sum in (15.1) in more detail, with the idea of
obtaining a specific algorithm for actually calculating such a sum
on a computer. In Fig. 15.2 we show the graph y = f(x). There are
N equal subdivisions: we shall call the length of a subdivision h
rather than 8x as in Fig. 15.1, since this is conventional when making
numerical calculations:

The relevant points of subdivision are labelled x0 to xN_x:

x0 = a, xl= a + h, x2 = a + 2h , ... ,

xN_! = a + (N — 1 )h,

which can be expressed as

x„ = a + nh for n = 0, 1, 2,..., N — 1.

(The point xN = b is not wanted for a sum of type (15.1).) Then the
area of the nth approximating rectangle in (15.1) is f(xn)h, and the
approximating sum in (15.1) becomes

A a f(a)h + f(a + h)h + • • • + f(a + (N — 1 )h)h


N- 1
= h Y f(xn) where x„ = a + nh, with n = 0,1,2,..., N — 1.
n= 0
250 15.2 Mathematical techniques

Computation of approximating sums (rectangle


rule)
If y = J\x) with range x = a to b, the signed-area
approximation with N subdivisions is
(15.2)
~ h £ /(-x«)>
n=0

where h = (b — a)/N and xn — a + nh.

When we take larger and larger N, and smaller and smaller h


correspondingly, we expect that the approximation will approach
the exact value. The following example illustrates this for the very
simple case of the signed area associated with a straight line. The
algorithm (15.2) is very easy to program on a computer for any
function f(x). It is called the rectangle rule.

Example 1 5.1. Calculate the sum in (15.2) when f(x) = x, a = — 1, b — 2, for


N = 30, 300, 3000,..., showing how the results approach the exact value 1.5.
The graph y = x is a straight line, from which it is very easy to see that the
signed area is exactly 1.5. The computed results are as follows (b — a = 3, and
so h = 3/N):
N 30 300 3000 30000
h 0.1 0.01 0.001 0.000 1
A % 1.05 1.455 1.4955 1.499 55
The approximations are approaching 1.5, though very slowly. We shall see in
Section 16.3 how to improve such calculations.
Fig. 15.3

15.3 The definite integral and area


x=b

The expression lim £ /(x) Sx of (15.1), which is equal to the


Sjc->0 x = a
signed area, has a very important brief notation:

Definite-integral notation
x=b 'b

lim Yj f(x) is denoted by f(x) dx. (15.3)


5.v-*0 x-a Ja

(Historically, a large letter S for ‘sum' used to be printed instead

of Yj the sign is really just an extended letter S.) The expression

f(x) dx is called a definite integral, to be read: ‘the integral


Ja

°f f(x) dx from a to b\ Here f(x) is the integrand, or the function


15.3 The definite and indefinite integral 251

to be integrated. The letter .x is the name of the variable of


integration, while a is the lower limit and b the upper limit for
the integration process.
We already found a way to obtain the signed area by using an
antiderivative: see (14.8). In the new notation, (14.8) is expressed as
follows.

Signed area expressed as a definite integral

The signed area FL of fix) between x = a and b is


given by
'b
(15.4)
f{x) dx = [F(x)]$ = F{b) - F{a)
Ja
where Fix) is any antiderivative of fix).

Notice that in a definite integral any letter can be used for the variable
of integration, because the letter itself disappears in the course of
evaluation; for example

x dx = [ix2]^ = [ix2];^ = \\
Jo
' i

t dr = [|t2]o = [jt2ftZo = i; and so on.


Jo

1 5.4 The indefinite-integral notation


The symbol
c
fix) dx,
..
with no limits of integration specified, is called an indefinite integral
of fix), and has exactly the same meaning as the word ‘antiderivative’
that we have used up until now, and which we have denoted by Fix).

Indefinite-integral notation

fix) dx, with no limits specified, stands for any (15.5)


%/
antiderivative of fix).

The expression fix) dx is called a definite integral because it


takes a definite value: it represents a specified signed area and there
is no arbitrary constant on the right. However, an indefinite integral
f/(x)dx does not stand for a number; it represents an anti-
derivative, which is a function. This function is to be written in terms
252 1 5.4 Mathematical techniques

of the current variable of integration, so the name of the variable is


usually significant. Also there will be a disposable, or arbitrary,
constant, in the usual way. For example

x2 dx = jx3 + C, e2r df = j e2r + C,

cos u du — sin u + C

and so on, where C is a constant. In some problems we shall assign


or discover a definite value for C; in others we might want to keep
C as an arbitrary constant in order to express every possible
antiderivative (hence, ‘indefinite’ integral).

Example 1 5.2. Find the signed area FI associated with the graph y = 3e2x from
x =1 to x = 3 using the new notation.
We shall need an antiderivative F(x) (i.e. an indefinite integral) of 3e2x. Using
the notation (15.5), we may write

F(x) = 3e2 v dx = |e2x.

(for this purpose any antiderivative will do, so we have put C = 0). Then, from
(15.4),
'3

3? = 3e2 v dx = [|e2x],
1

In the last example, we might as well have written


"3 " f
3e2* dx = 3e2* dx
Ji LJ
in the first place, without ever introducing F(x). If we do this, we
get another version of (15.4) which it is often convenient to use:

Signed area, using the notation for definite


and indefinite integrals

The signed area of /(x) from a to b is

(15.6)
fix) dx fix) dx
Ja LJ

where fix) dx is any indefinite integral (antideriv¬

ative) of /(x).

15.5 Integrals unrelated to area


Integrals arise constantly in applications, but only seldom is there
any direct connection with area. The following example starts by
1 5.5 The definite and indefinite integral 253

giving information that seems to have nothing to do with area.


However, we show that the problem can be thought of in terms of
an area, and therefore can be solved in terms of a definite integral.

Example 1 5.3. A small object P is pushed steadily along the x axis from x = 0
to x = 1, against a resistive force f(x) = x2. Find the work done against the
resistance.

Divide the range x = 0 to 1 into a large number of short steps of length 5x. In
general, if the resistive force is constant the work done over a distance is
(force) x (distance moved). Although the force on P is not constant, over a short
distance 5x the work dlV done by the applied force is given approximately by

5 W % f(x) 8x - x2 5x.

The total work W is given by

W = Xfj 8Wk *£ x2 5x.


JC = 0 x= 0

Letting 5x —► 0, we obtain exactly


x= 1

W = lim £ x2 5x. (15.7)


S°x -* 0 x = 0

But this expression matches equation (15.1): it represents the signed area of the
curve y = x2 between x = 0 and 1. Consequently we can say immediately that

W= x2 dx = [|x3]J, = f (15.8)
0

The reader should think very carefully about the step from (15.7)
to (15.8), because it can be generalized to apply to any similar
problem. Suppose that there arises, in any context whatever, a sum
of the type
x=b

lim Y, /(x) 5x.


5jc->0 x = a

Then such a sum can always be interpreted as representing a certain


signed area (namely the signed area of y — /(x) between x = a and
x = b) so it can always be represented by the definite integral
ja/(x) dx. We do not have to repeat this argument every time we
encounter such a sum; from now on, we call on the general
statement:

The limit of a sum represented by a definite


integral

In all cases
*b

lim X /(x) 5x = f (x) dx. (15.9)


8x->0 x — a Ja

The integral is then evaluated using (15.4) or (15.6).


254 15.5 Mathematical techniques

The variable occurring need not be denoted by x, as the following


examples demonstrate.

Example 1 5.4. An object is driven along a straight line with velocity v(t) = eT'
between times t = 0 and t = 2. There is a resistive force g{v) = 3v2, where v is
velocity. Find the total work done against the resistance.
In a short interval between times t and t + 5r, the distance travelled, 5x, is given
approximately by

5.x % v(t) 51 — e*‘ 51.


The work SIV done in this time interval is approximated by

81V % g(v) 5x = 3v2 5.x

% 3n2(e*' 5f) = 3e'(e*' 51) = 3e’! 5f.

Therefore the total work W required is given by


1 = 2

W = lim £ 3e2' 51
5i-0 ( = 0

'2
= 3e2,dt = 2[e2']2 = 2(e3 - 1).
Jo

Example 1 5.5. During a rainy period extending from t = 0 to t = 10 days, the


rainfall rate r from moment to moment in units of centimetres per day, is found to
be r(t) = f t — fot2. Find the total depth of rainfall, R,for the period.
Take a short time interval from t to t + 5t (expressed as a fraction of a day).
During this period, the rainfall 5R is given approximately by

6R ~ r(t) 51 = (ft - ^f2) 5f

(‘approximately’ because the rate of fall r varies a little even through a short
time). The total rainfall from t = 0 to 10 days is equal to the sum of all the
contributions as the steps 51 tend to zero (while becoming proportionately more
numerous):

R= Iim Z (fr- sof2)5f


8f -* 0 i =0

* 10

(sf ~ 50 f2) df (by (15.9))


«. 0

= Li (if2) - M'3)]i° = A(10)2 - 5o(10)3 = 10 (cm).

Example 15.6. Suppose that, in Example 15.5, the rainfall rate is given by
r(t) = t* e~‘ (cm per day). Obtain the total rainfall R between t = 0 and 10 days.
Proceeding as before, the total rainfall is given by
pio
R = De 'dt.

We cannot find an indefinite integral to enable R to be evaluated. However, we


know from (15.9) that R must be equal to the area under the (r, t) graph, which
can be computed numerically by using the numerical method of equation (15.2).
Divide the range t = 0 to 10 into N strips (so that 5f = 10/JV), then the
approximation corresponding to (15.2) becomes

10 N~l , _ /10\
R ~ T7 Z $ e where f„ = n( — ).
IV n=0 \N /
1 5.5 The definite and indefinite integral 255

The following computed values show how the exact result is approached when
we take N larger and larger:
N 5 10 100 1000
5.x 2.00 1.00 0.10 0.100
R 0.4701 0.7070 0.8796 0.8859
The exact answer is 0.88607

We shall show in Chapter 16 that we are not tied to equation


(15.2) for calculating signed area, but can find far better computing
formulae.

15.6 Improper integrals


If a definite integral has an infinite range, or the integrand becomes
infinite at some point in its range, the integral is said to be improper.
Usually these present no particular problem.

Example 1 5.7. Evaluate e 2x d.x

Putting d.x = — ie 2x, we have

i
dx= -i[e-2\|0* = -i(0 1) = 2‘

Example 15.8. Evaluate

x-2 d.x = [-x-1]f = [0 — (— 1)] = 1.


Ji

In Examples 15.7 and 15.8 we have (see Fig. 15.4) two cases of
y y

Fig. 15.4

an infinitely long figure which encloses a finite area. This cannot


always happen, even if the integrand goes to zero when x —> oo.

'x d.x
Example 1 5.9. Consider
. i v

We have
fx
x~1 d.x = [In x]f.
Ji
The logarithm becomes infinite as x becomes infinite, so the integral is
meaningless. The function x_1 does not tend to zero fast enough to keep the
area finite as we extend the range to infinity.
256 1 5.6 Mathematical techniques

The case when the integrand becomes infinite at some point in its
range has similar features:

Example 15.10. Consider (a) dx; (b) x 1 dx.

Notice that x * and x 1 are infinite at x = 0.

(a) dx = 2[x*]o = 2[1 — 0] = 2.

Therefore the integral gives no problem; it is again a case of an infinitely extended


figure (extended in the y direction this time) containing a finite area.
(b) On the other hand,

x_1 dx = [In x]i,

and the integral this time is infinite, because In 0 is ( — oo).

There are improper integrals which do not work out for a


different reason:

Example 1 5.1 1. Consider the integrals

(a) cos x dx, (b) cos x dx.

(a) We have

cos x dx = [sin x]o = sin X.

So long as X is finite, there is therefore no problem.

(b) However, for cos x dx we would, straightforwardly, have a term sin oo

to interpret. The only sensible meaning that we could attach to sin oo is that it
stands for lim^,*,
lim*^ sin X. But sin X has no definite limit as X -» oo; it goes up
and down between +1 for ever.

Improper integrals which give a definite finite result are said to


converge. If not, they are said to diverge.

15.7 Integration of complex functions: a new type


of integral
To differentiate or integrate a function containing the ‘imaginary’
element j, simply treat j like an ordinary real constant. Thus, for
example,

and
f . 1 .
eJ dx = - eJX + C = -jeJX + C,
J J
where C is an arbitrary constant (which in this context we would
allow to be itself a complex number). Suppose that a and b are real
numbers, and that
c = a + ]b.
1 5.7 The definite and indefinite integral 257

Then

ecx = ce“,
dx

and

ecx dx = - ecx + C. (15.10)

We use (15.10) with c — a + )b to work out two integrals U and


V which frequently occur in practice:

U = eax cos bx dx and V= eax sin bx dx.

Omit the arbitrary constant C for the moment. Observe that

U+jV = eax cos bx dx + j eax sin bx dx

eflX(cos bx + j sin bx) dx = eax ejhx dx

e(a + jb)x dx

Ja + jb)x
(from 15.10)
a + )b

a —)b
eax(cos bx + j sin bx)
a2 + b2

eax[(a cos bx + b sin bx)


a2 + b2

+ j( — b cos bx + a sin bx)].

Equate this last expression to U + j V: the real and imaginary parts


must separately be equal; so, after introducing the arbitrary constant,
we have

(a) eav cos bx dx

1
eax(a cos bx + b sin bx) + C,
a2 + b2 (15.11)

(b) eax sin bx dx

eax( — b cos bx + a sin bx) + C.


a2 + b2
258 15.7 Mathematical techniques

The integrals can be expressed more simply in terms of a phase


angle. Put
b
- = cos cp and = —sin cp
(a2 + b2)2 (a2 + b2)-

into (15.1 la), and

-b
= cos 6 and = —sin 9
(a2 + b2Y (a2 + b2)2
into (15.11b). Notice also that (p and 0 must therefore be related by

6 — <t> ~ 2n-

Then (15.11) becomes

(a) eax cos bx dx = r eax cos (bx + (p) + C,


(a2 + b2)
1 (15.12)
(b) Qax sin bx dx tax cos (bx + (p — jn) + C
(a2 + b2)2
where cos cp = a/{a2 + b2)2, sin (p = —b/(a2 + b2)1.

Example 15.12. Evaluate 1 = e Jtcos2xdx.


Jo
Equation (15.1 la) or (15.12a) can be used directly with a = — l,b = 2. However,
we will go through the working from first principles, but express the argument
differently. Remember that

ex+iP = e3 ej/i = e3(cos /? + j sin )3);

then we have e“* cos 2x = Re e<~1 + 2j)3r. Therefore

S(-1+2J)X dx
I = e x cos 2x dx = Re

„( - 1 +2j)*
Re
1-H2j

1 + 2j _ i
= Re 0- = Re
-1 + 2)

1 5.8 The area analogy for a definite integral


A signed area can be represented as a definite integral as in (15.4).
Conversely, any definite integral j^/(x)dx, whatever it represents,
can be interpreted as representing the signed area of the graph
y = f(x) between a and b. The connection with area means that we
have a picture of an integral which can often give useful information
without the need to evaluate the integral, which might in any case
15.8 The definite and indefinite integral 259

be impossible. One example of this is the simple numerical method


described in Section 15.2. We restate the connection, calling it the
area analogy.

The area analogy


Cb
The definite integral /(x) dx always represents the (15.13)

signed area of the graph y = /(x) from x = a to b.

The following section illustrates the use of the principle (15.13): it


will also be referred to in later chapters.

15.9 Using the area analogy


In this section we shall use t instead of x, and x instead of y, so that
we consider functions of the form

* - fit).

The reason is that time t is commonly the physical variable in


contexts where these techniques are found useful.
Sometimes, by using the area analogy (15.13), the graph of the
integrand of \ba f{t) dt makes it obvious that the value of a definite
integral is zero. Figure 15.5 shows some simple cases.
The range of the integrals on the two sides of the origin are equal,
and because of the special symmetry the positive and negative
contributions cancel out. Such functions are called odd functions,
or functions odd about the origin. They have the following property
(see Section 1.4):

Odd functions f(t)


(15.14)
satisfy the condition f{ — t) —

Some basic odd functions are


(c)

t, t3, t5,..., and their reciprocals;

sin at, sin3 at,..., and tan at, tan3 at,..., where a is constant.

Symmetrical integrals over functions which are


odd about the origin
(15.15)
Fig. 15.5 (a) x = t3; J!_ j t3 dt = 0.
fit) dt = 0.
If /(-o = - fit) then
(b) x = sin t; J” K sin t dr = 0.
(c) x = t + sin 2t;
(t + sin 2f) dt = 0.
260 15.9 Mathematical techniques

Another useful class are even functions, which are symmetrical


about the x axis:

Even functions /(f)


(15.16)
f(t) is even if /(-f) = /(f).

Some basic even functions are

f2, f4, t6,..., and their reciprocals;

cos at and cos" at (a is a constant);

also even powers of any odd function, such as tan6 at. It is also
useful to realize that
(odd function) x (even function) = (odd function),

(odd function) x (odd function) = (even function).

Some even functions are shown in Fig. 15.6.

Example 1 5.1 3. Show that the following integrals are zero:

(a) r4 sin 3f dr; (b) t5 cos 3t cos \t dr;

ri
(c) (c) (e2' — e 2') df

(a) The function f4 is even and sin 3r is odd, so the integrand is odd. Since
the range is symmetrical about the origin, the integral is zero.
(b) r5 is odd, cos 31 is even, and cos jt is even, so the integrand is odd and
the integral is zero as in (a).
(c) e2' — e~2' is odd (put —t in place of t in the function—it just changes its
sign). Therefore the integral is zero.

If we integrate an even function between ±c, the graph shows


that we get equal contributions from both sides of the origin, which
(a)
gives the following result.
x

Symmetrical integrals of even functions


(15.17)
If f{t) is even, then /(f) df = 2 /(f) df.
J —C

(b)
These ideas may also be useful if there is special symmetry about
X
some point other than the origin.

% 2n
Example 15.14. Show that cos3 t df = 0.
Jo
In the graph of x = cos t (Fig. 15.7) the parts OB A and DBC are congruent,
and similarly for the other pair of divisions; in fact all four divisions are
Fig. 1 5.7 congruent. For the graph of x = cos3 t the shape is changed, but the four pieces
15.9 The definite and indefinite integral 261

remain congruent and retain their original sign. The resulting cancellation gives
zero for the integral.

15.10 Definite integrals having variable limits


Integrals of the following type occur rather frequently in appli¬
cations:

I(x) fit) dr. (15.18)

where c is a constant. Although I(x) depends on x, this is still a


definite integral because it has limits of integration; no arbitrary
constant occurs. Notice that we avoid using x as the variable
of integration when a limit of integration involves x: the same
letter would be serving two totally different purposes. Therefore we
have changed the variable of integration to t.
Suppose that F(t) represents any particular indefinite integral, or
antiderivative, of f(t). Then, as always,
I(x) = [F(f)]* = F(x) - F(c). (15.19)
Since F(c) is constant,
dI(x) d
/(f) dr = —^ = /(x).
dx dx dx
Similarly we can obtain
d
/(f) dr = -/(x).
dx
Now consider the more complicated case
1 i;(x)

*(x) = /(O df = F(u(x)) - F(u(x)).


Ju(x)

By using the chain rule, we have

dx dv dx dx
with a similar result for F(u(x)), and finally we have the results

Differentiation of integrals

d du(x)
f(t) df =f(v(x)) -/(m(x))
(a) T
dx jU(X) dx dx
(b) (Special cases):
(15.20)
d
f(t) df = /(x)
dx
and
d
/(f) df = -/(x).
dx
262 15.10 Mathematical techniques

(The results (15.20b) are simply (15.20a) in the respective cases


v(x) = x, u(x) = c, or v(x) = c, u(x) = x.) It is worth noticing that
(15.20) does not require you to integrate anything!

'3x2
Example 1 5.1 5. Obtain d//dx when I(x) = e' df.
X2

Here /(f) = e', u(x) = xz, v(x) = 3x2.

Therefore = e3jt2 6x — e*2 2x = 2x(3e3*2 e* )•


dx

Equation (15.19) shows that I(x) — f(t) dt is an antiderivative


JC
of /(x). It might be thought that, by choosing various values for c,
we could reproduce all the antiderivatives F(x) corresponding to
/(x). However, this expectation is only sometimes correct.

Example 1 5.1 6. Let f(x) = cos x, with antiderivatives sin x + C, where C is


an arbitrary constant. Demonstrate that for the integral j/ /(t) dt it is not possible
to find a value of c which will reproduce the antiderivative sin x + 1000.
We have
'x
cos t dr = [sin f]x = sin x — sin c.
JC
But — 1 < sin c ^ 1 no matter what value of c we take, so we could never make
the integral equal to sin x + 1000.
(a)
y
Example 15.17. The function shown in Fig. 15.8a is described by f(x) = 1
when 0 ^ x < 1, /(x) = — 1, when 1 < x < 2, and f(x) = 0 when x > 2. Sketch
a graph of the function

/(x) = /(O df.

Range 0 < x < 1. I(x) = 1 df = x. (i)

Range 1 ^ x < 2. In this range, /(f) is described by a different expression, -:


instead of 1, so we must split up the integral:
"1

/(x) = /(f) df = /(f) df + /(f) df


J i

= 1 + 1) df (after using (i) at x = 1)

= 1-(x - l) = 2-x. (ii)


Range x > 2. Split the integral again:

/(x) = /(f) df + /(f) df

= 0 + 0 df = 0, (iii)
J2
where we used the value of (ii) at x = 2. The resulting graph of /(x) is shown
in Fig. 15.8b.
The definite and indefinite integral 263

Problems

2 x + x2
15.1. Sketch each of the following curves; then express (c) x(x2 — 1) dx; (d) dx;
o A"
the signed areas under them firstly as the sums of
strips, as in (15.1), and secondly as definite integrals, ,’2 f(f + 1) f4 y/u - 1
as in (15.3); and finally evaluate them by (15.4). (e) | —— dt; (f) J du;
fT
(a) y = a-3, - 1 < x < 2; (b) y = a5, - 1 ^ a ^ 1;
•o dw -i
(c) y = sin a, — n ^ a ^ 0; (d) y = e~2x, 0 < a ^ 1.
(g) (h) dx;
_ ! 2w + 3
15.2. Evaluate the following indefinite integrals (remem¬
ber the arbitrary constant). (i) cos2 3f dt (cos2 A = y(l + cos 2A)).

(a) x* dx-; (b) (a + 1/ dx; (c) e** dx;


15.5. Evaluate the following infinite integrals.

(d) sin x dx; (e) (cos a — 2 sin 2a) dx; dx


(a) dt; (b) e du; (c)
»
i
r r
(f) r*df; (g) cos 2u dw; (h) 3e“*ydy; dx f1 ds f dt
J (d) -/ (e) (0
* o (2x + 3)2 J 0 ** J , (f - 1)*

(i) (1 + 3r2 - 2t) dr; (j) (1+4 cos 4w) dw;


(g) e 21 sin 3t dt (see Section 15.7);

(k) (— x)l dx when x is negative (you will have to


(h) e cos 21 dt (see Section 15.7).
experiment to find a valid antiderivative).

15.6. The mean or average value of /(t) over an interval


15.3. Evaluate the following definite integrals. ] fT
0 ^ f ^ T is the quantity — /(f) dt. Find the mean
T
(a) a3 dx; (b) a2 dx; (c) dx;
values of the following over the intervals given.
-1
(a) /(f) = t, 0 *5 t < 1; (b) /(f) = f, — 1 sS t 1;
t*-* ri (c) /(f) = sin f, 0 ^ f 7r;
(d) a* dx; (e) (1 — 3x + 2x2) dx;
(d) /(t) = sin t, 0 5$ t ^ 27t; (e) /(t) = t-2, 1 ^ f ^ T\
o
(f) /(f) = e~‘ cos f, 0 < t ^ 2rc;
ei (g) /(f) = e_2! sin f, 0 < f < oo;
(0 (x 3 + x -) dx; (g) x 2 dx;
(h) /(f) = 1 — e(work out the mean value over 0 ^
1
f ^ T for several increasing values of T, and deduce the
r -1 value of the mean over 0 ^ f < oo; if you put T = oo into
(h) x 1 dx (take care: the x values are negative);
the integral directly, it turns out to be infinite, or
‘diverges’ (see Section 15.6), so no conclusion can be
drawn from this approach).
(i) (— a)- dx (see the remark in Problem 15.2k);
(i) /(r) = f1, 1 < f < co (it is necessary to follow the
procedure in the previous question, for the same reason).
p 1 Pin
-i.x
e~ asin
3'v dx; (k) 4x dx;
(j) 15.7. Use the even/odd properties of the integrands
0 do
(see Section 15.9) to prove the following results.
f 2n
(1) sin lx dx; (m) cos yx dx.
(a) sin4 f df = 2 sin4 f df;
- 71
’1 r
15.4. Evaluate the following integrals, using the nota¬
(b) df = 0;
tion of (15.4). -i (1 + f4)

'n f cos f
x(x2 + x + 1) dx; (b) (x — l)(x + 1) dx; (c) df = 0; (d) t2 sin(f3) df = 0.
(a)
1 + t2
264 Mathematical techniques

15.8. (Computational: See Section 15.2 and Examples 15.10. (Section 15.10). Find d//dx where I(x) is given
15.1 and 15.6). Write a simple program based on the by the following integrals.
rb
algorithm (15.2) to evaluate a definite integral fix)dx-
a (a) t2 dt; (b) sin5f dt; (c) dt;
Assume that you have a subroutine for evaluating fix), o 1 +1
and that you input a, b, and N (the number of subdivi¬
fV(J[+1>
sions); also either a permissible error E, or a parameter (d) f In t dt; (e) sin(f2) dt.
M which determines the number of iterations. If you use
E, the process might be written to print out when two
successive iterations are within E of each other. Check 15.11. (See Example 15.17: it is necessary to split up
the correctness of the program by using a function such
as x1 as integrand. the integrals.) Obtain /(f) dt where /(x) is defined
Estimate the values of the following integrals.
by the following.

(a) dx; (b) sin x2 dx; (c) cos e x dx (0 if x < - 1

(a) fix) = x if — 1 < x < 1 ; consider positive and

15.9. (Computational) (a) Convince yourself that \


0 if x > 1
e~'x'2 < e~x when x > 1. Use the area analogy (15.13)
negative values of x.
to show that, if b > 1. then dx < : dx.
fx if 0 ^ x < 1
Deduce that, if £ is a positive number and E < 1, then
(b) f(x) = < 2 — x if 1 < x < f ); consider positive x
e -v" dx < E for b > — In E.
0 if x > §
(b) Use the program written for Problem 15.8 to only.

evaluate the improper integral dx to within two


15.12. An ‘RL’ circuit has a constant current I0 flowing,
decimal places, in the following way. You have to stop produced by a constant applied voltage. A switch cuts
the integral somewhere: the program cannot deal with
off the voltage and closes the circuit again at time
h = x. Take a permissible error E = 0.001, say, to leave
f = t0 > 0. For t > f0, the current is given by /(f) =
some leeway. Referring to (a), choose b> — In £, and
I0 Obtain expressions for Q(t) for t ^ 0, where
rb
compute the integral e * dx. The part of the original
6(0 = /(u) dt/.
integrand between b and infinity will then be negligible. o
Applications
involving the integral
as a sum
Contents
16.1 Examples of integrals arising from a sum 265
16.2 Geometrical area in polar coordinates 267
16.3 The trapezium rule 268
16.4 Centre of mass, moment of inertia 270
Problems 274

Reminder: The short table (12.2) provides all the indefinite integrals
(antiderivatives) required for this chapter.

16.1 Examples of integrals arising from a sum


The examples which follow show typical cases where integrals arise
from sums of the type (15.9).

Example 16.1. The tension T in an elastic string is given by T = O.Olx


(kg ms-2), where x is the extension beyond the natural length. Find the work
done on the string to stretch it 2 metres beyond its natural length.

Natural length
Tension T
I \_ -►

String \_-_t
Fig. 16.1 Extension

To stretch it from extension x to x + 5.x, the work 5fT required is approximated


by
5H7 ~ force x distance = T 5.x = 0.01.x 5.x.
The total work W is approximated by
x= 2 x =2
W= X X 0.01x5.x.
.v = 0 -v = 0

Now let 5.x -> 0. Then we obtain


*=2 r2
W= lim X 0.01.x 5.x = O.Olx d.x (from (13.9))
6.x-+ 0 -x = 0 Jo

0.01.x d.x = 0.0 lQ x2]q = 0.02 (kgm2s 2).

Example 1 6.2. A car runs from rest to rest in 1 hour, its velocity v being given
by v = 200r(l - f) (in kilometres per hour). The rate of fuel consumption, f (in
litres per kilometre), is related to the velocity by f = 10 - 4 v2. Find (a) the distance
travelled and (b) the fuel used.
266 16.1 Mathematical techniques

(a) In time 8t it travels a distance 8x, where


8x ~ v 8t.
The total displacement x (which is equal to the distance travelled since
v is always positive) is therefore
1=1 ri
= lim ^ n §r = v dt
dl->0 1 = 0
ri
200r( 1 - t)df = 200 (t — t2) dt

= 200Qr2 - §f3]£ = 33^ (km)


(b) In distance 8x, it uses an amount of fuel 8F approximated by
8F *f8xxfv8t= 10" V(i> 8t) = 800t3(l - r)3 8t.
The total fuel used, F, is given by
(=i ri
F = lim X 800f3(l - f)3 8t = 800r3(l - t)3 dr
5f-+0 t = 0 Jo
■* 1
= 800 (f3 - 3f4 + 3t5 - t6)dt
J0
= 800 lit4 - ft5 + It6 - 4r7] 3 = 5.71 (litres).

Example 1 6.3. The straight line y = (r/h)x, between x = 0 and x = h, is


rotated around the x axis to sweep out a solid cone of height h and circular base
radius r. Obtain an expression for its volume.
Divide the interval OH into a large number of equal small steps 8x (Fig. 16.2)
Consider the step PQ between x and x + 5x. This identifies a thin slice of the
cone, like a slice of bread. Its volume 5F is nearly that of a cylinder of radius
y and thickness 8x, so
8 V ~ ny2 Sx.
The total volume is obtained by adding all the 8F and then letting the slices
tend to zero thickness (at the same time becoming proportionately more
numerous):
x=h 'h
V = lim Tty2 §x = ny2 dx
5x-+0 a- = 0

Fig. 16.2 nr
— x2 dx = — [i-x3] h0
h2 h2 3
nr2 h3
in r2h.
If 3

The volume of any solid of revolution between x — a and x = b,


formed by rotating a profile y = f{x) around the x axis, can be found
in exactly the same way:

Volume of a solid of revolution around the x axis

For a profile y — /(x), a ^ x sg b,


(16.1)
the volume V = ny2 dx.

Example 1 6.4. Find the geometrical area enclosed between the curves y =
2x2— 1 and y = x2.
16.1 Applications involving the integral as a sum 267

This problem is complicated if we have to think all the time about the difference
between signed and geometrical area as in Chapter 15.
Here it will be done in a different way. Divide the interval — 1 ^ x < 1 into
short steps of length 5x and consider the area elements indicated in Fig. 16.3.
They are nearly rectangular, and the geometrical (positive) area 5/1 of each is
given by

5/4 % |x2 — (2x2 — 1)| 5x = (—x2 + 1) 5x

(we may drop the modulus signs since —x2 + 1 ^ 0 in the given range).
The total geometrical area A is therefore given by
x=i r i

A = lim Y, (~x2 + 1) 5x = ( — x2 + 1) dx
§x —► 0 x = — 1 t — 1

= [ — 3*3 + *]-l = (~ 3 + 1) ~ (3 ~ 1) = 3'

Fig. 16.3

16.2 Geometrical area in polar coordinates


In Fig. 16.4, AB represents part of a curve which is described in
polar coordinates by

r = /(0), a ^ 0 ^ /?.
Form a new, nonrectangular type of area element 8/1 by dividing
the 0 range, 0 = a to 9 = fi, into small angular steps 89, expressed
in radians. (We use A rather than A because, in polar coordinates,
we always regard r as being positive, and we shall count the area
elements as positive.) A typical area element has the shape OPQ.
When 80 is small, OPQ has very nearly the same area as a narrow
circular sector of radius r and angle 80 radians. Its area is therefore
a fraction 80/2tt of a complete circle of radius r and area nr2:
zn
8/1 % — nr2 = \r2 80.
2tt
The total area is obtained by adding all the elements and letting
80 tend to zero:
e=p rp
A = lim Y i1"2 = ir2
50->O 6=a Ja

where r = f(9).

Area of a sector in polar coordinates


For a sector r = /(0), with a ^ 0 ^ j?,
rP (16.2)
A = r2 d0.
Ja

Example 1 6.5. Find the area of the loop of the curve r = 3 sin 29 in the first
quadrant.
For the loop shown in Fig. 16.5, the range of 9 is 0 =$ 9 ^ j n- Thus in
268 1 6.2 Mathematical techniques

(16.2)

f(Q) = 3 sin 20, a = 0, /? = \ n.

The area is therefore given by

A = i (3 sin 20)2 d0 = \ sin2 20 d0.

But, for any angle B, sin2 B = \ (1 — cos 2B); so

A = | (1 — cos 40) d0

= I [^ — z sin 40 ] if =Itt.

1 6.3 The trapezium rule


Practical problems often give rise to integrals which the investigator
cannot evaluate or find in a dictionary of integrals. Indeed, some¬
times integrals which are very simple-looking cannot, in principle, be
expressed in terms of ordinary ‘formulae’ at all. However, numerical
approximations to definite integrals can usually be obtained to any
required degree of accuracy by using numerical methods in con¬
junction with a computer. We will mention some very simple
methods that call directly on the area analogy (15.13), which we
repeat here:

The area analogy


'b
The definite integral f(x) dx is equal to the signed
(16.3)
area between y = f(x) and the x axis from x = a to
x = b.

In Examples (15.1) and (15.4), we illustrated the use of the area


analogy (15.13) using as the area approximation the sum in (15.1),
which had been introduced only for the purpose of establishing the
principle. It only gives close approximations if we use very small
step lengths; but, now that the area analogy is established, we can
look for approximation methods that will be more efficient.
An improved area approximation is shown in Fig. 16.6, where the
curve y = f(x) is ‘fitted’ by a polygonal curve. The approximation
to the area of each strip individually is obviously better in general
than we would get from a rectangle. Divide the interval x = a
to x = b into N steps. We shall denote the length of each step
by h (instead of 5x, because h is conventional in numerical analysis).
Then

b — a
h =
N
16.3 Applications involving the integral as a sum 269

Number the N + 1 points of division 0, 1,2N: the x values are

x0 ( = a), Xj, x2, xN_u xN { = b)

and the 3; values y0, yu y2,.. .,yN_l,yN.


Each of the approximating area elements is a trapezium. The
signed area 8An of the nth area element is given by

~ i(v„-i + yjh = h—a(yn_1 + yn).

The total area J4 is approximated by the sum of these:

* b — a
a ~ X -TT)-(y«-i + y«)
„=1 in

b — a
[(To + yi) + (yi + y2) + • • • + (y^-i + yjv)]
~2TsT
b — a
[£yo + (y 1 + y2 + • • ■ + yjv-1) + 2yjv] •
N

This is called the trapezium rule.

Trapezium rule

'b b — a ri . .
/(x) dx « L2yo + (y 1 + y2 + ■ ■ • yjv-1) + 2yxJ
L n
(16.4)
The interval is divided into N equal steps:
x0 ( = fl), xl5..., xN ( = b) are the division points;
and yn = /(x„) (n = 0, 1,2,..., N).

In the following example, we compare the trapezium rule (16.4) with


the rectangle rule (13.2), which we can recast for comparison as

b — a
/(x) dx ~ (y0 + yi + • • • + yjv-i)-
J
N
16.3 Mathematical techniques

Example 16.6. Compare the efficiency of the trapezium rule (16.4) with the

rectangle rule (15.2) for approximating to dx.

We set out the results in the following table.

N 10 100 1000
h = (b — a)/N 0.1 0.01 0.001
Rectangle rule 0.66 0.635 0.6324
Trapezium rule 0.632 657 0.632 125 0.632 120

The exact value is 0.632 120 5 .... For three-decimal accuracy, the rectangle rule
requires about 1000 divisions and the trapezium rule only about 12. There are
many formulae which are far more efficient than even the trapezium rule - one
of the best of these, for combining simplicity with accuracy, being Simpson’s
rule (see Problem 16.21). The reader should look at books on numerical
analysis for others.

1 6.4 Centre of mass, moment of inertia


Suppose that there are N particles attached to a weightless plane
sheet (Fig. 16.7), the nth particle being at P : (x„, yn) and having
mass mn, where n — 1, 2,..., N.
Let G : (x, y) be the centre of mass. It is the balancing point of
the assembly; the point such that the total moment of the particles
about any axis through G is zero. Consider in particular the axes
CGD and AGB, parallel to the y and x axes and passing through
G. Then
N N

Z mn(Xn - X) = 0, Z mn(y» ~ 9) = (16-5)


n- 1 n=1

These can be written

N N N N

Z mn*n - xZffl» = 0- Z mnyn -y Z "»„ = o.


n=1 n=1 n=1 n-1

JV

Let Z mn = the total mass; then these equations give


n= 1
J N J N

*= M„=i
A, Z mnxm y = -.
M „= i
Z mny„-

If instead of a number of particles there is a solid plate, then this


too has a balancing point. Assume that the plate is uniform so that
its mass per unit area, /r (greek mu), is the same everywhere on it.
We also assume that the shape of the plate is such that no vertical
or horizontal line cuts across the boundary more than twice: once
going in and again going out. If the shape does not have this
property, then the process as explained here has to be modified.
Suppose that the centre of mass G is at (x, y). Divide the
area into narrow vertical strips of width 5x (Figure 16.8a). Let the
1 6.4 Applications involving the integral as a sum 271

B (b)

total length, or height, of a representative strip as shown be F(x).


Then its geometrical area 8/1 is nearly equal to V 8x, and its
mass 8m is nearly juV5x. Therefore the moment about a vertical
axis AB through G : (x, y) is approximately given by (x — x) 8m %
(x — x) fiV 8x. The sum of all the elementary moments must be zero,
since G is the mass centre, So, in the limit as 8x tends to 0, we have
x=b

lim Y (x — x) V(x)n 8x = 0,
8jc-»0 x = a

where x = a and x = b represent the extreme left and right limits of


the plate. Since [i and x are constants, this is the same as
x=b x—b

\i lim Yj xV(x) Sx = fix lim Y F(x) 8x = fiAx,


8x~>0 x = a t 8x->0 x = a

x=b

where A is the area of the plate, equal to lim Y F(x) Sx.


8jc-^0 x=a

Cancelling ju, we obtain

1 x=b 1 fb
x = — lim Y xC(x)8x = — xC(x)dx.
^ 5i-*0 j = n /l Ja

Similarly, by dividing the y axis into steps 8y, and considering the
moments of horizontal strips of length H(y) (see Fig. 16.8b) about
a horizontal axis CD through G, we obtain
'd
1
y - yH(y) dy,
A Jc

where y — c and y — d are the extreme lower and upper limits of


the plate.
In these expressions, all reference to mass has gone {n is no longer
present). Therefore the centre of mass of a uniform plate is also
called the centroid of the figure representing the plate, and it
depends only on its shape and size.
In fact the moments about every line through G are zero, not
simply the moments about AB and CD parallel to the x and y axes
that we used to find G.
272 1 6.4 Mathematical techniques

Centre of mass of a uniform convex plate, or


centroid of a convex area, G : (xi, y)
' :

xV(x) dx; yH(y) dy. (16.6)


x
Ja

where A is the area; here, respectively, V(x) and H(y)


are the lengths of the vertical and horizontal strips,
and x = a, b (resp. y = c,d) are the extreme horizontal
(resp. vertical) boundaries of the figure.

Example 1 6.7. Find the position of the centroid or centre of mass of an


isosceles triangle of height h and base b.
Choose axes which make the job as simple as possible. In this case, use the axes
shown in Fig. 16.9.
From the symmetry of the isosceles triangle about the x axis, the centroid must
lie on this axis, so y = 0 without any calculations.
The sides have equations

b
v = + — x;
- 2/7
therefore the length of the strip at x is given by

b
V(x) = - x.
Fig. 16.9 h

Also the area A is given by

Therefore by (16.6),

- x dx = — <2 dx = \h.
bh h J h2

(In these coordinates, x is independent of the base b.)

The moment of inertia is important for problems in mechanics


involving rotation: it plays a part similar to that of mass in
nonrotational problems. The moment of inertia of a single particle
of mass m about any axis AB is defined to be md2, where d is its
perpendicular distance from AB (see Fig. 16.10). For the moment
of inertia of an assemblage of particles, the individual contributions
are added. For a solid plate the contributions of small area elements
are likewise added, as if they were particles, and in the limit we
obtain a definite integral. It is important to select axes and suitably-
shaped area elements to make a particular problem manageable.

Example 16.8. Find the moment of inertia I of a uniform rectangular plate


ABCD about the edge AB when AB = 2, BC = 6, and the mass per unit area is 2.
Set up axes parallel to the sides, and area elements which are vertical strips
of height 2 and width 8x, as shown in Fig. 16.11. The axis of rotation is the y
O axis. The mass 5m of each strip is given by

Fig. 16.11 5m = (surface density) x (area) = 2 8T = 2x2x5x = 4 8x.


1 6.4 Applications involving the integral as a sum 273

The moment of inertia of the strip distance x from the y axis is therefore

x2 5m = 4x2 8x.

The total moment of inertia 1 is given by

/ = lim V 4.x2 5x = 4 x2 dx
Sx-*0 x = 0 Jo

= 4[;jX3]o = 288.

Example 1 6.9. Find an expression for the moment of inertia I of an isosceles


triangle ABC about its base AB, when AB = b, its height is h, and its mass is M.
y The axes and the representative strip at x are shown in Fig. 16.12. The equations
of BC and AC are

respectively, so the length V(x) to be assigned to the strip is

b
F(x) = — x + b,
h

and the area 8A is approximated by

8/1 % b Sx.

Since the plate is uniform, the mass per unit area is (total mass)/(area), or M/jbh,
so the mass element 8m is approximated by

M b \
5m —x + b dx.
\bh h J
Therefore the moment of inertia / is given by

x=h
2 Mf
/ = lim Y. x2 = X x2 —I 1 — I 8x
8x-0 x = 0 5x->0 .x = 0 b \ h.
2M ph 2M
2 1 - dx = x2 — x3 ) dx
h h

2M
ix —
2M If
= \Mh2.
h 4/j h 12

Example 1 6.1 0. Find the moment of inertia of a circular disc of radius R and
mass M about an axis through its centre and perpendicular to the plane of the
disc.
The usual (x, y) coordinates are not natural to this problem. In Fig. 16.13, the
polar coordinate r ranges from 0 to R. Break this range into ring-shaped steps
as shown, the representative ring or annulus having inner radius r and thickness
8/\ These constitute the area elements 5,4.
We have 8/1 % 2nr 5r, and the mass per unit area is M/nR2, so that the mass
of the ring 8m is approximately

8m % - - 2nr 8r.
kR2

Fig. 16.13 The moment of inertia of the ring must be equal to that of a suitable
274 16.4 Mathematical techniques

distribution of closely-spaced particles along its circumference. The contribution


of each of these imaginary particles to the moment of inertia of the ring is equal
to its mass times r2. Since r is constant on the ring, its moment of inertia 5/ is
equal to the total mass of the ring times r2:

sr 2x 2M 3
§/ x r~ dm x —— r or.
R2

Finally
r = R 2M 2M
I = lim Y - r3 5r = — r3 dr
Br-o r=o R2 R2
2M
= -I-lR* = iMR2.
R

Problems

(Units are kilogram, metre, second (SI units) (b) If the x and y scales of the profile ellipse in (a)
where they are unstated). are contracted or expanded by suitable factors, it becomes
a unit circle. Deduce from this fact the formula for the
16.1. The resistance R of a compression spring is given volume of the ellipsoid of revolution.
by R = 100.x + 1000.x2, where x is the displacement from
its natural length. Find the work done in compressing it
through a distance of 0.01. 16.6. The curve y = \x between y = 1 and y = 2 is
rotated about the y axis to profile a vertical spindle, or
16.2. The velocity v of a point moving along the x axis truncated cone. Find its volume.
is v = 20 — 1 Or, where t is the time. The displacement x
taking place in a short time 8f is approximated by
16.7. A uniform beam AB of length L has mass m per
8x ~ v 8f. Express the displacement which takes place
unit length. It is cemented horizontally at A into a wall
between f = 2 and t = 4 as a definite integral, and
at the end A. Sum the moments about A of elements of
evaluate it. What is its x coordinate at f = 4 if it was at
length 5x, form a definite integral, and so find the
x = 3 when t = 2?
moment supporting the beam at A.

16.3. Each of the following curves is the profile of a solid


of revolution which has the x axis as its central axis. Find 16.8. A 'beam' in the shape of a circular spindle made
the volume in each case (see (16. l)-you should briefly of material of density 500 is fixed to a vertical wall at
go through the whole argument until you understand it, the end A with its axis of symmetry horizontal. Its
not simply quote the formula): cross-sectional area (perpendicular to its axis) is 4 x
(a) y — e_JC, 0 ^ x ^ 1; (b) y = l/x, 1 < x < 2; 10~4(1 + 0.4x2), where x is measured from A. Its length
(c) y = x( 1 — x), 0 ^ x ^ 1; is 1. Find the moment at A required to support it under
(d) y = sin x, 0 ^ x ^ jc; (e) y = x3, — 1 < x < 1 (the gravity.
fact that xJ is negative over part of its range does not Suppose that the data are the same, except that the
have to be taken into account: the volume elements cross-section is square, or possibly irregular in shape.
are always positive, unlike area elements); Does this affect the answer? Suppose that the axis is bent,
(f) y = x(l — x), 0 ^ x ^ 2 (see the note in (e)); but that x still measures the perpendicular distance from
(g) y = x“', 1 ^ x < oo (contrast Example 15.9, for the wall: is the calculation affected?
area); (h) y = x2, 0 ^ x ^ 1.

16.9. A narrow tube of length 10 cm and cross-section


16.4. Show that the volume of a sphere of radius R is
0.1 cm2 contains a chemical solution, with concentration
f 7iR3. (A sphere is a solid of revolution.)
c(x) = 0.04 e-4* gm cm'3, where x is the distance from
one end. Find the total mass of solute in the tube.
16.5. (a) Find the volume of the ellipsoid obtained by
rotating the elliptical profile x2/a2 + y2/b2 = 1 about
the x axis. 16.10. The water clock in Fig. 16.14 has depth 0.5 m,
Applications involving the integral as a sum 275

(d) r = sin 20, 0 ^ 0 ^ f n.


(Remember the identities cos2 A =\{ 1 + cos 2A),
sin2 A = ^(1 — cos 2A).)

16.15. The end of a water trough is a rectangle of height


H and width L. Find the total force and moment on the
end when the trough is full. (The pressure, meaning the
force per unit area acting perpendicularly on any surface,
at depth y is pgy, where p is density and g the gravitation
constant.)

16.16. Determine the position of the centre of mass of a


symmetrical cone of circular cross-section which has
height H and base radius R.

16.17. Find the moment of inertia of a rectangle, having


sides a and b, about an axis through its centre, parallel
to the sides of length b.
and its profile is given by r(h) = 0.39/ti, where r(h) is the
radius at height h from the outlet in the bottom. The size
of the outlet hole is such as to drain the water at a rate
16.18. Obtain the moment of inertia of an isosceles
triangle of height H and base B about an axis through
given by
its vertex which is (a) parallel to the base, and (b)
dV , , perpendicular to the base.
— = —0.003/if m3 hr-1,
dr
16.19. Use the trapezium rule (16.4) to evaluate the
where V is the volume of water remaining. Show that
following integrals to 1% accuracy. (The exact value can
the water level falls at a uniform rate, and find how long
be obtained by evaluating the integrals in the usual way.)
it runs. (Consider the change 8/i in level which occurs in
a short time 5f.)
(a) : dx; (b) sin x dx; (c) cos x dx.

16.11. An alternating current / = i0 cos cot flows


through a resistor R. The instantaneous rate of heat
generation is Ri2 heat units per unit time. Find the heat 16.20. The following integrals are either difficult or
generated in a complete cycle of the current, that is, in impossible to evaluate directly. Estimate them by using
a period 2n/co. Does it make any difference at what the trapezium rule (16.4). (Since you cannot know the
instant you regard the period as starting? (To carry out exact answer in advance, you can proceed by running
the integration, you will need the identity cos2 A = the program using increasingly fine divisions until you
get no change in some predetermined number of decimal
t (1 + cos 2A).)
places.)

16.12. Find the geometric area enclosed between the


curves y = —x and y = x(x — 1) on the interval 0 ^ x ^ 2, (a) sini x dx; (b) dx;
Jo
by considering vertical strips between the curves of width
5.v. '2 ex dx 2 sin x
(c) (d) -dx.
i x
16.13. Find the geometric area enclosed between the
curves y = — x and y = x3 between x = — 1 and x = 1
by considering vertical strips of width 5x connecting 16.21. The following is called Simpson’s rule for numeri¬
the curves. (Be careful about signs: these curves cross.) cal integration. It results from splitting the points of
division into successive groups of three, then exactly
fitting the corresponding groups of points on the graph
16.14. For the angular ranges specified, sketch the curves
by second-degree polynomials. For this purpose, N must
given in polar coordinates below and find the sectorial
be an even number:
areas.
(a) r = 0, 0 ^ 0 < 2k (a spiral arc); ’b b — a
(b) r = 2 cos 0, -\n 0 s= jn (a circle); y dx % (y0 + 4ki + 2k2 + 4y3
(c) r = ee/2rt, 0 0 7i (spiral arc); ^0 ^ + 2y4 + • —I- 4yJV_1 + yN)
276 Mathematical techniques

This type of integral is usually impossible to evaluate


Show that e *2 dx is given correctly to four deci¬ explicitly, but can be done numerically. Compute the
lengths of the following curves. (Try the trapezium
mal places by using only four subdivisions. Compare the
rule, Simpson’s rule of Problem 16.21, and an integrating
trapezium rule and the rectangle rule.
routine from a software package if you know how to use
it: the interest lies in comparing them.)
16.22. Consider the curve y = j\x) for a ^ x ^ b. Show
that the arc length 5s associated with a short step 5x is
given by 5s % (5x2 + 5y2)2. Deduce that the total (a) y = sin x, 0 ^ x ^ 1;
length s of the curve is given by (b) y = x2, 0 < x < 2;
(c) y = e*, — 1 < x < 1.
'b r ^y^2"'
s = 1 + (d) y = (1 — x2)1, 0 ^ x ^ 1 (a semi-circle, so it can be
dx done directly).
Systematic techniques
for integration
Contents
17.1 Substitution method for J f(ax + b) dx 277
17.2 Substitution method for j /(ax2 + b)x dx 279
17.3 Substitution method for } cos"1 ax sin" ax dx (m or n odd) 280
17.4 Definite integrals and change of variable 282
17.5 Occasional substitutions 283
17.6 Partial fractions for integration 285
17.7 Integration by parts 286
17.8 Integration by parts: definite integrals 289
Problems 291

17.1 Substitution method for f(ax + b) dx


Consider the indefinite integral

(3x — 2)3 dx.

We carried out this integration in Example 14.8 by starting with a


guess that the result will resemble (3x — 2)4. We now describe a
method less dependent on trial and error.
We shall take up a clue suggested by the chain rule procedure
(Section 3.3). Put

3x-2 = u. (17.1)
Then the integral becomes
p
u3 dx.
V

Unfortunately this is not equal to \uA + C, because dx, not du, is


present: the variable of integration is still x. Thinking in terms of
an integral as a sum, 5x is not the same size as 5u; in fact from (17.1)
3u — 3 dx, which suggests what to do with the new integral.
From (17.1), dw/dx = 3, which we write as

dx = 3 du.
Put this into the integral, and it works through straightforwardly:

I (3x 2)3 dx =

Now use (17.1) to change back to x:


w3(i du) u3 du = jju4 + C.

(3x — 2)3 dx = Y2(3x — 2)4 + C,


j
278 17.1 Mathematical techniques

and this is correct. In checking its correctness by differentiation, we


use the chain rule with u = 3x — 2, and find we are simply reversing
the order of the operations that we just went through.

Example 1 7.1. Use a substitution to obtain

Try
u — 2x — 1.
We shall need to express dx in terms of u. Since du/dx = 2, we have

dx = 2 du.
The integral therefore becomes, in terms of u.

dx (f du)
= i In ltd + C
2x - 1

= fin |2x - 1| + C.

Example 1 7.2. Evaluate sin(3x + 2) dx.

Put
u = 3x + 2,
then du/dx = 3, so du = 3 dx, or dx = 3 du. The integral becomes

sin(3x + 2) dx = Jsin u ■ (f du)

= ( —f cos u) + C = —f cos(3x + 2) + C.

The essence of the matter is that the change of variable or


substitution led to a simpler integral than the one we started with.
In general, for integrals of this type, we have the following result.

Type /(ax + b) dx

dn 1
Put u = ax + fi; then — = a, or dx = - du. The (17.2)
dx a

integral transforms to f(u) du.


a J

It is worth while to try this substitution in more general cases,


even if it is not obvious that a simplification will take place.

Example 1 7.3. Evaluate x(2x l)3 dx.

This is not quite of the form (17.2) because of the presence of the loose x.
Nevertheless, put

u = 2x — 1,

with the object of simplifying at least the most complicated part. Then

du = 2 dx, or dx = f du.
17.1 Systematic techniques for integration 279

We also need to express x in terms of u, using u = 2x — 1:


x = i(u - l).
Now we have

x(2x — l)3 dx i(u + l)w3(§ du)

i («4 + U3) du = joU5 + I^u4 + C

= ^(2x-l)5+^(2x-l)4 + C.

Do not miss the possibility of making a substitution in simple


cases; for example:

e“3* dx: put u = — 3x, dx = —5 du;


%)

sin 3x dx: put u = 3x, dx = j du;

f1 + X
-dx: put u = 1 — x, dx = — du.
J 1 -x

1 7.2 Substitution method for j* f(ax2 + b)x dx

Example 1 7.4. Evaluate x ex2 dx.


j

Try putting
u = x2,
with the objective of simplifying the unfamiliar-looking term e*\ It is then
necessary to deal with x and dx in the integral. We have

which we can write as du = 2x dx, or


x dx = \ du.
In this way we have translated the whole group (x dx) into terms of u, instead
of having to deal separately with x and dx. Therefore

x e*2 dx e*2(x dx) = e“d du)

j e“ du = |e“ + C = §e*2 + C,

where C is an arbitrary constant. The correctness of the result can be checked


by differentiating it.

x dx
Example 17.5. Evaluate
3x2 + 2'
Notice that the integral can be written in the form

r 1
(x dx).
3x2 + 2

The integrand contains a function of x2 and the combination x dx which


280 17.2 Mathematical techniques

appeared in Example 17.4. This suggests putting u — x2 to give a simpler


integral. However, we can do even better than this.
Put
u = 3x2 + 2.
Then du/dx = 6x, so that
x dx = £ du.
Therefore

i
du
(x dx) - (e d“) 6
3x2 + 2 u u
\ ln|u| + C = £ ln(3x2 + 2) + C,
where C is an arbitrary constant. The modulus sign in the logarithm can be
discarded because 3x2 + 2 is always positive.

The general result is as follows:

Integrals of type I = xf(ax2 + b) dx

Put u = ax2 + b; then x dx = — du; so (17.3)


2a
1
I =
2a
m du.

17.3 Substitution method for cosm ax sin” ax dx (m


or n odd)

Example 1 7.6. Evaluate sin3 x cos x dx.

Aim to simplify the worst term by putting


u = sin x.
Then sin3 x becomes u3, and we must deal with cos x dx. As always, begin with
du/dx = cos x. Therefore
du = cos x dx,
so, by good fortune, cos x dx appears in one piece. Then we have

sin3 x • (cos x dx) u3 du - + C = \ sin4 x + C,

with C arbitrary. (Check by differentiating.)

Example 1 7.7. Evaluate tan x dx.

We have
c r •
sin x ,
tan x dx = dx - (sin x dx).
\) «. COS X J cos x
17.3 Systematic techniques for integration 281

This time, put


U = COS X,

so that du/dx = —sin x. From this we obtain


du = —sin x dx,
so, apart from the sign, we have exactly the combination required for the rest
of the integrand. Then
1 f1
-(sin x dx) = - ( —dw) = —In |u| + C = —In |cos x| + C,
J cos x J u
where C is arbitrary. This often appears as In |sec x| + C in tables of integrals.

This technique can be used for products cos'" ax sin" ax, when
either m or n (or both) are odd numbers, either positive or negative,
and for certain other cases as well.

Example 17.8. Evaluate cos3 x dx. (This is the case m = 3, n = 0.)

Write cos3 x dx = cos2 x • (cos x dx), and put

u = sin x
(not cos x as possibly expected). Then du/dx = cos x, so that
du = cos x dx.
The remaining part of the integrand is cos2 x, and we can transform this by
writing
cos2 x = 1 — sin2 x = 1 — u2.
Then we have

cos2 x-(cos x dx) = | (1 — u2) du = u — ^u3 + C

— sinx — 51 sin- x + C,
where C is arbitrary.
The reader should try also the substitution u = cos x. It leads to an integral
in terms of u that is correct but worse than the original.

Example 1 7.9. Evaluate 1 = cos3 2x sin3 2x dx.

(Here in = 3, n = 3.) The technique requires us to decompose the term whose


power is odd. Here both powers are odd, so either will do. We shall split the
integrand like this:

/ = cos3 2x sin2 2x(sin 2x dx).

Put u = cos 2x so that


sin 2x dx = — \ du.
Since sin2 2x = 1 — cos2 2x, the integral becomes

/ = W3(l — U2)( —2 du)

(w3 — U5) du = — gtl4 + ysu6 + C

= — £ cos4 2x + cos6 2x + C,

with C arbitrary.
282 17.3 Mathematical techniques

The general rule is as follows.

Integrals of type / = cos'" ax sin" ax dx

(a) If n is odd, put / = cos"1 ax sin" 1 ax (sin ax dx);

then u = cos ax, sin ax dx = - du, and


a
sin2 ax = 1 — cos2 ax.

(b) If m is odd, write (17.4)

/ = cos'" 1 ax sin" ax(cos ax dx);

then u = sin ax, cos ax dx = - du, and


cos2 ax = 1 — sin2 ax. a

(c) If n and m are both odd, use either (a) or (b).

17.4 Definite integrals and change of variable


For the previous examples involving indefinite integrals, we changed
the variable to u, carried out the integration, and then expressed
the result back in terms of x. For a definite integral, it is often
more convenient to express the limits of integration in terms
of u, as well as the integrand, and in that way work with u right up
to the end. In the following example, both procedures are illustrated.

Example 17.10. Evaluate I = cos3 x dx in two ways.


o
(a) (First finding an indefinite integral in terms of x.) As in Example 17.8,
put u = sin x, du = cos x dx;

cos3 x dx = (1 — u1) du = u — \u3

= sin x — ^ sin3 x,

taking the simplest case with C = 0. Then

/ = [sinx — | sin3 x]^ = (1 — ±) — 0 =

(b) (Working with u throughout.) Put u = sin x into /. In order to express


the limits of integration in terms of u, note that u = 0 when x = 0, and u = 1
when x = \n. Then (writing the limits so as to make them more explicit)

cos3 x dx = (1 — u2) du
17.4 Systematic techniques for integration 283

In Example 17.10b it would have been wrong to write the integral


in the form
'in

(1 — u2) du.
Jo
This would imply that we were going to put u equal to 0 and
after integrating.

Example 1 7.11. Find the centroid (centre of mass) of the uniform semicircular
y plate shown in Fig. 17.1.
The symmetry shows that the centroid G lies on the x axis. From (17.6), the x
coordinate of G is given by

1 CR
x =- F(x)x dx.
i-n-P2
2UR- Jo

Since x2 + y2 = R2, we have F(x) = 2(R2 — x2)\ so that


4 f*
x =- (R2 — x2)Fc dx.
Jo
This is an integral of the type of (17.3). To simplify it, put u = R2 — x2, so that
du/dx = —2x and x dx = — | du. Also, u = R2 when x = 0, and u = 0 when
x = R. Therefore
4
u*(— 2 du)
7iR2 R2
Fig. 17.1 2 4 4
-=-[0 - ft3] = — R.
nR23 R 3nR2 J 3k

17.5 Occasional substitutions


Finding an advantageous new variable u is often a process of trial
and error. Frequently the possible usefulness of a substitution is
more easy to see in the form
x = f(u)
rather than u as a function of x as in the previous work.

f dx
Example 17.12. Find a substitution to evaluate -—.
J (1 - x2)*
Try to simplify (1 — x2)* first, hoping that dx will work out conveniently. To
do this try
x = sin u; (17.5)
then (1 — x2)1 = (1 — sin2 u)* = cos u. Also dx/du = cos u, so
dx = cos u du.
Therefore
os u du .
-= u + C = arcsin x + C,
cos u
a result which can be confirmed from the table of derivatives in Appendix D.
You might try putting u = 1 — x2 instead: the resulting integral is different from,
but no better than, the original.
284 17.5 Mathematical techniques

' dx
Example 1 7.1 3. From Example 17.12, we know that arcsin x + C.
(1 - x2)*
dx
Use this result to obtain
(4 — x2)*
Aim to convert (4 — x2)* into something like (1 — u2)*, so as to be able to use
the result in the table.
(4 - x2)* = 2(1 - ix2)* = 2[1 - (ix)2]*,
and make the substitution
u = \x.
Then du/dx = j, so that dx = 2 du. Therefore
’ dx ’ 2 du du

J J

1
s:to
(4 - X2)* (i - u2F

f
= arcsin u + C = arcsin \x + C.

dx
Example 17.14. From Appendix E, = arctan x + C. Use this result
1 +x2

to evaluate
1 + 9x2
We want to transform 1 + 9x2 to a form close to 1 + u2, so put
u = 3x,
so that dx = f du. Then
dx f 5 du
= \ arctan u + C = i arctan 3x + C.
1 + 9x2 1 + u2

If the required integral does not seem to be similar to one that


is already known, then one has in effect to guess a suitable
substitution:

In3 x
Example 17.15. Evaluate dx.

We can simplify the logarithm (at the risk of extra complexity elsewhere) by
putting
x = eu
so that In x = In e“ = u. Since dx/du = e“, we have dx = e“ du. Therefore
In3 x
dx = e“ du u3 du

= iu + C = f In x + C.

The general shape of the integrand sometimes suggests a substi¬


tution that is sure to simplify it. Suppose we notice that /(x) takes
the special form
du
f(x) = cg(u)~ -,
dx
where c is a constant and u is a function of x. For example,

(x4 + 1)V = Jn7 —;


dx
17.6 Systematic techniques for integration 285

where, in this case, c = u(x) = x4 + 1, and g(u) = u7. In the


general case,

du
f (x) dx = c g(u) — dx = c g(u) du.
dx J

(Any f(x) can in principle be written in this form: the question is


only whether it is easy to see how it breaks up.) Having observed
the form of u(x), the substitution should be made in the usual way.

Example 17.16. Evaluate (xi + l)*x ±dx.

The important thing is to spot that (d/dx)(x* + 1) is like the remaining factor,
x_i. This suggests that u = x* + 1 is the right substitution. Specifically, put
u = x* + 1; then
du _
— = \x T and so x 2 dx = 2 du.
dx
The integral becomes

2 id du = 2U> + C = §(x* + 1)! + C.

17.6 Partial fractions for integration


In Section 1.10, it was shown how a rational function P(x)/Q(x),
where P(x) and Q(x) are polynomials, P(x) is of lower degree than
Q(x), and Q(x) factorizes into real factors, can be expressed as the
sum of simpler partial fractions. This provides a method for
integrating rational functions.

dx
Example 17.17. Evaluate
c2- 1
By the methods of Section 1.10, we find that

1 J_1 i 1
x2 — 1 (x — l)(x + 1) X— 1 2X + 1
Therefore

dx dx dx
x — _ x + 1

= 2 In |x — 1| | In lx + II + C.
x - 1 x - 1 x - 1
Other equivalent forms are j In + C, In + C and i In B
X + 1 x T 1 X + 1
where C and B are arbitrary.

As a result of expanding in partial fractions we may encounter


integrands of the type

cx + d
px2 + qx + r
286 17.6 Mathematical techniques

in which the equation px2 + qx + r = 0 has no real roots (that is,


the denominator has no real factors). The following example shows
how to evaluate them by ‘completing the square’ in the denominator.

(x + 1) dx
Example 1 7.1 8. Evaluate I =
x2 + 4x + 8'
The quadratic form x2 + 4x + 8 has no real factors. ‘Completing the square’
in the denominator consists of writing x2 + 4x + 8 in the form (x + a)2 + b.
The first two terms, x2 + 4x, can be written

x2 + 4x = (x + 2)2 — 4,
so
x2 + 4x + 8 = (x + 2)2 - 4 + 8 = (x + 2)2 + 4.

The integral becomes

f* (x + 1) dx j [* (x + 1) dx

/ = J(x + 2)2 + 4_4J [i(x + 2)]2+T

Now put u = f(x + 2), or x = 2u — 2, from which

dx = 2 du.

Then

2u - 1
/ = i du = du — -k du.
u2 + 1 u2 + 1 u2 + 1

To evaluate the first integral, use the substitution v = u2 + 1, as in Section 17.2;


the second is a standard integral. We obtain

1 = 2 ln(x2 + 4x 4- 8) — j arctan \(x + 2) + C.

1 7.7 Integration by parts


This method is totally unrelated to the techniques we have so far
described, and can be used to integrate special types of product.
It is needed very frequently for obtaining fundamental general
results.
Suppose that we are given any u(x) and u(x). Then, by the product
rule,

d du du
— (UV) = U — + V —.
dx dx dx

Since both sides are equal, their indefinite integrals can only differ
by a constant, so

f d , , , f dr f du
— (uu)dx= u — dx + u — dx + C. (17.6)
J dx J dx J dx

Look at the integral on the left. It means ‘an antiderivative of


(d/dx)[u(x)u(x)]\ But, from the definition (14.1), w(x)u(x), is an
17.7 Systematic techniques for integration 287

antiderivative. Therefore (17.6) becomes

dr du ,
uv = u — dx + v - dx + C.
dx dx

Now rearrange the terms to obtain

dr du
u — dx = uv v — dx + C.
dx dx

This is the formula for integration by parts.

Integration by parts

dr f du
u — dx = ur — r — dx + C, (17.7)
dx J dx
where C is an arbitrary constant.

It is not at first obvious how this complicated result could be of


any use, but the point of it is that the right-hand integral might be
simpler than the one on the left. The process was once called ‘partial
integration’, because the ur part is already integrated out. (For the
effect of missing out C, see Problem 17.19.)

Example 17.19. Evaluate x e* dx by integrating by parts.

First observe that the integrand consists of the product of two factors, x and ex,
both of which we can integrate and differentiate any number of times. We relate
this fact to (17.7) by identifying them with u and dr/dx respectively: put

, dv
u = x and — = e*. (i)
dx

Then

du
= 1 and
dx

where we have chosen v to be the simplest antiderivative of ex. Nothing would


ultimately be changed by introducing an arbitrary constant C into v: any
antiderivative will do (see Problem 17.18).
Fill in the right-hand side of (17.7) by picking out u, v, du/dx from (i) and
(ii), and introduce the constant C:
f r
x ex dx = x ex (ex)(l) dx + C

= x ex ex dx + C = x ex — e* + C,

where C is arbitrary.

We obtained a simplification because we chose x, rather than ex,


to be assigned to u. Since du/dx is simpler than u, it seemed possible
288 17.7 Mathematical techniques

that the right side of (17.7) might be simpler than the left. (To see
what happens when we put u = ex, dt>/dx = x, see Example 17.21.)
As in Example 17.19, you should always write out stages (i) and
(ii) in full, and do the subsequent working in full, or you will make
mistakes.

Example 1 7.20. Evaluate x cos 2x dx.

Put u = x, di’/dx = cos 2x. Then

du
= 1, cos 2x dx = j sin 2x.

Substituting these functions into the right-hand side of (17.7):

x cos 2x dx = x(j sin 2x) — (5 sin 2x)( 1) dx + C


J
= jx sin 2x — j( — j cos 2x) + C

= jx sin 2x + l cos 2x + C.

Example 1 7.21. For x e v dx (see Example 17.19), try the effect of assigning
«J
x and ex to u and dv/dx the ‘wrong way round\
In Example 17.19, we successfully put u = x and du/dx = ev. Now try instead

dr
u = ev, — = x,
dx

then

du
— = e\ v = x dx = jx2.
dx

The integration-by-parts formula becomes

x ex dx = ex(jx2) - (jx2) ex dx + C

= ix2 ex - i x2 ex dx + C,

which is a true result, but the transformed integral is worse than the original.

Sometimes it is not immediately obvious that the method can be


made to work, as in the following.

Example 1 7.22. Evaluate In x dx.

Write In x = (In x)(l), so that the integral becomes


r
(In x)(l) dx.

We can now put u = In x and dn/dx = 1, so that


du 1
— = -, v = x.
dx x
17.7 Systematic techniques for integration 289

Then
r
(In x)( 1) dx = (In x)(x) - (x) dx + C

= x In x — dx + C = x In x — x + C,

where C is an arbitrary constant.

The integrals of other inverse functions, such as arcsin x and


arctan x, respond to the same technique.

Example 1 7.23. Evaluate x2 sin x dx.

It is necessary in this problem to integrate by parts twice. Put

u = x2, — = sin x;
dx
then

— = 2x, v = —cos x.
dx
From (17.7),

x2 sin x dx = —x2 cos x + 2 x cos x dx + C.

Integrate the integral on the right by parts; put


dt>
u = x, — = cos x,
dx
so that du/dx = 1 and v = sin x. From (17.7), we obtain finally

x2 sin x dx = — x2 cos x + 2x sin x + 2 cos x + C.

17.8 Integration by parts: definite integrals


The integration-by-parts formula (17.7) expresses a relation between
indefinite integrals, or antiderivatives. Suppose that we have a
definite integral of the form

which we expect to integrate by parts. Then, from (17.7),

fb u —
dr dx = r du,
uv — v — dx
L dx dx
The operation [ • • • ] applies to the two terms separately, so we have:

Integration by parts (definite integrals)

(17.8)

This can sometimes considerably simplify the working, especially if


more than one integration by parts is needed.
290 17.8 Mathematical techniques

Example 17.24. Evaluate x2 sin x dx.

As in Example 17.23, put u = x2 and dv/dx = sin x. Then

dw
= 2x, v = —cos x.
dx
From (17.8),
%±K
x2 sin x dx = [x2( —cos x)]^ ( — cos x)(2x) dx
Jo 0

fin
= 2 x cos x dx,
Jo
because the bracketed term is zero; we did not have to wait to the end of the
calculation to see it go. To evaluate the remaining integral, integrate by parts
again, putting u = x and dv/dx = cos x; we have

— = 1, v = sin x.
dx

Use (17.8) again:


*in

x cos x dx = 2 [x sin x] g11 — sin x dx

= 2(\n + [cos xjf) = 2 [|rc + (0 - 1)]

= 71 — 2.

The following result is important for Chapter 24, and involves the
use of (17.8):

•go

e“'fNdr = JV!
Jo
(17.9)
when N = 0, 1, 2, 3,..., .
(0! is defined to be 1)

Here AM stands for the factorial:

AM = N{N - 1 )(N - 2)-- -3-2-1.

The symbol 0!, which is apparently arbitrarily given the value 1,


does not fit this pattern; it should be regarded at this stage as being
just a useful convention. The related gamma function F(N) = (N — 1)!
is used in statistics in Section 40.6.
To prove (17.9), let k represent any of the numbers 0, 1, 2,...,
and write

e“'fkdt = F{k),
J0

to indicate the integral’s dependence on the parameter k; for example


17.8 Systematic techniques for integration 291

e 'r dr is denoted by F(3). Notice in particular that

m= dr = [ —e = 1. (17.10)

For k = 1,2,..., integrate by parts. Put u = tk and dy/dr = e then

F(k) — e 'rfc dr = [rfc( — e ')]o — (fctk_1)(-e-‘) dr

= k e 'tk 1 dr

(the bracket is zero because k ^ 1). The integral is F(k — 1), so

F(k) = kF(k — 1) for k = 1,2,.... (17.11)


By integrating by parts, we have reduced the degree of r by unity.
We could evaluate F(N), where N is given, by integrating by parts
again and again until we reach F(0), given as 1 by (17.10). But we
do not have to integrate by parts any more: equation (17.11) does
it for us. Put k = 0, 1, 2, 3,..., successively: we obtain

F(0)= 1 (by (17.10)),

F( 1) - 1F(0) (by (17.11)) =1-1 = 1!,

F(2) = 2F(1) (by (17.11)) = 2(1!) = 2!,

F(3) = 3F(2) (by (17.11)) = 3(2!) = 3!,

and so on (each line uses the result of the previous line). So, if we
are given N, we shall reach F(N) after N lines, and find that

F(N) = N\,

The argument above can be expressed in a different way. Using


(17.11) repeatedly we have

F(N) = NF(N - 1) = N(N - 1 )F(N - 2) = • • •,

until we arrive at F(0), which is 1, and we are left with TV! on the right.
Equation (17.11) is an example of a reduction formula, by which
an integral can be systematically reduced, one step at a time,
to progressively simpler integrals. (See Problems 17.14, 17.15,
17.16.)

Problems

17.1. (Section 17.1). Obtain f(x) dx when the /(x) (g) (j + 2x)n\ (h) x(x - l)4; (i) (1 - xf\
are as follows. (j) (2x — 3)“* for x > (k) l/(3x + 2)2;
(a) sin 3x; (b) cos 4x; (c) e (1) l/d - x)4; (m) 1/(1 + x); (n) l/(2x + 3);
(d)(l+x)10; (e) (1 — x)9; (0 (3 — 2x)5; (o) x/(l — x)2; (p) (1 + x)/(l - x);
292 Mathematical techniques

(q) x/(x — 1)* for x > 1; (r) cos(l — 2x); 17.6. Use the identities cos2 A = ^(1 + cos 2A),
(s) sin(2x — 3). sin2 A = j(l — cos 2A), or sin A cos A = \ sin 2A to eval¬
uate the following.
fit Cn
17.2. (Section 17.1). Evaluate the following indefinite sin2 r df; (b) cos2 f df;
(a)
integrals. o Jo
in 'in

(a) (21 — 5)5 df; (b) j sin f(3f — 1) df; (c) sin2 2f df; (d) cos2 If df;

(c) dw; (d) e 3rdr; cos4 u du.


(e) sin2 3f cos 3f df; (f)
(2w+ l)2
s ds
(e) ( — r)b df if t < 0; (f)
(1 -s) 3’ 17.7. Use the substitutions suggested to evaluate f(x)
» %J
cos (cuf — 4>) df. dx for the following /(x). (In several of the questions the
(g)
identity 1 + tan2 A = 1/cos2 A is needed. You may also
have to refer to the table. Appendix E.)
(a) In x/x (put x = eu);
17.3. (Section 17.2). Obtain /(x) dx when the /(x) (b) x(l — x2)1 (try (i) u = 1 — x2, (ii) x = sin u);
are as follows. J (c) l/(ex + e“x) (put u = ex);
(a) xe"*!; (b) x sin x2; (c)xcosx2; (d) 1/(1 — x2)* (try (i) x = sin u, (ii) x = cos u\ why
(d) x cos(x2 + 3); (e) x cos(l — 3x2); do the results seem to be different?);
(f) x(x2 - l)4; (g) x(3x2 + 4)3; (e) tan2 x (put u = tan x);
(h) x/(l + 2x2); (i) x3(l - x2)3 (note: x3 = xx2); (f) l/x2(l + x2) (put x = 1/m, followed by another pro¬
(j) x/( 1 + x2); (k) x/(3x2 - 2). cess);
(g) 1/(1 + x2) (put x = tan u);
(h) 1/cos2 x (put u = tan x);
17.4. (Section 17.3). Find f(x) dx where the /(x) are
(i) —-(put f = u2); (j) — sin - (put f = 1/m);
as follows. U(1 + t) f2 f
(a) sin x cos x; (b) sin2 x cos x; (c) sin2 2x cos 2x; (k) (1 — x2)* (put x = sin u);
(d) cos2 x sin x; (e) cos2 3x sin 3x; (f) sin3 x cos x; (l) 1/(1 + x2)* (put x = tan m).
(g) cot 2x; (h) tan |x; (i) (sin3 x)/cos x;
(j) sin3 x ( = sin2 x sin x);
(k) tan3 x (compare (j)); (1) cos3 x (compare (j)). 17.8. Use partial fractions to evaluate J/(x)dx for

the following /(x).


17.5. Evaluate the following definite integrals by using (a) l/(x2 - 4); (b) l/x(x + 2);
any necessary substitutions.
(c) l/x2(x - 1); (d)x/(2x+ l)(x+ 1);
(e) (x + l)/(4x2 — 9); (f) l/x(x2 + 1);
(a) (1 + x)7 dx; (b) (1 - ix)7 dx (g) x/(2x2 + 3x + 1); (h) l/x2(2x + 1);
(i) 1/cos x (first put m = sin x);

1 x dx (j) 1/sin x (first put u = cos x).

(c) x(l — x2)3 dx; (d)


o 2x + 3
2 dx f4 dx 17.9. Obtain f(x) dx for each of the following /(x),
(e) (note: x < — 1); (f)
U -3 1 + X 2 - 3x noting that they take the form cg(u) du/dx (see the
ri remark at the end of Section 17.5), so that (a), for
(g) x3(l — x2)3 dx; (h) tanfdf; example, will respond to the substitution u = x3 — 1.
(a) x2(x3 - l)5; (b) (x - l)(x2 - 2x + 3)_1;
ri n (c) l/(x In2 x); (d) x^x1 + 2)4;
(i) cot 3w dw; (j) sin u cos u du; (e) (ex - e_x)/(ex + e“x); (f) l/x*(x* + 1);
(g) x2/(x3+ 1).

(k) (sin v)'! cos v di>; (1) cos3 0 d0\


17.10. Use integration by parts (Section 17.7) to obtain
-in

±n riK/l° /(x) dx for each of the following /(x).


(m) I sin 2f df; (n) cos(cuf + 0) df.
i 0 J - \n!o) (a) xe"x; (b) x e3x; (c)xe_3x;
Systematic techniques for integration 293

(d) x cos x; (e) x sin x; (f) x cos \x\


(g) x sin 2x; (h) x(l - x)10; (i) x In x; 17.15. Denote cos'1 x dx by F(k) when k = 0,1, 2,
(j) x" In x, n / —1; (k) (In x)/x (the method might
. . Integrate by parts to show that
seem to have failed; but look again).
k - 1
F(k) =-F(k - 2) for k = 2, 3,..
17.11. Use integration by parts (see Example 17.22),

writing the integrand as /(x)( 1), to obtain f(x) dx for Evaluate F(0) and F( 1). Use the reduction formula
repeatedly, together with F(0) and F(l), to evaluate
each of the following /(x).
(a) In2 x; (b)arcsinx; (c)arccosx; cos4 x dx and cos5 x dx.
(d) arctan x.

17.16. Follow the lines of Problems 17.14 and 17.15


17.12. To evaluate /(x) dx for the following f(x), to obtain the following reduction formulae, and to
integrate the special cases given. The letter k is an
integrate by parts twice; then look closely at your result.
integer as specified for each case.
(If it does not work out you have probably made a
(a) Let
mistake with a sign.) Compare your results with (15.11).
*2
(a) ex sin x; (b)e_ vsinx (c)e~xcosx.
F(/c) = (In x)* dx (k ^ 0).

17.13. (Integration by parts; definite integrals. Section Show that F(/c) = 2(^2)* — kF(k — 1) for k ^ 1, and
17.8). Evaluate the following. *2

evaluate (In x)3 dx.


in
(a) I x cos x dx; (b) x cos 2x dx;
'o (b) Let F(k) = xk sin x dx (k 0). Integrate by

parts twice to show that


(c) x2 cos x dx;
F(k) = nk — k(k — \)F(k — 2)
-X
(d) e~x sin x dx (integrate by parts twice); for k 2; 2. Evaluate x4 sin x dx and x5 sin x dx.
o
(c) Obtain a reduction formula for
(e) e * cos x dx (integrate by parts twice);
o F(k) = sin* x dx,
-i
f2 In x dx in
(f) (g) arcsin x dx;
and use it to evaluate sin4 x dx and sin5 x dx.
o
-i -i

(h) arccos x dx; (i) arctan x dx;


17.17. (Change of variable etc.). Denote the integral
-1 Jo
’c dx
'2 for c > 0 by F(c). Deduce the properties (a) to
(j) In x dx. Ji *
Ji (d) below. F(c) is obviously equal to In c, but do not
use any of the known properties of the logarithm; pretend
ri that this is the first time you have ever seen the integral.
17.14. (Compare (17.9)). Denote xk ex dx by F(/c) for (a) F(a~l) = —F{a) if a > 0. (Hint: put c = a~x in
Jo the definition; then change the variable to u where
k — 0,1, 2,_Integrate by parts to obtain the reduction u = x~ F)
formula (b) F(ab) = F(a) + F(b) if a and b > 0.
(c) F(a/b) — F(a) — F{b), where a and b > 0.
F(k) = e - kF(k - 1),
(d) F(a") = nF(a) if a > 0 and n has any value.
(provided that k is positive). By applying it four times,
show that 17.18. (Integration by parts). It is stated in Example
15.19 that, in obtaining v from dn/dx, we may take
*i

15e + 24F(0) = 9e —24. any antiderivative (so naturally we always take the
F(4) = x4 e* dx =
simplest one, with C = 0 in the tables). Confirm that
Jo
294 Mathematical techniques

this is true for Example 17.19, in which u = x and (b) solid uniform sphere, mass m, radius a, about a
dii/d.x = ev, by choosing v(x) = ex + A instead. diameter: I = \ma2\
Prove that the truth of (15.7) is always unaffected (c) thin spherical shell, mass m, radius a, about a dia¬
by the choice of antiderivative for u(x). meter: / = § ma2;
(d) thin rectangle, mass m, side lengths 2a and 2b, about
17.19. (Integration by parts: an apparent paradox). Con¬ a diagonal: / = 5m(a2 + b2)\
sider the following calculation. (e) solid uniform cone, mass m, base radius a, height h,
about its axis: / = ^ma2.
x - i dx x ‘(l)dx
17.21 Assume that
(~x 2)x dx = 1 + dx.

e al cos bt dt = A e cos bt + B e sin bt + C,


Therefore 0 = 1. How is this to be resolved?

17.20 Verify the following moments of inertia / about where A, B, and C are constants. By differentiating this
the axis stated (Section 16.4): expression and matching both sides, obtain the constants
(a) thin circular disc, mass m, radius a, about a diameter: A and B in terms of a and b. Compare your result with
I = \ma2\ eqn. (15.11).
Unforced linear
differential equations
with constant
coefficients
Contents
18.1 Differential equations and their solutions 295
18.2 Solving first-order linear unforced equations 298
18.3 Solving second-order linear unforced equations 301
18.4 Complex roots of the characteristic equation 304
18.5 Initial conditions for second-order equations 307
Problems 308

18.1 Differential equations and their solutions


Suppose that we have a problem in which a quantity x that we are
studying depends on the time f; that is to say, x is a function of t,
which we will write as x(f). From the physics and geometry of the
problem we can often obtain an indirect relation between x and
f, called an equation for x. The equation might be an ordinary
algebraic equation such as x2 + 2xt = 1, but it might contain dx/df
or d2x/dr2, as in the equation d2x/df2 = g for a falling body, where
g is the gravitational acceleration. This is a simple example of a
differential equation, and we can solve it by the methods of earlier
chapters (compare Problem 14.8f).
The equation

dx
— = 3x
dt

is also a differential equation, but we do not yet know how to find


an explicit solution for x in terms of t. Obviously not just anything
will do; if for instance we try x = t2 it does not work, because then
dx/df = 2f, but 3x = 3f2, and these are quite different.
A clue is given by interpreting the equation: it says that a quantity
x always grows at a rate proportional to the amount of x already
present. This is a property of the exponential function (see Section
1.10), so we might try exponential functions of t. In fact,

x = e3!

solves the equation, because then dx/df = 3 e3!, and this is equal to
296 18.1 Mathematical techniques

3x, as required. However, it is not the only solution, because

x — A e3(,

where A is any constant, also solves the equation.


In general, a differential equation for x as a function of f is
an equation involving at least the first derivative dx/df as well as,
possibly, x and f separately. Some examples are

dx d2x dx d3x x2
-b 2Xt = 1,-1-b X = 0, — .
df dt2 df df3 f2

In such equations, f is called the independent variable and x


the dependent variable. An equation is called first-order, second-order,
and so on, according to the order of the highest derivative in it: dx/df,
d2x/df2, and so on.
Problems in science and engineering are often most easily formu¬
lated in terms of differential equations. Suppose for example that
Resistance Inductance in the RL circuit of Fig. 18.1 the switch is closed at time t = 0, and
——-nmp— that subsequently the voltage applied is £(f). Then the current x(f)
R L
is found by solving the differential equation

dx
L -b Rx — E(t).
dt

Here we have collected all the terms that involve x (including dx/df)
Fig. 18.1 on the left side and have put the term that does not involve x,
namely £(f), on the right. This is the conventional arrangement.
The term independent of x which comes on the right is then called
the forcing term, the reason being obvious in this case, since £(f)
drives the circuit.
The differential equation with the same left-hand side, but with
a zero forcing term on the right, plays a key role in obtain¬
ing solutions of the original equation. Such equations are called
unforced differential equations, or sometimes homogeneous equations,
and are the subject of this chapter. Also, for the present, we shall
further restrict ourselves to linear equations with constant coefficients,
which have the form:

Linear unforced differential equations with


constant coefficients

(a) First-order:
dx
-b cx = 0 (c constant). (18.1)
df '
(b) Second-order:
d2 x dx
- + b -- + cx = 0 (b, c constants).
df2 df
18.1 Unforced linear differential equations with constant coefficients 297

These are called linear because there are no squares, products, etc.
involving x and its derivatives. Such equations have comparatively
simple characteristics. The simplest instance of all is

— = 0.
dt

It has solutions x = A, where A is any constant. There is therefore


an infinity of solutions, and we must expect this to be true in more
general cases too.
A solution of a differential equation is any function x(t) which
fits, or satisfies, the equation. This is illustrated in the next two
examples.

Example 1 8.1. For the differential equation dx/dt + 2x = 0, verify that (a)
x = e2' is not a solution, (b) x = 2 e“2' is a solution.
(a) Test x = e2'. Then dx/dt = 2 e2' and so

— + 2x = 2 e2< + 2 e2' = 4 e2'.


dt

This is not zero, so e2' is not a solution.

(b) Test x = 2 e-2'. Then — = — 4 e-21 and so


dr

~V 4- 2x = — 4 e _ 21 + 4 e ~ 21 = 0.
dt

The zero value is what the equation requires, so 2 e-2' is a solution.


Incidentally, we can confirm in the same way that x = A e~2‘, where A is
any constant, is always a solution. We have
dx
— + 2x = — 2A e~21 + 2A e~2‘ = 0,
dt

as it should be. This is the infinity of solutions we were expecting.

Example 1 8.2. Verify that the following functions are solutions of the second-
order equation d2x/dt2 + 4x = 0: (a) x = cos 21, (b) x = sin 21, (c) x = A cos 21 +
B sin 21, where A and B are any constants.
Note that ‘verify’ means ‘try out’: you are not expected to show how the
solutions were obtained.
(a) If x = cos 21, then dx/dt = —2 sin 21, and d2x/df2 = —4 cos 21. There¬
fore

d2x
-+ 4x = —4 cos 2t + 4 cos 21 = 0
dt2

as required.

(b) Similarly, if x = sin 21, then

d2x
-+ 4x = — 4 sin 2f + 4 sin 2f = 0.
dt2

(c) Confirmation is straightforward, but the underlying reason why the


previous solutions can be combined into a new solution in this way is made
clearer by organizing the calculation as follows.
298 18.1 Mathematical techniques

d2 y H2
_h 4x = — (A cos 2t + B sin 2t) + 4(A cos 2f + B sin 2t)
dr2 dt2

/ d2 \
\dt2 J
+ bQ*— sin + 4 sin 2tj,

by rearranging the terms. We already know that the two bracketed expressions
are zero, so the whole expression is zero as required.
The separation of d2x/df2 + 4x into an ‘A’ part and a ‘B’ part in this way
is possible only because the equation is linear.

1 8.2 Solving first-order linear unforced equations


Consider the equation

dx
— + cx = 0 (c a fixed constant). (18.2)
dt

If we write it in the form

dx
= (-c)x,
df

it can be seen to describe the variation of a quantity x(f) which


decays (if c is positive) or grows (if c is negative) at a rate
proportional to the amount of x already present. From Section 1.10,
we know that exponential functions have this property. We shall
therefore test for solutions of the form

x(t) = A em' (18.3)

where A and m are unknown constants which we shall try to adjust


to fit the equation. From (18.3),

- + cx = Am emt + cA emt = A(m + c) emt.


dt

This quantity must be zero for all values of t in order to fit the
differential equation (18.2). Ignoring the possibility A = 0, which
gives us the so-called trivial solution x(f) = 0, we must have

m = — c,

and in that case it does not matter what value is given to A. We have
therefore found a collection of solutions x(f) = A e~ct, where A is an
arbitrary constant. It can be proved that there are no other solutions,
and so we call the solutions we have found the general solution of
the equation.
18.2 Unforced linear differential equations with constant coefficients 299

The general solution of


dx
+ cx = 0
df

where c is a given constant, is


(18.4)
x = A e~ct,

where A is any constant.

Example 1 8.3. Find the general solution o/dx/df — 4x = 0.


We will rework the theory. Look for solutions of the form x = A em':

dx
— — 4x = Am em' — 4A emt = A emt(m — 4).
df

This is zero for all time if m = 4, whatever the value of A. Therefore the general
solution (which includes the trivial one mentioned above) is

x = A e4', with A an arbitrary constant.

Figure 18.2 depicts several of these solutions, corresponding to various values


of the arbitrary constant A.
Each value of A gives a different curve, and these solution curves fill the whole
plane. Also the curves do not cross, so there is one and only one curve through
every point. This corresponds to the fact that the slope dx/df has one and only
one value at every point, namely the value prescribed by the differential equation
dx/df = 4x taken at the point. This is all strong evidence that we have found
all the solutions. More is said about the graphical way of understanding
differential equations in Chapters 22 and 23.

Fig. 18.2 The values of A are


indicated on the curves
300 1 8.2 Mathematical techniques

Example 1 8.4. Find all the solutions of 3 —- + 2x = 0.


dr
We could carry out the full calculation as in the previous example. However, if
instead we want to quote the formula, (18.4), we must first write the equation
in the form

d.x
+ 3 x = 0.
dr

Therefore c = f (not 2), and the general solution is

x = A e 3', with A any constant.

It is worthwhile to memorize the formula (18.4).


In practical cases we do not usually need all the solutions, but
only the one which satisfies some further condition of the problem.
Frequently the condition supplied describes the condition prevail¬
ing at the start of the action, or at some other time, as in the
following.

Example 1 8.5. Find the solution of-4.x = 0for which x = 2 when t


dr
Other ways of saying this are ‘find the solution curve which passes through the
point (1, 2)’, or ‘find a solution x(t) so that x(l) = 2’.
From Example 18.3, all the possible solutions are given by

x = A e4'.

Since .x = 2 when r = 1, we must have 2 = A e4. Therefore

A = 2 e“4

and the single solution picked out is

.x = (2 e“4) e4' = 2e4('“1).

An extra condition of this type is called an initial condition. It


describes the state of the system at a given time. The differential
equation together with its initial condition is called an initial-value
problem.

Initial-value problem, first-order equation

(a) Differential equation: — + cx = 0.


dr
(18.5)
(b) Initial condition: x = x0 at f = f0,

(or x(f0) = x0), with x0 and f0 specified.


1 8.3 Unforced linear differential equations with constant coefficients 301

1 8.3 Solving second-order linear unforced


equations
For second-order differential equations of the type (18.1), we use a
similar technique.

Example 1 8.6. Find some solutions of the equation


d2x dx
+ - 2x = 0.
dt2 dt

We will look first for absolutely basic solutions. Test whether there are any
solutions of the form x(t) = em', where m is constant. Because dx/df = m em< and
d2x/df2 = m2 em\ we have

d2x dx ,
—- H-— 2x = m2 e + me —2 e
dt2 dt

= em!(m2 + m — 2).

This is zero for all time if m2 + m — 2 = 0, that is, if

m = 1 or — 2.

This gives us two solutions, namely

x(t) = e', and x(t) = e~2'.

From this basis, we can obtain more solutions. Guided by Example 18.2c, we
show that also

x(t) = A e1 + B e 21,

where A and B are arbitrary constants, is a solution. By substituting into the


equation and sorting the terms into those with coefficient A and those with
coefficient B, we obtain

= 0,

because e' and e 21 are known already to be solutions, so both of the


bracketed expressions are zero.

This is the principle, but consider now the general case

d2x , dx
—- + b - + cx = 0.
dr dt

Look for solutions of the form x = emI. Then

H2Y H v
—- + b —+ cx = emt(m2 + bm + c).
df2 dt

This will be zero for all t, as required by the differential equation, if

m2 + bm + c = 0, (18.6)

which is called the characteristic equation. Being quadratic, it may


have two real solutions, exactly one real solution, or two complex
solutions, depending on the coefficients. Consider the real cases first:
302 1 8.3 Mathematical techniques

Roots m, and m2 of the characteristic equation real and different


In this case,

x(f) — emi' and x(t) = em2'

are solutions of the differential equation, and from these we can


construct a whole family of solutions

x(t) = A emi' + Bem21'

where A and B are arbitrary. It can be proved that there are no more
solutions: this gives the general solution. The pair of functions
(e'"ir, em2f) is called a basis for the general solution.

-+ b-h cx = 0; roots m1 and m2 of


dr2 dt
m2 + bm + c = 0 real and difTerent
(18.7)
Basis of solutions: emi', e'"2'.

General solution: A emit + B em2' {A, B arbitrary).

d2x dx
Example 1 8.7. Find the general solution of 2 — x = 0.
dt2 dt
To correspond with the standard form, (18.7), we should have to write the
equation in the form d2x/dt2 — \ dx/df — \ x = 0, but there is no need to do
this if we directly test for solutions of the form x = emI. The characteristic
equation then takes the form 2m2 — m — 1 = 0, or (2m + l)(m — 1) = 0, so that
m, = m2 = 1. Therefore the basis for the general solution is the solution pair
(e~21, e'), and the general solution is

x(f) = A e-5' + B e', A and B arbitrary.

The roots ml and m2 of the characteristic equation are equal


Suppose that ml = m2 = m0, say. We have then only one function
for our basis instead of two, and we might expect the general solution
to be A emo'. However, all we know is that there is essentially only
one solution of the form em' (ignoring simple multiples of emt), but we
shall see in the next example that there is also a solution which is
not of this form, namely

x(t) = t em°‘. (18.8)

We might therefore think there will be no end to it: if t em°r


is a solution, then why not t2 emo', or some function of great
complication? However, it can be proved that every second-order
linear differential equation has exactly two linearly independent
solutions (i.e. they are not just multiples of each other); also that
these form a basis of solutions: we do not need any others to
construct the most general solution. Formally:
18.3 Unforced linear differential equations with constant coefficients 303

Basis and general solution of

d2.x
+ cx = 0
d7 df
(a) There exist two linearly independent solutions.
(b) If u(t) and v(t) are any two linearly independent
(18.9)
solutions, these form a basis for the general solution;
that is to say, the general solution is given by

x(f) = Au(t) + Bv(t),

where A and B are arbitrary constants.

d2x dx
Example 1 8.8. Find the general solution of—- + + 4x = 0.
df2 df
The characteristic equation, formed by substituting x(f) = emI, is

m2 + Am + 4 = (m + 2)2 = 0,

and the only value of m that we find is m = — 2. It corresponds to the basic


solution e-2'.
The theorem (18.9) guarantees there is another independent solution, and it
does not matter how we find it. Test the truth of (18.8), which proposes an
independent solution having the form

x(f) = re-2'.

Then

^ = (1 -2f)e"2',
df

and

d2x
-r = (“4 + 4f)e“2'.
df2

Therefore

^ + 4 d- + 4x = [(-4 + At) + 4(1 - 2f) + 4t] e"2',


df2 df

which is zero, so x(f) = f e~2' is a second solution, and it is independent of the


first. By (18.9), the solution basis is therefore

(e-2', fe"2'),

and the general solution is

x(f) = A e“2' + Bt e~2', A and B arbitrary.

The second solution always takes the same form (see Problem
18.8):
304 1 8.3 Mathematical techniques

Characteristic equation: coincident roots


V dx
If —- + b + cx = 0, in which bz — 4c = 0 (for co¬
dr2 df
(18.10)
incident roots), and m0 is the single solution of the
characteristic equation m2 4- bm + c = 0, then the sol¬
ution basis is (em°r, t emo') and the general solution is
x(f) = A emo' + Bt emot (A and B arbitrary constants).

1 8.4 Complex roots of the characteristic equation


If b2 < Ac, the roots m1 and m2 of the characteristic equation
m2 + bm + c = 0 for the differential equation d2x/dt2 + b dx/dt -f
cx = 0 are complex. Since they are roots of a quadratic equation,
they must be complex conjugate, so put

ml = a + j/f, m2 = a - j/1,

where a and /? are real numbers. The corresponding functions

e(a+w>f and e(a”j/;,r (18.11)

are genuine solutions of the differential equation. They are complex


functions, so we call (18.11) a complex basis for solutions of the
differential equation. If we are interested in complex as well as real
solutions, then we can allow the arbitrary constants A and B to be
complex as well, in an all-inclusive general complex solution

x(t) = A e(a + j/J)r + Be(J'iW.

Suppose, however, that we want the general solution to consist


only of real functions. Then a basis for real solutions can be got
from (18.11) in the following way. By (6.8)
e(« + j/?)r _ e«< gj/Jl _ eal cos pt _|_ j g«r sJn ^
This function solves the differential equation, so its real and
imaginary parts separately must also solve it. Therefore

(e“( cos (k, eat sin fit)

is a real basis for the general (real) solution

x(t) = A e" cos /3t + B eat sin /If, (18.12)

where A and B are arbitrary (but real, of course). The second


complex solution, has the same basis, so we get nothing new
by considering it.
Equation (18.12) can be written in a different form. Using the
identity (1.19), we have

A cos [It + B sin [h = C cos (/If + <p),

where C and <p are constants related to A and B. Therefore (18.12)


1 8.4 Unforced linear differential equations with constant coefficients 305

can be written

x(t) = C e“r cos (fit + <p).

Since A and B are arbitrary, so are C and cp.

Example 1 8.9. Find the general solution of —- + 4x = 0.


d t2
The characteristic equation is m1 + 4 = 0. Its solutions are m = +2j. Therefore
the complex solution basis is (e2jI, e_2j!). But

e2j' = cos 2t + j sin It,

and the real and imaginary parts give a basis for the real solutions:

(cos It, sin 21).

Therefore the general solution is

x(t) = A cos 2t + B sin 2t (A, B arbitrary).

d2x dx
Example 1 8.10. Find the general solution of —— 4- 2-h 2x = 0.
dr dt
Setting x = e"" gives the characteristic equation m2 + 2m + 2 = 0, so that
m = — 1 ± j. Therefore

(e(_ 1 +j)', e(_1“j)!)

is a basis for complex solutions. But

e<_ 1 +j)' = e”'(cos t + j sin t),

whose real and imaginary parts are

e_!cos t, e“'sin f.

These form the basis for the real solutions. The general solution is

x(f) = A e~' cos t + B e~' sin t.

If we chose instead to take the real and imaginary parts of e(~1 ~j,!, we would
obtain (e-' cos t, —e-' sin t) as a basis. The minus sign will be absorbed into
the arbitrary constant B: no new solutions appear.

The general solution method can be summed up as follows:

d2x dx
—— + b-1- cx = 0, when m2 + bm + c = 0 has
dr dt
complex roots m1,m2 — a ± j/5 (i.e. b2 < 4c)
Complex basis: e(a+j/i)r, e(a~j/?,r.
Real basis: eat cos fit, e" sin (it.
General solution:

(a) x(t) = A ea' cos fit + B ea( sin (it


(A and B arbitrary);
or
(b) x(t) = C e“' cos ((it + <fi) (C and 0 arbitrary).
306 1 8.4 Mathematical techniques

A very important case is when b = 0 and c > 0, illustrated by


Example 18.9. In that case, a = 0. In conventional notation, putting
c = co2, we obtain the following result:

Characteristic equation: m2 + at2 = 0; mu m2 = ±jco.


Complex basis: ejcot, e~JC0'. (18.14)
Real basis: cos cot, sin cot.
General solution: (a) x(f) = A cos cot + B sin cot,
or
(b) -x(f) = C cos (cof + </>).

In the special case (18.14), the alternative solution form

.x(f) = C cos(cof + </>)

shows that the solutions oscillate regularly, swinging above and


below the f axis to an extent governed by the amplitude C. In the
general case (18.13),

x(f) = C e*f cos (/If + </>),

the solutions oscillate, but the amplitude is governed by the factor


C e3". If a is positive, the oscillation constantly grows; if a is negative,
X
it dies away to zero. This is fully discussed in Chapter 20, but Fig.
18.3 shows a particular case.
The damped unforced linear oscillator is the simplest linear
model of an oscillating mechanical or electrical system which has a
small amount of friction or some other form of energy-loss mechanism
(see Chapter 20 for a full discussion). In a customary notation the
equation is

Fig. 18.3 Graph of


d2x dx ,
x(f) = 4e~0'2' cos(2? — 1) —— + 2k-h co x = 0.
dt2 dt

The term 2k dx/df expresses the energy-absorbing property. Assume

k2 < co2.

The characteristic equation is m2 + 2km + co2 = 0, so that

m = —k + (k2 — co2)2 = — k ± j(co2 — k2)\

since k2 < to2. From (18.13), a = -k and /? = (co2 — k2)\ so finally:


18.4 Unforced linear differential equations with constant coefficients 307

dx
— +2k-b co2x — 0 where k2 < <x>2
df2 df

General solution:

(a) x(t) = A e~kl cos (to2 — k2ft

+ B e~kt sin(co2 — k2)*t (18.15)

(A and B arbitrary constants); or

(b) x(f) = C e~kt cos[(co2 — k2)U + 0]


(C and cj) arbitrary).

1 8.5 Initial conditions for second-order equations


The general solution of a second-order differential equation involves
two arbitrary constants, and the solutions are therefore an order of
magnitude more numerous than in the first-order case. Unlike the
first-order case, the solution curves may cross - in fact, there is an
infinite number of solution curves through any point on the (x, t)
(a) plane, as indicated in Fig. 18.4a.
To pick out a particular solution, we need to determine the two
arbitrary constants. Two pieces of information are necessary. These
may consist of two initial conditions: conditions which define the
state of the system at some starting time f0: the values of x(f) and
the slope dx/df at t = t0 are given (see Fig. 18.4b). For example, the
equation d2x/df2 + cuo.x = 0 describes the oscillations of a particle
on a spring; the initial conditions tell us its position and velocity
(i.e. its state) when it starts off. We then have an initial-value
problem:

Initial-value problem

. ,, d2x , dx
(i) Equation: - + b- + cx = 0.
dt2 df

(ii) Initial conditions:


Fig. 18.4 (a) An infinite number dx
of curves pass through each X = *0 and — = xx at t = t0, (18.16)
point, (b) Selection of a
solution given P and the slope
at P.
which may be expressed alternatively as

x(t0) = X0, x'(t0) = x1?

where x0 and xl are given.


308 1 8.5 Mathematical techniques

Example 1 8.11. Find the solution of d2x/dt2 + 4x = 0 for which x = 1 and


dx/df = 2 at t = 0 {that is, x(0) = 1, x'(0) = 2).
First we need all the solutions. From Example 18.9, these are x(r) = A cos 2t +
B sin 21, where A and B may take any values. Since x = 1 at t = 0,

1 = A + 0, so A = 1.

For the other condition, we first need x'(f) in general:

x'(t) = — 2A sin 21 + 2B cos 21.


At t = 0, we are given that x'(f) = 2, so the last equation becomes

2 = 0 + 2 B, or 6=1.
The required solution is therefore x(t) = cos 21 + sin 21.

Problems

(For the ‘dash’ notation x'(f) = dx/df etc., see (4.1).) 18.5. A radioactive element disintegrates at a rate pro¬
portional to the amount of the original element still
remaining. Show that if A{t) represents the activity of the
18.1. Say which of the following equations are linear,
element at time f, then
unforced, with constant coefficients, i.e. can be rearranged
to conform with (18.1a). dA
- + kA = 0,
(a) x' = 3f; (b) x' = jx; (c) x' + fx = 0; df
(d) 3x' — 2x2 = 0; (e) x' — x = 0; (f) x' = 0;
x' dy , 1 dy where A: is a positive constant.
(g) - = 3; (h) - + iy =1; (t) - -f-« 2; (a) Solve the initial-value problem for A if A = A0
x2 dx y dx
(given) at time f = 0.
dl v' + v + v~
(j) L - + RI = 0; (k)-- = 1. (b) The time taken for the activity to drop to half of
dt v' — v + v the starting value is called the half-life period. For
Uranium 232, it is found that 17.5% has decayed after
20 years. Show that its half-life period is about 72 years.
18.2. Write down all the solutions of the following
equations. Check one or two of them by substitution
into the differential equation. 18.6. (Gotterdammerung) Once upon a time, rabbits in

(a) x' + 5x = 0; (b) x' — jx = 0; Elysium reached maturity instantly and bred with a
(c) x' — x = 0; (d) x' + 3x = 0; birthrate of 20 rabbits per year per couple. No rabbit
(e) 3x' + 4x = 0; (f) x' = 2x; (g) x' = 3x; ever died. At the start of the experiment Zeus released
(h) x'/x= -3; (i) (x'+ 1 )/(x + 1)= 1. 50 male and 50 female rabbits.
By treating the number of rabbits as a continuously
varying quantity and considering the number born in a
18.3. Solve the following initial-value problems. short time 5r, construct a differential equation and then
(a) x' + 2x = 0, x = 3 when f = 0; an initial-value problem for R{t), the rabbit population.
(b) 3x' — x = 0, x = 1 when t = 1; Find how many rabbits there were at the end of Year 4.
(c) y' — 2y = 0, y = 2 when x = —3; Appalled by this result and assisted by Pluto, Zeus
(d) x' + x = 0, x( — 1) = 10; launched another similar experiment, in which any rabbit
(e) 2? - 3y = 0, y(0) = 1; was allowed to live for one year only. Construct the
(f) Find the curve whose slope at any point (x, y) differential equation for the population. Did this alleviate
is equal to 5x, and which passes through the point the situation appreciably?
(1, -2).
18.7. Obtain all solutions of the following equations.
18.4. Suppose that the generator in Fig. 18.1 is short- (The characteristic equations all have real roots, not
circuited and cut out at a moment when the current in necessarily distinct.)
the circuit is I0. Find an expression for the current (a) x" — 3x' + 2x = 0; (b) x" + x' — 2x = 0;
subsequently. Show that the ratio L/R provides a (c) x" — x = 0; (d) x" — 4x = 0;
measure of the time it takes for the current to die away. (e) 3x" - jx = 0; (f) x" - 9x = 0;
Unforced linear differential equations with constant coefficients 309

(g) x" + 2x' - x = 0; (h) x" - 2x' - 2x = 0; Problem 18.12, the equation of motion takes the form
(i) 2x" + 2x' - x = 0; (j) 3x" - x' - 2x = 0;
d 20 d0 g
(k) x" + 4x' + 4x = 0; (1) x" + 6x' + 9x = 0; - + K — + -6 = 0,
(m) 4x" + 4x' + x = 0; (n) x" = 0. d t2 dt l

where K is an additional positive constant which takes


18.8. Verify that, when the characteristic equation cor¬ account of the friction (assumed to be proportional to
responding to x" + bx' + cx = 0 has coincident roots, the angular velocity). In a particular case (SI units),
m, = m2 = m0, say, then the function x(f) = r emo' pro¬ g = 9.7, / = 20, K = 0.066. The pendulum is at rest at
vides a second solution for the basis of the general first, hanging freely. It is then pushed so as to give the
solution. (For coincident roots, b2 = 4c.) bob a velocity of 1 metre per second. Find the subsequent
motion.
18.9. Solve the following initial-value problems.
(a) x" - 4x = 0, x(0) = 1, x'(0) = 0;
18.15. Consider the third-order differential equation
(b) x" + x' - 2x = 0, x(0) = 0, x'(0) = 2;
(c) y" - 4y' + Ay = 0, v(0) = 0, y'(0) = - 1; d3y
-y = 0.
(d) y" + 2y' + y = 0, y(l) = 0, /(l) = 1; dx-
(e) x" — 9x = 0, x(l) = 1, x'(l) = 1;
Proceed by analogy with the method of Section 18.3:
(f) x" - 4x' = 0, x(l) = 1, x'(l) = 0.
by substituting y = emx, and obtaining a characteristic
equation for m (a cubic equation), find three distinct
18.10. Obtain all solutions of the following equations.
basic solutions of this type. By introducing arbitrary
(The roots of the characteristic equations are complex.)
constants A, B, C, find as wide a variety of solutions
(a) x" + x = 0; (b) x" + 9x = 0;
as you can (in fact, this is the general solution).
(c) x" + j.x = 0; (d) x" + oJqx = 0;
(e) x" + 2x' + 2x = 0; (f) y" - 2y' + 2y = 0;
18.16. By proceeding as in Problem 18.15, find a wide
(g) y" + y + y = 0; (h) 2x" + 2x' + x = 0;
variety of solutions of the equation
(i) 3x" + 4x' + 2x = 0; (j) 3x" - 4x' + 2x = 0.

18.11. Solve the following initial-value problems.


(a) x" + x = 0, x(0) = 0, x'(0) = 1;
(b) x" + 4x = 0, x(0) = 1, x'(0) = 0; 18.17. By proceeding with the equation
(c) x" + u)qX = 0, x(0) = a, x'(0) = b;
(d) x" + 2kx' + x = 0, x(0) = 0, x'(0) = b for the cases
k2 > 1, k2 < 1 and k2 = 1.
(Use the A, B form: finding the constants C and <j> in as in Problem 18.15, obtain the collection of solutions
(18.14b) for an initial-value problem can be compara¬
y(x) = A ex + B e~x + C cos x + D sin x,
tively difficult.)
where A, B, C, D are arbitrary constants.
18.12. The approximate equation for small swings of
a pendulum is 18.18. A tapered concrete column of height H metres
is to support a statue of mass M (i.e. weight Mg force
d 20 g
- + -0 = 0 units, where g is the gravitational acceleration) at the
dr2 / top. Pressure (force per unit area) may not exceed P.
where 0 is the inclination from the vertical (in radians), Show that the most economical construction for the
I is the length and g the gravitational acceleration. The column is for its cross-sectional area A(y), where y is
pendulum is held still at an angle a, and is then passively distance above the ground, to satisfy the equation
released. Find the subsequent motion. Mg pg CH
A(y) = —- + — A(u) du,
18.13. The pendulum in Problem 18.12 is hanging at P P J,
rest; then the bob is given a small velocity v in the where p is the density of concrete. By differentiating this
direction of 0 increasing. Find the subsequent motion. expression (see Section 15.10), obtain a differential equa¬
tion for A(y), and an initial condition for the equation,
18.14. If there is a little friction in the pendulum of and solve it.
Forced linear
differential equations
Contents
19.1 Particular solutions for standard forcing terms 310
19.2 Harmonic forcing term by using complex solutions 314
19.3 Particular solutions: exceptional cases 317
19.4 The general solutions of forced equations 318
19.5 First-order linear equations with a variable coefficient 321
Problems 324

1 9.1 Particular solutions for standard forcing terms


Consider the equation

d2x dx
-+ b— + cx = f{t).
dt2 dt

The function /(f) is called the forcing term; it represents physically


the external input to the physical system that the equation describes,
and the system will respond with an output x(t) which depends on
the input /(f).
If /(f) is an exponential function K e“', a sine or cosine function
K sin /It or K cos fit, or a polynomial, then we can find an individual
particular solution by trial.

Example 19.1. Find a particular solution of

Try for a solution containing the same exponential as that on the right-hand
side of the equation:

x(f) = p e2',

where p is some constant - not an arbitrary constant, but one whose value we
shall settle by substitution: only one value will do. Then

dx - „ , d2x
df dr2 f

so that

d2x dx
—j +-2x = 4p e2' + 2p e2' — 2p e2'
dr df

= e2'(4 + 2 - 2)p = 4p e2'.

This must equal the given right-hand side, 3 e2' for all values oft, which is only
possible if 4p = 3, or

P =l
19.1 Forced linear differential equations 311

Therefore one particular solution is

x(t) = |e2'.

There are many other solutions, as we shall see in Section 19.15, but they are
all based on this particular solution.

d2x
Example 1 9.2. Find a particular solution of-+ 4x = 2 cos 31.
dr2
Guess that there might be a solution of the form

x = p cos 3r.

Then

dx d2x
— =— 3p sin 3r, and —-= — 9pcos3r,
df dr

so that

d2x
+ 4x = — 9p cos 3r + 4p cos 3r = —5p cos 3f.
dr'

This must be the same as the right-hand side of the equation in order for the
guessed function to be a solution, so

— 5p cos 3r = 2 cos 3r.

Therefore p = — f, and the required solution is

x(f) = — § cos 3r.

In most cases when the right-hand side is a sine or cosine it will


not work so simply, as is illustrated in the following example.

Example 1 9.3. Find a particular solution of

d2x dx
—- H-2x = 2 cos 3r.
df2 dr
The form p cos 3r cannot be made to fit this equation because the dx/dr term on
the left produces a sin 3r term, making the left and right sides impossible to
match for all t. Try instead

x = p cos 31 + q sin 3r.

Then

dx
= — 3p sin 3r + 3q cos 3r,
dr
and

d2x
= —9p cos 3r — 9q sin 3r.
dr2
Therefore

d2x dx
2-T -i—— 2x = ( — 9p + 3q — 2p) cos 3r
dr2 dr
+ ( — 9q — 3p — 2q) sin 3r

= (— lip + 3g) cos 3r + ( — 3p — 1 lq) sin 3r

= 2 cos 3f
312 19.1 Mathematical techniques

(the last expression being the right-hand side of the equation) for all t.
The only way to satisfy this condition is to require both

— 1 Ip + 2q = 2, — 3p — 11<? = 0.

The solution of these two simultaneous equations for p and q is

P = ~ 65' Q ~ fs-

Therefore a particular solution is

x = — ^ cos 3r + ^ sin 31.

It will be necessary in nearly all cases to take both sine and cosine
terms into account.

The case when f(t) is a constant often occurs.

Example 1 9.4. Find a particular solution of -j-y — 2 — + 4.x = 3.

Test whether there is a constant solution

x(r) = p (a constant).

By substituting this in the differential equation, we get

0 + 0 + 4p = 3,

so that p = §, and the particular solution is just x(t) = which is obvious after
it has been worked out.

If the right-hand side is a polynomial (the sum of one or more


powers of x\ then the solution will be a polynomial.

d2.x dx
Example 1 9.5. Find a solution of—--h x = 3 + 2r.
dr dr

Try a solution of the form x(r) = p + qt + rt2, where p, q, and r are constants.
It is normally necessary to try a polynomial of the same degree as the forcing
term, which in this case has degree 2, and to include all the lower-degree terms
in the trial solution.
Since

dx d2x
r = q + 2rF and — = 2 r,
dr dr2

we must have

2r — (q + 2rt) + (p + qt + rt2) = 3 + 2t2.

Match up the coefficients of the three powers of r; we find that

r = 2, —2r + q — 0, 2r — q + p = 3.

The equations are easy to solve and lead to the solution

x(r) = 3 + 4r + 2r2.
19.1 Forced linear differential equations 313

Particular solutions of
d2x , dx „ x dx
-— + 6 — + cx = /(0 and + cx = ./(f)
dr df dt
(a) /(f) = K e5": try a solution x(f) = p e“'.
(b) /(f) = K cos /if or K sin ft: try a solution
x(f) = p cos ft + g sin /if.
(c) /(f) is a polynomial of degree N: try a polynomial
of the same degree, with all its terms present.

There are exceptional cases where these substitutions have to


be modified. For example, d2x/df2 = f has a polynomial solution of
degree 3, not degree 1. These cases are treated in Section 19.3.
If the forcing term on the right-hand side consists of the sum of
several constituent terms, then obtain a particular solution for each
one, and add them, as in the following example.

Example 1 9.6. Obtain a particular solution of


d2x
—- + 4x = 1 + e .
df2
d2.X!
Solve + 4.xj = 1 for x,(f) and + 4x2 = e 'for x2(f); then
/if2”
x(t) = xft) + X2(t)

will be a particular solution of the original equation.


For Xj(f), try for a constant solution x^f) = P'■ it is found that p = so
xi(0 = i
For x2(f), try x2(t) = qe ' (following the method of Example 19.1). The
substitution gives q e"' + 4q e~‘ = e-', so that q = j, and the solution is
x2(f) =
Therefore, a particular solution x(t) of the original equation is
x(f) = xft) + x2(f) = l + ie“'.

The method just described is another consequence of the linearity


of the class of equations considered. It is also called the super¬
position principle.
These methods apply equally to first-order equations.

Example 1 9.7. Obtain a particular solution of — + x = 3 cos 2f.


dt
Remembering Example 19.3, we expect the solution will have to contain both
cosine and sine terms, so try
x(f) = p cos 2t + q sin 21.

The substitution gives (p + 2q) cos 2r + ( — 2p + q) sin 2t = 3 cos 2f, so that


p = q = f, and the solution is

x(f) = | cos 2t + f sin 2t.


314 1 9.2 Mathematical techniques

1 9.2 Harmonic forcing term by using complex


solutions
In Example 19.3, we solved a second-order equation with the
term cos 2f on the right by choosing constants p and q so that the
expression p cos 21 + q sin 21 would fit the equation. We shall
explain another important method for obtaining solutions, which
derives the required real solutions from complex solutions of a
related equation.
First of all, consider the general differential equation

d~X + b — + cX = a ej/", (19.2)


df2 dr

where b, c, a, and f are real constants, and j is the complex element.


Since the forcing term is an exponential, we shall test for a particular
solution of (19.2) having the form

X(t) = P ej/J‘

as in Example 19.1 (but this time we must expect that P will be a


complex constant).

Example 1 9.8. Find a particular (complex) solution of the complex differential


d2X dX
equation —— +-b X = 3 e J.
dr2 dr

Look for a solution of the form X(t) = Pe2i‘. To find P, substitute this
expression into the left-hand side of the differential equation;

(2j)2P e2j' + (2j)P e2jl + P e2j' = P(-4 + 2j + l)e2j'

= P(-3 + 2j) e2j'.

This must be the same as the right-hand side of the equation, 3 e2j(, for all values
of f. Therefore, P( —3 + 2j) = 3, so

?__ 3( —3 — 2j)
P = 3(3 + 2j).
— 3 + 2j ( —3)2 + 22

Therefore X(t) — —-^(3 + 2j)e2j' is a particular solution. When expanded, it


becomes

X{t) = — p5-(3 + 2j)(cos 21 + j sin 21)

= -i^(3 cos2r - 2 sin 2t) + j{-^(2 cos 2r + 3 sin 2f)}.

Consider next the real equation for x(f):

d2x dx
dr2 + b dt + CX = a C°S ^19‘3^

where b, c, a, f are all real. We know that

cos ft = Re QiPt

(see (6.8)). Therefore, if we can find a particular solution X(t) of the


complex equation (19.2), its real part will solve the corresponding
real equation (19.3).
1 9.2 Forced linear differential equations 315

Example 1 9.9. Find a particular solution of the equation


d2x dx
—- +-2x = 2 cos 3r.
df2 dr
This is the same problem as Example 19.3, reworked so that the methods can
be compared. Since cos 3f = Re e3jt, the corresponding complex equation for
X(t) is

d2X dX
+ - 2X = 2e3j‘.
dT2" dt

To find a particular solution of this new equation, try X(t) = Pe3jt:


d2X dX ,
-— +-2X = 9j 2P e3j + 3jP e3j - 2 P e3j'
dr dr
= (9j2 + 3j - 2)P e3jI = (—11 + 3j)P e3j'.
This must equal 2 e3j' for all values of t, so
2 2(—11 — 3j)
P= — (65 + 6 5j)•
— 11 + 3j (— 11)2 + 32
Therefore we have a complex solution of the complex equation:

*(0 = -(65 + «j) e3j' =-(65 + 6sj)(C0S 3f + j sin 30-


For x(f), we require only the real part of this expression:
x(t) = Re X(t) = —cos 3f + ^ sin 3f,

which is what we obtained in Example 19.3 for the same problem.

In the case when the right-hand side of the equation has the form
a sin cot, the calculation is the same, but the imaginary part of the
complex solution must be extracted instead of the real part. The
following example demonstrates also how right-hand sides of the
form
a e*' cos fit, a eot' sin fit
can be handled in the same way.

d2x
Example 19.10. Find a solution of—- + x = e-2' sin 3f.

Use the fact that


e“2' sin 3f = Im (e“2' e3j') = Im
Therefore, consider the corresponding complex equation
d2X <-2 + 3j)f
+ X = e
dr2
To find a solution, try the form
X(t) = P e(_2 + 3j)'.

We find in the usual way that


(-2 + 3j)2P el_2 + 3j)! + pe(_2 + 3j)' = e(“2 + 3j)'
for all values of t. Therefore
1 -1
P= - = -^(l - 3j)
( — 2 + 3j)2 + 1 4(1 + 3j)
316 1 9.2 Mathematical techniques

and
X(t)= -^(1 -3j)e(-2 + 3J>'.
If we take the imaginary part of X(t), we obtain a solution of the original
equation:
x(t) = Im[-^(1 - 3j)(e"2' e3j')]
= — jo e“2'( —3 cos 3t + sin 3t).
The same result could be obtained by substituting
x(t) = p e ~21 cos 3t + q e “ 21 sin 3f,

but this would be a very laborious and error-prone process.

The method is particularly advantageous when the coefficients


are general constants. The following equation will be important in
Chapter 20.

Example 19.1 1. Find a particular solution of


d2x dx
—- + 2k-(- (OoX = a cos cot, where a > 0.
dr dt
Since cos cot = Re eJ0”, first find a solution of
d2X dX ,
—r + 2k — 4- co20X = a eJ“'.
dt2 dt
By substituting X(t) = PeJ“', we find that
P = a/[(o>o — w2) + j(2/cco)].
It is easier if we put P into polar coordinates: P = \P\ where
l-P| = |tf/[(a>o — co2) + j(2/ca>)]|
= a/[(cuo - co2)2 + (2/ca>)2]C
and
4> = arg[(coo - w2) - j(2km)],
since a > 0. Then we obtain
a cos (cot -I- f)
x(t) = Re(P ejw') = (19.4)
[(a>o — or)2 + (2kco)2]-’
where f is the polar angle of the point ((a>o — co2), -2kco) on an Argand
diagram.

d2x
Particular solution of + CX = /(f)
dt2
(a) /(f) = a cos /f or a sin /if
Put X(t) = P eipt to solve
X" + bX' + cX = a ej/?!.
(19.5)
Then x(f) = Re X(t) or Im X(t), corresponding to
cos /f or sin /if respectively.
(b) /(f) = a eat cos /if or a eaI sin /f.
Solve X" + bX' + cX = a e(a+j/!)t, and continue as in
(a).
19.3 Forced linear differential equations 317

1 9.3 Particular solutions: exceptional cases


There are exceptional cases for each of the three rules (19.1), when
the suggested substitution does not give any result because the trial
function delivers zero when it is substituted into the left-hand side.
This means (as with the similar exceptional case of a single solution
of the characteristic equation, (18.10)) that the trial solution must
have a different form.
The most important exception is the case of the equation

d2x
+ fi2x = a cos 18t (or a sin fit).
d7

Note that /? occurs on both sides of the equation. This is a special


case of Example 19.11 in which k = 0 and co2 = Wq = (l2. The rule
in (19.1) suggests substituting
x(t) — p cos fit + q sin (h,
and choosing p and q so that the two sides match. But we
already know that this is a solution of the unforced equation
d2x/dt2 + [32x = 0, and the inevitable zero that we get on making
the substitution cannot be matched to a cos fit on the right.
In this case, the solution is quite different. The following results
can be confirmed by direct substitution.

Particular solutions: two exceptional cases


d 2x
(a) —- + B2x = a cos fit: solution
dr

x(t) = — t sin fit.


2p (19.6)
d2x
(b) —— + /?2x = a sin fit: solution
dr

x(t) =-t cos fit.


2ff

,d2x
Example 19.12. Find a particular solution of 9x = 5 sin 31.
dt2
Here, x = p cos 3f and x = q sin 3t both give d2x/df2 + 9x = 0, so the standard
solution form does not work. From (19.6), with /? = 3 and a = 5, the required
solution is

x(t) = —-f cos 3f = —ft cos 3f.


2x3
This solution is sketched in Fig. 19.1. Unlike the ordinary sine and cosine type
solutions, it grows indefinitely. Such solutions have an important physical
significance described in Chapter 20.
318 1 9.3 Mathematical techniques

x
3

-3

—g t cos 3t
Fig. 19.1 -x=±ii

There are other exceptional cases that are not so frequently


encountered; some examples are given among the problems at the
end of the chapter.

19.4 The general solution of forced equations


Consider the equation

d2Y
—- — x=— 2 cost. (19.7)
dt2

A particular solution, xp(t), say, is

xp(f) = cos t. (19.8)

From earlier experience, we should expect other solutions. In order


to find some, consider what happens when we substitute various
functions x(t) in the expression

(19.9)

For example, when we put x(t) = cos t, we obtain

d2
—- cos t — cos t — —2 cos t,
d t2

as demanded by (19.7).
Suppose now that we can find another function, x(t) = xc(t) say,
which produces zero out of (19.9). For example, x(t) = xc(t) = e!
makes d2x/df2 — x equal to zero. It is then obvious that if we put

x(t) = xp(t) + xc(t) = cos t + e'

into (19.9), we again obtain (-2 cos t) on the right: that is to say,
we have found another solution of (19.7).
19.4 Forced linear differential equations 319

But we already know, from Chapter 18, all the functions xc(t) that
give zero when they are put into (19.9): they are the solutions of the
equation
d2x
— x = 0, (19.10)
d?
and are given by
x(t) = xc(r) = A e1 + B e_I,
where A and B are any constants. Therefore
x(f) = cos t + A e' + B e~‘ (19.11)
is always a solution of (19.7). The differential equation (19.10) is
called the unforced equation corresponding to the original equation
(19.7), and its solutions x(r) are called the complementary functions
of the problem (they complement or extend the particular solution
of (19.7) that we obtained).
To show that we have obtained all possible solutions of (19.7),
take the particular solution cos t that we obtained, and suppose
that xp(f) is any other solution of (19.7). Evidently the function
x(f) = xp(t) — cos t satisfies (19.10), so x(f) must be a complementary
function. Therefore
xp(f) = cos t + (a complementary function),
so xp (t) must be one of the solutions already expressed by (19.11).
Therefore (19.11) is the general solution of (19.7). Exactly the same
argument would have applied in the general case:

d2x dx
General solution of — + b— + cx = m
dr dr
(i) Obtain any particular solution, xp(t).
(ii) Obtain all the solutions Axcft) + Bxc2(t) of the
corresponding unforced equation
(19.12)
d2x
+ cx = 0
dr2
(the complementary functions).
The sum of these gives the general solution:
x(t) = xp (r) + Axcl(t) + Bxc2(t).

The theory and method is exactly the same for linear equations of
the first order, and of any order, whether the coefficients are constant
or not.

Example 1 9.1 3. Find the general solution + 4x = 3 cos 5t.

Particular solution xp(t). Looking forward into the calculation, it can be seen
320 1 9.4 Mathematical techniques

that the solution needs no sin 51 term. Therefore try

:xp(t) = p cos 51.

The substitution into the equation gives

p{ — 25 cos 51) + 4p cos 5t = 3 cos 5t

for all t, so p = —4. Therefore

x(t) = —^ cos 51.

Complementary functions xft). We require the solutions xc(t) of the corresponding


unforced equation d2xc/dt2 + 4xc = 0. Try for solutions of the form xc(f) = p emt.
The substitution produces the characteristic equation m2 + 4 = 0. Therefore
m = + 2j, so a pair of solutions

(e2j',e'2j')

constitutes a complex basis. To get a real basis, choose either one, say e2jI, and
find its real and imaginary parts. These are

cos 2f, sin 2f,

and this is the required real basis. Therefore, all the complementary functions
are given by

xc(t) = A cos 21 + B sin 2t (A and B arbitrary constants).

General solution. This is the sum of the two:

x(t) = —7 cos 5f + A cos 2t + B sin 21.

As explained in Section 19.3, the straightforward trial method


for the complementary functions fails if the forcing term on the right
is already a complementary function, so it can be a useful tactic to
look at the complementary functions first. The following example
contains this feature, and is also an initial-value problem.

Example 19.14. (a) Obtain the general solution of


d2x
—- + 4x = 3 + 2 cos 21.
dr
(b) Find the particular solution for which x = 0 and dx/dt = 0 when t = 0.

(a) Complementary functions xc(t). These are the solutions of d2xc/dt2 + 4xc = 0.
We found them in Example 19.13: they are

xc(t) = A cos 2f + B sin 21 with A and B arbitrary.

Particular solution xp(t). There are two terms on the right, so find a particular
solution for each term separately, and add them.
Therefore for a solution, xpl(f) say, of d2xpl/dt2 + 4xpl = 3, we can obviously
take
3
-Xpl(t) 4-

Corresponding to the other term, we need a solution, xp2(f) say, of


d2x P2
+ 4xp2 = 2 cos 21. We should normally expect a solution of the form
dt
p cos 21 + q sin 21. However, looking at the complementary functions we found,
this function is already a complementary function; so we have the exceptional
1 9.4 Forced linear differential equations 321

case (19.6), which gives


xp2(f) = 21 sin 21.
Therefore a particular solution of the original equation is
xp(t) = xpl(0 + xp2(f) = | + |f sin 2t.

General solution.
x(r) = A cos 2f + B sin 2t + | + \t sin 21.
(b) Initial-value problem. We require also dx/dt:
dx
— = — 2A sin 2t + 2B cos 2t + i sin 21 + t cos 2t.
df
The initial conditions prescribe x(0) = 0, or
A + l = 0,

and x'(0) = 0, or
2 B = 0.
Therefore A = — | and B = 0, so the required particular solution is
x{t) = — |cos 2t +1 + 31. sin 2t.

1 9.5 First-order linear equations with a variable


coefficient
So far, the coefficient c in the equation d.x/dt + cx = /(f) has been
a constant. We shall now suppose c to be variable; call it g(t):
dx
— + g(t)x = f(t). (19.13)
df
The equation is of linear type (no squares, products, etc. between
terms involving x are present), and the idea of obtaining a general
solution by adding complementary functions to any particular
solution still holds good. However, it is nearly impossible to guess
suitable trial functions so we need a new approach to finding
solutions.
If we could express the left-hand side dx/dt + g(t)x of the
equation as
d
— (something),
df
then the equation would be easy to solve. This cannot be done, but
we can instead do the next best thing. This is to obtain a certain
function 7(f), called an integrating factor, such that

(19.14)

identically (that is, for every function x(f) and for all values of f).
The following example shows the meaning of this idea and the
way it is used.
322 19.5 Mathematical techniques

Example 19.15. (a) Show that I(t) = e' is an integrating factor for the
expression dx/dt + x. (b) Use it to find the general solution of the equation
dx/d t + x = e2'.
(a) We have to confirm (19.14), that

(i)
e'(^ + x) = ^<e'x>-
Work from the right-hand side of (i). Differentiate the product e' x:
d ,dx fdx \
— (e x) = e-h e x = e'-lx ,
dr dr \dt J
which is the same as the left-hand side of (i), so e' is an integrating factor,
(b) Multiply both sides of the differential equation by e':
Y dx
e' I-h x 1 = e' e2' — <“3'
Idr
Because of the result in (a), we can write this as
d
- (e'x) = e~
dr
Therefore

e'x = e3' dr = ^ e3' + A (A arbitrary).

or
x ^ + A c

To find a general expression for an integrating factor, refer back


to the definition (19.14); the integrating factor 7(f) is chosen so that

)(dx ' d
/(O^ + 0(f)XJ==dr[/(°X]'
This is the same as

HO ^ + HOglOx = HO ^ + x
dt dr dr
or
d/(Q
H0g( 0 =
dr
dx
(after cancelling /(f) —, and dividing through by x). This can be
written
1 d/(t)
- = g( 0
/(t) dr
or
d In /(f)
= g(0-
dr
Therefore

In /(f) = 0(f) df.


1 9.5 Forced linear differential equations 323

or
I(t) = e$a(t)dt.
(In the case of Example 19.15, we had g(t) = 1, and the present
formula gives /(£) = e-fdt = et+c; the choice C = 0 gives the inte¬
grating factor suggested - any other choice would do.)

Integrating factor for the equation


dx
~r + g(t)x = /(£).

(19.15)
Put /(£) = e^(,)d';
f dx \ d
then /(£)-h g(t)x = — [7(r)x].

dx
Solution of-F g(t)x = f(t).
dt
Multiply both sides by I(t) (see (19.15)): the equation
becomes
(19.16)
- [/(£)X(£)] =/(o/(0;
dt

then I(t)x(t) = I{t)f{t) dt + C, giving x(t).

dx 1
Example 1 9.1 6. Find the general solution of — — - x = t ,for t > 0.
dt t

Here g(t) = — 1 ft. Then g(t) dt = C — In t (we need only consider t positive),
so that

7(t) = e"ln‘ = r\
where we have chosen C = 0 for convenience. Multiply both sides by I{t) = t

By (19.15), this can be written

^-(t_1x) = t2.
dr
Therefore

t2 dt = yf3 + C,
324 1 9.5 Mathematical techniques

so that

x(t) = yt4 + Ct.

The solution obviously falls into the shape


(particular solution) + (complementary function).

We should have gained nothing by considering negative t, or by adding an

arbitrary constant, when working out - df: we only need any integrating factor,
J t
not all possible ones.

Notice particularly that, in the examples, we did not need to


calculate or check the truth of a statement like

d
r1
dt

We already know that is an integrating factor, and this is the


very property that an integrating factor is designed to possess.
Be prepared to recognize this type of equation in disguised form,
or when different letters are involved; for example,

dy x + y
dx x + 1

is the same as

dy 1 x
dx x + 1 x + 1

Problems

19.1. Find a particular solution of each of the follow¬ (g) x" — 4x = e“' cos t (note: e~' cos t = Re e(_I+j)');
ing equations by trial as in Section 19.1. (h) x" — 4x = 3 e! sin 2t (note: e' sin 21 = Im e(1 + 2j)‘);
(a) x' + x = 3 e2!; (b) x' — 3x = t3 + 1; (i) Show that a solution of x" + x’ + 4x = 5 cos 3r is
(c) 2x' + 3x = t + 3 e'; (d) x" + x = 3 e2'; (5/^34) cos(3f + </>), where 4> = arctan f.
(e) x" — |x = 2 e' + 3 e-'; (f) x" — 2x' + x = 3;
(g) x" + 4x' - x = 3r2 - t:
(h) x" — x = 2 cos f; (i) 2x" + 3x = 2 sin 3t; 19.3. The following differential equations are examples
(j) 2x" + x' = sin t — cos f; of the exceptional cases treated in Section 19.3. Find
(k) x" + 2x' + x = cos 21; a particular solution in each case.
d 2y (a) x" + x = 3 cos f; (b) x" + 4x = 3 sin 2r;
(l) —- — y = 1 — 3 e2N d2 v
dx (c) x + 4x = 1 + 3 cos 21; (d) — + 9y = 2 sin 3x.
d2y dy dx2
(m) —-1 + 2y = 3 sin 2x. d2y dy
dx2 dx (e) —- — 2-1- 2y = ex cos x.
dx2 dx
19.2. Use the method of Section 19.2 to find a particular
solution of the following. 19.4. The following are exceptional cases of types not
(a) x" — x = 3 cos 2f; (b) x" + x = 2 sin 3f; described in Section 19.3. Find a particular solution
(c) x" + 2x + x = 3 sin f; (d) x" — x' — x = 3 cos r; for each.
(e) 2x" + x' + 2x = 2 cos 2t; (a) x" — x = e!; try a solution of the form pt e'.
(f) 3x" + 2x' + x = 2 sin 2f; (b) x" - 2x' + x = e'; try a solution of the form pt2 e'.
Forced linear differential equations 325

(In this case, both e' and t e' are complementary functions, 1
so the form in (a) will not work.) (j) x'-x = In f; (k) £x' — x = 1 + t;
t
(c) Consider the simple differential equation d2.x/df2 = f.
dy x + y
A first try with the form pt + q suggested by (19.1) does (/) — =-(m) x' + x cos t = cos t;
not lead to a result. Try polynomials of higher degree dx x -F 1
than 1. dy 1 —y
(n) x — =-(o) (1 — f2)x' -p tx = t.
d2y dv d.x 1 —x
(d) -—— H—1 = x. The absence of a term in v causes
d.x“ dx
the second-degree trial function px2 + qx + r to fail. 19.7. Show that the general solution of
Try a third-degree of polynomial instead.
dy 1
(e) x" — 2x' + lx = e' cos f. Try f e‘(p cos t + q sin f), or — + - y = f(x)
modify the complex-number approach of Section 19.2 to d.x x
obtain a particular solution. is given by
(f) First-order equations also have exceptional cases.
1 f C
dy y(x) = - xf (x) dx H—,
Consider the equation-y = e\ (If you have read x J x
dx
as far as Section 19.5, you can also handle it by using where C is any constant. Find the solution of the
an integrating factor.) equation
dy 1
19.5. Find the general solution of the following equa¬ -F - y = In x
tions. dx x
(a) x" + 9.x = 3 e2'; (b) x" - 4x = for x > 0, for which y = 0 when x = 1.
(c) 4.x" — x = 1 + 3 cos 21;
d2_y dv 19.8. (a) Use an integrating factor to show that the
(d) —-- + 2 — + 2y = 3;
dx d.x general solution of
(e) x" — 2x' + 2x = 3 sin 21; dy
(f) 4x" — 2.x' — 2x = 312. — + V = /(*)
dx
(g) x" -p x' = 2 — 3 e~' cos f;
(h) 2x" T x' — x — + 3e is

d 2y , y(x) ex/(x) dx + C e"


(i) —- -P y = 1 + 2 e3x + x2;
dx2
where C is an arbitrary constant.
d 2y d_y . (b) Show that the particular solution for which y = y0
(j) -F 2-F y = 3 cos 2x -F sin 2x;
dx2 d.x when x = 0 is given by

d 2y dy
(k) -F 4 — + 5y = e x sin x. y(x) = y0 e x + e e“/(n) du.
d.x2 dx o

19.6. Use an integrating factor (Section 19.6) to find 19.9. (Newton cooling). An object is heated or cooled
the general solution of the following equations. above or below the ambient air temperature T0. Under
(a) x' — 3.x = 0; (b) x' -F 2x = 3; certain physical assumptions, the body temperature T
(c) x' - 2tx = t; (d) x' — t_1x = t + te"1; satisfies the equation
(e) x' — t lx — t — 1; (f) tx' — 2x + 3 = 0; dT/dr = -k(T-T0),
dy 1 where k is a positive constant. Find the general solution
(g) — -|-y = sin x (you will need to use mte-
dx x -P 1 of the equation.
gration by parts to perform the integration); The body is at 100° C in an atmosphere at 40° C.
After 3 minutes, its temperature is 85° C. Find the
dv 1 . dy ; 2 value of k, and determine when the body will reach
(h) 3 — -F - y = x; (i) (x - 1) --y = (x - 1) ;
dx x dx 60° C.
Harmonic functions
and the harmonic
oscillator
Contents
20.1 Harmonic oscillations 326
20.2 Phase difference: lead and lag 328
20.3 Physical models of a differential equation 329
20.4 Free oscillations of a linear oscillator 330
20.5 Forced oscillations and transients 331
20.6 Resonance 333
20.7 Nearly linear systems 335
Problems 337

20.1 Harmonic oscillations


Consider the equation

where we assume co > 0. Its solutions (see (18.14b)) are

x(t) = C cos(tof + <p) (20.1)

where C and </> are any constants. We can write

x(t) = C cos(arf + </>) = C cos co(t + (p/co).

Therefore the graph of (20.1) is merely the graph x = C cos cot


shifted, or translated, a distance (p/co along the t axis; to the left if
cp is positive, to the right if cp is negative. Sine functions are included
in the collection (20.1), because sin cot = cos(tof — \n). These func¬
tions are spoken of generally as harmonic functions.
In applications it is usual to adjust (p so that

C > 0 and —n < cp ^ n (20.2)

which can always be done without changing the function values


described by the expression (20.1). We say then that the function
(20.1) is in standard form.

Example 20.1. Express — 2cos(3f — §7i) in standard form.


Note that cos(/l + n) = —cos A. Therefore

— 2 cos(3f — In) = 2 cos(3f — §71 + 7t) = 2 cos(3r — *n).

We now have positive C, but <p is still out of range according to (20.2). To bring it
20.1 Harmonic functions and the harmonic oscillator 327

within range increase it by In, which alters nothing:


2 cos(3f — |k) = 2 cos(3f — §7t + 27i) = 2 cos(3t + jn)
which is now in standard form.

Fig. 20.1 x = C cosfcof + 0);


C > 0, —7T < ([> ^ 7t.

The features of the function x(t) = C cos(cot + 0) are shown in


Fig. 20.1. Assume that the expression is in standard form (20.2). The
graph swings between ± C, and C is its amplitude. It is periodic (see
Section 1.6), repeating itself at intervals of length 2n/co, which is its
minimum period. The number of complete oscillations per unit time
is the frequency (e.g. in cycles per second, or Hertz units), and
Frequency = (period)-1 = w/ln. (20.3)
The parameter co is angular frequency, often shortened merely to
‘frequency’. The parameter </> is the phase or phase angle. As
explained above, 4>/<x> represents the distance that the graph x =
C cos cot has to be shifted to coincide with (20.1).
Frequently the independent variable represents length x instead
of time t, as in a form such as y = C cos (cox + 4>). Then 2n/co is
called wavelength rather than ‘period’, and co/27i the wave number
rather than ‘frequency’.
Graphs of harmonic functions are often displayed by plotting
x against the dimensionless variable

T = OOt
(t is the Greek letter ‘tau’) rather than against f. Thus z will be the
name of the new time-like axis, so that
x = C cos(t + <p)
The x, r graph is drawn in Fig. 20.2.

0 X

Fig. 20.2 T-period 2n


328 20.1 Mathematical techniques

The new graph has z period 2n: it repeats itself when r increases
by 2n. It is the same as the graph of x = C cos z displaced through
an interval in z of length </>. Expressed in terms of the period or
wavelength in Fig. 20.2, it is clear that cj) = n represents a displace¬
ment of half a wavelength, 4> = ^ represents a displacement of a
quarter of a wavelength, and so on.

20.2 Phase difference: lead and lag


Suppose that two oscillations have the same angular frequency to,
but are out of step because they have a different phase:
Xi(0 = Q cos (cot + 0j), x2(t) = C2 cosfwt + 02).
Then they are said to be out of phase by an angle <f>2 - 4)v or
0! — (f>2. More specifically, the following terminology is widely used
in science and engineering applications:

Phase difference—lead and lag


x(t) = Cj cos (cur + 4>i), and y(t) = c2 cos(cof + </>2)
are harmonic functions in standard form with the (20.4)
same circular frequency co. If > 02, then x is said
to lead y, or y is said to lag x, by an angle — (j>2-

The reason for these terms is illustrated in Example 20.2.

Example 20.2. If a voltage v = v0 cos cot is applied to a coil having self¬

inductance L, the resulting current i = °- cos(u>t — 27r), so that v leads i, or i lags


coL
v, by in (or 90°). Illustrate the sense of the terms graphically for this case.
The curves in Fig. 20.3 are plotted against the variable i = cot and represent

v = v0 cos t and i = — cos(t — \n).


coL

Choose one of the curves, say the v curve, and select a prominent feature, say
the maximum at B. Now search an interval within ±7i of B (that is to say,
within half a period on either side of B) for the corresponding feature of i. This
Period 271
is the maximum of i at A.
Fig. 20.3-v = v0 cos r; Now, as we move from left to right (time increasing) through the interval,
-/ = (vJcoL) cos(t — y7l). B appears before A—that is to say, at a quarter period (^7t) earlier than A. This
The period inspected is symmet¬ will be true for any feature of v within its own symmetrical corresponding interval
rical about the chosen feature of ±n. It is equivalent to saying that, when the two variables to be compared
at B. are in standard form, the one with the greater phase leads, and the other lags,
by the phase difference (taken positively).

In Example 20.2 it is essential to limit the search to the prescribed


single period. Otherwise we could argue (see Fig. 20.3) that because,
say, C appears before B, therefore i leads v, which is contrary to the
definition. Also, notice that if one entity leads another it does not
in the least imply that the first is to be taken as the cause of the
second.
20.2 Harmonic functions and the harmonic oscillator 329

Suppose that two oscillations, having the same amplitude and


frequency, differ in phase by n so they are displaced by half a period.
If the oscillations are added together, there is total cancellation, as
shown in Fig. 20.4. The following example shows what happens
when the phase difference is less extreme.

Example 20.3. Two waves described by C cos cot and C cos(cof + f) are
superimposed (added). Show that the result is a harmonic wave of the same
frequency, and show how the amplitude varies as f varies between ± n.
From Appendix B the sum can be written

Cfcos wt + cos(cuf + f)~\ = 2C cos j</> cos (tot + jf).

This is a harmonic oscillation with angular frequency co, phase |</>, and amplitude
2C cos \cj). As f goes from —n through zero to n, the amplitude goes from zero
Fig. 20.4 -x = (cancellation) through the value 2C and back to zero.
C cos(cof + (j))\-x =
C cos(a>f + <p ± n). This type of superposition is of importance in describing inter¬
ference and diffraction phenomena. If the amplitudes of the compo¬
nents are not the same, a similar calculation applies (see Problem
20.4 at the end of this chapter).

20.3 Physical models of a differential equation


(a) Figure 20.5a shows a piston of mass in running in a cylinder,
controlled by a spring which obeys Hooke’s law (a linear spring)
and has stiffness s, acted on by an external force F(t). The displace¬
ment of the piston from its equilibrium position is x(t). Assume also
that there is a frictional resistance proportional to the velocity:

(frictional resistance) = K —, (K > 0).


dt
The equation of motion, force equals mass times acceleration,
(b)
becomes
dx d2x
F(t) - K- sx — m —-
df dt2

d2x K dx s F(t)
or - + + x —-. (20.5)
dt2 m dt m m
This equation is of the type discussed in Chapter 19:

d2x , dx .. .
Fig. 20.5 (a) Mass-spring 2 + b — + cx — J (t), (20.6)
system. The arrows indicate the dr dt
actual direction of the forces in which
when F(t), sx, and K dx/df take
positive values, (b) Schematic
K F(t)
b = c = and f(t) =
representation: the spring and m m in
the frictional element must be in
Figure 20.6 represents an LCR circuit driven by a voltage source
parallel.
of zero impedance, V(t). If Q is the charge on the capacitor, then

d2Q
+ m,
dt2 dt
330 20.3 Mathematical techniques

„ d-!f
dt2
+ *^
L dt
+ ±e
LC
= !no.
L
(20.7)

Again, this is an equation of the type (20.6), with

x = Q. b = ~, c = m = 7 v(o.
L rr:
These two physical systems serve as models of the differential
equation (20.6). They are also models of each other, for by choosing
the same values of b and c and the same forcing term f(t) the
circuit would serve as a precise analogue of the piston and mimic
its behaviour exactly. A vast number of systems share the governing
equation (20.6), at least approximately. Such a system is called a
linear oscillator.

20.4 Free oscillations of a linear oscillator


Suppose that in the piston system there is no external force acting,
so that F(t) = 0 for all f. We shall choose a conventional notation
that simplifies the algebra a little. Equation (20.6) will be written

+ 2k — + <x>lx = 0 (20.8)
dt2 dr
(in which we have put K/m = 2k, s/m = cOq, F(t) = 0). This equation
describes the free oscillations of the mass-spring system.
The parameter k is a measure of the amount of friction in the
system. We shall consider the case when k is ‘small’. This is not very
meaningful because k is not dimensionless, so we could change our
units so as to make it as large as we wished. The only thing that
makes sense is to compare it with another parameter having the
same dimensions. We specify that

k2 < co2. (20.9)

We have already worked out this problem (see (18.15)). The


solutions of (20.8) subject to (20.9) are given by

x(t) = Ce"“ cos[(coq - + 0], (20.10)

where C and (f) are arbitrary. These are called the free oscillations
or natural oscillations of the system represented by the equation.
If the friction, or so-called damping, is zero then k = 0 and the
equation for the free oscillations becomes
d2x ,
~ + oF0x = 0 (20.11)
dt2

with solutions

X(t) = C COS(a>0t + (j)) (20.12)

which are harmonic functions with circular frequency co0.


20.4 Harmonic functions and the harmonic oscillator 331

(a) The friction, or damping, changes (20.12) into (20.10). The


frequency is changed from co0 to (coq — k2)/ which is a small change
if k is small, and the regular oscillations of (20.12) are caused to die
away through the factor e~kt in (20.10). The general effect is shown
in Fig. 20.7. We say that the oscillation decays exponentially
down to zero, when all the initial energy is used up on friction.
If k2 > a>l, then there is a comparatively large amount of friction,
and the form of the solution is different from (20.10) (see Fig. 20.7b).
There are no oscillations; the x(t) curve dies away without crossing
the t axis more than once, as in a dead-beat electrical instrument or
shock absorber.

20.5 Forced oscillations and transients


Return to the equation (20.5) for the mass-spring system with a
nonzero external force F(t) acting. As before, put K/m = 2k, s/m = col,
(b) and F(t)/m = /(f), so that we get
d2x dx ..
-JT + 2k T + 03°x ~ /(0»
dr dr
which is a forced equation of the type considered in Chapter 19.
We shall consider only the case when /(f) = K cos cot:
d2x „ dx ,
-h 2k-h coix — K cos cot, (20.13)
dt2 df
and suppose as before that the friction (or resistance) is ‘small’:
k2 < COq.

The mass in the piston system is now subject to competing


Fig. 20.7
stimuli. Left to itself it would oscillate as in (20.10) with circular
frequency (col — k2)% and finally come to rest. However, the forcing
term is trying to make it oscillate with a different circular frequency
co. The result is described by the general solution of (20.13). This is
equal to the sum of a particular solution (already worked out in
(19.4) and the complementary functions which are the free oscilla¬
tions given in (18.15):

General solution of the forced linear oscillator


equation (k2 < a>l)
d2X (j x
- + 2k + C0n.x = K cos cot.
dt2 dt
K
x(t) = — cos(cof + 0) (20.14)
[(cuq — c°2)2 + 4k2co2]i
+ C e~kI cos[(coq — k2)if + /],
in which & is the polar angle of the point
(coq — co2, —2koo), and C and 0 are arbitrary.
332 20.5 Mathematical techniques

The structure of (20.14) is very important: the general features are


summarized in (20.15) below.

Forced oscillations of a linear oscillator


(A) The forced oscillation (first term of (20.14)) co¬
exists with a free oscillation (second term). The free
oscillation proceeds as if no forcing term were present.
(B) The term representing the forced oscillation is
harmonic, with the same frequency as the forcing
term, but a different phase and amplitude. The term
is invariable; initial conditions can have no effect on
it since it contains no adjustable constants. (20.15)
(C) The free oscillation term adjusts to any initial
conditions by means of the constants C and </>.
(D) If k is positive, that is to say if there is any friction
(or resistance in the case of a circuit), the free oscilla¬
tions die away to zero due to the factor e_kt. Therefore
all solutions ultimately settle into the same steady
oscillation, independently of the initial conditions.

On account of (D) the free oscillation is called a transient


oscillation, and may show itself, for example, by a brief irregularity
in the voltage or current upon switching an electrical apparatus.

Example 20.4. The circuit shown is initially quiescent and uncharged. Find
the charge Q(t) on the capacitor after switching the circuit on.
We shall rework the problem from first principles. The equation is
,d2Q ,dO
10“3 ^ + 81(T3+ 10() = 2cos90r,
dr2 dt

d22 dO
or ~ + 8-- + 1042 = 2103cos90t.
dt2 dr
Complementary functions Qc (natural oscillation). The characteristic equation is
m2 + 8m + 104 = 0, so that m = — 4 + 99.92j and the complementary functions
are Qc = B e~4t cos(99.92r + f), where B and f are arbitrary.
Particular solution Qv (forced oscillation). Look for a solution to the correspond¬
ing complex equation

^4 + 8 — + 104X = 2 x 103 e90j',


dt2 di
and take its real part. By trying a solution of the form X(t) = P e,DJI we obtain
P = 0.9205 — 0.3488j. In polar coordinates this becomes P = 0.9843 e~°'3622->.
The corresponding complex solution, in polar coordinates, is
X(t) = 0.9843 e,90'-°-36229.

Therefore the particular solution is


Qp(t) = Re X(t) = 0.9843 cos(90f - 0.362).

The general solution. This is

2(0 = 0.984 cos(90r — 0.362) + B e“4' cos(99.92f + f).


20.5 Harmonic functions and the harmonic oscillator 333

Fig. 20.9 (a) Forced oscillation,


Qp = 0.984 cos(90f - 0.362).
(b) Transient,
Qc = 0.985 e“4' cos(99.921 +
2.777). (c) Total oscillation, --1-1-1-1-1-1-1-b 90?
Q = QP + Qc- 4n &n \2n 16k 20 n 24 n 28 n 12k

Initial conditions. At t = 0, Q and dQ/dt are zero. After obtaining dQ/dt and
substituting t = 0 into Q and dQ/dt, we obtain the equations

B cos 0 = —0.9204,
4B cos 0 + 99.92B sin 0 = 31.389.

The solution is B = 0.9851, 0 = 2.777, so Q(t) is given by

Q(t) = 0.984 cos(90t - 0.362) + 0.985 e“4' cos(99.92f + 2.777).

Figure 20.9 shows the individual contributions of the two terms.

20.6 Resonance
Return to equation (20.14) for the linear oscillator and its solutions
and examine the forced oscillation, which is all that is left after the
transient has died away. Its amplitude, A say, is given by

K
[(a>o — tt*2)2 + 4/c2g)2]*
Different values for the forcing frequency co will produce different
334 20.6 Mathematical techniques

amplitudes; some values of co will be more effective than others in


generating a large amplitude.
Regard w0 and k as representing the fixed characteristics of some
kind of system, and consider an experiment in which we try to excite
it with a controllable input K cos cot, keeping K constant but trying
various values of co. The amplitude A will be greatest when
(wq — co2)2 + 4k2co2 = g(co) say, is a minimum with respect to the
variable co. It is found by solving dg/doo = 0 (see Problem 20.15),
that the minimum occurs when
or = a>l — 2k2, (k2 < ^co2).
When co2 takes this value the amplitude A will take its greatest
possible value for the given K and co, given by
K
A = -r-=-r.
2 k(a>l — k2)2

(a) Figure 20.10 shows schematically how the amplitude A, and also
oo
the phase 0 in (20.14), vary with forcing frequency co. Different
curves are obtained according to the amount of friction or damping
(or resistance in the case of a circuit) in the system, measured by
the size of k; as the damping decreases, the maximum increases.
When the condition for a maximum is satisfied, the system is said
to be in a state of resonance.

Resonating system

d2x d.x ,
-(-2k — + cokx = K cos cot.
dt2 dt

(b) Forced amplitude A =- •— —— — .


-r (20.16)
P [(co2 - co2)2 + 4fc2o)2]*
Resonant frequency co2 = C0q — 2k2.
IS

Resonance amplitude
2k(co2 - k2t2

A physical feeling for the buildup of a large amplitude can be


obtained by thinking of a child being pushed on a swing by two
people, one on either side of the swing. The method is to push the
swing the way it wants to go, and not to work against it. This is
best done by pushing it, forward and backward alternately, when it
is at the bottom of its path. The driving frequency is then the same as
the natural frequency of the swing. The driving cycle is a quarter
of a period out of phase with the swing’s cycle, because the force is a
maximum when the displacement is a minimum. In terms of (20.16),
k is assumed small, so that co2 = col verY nearly, and the phase
difference O is nearly jn or a quarter of a period, the forcing term
leading the response by this amount.
20.6 Harmonic functions and the harmonic oscillator 335

Suppose next that there is zero friction;

k = 0,

so that

d2x
—- + (jOqX = K cos cot, (20.17)
d t2

and from (20.16) the forced amplitude A is

(20.18)

The natural frequency of this system is exactly co0. When co (the


forcing frequency) gets close to co0, the amplitude A can become very
large, approaching infinity as co approaches co0: see Fig. 20.10(a).
When co = co0 the equation becomes

d2x
— + coqX = K cos co0f, (20.19)
d?

and apparently A = oo. This result cannot be said to describe a


steady solution of (20.19), but must be reconcilable with (20.19) in
some way. In fact it is the ‘exceptional case’ of equation (19.6(a)),
and has a solution

K
x(t) - t sin conf, (20.20)
2 co0

This particular solution conveniently satisfies the initial conditions

x(0) = 0, x'(0) = 0, (20.21)


that is to say, the conditions for initial quiescence. It therefore
represents a system without friction and in a state of resonance,
which starts up from rest. Its oscillations grow steadily to infinity
due to the factor t in (20.20). The equation does not have any
solutions corresponding to steady forced oscillations, such as we
found earlier in systems having even a small amount of friction.

20.7 Nearly linear systems


Consider the pendulum of Fig. 20.11. It consists of a weightless rod
of length /, pivoted at the top and carrying a point mass m at the
lower end. It makes an angle 9(t) with the vertical. The equation of
motion is

— + 9 sin 9 = 0, (20.22)
dr l
where g is the gravitational constant.
This equation is nonlinear since sin 0 is not of the form a6 + b,
Fig. 20.11 so the methods of Chapter 19 do not apply to it. However, the
336 20.7 Mathematical techniques

Taylor series for sin 9 begins:

sind — 9 — %93 + • • •,
(6 in radians), so provided that 6 remains small enough we can
approximate sin 0 by
sin 9^9.
The error is about 10% when 9 = 45°, and 0.1% at 5°. Put this into
(20.22); we obtain the approximate linearized equation

d'° + -0 = 0. (20.23)
dr2 1

The general solution is 9(t) = C cos[(g//)% + </>]. The values of C


and f will depend on how it was set going; the initial conditions
amount to prescribing the position and angular velocity at t = 0.
However, C must be small for the approximation to be justified.
Exactly linear equations are uncommon. Most frequently they
occur as the result of a simplifying approximation such as we carried
out for the pendulum. Usually some function in the equation is
linearized at the expense of a restriction on the dependent variable.

Example 20.5. A mass m is fixed at the midpoint of a piece of elastic having


natural length I and stiffness s. The elastic is stretched between two points a
distance L > I apart. Find the period of small lateral vibrations.
A
(See Fig. 20.12.) The extension e of the branch AC is AC — jl, so the tension T
in either branch is
T = se — s(/lC — \l).

The total restoring force F is 2T sin 9, and


sin 0 = NC/AC = x/AC,
so F = 2s(AC - jl)x/AC = 2s(l - l/2AC)x.
The equation of motion is
d2x
m-= — F,
df2
so we must put AC in terms of x. Now
AC = {\L2 + x2)*,
B
so the equation will be nonlinear. However, if the oscillations which we expect
Fig. 20.12 A and B are fixed a
are of small amplitude compared with L, we can put
distance L apart where L > /;
the mass m at C is displaced AC
from equilibrium at N by a with an error of something like 2x2/F2, and the approximation to the restoring
distance x(t). force becomes
F ~ 2s(l — l/L)x.

The equation of motion becomes approximately


d2x
fn — = -2s(l - l/L)x,
dr2

d2x 2s
or —- + — (1 -l/L)x = 0.
dr m
20.7 Harmonic functions and the harmonic oscillator 337

This is the linearized equation, good for small amplitudes. It has solutions

x(t) = C cos

where C and / are arbitrary. The approximate period is

It is interesting to consider the case when the string is unstretched in the


equilibrium position, so that L = l. The resulting equation cannot be straight¬
forwardly linearized.

Problems

20.1. Express the following in standard amplitude-phase given by


form C cos(<of + </>), with C > 0 and — n< <f> ^ n.
(-1 )Na)Ce~kTN
(a) 3 cos(3f + §7r); (b) 3 cos(cof — 3zr); x(Tn)
(co2 + k2)*
(c) 2 sin 3r; (d) 3 sin(2f +
(e) —3 cos(2f — \n)\ (f) —4 cos(2f + ^7r);
20.6. Consider an expression of the form
(g) — sin f; (h) 3 cos 2f + 4 sin 2f;
(i) cos 2r + cos(2t — n); x(t) = e”'/r£/(t),
(j) cos(21 — §7r) — cos(2f + §7i). where T is a constant, and g(t) itself does not have
any term in it like e±fc! (for example, g(t) might be a
20.2. State whether x leads or lags y in the following constant, or cos t, or even f3, but it must not be, for
cases, and by how much. example, e~2'cos f). Then T is called the time constant
(a) x = 4 cos 3r, y = 3 cos(3f — ^n). of/(f).
(b) x = 2 cos(21 + ^7r), y = 3 cos(2; 4- §71). (a) State the time constant for Qc in Example 20.4.
(c) x — — 3 cos 2t, y = 4 cos 21. (b) Describe how T provides a measure of the rate
(d) x = cos 3r, y = sin 31. of exponential decay of x(r), rather like the half-life
(e) x = 2 cos 31, y = cos(3f — In). period of a radioactive substance.

20.3. Obtain the free oscillations of the following in the 20.7. (Heavy damping.) Find the general solution of
form C cos(ojf + 0). State (i) the natural frequency if the the equation
damping coefficient is put to zero; (ii) the frequency
x" + 2 kx' + cox = 0
that actually occurs in the cosine term of the solution;
(iii) the number of complete cycles needed for the when k2 > co2. Describe the general character of the
amplitude to drop to 0.1 of its value at t = 0. solutions, contrasting them with the case when k2 < co2.
(a) x" + 20x' + (2.5 x 105)x = 0.
(b) x" + 0.5x' + 4x = 0. 20.8. Solve the equation
(c) x" + 0.15x' + 3x = 0.
x" + 10x' + 24x = 0
(d) x" + x' + 20x = 0.
subject to the initial conditions x(0) = — 3, x'(0) = 20.
20.4. Express A cos u)t + B sin(cuf + 571) in the standard Show that the solution curve crosses the f axis only
form C cos(ojf + 0) when (a) A = 3*, B = 1; (b) A = 3*, once, at the point t = In 2.
B= —1; (c) A = -3* B=U(d)A = -3* B = -1.
20.9. (‘Critical damping’.) Find the general solution
20.5. (a) Show that the maxima and minima of x(f) = of the equation
C e~k‘ cos(cof + 4>) occur at times TN given by
x" + 2 kx' + of x = 0
k
o)Tn + /= — arctan —I- Nn, for the case when k2 = co2.
co

where N is any integer. 20.10. The following equation could represent the
(b) Show that the values of x(f) at these points are damped vertical motion of a mass supported by a spring
338 Mathematical techniques

and subjected to an external periodic force: librium' means that

x" + x' 4- 36x = 10 cos cot, for t > 0, x(f) = constant


is a solution of the equation.)
the system being in equilibrium under no force for (b) Call the equilibrium positions x = a and x = b.
t sS 0.
To investigate the state of affairs near x = a, put
(a) Find the period of the free (damped) oscilla¬
tions. Show that any free oscillations stimulated at x = a + u
startup are reduced by a factor of about 14 after five into the equation, so as to obtain an equation for u(f),
periods of oscillation. which is the distance from a. Then do the same thing
(b) Obtain expressions in terms of co for the ampli¬ near x = b by putting
tude and phase of the forced oscillation. x = b + v,
(c) Find the condition for resonance.
(d) Plot curves of amplitude and phase against co where v is distance from b. Tidy the equations as far
for a range 4 < co < 8. as possible.
(c) Suppose that u in one case, and v in the other,
20.11. A particle rolls to and fro under gravity at the are small, and linearize the equations in each case.
bottom of a parabolic cylinder having vertical cross- (d) Show that in one case small oscillations take
section y = ax2. There is negligible friction. The equa¬ place, but that in the other the displacement tends to
tion of motion in terms of horizontal displacement x increase. (One is called a stable equilibrium state, the
is then other unstable.)

x" + 2 ax(g + 2ax,2)/(l + 4 a2x2) = 0. 20.14. A particle moves in a plane under a central
attractive force y/rx per unit mass, where r and 9 are its
Show that for small oscillations the period is 2^n/{ag)^.
plane polar coordinates relative to an origin in the
attracting body. Its equation of motion can be ex¬
20.12. A particle is balanced at the topmost point, pressed in the form
x = y = 0, of an inverted parabolic cylinder whose shape
is described by y = —ax2, y being measured vertically
upward. Its equation of motion is
where u = r_1 and H is its (constant) angular mo¬
x" -I- 2ax(2ax'2 — g)/( 1 + 4a2x2) = 0.
mentum per unit mass.
By linearizing the equation show that, if the particle Show that the equation has a constant solution u = u0,
is slightly disturbed, it starts to move away from its which is equivalent to a circular orbit. Does it stay close
initial position (0, 0) at an increasing rate. (This condi¬ to this orbit if its position u is slightly changed from u0,
tion is called unstable equilibrium.) while H keeps its original value? (Hint: put u = u0 + x
and linearize the equation for small x. You may assume
20.13. The equation for the displacement x(f) of an that for small values of x/u (see (5.4d)).
electrical circuit fixed on springs and influenced by a
current-carrying conductor is (u0 + xY~2 % uS"2(l + (a - 2)— ^

x* + 4[x - 2/(3 - x)] = 0.


20.15. Given the expression for the forced amplitude
(a) Show that there are two positions x at which the A in equation (20.16), deduce the expressions for the
circuit could theoretically be in equilibrium. ('Equi¬ resonant frequency and resonant amplitude.
Steady forced
oscillations: phasors,
impedance, transfer
functions
Contents
21.1 Phasors 339
21.2 Algebra of phasors 341
21.3 Phasor diagrams 342
21.4 Phasors and complex impedance 343
21.5 Transfer functions in the frequency domain 346
Problems 348

21.1 Phasors
We shall consider circuits driven by an applied harmonically
alternating voltage, with resistances placed so that any free oscilla¬
tions set up by switching on the circuit die away, leaving only
a periodic forced oscillation, as described in Section 20.5. It is only
this remaining, steadily-oscillating state that is discussed here.
Let x(t) represent any variable in the circuit, such as the current
in a particular branch. If the frequency of the applied voltage is
co/2k, then all these possible variables x(t) share the same frequency
co/2ti once the transients have died away, though in general the
phases and amplitudes of different variables are different. Here we
adopt the standardized amplitude/phase form of (20.2), assuming
that

x(t) = c cos (cot + 4>), with c > 0 and — 71 < cf) (21.1)

We can write x(t) in a complex form instead:

x(t) = Re(c ej(mf+*)) = Re(c ei<p • ejoJI).

The complex coefficient c ei4> that multiplies ej"' is called the phasor
corresponding to x(t), and it is independent of time t. Every
variable x(t) will have its own phasor, but the factor ej<or is the
same for each one. In a circuit, and c usually depend on co, so the
values of the corresponding phasors will depend on co.
Corresponding to each variable denoted by a lowercase letter, we
use a bold capital letter to denote the phasor. This style is traditional,
and emphasizes that phasors, being complex numbers, can be treated
as vectors in the Argand diagram.
340 21.1 Mathematical techniques

Phasor of a harmonic oscillation


The phasor of x(t) = c cos(cot + tf>) is the complex (21.2)
number X = c ei<p.

In engineering applications, phasors c are often written in the


form c[tp, and tp may be expressed in degrees. Thus, if X = 3 e^i7tJ,
we can write
3/-fo = 3/—45°.
The two numbers displayed are the polar coordinates of the point
which represents the phasor on an Argand diagram, in this case the
point (3/^/2, —3/^/2) corresponding to
3
A' = 3 cos( —45°) + j3 sin( —45°) =
V2
It is often convenient to express a phasor in the form a + jb rather
than in the polar form c ej<^.

Example 21.1. Find the phasor of x(t) = — 3 cos(2f + fn).


In standard form (21.1), x(t) = 3 cos(21 — \n). The phasor X is therefore given
by X= 3e"^j or 3/-90°.

Example 21.2. Given that the prevailing angular frequency is to = 104, find
the functions x(t) having the following phasors. (a) X = l/( — 1 +j), (b) X =
(i-V3M-i +j)-
(a) Put X into polar form:
1 -1 -j
X =- _i _ ij _ _L e
-1 +J (-D2 + l2 2 2J~V2
Therefore

x(t) = cos(104r — jrt).


V^

(b) 1 — x/3j = 2e_*7tJ (as can be seen by putting the point 1 — ^/3j on an
Argand diagram). Therefore, using (a),

X = (2 e_i7tj)^ e“i,tj^ = J2

The phase (-}§7t) is out of the standard range (21.1), so add 27r to it, leaving
X unchanged. We obtain X = J2 e A so

x(t) = yj2 cos(104f + j^rr)

Example 21.3. Let x(t) = 3 cos tot — sin cot. Find the corresponding phasor.
Take the terms separately:

J3 cos tot = Re(^/3 ej<"');

sin tot = cos(<yf — \n) = Re(e~*Itj ej“").


Combining them, we obtain

x(t) = Re[(%/3 - e~iK>) ejraI]-


21.1 Steady forced oscillations: phasors, impedance, transfer functions 341

Therefore

*=V3-e-^ = V3-(-j)
= 2e^j or 2/30°.

21.2 Algebra of phasors


As seen in Example 21.3, when oscillations associated with the same
value of co combine by addition, so do their phasors. Suppose, for
instance, that u(f) and v(t) have the same angular frequency co, and
that their phasors are U and V. Then u{t) = Re(£/ejcot) and v(t) =
Re(Fej"r), so

u{t) + v(t) — Re[(t/ + V) ejco(],

whose phasor is U + V. The addition holds similarly if there are


more terms present.

Addition principle for phasors

If u(t), v{t),... have a common frequency, and


z(t) = u(t) + v(t) + ■ • •, then Z = U + V + ■ ■ ■, where
Z, U, V,... are the corresponding phasors.

Differentiation and integration give important results. If x(t) =


Re(Ar ejrot), where X is the phasor, then dx/df = Re(jco^f ejcot), so that
the phasor of dx/dt is ]wX. Differentiate again, and a further factor
jco is introduced, so that the phasor of d2x/dt2 is (jco)2X, and so on.

For j*x(t) df, we find in the same way that the phasor is X/jco. Here

the additive arbitrary constant has been put to zero because, in


normal use, all the variables that occur oscillate.

Phasors of derivatives and integrals

dx d2x
Variable; x x df
df df2

Phasor: X= c j coX -co2X lx


jco

Example 21.4. Obtain the phasor of the expressions


d2q dq q , df 1
(a) L —£ + R — + (b) L — + - ; df; in terms of the phasors Q of q(t) and
df2 df C df C
I of i(t). (L, R, and C are constants, and the prevailing frequency is co.)
342 21.2 Mathematical techniques

(a) From (21.4) and the addition principle (21.3), the phasor is

L(joofQ + R(]m)Q + (1 /C)Q = [(1/C - Leo2) + )Rto]Q-

(b) The phasor is


L(ja>)/ + (l/Cjco)/ = j(Lco — 1/Cco)/.

Example 21.5. Find the steady-state solution of

7* + 8 — + 104x = 2 103 cos 90f.


dr2 dr
This is equivalent to the circuit equation in Example 20.5, with x(f) in place of
q{t). The prevailing value of to is 90. Let X be the phasor of x(t). The phasor of
the right-hand side is 2-103, so by using (21.4) we obtain

[(90j)2 + 8(90j) + 104]A" = 2-103,


or
(1900 + 720j)AT = 2 103,
from which X can be found.
2 • 103 1
1900 + 720j ~~ 0.95 + 0.36j
1
= 0.984 e"°-362j.
1.0159 e°-3622j

Therefore
x(t) = Re[0.984 e_0-362j e90jI]

= 0.984 cos(90f - 0.362),


Imaginary axis as we found in Example 20.5 for the forced oscillation.

21.3 Phasor diagrams


Complex numbers can be represented by vectors in an Argand
diagram (see Section 6.2), and they are added in the same way as
the corresponding vectors. Phasors are just complex numbers, so
they add like vectors too. This fact can be used to show pictorially
how a number of superposed oscillations which are not in phase
with each other contribute to the sum. The diagrams concerned are
called phasor diagrams.
^ ^ Imaginary axis

Example 21.6. Let u(t) = 2 cos lOr, v(t) = cos(10r — \n), and w(t) =
3 cos(10f + jtc). Find p(t) = u(t) + v(t) + w(t) by means of a phasor diagram.
The phasors corresponding to u, v, and vv are U = 2, V — e~i,Ij, and W = 3 eiItj.
In the polar-coordinate notation they are U = 2/0° V = 1/ —90°, W = 3/45°.
They are shown as position vectors in Fig. 21.1a, and in Fig. 21.1b they are
strung together as usual for addition. The vector OP can be measured
off from the diagram, or calculated using the dimensions shown. We have

|OP| = [(3/72 + 2)2 + (3/72 - l)2]- = 4.27,

, 3/72 - 1
<p = arctan-= 0.479 (radians).
Fig. 21.1 (a) Argand diagram 3/72 + 2
showing U, V, W. (b) The sum Therefore p(t) = 4.27 cos(10f + 0.479).
U + V + W = OP.
21.4 Steady forced oscillations: phasors, impedance, transfer functions 343

21.4 Phasors and complex impedance


In the following table, an electric current

i(t) — c cos (cot + (j>),

with phasor

/=ce^

is caused to pass through a resistor, an inductor, and a capacitor,


separately. The resulting voltage drop v(t) associated with each is
shown, together with its phasor V. It is the unique steadily oscillating
state that is being described by the phasors.

Resistor Inductor Capacitor


+
—-o
/ / --
1 1 i 1 1 1
u V V

di 1
Voltage drop: v = Ri v=L— v—— i df
df cJ
1
Voltage phasor V: Rc ej<^ = RI j co LI /
jcoC

Voltage phase: (p 4> + <t>~2 n


(in phase) (v leads i) (v lags 0

A similar table can be constructed if the voltage rather than


current is prescribed. The entries can be read from the table above;
for example, if the phasor of the voltage applied to an inductor is
V, the phasor of the resulting current is V/ja>L.
Discussion of circuits in terms of phasors is said to take place in
the frequency domain, rather than the time domain associated
with the differential equations of the circuits.
Each of the three cases in the table can be written in the form
V=ZI,
where Z is either R, jcoL, or (jcoC)-1. The quantity Z is called the
complex impedance of these elements. There is a plain analogy
with Ohm’s law for direct current through a resistance. We have

Complex impedance Z

Resistor Z — R
Inductor Z=jcoL (21.6)
1
Capacitor Z =-.
j coC
344 21.4 Mathematical techniques

By stringing elements of this type together in series and parallel,


we can form composite units. The combined unit has a complex
impedance which is the sum of the complex impedances of the
individual elements:

Example 21.7. Show that the complex impedance Z of two elements in series,
whose complex impedances are Zx and Z2, is given by

Z — Zt + z2.
Suppose that the impedance of the unit is Z; we mean by this that, if V is the
v phasor of the voltage drop across the unit and I is the phasor of the current
-1

through it (see Fig. 21.2), then


/ -►
V = ZI.
From Fig. 21.2, v = v1 + v2, therefore, by (21.3), the corresponding phasors
satisfy
Fig. 21.2
V = V, + v2.
But i, and therefore /, is the same for Zx and Z2, so
Vi = ZJ, V2 = Z,/
Therefore V = ZXI + Z2/ = ZI, or
z = z, + z2.

If the two impedances are in parallel, the analogy with Ohm's law
again exists:

Example 21.8. Show that the complex impedance Z of any two elements Zx
1 1 1
and Z2 in parallel is given by — —-1-.
Z Zj Z2
From Fig. 21.3, i = h + i2; so, by (21.3), I = f + I2. The voltage drop is the
same for both branches, so
h = V/Zu I2=V/Z2, I = V/Z.
Therefore I = (l/Zi + l/Z2)V = (l/Z)V, from which the result follows.

It is easy to extend these two results to encompass more elements,


and therefore we have the following general result.

Complex impedance Z of series and parallel


circuits

(a) Impedances Zl5 Z2,. ., in series:

Fig. 21.3 Two impedances in Z = Z1 + Z2 + • • • (21.7)


parallel and their combined (b) Impedances Zl5 Z2,. ., in parallel:
impedance.
1 1 1
— — --1-b • • ■
Z Zj z2

The analogy with resistive circuits, evident from these formulae,


goes much further. The general rules which govern voltages and
currents in a passive linear circuit are Kirchhoff’s laws: (i) that
the algebraic sum of the voltages around any closed circuit is zero;
21.4 Steady forced oscillations: phasors, impedance, transfer functions 345

(ii) that the resultant current entering any junction is zero. There
is also a linear voltage/current relation for each branch. In terms of
phasors and complex impedances for a circuit in a state of steady
harmonic oscillation, these conditions become the following.

Around any closed circuit, £ V = 0.


At any junction, £/=0. (21.8)
On any branch, V = ZI.

These rules have the same form as the rules for resistive direct-
current circuits, with V, /, and Z appearing in them in place of v, i,
and R. It follows that general rules applicable to DC circuits
may be borrowed for the purpose of the circuits we have been
considering. Such rules are the Wheatstone bridge rules, Thevenin’s
theorem, and the structure of equivalent circuits. However, the
restriction to steady harmonic oscillation must be remembered:
many circuits can be made to ‘balance’ like a Wheatstone bridge
for steady oscillations, but not for more general disturbances.

Example 21.9. Find the steady AC current in the circuit shown in Fig. 21.4.

R The unit comprising R and C consists of two complex impedances in parallel,


R and (jmC)-1. If Z is the combined impedance, then

1 _ i 1
Z~ R (jcoC)-1 ’

which gives

R
Z =-.
1 + jcoRC

Z is in series with the other impedance, jcoL, so the impedance of the circuit is
given by

R R(1 — oj2LC) + jcoL


Z =-1- j o)L =-.
1+j ojRC 1+jcoRC

Since / = V/Z, and V = v0, we obtain

J= Pq(1 +j (*>RC) = P0(l + co2fl2C2)* ejWl _ w

R{1 - u}2LC) + ]coL [R2(l - o)2LQ2 + cu2L2]^


where

coL
cf)l = arctan coRC, 02 = arctan
R( 1 - co2LC)

Finally,

i(t) = Re(/ej“‘)

»0(1 + co2R2C2)i
COS (cot + (/>!— 4>2).
[R2(l - co2LC)2 + co2L2¥

Example 21.10. Find the steady AC current entering the circuit shown in
Fig. 21.5.
346 21.4 Mathematical techniques

The phasor of the voltage source is V = v0. By (21.6) the impedance of MNPQ
is R +jcoL, and that of MQ is 1/jcoC. These are in parallel, so by (21.7) the
impedance Z of the circuit viewed between M and Q is given by

1 1 1
— --b -.
Z R + jcoL 1/jcoC

Therefore

V_ ( i + jmC .
Z V°\R+itoL

The simplest way to get an expression for i(f) is to treat the two terms in the
parentheses on the right separately (though this does not give the answer in
U=UQ COS (Ot standard form). We obtain
(a)
Vq
‘(t) cos ^cot — arctan
(R2 + ca2L2f
+ v0coC cos(a>f + §7r).

Example 21.11. (Balanced bridge circuit.) (a) For Fig. 21.6a, show that (i) if
i(t) = 0, then Z,/Z2 = Z3/Z4, (ii) if Zv/Z2 = Z3/Z4, then i(t) = 0. (b) Check that
i(t) = 0 in the circuit of Fig. 21.6b.
(a) The analogy (21.8) between resistive and general circuits for steady harmonic
oscillations enables us to borrow ordinary Wheatstone-bridge theory, substitut¬
ing current and voltage phasors and complex impedances for the usual
constant currents, voltages, and resistances. We can therefore say immediately
that the circuit is balanced (i(f) = 0) if, and only if.

(b) v =uQ cos wt Zjz2 — z3/z4.


(b) Zx consists of a capacitor and resistor in parallel; so, by (21.6) and (21.7),

1 1 1 „ 1
- =-— + - or Zx=-.
Z\ (jm) 1 1 1+jw

Also

Z2=jcn, Z3 = —, Z4=l+jw.
jcu

Therefore

Z2 jm(l+ja>) Z4 jco(l + jco)'

so, from (a), i(t) = 0 and the bridge is balanced.

21.5 Transfer functions in the frequency domain


C Consider the circuit of Fig. 21.7, in which the applied voltage is v1(t),
with phasor Vx = cx eJ01. Suppose that the voltage drop r>2(f) across
— v*— — 11

i

R has a phasor V2 = c2 e-^2.
Consider the ratio of these two phasors, denoting it by C12 (G
i \L
standing for voltage gain):

V2 _ c2 ej02
G 12 C1 gj(02 -0l)
Fig. 21.7 Vx cx ej<Al Cl
21.5 Steady forced oscillations: phasors, impedance, transfer functions 347

Then

which is the ratio of the peak voltages, or amplitudes, of v2(t) and


v^t). The argument (polar angle) of C12 is the phase difference
between them. If instead we are interested in the current i2(t) through
R2 produced by vx(t), then we need the ratio

Zu= VJh,
where /2 is the phasor of i2(t). This quantity, a voltage divided by
a current, is called a transfer impedance. Alternatively, we could
consider the ratio
Y2l=I2/Vu
in which Y21 is called a transfer admittance (whose parallel is
conductance in DC theory).
In general, the ratio of an output (such as a current in a selected
branch) to an input (such as a voltage driving a network) is called a
transfer function in the frequency domain. A different class of
transfer functions is discussed in Chapter 25, on Laplace transforms.

Example 21.12. Find the transfer impedance Z12 = VJI2 for the circuit of
Fig. 21.8 when the prevailing angular frequency w is 200.
The currents indicated take account of Kirchhoff’s second rule (21.8), that the
sum of the currents entering a junction is zero. The first law expressed in terms
of the phasors (see (21.8)), that the sum of the voltage drops round closed circuits
is zero, gives for the circuits ABCDEA and BCDEB respectively:
1
2/, + : + 3 )I2=Vi,
200 x 0.0 lj
and

(-1-+ 3 )/2 - (200 x 0.005j)(/2 - /,) = 0.


\200 x O.Olj J

After simplification, these become

2/1 + (3 — 2j)/2 = L,
j h + (3 - |j)/2 = 0.
The solution for /2 is

The transfer function required is


Z12= VJI2 = 6 +Yj = 8T4e°'74k
The amplitude of i2(t) is given by
|/2I = IC|/|Z12| = 10/8.14 = 1.23.

Its phase is:


(phase of L,) — (phase of Z12) = 0 — 0.74 = —0.74.
The current leads the voltage by this amount.
348 21.5 Mathematical techniques

The methods described in this chapter were invented in the late


nineteenth century to assist engineers working with alternating
current to interpret and make calculations on their circuits. So long
as only steady harmonic oscillations had to be considered, there was
no need to solve differential equations: only algebraic equations are
involved and these are much simpler to manipulate. Since that time,
the methods have been extensively developed so as to permit
computer calculation for circuits of any degree of complexity, using
matrix algebra, graph theory, and other sophisticated techniques. In
Section 24.16, another method for algebrizing circuit equations is
described, using Laplace transforms.

Problems

21.1. Write down the phasors X corresponding to the


(b)
oscillations x(t) given below, in polar and a + jb form,
(a) 2 cos( 10? + in); (b) — 2 cos(10f + \n);
AW ■amn—o
(c) 3 sin cut; (d") —4 sin(3r — ^tt).
(O L c
21.2. Write the following phasors X in polar form, and --nrmp- Hi—°
give the corresponding oscillations x(t) when the angular
frequency is to. R
(d)
(a)l-j; (b)2j; (c) — 3j; (d) ~V2 — V2j; AVw
(e) — 2^3 —2j; (f) — 1 — V3j; (g) 1/(1 - 2j);
(h) j/( 1 — 2j); (i) (2 + 3j)/(2 - 3j); (j) l/3j + 2j.

21.3. Write down the phasors corresponding to the C


following oscillators. State the amplitude and phase,
(a) cos 21 + cos(2f — ^tt); (b) cos 3f — sin 3t; (e) R
(c) sin 3f + 2 cos 3f. AW

21.4. Either algebraically, or by calculation or measure¬ L


ment based on a phasor diagram, give the phasors of ■W-
the following functions.
(a) —cos 2r + cos(2f + ^7r) + cos(2f — (f) L
(b) cos 1760? — 3 cos(1760f — \n) + cos(1760t + -nmo-

21.5. Show that the point on an Argand diagram cor¬


responding to c moves on a circle, centre the
origin and radius c, with constant angular velocity to, (g)
and that its projection on the x axis is given by x = * L
c cosfcof + tj>). (A phase diagram such as Fig. 21.1 is a rAM-nmo—
snapshot of conditions at f = 0 in such a representation:
the whole diagram rotates unaltered.) c

21.6. Obtain the complex impedance of the following


circuit branches. (h) R C
-Wv~-lb
(a)
^ (7
°—VvV-1 (-
Steady forced oscillations: phasors, impedance, transfer functions 349

L complex impedances (the units are usually ohms, although


(i)
-'THTT'- the quantities may be complex). A voltage with phasor
R V0 is applied. Obtain the phasor Vt as indicated, the
-AW corresponding voltage gain VJV0, and the transfer imped¬
C
ance V0/Iv

(j) L R
(a)
-AHv-

(k) L
-'W-

f-

R
A\V
(1)

(c)

Fig. 21.9

21.7. A voltage v = 2 cos a>t is applied to each of the


terminals in Problem 21.6. Find the amplitude and
phase of the current passing through the circuit.

21.8. In Fig. 21.10, numerical values are given to the Fig. 21.10
Graphical, numerical
and other aspects of
first-order equations
Contents
22.1 Graphical features of first-order equations 350
22.2 The Euler method for numerical solution 351
22.3 Nonlinear equations of separable type 354
22.4 Differentials and the solution of first-order equations 356
22.5 Change of variable in a differential equation 360
Problems 363

22.1 Graphical features of first-order equations


In this section we shall use x for the independent variable (instead
of t), and y as the dependent variable (instead of x), and consider
(a) differential equations of the form

y
^=f(x,y), (22.1)
dx

where /(x, y) is unrestricted. If f(x,y) happens to take the form


Lineal element g(x) + h(x)y, the equation is linear and can be handled by the
through P : (a,b) method of Section 19.5; otherwise none of the methods so far dis¬
cussed will work.
However, we can always obtain a rough picture of the solution
curves by using a simple fact. Choose any point x = a, y = b, on
O a
the (x, y) plane. Then equation (22.1) says that the slope of the
solution curve which passes through (a, b) must be equal to f{a, b),
and this has a definite numerical value that we can work out. So
take a large number of points (a, b) on the (x, y) plane. For each of
them, work out f(a,b), and draw through the point a short line
whose slope is equal to f(a, b), as is done in Fig. 22.1a for the special
case /(x, y) = xy. These are called direction indicators. Given enough
of these, it is possible to draw a family of curves which follow their
directions smoothly, as in Fig. 22.1b. Each of the curves represents
a solution of (22.1), because its slope, or derivative, is correctly
reproduced at each point on it. The picture is called a lineal-
element diagram or a direction field. The technique can in principle
O 1 a: be used for first-order equations however complicated they may be.
Rather than to draw the direction indicators at grid points as in
Fig. 22.1 Lineal-element diagram
Fig. 22.1, it is often easier to look for curves, called isoclines, along
indicating solution curves for
dy/dx = xy. which the slope is constant, as in the following example.
22.1 Graphical, numerical and other aspects of first-order equations 351

Example 22.1. Sketch the solution curves of— = x — y.


dx
Here dy/dx takes constant values K on the isoclines x — y = K, or y = x — K.
For example, dy/dx = 0 on y = x, dy/dx =1 on y — x — 1, and so on. If we
draw the line y = x — K, then the indicators along it are all parallel, with slope
K, so it is easier to draw a large number of them. Figure 22.2 is constructed in
this way. (This equation is in fact linear with constant coefficients, its solutions
being y = x — 1 + C e-x.)

Example 22.2. Sketch the solution curves of — = x2 + y2.


Fig. 22.2 Solution curves of dx
dy/dx = x — y in the first The isoclines are the circles x2 + y2 = K (see Fig. 22.3), on each of which the
quadrant.-isoclines; slope is equal to K (which must be a positive number here).
values of K indicated.-
solution curves.
Closed curves can occur, as in the following example.

y
Example 22.3. Sketch the solution curves of — = ——.
dx y

The isoclines are the radial straight lines —x/y = K (see Fig. 22.4), or
1

on which the slopes are equal to K. On y = 0, the slope K must be infinite so


the direction indicators are vertical.

The method illustrates why there is always an infinite number of


solutions: there will be a single solution curve through every point
Fig. 22.3 Pattern of solution where /(x, y) has a definite value. The type of exception that might
curves of dy/dx = x2 + y2 in arise is illustrated by the case f(x,y) = (xy)i, which only has a
the first quadrant.
meaning when x and y have the same sign; there are solutions
-isoclines,
-solution curves. in only the first and third quadrants of the (x, y) plane. Again, there
can be points from which several solution curves emanate: this
occurs at the origin in Example 22.3, where f(x,y) takes the
indeterminate form 0/0; nothing can be taken for granted at such a
point, but elsewhere the curves do not intersect.
To prescribe a point (a, b) through which a curve must pass is
equivalent to imposing an initial condition on the solution: the
corresponding initial condition would read ‘Find the solution for
which y = b when x = a\ Therefore, it can be seen that, even when
an equation is not linear, an initial condition of this sort will give
exactly one solution; points where f(a,b) is indeterminate being
excepted.

22.2 The Euler method for numerical solution


Fig. 22.4 Sketch the solution For the equation
curves dy/dx = —x/y.
-isoclines, j- = f(x, y),
-solution curves. dx
352 22.2 Mathematical techniques

consider an adaptation of the graphical method described in the


previous section. Start at any point P0 : (x0, y0), and draw an
indicator with slope f(x0, y0) from P0 to Pl : (xl5 yx) a short distance
away (see Fig. 22.5). Then Pj will lie close to the solution curve
through P0. Do the same thing starting with P1; and so on,
continuing as far as is necessary. It is also possible to proceed
backwards from P0. Provided that the steps are small enough, it
seems likely that P0, Px, P2,■.., will be close to the solution curve
through P0, so we have an approximate solution to the initial-value
problem: Find the solution of dy/dx = /(x, y) for which y = y0 when
x = *o-
Obviously, to draw the solution curve in this way is not really
Fig. 22.5 Step-by-step use of practicable; but we need not actually draw it, because the same
direction indicators along a process can be carried out numerically as follows.
particular solution curve. As shown in Fig. 22.6, choose a small, constant step length h in
x for going from point to point: P0 -* P{ -> P2 ->■ • • •, where the
vectors P0PU P\P2, P2F3, • • ■ point in the direction of the indicators
at their respective starting points Po,PuP2,-- Corresponding to
the x steps, call the y steps ku k2, k3,...:

yi=y0 + ki, y2 = y 1 k2, y3 = y2 + k3,...,


where

ki = hy'(x 0) = hf(x0, y0), k2 = hy'(x 1) = /i/(xl5 yj, ....

Therefore:

for Pj: Xj = x0 + h, yl = y0 + hf(x0, y0);

for P2: x2 = Xj + h, y2 = yt + hf(xu yx);

and, in general, with n = 1, 2, 3,..., in turn,


Fig. 22.6 Three steps in the
numerical solution of forPn: xn = xn.1 + h, y„ = yn-1 + hf(xn.lt y^).
dy/dx = f(x, y), starting at
the point P:(x0,y0). We expect that the points will be close to the solution curve. This
is the Euler method for approximating to the solution of the
differential equation.

Euler method for initial-value problems

Differential equation: — =/(x, y).


dx
Initial condition: y = b when x = a. (22.2)
Approximate solution: Put x0 = a, y0 = b; then
xn — xn-\ +h, y„ = y„_! + /(x„-i, yn-i)h
for n — 1,2,... successively.

This recipe, or algorithm, describes a step-by-step repetitive


process, or iteration; essentially the same thing has to be done over
22.2 Graphical, numerical and other aspects of first-order equations 353

and over again. The procedure which produces a new (x, y) from
the preceding (x, y) is called a recurrence relation (compare Newton’s
method for solving equations in Chapter 4). Such a process is easy
to program on a computer, and Fig. 22.7 is the skeleton of a flow
diagram. The program should contain a method for stopping itself
when x has gone far enough; also, since small intervals h are usually
necessary, it is useful to include a means of recording only the results
for preset values of x to avoid voluminous output.

Fig. 22.7 Flow diagram for the


initial-value problem.

Example 22.4. Use the Euler method to obtain a solution of the initial-value
problem

dy ,
— = xy , with y = 1 at x = 0,
dx

between x = 0 and x=l. Compare the result with the exact solution y = ( 1 — \x2}~1
when steps of h = 0.2, 0.1, 0.01, and 0.001 are adopted.
The following results are obtained.

X 0 0.2 0.4 0.6 0.8 1.0

Exact y 1.0000 1.0204 1.0870 1.2195 1.4706 2.0000


h = 0.2 1.0000 1.0000 1.0400 1.1265 1.2788 1.5405
A = 0.1 1.0000 1.0100 1.0623 1.1687 1.3601 1.7129
h = 0.01 1.0000 1.0193 1.0843 1.2139 1.4576 1.9618
h = 0.001 1.0000 1.0203 1.0867 1.2189 1.4693 1.9960

Euler’s method is very simple; but it is usually good enough to


provide reasonable accuracy over a finite range, provided that small
enough intervals are used. The simplest way of checking accuracy
is to experiment with successively smaller intervals h, noting when
further reduction in h does not change the values of y obtained at
the number of decimal places required. Several problems on these
lines are given at the end of the chapter.
There exist, however, far more sophisticated algorithms which
will give great accuracy over long ranges without having to use
minute values of h (which can introduce problems of its own). The
computer programs for such methods can be found in libraries of
computer routines. The theoretical side of the subject is called
numerical analysis; mathematical theory makes it possible, for
example, to estimate the size of interval required without carrying
out trials.
354 22.3 Mathematical techniques

22.3 Nonlinear equations of separable type


The equation

dy _ y2
dx x2

is nonlinear (note the y2 term), and none of the theory of Chapters


18 and 19 can be adapted to solve it. Write it in the form

dy dx

y2 1?'
On the left only y appears, and on the right only x appears. The
form looks like an invitation to integrate both sides:

dy dx
C,
x1

so — 1/y = — 1/x + C. Therefore

x
y = 1 - Cx

where C is arbitrary. The reader should consider checking that these


really are solutions by substituting into the equation. Notice that
there is no sign of the complementary functions and particular
solutions found for linear equations: an arbitrary constant C does
occur, but it is imbedded deep in the expression. The solution curves
y are shown in Fig. 22.8: each curve has its individual asymptotes,
namely the lines y = — C~1, x = C~1.
The method is called separation of variables. It can be applied
to equations which are separable, that is to say, ones that can be
arranged in the form

dy
— = g{x)h(y);
dx

where the right-hand side is the product of two terms, one a function
of x only, and the other a function of y only. Alternatively, you
-2 might see it more easily as an equation which can be put into the
Fig. 22.8 Solution curves form
y = x/(l - Cx) for
dy/dx = y2/x2. Values of C T(y) dy = X(x) dx,
indicated on the curves.
2f(x) and T(y) being functions respectively of x only and y only.
For example the equations

dy i. + dy y dy
- = x2y2, — = e sin y,-= cos(y )
dx dx x dx

are of the right type.


22.3 Graphical, numerical and other aspects of first-order equations 355

Separation of variables
dv
Equation type: — = g(x)h(y).
dx
dy
Separate the terms: g(x) d.x. (22.3)
h(y)

Integrate: g(x) d.x + C, so that y is


h(y)
expressed as function of x (usually an implicit function).
C may take a range of values.

Example 22.5. Find solutions of the equation y — = cos x.


dx
This can be written y dy = cos x dx. By integrating both sides, we obtain
\y2 = sin x + C, giving y = ±2*(sin x + C)*, where C is only to a certain extent
arbitrary. It cannot be completely arbitrary; for example, if C = — 100, then
sin x + C will always be negative (because — 1 < sin x < 1), so the square root
never has a real value. We must have C > — 1 to get any real solution. If C lies
between —1 and +1, there will be intervals in x which make sin x + C
negative; this fact leads to the oval curves in Fig. 22.9: you should experiment
with, say, y = +2*(sin x + from this point of view. Since sin x has period
2rr, so has the solution picture.

Fig. 22.9 Solution curves for the


equation
dy
y — = cos x,
dx
which are the functions
y = ±2*(sin x -I- C)E

y(x + l)
Example 22.6. Find solutions
x(y + l)
After separating, we have

*1 +x
dx + C,
x
or

dy = + dx + C.

Therefore y + In |y| = x + In |x| + C, or y e-1' = Ax ex, where A is arbitrary.


We cannot further reduce this ‘solution’ to express y explicitly in terms of
x. The ‘answer’ is nearly as obscure as the original equation. More intelligible
information about the solutions could be obtained by using the graphical or
numerical methods of Sections 22.1 and 22.2.
356 22.3 Mathematical techniques

The separation-of-variables techniques requires initiative, even in


the simplest cases, as can be seen from the following example.

dy
Example 22.7. Find solutions of = 2y*.
dx

After separating, we have y i dy = dx, or

y* = x + C. (22.4)

To express y in terms of x, square both sides. We get

y = (x + C)2. (22.5)

This represents a family of parabolas, as shown in Fig. 22.10a. But it cannot be


right: the curves cross at every point, although dy/dx has only one value at any
point. In fact, since y* > 0, only the positive value of y'(x) is legitimate, and this
gives the right-hand branches, shown in Fig. 22.10b.
We also lost a solution, namely y(x) = 0. This is connected with the fact
that, for this solution, we in effect divided by zero when we first separated the
equation.

The production of pseudosolutions and the nonappearance of


certain singular solutions in the final formula, as in Example 22.7,
is a problem which constantly arises in nonlinear differential
equations.

22.4 Differentials and the solution of first-order


o -c
equations
Fig. 21.10 (a) y = (x + C)2 for In this section, it is important to distinguish between an identity
various C. (b) The solutions of
such as d(y2)/dx = 2y dy/dx, which is true for any y(x) (it is just a
the differential equation,
consisting only of the right-hand special case of the chain rule), and an equation such as dy/dx = xy,
branches of the parabolas. which will only be true for special functions y(x).
(Note: y(x) = 0 is also a valid Take an identity such as
solution.)
d 2
— x = 2x, (22.6a)
dx

and consider another way of writing it:

d(x2) = 2x dx. (22.6b)

It is as if we formally multiply (22.6a) by dx to obtain (22.6b).


Conversely, if we divide (22.6b) by dx, we recover (22.6a). We
have already used this process to help to change the variable in an
integral in Section 17.1.
Now consider a more complicated identity, obtained from the
product rule for differentiation (y(x) represents any function of x):

d dy
(xy) = y + x - . (22.7a)
dx dx

The parallel expression of the same identity, obtained as before, is

d(xy) = y dx + x dy. (22.7b)


22.4 Graphical, numerical and other aspects of first-order equations 357

Given either one of them, we can immediately construct the other,


so we shall regard such pairs of expressions as being simply different
ways of writing the same thing. In effect this is what we did when
carrying out the separation-of-variables process for differential
equations in Section 22.3, and we are leading up to a generalization
of this method.
In general, a differential expression or differential form has the
shape

P(x, y) dx + Q(x, y) dy, (22.8)

where P(x, y) and Q(x, y) are two functions of x and y. In (22.6b),


we had P(x, y) = 2x and Q(x, y) = 0; in (22.7b), we had P(x, y) = y
and <2(x, y) = x. The symbols on the left of (22.6b) and (22.7b), d(x2)
and d(xy), are called the differentials of x2 and xy respectively.
The table (22.9) (below) gives a list of useful identities written
in the usual form and the differential form for comparison.

Standard form Differential form

d
(C) 0 (C constant) dC = 0
dx
d
(x2) 2x d(x2) = 2x dx
dx
d dy
O'2) 2y d(y2) = 2y dy
dx dx
d dy
(xy) x— + y d(xy) = y dx + x dy
dx dx (22.9)

±(l)=UXil-y) dQ = - ~ (y dx- x dy)


dx \xj x2 \ dx )

A(T) = iT-*dT) = p (y dx - -x d^)


dx \y) y2 V dx)

d
d( In - J = — — (y dx — x dy)
dx xy \ dx V x) xy

Differential forms can be manipulated. For example:

(i) 2x dx - y dy = d(x2) - d(^y2) = d(x2 - \y2).


(ii) (x + y) dx + x dy = x dx + (y dx + x dy)
= d(jx2) + d(xy) = d(|x2 + xy).
(iii) If m(x) and v(x) are two functions, then

d(uv) = u dy + v du;

which is the product rule for derivatives in differential form. These


results are all identities; i.e. true for all functions y(x), w(x), u(x).
358 22.4 Mathematical techniques

Example 22.8. Put d(x3 + x sin y) into the form P dx + Q Ay.


We have

d(x3 + x sin y) = d(x3) + d(x sin y),

for the first term, (d/dx)x3 = 3x2, and in differential form this becomes
d(x3) = 3x2 dx. For the second term, we can use the product rule in the form
(iii) above (or write it in standard form first):

d(x sin y) = sin y dx + x d(sin y)

followed by the chain rule

d(x sin y) = sin y dx + x cos y dy.

Finally,

d(x3 + x sin y) = (3x2 + sin y) dx + (x cos y) dy,

so that, in (22.9), F(x, y) = 3x2 + sin y and Q(x, y) = x cos y.

First order differential equations can be written alternatively as


differential forms. The simples is the equation

which has solutions y(x) — C, where C is any constant. In differential


form, the equation becomes
dy = 0,
and the solutions are compatible with the first entry in the table
above: dC = 0 when C is a constant.

Example 22.9. Find solutions of the equation — =


dx y
In differential form, this becomes

x3 dx + y2 dy = 0.

But

x3 dx + y2 dy = d(jx4) + d(^y3) = d(Jx4 + §y3).

This will be zero if

i*4 + iy3 = C,
where C is, in this case, any constant. The equation is in fact separable; you
should compare this with the process in Section 22.3.

Example 22.10. Find solutions of- — = -— .


dx x + y
In differential form, this becomes

0 = (x — y) dx — (x + y) dy

= x dx - y dx - x dy - y dy.

Try to rearrange it so that recognizable forms appear:

0 = x dx - y dy — (y dx + x dy)

= d(vx2) - d(iy2) - d(xy)

= d(\x2 - \y2 - xy).


22.4 Graphical, numerical and other aspects of first-order equations 359

This differential will be zero as required if

1X2 _ ly2 _ xy =

where C is the ‘variable constant’, or parameter, which will generate a whole


family of solutions.

In the previous example, the terms were rearranged in a search


for a group like y dx + x dy that would simplify, in that case to
d(xy). If a differential form can be expressed identically (that is to
say, for all y(x)) in the form of a single differential:

P(x, y) dx + Q(x, y) dy = dF(x, y), (22.10)


where F(x, y) is a fixed function of x and y, then it is called a
perfect differential form, or a perfect differential. Usually this is
impossible. For example, consider the differential form y dx. It can
be proved that there does not exist any fixed function F(x, y) such
that

y(x) dx = dF(x, y(x)) for every y(x)

(try looking for one).


To solve differential equations by this method, we search for a
perfect differential in order to be able to conclude with the steps

‘dF(x, y) = 0; therefore F(x, y) = C\

as in the examples. If a perfect differential is not already present, we


might be able to produce one by multiplying through by a suitable
function, called an integrating factor for the expression. For example,
y dx — x dy is not a perfect differential; but, from (22.10),

so the new expression is a perfect differential.

Perfect differential forms

Let y be an arbitrary function of x. Then P(x, y) dx +


Q(x, y) dy is a perfect differential if it can be written
(22.11)
as
P(x, y) dx + Q(x, y) dy s dF(x, y),
where F(x, y) is a fixed function of x and y.

Integrating factor for differential forms

A function I(x, y) is an integrating factor for the (22.12)


differential form P dx + Q dy if I(P dx + Q dy) is a
perfect differential.
360 22.4 Mathematical techniques

It can be proved that every differential form has an integrating


factor, but only occasionally is it easy to see one.

dy
Example 22.11. Find a family of solutions of x V-
dx

(This is a linear equation, which is also separable, so we have two other


methods for solving it.) In differential form:

y dx — x dy = 0,

and we cannot do anything with the left-hand side as it stands. However, the
remark above suggests we multiply by 1/x2, obtaining

0 = (l/x2)(y dx — x dy) = d( — y/x).

Therefore

— y/x = C, or y = —Cx,

are the solutions, as is easily confirmed.


There are other possibilities; for example (see (22.10)) we might divide by y2
or xy. In the end these lead to the same set of solutions.

Note that, in Example 22.11, the equation is linear:

dy
+ y = o,
dx

and so we can alternatively use the method of Section 19.5. An


‘integrating factor’ I(x) is also used there: it is

I(x) = e~Sx~ldx = e~,nx+c = 1/x

(where we choose C = 0 for simplicity), which is different from,


though related to, the ones which work for the differential form
y dx — x dy above.

dy
Example 22.1 2. Find a set of solutions of x y + y2x.
dx
Equivalently, y dx — x dy + y2x dx = 0. The first two terms cannot be written
as df(x, y). The table (22.10) offers three integrating factors, x-2, y-2, and
(xy)-1 to choose from. It is, however, also necessary to be able to manage the
remaining term, y2x dx, after multiplying by the integrating factor, so we choose
y-2, which gives

0 = (l/y2)(y dx — x dy) + x dx

= d(x/y) + d(§x2) = d(x/y + |x2).

Therefore x/y + ^x2 = C, or y = x/(C - ^x2), are solutions.

22.5 Change of variable in a differential equation


Occasionally we can find a change of variable, or substitution, which
will simplify a differential equation. Some general types are given
here for illustration.
22.5 Graphical, numerical and other aspects of first-order equations 361

Equations not involving y

If a differential equation contains


x, dy/dx, d2y/dx2, ..., (22.13)
but not y, substitute the independent variable
w = dy/dx in place of y, producing an equation for w
of lower order.

To see how this works, consider the following example.

. d2 v ( dy\2
Example 22.1 3. Find solutions of—— + — ) =0.
dx2 \dx/
The variable y is not independently present, so put

dy
w
dx

Then d2y/dx2 = dw/dx, so that the equation becomes

dw ,
-hv2=0.
dx

This is a separable equation, and the method of Section 22.3 gives

dw 1
dx, or — = x + A,
w
where A is constant, so that

1
w =-.
x + A
For the second stage, remember that w = dy/dx, so we have

dy 1
dx x + A

Therefore

y — ln\x + A\ + B,

where A and B are constants which we see, in retrospect, may be chosen entirely
arbitrarily.

Sometimes it is possible to change the independent variable y into


something else to obtain a more manageable equation:

dy /y\
Equation of the form = f - I.
dx \x/
Change to a new dependent variable v by
v = y/x (22.14)
and solve the resulting separable equation.
(To make the change write y = xv, so that
dy/dx = x dv/dx + v.)
362 22.5 Mathematical techniques

dy 3y — x
Example 22.14. Find solutions of — =
dx 3x — y
This equation can be written in the form
.V
dy 3 y/x — 1
dx 3 — y/x ’

which has the form /(y/x), so change the dependent variable from y to v = y/x.
To obtain dy/dx in terms of v, write y(x) = xv(x). Then dy/dx = x dv/dx + v,
and in terms of v the equation becomes

dv 3v — 1 dv v2 — 1
X-h V = or x—
dx 3 - v ’ dx 3 —v

This new equation is separable (Section 22.3) (a separable equation will always
be obtained at this stage). Following (22.3):

dm
v+ 1
Fig. 22.11 Solution curves for
Therefore In | [(u — l)/(n + 1)2]| = In |x| + C, where C is an arbitrary constant.
dy 3y — x After returning to y and simplifying, we have
dx 3x — y (y - x)/(y + x)2 = c, (22.15)
Notice the special solutions where c = +ec. The solution curves are shown in Fig. 22.11, plotted by working
y = +x. directly from the differential equation and using a numerical method (see Section
22.2).

The methods of separation of variables, differentials, substitutions,


etc. used to solve nonlinear equations are rather hazardous. In
Example 22.14 there are two solutions which do not appear in
(22.15), represented by the straight lines y = ±.x in Fig. 22.11 and
corresponding to the limiting cases C -» ±oo. Therefore (22.15) is
not a truly general solution. These extra singular solutions can be
found by trying the form y = mx in the equation and solving the
resulting quadratic for m.
Singular solutions are important in the theory of vibrations,
population problems, and other nonlinear fields. In general, they
represent limiting cases of the ordinary solutions; in this case, y = x
is an envelope of the ordinary solutions: that is, a curve that is
tangential to each ordinary solution. This was the case with the
singular solution y(x) = 0 of Example 22.7.
The following example illustrates another way of transforming a
differential equation. In a mechanical context, the transformation
of the derivative d2x/dtz involved is called the energy trans¬
formation.

Example 22.15. The acceleration of a vehicle is constrained by its velocity:


d2x/dt2 = Kv 2, where v = dx/dt and K is a constant. Find the velocity as a
function of distance if x(0) = 0 and v(0) = 0.

Transform the acceleration by the chain rule, using x as the intermediate


variable:

d2y _ du _ dr dx du 1 d(u2)
df2 dt dx dt dx ' dx
22.5 Graphical, numerical and other aspects of first-order equations 363

The differential equation now relates v to x:

dx
We could put v2 = u and separate, but for the sake of adventurousness we shall
leave v2 as it is and turn the equation upside down:
„ dx , ,
2-- = K~1v2.
d(v2)
Now integrate both sides, courageously using v2 as the variable:

2x = K~1 v2 d(v2) = {K~\v2)2 + C,

or
v4 = 4Kx - 2CK.
The initial condition x(0) = r(0) = 0 then gives
v(x) = (4 Kx)*.

Problems

In these problems, y' means dy/dx. to use smaller intervals h over some sections than over
others.
22.1. Sketch a lineal-element diagram for the solution
(a) y' = y(x + l)/x(y + 1). This refers to Example
22.6, with the ‘solutions’ ye}’ = Axex, where A is an
curves of each of the following.
arbitrary constant.
(a) >•' = -y; (b) y' = x - y; (c) y' = x/y; (b) y' = 2y*. This refers to Example 22.7.
(d) y' = xy; (e) y' = -y/x; (f) y' = y/x; (c) y' = (y/x)* has solution curves in the first and
1 third quadrants. Sketch a lineal element diagram to
(g) y' = (x - l).v; (h)/ = —-2; obtain the broad pattern, then compute a few repre¬
x- + y
sentative curves. (The general solution is

(i) y' = i . 2 i (j) y' = (i - y2)*; |y| = (M* - C)* for |x|* > C. )
x + y — 1
(k) y' = (y/x)T Make sure that not all your curves 22.4. (Separation of variables). Obtain solutions of the
lie in the first quadrant. following equations.
(a) y' = x/y; (b) y' = 2x/y;
(c) y' = x/(y + 2); (d) y' = (x + 3)/(y + 2);
22.2. (Computational). Use the Euler method to com¬
(e) y' = x2/y2; (0 y' = -x2/y2\
pute approximate solutions to the following initial-
(g )y' = y2/x2\ (h )y'=—y2/x2;
value problems. Try various values of the step h. Compare
(i) 2xy' = y2; (j) yy' + x = 1;
the results with the exact solutions provided.
dx , . dx
(a) / = -iy with y = 1 at x = 0, over the range 0 ^ (k) — = 3f2x3; (1) (sin x) — = t;
x < 2. (The exact solution is y = e~*x.) dt df
(b) y' = — x/y with y = -1 at x = -1, over the range
- 1 <Cx< 1. (The exact solution is y= — (2 — x2)U (m) ex+J’ — = 1; (n) (1 + x2) — + (1 + y2) = 0,
Try to extend your results forward and backward dx dx
(using negative h) to the range — 2 < x 2.) with y(0) = — 1.
(C) y' = (l - y2)*, with y = 0 at x = 0, over the range
22.5. Show that the solution of the initial-value prob¬
0 x ^ jn. (The exact solution is y = sin x.)
lem

22.3. (Computational). Use the Euler method to cal¬ dy x


culate a few representative solution curves in the following dx y
cases. Each curve will have a different initial condition. where y = 1 when x = 2, is obtained from the equation
You should follow each curve forwards, and probably U
backwards as well (negative h), sufficiently far to get a u du = — v dm
clear idea of how it is behaving. It might be advantageous Ji
364 Mathematical techniques

Generalize this technique to apply to the initial- dv v2 ,


(c) -- = — - (divide by y2);
value problem dx y - 1

dy
- = g(x)h(y) (d) d- = -— (divide by y2);
dx dx y2 - x
where y — b when x = a. dy = y(x2 + y2 - y)
(e) (divide by x2y2);
dx x(x2 + y2)
22.6. Solve the following equations and sketch the solu¬
tion curves. Take care to avoid spurious solutions as in dy y x3 — y
(0 (show that this reduces to
Example 22.7. Look out for solutions you might have dx x x3 + y
lost in the process: these are usually suggested by the
sketch. x3y2 d(x/y) = y d(xy);
dv , , dy
(a) x = 2yT; (b) — = xy% now put u = xy and v = x/y).
dx dx

22.9. The 'logistic equation’ dP/dt = aP — bP2, where


(c) j- = (1 - y2)+; (d) x = (1 - y2)1.
dx dx a > 0 and b > 0, represents the growth of a population
P(t) > 0 in which unrestricted growth is prevented by
the term — bP2 representing pressure on the means of
22.7. (Differential method). Obtain a family of solutions
subsistence. Solve the equation, sketch the solution
of the following equations. (Usually they must be left in
curves, and show that in all circumstances P(l) —>■ a/b
implicit form.)
as t tends to infinity.
d y 2x — y
(a) — =-- (check whether there is also a solution
dx x T 2y
22.10. (Computational). A population P(t) of protozoa
of the form y = rax);
is assumed to increase according to the equation dP/dt =
. dy y
(b) / = V—^ aP — bP2, where a and b are constants. Starting with 10
dx y —x
protozoa, they are observed to increase by 150% per day
dy x2 - y dy 2x - y
(c) — =-; (d) =--; while the numbers are still low, and to reach a fairly
dx x + y dx x — 2y steady level (dP/dt = 0) of 25,000 after a few weeks. Find
an approximation to a and b.
dy x — 2xy , dy 3x2
(e) - = —-4 Use a numerical method to compute the popula¬
dx x —y ( } dx ~ 3?~+~l' tion curve for the first 10 days.
dy 2xy Compare the curve obtained from the law dP/dt =
(g) — + —^— = 0 (this is also a linear equation); aP - bP4.
dx x — 1

dy 22.11. (See Example 22.15). (a) A falling body of mass


(h) (1 — sin y) — + cos x = 0;
dx m is subject to air resistance equal to Kv®, where v is
its speed of fall: its equation of motion is then
(i) (1 + 3 e3>') — = 2 e2* — 1;
dx
dt2 ' m \df/
(j) (e-t+r+ 1) ^ + (ex+)' - 1) = 0;
dx where g is the gravitational acceleration and x repre¬
sents its position measured vertically downwards. With¬
dy 1 + cos x sin y out solving the equation, show that the limiting speed of
(k)
dx 1 -f sin x cos y fall is equal to (mg/K)1/a.
(b) Substitute v = dx/dt to obtain an equation for
it of the form
22.8. (Differential method). Some of these need an inte¬
grating factor (see equation (22.12)) such as the ones dfy2)
suggested.
dt
dy y y — 2x
(a) — = — --(Check also for solutions of the form
dx x x - 2y (c) Assume that (in mks units) K = 4, m = 80, a = 1.2,
y = rax) g = 10, and that the mass is dropped from rest. Use
Euler’s method (22.2) to obtain t>2, and hence v, over a
dy EU -X2) 2
(b) —— (divide by x ); sufficient distance to compare with the limiting speed of
dx x(l+x2) fall.
Graphical, numerical and other aspects of first-order equations 365

22.12. (See Section 17.5). Solve the following (implicitly) -«-H


by putting y = xw.
(a) Ay/Ax = (x2 - xy + y2)/xy;
(b) Ayjd.x + (.x2 + y2)/xy = 0;
(c) Ay/d.x + (x - y)/(3.x + y) = 0;
(d) dy/dx = 2xy/(3x2 - 4y2);
(e) Ay/Ax + 2(2.x2 + y2)/xy = 0.

22.13. Show that, if the substitution w = yl~n is made


in the equation y' + g(x)y = h(x)yn (called the Bernoulli
equation), we obtain the linear equation w' +
(1 - n)g(x)w = (1 - n)h(x). Fig. 22.12
Use this result to solve (a) y' + y = y4,
(b) y’ + y = y“A values v = 1 m s ', f=4 ms"1, H = 30 m, compute
the path of the boat. (As you approach close to A, you
22.14. The equation will encounter a problem with dy/dx.)

d2y/Ax2 + (b/x) dy/dx + (c/x2)y = 0 22.16. (Computational). As in Problem 22.15, but con¬
is called equidimensional (the d.x2, x dx, and x2 in struct a differential equation for a stream having a
the denominators are considered to have the same parabolic distribution of velocity, greatest in the middle
dimensions). It is a linear equation with zero on the and zero at the banks, of the form
right, so we expect a general solution of the form v(x) = ax(H — x).
4yi(x) + By2(x), where A and B are arbitrary. Show Put in plausible values for V, v, H, and a, and
how a basis of solutions (y,(x), y2(x)) can be obtained compute the path.
in two ways:
(a) Look for solutions having the form y = xM, 22.17. A mouse M enters a room at O and rushes to
where M is an unknown constant. Note that M might its hole at H with speed v, pursued by the cat C, who
be complex: in that case, to obtain real solutions, use
^„a + jP _ g(a + ]fi) In x _ In * ^j/J In *

= xa[cos()3 In x) + j sin(y5 In x)],

as in Section 16.5.
(b) Change the independent variable to f, where
f = In x, or x = e'. The new equation has constant
coefficients.
(c) Use either method to find the solutions of the
equations

(i) d2y/dx2 — (2/x) Ay/Ax + (2/x2)y = 0;


(ii) d2y/dx2 — (l/x)dy/d.x + 1/x2 = 0;
(iii) d2y/dx2 + (3/x) dy/dx + (2/x2)y = 0.

Fig. 22.13
22.15. (Computational). A boat enters a river at O, and
tries to reach the point A on the other bank, directly
starts from B at the same moment as the mouse
opposite O and distant H from O, by keeping its nose
pointed towards A, at an angle 6 from OA (see Fig. appears (see Fig. 22.13). The cat runs with speed
22.12). The speed of the boat in still water is V, and the V > v, always directly towards the mouse. Show that
uniform stream speed is v < V. dr/dr = v cos 0 — V and d0/dr = — (v sin 0)/r,
Show that, when the boat is at (x, y),
where r and 0 are polar coordinates for the cat relative
dx/dt = V cos 6, Ay/At = v — F sin 9 to the (moving) mouse. Construct a differential equa¬
where t is time. By dividing one equation by the other tion for r in terms of 9, and solve it (it is really only
find a differential equation for y in terms of x. Given the a question of integration).
Nonlinear differential
equations and the
phase plane
Contents
23.1 Autonomous second-order equations 367
23.2 Constructing a phase diagram for (x, x) 368
23.3 (x, x) phase diagrams for other linear equations; stability 371
23.4 The pendulum equation 373
23.5 The general phase plane 375
23.6 Approximate linearization 377
23.7 Limit cycles 379
23.8 A numerical method for x = P, y = Q 380
Problems 381

However many methods may be invented for solving differential


equations, there will always remain equations beyond their scope.
But this does not mean that nothing can be done with them. The
important van der Pol equation, which models a type of electrical
oscillator:

d2x dx
+ c(x2 - 1) + x = 0,
df1 df

where c > 0, cannot be solved explicitly. However, there are still


comparatively simple methods which enable us to demonstrate its
really important feature, which is that every solution, no matter how
the device is started off, settles down into the same regular periodic
oscillation.
Techniques enabling such conclusions to be drawn without
actually solving the equation are called qualitative methods. This
chapter outlines a way of looking at differential equations which is
at the basis of many of these techniques. Qualitative methods do
not consist of a collection of fixed results, and tend to be exploratory.
Therefore computation is important. In the final section, a simple
computing method is described which is easy to program but is
effective enough to analyse realistic physical and biological models.
We shall take f (time) as the independent variable. For derivatives
with respect to time, we use the conventional dot notation (just
like the dash notation (4.1)):

dx d2x
x = —, x = -.
23.1 Nonlinear differential equations and the phase plane 367

23.1 Autonomous second-order equations


Let the independent variable be f and the dependent variable x. We
shall only discuss equations which can be written in the form
x(f) = Q(x(t), x(0),
in which t does not appear independently under Q on the right-hand
side. Such equations are called autonomous. For example, the
equation x — xx + 1 = 0 is autonomous, but tx — xx + 1 = 0 is not
autonomous. If startup conditions are supplied so as to specify an
initial-value problem:
x = Q(x, x), x(t0) = x0, x(t0) = y0 (23.1)
(there is a special reason for using the symbol y0 here), we expect
that the initial conditions will select exactly one solution.
Suppose that the equation represents an electrical system, and
that a graph of x against t for t ^ t0 can be plotted automatically.
If we find the clock in the plotter has been wrongly set, then it will
not make the graph unusable; only its starting time t0 will be wrong.
Similarly, if we do one experiment starting at t — 8.00 hr and repeat
it at t — 13.00 hr, the graphs plotted will be the same shape although
x(t) the starting times are different (see Fig. 23.1). Intuition suggests that
for autonomous equations, namely those in which t does not occur
independently, it will not be an arbitrary clock time t0 assigned to
startup that counts, but the time elapsed from startup, t — t0.
This intuition is correct. The mathematical reason is that a change
of time scale from f to t — t0 does not change the form of the
differential equation, so the same phenomena follow. Put
Fig. 23.1 Two experiments, with
T = t — t0, and write x(t) = X(T).
a device described by an
autonomous equation, which Then dX/dT = dx/df and d2X/dT2 = d2x/dt2. Also t = t0 becomes
start at different times. T — 0, so that the new initial-value problem is
X = Q(X, X), X(0) = xo, X(0) = y0. (23.2)
The equation is unchanged, but the starting time is assigned the
value zero. Suppose (23.2) is solved in terms of T. Restore t by
putting T = t — t0. The solution x(t) of (23.1) is then a function only
of t — t0, so it depends only on the time elapsed from startup.

Example 23.1. Solve the initial-value problems x + a>2x = 0, with x(t0) = x0


and x(t0) = y0.
The general solution is x(t) = A cos art + B sin a>t. The process of finding A and
B from the equations obtained by substituting the expressions for x(t) and x(t)
into the initial conditions is quite complicated (try it). Instead, put
T = t — t0, x(t ) = X{T).
Then
X + w2X = 0, X(0) = x0, A(0) = y0.
The solution of this system is simple:
X(T) = x0 cos coT + co~1y0 sin wT.
Put T = t — t0; then the required solution is
x(t) = x0 cos co(t — t0) + w- 'y0 sin u>(t — t0),
that is to say, x is a function only of the elapsed time t — t0.
368 23.1 Mathematical techniques

Autonomous second-order equations


The solution of the initial-value problem
x = <2(x, x) (23.3)
with x(t0) = x0 and x(t0) = y0 is a function only of elapsed
time t — t0.

23.2 Constructing a phase diagram for (x, x)


Consider again the initial-value problem of Example 23.1:
x + co2x = 0, x(t0) = x0, x(t0) = y0. (23.4)
This problem could arise in connection with a mass oscillating on
a spring. The initial conditions imply that the position and velocity
are prescribed at the start, t = f0. If asked what the system was doing
at t = t0, specification of the position and velocity seems to constitute
an adequate description of its state. It is in fact a perfect description,
since it is exactly what is required to determine the whole future of
the system. It is therefore reasonable to call the pair of numbers

(*o> yo)
the state of the system at t0.
Subsequently the system moves smoothly through a succession of
states: x and x will vary in time. Catch the system at any moment
tA; then the state (x(tj), x(tt)) serves as fresh initial conditions for
all the subsequent motion, but there can never be any conflict with
what was predicted from the original initial conditions. It is the
succession of states which is the subject of this chapter; the precise
time that the states occur takes a secondary place.
To track the succession of states (x(t), x(r)) for the initial-value
problem (23.4), we could in principle begin by finding its solution.
The solution (Example 23.1) is
x(t) = x0 cos cot + w-1j/0 sin cot,
so
x(t) = — cox0 sin cot + y0 cos cot.

In effect, these equations specify the states parametrically (with


parameter t). However, the expressions do not clearly reveal the
association of x with x, which is what we actually observe from
moment to moment. Moreover, we need a method for equations we
cannot solve.
We shall take a different route to the states (x, x) which does not
require that we solve the differential equation. Write

x = y. '■ (23.5a)
Since x = — co2x, and x = (d/df)x, we have

y= oo2x. (23.5b)
23.2 Nonlinear differential equations and the phase plane 369

These two simultaneous first-order differential equations are equiva¬


lent to the second-order differential equation (23.4). Divide (23.5b)
by (23.5a), and use (3.8):

(23.6a)

Time has disappeared from the problem, and we now have a single
first-order equation connecting x and y (i.e. connecting x and x).
Separate the variables in (23.6a), and we obtain

y dy = — co2 x dx,

or
co2x2 + y2 = C, (23.6b)
where C is a positive, but otherwise arbitrary, constant.
The motive for introducing y as the symbol for x now becomes
r-
r--X
clear. Set up a pair of axes x and y as in Fig. 23.2. This framework
is called the (x, x) phase plane. The solution (23.6b) represents the
family of ellipses displayed in the figure, and is called an (x, x) phase
diagram for the differential equation x + co2x — 0.
Any point on the diagram, say A : (x0, y0), represents an initial

i1 iii
(IK Y\\\
Cc\ state. If we follow the curve passing through A, we obtain the

oJjj\ sequence of states for the corresponding solution. Equation (23.6a)


alone does not tell us which way to travel along the curve; for this
purpose, we momentarily resurrect t. The arrows indicate the
directions that correspond to time going forwards rather than
backwards. We defined y by

Fig. 23.2 (x, x) phase diagram


for x + orx — 0, displaying If we are in the upper half plane, then y > 0, so dx/df is positive.
y — x against x. Therefore x(f) is increasing, and the directive arrow points from
left to right. By a similar argument with y < 0 we find that in the
lower half plane the arrow points from right to left. We must
follow the arrow. Supplied with arrows, the state curves are called
phase paths, or trajectories, or orbits for the differential equation
x + oo2x = 0.
Starting from A : (x0, y0), follow the phase path. In going round,
we can pick out a new feature in passing: that x is zero when
x is at a minimum or maximum, and vice versa. Eventually we get
back to A, renewing the initial state. Continue to follow the path
around, duplicating the first circuit; the succession of states is
repeated time after time.
This repetition does not itself establish that this is a truly periodic
process. When we meet A again at the end of the first circuit, it is
at a later time, tx say, so the initial conditions for the original
equation are to this extent changed. Even though the system must
follow the same path, perhaps it takes twice as long to go round
370 23.2 Mathematical techniques

the second time. However, from the discussion in Section 23.1, the
time to complete any circuit, or to go repeatedly between any two
fixed points on the circuit, is invariable because the equation is
autonomous. This argument does not depend on what equation we
started with, so we can say in general: any closed phase path
represents a periodic oscillation.
Finally notice in Fig. 23.2 the bullet at the origin. This point
represents a true solution, namely

x(t) = 0, y(t) = 0.

It is a special case of an equilibrium point, meaning a constant


solution

x(f) = k, x(t) = 0,

where k is a constant. Equilibrium points are of great importance


in phase diagrams. A zero solution does not mean the same as no
solution. If the constant k is zero, this solution is called ‘trivial’; but
this does not lessen its significance. An equilibrium point surrounded
by closed curves is called a centre. It represents periodic oscillations
about equilibrium. (Note, however, that oscillations of different
amplitudes do not usually have the same period.).
Since we chose a simple case, we have not discovered anything
we did not know already, so consider the following example.

Example 23.2. Sketch an (x, x) phase plane for the equation x + cx3 = 0,
where c > 0.
This represents small lateral oscillations x(t) of a mass attached to the middle
of an elastic string that is fixed at the ends and is unextended when x = 0. It is
the same as Example 20.5, with l = L, and c = 4s/ml. We can regard explicit
solutions as being unobtainable.
Put y = x. Then x = y, and we have two first-order equations, together
equivalent to the original equation:

x = y, y = —cx3,

Therefore

dy dy Idx x3
dx dr / dr y

By separating the variables, we obtain

ydy = x3 dx,

so that \yl = — ^cx4 + C, where C is an arbitrary constant. Therefore

y = ±(c/2)*(A - x4)*,

where A is arbitrary. This is the equation for the phase paths shown in Fig. 23.3.

The phase diagram of Fig. 23.3 consists entirely of closed curves;


so, by exactly the same reasoning as in Example 23.1, we can deduce
that every solution of the differential equation is a periodic oscillation.
(However, we cannot say that they all have the same period: in fact
23.2 Nonlinear differential equations and the phase plane 371

they do not.) This phase diagram has therefore revealed an important


fact about an equation that we could not solve.

23.3 (x, x) phase diagrams for other linear


equations; stability
For the moment, we shall stay with equations that are familiar from
Chapter 16.

Example 23.3. Construct the phase diagram for x — a>2x = 0.


Put x = y; then y = m2x. To eliminate f, form

i = di = oj1-
x dx y

Separate the variables:

x dx,

or
y2 — w2x2 = A,
y
where A is arbitrary. This represents the family of hyperbolas in Fig. 23.4, having
asymptotes y = ±cox. The directions of the arrows follow the rule in Section
23.2: left to right in the upper half plane.

Differential equations usually describe the behaviour of some


circuit, machine, ecosystem, or something else in the real world, and
our concern is with interpreting the phase diagrams in such a way
as to bring to light features of practical importance. One important
question is that of stability of equilibrium in a system. In our
phase diagrams, the question turns into that of the stability of an
equilibrium point, such as the origin in Examples 23.1 to 23.3.
In practice, systems are always subject to small external disturb¬
ances and internal fluctuations. For a system to work, it is important
that small causes give small effects, and that the effects do not
commence to grow catastrophically. In that case, a system is said to
be stable with respect to small disturbances; otherwise it is unstable.
The precise criterion for tolerable behaviour will depend on what
we require from the particular system.
An equilibrium point surrounded by a structure of curves resem¬
bling those shown in Fig. 23.4 is called a saddle. It could hardly
be classed as anything but an unstable equilibrium point. Apart
from two special directions, if equilibrium is disturbed - even by a
hairsbreadth - the system will find itself on one of the hyperbolas,
and so it will be swept further and further away from equilibrium.
On the other hand a centre, exemplified in Figs 23.2 and 23.3,
would often be called a stable equilibrium point. If equilibrium
is disturbed by a small amount, the system does not go wild; it
simply oscillates around its equilibrium position. However, a vehicle
which behaved like that would be regarded as very unstable. Subject
to a continual battering, it would vibrate objectionably, and the
372 23.3 Mathematical techniques

vibrations would never die away; therefore the stronger sort of


stability illustrated in the following two examples is preferable.

Example 23.4. Construct a phase diagram for x + + x = 0.


The equivalent pair of first-order equations is
x = y, y = -\y - x. (•)
Therefore

dx y

This is hard to solve. To produce a phase diagram, we could solve the original
equation for x(t) by the methods of Chapter 18, and then obtain y — x(t). These
are parametric equations for (x, y) curves. However, this would not be in the
spirit of this chapter, because almost never are we able to solve the original
equation. Instead, the Euler method of Section 22.2 can be used to obtain
solution curves for (ii). (As we shall see later, it is easier to work from (i) using
Section 23.8.) We obtain the pattern of spiral curves surrounding the origin
shown in Fig. 23.5.

Example 23.4 is a case of a linear oscillator with small damping,


discussed in Section 18.4. The phase paths show that, from any
starting point, the origin is approached via a sequence of diminishing
spirals. Therefore any initial disturbance from equilibrium dies away.
The equilibrium point is called a stable spiral. For the equation
x — + x = 0,
the pattern of curves gives an outgoing unstable spiral.

Example 23.5. Construct a phase diagram for 2x + lx + 3x = 0.


Fig. 23.5
This is a case of heavy damping: We obtain

x = y, y= -\y - §x,
or
d.V = -\y- 2-x

dx y
y The phase paths, calculated numerically, are as shown in Fig. 23.6.

In Fig. 23.6, the origin is called a stable node. All solutions fall
straight into the origin without any oscillations: the system is
deadbeat. Notice the structure of the node. There are two straight-
line solutions to the equation

dy = -b - ?x
d.x y

which can be found by trying for solutions of the form y = mx.


Then dy/d.x = m — ( — — |)/m, or 2m2 + 1m + 3 = 0. Therefore
tn = — \ or m = —3, and the two linear solutions are y = —\x,
y = —3.x. They divide the plane into four sectors which contain
curved phase paths. Each of the curves has the property that it is
tangential to y = —\x at the origin, and parallel to y = — 3x at
infinity. This behaviour is characteristic of nodes arising from linear
23.3 Nonlinear differential equations and the phase plane 373

equations, and the mutual tangency at the origin is common to all


nodes, even those arising from nonlinear equations.
The technique for second-order differential equations can be
summed up as follows.

(x, x) phase plane for x = Q(x, x)


(a) Phase path equations:
x = y, y = Q(x,y):

(b) The direction of a phase path (x, y) is left to right


if y > 0, and right to left if y < 0.
> J 03 81
(c) Equilibrium points (constant solutions) are at (x, 0), v ' ’
where x is any solution of Q(x, 0) = 0.
(d) Alternative equation dy/dx = Q(x, y)/y. (Shows
that phase paths meet only at equilibrium points or
other points where Q(x, y)/y is undefined, and that
paths cross the x axis at right angles.)

23.4 The pendulum equation


The equation for a pendulum (Fig. 23.7) consisting of a light rod
AB of length / freely pivoted at A and carrying a mass at B is

x + co2 sin x = 0,
where x is the angle of inclination and co2 = g/l; here g is the
gravitational acceleration. This equation can be solved, but only
with difficulty, and by using recondite functions. From (23.7), the
equations for the phase paths are
x — y, y = —co2 sin x. (23.9)
The equilibrium points solve the equation sin x = 0, so
Fig. 23.7
x = 0, ±7i, ±2n,..., with y — 0. (23.10a)
If
x = 0, ±2n, ±471,..., (23.10b)

the pendulum is hanging vertically from its pivot in equilibrium.


These values of x all represent the same observed state, though on
the phase plane they correspond to different points. Similarly the
values
X = ±7t, ±371, ..., (23.10c)
represent a state which is not usually thought of in connection with
a pendulum: the pendulum rod is perched vertically upwards (and
insecurely) on its pivot A.
Consider x = 0 as being representative of the freely-hanging state,
the equilibrium points (23.10b). See what happens when the displace¬
ment from x = 0 is small, but the pendulum is no longer in
374 23.4 Mathematical techniques

equilibrium. We can then put


sin x « x.
The original equation becomes x + oj2x — 0 (approximately), with
solutions x = C cos(cot + 4>), where C (small) and (j) are arbitrary.
This is the familiar condition of small, isochronous oscillations. We
have already solved the same problem for the phase plane in Section
23.2 and we found a centre: the family of ellipses shown in Fig. 23.2:

co2x2 + y2 = C,
where C is an arbitrary non-negative constant. This family will be
repeated (for small C) around x = ±2n, ±4n,... in a progressively
developing phase diagram; see Fig. 23.8 below.

Next, consider the case when the pendulum stands vertically: we


choose as representative of (23.10c) the case x = n. To find what
happens when the state is slightly displaced from the point (ti, 0), put
x = n + X,
where the new variable X is going to be small. Then

sin x = sin(7r + X) = sin n cos X + cos n sin X


= —sin X % —X.
From (23.8), with X instead of sin x, the (approximate) equation for
the displacement X from the equilibrium point is
X - co2X = 0.

From Example 23.3, this is equivalent to a saddle point on the phase


plane at X = 0 (i.e. at x = k), and all the other equilibrium points
in (23.10c) will have the identical structure, which implies instability.
The state of affairs around the equilibrium points is shown in Fig.
23.8. The rest of the phase diagram could be computed from (23.9),
but in this case it is not difficult to sketch it in its entirety as
in Fig. 23.9. From (23.9),
dy co2 sin x
dx y

By separating the variables, we obtain the equations of the paths,


which can be written in the form

y = ±^/2co(cos x — A)*, (23.11)


23.4 Nonlinear differential equations and the phase plane 375

where A is (to an extent) arbitrary. Since cos x has period 2n, the
Fig. 23.9 Phase diagram (x, x) for
the pendulum equation repetitious nature of Fig. 23.8 is explained. Notice that (cos x — A)*
x + to2 sin x = 0, given by is real only when cos x ^ A. Therefore 4^1. With that limitation,
y = ±x/2co(cos x — 4)* with
4 ^ 1. The figure extends with
period 2n. The undulating
curves (for A < — 1) represent a
whirling motion. The
separatrices correspond to
4=1. There are centres at
x = 0, ± 2rc,..., and
saddles at x = +71, ±3ji, ....

there are two main ranges of A which give significantly different


patterns of (x, y) curves: — 1 ^ A ^ 1 and A < — 1. The centres
correspond to 4 = 1, and the special curves joining the saddles,
called the separatrices, correspond to 4 = —1. Notice the regular
whirling motions which occur if y = x is large enough.

23.5 The general phase plane


There exists a great field of problems which, right from the start,
take the form of simultaneous first-order differential equations:

x = P(x, y), y = Q{x,y). (23.12)

Example 23.6. A community offoxes and rabbits lives in uneasy harmony on


an island. The rabbit population is x(f), and they eat grass. The fox population is
y(t); they eat rabbits. Construct a differential equation model for the population
variation.
In a short time fit there is a rabbit population increase ax fit (a > 0) due to
births and natural deaths, and a decrease — bxy fit due to meetings with foxes,
the frequency of which we suppose to be jointly proportional to the population
densities of these animals. The net change in time fit is therefore

fix = ax fit — bxy fit.

Divide by fit and let fit -» 0; we obtain

— = ax — bxy. (i)
dr

For the foxes, assume that a shortage of prey causes a death rate c from
starvation, offset by a fecundity factor dxy among those who get something to
eat. Then, in time fit,

fiy = — cy fit + dxy fit.

Divide by fit and let fit —> 0:

— = — cy + dxy. (ii)
df
(i) and (ii) form a simultaneous (nonlinear) first-order system:

x = ax — bxy, y — — cy + dxy, (23.13)

with a, b, c, d, > 0.
376 23.5 Mathematical techniques

We shall now show how important characteristics of the solutions


x(t) and y(t) of the equations (23.13) resulting from this example
can be revealed on a general phase plane by plotting y against x.
(Notice that x is no longer equal to y, so this is not the same as the
(x, x) phase plane that we had before.) The pair of values (x, y) will
be called a state. Although we do not prove it, the values of x and y
at a particular t constitute initial conditions determining the solution
for all subsequent time t > t0, and because the equations are
autonomous, the solutions are functions only oft — t0.
Firstly, look for any constant solutions (x = constant, y — con¬
stant). These must satisfy the differential equations: that is to say

0 = x(a - by) and 0 = -y(c - dx).

Therefore (x, y) = (0, 0) and (x, y) = (c/d, a/b) are the constant
solutions: on the (x, y) phase plane, they are the equilibrium points.
’ As before, divide the two equations, obtaining

dy dx dy _ (c — dx)y
(23.14)
dtj dr dx (a - by)x

This is a separable equation; after separation it becomes

- d dx.

or
a In y — by + c In x — dx = C. (23.15)

Equation (23.15) represents the closed curves shown in Fig. 23.10.


It is possible to see in advance that they are closed: the reason will be
given shortly.
The direction arrows on the figure do not obey the rule (23.8b)
for the (x, x) phase plane. Each case has to be treated separately.
The principle is easy: we have to find the direction at a single
point, and the directions elsewhere are settled by continuity of
direction - we expect adjacent curves to have the same direction.
We might take the point M : (0, m) in Fig. 23.10. At this point, the
second equation of (23.13) gives y = —cm < 0, so y is decreasing at
M. Once this direction is settled, the directions on the other curves
follow by continuity.
There is a centre at the equilibrium point E : (c/d, a/b). If the
rabbit/fox populations take the values at E, the equations predict
that the state will be permanent. A bad season for grass, or a disease
amongst the foxes, will put the population state somewhere else, and
thereafter the populations will undergo periodic oscillations. If foxes
feast and thrive, rabbits languish, eventually starving the foxes;
therefore rabbits prosper again; and so on.
The equilibrium point at O is unstable. If rabbits are introduced
into a desert island paradise the population increases indefinitely.
23.5 Nonlinear differential equations and the phase plane 377

following the x axis arrow. If foxes are introduced to control the


rabbits, a great periodic cycle is set up which goes on for as long as
nothing else changes. Clearly the model is imperfect; but, provided
that we have a program which will plot general phase paths, the
complexity of the model is really a matter of no importance.
Earlier we said that the implicit equation (23.15) for the phase
paths gives closed curves. It is useful to be able to recognize this
feature.

Condition that f(x) + g(y) — C (C arbitrary)


represents a centre

If f{x) has a minimum at x = a, and g{y) has a ^3


minimum at y = /?, then there is an equilibrium point
at (a, /?) which is locally a centre (can also substitute
‘maximum’ for ‘minimum’ in both places).

To prove this you need to look forward at Section 27.1. In three


dimensions, x, y, z, the surface z = f(x) + g(y) is bowl-shaped, with
a minimum or maximum at (a, /?). The paths are the curves cut out
by intersection with the horizontal planes z = C. The functions
c In x — dx and a In y — by have maxima at x = c/d, y = a/b.
Finally, for a general system

x = P(x, y), y = Q(x, y),

the equilibrium points are where

P(x, y) = Q(x, y) = 0,

and might therefore appear anywhere in the phase plane, not just
on the x axis as with the (x, x) plane. The following statements recall
the main features encountered in this section.

General phase plane

(a) Phase-path equations: x = P(x, y), y = Q(x, y).


(b) Equilibrium points: the solutions of
(23.17)
P(x, y) = Q(x, y) = 0.
(c) Phase-path direction: find the direction at one
point and use continuity for other paths.
(d) Alternative equation: dy/dx = Q(x, y)/P{x, y).

23.6 Approximate linearization


In connection with the pendulum problem of Section 23.4, we were
able to analyse equilibrium points by using a linear approximation
valid near the points. In the general case, suppose that x = P(x, y),
378 23.6 Mathematical techniques

y = Q(x, y), and that (k, /) is an equilibrium point:

P(k,l) = Q(k,l) = 0. (23.18)

We shall obtain a linear approximation to P(x, y) and <2(x, y) valid


near (/c, I). Put

x = k + X, y = l+Y, (23.19)

where we suppose that X and Y are small. Then, because of (23.18),


the approximations will take the form

P(x, y) = aX + BY, Q(x, y) = cX + dY, (23.20)

(with no constant term present).


Equations (23.20) should provide information about the phase
paths near the equilibrium point (x, y) = (/c, /), which has become
(X, Y) = (0,0), and tell us at least whether they are stable or
unstable. This is often true, but not always: if, to take an extreme
case, a = b = c = d = 0, then we would hardly want to rely on it.
The single equation corresponding to (23.17d) which connects X
and Y in the approximation is

dT cX + dY
-- (23.21)
dX aX + bY

The algebra involved in classifying this equation with respect to the


coefficients is very complicated, and we merely summarize the
results, omitting some special cases.

Equilibrium point (0, 0) of the linear system


X = aX + bY, Y = cX + dY
Put p = a + d, q = ad — be, A = p2 — 4q.
(a) If q > 0 and A > 0: a node (p < 0, stable; p > 0,
unstable.)
(b) If q > 0 and A < 0: a spiral (p < 0, stable; p > 0,
unstable.) (23.22)
(c) If q < 0: a saddle.
(d) If p = 0 and q > 0: a centre.
(e) Path directions: investigate one point.

Since these equations are linear, the phase diagram centred on


(0,0) is self-similar: the pattern of paths is the same if viewed
centrally through a microscope or seen over an immense field, so
we are not restricted to small x and y.
In applying (23.22), do not be too ready to decide that the
original equations have a centre just because the linear ones do: the
small difference on changing back from the linear approximation
may be all that is necessary to change a centre into a spiral.
23.6 Nonlinear differential equations and the phase plane 379

Example 23.7. Classify the equilibrium points of the system


x = x — y, y — 1 — xy.

The equilibrium points are where x — y = 0, 1 — xy = 0; that is, at (1,1) and


(-1,-1).
Near (1,1). Put x = l + X, y = l + Y. Then x — y = X — Y, and

1 - xy = 1 - (1 + X)(l + Y)x -X - Y
for X and Y small. Therefore, in (23.21),

a = 1, b=— 1, c = — 1, d=— 1;

so p = 0, q = — 2, A = 8. According to (23.22), this is a saddle point (which is


an unstable equilibrium point).
Near ( — 1, —1). Put x=—l+2f, y=—1 + T; then we obtain

x — y = X — Y, 1 -xyw-X-Y,

so that a=l,fe= —1, c=l,d = 1. Therefore q = 2 > 0, p = 2 > 0, A = — 4<0;


so, by (23.22), the point is an unstable spiral. The phase diagram is shown in
y Fig. 23.11.

23.7 Limit cycles


Spirals, centres, etc. occur for both linear and nonlinear systems,
but a limit cycle is a feature only of nonlinear systems. When it
occurs it usually represents the most important phenomenon in the
phase plane. The following example includes a limit cycle.

Example 23.8. Sketch a phase diagram for


x + (x2 + x2 — l)x + x = 0.
Put
x = y, y = (1 - x2 - y2)y - x, (i)
It is possible to express the phase paths in polar coordinates r, 9:

r2 = x2 + y2 and tan 9 = y/x.

Differentiate these equations with respect to t;

rr = xx + yy,

0/cos2 6 = (xy — yx)/x2. (ii)


Substitute (i) into (ii): remember that x = y, and put x = r cos 9 and y = r sin 9
y
as necessary. Then r(t) and 9(t) are found to satisfy

r = —r{r2 — 1) sin2 9, (iii)


0 = — 1 — (r2 — 1) sin 9 cos 9. (iv)

A particular solution of (iii) and (iv) is r = 1, with 9 = — 1. This indicates a


path consisting of the circle r = 1, followed around in the clockwise direction
with unit angular velocity.
Also, from (iii),

J>0 iO<l,
j<0 if r > 1,
so the circle is approached from points inside by means of expanding spirals,
and from points outside by contracting spirals. The phase diagram is shown in
Fig. 23.12 Fig. 23.12.
380 23.7 Mathematical techniques

If we start from any initial conditions except for the equilibrium point (0, 0),
the system settles down gradually to the regular oscillation represented by the
circle. This behaviour has a physical explanation. The ‘coefficient’ x2 + x2 — 1,
although variable, serves the purpose of a damping coefficient. Outside the circle,
when .x2 + y2 — 1 > 0 (remember y = x), energy is lost and the paths tend
to drift inwards. When x2 + y2 — 1 < 0 there is negative damping; energy is
being supplied, so the amplitude of paths within the circle increases. For points
on the circle x2 + y2 — 1 = 0 the damping is zero, so the motion is harmonic
(the solutions are x = cos(f + </>), with (f) any constant), consistent with the
circular path.

The circular path r = 1 in Example 23.8 is an example of a limit


cycle, which is defined generally as an isolated closed phase path. If
the paths approach it spirally (in a broad sense) from both sides, it
is called a stable limit cycle. It then represents a stable oscillation:
if we disturb, or perturb, the oscillation by a small amount, it simply
creeps back into the original oscillation. If the paths on one or both
sides point away from the limit cycle, it is called unstable and is
unlikely ever to be observed in practice.
To show how strange a limit cycle can be, we return to the van
der Pol equation for the special case x + 10(.x2 — l).x + x — 0.
15 Figure 23.13 shows its limit cycle together with the solution
represented by the limit cycle.

23.8 A numerical method for x = P, y = Q


We shall show a numerical method, related to Euler's method of
-v Section 22.2, for plotting phase paths of the system

x(t) = P(x{t), y(t)), y(t) = Q(x(t), y(t)). (23.23)

Essentially we use t as a parameter in a step-by-step solution. Start


from an initial point P : (x0, y0) at time t0. The choice of t0 does not
affect the path constructed because the equations are autonomous.
Take short time steps of length h. Then we proceed from point to
point in the diagram;

Po • (-xo> To) Pi ■ (*i> Ti) ~> P2 • (^2> T2) "-***■•

Since, approximately,

Fig. 23.13 (a) Limit cycle for -x«+i ~ xn = hx(tn) = hP(x„, yj,
x + 10(x2 - l)x + x = 0.
(b) The solution x(r) and similarly for y„+1 — y„, the rule for getting from Pn to P„ + , is
corresponding to the limit as follows.
cycle.

Euler’s method for x = P(x, y), y = 0{x, y)


An+1 = v"+ y«)> yf+1 = yn + ^ v„);
(23.24)
Compute for n = 0,1,2,... successively.
x'; .5 ''t'jri nee' c'-5'e<er',*:.c etjjcSoos and **-«e phase picne 381

TTm process gives rise to rather unevenly spaced points on a


' > st pa*' v. ddy spaced -r,en P and <2 are large, and very closely
spaced near ar. equilibrium point, a here P and 0 are inevitably
"> However, the} have the advantage that, if necessary, regular
tone indications can be marked on the path while it is being
computed.
' f-eati spaced points are w anted, the parameter can be changed
from * me t to arc length s. We have os2 = ox2 + 6y2: so

df
fiV-
dr
Therefore
dx dx ds P dy Q
-=- = —— ——— and — = — -——
d: dr dr (P~ ~ Q2)i ds (P2 4- <22)^
art ec . • a tm eq uations for the path, in terms of arc length s. This
av ts * at follow ing method.

To compute The paths of the system x = P(x, y),


, at evenly spaced points
'

App; ;• 2: 2- to the equivalent system

dx ~ av -
= P(x, yh ■= Q(x, y),
ds ds
< 23.25 j
vbeit
p = p <P2 - Q2f. Q = Q (P2 4- Q2)*.

The step length /i is the distance along the path.

Problems

««'-/ prvdwens* ir/v to— prvtc-'-;on. The are . = 4 jc It is difficult to make sense of the diagram
-mtr*+K*e yf 13.5 a os/t e h»gh- ■without this information.)
cy" 5yer/ ^OiAd slew the (e) x = >, ,v = — 2x — 3y. (Stable node. Find the
y< mgc 1 vf erii *t»r«fore S>e two solutions > = —x, v = 2x, which are radial straight
—.y % sr^itie'’4. lines as in ■ These represent /our paths since the}
are interrupted by the origin.)
23. U Cwyxaiinii fbaaree roaynw^ a phase diagram tdt x = >. > = — 3x — >'. (Stable spiraL)
- -a vv -'i ca - ' - ; ' 1*. a- t • . .. ■. (ej x = y, y = — 2x — y. (Unstable spiraL)
'1,4 jsEuagir* '.a*. you be ". 'A-t A. The ev jhhor.um (f) Recompute (a), marking off a time scale on
• a t'-r'. : ■ -a' 'i- - y.- each of the paths, showing interv als in f of around 0.3.
ms '.j »•* a. * v*y • a'': a. *• t ti 'onward*. by (g) Recompute (b) with a time scale as in (f).
thfc' K 'i .£T. ' (h) x = y. y = — 2y. A different type: w hat is the
<sM ^ * J5./ * -4x (A cw«*e,4rz 4 f2 = C If yoor second-order equation that it comes from?
pari» do m( atari? dose, tty a Mdkr rirml i)
% jf * / * x t taotfe. x' ■— y2 = C. Find the
a. mp.uvsi t. r.j *./ ' 'j* ev-i ivr.s ’he} 23.2. Sketch the phase paths for the following equa-
382 Mathematical techniques

tions by first solving for them: form dy/dx and separate (b) centre at (0, 0), saddles at (+ 1, 0);
the variables. (c) unstable node at (0, 0), stable node at (1, 0);
(a) x = y, y = x; (b) x = x, y = y; (d) centres at (±1,0).
(c) x = —y, y = x; (d) x = —x, y = y;
(e) x = 2y, y = x; (f) x = -2y, y = x. 23.8. (Computational). Obtain a phase diagram for the
following (in some of these the linear approximation
23.3. Solve the following by using the energy trans¬ point is zero, so it gives no information);
formation d2x/dr2 = \ d(x2)/dx (Example 20.15), and (a) x + |x|x + x = 0; (b) x + |x|x + x3 = 0;
sketch the (x, x) phase diagrams. (c) x = x4 — x2; (d) x = 2xy, y = y2 — x2;
(a) x = ev; (e) x = 2xy, y = x2 — y2;
(b) x + x2 + x = 0 the transformed equation is linear (/) x ± x(x2 + x2) + x = 0 (notice that the origin is a
in y2); spiral, although the linear approximation has a centre
(c) x - 8xx = 0; - see the remark following (23.22)).
(d) x = ex — e~x (the Poisson-Boltzmann equation).

23.9. From the Taylor series (5.4b), sin x % x — £x3


23.4. Classify the equilibrium point (0, 0) for each of
for small x, so the pendulum equation (23.8) is approxi¬
the following linear equations by using (23.22). Sketch
mated by x + co2(x — gx3) = 0 (the Duffing equation).
the phase diagram: in cases where it is appropriate
Sketch or compute the phase diagram, and comment on
you should first obtain the radial straight paths y = mx
the differences from Fig. 23.9, for the exact equation.
by substitution. State which are unstable.
(a) x = x — 5y, y = x — y;
(b) x = x + y, y = x — 2y; 23.10. (Computational). For a modified form of the
(c) x = — 4x + 2y, y = 3x — 2y; predator-prey problem (compare Example 23.6), in a
(d) x = x + 2y, y = 2x + 2y; special case, the equations are
(e) x = 4x — 2y, y = 3x — y;
(f) x = 2x + 3y, y = — 3x - 3y.
x = 4x — 2xy — x2, y = —2y + xy — 2y2.

The additional terms in x2 and y2 are meant to account


23.5. For the equations given: find any equilibrium for competition for resources among rabbits and among
points; obtain a linear approximation at each equi¬ foxes. Use a linear approximation at the equilibrium
librium point by the method of Section 23.6; classify points in order to classify them, then compute the phase
it from (23.22) (finding the straight line paths in the diagram.
case of nodes and saddles); and put the sketches on a
phase diagram. Guess how the diagram away from the
23.11. A model for H(t) hosts supporting P(r) dan¬
equilibrium points is filled in (isoclines. Section 17.1,
gerous parasites is H = (a — bP)H, P = (c — dP/H)P,
might help here). Then turn to Problem 23.6.
where a, b, c, d, are positive. Analyse the system in
(a) x = x — y, y = x + y — 2xy;
the (H, P) plane.
(b) x = 1 - xy, y = (x - l)(y + 1);
(c) x = x — y, y = x2 — 1;
(d) x + x — x3 = 0 (with x = y); 23.12. Figure 23.14 represents a spring of stiffness s
(e) x = 4x — 2xy, y = — 2y + xy, for x > 0 and y > 0 and natural length /, pivoted at A at a height h above
(foxes and rabbits. Example 23.6: classify (0, 0) as if a smooth wire CD. At B is a mass m, attached to the
x and y could be negative). spring and sliding on the wire. The equation of motion is

/
23.6. (Computational). Check some of the phase dia¬
grams you sketched in Problem 23.5 by computing (h2 + x2)*
representative phase paths. Look out for separatrices,
which end at equilibrium points.
A

23.7. Sketch possible phase diagrams Pom the infor¬


mation given. If a phase path ends in mid air, or if you
have a closed curve without an equilibrium point inside,
then there is something wrong. There are often several
possibilities; for example, a path might either join two
equilibrium points or split, forming two branches going
to infinity. Suppose that the only equilibrium points at
a finite distance are those given in the following cases,
(a) centre at (0, 0), saddle at (1, 0);
Fig. 23.14
Nonlinear differential equations and the phase plane 383

Classify the equilibrium points when l < h, l = h, and of


I > h.
x = (x2 + y2 — l)y, y = —(x2 4- y2 — l)x.

23.13. (Computational). Solve Problem 23.12 modified Explain why the circle x2 + y2 = 1 does not represent
so that there is friction between the bead and the wire periodic motion.
equal to kx. Classify the equilibrium points and construct
the phase diagrams. 23.19. Verify that the differential equation

23.14. (Computational). Construct phase diagrams for x 4- ( 1 — x2-)x± a>2x = 0,


the equation x ± kx — x + x2 = 0. Consider various values V co2J
of k. has the particular solution x = cos co(t — t0) for any
f0. What is the corresponding phase path in the (x, y = x)
23.15. (Computational). Construct a phase diagram for plane? Put further details on a sketch of the phase
the following equations. (They each contain a limit diagram.
cycle.)
(a) x + |(x2 4- x1 — l)x + x = 0;
23.20. Locate all equilibrium points of the system
(b) x + j(x2 — l)x + x = 0;
(c) x 4- sd-x2 — l)x 4- x = 0; x = (x2-l)y, y = (y2 — l)x,
(d) x + 5(x2 — l)x + x = 0. and sketch its phase diagram.

23.16. As in Problem 23.7, sketch phase diagrams for 23.21. The linear system given by (23.22), namely
the general (x, y) phase plane compatible with the follow¬
ing information. The equilibrium points and limit cycles x = ax ± by, y — cx + dy,
specified are the only ones allowed. can be expressed in matrix form as
(a) (0, 0) is a spiral and x2 4- y2 = 1 is a stable
x = Ax,
limit cycle.
(b) (0, 0) is a spiral, x2 + y2 = 1 a stable limit cycle, where
and x2 + y2 = 4 another limit cycle.
X a b

_1
(c) (±1,0) are saddles, (0,0) is a centre, and x2 + X = , A =
y2 = 4 is a stable limit cycle.

1
(d) (±1,0) are centres, (0, 0) is a saddle, and x2 4- Try an exponential solution in the form
y1 = 4 a stable limit cycle.
X = C 6^*,
(e) (0,0) is a centre; the only closed path with
x2 4- y1 > 1 is the stable limit cycle x2 + y2 = 4. where C is a constant column vector, and show that
X must satisfy
23.17. Show that, in polar coordinates, the system det(A — Xl2) = 0.
x = -y ± x( 1 - x2 - y2), In other words, the solutions for X are the eigenvalues
y = x + y{ 1 - x2 - y2) of A (see Chapter 13). If Xt and X2 are distinct eigen¬
values, show that the general solution is
becomes
x = Cj e2lt + C7 eX2t.
r = r( \ — r2), 0=1.
By investigating the sign of r, explain why the system What is the solution if Xx = A2?
has just one limit cycle, which is stable. Sketch the Write down the roots of the quadratic equation,
phase diagram. and discuss how the possible cases (e.g. real roots,
imaginary roots, complex roots, etc.) fit in with the
23.18. Find the locations of all the equilibrium points centre, saddle, node, and spiral, as classified in (23.22).
PART IV Transforms and Fourier
Series

The Laplace
24 transform
Contents
24.1 The Laplace transform 384
24.2 Laplace transforms of f", e±(, sin f, cos t 385
24.3 Scale rule; shift rule; factors t" and ekt 387
24.4 Inverting a Laplace transform 390
24.5 Laplace transforms of derivatives 392
24.6 Application to differential equations 394
24.7 The unit function and the delay rule 396
Problems 400

24.1 The Laplace transform


Suppose that /(f) is a specified function, and that s is a real positive
parameter (that is to say, a supplementary variable). Then the integral
-30

e-s' /(f) df = F(s)


Jo
is called the Laplace transform of /(f): the integral transforms /(f)
into another function F(s).
For example, suppose that /(f) = e2'. Then

, — (s — 2)t
F(s) = e~st e2' df df
Jo Jo
00

>-(s~2)t

s — 2 JO

(e‘ e°) = (0 - 1) =
s — 2 s — 2
This result is true only if s > 2; otherwise the integral is infinite. We
shall always assume that s is large enough to ensure that the integrals
we encounter remain finite, or converge (see Section 15.6).
We also use the symbol L to stand for the ‘Laplace transform of’.
We have just proved that
1
F(s) = L{e2t}
s — 2
24.1 The Laplace transform 385

Laplace transform of /(f)


* 00
(24.1)
L{f(t)} = F(s) = e s'/(f)df.
Jo

Another, very useful, notation is to indicate a transformed function


by a tilde sign: L{f(t)} = f(s), L{x(t)} = x(s), and so on.
The letter p is often used for the parameter instead of s, especially
in mainly theoretical texts.

24.2 Laplace transforms of tn, e±(, sin f, cos t


(a) Positive, whole-number powers tn, n = 0,1, 2,....

L f" = tn df.

Simplify the integral by substituting u = st, so that

t = - u and df = - du.
s s
Since s is positive, the limits of integration f = 0 and oo correspond
to u = 0 and oo respectively. Therefore
u\n du 1
rftn\ — e u un du
„n+ 1

nl
~ s" +1
for n — 0,1, 2,... (from the standard integral, (17.9)). Note that 0!
is to be interpreted as being equal to 1.

Laplace transform of powers

Mtn} = ~T for n = 0,1,2,....


s
(24.2)
Special cases:
3!
H1}=-. L{t}=b L{t2} = - L{t>)
s s1 sJ

Example 24.1. Find the Laplace transform F(s) of /(f) when f(t) = 1 — f +
1 2 1 3
— t2-r.
2! 3!
Composite expressions are dealt with in the following way.

F(s) or= £ jl - I + i (2 - i r3|,

(which follows from the fact that each £{■••} stands for the integral (24.1))
386 24.2 Mathematical techniques

1 1 1 2! _ 1 3!
s s2 2! s3 3! s4
1111
-^ + “I-7-

±t
(b) Exponential e
00

±n _ e~st e±r df e (s+1)'dt


^{e±!}
jo 4o
00
[e (S+1)I] o •
s + 1
s + 1 are both positive if we take s > 1, in which case

£{e±!} = 1 (0-1)
s + 1 S + 1

r fp'f -
(24.3)
/ -
£{e-'} =
s - 1 s + 1

(c) Sine and cosine.

2l{sin f} - (24.4)
L{cos t} =
s2 + 1 s2 + 1

Since cos t + j sin t = ej;, both of these can be verified at the same
time by working out £{ej*} and then separating the real and
imaginary parts:
» — (s — j)t
£1\eJt} e SI ej' df = df

=-1—: [e-(s-j)']o“ =-—(0-1)


s-J s-j
(since s is positive)
1 = s +j
s-j s2 + 1
Therefore, as in (24.4),
f°° e SI gjt fa
s
cos t df = Re
7TT;
i
e st sin f df = Im >~st e1' df
Jo s2 T 1

Example 24.2. Find the Laplace transform o/ 3f2 + 2 e ' — 5 cos t.


L{3t2 + 2 e~' — 5 cos f} = 3L{t2} + 2L{e~‘} — 5X,{cos t},
i
+ 2 - 5
s + 1 s2 + l’
6 2 5s
s3 s + 1 s2 + 1
24.3 The Laplace transform 387

24.3 Scale rule; shift rule; factors tn and ekt


The following rules make it easy to derive more complicated
transforms from the basic ones of Section 24.2.

Scale rule
If L{f{t)} — F(s), and k > 0, then
(24.5)

The proof is as follows.

L{f(kt)} = e stf(kt)dt.

Change the variable by putting u = kt, so that t = u/k and df = du/k.


The limits of integration t — 0 and oo go into u = 0 and oo
respectively because k > 0. Therefore

Hf(kt)} = -s(u/k)
m i du
1
e (s/t)u/(u) du.
k

Ml
since F(s) = e su/(w)du.
The following are special cases.

If k is any constant, positive or negative, then

(a) L{ekt} = M
s — k

(b) £{cos kt} = —-S -, (24-6)


s2 + kl
k
(c) L{smkt} =—-—.
+ kz

These are proved as follows.


(a) Suppose that m is a positive number; then combining (24.3)
with the scale rule (24.5) gives

1 1 _ 1
m s/m +1 s + m

The result (24.6a) therefore holds good for both positive and
negative k.
388 24.3 Mathematical techniques

(b) From (24.4),

s
L{cos t}
s2 + 1

Therefore, by the scale rule (24.5), if k > 0,


1 s/k s
L{cos kt} —
k (s/k)2 + 1 ?r+k2'

This is true also if k is negative, since it is equal to Jo e~sr cos kt dt


(see (24.1)).
(c) is similar to (b).

Example 24.3. Find the Lappace transform of cos (3? + \k).


cos(3f + \k) = cos {n cos 3f — sin \n sin 3f = (cos3f — sin 3t)/y/2.

Therefore by (24.6)

1 / s 3 1 s - 3
T{cos(3r + \k}
V2V^+9_?T9 V2^T9'

Suppose that we know the Laplace transform F(s) of a function


f{t) already. Then the Laplace transform of ek‘f(t) can immediately
be written down

L{ektf(t)} e st Qkt f(t) dt = e(s~k)t f(t) dt.


Jo Jo

But Jo e_s! f(t) dr = F(s), which is supposed to be known, and here


we have s — k in place of s. Therefore

L{ekt f(t)} = F(s-k).

Shift rule (multiplication by ekt)

If L{f(t)} = F(s) and k is any constant, then (24.7)


L{ekt f(t)} = F(s - k).

The shift rule is so called because the transform function F(s) is


‘shifted’ a distance k along the s axis by the presence of the factor ekt.

Example 24.4. Find L{o 31 sin 21}.


From (24.6),

2
Zisin 2d =
s2 + 4

By the shift rule (24.7) with k = —3, we deduce that

2 2
TLfe 3' sin 21) =
(s + 3)2 + 4 s2 + 6s+13
24.3 The Laplace transform 389

Example 24.5. Find L{t3 e4'}.


From (24.2), L{t3} = 3!/s4. The shift rule with k = 4 gives

There is a rule similar to (24.7) by which we can find the Laplace


transform of f"/(f) when the transform of /(f) is known:

Multiplication by t"

If L{f(t)} = E(s), then


(24.8)
d nF{s)
ds"

The simplest way to prove this is to start with the right-hand side.
Since

e_s' /(f) dr = F(s),

then

d F(s) d
e_s( /(f) df
ds ds

d(e_s()
fit) dt
ds

( — t e~s,)f(t) df e“s'(r/(0) df

Every time we differentiate, another factor t and another multipli¬


cation by —1 appears, which takes us to (24.8).

Example 24.6. Find L{t cos 3f}.


Since, by (24.6b),

F(s) = Tjcos 3r} = —-,


1 s2 + 9

then, by (24.8),

d s 9 — s2 s2 — 9
L{t cos 3f}
ds s2 + 9 (s2 + 9)2 (s2 + 9)2

Note the two following special cases, which occur frequently.


390 24.3 Mathematical techniques

s2-k2
L{t COS kt}
<VTfc¥’ (24.9)
Iks
L{t sin kt)
(s2 + k2)2

Example 24.7. Find £{£3 e-3'} (a) by using the shift rule, (b) by using (24.8),
(c) by working directly from the definition of the Laplace transform.

(a) From (24.2),

Therefore, using the shift rule (24.7a) with k = — 3,

6
L{e~3' f3}
(s + 3)4
(b) From (24.6) with k = — 3

1
L{e-3'}
s + 3

From (24.8) with n = 3,

.3 a-3t\ _ ( i\3 1 _,._«3(-lX-2X-3) 6


X{t3 e“3'} = (— l)3 = (-l)3
ds3 s + 3 (s + 3)4 (s + 3)4

(c) From the definition, (24.1),


' oo

£,{t3 e-3'} = e s't3 e 31 df e-(s+3)'t3 dt.


o
From (24.2), this is equal to

3!
(s + 3)4

24.4 Inverting a Laplace transform


Given a function f(t), we obtain its transform F(s) by using the
definition (24.1). Alternatively, if a function F(s) is presented, then
we can try to recover the original f(t), from which F(s) is obtained.
This is called the original of F(s). This second question is the
inverse problem for the Laplace transform - to find *?’ in the
equation

m = m. ■
We shall assume here that there is only one answer to this problem.
The process of finding f(t) from F(s) is called inversion of F(s).
The notation

fit)«- F(s)
24.4 The Laplace transform 391

is a useful notation which underlines the two-way correspondence


between /(f) and F(s).
We can open up a ‘dictionary’ for this purpose, as we did for
derivatives and integrals. The most important results we have so far
are given in the table (24.10) below.

/(f) for f > 0 ns)


n\
f" (n = 0,1,...), I/T+7
1 tm-l
(m= 1,2,...)
(in — 1)!

1
ekt (any k)
s — k (24.10)

s
cos kt (any k)
s2 + k2
\
sin kt (any k)
s2 + k2

1
- sin kt (any k =£ 0)
s2 + k2

A much fuller table which also includes the various rules can be
found in Appendix F. Remember that everything we do with
Laplace transforms refers to f ^ 0 only: the defining integral (24.1)
calls only for values of f ^ 0.
Partial fractions are often useful for inverting transforms.

Example 24.8. Given the transform 1 /s(s + 1 ),find the original.


In partial fractions,

1 _ 1 1
s(s + 1) s S + 1

From the table above,

- <-> 1 and -<-* e-',


s s -I- 1

so that

-?-<-> 1 — e-'.
s(s + 1)

Example 24.9. Invert the Laplace transform


s + 1
s(sz + 4)
392 24.4 Mathematical techniques

The partial-fraction rules require the form

s + 1 A Bs + C
-S- = - 3-T-7" '
s(s2 + 4) s s + 4

When the constants are determined by the method of Section 1.12, we find that
A = j, B = — j, C = 1, so that

s+ 1 i —i;s 4- 1
_= _ -I---?

s(s2 + 4) s s2 + 4
~ s 1
_+
s 4 s2 + 4 s2 + 4

From (24.2),

1
-<-► 1.
s
From the Table (24.10),

s 1 . .
-<-> cos 21, -<->isin2r.
s2 + 4 s2 + 4 -

Therefore

s+ 1 . , . .
-<-> i — t cos 21 + j sin 2t.
s(s2 + 4)

Example 24.10. Invert the Laplace transform


3s + 2
s2 + 2s + 2'

The quadratic denominator does not have real factors, so partial fractions are
not available. Instead we complete the square:

s2 + 2s + 2 = (s + l)2 — 1 + 2 = (s + l)2 + 1.

We aim to write the whole expression in terms of s + 1 so that we can apply


the shift rule (24.7). So put also

3s + 2 = 3(s + 1) - 3 + 2 = 3(s + 1) - 1,

and the transform becomes

3(s + 1) - 1
(s + l)2 + 1
If we had s instead of s + 1, we could invert the transform:

3s — 1 3s 1
—-«-► 3 cos t — sin t.
s2 + 1 s2 + 1 s2 + 1

Therefore, by the shift rule with k = — 1,

3(s + 1) - 1
e '(3 cos t — sin t).
(7+ i)2 +7

24.5 Laplace transforms of derivatives


Suppose that L{f(t)} = F(s). Then the Laplace transforms of df(t)/dt,
d2/(0/df2,... can be expressed in terms of F(s).
24.5 The Laplace transform 393

In the definition.

dm
M d t.
m- dt
Integrate the right-hand side by parts. Using the notation of Section
17.7, put

df = d f{t)
u = e
dt dt
so that
du
= -se v = /(t).
dt
Then

L<
Jd/Wl ° -sr d/(0
dt
I dr dt

= [e"st/(0]o - (-se s,)/(t) dt

= 0 - e° /(0) + s /(t) dt

= -m + sL{f(t)}.
In other words, if L{f{t)} = F(s), then

fd/(t)]
M- = sF(s) — /(0). (24.11)
i dt 7
(Note that it is /(0), not F(0), that arises here.)
We can use (24.11) again and again to obtain successively

fd2/(0i Id d/(f)|
= £
i dt2 [dt dt J

and higher derivatives, from which we obtain the following rule.

Laplace transform of derivatives

If L{f(t)} = F(sl then

L{
fd/(0l = sF(s)-m,
tdTJ
d2/(0l
■ =' s2F(s) — 5/(0) — /'(0), <24l2>
dr2 J
d3/(r)l
L< ■ = s3F(s) - s2/(0) - sf( 0) - /"(0),
dt3
and so on.
394 24.5 Mathematical techniques

Example 24.1 1 . Obtain the transform of the expression


d2x dx
—— + 2 — + 3x,
dt2 dt
when x = 4 and dx/dt = 5 at t = 0.

Put L{x{t)} = 2f(s). Then

fd2x _dx fd2x


L{itt + 2
[dt2 dt +
Lw
-f- 2 Lj
IS! + 3£{x}

= s2X - sx(0) - x'(0) + 2[sX - x(0)] + 3X

= s2X - 4s - 5 + 2(sX -4) + 3X

= (s2 + 2s + 3)2f-4s- 13.

24.6 Application to differential equations


The results (24.12) enable initial-value problems for differential
equations to be solved.

Example 24.12. Find the solution of

dx
— + 2x = e
dt

for which x = 3 when f = 0.


Since

dx
+ 2x = e‘
dt

it is also true that

fdx
^idf' + 2L{X) =£{e
Write

L{x(t)} = AT(s).

By (24.12) the transformed equation becomes

sX- 3 + 2X = 1
s + 1

(where we put x(0) = 3 as specified by the initial condition). The transform X(s)
of x(t) is therefore given by

3s + 4 1
X{s) = +
(s 4- l)(s + 2) s + 1 s + 2
x(t) = e~‘ + 2 e_2!,
which is the required solution.

It can be seen that the terms involving /(0), /'(0),... in (24.12),


far from being merely a nuisance, are exactly what is required to
translate a differential equation together with initial conditions into
a simpler problem in ordinary algebra. We do not have to match
24.6 The Laplace transform 395

up arbitrary constants with the initial conditions; these conditions


are built into the transformed equations.
In many physical situations, we want to know what happens
when an inactive or quiescent system is ‘switched on’. In such cases,
we have zero initial conditions at t = 0. For a system described by
a second-order differential equation, we assume by this that the
variable and its first derivative are initially set to zero.

Example 24.1 3. A system is described by the equation

d2x „ dx
— + 2 — + 4x = 1.
dr2 dt
It is initially quiescent and is then switched on. Find the subsequent time
variation of x.

We have x(0) = x'(0) = 0. Let


x(t) <-► AT(s).

Then the equation transforms to

s2X + 2sX + 4X = -,
s

(notice the 1/s) so that

_= i1 _ i
1 s +2
z =
s(s2 + 2s -I- 4) 4 s 4 s2
°2 + 2s + 4
The quadratic has no real factors; therefore the second term is rewritten in the
manner of Example 24.10:

il _i (S+D+ l
x = i-s (s + l)2 + 3

.Li
4 4
s + 1 1
S ,(s + l)2 + 3 (s + l)2 + 3

To invert the last two terms: from (24.10)

s 11
COS y/3t. sin yj3t.
s2 + 3 s2 + 3 J3

By using the shift rule (24.7) with k = — 1, we obtain

s+1 . 11
(s + l)2 + 3
e cos V21, (S + l)2 + 3 yj 3
e 'sin^/3t.

Therefore
x(t) - l - l(e~' cos yj3t + e“' sin J31).

Example 24.14. Solve the equation

d2x ,
—- + WqX = a cos co0t,
dr2
with x(0) = x'(0) = 0.
If we put L{x(t)} = X(s), then the equation transforms into

s2X + oj1oX = ~
S
-,
+
j
(Do
396 24.6 Mathematical techniques

so that

x- as
(s2 4- Wo)2'

We can read off the inverse from (24.9) with k = a>0:

.. a
x(t) —-t sin co0t.
2 co0
This equation is one of the exceptional resonant types discussed in Section 19.3.
The advantage of using the Laplace transform is easy to see.

Example 24.1 5. Solve the simultaneous equations

dx dy
— = x~y, — = * + y,
dt dt

with the initial conditions x(0) = 1, y(0) = 0.


Let L{x(t)} = X(s) and L{y(t)j = T(s). Then the transformed equations, includ¬
ing initial conditions, are

sX — 1 = X- Y, sY= X + Y.

Therefore

(1 -s)X - y= -1,

X + (l — s) T = 0.

By solving these equations, we obtain

X = ■ ~l +S , Y = --1-.
s2 — 2s + 2 s2 — 2s + 2

The denominators, s2 - 2s + 2, have no real factors, so use the method of


Example 24.10 to rewrite these expressions as

x - 5-1 y = —L—
(s — i)2 + r (s -1)2 + r
so that the shift rule (24.7) can be used to invert them. By (24.10),

s 1
-r—- <-> cos r, —-<->sint.
s2 + 1 s2 + 1
Therefore, by the shift rule with k = 1,
x(t) = e' cos t, y(t) = e' sin t.

24.7 The unit function and the delay rule


The Heaviside unit function H(t) (or U(f)) was introduced in
Section 1.4. Here is a reminder of its definition:

Unit function H(f)

when t < 0, (24.13)


H(0 =
when t ^ 0.
24.7 The Laplace transform 397

(a) x
It is shown again in Fig. 24.1a. Figures 24.1b-e show how it can be
1 used to describe various step functions and switching functions.
For example, the composition of the three segments of Fig. 24. le
is specified by:
O t
e'(0 - 0) = 0 if t < 1,
x
(b)
1 e'[H(f — 1) — H(f — 2)] e'(l — 0) = er if 1 t < 2,

e'(l — 1) = 0 iff 2s 2.
O1 c t
Related Laplace transforms are given as follows.

Laplace transform for the unit function


(24.14)
£{H(f)} = L{H(t — c)} = 6— (c positive).
s s

The various combination rules such as the shift rule (24.7) work for
H(f) in the same way as for smooth functions /(f).

Example 24.1 6. Find £(/(/)} when /(f) = e'(H(r — 1) — H(f — 2)).

This is the function shown in Fig. 24.le. Then, from the definition,
X *00

MfU)} e"V[H(t- l)-H<7-2)] dt,


o

e — <s —1 df =-[e-(s-1,'](
i s- 1

- 1 (e-2>»-i>-e-(,~1)).
s— 1
Fig. 24.1 (a) x = H(t).
Alternatively, we could use the shift rule (24.7), though it has no particular
(b) x = H(f - c),
advantage.
(c) x = H(r — d) — H(f — c),
(d) x = r[H(f) — H(r — 1)],
(e) x = e'[H(r — 1) — H(f — 2)]. Example 24.1 7. Find the Laplace transform of the function shown in Fig. 24.2.
By considering the segments one at a time and using Fig. 24.1c, we have

x(t) = [H(t) - H(f - 1)] - [H(f - 1) - H(f - 2)]

+ [H(f - 2) - H(r - 3)]-,

= H(r) - 2H(r - 1) + 2H(r - 2) - 2H(f - 3) + ■ • ■ + .

From (24.14)

H(f — n) <->-.
5
1 i t i
i i Therefore
i i
i_i
o i: 12 31 14 t 12 ,
i i i
i i 1 L{x[t)} =-(e^s — e-2s + e“3s — •
s s

Fig. 24.2 The brackets contain an infinite geometric series with first term e s and common
398 24.7 Mathematical techniques

ratio — e s. Therefore

1 2 e-s 1 — e~s
L(x(t)) = -
s sF+ e"s s(l + e“s)

Suppose that we have a function g(t) which has a meaning for


all positive f, such as g(t) = e_t. Its Laplace transform is
G(s) = e~SI g{t) dt. All values of g(t) for t positive are called on
to contribute to this integral, but none of its values for negative t
(a) are called upon (Fig. 24.3a).
\ Now translate the function a distance c (positive) to the right as
\ in Fig. 24.3b. The new graph represents g(t — c). It brings with it a
\
N section NA which originally corresponded to negative values of f.
We cannot expect that the Laplace transform of this new function
g(t - c) can be expressed in terms of G(s), because none of these t
values played any part in the calculation of G(s).
Therefore we cut out the section NA by considering not g{t - c),
but g{t - c)H(f - c), which is shaded in Fig. 24.3b, and is congruent
to the shaded part of Fig. 24.3a.
Then
fc°
L{g(t - c)H(f - c)} = e s' g(t - c)H(f - c) dr
ao

0 c t e 51 g(t — c) dr.
J C

Fig. 24.3 (a) Graph of g(t).


Put f — c = u, so that t = u + c and df = dr/. The integral becomes
(b) Graph of g(t — c)H(T— c).
- s(u + c)
g(u) drr = e' e s“ g(u) du = e sc G(s).
Jo
This is the second shift rule, or the delay rule; so called because
g(t — c)Fl(f — c) does not start until t = c.

Delay rule

If G(s) g(t) and c > 0, then (24.15)


e_cs G(s) g(t — c)H(r — c).

It is most often useful in inverting a Laplace transform.

Example 24.1 8. Find the Laplace inverse transform of e~2s/s2.


Put G(s) = 1/s2. Then
1
GG) = — <->g(0 = t.
s
By the delay rule,
e-2s
— = e~2s G(s) <-> (f — 2)H(r — 2),
s

a function which suddenly takes off from zero at t = 2.


24.7 The Laplace transform 399

Example 24.19. Find the Laplace inverse transform of


e-2(s+l)

(s + l)(s + 2)
Put

G(s) =-1-= —1-—


(s + l)(s + 2) s+1 s + 2

) = e_I — e-2'.

We require the inverse transform of e"2<!+1)G(s). By the delay rule with c = 2,


this is given by
e~2(s + *>(j(s) = e~2 e“2s G(s)

<-► e_2(e_('"2, - e”2(,_2))H(r - 2) = (e-1 - e“2'+2)H(f - 2).

Example 24.20. Solve the differential equation

dx
— + 2.x = f(t)
dt
7
with x(0) = 0, where (Fig. 24.4)

( 0 when t < 1,

fit) e-' when 1 ^ t ^ 2,

0 when t > 2.

Let L{x(t)} = 2f(s). We need


•2
L{f(t)} = e“s' e“' df = (s+ !)I d(

1 (e-o+n _ e-2(s+D) = /tm


s + 1

The transformed equation is then

sX + 2X = F(s), or X = F(s)/(s + 2).

Therefore

X = (e-(s+1) - e~2(s+1)) 1
(s + l)(s + 2)

1 1
= (e“(s+1) - e~2(s+I))
s + 1 s+ 2

1 1 1
e 1 e s — e 2 e 2s
vs + 1 s + 2 s+1 s + 2

Apply the delay rule with c = 1 and c = 2, noting that

1 1
- «-> e ' — e 21.
s+1 s + 2

We obtain

x(t) = - e"2('_1)H(t - 1)

— e~2(e~('~2) — e_2(<_2))H(f — 2)

= (e“( - e1_2')H(t - 1) - (e“' - e2"2')H(t - 2).


400 24.7 Mathematical techniques

Both terms are zero before ‘switch-on’ at t = 1. Between t = 1 and 2, only the
first term contributes. For t > 2 both terms are present, the second being
stimulated by ‘switching off’.

Problems

The dot notation, x = dx/dt, x = d2x/dt2, etc. is used (e) 2x + 3x — 2x, where x(0) = 5, x(0) = — 2;
in some of the questions. (f) 3x — 5x + x — 1, where x(0) = 0, x(0) = 0.

24.1. Write down L{x(t)}, where x(t) is as follows, 24.7. Use the Laplace transform to solve the following
(a) e'; (b)4e_I; (c)3e'-e"'; initial-value problems.
(d) 3f2 - 1; (e) it3 + 2f2 - 3; (f) 3 + 214; (a) x + 3x + 2x = 0, x(0) = 0, x(0) = 1;
(g) 3 sin t — cos f; (h) 2(cos t — sin f); (b) x + x — 2x = 0, x(0) = 3, x(0) = 0;
. , 1 1 , 1 (c) x + 4x = 0, x(0) = x0, x(0) = y0;
(i) 1 H— t H— t + ■ ■ ■ -i— t" (you get a geometric (d) x + co2x = 0, x(0) = c, x(0) = 0.
1! 2! n!
(e) x + 2x + 5x = 0, x(0) = 3, x(0) = — 3;
series; see Section 1.13).
(f) d4y/dx4 — y = 0, y(0) = 1, y'(0) = 0, y"(0) = 0,
y"'(0) = 0 (use x instead of t as the variable in the
24.2. (Scale rule). Find L{x(t)} for the following cases Laplace transform).
of x(t).
(a) e3'; (b) 1 — 2 e~2'; (c) sin cot;
24.8. Use the Laplace transform to solve the following
(d) cos cor; (e) 3 cos 2t — 2 sin 2t;
initial-value problems.
(f) cos2 t (express it in terms of cos 21);
(a) x = 1 + t + e', x(0) = 0, x(0) = 0;
(g) sin2 t (see (f)).
(b) x + x = 3, x(0) = 0, x(0) = 1;
(c) x + 2x -I- 2x = 3, x(0) = 1, x(0) = 0;
24.3. (See Section 24.3). Find L{x(f)} in the follow¬ (d) x — x = e2'. x(0) = 0, x(0) = 1;
ing cases of x(t). (e) x — x = t e'; x(0) = 1, x(0) = 1;
(a) t2 e' (easiest to start with f2); (b) t e~2'; (f) x — 4x = 1 — e2', x(0) = 1, x(0) = — 1;
(c) f2e“'; (d)e2'cosf; (e)e“'sinf; (g) x - 4x = e2' + e“2', x(0) = 0, x(0) = 0;
(f) e' sin 3f; (g) e~2' sin 3f; (h) e~3' cos 2r; (h) x + co2x = C cos cot, x(0) = x0, x(0) = y0;
(i) fcos3f; (j) r sin 3r; (k) t2 sin f; (i) x — 2x - x + 2x = e’2', x(0) = 0, x(0) = 0,
(1) t4e-' (compare the three methods: (i) start with f4 x(0) = 2 (look out for factors in the denominator of
and use the shift rule, (ii) start with e-' and use (24.8), X(s)).
(iii) work directly from the definition (24.1)).
24.9. Solve the following simultaneous first-order
24.4. Obtain the Laplace transform for t sin kt by differ¬ differential equations, for the given initial values.
entiating that of cos kt with respect to k. (a) x = x — y, j> = x + y, x(0) = 1, y(0) = 0;
(b) x = 2x + 4y + e4', y = x + 2y, x(0) = 1,
y(0) = 0;
24.5. Invert the following Laplace transforms.
(c) x = x - 4y, y = x + 2y, x(0) = 2, y(0) = 1.
(a) 1/s2; (b) 1/s; (c) 3/2s; (d) 3/s5; (e) l/(s - 3);
(f) l/(s + 4); (g) 3/(2s - 1); (h) 2/(2 - 3s);
(i)l/s(s-l); (j) l/(s2 + s - 1); (k) s/(s2 — 1); 24.10. Find the general solution of the following by
(1) (2s - l)/(s2 - 1); (m) s/(s2 + 1); (n) l/(s2 + 4); putting x(0) = A, x(0) = B, where A and B are arbitrary,
(o) (2s - l)/(s2 + 4); (p) (2s - l)/s(s - 1); (a) x + x = e'; (b) x - x = 3; (c) x - 2x + x = e.
(q) (s2 - l)/s(s - l)(s + 2)(s + 3);
(r) s/(s - l)(s2 + 1); (s) l/(s - l)3; 24.11. Find the general solution of d4y/dx4 — y = e',
(t) (2s + l)/(s2 - 2s + 2); (u) s/(s2 + l)(s2 + 4). by putting y(0) = A, y'(0) = B, /(0) = C, y"'(0) = D,
where A, B, C, D are arbitrary. (Let the variable in
the Laplace transform (24.1) be x instead of f.)
24.6. Find the Laplace transform of the following ex¬
pressions involving x(t), where L{x(t)} = 3((s).
(a) x(t), where x(0) = 6; (b) x(t), where x(0) = 0; 24.12. This is a system of first-order equations for x0(t),
(c) x(t), where x(0) = 3, x(0) = 5; x,(f), ...,x„(t):
(d) x(f), where x(0) = 0, x(0) = 0;
*0 = fi(xr-1 xr)
The Laplace transform 401

for r=l, 2,Solve them by using the Laplace (a) x + x = fit), where

transform, showing that xr = — (/fry e~p'. 1 for 0 < t ^ 1,


rl m=
(0 for r > 1 .
(b) x — 4x = f(t), where
24.13. Use the delay rule (24.15) to obtain the Laplace
transform of e~'(f — 2) cos(r — 2)H(f — 2). \\ for 0 < f < 1,
m=
(.0 for f > 1.

24.14. Find the functions which give rise to the following (c) x — 4x = /(r), where
Laplace transforms: t for 0 < t < 1,
(a) e-27(s + 3); (b) (1 — se~s)/(s2 + 1);
(c) e-27(s — 4); (d) se~s/(s + l)(s + 2); fit) = \ 2 - t for 1 < t ^ 2,
(e) e"7(s- l)(s2 — 2s + 2).
to for t > 2.
(d) x + x = /(r), where
24.15. Solve the following differential equations
assuming that the initial state is, of quiescence x(0) = fcos t for 0 < t ^ 7t,
fit) =
x(0) = 0: [O for t > n.
Applications of the
Laplace transform
Contents
25.1 Division by s and integration 402
25.2 The impulse function 404
25.3 Impedance in the s domain 406
25.4 Transfer functions in the s domain 408
25.5 The convolution theorem 413
25.6 General response of a system from its impulsive response 415
25.7 Convolution integral in terms of memory 416
25.8 Discrete systems 417
25.9 The z transform 419
25.10 Behaviour of z transforms in the complex plane 424
25.i i Difference equations 428
Problems 430

25.1 Division by 5 and integration


Multiplication by s is associated with differentiation (see (24.12)).
Division by s is associated with integration, as follows.

Division rule
*r
(25.1)
If G(s) <-»■ g(t), then - G(s) g(z) dr.
s Jo

To prove this, put (l/s)G(s) = F(s); then we must express /(f)


in terms of g(t). Rewrite the relation between F(s) and G(s) in the
form

sF(s) = G(s).

But from (22.12) we know that, in general,


df
— sF(s), provided that /(0) = 0;

so then we have df/dt <-» G(s). This is equivalent to the initial-value


problem df/dt = g(t), with /(0) = 0. The solution is
ft
m= g(t) dr.
Jo

Example 25.1. Find f(t) when F(s) — l/s(s2 + 1), (a) by using partial frac¬
tions, (b) by using (25.1).
25.1 Applications of the Laplace transform 403

1
(a) <-> 1 — cos t.
s(s2 + 1) s s2 + 1

(b) In the notation of (25.1), put

I
G(s) = sin f.
s2 + 1

Therefore

1 1 1
F(s) =
s(s2 + 1) ss2 + 1
t
sin t dr = 1 — cos t.
o

i(0 Figure 25.1 shows a capacitor, of capacitance C, being charged


by a current i(t), the voltage drop across the plates being v(t).
Assume that the capacitor is uncharged at t — 0; then, at a later
time t,
Fig. 25.1
1 C'
v(t) = i( T)di.
CJ o
Therefore, according to (25.1), the relation between the Laplace
transforms of v(t) and i(t) is

V(s) = i- I(s). (25.2)


Cs

We say that (25.2) describes the situation in the s domain, as we


spoke of description in the frequency, or co, domain in Section 21.3.
If the capacitor has an initial charge q0, then

i(t) dr + q0
m = 'c -J o
Since q0 +-> s 1q0, this transforms into

1 1
V(s) = [/(s) + q0~]. (25.3)
C~s

We shall not be concerned with this case.

R Example 25.2. The circuit shown is switched on at time t = 0. It is initially


quiescent, and there is zero charge on the capacitor. Find the current for t > 0.

The circuit equation is

1
v0 cos tot = Ri(t) + i(t) dr.
C o
Such an equation is called an integral equation for i(t). The Laplace transform
of the equation is

= RI(s) + -11(s),
Fig. 25.2 s2 + to2 ' ' Cs
404 25.1 Mathematical techniques

so

R (s + 1 /RC)(s2 + co2)

_v0 1 fRCcofs _ RCco2 1 \


R 1 + (RCco)2 \s2 4- u)2 s2 + co2 s + 1 /RCj

after splitting into partial fractions. Therefore, for t > 0,

i(t) = —---- [(KCco)2 cos cot — RCco sin cot + e_</R<:].


R 1 + (RCco)2

The first two terms represent a steady forced oscillation and the final term is a
transient.

25.2 The impulse function


x Figure 25.3 shows the graph of a function which is zero everywhere
except for a tall, narrow rectangle with width e and height 1/e, so
that the area under the graph is equal to 1. Imagine that £ is a very
small number, as small as we wish. This very tall and very narrow
l picture is a simplified version of the impulse function or delta
I function, usually denoted by 5(f). It is used in problems involving
I sudden and brief events, to represent (say) impulsive force between
two bodies in collision; voltage from a lightning strike; or, if the
variable is position rather than time, a point force.

O £ t
The impulse or delta function 5(f)

Informal definition: 5(f) = 1/e for 0 < f < e, and


Fig. 25.3
5(f) = 0 elsewhere.

X
In Figure 25.4, 8(f) is moved to the right so as to be at f = c; the
vertical strip therefore represents 5(f — c). An ordinary function /(f)
crosses it at C. Consider the integral
'b

f{t) 5(f - c) df,


Ja

where c lies between a and b. The integrand is zero except between


c and c + e; over this very narrow interval, /(f) hardly changes from
the value /(c). Therefore (as closely as we wish)

^ *c+e

/(0 8(f — c) df = /(c)£“1 df = /(c).


Ja Jc

If c does not lie between a and b, then the integral is zero. The delta
function is sometimes called a sifting function because of this
property.
25.2 Applications of the Laplace transform 405

Sifting property of 5(f)


'b
if a ^ c < b, (25.5)
fit) 5(t - c) df =
Ja otherwise.

We can obtain the Laplace transform of 5(f) from (25.5):

Laplace transform of 5(f — c)

L{Ht - c)} = e st 5(f — c) df = e' (25.6)

for c 0. In particular, jC{5(f)} = 1.

Example 25.3. The equation d2x/df2 + co2x = /(f) represents the displace¬
ment x of a particle of unit mass on a spring of stiffness to with external force f(t).
Find the motion for t > 0 if the particle is subjected to an impulse I 8(t — 1) at
time t = 1, assuming equilibrium at t — 0. (/ has dimensions [force x time).)
The equation is d2x/df2 + co2x = / 8(f — 1). Its transform is

s2A + co2X = I e~s,

where x(t) <-> 2f(s). Therefore

I
X(s) =
s2 + at2

We know that

1 1 .
-j *-> — sin cof;
S + CO CO

so, by the delay rule (22.15), we have

2f(s) = —-- e^s <-> x(t) = — sin co(t — l)H(f — 1),


s2 + co2 co

where H stands for the unit function (24.13). There is no motion until t= 1,
when the impulse sets up free oscillations (I/co) sin co(t — 1).

Example 25.4. Find the current resulting from an impulsive voltage Iv8(t)
R applied to the circuit of Fig. 25.5, the current being zero before application of the
voltage. (The dimensions of f are \_emf x time).)

The equation for the current is L di/dt + Ri = /v 5(f). After transformation, with
i(0) = 0, it becomes

Lsl(s) + RI(s) = /v.

Therefore

L -Rt/L
I(s) =
L(s + R/L)

The great, though brief, applied voltage gives only a finite current because of
the counter-emf generated by the coil.
406 25.2 Mathematical techniques

The delta function can be regarded formally as the derivative of


the unit function H(r). As in Fig. 25.6a, smooth out the transition
of H(f), from zero to one, as t passes through the origin, by means
of a sloping straight-line segment. The derivative of this function is
equal to zero outside the transition interval (0, e) and equal to e
inside it; this specifies 5(t) as in (25.4).

Connection between H(t) and 5(t)

dH (0 (25.7)
(b) - 8(0.
dr

This only conforms with the Laplace-transform derivative rule


(24.12), '

H(0) = 1,
O t

Fig. 25.6 if we rather arbitrarily interpret H(0) as being zero. It should be


understood that certain weaknesses result from treating the impulse
function very informally; the real justification for its use is an
elaborate mathematical subject called distribution theory.

25.3 Impedance in the s domain


In Table (25.8) below three basic circuit elements are shown,
together with their voltage-drop-current relations and the Laplace
transforms of these relations, on the assumption that, at t = 0, the
current through the inductor and the charge on the capacitor are zero.
The expression ‘s domain’ refers to transformed quantities.

Resistor Inductor Capacitor

i(0 —*o— at)—*>- i(t) —-1 (-O


I I
L — v(t)-1 *--v(t) - 1—v(t)—1
1
Time domain: v(t) - Ri{t) v(t) --= Ld,(,) v(t) - t(t) dr (25.8)
dt C
s domain. V(s) = RI(s) V(s) = Lsl(s) V(s) = (l/Cs)/(s)
Impedance Z(s): R Ls 1/Cs

Table (25.8) should be compared with the Table (25.5) for the case
of steady forced oscillations of frequency co/2k. The impedances
Z(s) in the s plane are analogous to the complex impedances R,
jcoL, and 1/jojC of (21.6) for the steady case. One can pass from
one to the other by substituting jco for s, or —js for co. However,
the s forms allow arbitrary inputs to the circuit to be considered.
25.3 Applications of the Laplace transform 407

Impedances combine in series and parallel in the same way as do


complex impedances (see (21.7)) in the frequency domain, but it is
to be remembered that they refer to zero initial conditions only.

Combination of impedances Z(s) in the s domain


for zero initial state

Impedances in series

Z — Zj + Z2 + • • • .

Impedances in parallel
(25.9,

1 1 1
-h-(-•••.
Z Zx z2

(a) Example 25.5. The circuit shown in Fig. 25.7a is initially quiescent, with zero
charge on the capacitor. The constant voltage v0 is switched on at t = 1 and off
at t = 2. Find the current i(t).
The corresponding s domain impedances are shown in Fig. 25.7b, in which the
elements R and C are grouped. They are in parallel, so (25.8) and (25.9) give

t _1 1 _l,s_s+4
Y^R (Cs)_1 _3 12 _ 12

Hence

and also Z2 = Ls = 4s. Then Z for the whole circuit is given by

„ . ,12 4(s+l)(s + 3)
(b)
s+ 4 s + 4

Therefore
s +4
/(s) = V(s).
4(s + l)(s + 3)

Taking into account switch-on at t = 1 and switch-off at t = 2,

v(t) = u0[H(t - 1) - H(t - 2)],


so

V(s) = v0(-e~s — - e-2s


\s s

Fig. 25.7 Therefore

Vpjs + 4)
I(s) = (e_s — e~2s)
4s(s + l)(s + 3)

1
-+ 24 (e-s e~2s).
s + 1 5 + 3
408 25.3 Mathematical techniques

The bracketed factor transforms back to

vo(i — §e~' + 24e3’)>

and, by using the delay rule (24.15) to deal with the exponentials,

i(t) = ^0(3 - I e'<,_1) + ^e-3(,'1))H(f - 1)

- t>o(£ - i e“(,_2) + ji e~3('~2))H(f - 2).

Nothing happens until the system is switched on at t = 1, when the first term
(only) is activated. At t = 2, when it is switched off, the second term comes in
also; some current persists but it dies away to zero.

It must be emphasized that such a problem is considerably


complicated when the initial conditions are not zero. For example,
the expression (25.3) for an initially charged capacitor is not in the
form of a voltage-impedance-current relationship. In such cases, it
is necessary to start with the differential equations for individual
branches.

25.4 Transfer functions in the s domain


The impedance Z(s) which directly connects the current in a unit
with the voltage drop across the same unit is a special case of a
more general idea: to relate any two currents or voltages which occur
in the network.
We suppose as before that we have a passive circuit consisting of
linear resistors, capacitors, and inductors, and a single source of
voltage which drives the circuit. Figure 25.8 represents such a
network. We denote the driving voltage by f(t), because much of
what we say can be taken over into mechanical and other systems.
The unknown voltages and currents we call the variables.

Suppose that there are N currents and voltages to be determined,


which we call x^t), x2(t),..., xN(t), with transforms X^s), X2(s),
..., XN(s). A voltage f(t), with transform F(s), is applied somewhere
in the network. Provided that the circuit is initially quiescent (zero
25.4 Applications of the Laplace transform 409

initial conditions), each of the s domain equations for the currents


and voltages takes one of only two possible forms:

either alXl + a2X2 H-1- aNXN — 0

or b 1X1 + b2X2 + • • • + bNXN = F,

where the coefficients are functions of s. Therefore the transforms


Xv X2,..., XN are all proportional to F:

Xn(s) = Gn(s)F(s),

for n = 1, 2,..., N. The G„ are functions which depend only on


the circuit constants and not on the applied voltage. They are
called transfer functions. Nominate an arbitrary variable pit) in
any branch as the input, and another variable q(t) elsewhere as the
output. The corresponding G„ are denoted by GP and GQ. The
driving voltage fit) may serve as an input if we wish. Assuming that
we start with zero currents and charges, which implies zero initial
conditions, the transforms P{s) and Q(s) of p(t) and q(t) are
related by

Q(s)/P(s) = GqF/GpF = Gpq(s),

say, where GPQ(s) is called the transfer function from p to q.

Transfer function GPQ(s) (zero initial conditions)

Let p{t) (input) and q(t) (output) be the voltage or


current in any two branches. Then

Q(s)/P(s) = Gpq(s),

where GPQ(s) is the transfer function from p to q. GPQ


depends only on the circuit parameters.
(a)
L
Transfer functions which connect different types of variable are given
various names and conventional symbols in literature on systems.
For example, in the s domain, voltage h- current is impedance;
current — voltage is admittance; voltage -4- voltage is voltage gain,
and so on.

Example 25.6. Find the transfer function G(s) from the voltage transform P(s)
over R, regarded as the input, and the voltage transform @(s) over C, regarded as
the output, in Fig. 25.9a.
(b)
Let the current i(t) be as indicated. The impedances of the various groups are
shown in Fig. 25.9b; these are in fact transfer functions between current and
T T voltage for each unit. In terms of the transforms,

1
Pis) = RI(s), Q(s) = I is).
Cs
Therefore
m 1
G(s) =
Fig. 25.9 Pis) RCs'
410 25.4 Mathematical techniques

Thus

<20 = — - P(s),
PCs
(a) circuit A
1
and so q(t) = p(r) dr, as expected.
RC Jo

Suppose now that we have a circuit such as the one in Fig. 25.10a,
called Circuit A, where p(t) is the input voltage and q{t) the output
voltage. Figure 25.10b schematizes the arrangement and specifies
the transfer function G(s) = Q(s)/P(s) between p and q.
(b) circuit A,s domain We could also symbolize the dependence of q on p by the scheme
in Fig. 25.10c. However, this figure suggests the beginnings of some
kind of series arrangement: it looks as if we could attach another
G Q L*S circuit to the original one without altering the transfer function, and
P(s) A P RA+LAs Q(s)
so get an easy calculation for the combined circuit. This is not true
0 in general, but sometimes it is a useful approximation.
To illustrate this question, we will append to Circuit A another
Circuit B. It is shown in Fig. 25.11a, together with its s domain
(c) representation and its transfer function. In Fig. 25.11b, A and
B are connected across MJV; here p, q, r, and their transforms
V P, Q, R represent the actual voltages across the terminals indicated.
G
A R+Ls
P A A
' Q The question is, do the transfer functions written in the boxes
still correctly give (2(s) in terms of P(s), and then R(s) in terms
Fig. 25.10 of <2(s)?

(a)
circuit B circuit B, 5 domain

G„= V
vv

(b)
circuit A circuit B
N
o—

c = V r V
P(s)
w Q(S)
B VV
R(s)

Fig. 25.11
M

If an appreciable amount of current passes between Circuits A


and B after attachment, then Q(s) must change, so the true transfer
functions of both of the circuits will be changed, and the changes
will not compensate each other. In special circumstances, however,
the circuits may behave almost independently, or can be made to
do so by means of technical arrangements such as feedback.
25.4 Applications of the Laplace transform 41 1

Example 25.7. The two circuits A and B shown in Fig. 25.12 are connected
to form a composite circuit C. Show that

G(s) a GA(s)GB(s)

(where the G(s) are the transfer functions for the voltages shown) if 1/R is much
smaller than 1/r + 1/r,.

—? VA— -O O--Wv- r- O

1
®-

ft * f f' rl
f

vvv
r -

vvv
> V V r*

J*.
V
VB V
v r vc

1 1 1

1
Fig. 25.12 circuit A circuit B circuit C

For Circuit A alone:

r r r
Ga(s) =
V*(s)
K (*) r + r,

For Circuit B alone:

FB(s) impedance of C
Gb(s) =
V2(s) total impedance

1 1
Cs R + Ls+ l/Cs

Therefore

1 1
GA(s)GB(s) =
Cs r + r, R + Ls + 1 /Cs

For Circuit C, by following the voltage drops around closed subcircuits as usual,
we get

V = r,l + r(I - 7j),

0 = (R + Ls + l/Cs)/j — r(I — I,),

from which
Vr
h =
(r + r,)(r + R + Ls + l/Cs) — r2

which represents the current ‘leaking’ between A and B. Therefore

Vc(s) 1 r
Gc(s) = m2’
F(s) Cs (r + r,)(r + R + Ls + l/Cs) — r*

which we have to compare with GA(s)GB(s) above. Rewrite Gc(s) in the form

1 r 1
Gc(s) =
Cs r + r, R + Ls + l/Cs — rr,/(r + rj

It can be seen that Gc(s) % GA(s)GB(s) if rr,/(r + r,) is much smaller than R. But

r + r.

and this is much smaller than R if 1/r + l/r1 is much greater than 1/R. The
relation between the circuits could be represented in this case approximately by
Fig. 25.13, as if they processed the voltage signals independently.
412 25.4 Mathematical techniques

G * * -►-
—>— A r+r] -W ~
B Cs R+Ls+WCs
V
Fig. 25.13

Example 25.8. Figure 25.14 shows a chain of three systems, which act
independently upon their inputs according to the transfer functions GA(s), GB(s),
and Gc(s) indicated in the boxes. Find the transfer function G(s) between F(s) and
Fc(s). Find fc(t) when /(f) = H(f),/or zero initial conditions.

device A device B device C

Fig. 25.14

We have

FF = F£Fbfa = _1_ J_1


F Fr F\ F s + 2s + 1 s
so
1
G(s) =
s(s + l)(s + 2)

Now let J\t) = H(f); then F(s) = 1/s. Therefore

1 1 1
Fc{s) = G(s)F(s) =
s(s + l)(s + 2) s s2(s + l)(s + 2)

In partial fractions,

, 1 , 1 1 1
Fc (s)=-l- + ±~2 +---i--■
s s s + 1 s + 2

Therefore /(f) = — f + ff + e ' — j e 21 for t > 0.

It is possible to get an idea of important features of an output


without going through the whole calculation:

Example 25.9. In a particular system, the output X(s) in the s domain is related
to the input F(s) by X(s) = G(s)F(s). Find the general character of x(t) if
f(t) = cos 2f, G(s) = s/(s + l)(s2 + 4s + 5).
Since /(f) <-> s/(s2 + 4), we have

X(s)
(s + l)(s2 + 4s + 5) s2 + 4

If we expanded this in partial fractions, we should have terms of the types

Is .1
, and (from G),
s + 1 (s + 2)2 + 1 (s + 2)2 + 1
and

S
-=-and A
——1 (from F).
s' + 4 s2 + 4
25.4 Applications of the Laplace transform 413

Therefore, in terms of time, we should obtain terms like

e~', e-2' sin t, and e_2! cos t from G (which are transients),

cos 21 and sin 2t from F (a forced oscillation).

Finally we illustrate the relation between transfer functions in the


s domain and complex transfer functions in the co domain (Section
21.5).

Example 25.10. The transfer function between an input F(s) and an output
X(s) is l/(s2 + 1). Find the amplitude and phase of the steady forced oscillation
produced by an input /(f) = 3 sin 21.
As pointed out in Section 25.3, the complex impedance is simply the s domain
impedance with jco substituted for s. The same is true for any transfer function.
In the to domain representation, the input and output will be represented by
phasors F(co) = 3 e~i7Ij and X(co), corresponding to circular frequency co = 2
in this case. Then

A(co) =---3 e i7tj = —e i,lj = ei7Ij.


(2j)2 + l

The amplitude is the modulus of X, which is 1, and the phase is

25.5 The convolution theorem


The following result enables us to interpret Laplace transforms
which take the form of a product of two functions.

Convolution theorem

Suppose F(s) = G(s)H(s), and

G(s) «-► g(t), H(s) <-► h(t).


Then
(25.11)
m= g(t — x)h(x) dr

(which is the same as j(, h(t — x)g(x) dr).

This result will be proved in Chapter 31, Example 31.12. For the
present we shall verify that it is true in some special cases.

Example 25.1 1. Find the inverse Laplace transform of

1
F(s) =
(s + l)(s + 2)

Put F(s) = G(s)H(s), where

1 1
G(s) = H(s) =
s + l’ sTT
41 4 25.5 Mathematical techniques

then g(t) = e ' and hit) = e 2'. Equation (25.11) gives

,-(l-t) p-2r
F(s) <-> fit) = e 2t dt

dt = e e r dr (note this step carefully)

= e '( —e ' + 1) = e ' — e 21.


This result can be confirmed by using partial fractions instead:

(s + l)(s + 2) s+1 s + 2

Notice very carefully the distinction between t and z in the


integrals (25.11): t is the variable of integration. The variable t is a
constant so far as the integration process is concerned; so, for
example, at one point we took e-' outside the integral sign.

Example 25.1 2. Find the inverse transform of l/s(s2 + 1).


In Example 25.1, we showed in two different ways that

1 — cos f.
s(s- + 1)
To confirm that (25.11) gives the same result, put
1
G(s) = and H(s) = -,
s+1 s
say. Then (for t > 0) g(t) = sin t and h(t) = 1, so
g(t — t) = sin(f — i) and h( i) = 1.
Therefore, by the convolution theorem.

F(s) <- sin(f - t) 1 dr = [cos(t - i)]l=0 = 1 - cos t,

as expected.

Example 25.13. (See (25.11)). Confirm directly that

g(t - T)h(x) dr = h{t — t)g(z) dr.

In the first integral, change the variable, putting


u = t — T.

Then (remember t is to be treated like a constant) du = -dr. Therefore

g(t — t)/j(t) dr g(u)h(t - u)(-du)

h{t — u)giu) dm,

which is the integral required, merely using u instead of t for the variable of
integration.

Example 25.14. Find an expression for the inverse transform of

Fis) =-- His)


s + 1

in terms of hit), the inverse transform of His).


25.5 Applications of the Laplace transform 415

Use the convolution theorem, (25.11), putting G(s) = l/(s + 1). Then

9(t) = e '.

We therefore obtain from (25.11)


't

/(f) = /,(T) (jT)


Jo
or its alternative form
*I

/(f) e_T h(t — t) dT.


Jo

25.6 General response of a system from its


impulsive response
We shall take an electrical network as our example, though what
we say applies to linear mechanical systems as well. Suppose that it
is activated by an applied voltage /(f) (regarded as the input).
Focus on any particular one of the currents or voltages in the circuit,
and call it x(t) (the output). The transfer function between input
and output will be called G(s). We have then

X(s) = G(s)F(s). (25.12)

Suppose that we conduct an experiment in which we excite the


circuit by means of a voltage impulse 7V 5(f), and record the result
(the dimensions of 7V are [emf x time]: see (25.1)). Then

/(f) = 7V 5(f), so that F(s) = 7V.

The current resulting from this special voltage (an impulsive input)
will be called x*(t), with transform X*(s). Now put F(s) = 7V and
X* for X into (25.12), and it becomes

X*(s) = 7vG(s). (25.13)

Such an experiment would therefore give us the corresponding


transfer function G(s) directly (we could even arrange for 7V to equal
unity). Thus, even if the circuit is a ‘black box’ with its details
unknown, we still know from (25.13) what to put into (25.12) for the
case when /(f) is any function at all:

2f(s) = 7v-12f*(s)F(s).

Therefore, by the convolution theorem (25.11),


't

x(t) = 7V 1 x*(t - t)/(t) dr, or x*(r)f(t - t)dr.


Jo Jo

This type of result applies to the other circuit variables such as


voltages and charges, and to mechanical systems governed by linear
differential equations. In terms of general outputs and inputs:
416 25.6 Mathematical techniques

Output x(t) from an input f(t) to a quiescent


linear system, in terms of the output x*(t) from
an impulsive input / 5(f)
't

x(t) = rl x*(t - t)/(t) dt, (25.14)


J0
or
*t
*(0 = /_1 x*(x)f(t — t) dt.
Jo

Example 25.1 5. The displacement x*(t) caused by an impulse I 5(f) applied


to a certain mechanical linear system at rest is found to be x*(t) = e ' — sin 2f.
Find the displacement x(t) corresponding to an applied force f(t) = sin f starting
at t = 0.
We have
t

x(t) =r i [e_('_T) — sin 2(f — t)] sin t dt (from (25.14))


o

= r' e er sin r dr — / - i sin(2f — 2i) sin x dr

= / 1 e‘ er sin t dx

i r1
7 [cos(2t — 3t) — cos(2f t)] dr,
Jo
by using the identity (1.18b). In the end, we find

x(f) = \I~1 e-' + \1~1 sin 2f — ^/_1(3 cos t + sin t)

for f > 0. The first term is a transient, and the second an induced free oscillation,
while the third term represents the forced oscillation.

25.7 Convolution integral in terms of memory


An integral of the type

x(t) = t)/(t) dr,

such as arose in the convolution theorem (25.11), is called a


convolution integral. Typically, / acts as some kind of ‘cause’,
such as a driving force or voltage, and x(t) stands for a certain ‘effect’
produced.
Choose a time f for observation; then divide the interval t = 0 to
r = t into a large number of equal time steps 5t. We have

x(f) = g(t - t)/(t) dt « X 9(t - t)/(t) 8t.


0 T—0
25.7 Applications of the Laplace transform 417
(a)
Now choose any moment xl between 0 and t: there was a force /(ty)
applied at this moment, and its contribution to x at time t > xx is

9(t ~ Tt)/(Ti)8t.
The factor g(t — xf) takes into account the time elapsed between
the cause and its effect - in some problems it would be appropriate
to call t — tj the ‘age’ of /(rq) at the moment t of observation, and
g an ageing factor. Depending on the type of problem, this factor
might weaken or amplify the contribution of f(xf) to the integral
as time t passes. The elapsed time is increased if either we take an
earlier x1, or delay the time of observation by increasing t. Figure
25.15a shows a representative function g((x), where a stands for ‘age’,
and Fig. 25.15b illustrates its effect on the influence of / at time xt
on x at a later time t.

25.8 Discrete systems


Suppose we have a system or processor, which we shall generally
think of as an electrical circuit. All the time functions used are zero
for t < 0. The input will be denoted by x(t) and the output by y(t),
and either may be referred to as a signal. The system is said to be
linear and time invariant if there is a fixed transfer function G(s) such
that for all inputs and at all times t ^ 0 the input/output relation
has the form

Y(s) = G(s)X(s) (25.15)

where

x(f) <-> 2f(s) and y(t) <-> Y(s).

Thus G(s) completely describes the effect of the circuit. By (25.11),


the convolution theorem, (25.15) is equivalent to

y(t) x(x)g(t — x) dx or x(t — x)g(x) dr (25.16)


Jo

where g(t) G(s).


For the impulsive input x(r) = x*(t), where x*(f) = S(t), we obtain
X*(s) = 1, so by (25.15), T*(s) = G(s), or

g(t) = y*(t), where g(t)^G(s). (25.17)

In other words, the interpretation of g(t) is that it is equal to the


output from a unit delta-function input at t — 0. This repeats the
result (25.14).
So far in the chapter we have only considered circuits made up
from the traditional elements, resistances, capacitances, and induc¬
tances, but there exists a far greater variety of basic units. We shall
not describe the circuits which contain these, but only specify their
properties by specifying their transfer functions.
418 25.8 Mathematical techniques

(a) Figure 25.16a shows a smooth signal x(t) starting at t = 0.


Imagine that this serves as the input to a circuit that picks out the
values of x(£) at times t = 0, T, 2 T, 3T,..., samples them over very
short time intervals, and ignores the values of x(t) in between,
treating them as if they were zero. This process is indicated by the
shaded strips in Fig. 25.16a. The device registers a sequence of values

{x(O),x(D,x(270, x(3T),...},

called a sample of x(£) at equal intervals T. In an actual instrument


the output will consist of a succession of ‘spikes’ as in Fig. 25.16b.
These can be thought of as brief puffs of energy generated by the
circuit, which are equal in ‘content’ to the sequence of values above,
so it is plausible to represent the sample, y(£) say, by
K
y(t) = X x(kT)S(t - kT), (25.18)
k= 0

(where K may be infinite). Such a function is called discrete. The


Fig. 25.16 circuit works like the first stage of an analogue-to-digital converter.
Suppose next that we have a circuit which processes discrete
inputs of interval T, and produces discrete outputs of interval T. Such
circuits may amplify, or filter, or delay, or modify the input in a
variety of ways. We then have a completely discrete system. The
input x(t) and output y(t), and their Laplace transforms 2f(s), 7(s)
take the form
N N
x(0= I xnS(t-nD, X(s)= X x„e-n7\ (25.19)
n=0 n=0

y(t) = I ykS(t -kT), Y(s) = X yk e~kTs, (25.20)


k= 0 Jc = 0

where x„ and yk are constants, and N and K may be infinite. We


may alternatively express x(t) and y(t) in the form:

x(t) = {x0, xl5 x2,..., Xjv}, or simply as {x„};

y(0 = {ko> yu • • •, yK}, or as {yfc}.


Thus, {n + 3} stands for {3,4,5,...}. In a case such as {1,2,0,0,0,...}
we may shorten it to {1, 2}.
Assume next that there exists a transfer function G(s) so that
Y(s) = G(s)Ar(s). Let g(t) <-> G(s); then g{t) is equal to the output
resulting from the unit impulsive input

x*w = m

(or x*(t) = {1} or {1, 0, 0, 0,...} in the sequence form). Since the
device generates only discrete outputs, g(t), which is equal to the
response to x*(£), must also have a discrete form:
m M

9(0= X so G(s) = £ gme~mTs (25.21)


m=0 m=0
25.8 Applications of the Laplace transform 419

Example 25.16. A discrete circuit delays any incoming signal by an interval


T (see Fig. 25.17a). (a) Obtain a transfer function G(s) by considering the response
to a delta function input, (b) Confirm that this transfer Junction delays an arbitrary
discrete signal by an interval T, and that therefore the circuit is linear.

y(t)
Output

O T IT 3T 4T t

(b)
5(t-T)

Fig. 25.17
0 T t

(a) If the input is x(f), then the ouptut is y(t) = x(t — T). Therefore, if
x(t) = S(t), the output is d(t — T) as in Fig. (25.17b), and by (25.17), we must
have g(t) = 5(t — T), so the transfer function is G(s) = e-Ts.
(b) To check that this transfer function really works for a general discrete
X

input, put x(t) = Y, xn^(t ~ nT). We then have


n =0

T(s) = X(s)G(s) = A'Me-7'5.

By the delay rule (24.25),

y(t) = x(t - T)H(t - T),

where H(f) is the unit function. Therefore x(t) is delayed by an interval T.

In the general case when the transfer function takes the form
{0i,02the inPut x(t) = S(t), represented by x(t) = {1},
M
generates a string of impulses £ gm&{t — mT), delayed by intervals
m=0
mT, m = 0 to M. We shall look at this case in the next sections.

25.9 The z transform


In the previous section the only functions of s that appear are
exponentials of the form e~nTs, representing S(t - Nt), where n is
a positive integer or zero.
420 25.9 Mathematical techniques

They may be written

e-nTs = (eTS)-» = l/(e7’s)".

The algebra connected with discrete systems is simplified by intro¬


ducing a new variable z, defined by

z = eTs. (25.22)

Then we may write, for example

N N y
X(s)= X xn e~"Ts = I
n=0 n=0 2

We shall reformulate the previous results in terms of z. Suppose


we have a discrete signal x(r) = {x0, xl5 x2,...} consisting of equally-
spaced impulses or samples with interval T, and x(f) = 0 for t < 0.
Then the function X(z) given by

X(z) = x o + - + ^ + ^ + • • • (25.23)
z z z

is called the z transform of x(t). Given x(t) we can write down the
z transform. On the other hand, given a function X(z), we can
expand it by Taylor’s theorem in powers of z_1 in order to obtain
the sequence of coefficients {x0, x1? x2,...} in (25.23), which defines
x(t). This sequence is called the inverse transform of X(z).
Suppose that {x„} is supplied as input to a discrete linear system.
The z transform of the output y(t) = {y0, y1? y2,...} is

T(z) = y0 + - + ^f + • • • (25.24)
z z

We already know from (25.21) that if the circuit is linear it has a


transfer function G(s) which represents a similar sequence of impul¬
sive terms. Therefore g(t) has a z transform:

£(z) = do + — + +• • • • (25.25)
z z

Finally (since all we have done is to write a shorthand for eTs), the
z transforms of output and input are related by

7{z) = Q(z)X(z) (25.26)

which is simply the product of two polynomials in z_1.


We have lost sight of T in these expressions, but we can always
recover it by returning to time-domain or s-domain formulae by
putting z = eTs. To summarize:
25.9 Applications of the Laplace transform 421

The z transform of a discrete signal

(a) If x(f) = {x0, xl5 x2,...}, its z transform isX(z) =


X, X,
*0 +—
z
+~
z
+•••
(25.27)
(b) The sequence {x0, xu x2,...} is the inverse trans¬
form of X(z).
(c) z is related to the Laplace transform by z = eTs.

The transfer function in terms of z

(a) The transfer function takes the form

nt \ . 9l . 02 .
Q{z) — do H-1 y + ' ‘ •
z z
(25.28)
(b) The input/output relation is

7(z) = g(z)X(z).
(e) The inverse of g(z) is the response to an input
<5(f), and has the form {g0, gx, g2,..

Example 25.17 Obtain the z transform of the discrete signal x(t) defined by
the sequences (a) {1}; (b) x„ = 1 for n ^ 0; (c) x„ = 1 if n is even, xn = 0 if n is odd.
0 0
(a) X(z) = 1 H-1—- + • ■ • = 1.
z z

(b) X(z) =1-1 — + ■ ■ ■.


z z

This is an infinite geometric series with common ratio z_1 (it converges only if
|z| > 1, but do not worry about this). From Section (5.4):

X(z) =

1 1
(c) X(z) — 1 + -j + + • • • •
z z

The common ratio is z 2, so by Section (5.4)

1
X(z) =
l-z“2 z2 - T

Example 25.1 8. Obtain the z transform of x(t) = {1, 2, 3,...}, or {n + 1}.


We see that

2 3 4
X(z) =1-1-+ +
z z z
422 25.9 Mathematical techniques

To sum this series, multiply it by 1/z:

1 12 3
-X(z) = - + ~2 + -3 H ■
z z z zJ

Subtract the second expression from the first:

1\ 1 1
1 - - \x(z) ={+- + - +
z Z z 2

z — 1

(as in the previous example). Therefore

X(z) =
- 1/ zj (z - l)2

Example 25.19. (a) Obtain the inverse z transform of the function X(z) =
z/(z — 2). (b) Deduce the time function x(t) which it represents.
(a) We need to find the coefficients in the infinite series form for X(z):

X(z) =
x
X0 + — + -y +
, x2
• ‘ ‘ •
z z

This is a Taylor expansion ofX(z) in powers of 1/z for large z (see Section 5.6).
To obtain it, we start by expressing X(z) in terms of 1/z:

The binomial expansion (5.4d), with a = — 1 and x — 2/z, gives

2 22 23
X(z) — 1-1-b — + — + • • ■ .
z z z

Therefore the sequence of coefficients (i.e., the inverse) is {1, 2, 22, 23,...}.
(b) The corresponding time-function x(f) is therefore

x(t) = S(t) + 2S(t - T) + 22S(t -2 D + 235{t -3 T) + ■ ■ ■.

Example 25.20. The response of a discrete system to the input x(t) = <5(t) +
(5(f — T) is found to be y(t) = <5(t) + 2S(t — T) + d(t — 2T). Find (a) the transfer
function Cj{z), (b) the transfer function G(s), (c) the response to a unit impulse S(t).
1 2 1
(a) PutX(z) = 1 H—, flz) = 1 -|-1—-, and Q(z) = g0 + — + ~ + ■ ■ ■ (for
z z z z z
all we know at this stage, there might be an infinite number of terms in (/(z)).
Since J{z) = Q(z)X(z),

(b) Restore s by putting z = eTs, where T is the spacing interval:

G(s) = 1 + e Ts.
25.9 Applications of the Laplace transform 423

(c) The impulse response is the inverse transform, g(t), of G(s):

g(t) = 8(t) + S(t - T),

which can be obtained also from (a).

Example 25.21. A smooth signal x(t) is sampled at intervals T to produce the


discrete signal (x(0), x(T), x(2T),..Obtain the z transform when (a) x(t) =
cos cot; (b) x(t) = sin co(t).
The sample sequences in (a) and (b) are respectively {1, cos coT, cos 2coT,...}
and {0, sin coT, sin 2coT,...}. We can deal with both at the same time by
remembering that cos ncoT and sin ncoT are respectively equal to the real and
imaginary parts of ejn“T. Therefore, consider the sequence resulting from the
complex input sequence

{l,ej“r, e2j“r,...},

which has the z transform

gjcoT q2jcoT

= 1 + (ej“Tz_1) + (ej“Tz-1)2

This is an infinite geometric series with common ratio ejra7z-1, so its sum is
equal to

1/(1 — ej<uTz~1) = z/(z — ej"r).

The complex conjugate of the denominator is z — e-j"T, so write

z _ z z — e~j“T
z — ejraT z — ejtuT z — e~jtuT

z(z - e‘j“T) _ z(z — e_j“')


z2 — z(eiaT + e_j"r) +1 z2 — 2z cos coT + 1

The transform of cos coT and sin ojT are the real and imaginary parts respectively
of this expression, so:

z(z — cos coT)


transform of cos coT = —-;
z2 - 2ZCOSCOT+ 1

z sin coT
transform of sin coT = —-.
z2 — 2z cos toT + 1

Finally, we note the discrete form of the Convolution Theorem,


(25.11), expressed in terms of z. For a discrete linear system there
exists a transfer function Q{z) such that input and output are related
by y(z) = or

Ti
To + + • • • + — + ••■
z z

By matching the coefficients of inverse powers of z on both sides we


424 25.9 Mathematical techniques

obtain:

Discrete form of the convolution theorem

If J{z) = g{z)X{z), then

y0 = 0oxo>

y i = 0ixo + 0oxi>
(25.29)
y 2 = 02*0 + 01X1 + 0 0X2 ’

and so on. In general,


n

= Z 0r^n-r-
r=0

The structure of these formulae resembles that of a convolution


integral, with r in place of x and n in place of t in (25.11).

25.10 Behaviour of z transforms in the complex


plane
Suppose that a string of impulses represented by x(t) = {x0, xl5 x2,...}
is fed into a discrete processor. When the first impulse arrives at
t = 0, it triggers the circuit to produce a scaled copy of the transfer
function (f(z) in the time domain, a string of impulses given by
x0g(t) = {x0g0, x0g1, x0g2, ■ ■.}. The second impulse is felt at t — T,
and Q{z) forms another scaled copy of itself, xxg(t — T), starting at
t — T, and so on. These sequences overlap: the second one starts
before the first has ended, and the output sequence consists of the
sum of all the effects which are still present at T, 2T, 3T,.... This
is illustrated in Fig. 25.18.
If Q(z) has an infinite number of terms, the effect of any input
term will be present for ever after. This extension into the distant
future of the influence of an individual piece of input resembles the
presence of transients in systems governed by differential equations.
Very long-term effects are usually unwanted; in particular, they
should not increase as time goes on. Their increase or decrease is
described by the rate of increase or decrease of the coefficients in
the series

g(z) = g0 + gl + gl + .... (25.30)


z z

We shall illustrate how information about this question can be


obtained by examining the behaviour of Q(z) when it is given in
closed form, and the variable z is allowed to be complex.
We limit consideration to cases where Q(z) is a rational function
25.10 Applications of the Laplace transform 425

<5(0 g(t)

S(t-T) g(t-T)

x(t) = <5(0 +S(t-T) y(t) = g(j) + g(t-T)

Output
Input function
function
A
followed by
B

Fig. 28.18 O O 2T 3T

of z:

aMzM + aM-iZM 1 + • • • + a0
£(*) = (25.31)
bftZN + bN_1zN 1 + • • • + b0

Assume that M ^ N. (If M > N, then Q(z) cannot be expressed as


a series in inverse powers of z). Suppose that the am and bn are all
real numbers, and that the N solutions of the equation

bNzN + bN^zN~1 + • • • + b0 = 0 (25.32)

are

z — zlt z2, z3,..., zN.

For simplicity, we shall assume that these numbers are all different.
The denominator of (25.31) then has N different factors of the form
426 25.10 Mathematical techniques

(z — z„), for n = 1 to N, so

aMzM + aM_±zM 1 + • • • + a0
g(z) (25.33)
bN(z - zx)(z - z2)---(z- zN)

Notice that Q(z) is infinite at the points zu z2, .. ., zN. These points
are called the poles of (fi(z). Some of them may be complex numbers.
If so, they occur in pairs: if z„ is a solution of (25.32), then so is its
complex conjugate z„.
Equation (25.33) may now be expressed as the sum of partial
fractions (now in general complex) just as in Section 1.12:

Q{z) = C0 + + • • • + C— (25.34)
Z Z Z2 Z

where C0 to CN are constants (C0 only occurs if M = N). A typical


term has the form

C
(25.35)
z —c

where c may be complex: if so, then C might be complex as well.


This term is the source of a part of the discrete output signal g{t)
produced by an input x(t) = <5(r), and we shall see whether it
generates an increasing or a decreasing output.
Suppose firstly that we find a pole at z = c in (25.35), where c is
a real number. Then C is also real, and

C C/ cV1 C Cc Cc2
— (l j --+ H—•
z — c z \ z) z z zJ

In the time domain this corresponds to the sequence

{0, C, Cc, Cc2,...}.

If |c| > 1 the terms are increasing in magnitude, and the system is
said to be unstable. If |c| < 1 they are decreasing in magnitude. The
rate of increase or decrease is actually exponential, because

\Ccn\ = |C| enln|c|

If c = +1, then the output time sequence is

(0, C, ±C, C, ±C,...}.

Next, suppose that c is complex. Then there is another pole at


z — c. Taking these together, we obtain a pair of complex conjugate
terms

= 2 Re + • • • (25.36)
z
25.10 Applications of the Laplace transform 427

Evidently the magnitude (modulus) of the coefficients follows the


same rule as before: if |c| > 1 the coefficients increase, and if |c| < 1
they approach zero.
We can interpret the complex poles more closely. Put

C = |C| ej*. (25.37)

Also, introduce constants o and co defined by

o = In |c|, co — arg c,
so that
c = eff e^,
and
c" - e"ff eJna\ (25.38)

From (25.37) and (25.38)

Ccn = \C\ ena ej(nco+<w.


Therefore, for n = 0, 1,2,...,
2 Re Cc" = |C| Qna cos (non + 0).

Put this into the time sequence

{0, 2 Re C, 2 Re Cc, 2 Re Cc2,...)

obtained from (25.36): it becomes

{0, 2|C| cos0, 2|C|eCT cos(co + 0), 2|C|e2cr cos(2co + 0),

2|C| e3ff cos(3co + 0),...} (25.39)

This sequence (apart from the first term) would be obtained by


sampling at unit time intervals from the smooth function

2|C| eat cos (cot + 0)H(f), (25.40a)

so a picture of the progress of the discrete transient can be obtained


as in Fig. 25.19. Alternatively, this is equivalent to samples at interval

Fig. 25. 19 Discrete transient of


C/(z — c). Suppose that C = 0.5
and c = 0.8e2 9j. Then w = 2.9
and a = —0.223. The curve
(25.40a), y = e~0-223' cos (2.91)
is shown. The response to S(t)
in (25.39), except the first, are
shown.
428 25.10 Mathematical techniques

T taken from

2|C| eCT'/r cos(cot/T + </>)H(t). (25.40b)

In Fig. 25.20 we show an Argand diagram with the unit circle


|z| = 1 indicated. This is used as a design tool to obtain a qualitative
idea of how a proposed circuit will behave, and to modify its
properties. We can find the poles (the points where Q{z) is infinite),
and place them on the diagram. Poles within the circle promise
transients which die away; if there is a pole outside, then a
stimulus applied to the circuit will produce ever-increasing output,
so the system will be unstable. Poles lying on the circle |z| = 1
produce transients which do not approach zero or infinity in
magnitude.

Fig. 25.20The unit circle |z| = 1, 25.11 Difference equations


and several poles of a transfer
Systems can be constructed whose output yn + 1 at time t = (n + 1 )T
function G(z). One of the poles
is outside the circle so the circuit depends not only on the input up to that time, but also on the
is unstable and a transient preceding outputs. This is achieved by delay elements which pick up
associated with this pole will each yn at time nT, store it for a time T, then feed it back into the
grow exponentially.
system so as to modify yn+l in some way. A chain of delay elements
can reach back further into the history of the outputs. In this way
yn + l may be related to the current input and earlier outputs by
equations such as

Fn+1 — yn + -X«>
or
■Vn + 2 +j yn + xn,

for n = 0, 1, 2,.... Such equations are called difference equations


or recurrence relations. The equations above are called linear
difference equations, because the terms in y only appear linearly.
The circuits producing such equations are not necessarily linear in
the sense we have used so far: that they possess a transfer function.
Difference equations are treated more fully in Chapter 37; for the
present we shall outline a connection with z transforms.

Example 25.22. Obtain the sequence {y„}, where

yn+i = yn + xn,
given that

y0 = 3 and {*„} = {1,2, 3,...}.


This is easily done by simply counting.
For n = 0: y1 = y0 + x0 = 3 + 1.
For n = 1: y2 = yt + x, = (3 + 1) + 2.
For n = 2: y3 = y2 + x2 = (3 + 1 + 2) + 3, and so on. Evidently,

>'n = 3 + (l-|-2 + 3 + -- - + n)
= 3 + ^n(n + 1)

by using a well-known formula (see Appendix A(f)).


25.1 1 Applications of the Laplace transform 429

Notice that we had to prescribe y0: it was not given by the


difference equation, and we could have assigned any value to it. It
resembles the initial condition of a first-order differential equation.

Example 25.23. Use z transforms to determine the stability of a feedback


circuit which processes the digital signal {*„} according to the difference equation

yn+2 = 3y„+1 - 2y„ + xn,

where y0, yu and the sequence {xn} are given.


It is usual to collect together the y terms on the left-hand side:

y„ + 2 - 3y„ + 1 + 2yn = x„ (i)

as with a differential equation. The sequences and their transforms are given by

X(z) = x0 + XyZ~1 + x2z~2 -\-;

{y„} = {y0,yi,y2,---},
9tz) = y0 + y^'1 + tv-2 + • • •;

{y„+i} = {yi,yi, y3,---}>

%(z) = y i + y2z_1 + y3z_1 + • • ■;

{yn+2} = {y2, y3>y*>---}>


%(z) = y2 + y3z~1 + y4z~2 + ■ ■ ■.

By simply looking at the z series it can be seen that

9i(z) = z.%z) - zy0


and
%(z) = z%(z) - zy1 = z2f{z) - z2y0 - zyt.

Therefore the z transforms of the sequences obeying the relation (i) are con¬
nected by

(z20iz) - z2y0 - zyx) - 3zfXz) - zy) + 2j\z) = X(z),


or
(z2 -3z + 2)7{z) - (z2 - 3z)y0 - zyt = X(z).

From this equation we obtain

X(z) + (z2 - 3z)y0 + zyl X(z) + (z2 - 3z)y0 + zyt


7(z) =-^-=-. (u)
z2 - 3z + 2 (z - l)(z - 2)

The denominator is independent of y0, yu and {x„}, which are arbitrary.


Ofz) has a pole at z = 2, so the results of the previous Section predict that unless
y0, yu and {x„} are specially chosen, the output will grow exponentially.

In Example 25.23, equation (ii), the denominator has the form

az2 + bz + c

where a, b, c are the coefficients of yn + 2, y„+1, and yn respectively.


The denominator alone determines the growth of transients, so there
is really no need to work right through the problem if all we want
is information about the stability. In fact, if y0 = JT = 0, which
430 25.1 1 Mathematical techniques

would be a natural condition, the circuit has a transfer function


equal to \/(az2 + bz + c), so the situation is exactly the same as in
the previous section. Similar considerations apply to linear differen¬
tial equations of any order.

Problems
(a)
25.1. Invert the transforms (a) 1 /s(s2 + 1), -VvV-
(b) l/s“(s“ + 1), (c) l/s3(s2 + 1), by using (25.1). R=2

Ht-
C= 2
25.2. The equation for the current i(t) in an RLC
circuit for zero initial charge is
L= 3
't (b)
di L= 2
L + Ri + - /'(t) dr = v(t).
df C Jo —VW
C=3
(a) Solve this equation when L = 2, R = 3, C = 3, v(t) = *
3 cos r in conveniently scaled units, for zero initial
current and charge. -^nrrr-
(b) Adapt the equation to the case when v(t) = 0 and R=2
there is an initial charge q0 on the capacitor, and
solve it, given that i(0) = 0.
(c) The circuit in (a) is quiescent with zero charge;
then, at f = f0, a voltage of 300 units acts in it for
0.01 time units. Approximate the applied voltage
by a suitable impulse function, and solve the equa¬
tion for i(f).

25.3. The displacement x(f) of a mass on a spring with


velocity damping and external force /(f) per unit mass
reduces to the conventional form x + 2kx + co2x = f(t).
The initial conditions are x(0) = 1, x(0) = 1. An impulse
/ is applied at f = f0. Find the solution for f > 0 for
k2 > or.

25.4. A light plank of length / rests across a crevasse,


and sags under the weight of a mountaineer of mass
M standing at the centre. The displacement u(x), where
x is measured from one end, is determined in general by
K d4u/dx4 = /(x), where K is constant and /(f) is force
per unit length along the plank. The boundary conditions,
25.6. Find the transfer functions F2(s)/F,(s) and
which say that the plank merely rests on its ends, are
V2(s)//(s) in the following circuits.

u(0) = u"(0) = u(l) = u"(l) = 0. (a)

Treat the mountaineer as a point force and solve


the problem using Laplace transforms. (Hint: Two condi¬
tions are prescribed at x = 0, but four are needed: call
the missing ones A, B. Find A and B by requiring
«(/) = M"(/) = 0.)
-VA-
R=5
25.5. Find the impedances of the circuits shown. (continued)
Applications of the Laplace transform 431

(b) R=2 L=2 25.11. (a) A student learning a language aims to memor¬
O-WV— -'TIF'-I -o ize 50 new words a day, starting at t = 0. She is successful
f 'l(s)
► C=2
1
1
in this but, after a time lapse a, remembers only a fraction
e-o.ou those iearne(i at any time. Express the number
V,(J) R=3 3 =■ v2( Nit) of words still in her vocabulary in terms of a
convolution integral, and evaluate it.
■ o- -0 l Fig. 25.22 (b) The student decides to increase the number of
words available by attempting 50 + O.lt words per day.
Find the new Nit), assuming the same initial success and
25.7. Evaluate the convolution integral
the same rate of forgetting.
•t

git) hit - t)c1t, 25.12. A population pit) for t > 0 develops as follows.
Jo The p0 individuals in existence at t = 0 die out on
or average via a factor e~v', so that at time t only about
•t p0e~yt are still in existence. For the rest, take any
h(T)g(t - t) dr, time t < t. The number born between z and i = 5r is
Jo bpi%) 5t, where b is the birthrate; these individuals die
in the following cases. Sometimes it might be easier out through a factor e~Pit~z\ where /? < y and t — t
to invert the corresponding Laplace transform (25.11). is the time elapsed from birth. Show that
(a) 0(f) = e‘, h(t) = 1; (b) g(t) = 1, h(t) = 1; 't
(c) 0(r) = e', h(t) = e‘; (d) g(t) = e_t, hit) = t; Pit) = p0 e yt + b p(x) e~w-t) dr,
(e) g{t) = r, h(t) = sin t; (f) g(t) = cos t, h(t) = t; o
(g) g(t) = sin 31, h(t) = e“2‘; and solve the equation.
(h) g(t) = sin t, hit) = sin t;
(i) git) = t4, h{t) = sin t; (j) 0(0 = tn, hit) = tm.
25.13. A simple harmonic oscillator with displacement
x is subject to a constant force F0 for 0 < t < t0, and
25.8. Use the convolution theorem (25.11) to obtain allowed to oscillate freely for t > t0. Its equation of
an expression, in the form of an integral, for a particular motion is
solution of the following equations.
mx + kx = F0[//(t) — Hit — t0)].
(a) ~ + co2x = fit); (b) ^ - co2x = f(t).
dr dr If the system starts from rest in equilibrium, show that
the Laplace transform is
25.9. Use the convolution theorem (25.11) to find a
Fp (1 - e-«)
solution x(0 of the following Volterra-type integral equa¬ L{xit)}
tions. m sis2 + co2) ’

where co = y/ik/m). Show that, for 0 < t < t0, the solu¬
(a) x(r)(t — t) dr = tA. tion is
f
(b) Xit) = 1 + x(r)(t — t) dr; xit) = — (1 — cos cot),
k
(c) xit) = sin t +
f x(t) cos(t — t) dr. and find the solution for t > t0.

25.14. An equation of the form


25.10. By following a similar argument to that lead¬
ing up to (25.14), show that dx(t)
xit 1) + t,
d dt
x**i*)fit - t) dr,
which relates the derivative at time t to the value of
the function at an earlier time is an example of a
where x**(f) represents the response of a quiescent ‘black
differential delay equation. If x(t) = 0 for t ^ 0, show that
box’ to a unit-function input H(t), and xit) is its response
the Laplace transform of the solution is
from quiescence to an input fit).
Suppose that the transform U**(.s) of the unit-function
L{xit)} = —-= —-.
response is given by l/(s - l)(s + 2) in a particular case. s2(s + e s) s3(l + e 7s)
Obtain the response from zero initial conditions to an
input H(t) sin cot. Expand 1/(1 + e~s/s) in powers of e”7s using a binomial
432 Mathematical techniques

expansion, and show that 25.20. Obtain the output y(t), when the transfer function
is G(s) = 1/(1 — ie~Ts) and the Laplace transform of the
2 W. (t - nf + input is X(s) = e~Ts + 2e“2Ts. [Hint: expand G(s) in the
-o (n + 2)! form of an appropriate infinite series in powers of e_s7/]

where |_tj is the integer floor function (the largest integer


less than or equal to t: for example, [2.3J = 2, [3J = 3, 25.21. Obtain the z transforms corresponding to the
and L — 2.3J = -3). various specifications that follow: (a) x(t) = d(t — T) +
25(t -2T)-5(t- 37).
(b) x(J) = (1, - 1, 1, -1,..(c) x(t) = {1/2"}.
25.15. The equation
(d) X(s) = e“TV(l — e~2Ts).
'i

2 cos(t — u)x(u) du = x(t) — t,


Jo 25.22. The following functions are sampled at interval
T. Obtain the z transform of the (discrete) sampled
is an example of an integral equation. Note that the
functions (H(t) is the unit function (1.13)).
integral is of convolution type, which means that the
(a) tH(t).
Laplace transform of the equation is
(b) e-'H(r).
1 (c) cos cot H(f), when T = n/2co.
2L{cos t}X(s) = X(s)
(d) sin cot, when T = n/2co.
Show that the solution is
25.23. Obtain the z transforms of the transfer func¬
x(t) = 2(f — 1) e‘ + t + 2.
tions, Q(z), of various discrete, linear, systems which have
been tested for the particular input x(f) and output y(t) as
25.16. The differential equation
specified:
d2x dx (a) x(f) = {1, 1}, y(t) = {1, —1}, and find the sequence
—— + t-x = 0, for g(t).
dr2 df
(b) x(f) = {1, 0, 0, 3}, y(t) = {1, 1}.
does not have constant coefficients: the coefficient of (e) x(t) = {1, -1}, y{t) = {1, 1}.
dx/df is t. Using the results (24.8) and (24.12), show that (d) x(0 = {U,l,...}, y(0 = {l,0,-1,0,1,0,-1,0,1,...}.
the transform of the differential equation subject to the (e) x(f) = {l,0, 1,0,...},
conditions x(0) = 0 and x'(0) = 1 satisfies the first-order y(0 = {l,0,-1,0, 1,0,-1,0, 1,...}.
equation

dX(s) 25.24. Prove that if the z transform of the discrete


s- + (s2 - 2)X(s) = 1.
ds function given by {x0, xu x2,...} is X(z), then the dis¬
crete transform of y(t) = {x0, X! e_cr; x2 e_2CT,...} is
Verify that V(s) = 1/s2 satisfies this equation, and hence X(cclz).
obtain the required solution of the original equation.

25.25. (a) Prove that if the z transform of the discrete


25.17. Using the method outlined in Problem 25.16, function x(t) defined by {x0, Xj, x2,...} is X(z), then the
solve the following variable-coefficient equations using transform of {0, x0, Xj,...} is (1 /z)X(z).
Laplace transforms: (b) Deduce that the transform of {0, 0,..., 0, x0, xl5...}
(a) tx"(t) + (1 — t)x'(t) — x(t) = 0, x(0) = x'(0) = 1; (starting with N zeros) is (1 /z)NX(z). (This is a time-delay
(b) x"(t) + tx'(t) - 2x(t) = 2, x(0) = x'(0) = 0; rule for z transforms.)
(c) tx"(t) — x'(t) + tx(t) = sin t, x(0) = 1, x'(0) = 0.

25.26. Prove that if the z transform of {x0, x„ x2,...}


25.18. (Discrete systems, Section 25.8.) The following
is X(z), then the transform of {xN, xN+l, xN+2,...} is
signals are expressed in the sequence forms (25.30).
Write the explicit form of x(f) and its Laplace trans¬ zNX(z) — zNx0 — zN~1x1 — ■ ■ ■ — zxN_x.
form (25.19) in each case.
(a) {1,2, 1,0, 0,0,...}. (b) {0, 1, 2, 3,...}. (c) {3}. (d) (This resembles the differentiation rule for Laplace trans¬
{(-2)"}. (e) {0,0,3}. forms, (24.12). Start the process with N = 1, then N = 2
etc., until the sequence becomes clear.)
25.19. The transfer functions g(t) in the time domain,
and inputs x(f), are given below. Obtain the outputs 25.27. The following represent transfer functions for
y(t) in each case, (a) g(t) = {1, 1}, x(t) = {1, 1}. discrete systems, Q(z). Find the poles, mark them on an
(b) g(t) = {1,1/2, 1/22,...}, x(t) = {!,!}■ Argand diagram as in Fig. 25.20 and state whether the
(c) 0(t) = {1,-1, 1,-1,...}, x(t) = {0, 2, 2}. systems are stable or not. Obtain the rate of growth or
Applications of the Laplace transform 433

decay of their transients. discrete systems governed by the difference equations


(a) (z + l)/(z2 - 4). shown. Use z transforms to obtain the transforms J{z)
(b) (z2 - z)/(4z2 - 1). in terms of X(z) and the initial values y0 and y,. State
(c) l/(4z2 + 1). whether the systems are stable or not.
(d) (z3 + l)/(2z4 + 5z2 + 2). (a) 4y„ + 2 - y„ = x„; y0 = Uyi = 2.
(b) yn + 2 - 3y„+i + 2y„ = 2xn; y0 = 0, yt = 1.
(c) 2y„+2 + y„+i + yn = xn + 1 - xn; y0 = 0, y, = 1.
25.28. {x„} and {y„} represent inputs and outputs to (d) 2y„ + 2 + 3y„+1 - y„ = xn; y0 = 1, yx = 1.
Fourier series and
Fourier transforms
Contents
26.1 The composition of vibrations 434
26.2 Fourier series for a periodic function 435
26.3 Integrals of periodic functions 436
26.4 Calculating the Fourier coefficients 438
26.5 Examples of Fourier series 440
26.6 Use of symmetry: sine and cosine series 442
26.7 Functions defined on a finite range: half range series 444
26.8 Spectrum of a periodic function 447
26.9 Obtaining one Fourier series from another 448
26.10 The two-sided Fourier series 449
26.11 Nonperiodic functions and the Fourier transform 451
26.12 Short notations 454
26.13 Fourier transforms of some basic functions 454
26.14 Rules for manipulating transforms 456
26.15 The delta function and periodic functions 458
26.16 Convolutions 460
26.17 The shah function 464
26.18 Energy is a signal: Rayleigh’s theorem 465
Problems 466

26.1 The composition of vibrations


If a note on a piano is played, firstly by pressing the key and then
by plucking the string, the sounds produced are very different
although the pitch or fundamental frequency heard is the same
in both cases. The note produced by an instrument is not a pure
tone or sinusoidal wave; it is a richer sound which contains other
frequencies. These occur in different proportions when the same note
is stimulated in different ways, or is sounded on different instruments.
A trained ear can detect some detail in these differences; the extra
components can be distinguished and their pitch recognized, or
they can be isolated by using resonators. The extra component
frequencies of a note are all higher than the fundamental frequency,
and related to it in a simple manner. If the fundamental frequency
is f then the harmonics present have frequencies
f 2/, 3f 4/, 5
the strength of the harmonics dropping off to zero as the frequency
increases.
When these components are added, a profile for the composite
wave is obtained. A particular note was found to have components
as shown:
26.1 Fourier series and Fourier transforms 435

Fundamental tone (250 Hz) Order of harmonic: 1 2 3 4 5


Frequency: / 2/ 3/ 4f 5/ ...
Relative amplitude: 1.0 0.9 0.3 0.3 0.1

The shape and amplitude of the component harmonic waves, and


of the composite wave, are shown in Fig. 26.1.
By means of an electronic synthesizer the proportions in which
harmonics occur can be controlled and a great variety of sound
quality generated, from flute to drum. Given any particular funda¬
mental frequency f it is plausible that we could generate a sound
wave of any preassigned quality (that is to say, of any shape) by
adjusting the balance of the harmonics. This possibility is essentially
what the theory of Fourier series is about, though in a wider context
/\7W than that of sound waves.
Of)

WW 26.2 Fourier series for a periodic function


(4/)
The following symbols are used in connection with periodic functions
(see Section 20.1):

/ = frequency (cycles/time; if time is in seconds, the unit is


(5/) the Flertz);
co = angular frequency (radians/time): co = 2nf = 2n/T\
T = period or wavelength: T= 1// = 2n/co.
Compound sound

A typical periodic function P(t) with period T is shown in Fig.


26.2. Any full-period interval may be chosen for discussion; suppose
it is the interval between t = — n/co and t = n/co.
We shall express P(t) over this interval, denoted by [ — rc/co, n/of],
as the sum of harmonic (sinusoidal) curves having frequencies
J2f3f..., where / — 1/T; or, equivalently, angular frequencies
co, 2co, 3co,..., where co = 2ti/T. A constant term is also needed,
Fig. 26.1 since the average value of P(t) will not generally be zero. Both sine
and cosine terms are needed, because if we involve only sines or
only cosines, the sum will have a symmetry, odd or even (Section
15.9), which P(t) might not have. Then we expect that

Fig. 26.2 A periodic function


P(t) with period T = 2nj(i).
436 26.2 Mathematical techniques

Standard form for a Fourier series of period T


P{t) = ja0 + (at cos cot + bx sin cot)
+ (a2 cos 2cot + b-, sin 2cot) + • • • __
ao (26.1)
— Jao + Z (an cos na>t + bn sin ncot),
n= 1
where co = 2n/T.

Equation (26.1) is a Fourier series for P(t), and the constants


a0; au ba2, b2;... are its Fourier coefficients. It will be shown
how to determine the coefficients in Section 26.4: the factor \ in the
constant term ja0 is introduced to simplify the working.
We have spoken in terms of the one-period range t = — k/co to
t — k/co, but every term on the right of (26.1) is periodic with the
same period T = 2k/co as P(t). Therefore the series will describe
P(t) for every value of t, not merely for t in the interval between
+ k/co.

26.3 Integrals of periodic functions


We prove two results needed for the next section, in which the values
of the coefficients in (26.1) are determined. Figure 26.3 represents a
periodic function P{t) with period T. Choose any value of t, say
t = f0, and compare the two integrals
' T fto+T
P(t) dt and P(t) dt.
Jo J to
each of which is taken over a one-period interval of P(t). The figure
shows that the integrals are equal by virtue of the area analogy
(15.13). The two shaded areas in Fig. 26.3a, b are assembled from
identical elements which are simply added up in a different order.
(a)
Pit)

(b)
Pit)

Fig. 26.3 Illustrating the area


analogy for (a) P(t) df and
(b) j;o0+rF(t) dr,
where P(t) has period T.
26.3 Fourier series and Fourier transforms 437

The integral over any one-period interval of a function


P(t) having period T,
*to + T
(26.2)
P(t) dr.
' to

does not depend on t0.

Example 26.1. Show that sin 21 cos21 dr = 0.


The period of cos2 t is n, because

cos2 t = |(1 + cos 21)

and the period of cos 21 is n. By (24.2),


'it t'in
sin 2t cos2 t df = sin 21 cos2 t dr,
Jo J ~i;n

since the range —jn to also covers a period n. But the integrand is an odd
function about the origin, so that the value of the last version is zero (Section
15.9).

The following special results can be proved by using the trigono¬


metric identities in Appendix B which convert products to sums.

Trigonometric integrals over a one-period


interval
(a) For n and m = 0,1, 2,... with n ^ m,
'tt/cu
cos not cos mot dt — 0,
J -n/o>
"n/o
sin not sin mot dt = 0,
J — K/o)
,'n/w
cos not sin mot dt - 0.
J -Jl/co (26.3)

(b) For n = 1, 2,...


'n/to
cos2 not dt = sin2 not dt — k/o.
~K I CO -n/o>

For n — 0, we obtain
'n/co ' 7t/co

dt = 2n/o and Odt = 0.


J -n/o) J -n/oi

(c) The range — n/o to n/o may be replaced by any


interval of length 2n/o.
438 26.4 Mathematical techniques

26.4 Calculating the Fourier coefficients


From (26.1), we expect that any periodic function P(r) having period
2n/co can be expressed in the form of a Fourier series

P(r) = + £ (an (26.4)


\ n cos ncot + bn
•* sin cut).
n= 1

To find a particular coefficient aN, multiply both sides of (24.4) by


cos Ntot:
P(r) cos Ncot = ja0 cos Ncot
00

+ Y, (an cos ncot cos TV cut + bn sin ncot cos Ncot).


n= 1

Integrate both sides of this equation between -rc/cu and rc/cu:


" K/CO n/io

P(t) cos TV cut dt = ^a0 cos Ncot dr


J - n/co -Tt/C0

1 71/co 'k/O)
cos ncot cos Ncot dt + bn sin ncot cos Ncot dt ).
11= 1 -n/o) J -n/co

(26.5)
Firstly consider the case N = 0. According to (26.3a), all terms
under the summation sign in (26.5) are zero; so, after putting
cos Ncot = cos 0 = 1, we are left with
rn/c> k/co
K
P(t) dr = \a0 dt — — a o-
J - nlo> V - n/o>
co

Therefore
* 71/(0
cu
Cln = P(t) dt. (26.6)
^ J k/co

Now suppose that iV / 0. By (26.3), all the integrals on the right


of (26.6) are zero except the single one that involves aN, so (26.6)
reduces to
71/(0 k/co
n
P(t) cos Ncot dr = aN cos2 Ncot dt = — aN.
— 71/(0 2 -n/co
co
Therefore, for N — 1, 2, 3,...,
‘ n/co
co
r?v — P{t) cos Ncot dt. (26.7)
n - k/co

By comparing (26.7) with (26.6), it can be seen that a0 and au a2,...


are all given by the same formula. That is why the constant term in
(26.4) is written as ja0 instead of a0.
To find bN for TV = 1,2,..., multiply (26.4) by sin Ncot and
integrate. In a similar way to that described above, we find that, for
TV = 1,2,3,...,
K/CO
co
bN — P(r) sin Ncot dr. (26.8)
TC J _ K/co

Since P(r) is a known function, the integrals in (26.6), (26.7), and


26.4 Fourier series and Fourier transforms 439

(26.8) can be evaluated to give all the coefficients in the Fourier


series (26.5).
In the following summary, the letter n is used in place of N to
simplify the form of the results.

Fourier series for periodic functions

Function: P(t), period T — 2ft/co,


00

Fourier series: P(t) — jaQ + £ (an cos na>t+ b„ sin ncot),


n= 1

Fourier coefficients:
a (26.9)
an P(t) cos ncot dt (n = 0,1, 2,...),
J — TC/CO

CO
bn = — P(t) sm ncot df (n = 1,2,...),
tt J — k/cj
(in place of the range of integration, —n/co to rr/co,
any other one-period interval may be used).

It can be seen also that since


co
' TZ/(Q
1 21
_L T

an =
2U0 P(t) df = Pit) df,
2 n J.. -n/co
_ TJ -±T

the following is true:

Average value of P(t)


The average value of P(t) over a one-period interval (26.10)
is equal to the constant term \a0.

Notice the case of period 2ft, which often occurs. In such cases
co = 2n/T = 1:

Fourier series for functions P(t) with period 1 n


CO

P(t) = j<20 + Z ian cos nt + b„ sin nt),


«= i
where
1 Cn
an = Pit) cos nt df, (26.11)
n J -n

1 \
bn = ~ Pit) sin nt df.
TC J — -n
(The integrals may be taken over any one-period
interval instead of [ — ft, ft].)
440 26.5 Mathematical techniques

26.5 Examples of Fourier series


The actual calculation of Fourier coefficients requires attention to
detail, especially in respect of a0.

Example 26.2. Find the Fourier series of the function P(t) shown in Fig. 26.4.

P{t)

Fig. 26.4 -27t -n O n 2n 4ix f

The period is 2rc, so that co = 2n/2n = 1. Choosing the interval — n to n as the


basis of the calculation yields

| t if 0 < t < Jt.

The coefficients can be obtained from (26.11):

Coefficients bn. P(t) is an even function about the origin (see Section 15.9), and
sin nt is odd; therefore P(t) sin nt is odd. Hence the integrals defining bn are all
zero:

bn = 0 (n = 1,2,...). (i)

Coefficients an. Since P(t) is even and cos nt is even, P(f) cos nt is even; so (26.11)
gives

2
= - P(t) cos nt dr = - f cos nt df,
n o Jt

t sin nt sin nt
dr , (ii)

after integrating by parts.


At this point it is seen that n = 0 is a case requiring separate treatment
(basically because f cos nt dr # n~1 sin nt + C when n = 0). Postponing the ques¬
tion of n = 0, suppose firstly that n = 1,2,.... The formula becomes

2 .. 2 ( — 1)" — 1
an = ~ [cos nr] S =-r-.
nn~ n n1

Therefore

_ f0 if n is even,
{ —4/nn2 if n is odd. ^

We still have to find a0, which is given by (26.11) as

2 * K

a0 = - t dr = jt. (iv)
n J0

Collect the coefficients from (i), (iii), and (iv) and put them back into the Fourier
26.5 Fourier series and Fourier transforms 441

series:

4 (cos t cos 3t cos 5t


P(t) = in-~ —- + + + ■

n \ 1

(a) n In Fig. 26.5, we show how P(t) is gradually shaped as we take more
and more terms of the Fourier series in Example 26.2. Here

cos t cos 3t cos 51


P(t) = H-4( + ■ ■

-i
n \ l2 32 52 /

-JC O
= 1.571 — 1.273 cos t — 0.141 cos 31 — 0.051 cos 51-

Example 26.3. Find the Fourier series for the function shown in Fig. 26.6.
The period is T = 2n, so that co = 1 and the Fourier series is
OO

P{t) = ja0 + Yj (an cos nt + bn sin nt).


n= 1

It makes no difference to the ease of calculation whether — n to n or 0 to 2n is


(c) chosen as the basic interval. We will take 0 to 2ti to remind the reader of the
possibility. Then

t (0 < t < 7l),


P(t) =
0 (n < t ^ 2k).

Coefficient an. From (26.11),

1 1
(d) a„ = P(t) cos nt dt = t cos nt dt.

Warned by Example 26.2, we deal first with the case n — 1, 2, 3,...:

1 1 1
- f sin nt H-cos nt
JO

1 \ f 1
OH-cos nn — 0 H- [(-!)"-!]•
Fig. 26.5 (a) 1.571; n2 J \ n2 nn
(b) 1.571 - 1.273 cos t; (c)
The sequence has every even-order term zero:
1.571 —1.273cost —0.141 cos 3f;
(d) 1.571 - 1.273 cost-0.141 2 2 2
a1 = —, a2 = 0, a3 = ——, a4 = 0, a5 = ——,
cos 3f — 0.051 cos 5t. n n32 n52

P(t)
n

Fig. 26.6 -3n -2n O 7t 2k 3ti 4k t

The case n = 0 is again special:

1
a0- p(t) dt = -[it2]s = itt.
K t
Coefficient bn.

1 2n
br, = -- P(t) sin nt dt = - f sin nt dt
n
442 26.5 Mathematical techniques

7t
t cos nt 1
-1—- sin nt
71 n n2 o

—-(71 COS 7771 — 0) 4-- (0 — 0)


7t n n J 77

The series is difficult to write if the cosine and sine terms are kept together. By
separating them, we obtain

4- sin t-sin 2f + - sin 3t — • • •


V 2 3

In Example 26.3, the function P(t) jumps from n to zero at the


points

t == ..., 7t, tz, 3rt, ....

To see what values are generated by the series at such points put,
say t = n into the series we obtained. All the cosine terms become
( —1) and all the sine terms are zero, so that at x = it the series
delivers

A few minutes with a calculator makes it clear that this series for
P(n) cannot add up to tt, and plainly it does not give zero either.
In fact its sum is jn, half way between these values. The general
rule is as follows.

Fourier series at a jump in value of a function

The sum of a Fourier series at a jump is equal to the (26.12)


average of the two function values on either side.

Fig. 26.7 shows how the function is fitted by the series when the
six terms up to cos 31 and sin 31 are taken.

Fig. 26.7 -37t' -2n O 2k


z

26.6 Use of symmetry: sine and cosine series


In general, the Fourier series for a periodic function will contain
both sine and cosine terms. However, the following results hold.
26.6 Fourier series and Fourier transforms 443

Even and odd functions P(t)


(a) If P{t) is even about the origin, then
by = b2 = ■ ■ ■ = 0.
(26.13)
(b) If P(t) is odd about the origin, then
Aq Cl y Ct2 * * * 0.

These results follow from (26.9) and (26.11), because P(t) sin neat
is odd if P(t) is even, and P(t) cos neat is odd if P(t) is odd.

Example 26.4. Obtain the Fourier series for the switching function P(t)
shown in Fig. 26.8.

The period T is 2, so that a> = n. Choose the basic interval to be t = — 1 to 1.


On this interval,
'-1 for —1 < t <0,
P(t) =
1 for 0 < t < 1.
Since P(t) is odd about the origin,
a0 = ay = a2 = ■ ■ ■ = 0.

For the bn, from (26.9), since the integrands are even functions,

Fig. 26.8 b=- P(t) sin nnt df = 2 sin nnt dr = —— [cos mrf]o
nn

=-[(-!)"-!]
m:

for n — 1, 2,.... The sequence bn is therefore


4 4 1
by=~, b2 = 0, h3=--, b4 = 0,...,
71 71 3

and the Fourier series is


1 . 1
P(f) = - sin 7tf 4- - sin 3nt + - sin 5nt +

4 “ sin(2r — l)nt

n r=i 2r — 1

Example 26.5. Obtain the Fourier series for the switching function P(t)
Pit) shown in Fig. 26.9.
1
The period is 2, so that co = tl Choose [—1, 1] as the representative interval;
1 1 1 1
1 1 1 1 then
1 1 1 1
1 1 1 1
1 1 1 1
1 1 if-*<*<*,
P(t) =
-2 -4 -l -1 o i 1 1 2
2 2 2 2 elsewhere on the interval.
Since P(t) is an even function, by = b2 = b3 = = 0. The coefficients an are
Fig. 26.9
given by
-l f* 2l
an P(f) cos nnt df cos nnt dr
-i J ~T

= — [sin rraf] 1 — sin fnn.


nn nn
444 26.6 Mathematical techniques

As we have seen before, a0 gives trouble since this formula is meaningless when
n — 0. We have, in fact.

r*
a0 =
ldt = l.
*

Then

Oq — 1, Uj — 2/71, #2 — 6, ^3 — 2/3tc, U4 0,...,

so that the odd-order coefficients alternate in sign.


Finally the series is

2 ( 1
P(t) = \ + -1 cos nt-cos 3rcf + - cos 5nt —
1
3i \ 3 5

5 2 ^ ( — l)r cos(2r - 1 )7ir


2 L z ; •
7i i 2r — 1

26.7Functions defined on a finite range: half¬


range series
It is often necessary to obtain a Fourier-type series for a function
which is of interest only over some finite interval, and whose natural
extension, if any, is not necessarily periodic. The Fourier series is
invariably periodic, so that it cannot fit a nonperiodic function
everywhere. For example, consider the problem of finding a Fourier
series which will fit /(f), where

/(f) = t between f = 0 and k,

when our only concern is whether the series fits /(f) between 0
and ft, the behaviour of the series elsewhere being a matter of
indifference.
Figure 26.10 illustrates a technique for producing such series. We
hold on to the given function inside the interval of interest, but
extend it by means of an artificial function which is periodic. This
extended function will have a Fourier series of its own, and it
will agree with /(f) on the interval 0 to n.
In Fig. 26.10b we have extended the nonperiodic function /(f) = f
on 0 ^ f ^ n to an artificial function /s(f) which has period 2k and
is an odd function. Being odd, it has a Fourier series consisting of
sine terms only, and this will correctly reproduce /(f) on 0 ^ t ^ ft.
Alternatively, Fig. 26.10c shows how to get a series of cosine terms
by an even extension fc(t), keeping /.(f) = /(f) on 0 ^ f < n.
Again, Fig. 26.10d shows a fairly arbitrary extension of period 3k,
which will have a Fourier series containing both sine and cosine
terms. Obviously there is an infinite number of possibilities, the most
important being the so-called half-range sine and cosine series,
corresponding to odd and even extensions respectively.
26.7 Fourier series and Fourier transforms 445

\
/ n

[/

\
\
-3k /-2k -k / 0 k /'lK 3k t
/ / / /'
x / -K / /

(c)
Fig. 26.10 (a) /(f) = f on
0 < f ^ n, with natural
m
\ n
nonperiodic extension. \ / \
(b) /(f) = f on 0 ^ f ^ n, has
period 271 and is an odd
-371
i
\

-2k
/
i

-k
\
S A
o k 2k 3k t
function.
(c) /(f) = fon0^f^7r has
period 2k and is an even (d) /a(f)
function.
(d) Z(f) = f on 0 ^ f < 7i,
and has an arbitrary extension
of period 3 it.

Example 26.6. Obtain a Fourier sine series for f(t) = t on the interval
0 ^ f < rr.
Extend /(f) on 0 < f ^ x as an odd function /(f) with period 27t (not ti) as
shown in Fig. 26.10b. Then to = 1 in (26.9). Choose the interval -it to r as
basic. Then since /(f) is odd, we know in advance from (26.13) that

3D

Ut) = X bn sin nt
n= 1

(sine terms only), where

1 rn
b„ = /(f) sin nt df
71

2
/(f) sin nt df (since /(f) is odd; see (15.17))
K

2
/(f) sin nt df (since /(f) agrees with /(f) on 0 ^ f < n)
K

2 2 1 1 . T
f sin nt at = - — f cos nt 9—- sin nt
K
446 26.7 Mathematical techniques

2
-COS Ml 2(-l)"+1
n n n

Therefore the required series is

2/ . 1 . . 1 . , (-l)n + 1 sin nt
- sin f-sin It + - sin it — ■ ■
it V 2 3 71 n = 1 n

and this is equal to t on 0 ^ t < n, but nowhere else. (By (26.12), the value
delivered by the series at t = 7t is zero.)

(a) Half-range cosine series for 0 ^ t < n


o° 2 Cn
f(t) = 2ao+ Z a„ cos nt, a„ — - fit) cos nt dt
n= 1 tl J o
(26.14)
(b) Half-range sine series for 0 ^ t ^ n
00
2
fit) = Z bn sin nt> b„ — - f{t) sin nt dt
n= 1 n Jo

Suppose that, more generally, a sine series representing fit) for


0 ^ f ^ t0 is required (Fig. 26.11). Extend fit) to an odd function

Fig. 26.11 fit), 0 ^ t ^ t0; odd


extension, fit), period 2f0.

fit) having period 210. Then, in equation (26.9),

co = 2n/2t0 = n/t0,

and, since fit) is odd,

00

fft) = Z b„sininnt/t0),
n= 1

where

u 1
bn = -
P° // x • nizt j 2
ff) sin — dt = —
f'° fit) sin. —
nut
dt,
to J -to to t0 J 0 f0

since fit) = fit) on the interval 0 to f0.


If the extension is carried out so as to produce an even periodic
function, a similar calculation leads to a cosine expansion.
26.7 Fourier series and Fourier transforms 447

(a) Half-range cosine series for 0 ^ f ^ t0

\ i £ nut
J(t) = ja0 + ^ a"cos—.
n= 1 t0

2 to
nut
an = ~ f(t) cos — dr.
to J o t0 (26.15)
(b) Half-range sine series for 0 ^ t ^ t0
to
S , . tmr nnt
/(0 = L bn sin —, f(t) sin — dr
n=l t0 Lo J o to

26.8 Spectrum of a periodic function


Suppose that P(t) is a periodic function with period T. The Fourier
series has the form
00

P(r) = |a0 + Y, (a>, cos ncot + bn sin ncot)


n= 1

where co — 2n/T. By the identity (1.19),


an cos ncot + bn sin ncot = c„ cos(ncot + </>„),
where cpn is a phase angle, and
cn = + bn) (n= 1,2,...),
which is the (positive) amplitude or strength of the nth term. For
completeness, we include
co — 2 l^ol •

The sequence c0, cu c2, ■ ■ ■ is called the spectrum of P(t). If the


series consists only of cosine (or sine) terms, then correspondingly
cn = — \an\ (or |i*„|) - the spectral components are always
(a) m
positive or zero.
1
The spectrum can be displayed as if it were a physical spectrum.
Figure 26.12 shows the spectrum of the function worked out in
J-1_I-1-1-1-
Example 26.5. The property which makes the spectrum a useful
concept is that the spectrum is independent of the time origin
of t, although the Fourier series itself is not. If
00

P(t) = $a0+ Y cn cos(ncot + cp„),


n= 1

then the series for P(t — t0), whose graph is the same shape as P(t)
but moved to the right a distance f0, is
00

P(t - t0) = \a0 + Y cn cos[nco(t - t0) + </>„]•


«= i
The cn remain the same, and only the phase angle changes. Therefore
Fig. 26.12 See Example 24.5. (a) it is only the shape of P{t) which determines its spectrum, not its
P(t) = i + (2/Jtf + 5 cos
clock-timing. For this reason the spectral or harmonic composition
5;rr-•). (b) The spectral
components are j, 2/n, 0, 2/3n, of a piano note is always the same, independently of what time of
9, 2/5n. day the note is played.
448 26.7 Mathematical techniques

It is important to realize that the spectrum refers only to a


complete periodic function, and not to an isolated segment such as
those discussed in Section 26.7. For the functions shown in Fig.
26.10, there correspond different Fourier series which have different
spectra.

26.9 Obtaining one Fourier series from another


There exist ‘dictionaries’ of Fourier series, but the entries cannot
match exactly all the functions required in practice. If the broad
shape of the dictionary entry is the same as that of the function
whose series is needed, then scaling or translation along the t axis,
or along the axis of P(t), might be all that is required. The transition
can require more than one stage.
The examples which follow are based on the standard form shown
in Fig. 26.13, which was expressed as a Fourier series in Example 26.2:

. 4 (cos t cos 31 cos 51


(26.16)

Fig. 26.13
Example 26.7. Find the Fourier expansion of the function Q(t) shown in Fig.
26.14.

Q(t) This is the same as Fig. 26.13 except that the vertical dimension is reduced by
a factor 1 /n. Therefore, from (26.16),
1 - A *
, 4 ( 1
Q{t) = i-cos t 4-cos 3f + ■ • •
-2n -n On 2n 3n t 7t2 V 32

Fig. 26.14 Example 26.8. Find the Fourier expansion of the function Q(t) shown in Fig.
26.15.
Here the t scale is changed by a factor n. We obtain, from (26.16),
Q(t)
Q(t) = \n-(cos nt + — cos 3nt + — cos 5nt 4-).
n\ 3" 52 /

It is necessary to be careful here: it is not t/n but nt in the new series.


Check the period: it is equal to 2, which is correct.

Example 26.9. Find the Fourier expansion of Q{t) in Fig. 26.16.


The graph of P(t) in Fig. 26.13 has been shifted a distance ?n to the left (see
Fig. 26.16). Therefore

Q(t) = P(t + |7t),


Q(t)
. 4/ , 1 1
— 2n I cos(t + I71) + — cos 3(t + j7t) H-cos 5(t + \n) + • • •
tt \ 32 52

t As n goes through the sequence 1, 3, 5, 7,..., cos \nn = 0 and sin \nn
becomes the alternating sequence 1, - 1, 1, - 1,..., . Therefore

fig. 26.16
Q(t) = \n - - (sin t - -1 sin t + — sin
re V 32 52
5t- Y
26.10 Fourier series and Fourier transforms 449

26.10 The two-sided Fourier series


Equations (26.9) define the Fourier series in terms of circular
frequency co, where co — 2n/T, and T is the period. For the rest of
the chapter we shall instead use the fundamental frequency f0
(complete cycles per unit time), since it will simplify the subsequent
development of Fourier transforms. We then have

T— — and co = 2nf0.
fo

In terms of f0, (26.9) becomes

(a) Fourier Series

Fourier series in terms of frequency f0


xP(t) a real or complex function with period T = l//0

(a) Fourier series

xp(0 = ja0 + Yj (an cos 27inf0t + bn sin 2knf0t)


n— 1
(26.17)
(b) Coefficients
/•

«« = 2/o xP(t) cos 2nnf0t dt


J Period

K = 2/0 xP(t) sin 2nnf0t dt


Period

We shall now show that (26.17) may be reorganized into another


shape, as follows:

The two-sided Fourier series

Xp(t) a real or complex function with period T = l//0.

(a) Two-sided series


00

Xp(t)= I ei2nnfot
n = — co
(26.18)

(b) Coefficients Xn

X. = f» Xp(t) e i2nnfot dt
J Period
450 26.10 Mathematical techniques

The coefficients are in general complex even if xP(f) is real, and the
series runs from n = — oo to n = oo.
To prove (26.18) we shall work backwards from it to arrive at
(26.17). Start with (26.18a):

Xp(t) = x0+ i X, el2"”" + I X, el2"""


n= 1 « = ~ oo

= X0+ £ Xn ei2nnfot + £ X.n (26.19)


n= 1 n= 1

changing the counting index n to ( — n) in the final term.


From (26.18b), for n positive and negative,

X„ = fo 3(t){cos 2nnf0t - j sin 2nnf0t} df,


Period

= iK - jbn), (26.20)

where

an = 2/0 Xp(r) cos 2nnf0t dr,


Period

and (26.21)
>

K = 2/0 xP(f) sin 2nnf0t df.


4 Period
J
Therefore

a-n = an and b-n= ~b,r


(26.22)

It follows from (26.20) and (26.22) that when n ^ 0, as in the sums


(26.19),

X„ = i(a„ - }bH),

X-n = - jfc-„) = i(a„ + A), (26.23)

where an and bn are the same numbers as the coefficients in the original
series (26.17).
Finally, (26.19) becomes

Xp(r) = k>+ I {i(an — jh„) ej2TCn/o' + j(a„ + jbn) e~i2nnfo1}.


n= 1

After using Euler’s formula (6.8) for the exponentials, and carrying
out the multiplications, the terms in which j appear cancel, and we
are left with
X
xP(f) = ja0 + Y, (an cos 2nn/0f + bn sin 2nn f0t),
n= 1
26.10 Fourier series and Fourier transforms 451

which is the original form (26.17a). (Since xP(t) may be complex, so


may a„ and bn, so we should not shorten the final calculation by
taking twice the real part of f(a„ — jbn) ej27tn/o'.)
The following properties sometimes save calculation:

Properties of Xn in the two-sided Fourier series

(a) XH = \{an - jb„), and X_n = \{an + jbn\ where an


and bn are derived as in (26.17).
(b) If xp(t) is real, then X_n = X„. (26.24)
(c) If xP(t) is real and even, Xn is real and X_„ — X„.
(d) If xP(t) is real and odd, Xn is pure imaginary and
X-m = -xn.

Example 26.10. Obtain the two-sided Fourier series for the function xP(t),
yo having period T, of which a single period is shown in Fig. 26.17.
i In (26.18) /0 = 1/r. Therefore
i
J ji 1 iT . 1
X=- xP(t)e i2nn,lT dt = — e >2Kn,IT dt
i n
is j
T J* _i21T T
i i 1 T j-g-j2nm/r-jiri 2
_ _^gjicnt/T _ g-jTtnt/r-j
-I7 _!r oft It ' T(-j2nn) 2nn

Fig. 26.17
= — sin (nm/T) (from (6.10)).
nn

Finally

00 1 TMT
Xp(t) = X — sin eJItn,/T.
n = — oo T

26.11 Nonperiodic functions and the Fourier


transform
Figure 26.18 shows examples of functions x(t) which are not
periodic. Nonperiodic functions can still be expressed in terms of
harmonic (sine and cosine) functions, but instead of a Fourier series
with a discrete spectrum of frequencies nf0, there is an infinite
integral involving a continuum of frequencies. A nonperiodic x(t)
may be expressed in the form

x(t) =
J — 00

This is the analogue of (26.18a), the two-sided Fourier series.


Fig. 26.18 Three nonperiodic Harmonic elements are present having the form t]2nft, but discrete
functions. harmonics are not singled out; there is a contribution at every
452 26.1 1 Mathematical techniques

frequency /. The ‘weight’ of the contribution’ of frequency / is


measured by the function X(f), which is called the spectral distribu¬
tion, or spectral density function, or the Fourier transform of x(t). It
is given by

X(f) = x(t) e i2nft dt,

and this is very similar to (26.18b). Taken together these two


formulae are called a Fourier transform pair. The type of integral
involved (there are many versions of the result) is called a Fourier
integral.

The Fourier transform pair


(a) x(r) in terms of its spectral distribution X(f):

x(t) = X(f)d2*ftdf.
(26.25)
J — oo
(b) X{f) is the Fourier transform of x(t):

X{f) = x(t) e )2nft dt

We cannot prove this result here, but some idea of the process
involved can be obtained by comparing the following, nonperiodic,
example with its periodic counterpart, Example 26.10.

Example 26.1 1. (Compare Example 26.10) Find the spectral distribution


X(f) of the function x(t) defined by

1, — jZ<t<
x(t) =
(0, elsewhere.
Figure; 26.19a shows x(t). The spectral distribution function, or Fourier transform,
of x(t) is given by
it
e~)2nft dt
X(f) = x(t) e i2Kft dt =
“it
1
[e ]tV = -4[e -jjt/t _ eJtt/t]
-j2tt/nit

(—j 2tt/) 2nf

j 1
(— 2j) sin 7t/i = — sin nfx.
2nf nf

By (26.25a), x(t) is made up from the harmonics e*2^' by

sin nf z
x(t) = X(f) oi2nft d/ = eJ2tt/< df

Figure 26.19b shows the Fourier transform X(f), which in this case is real.
Fig. 26.19
26.1 1 Fourier series and Fourier transforms 453

Now compare the results of Examples 26.10 and 26.11:

(A) Periodic case (Example 26.20)

00 1 nut
xP(t)= X — sin ej2ro,t/T
n — — oo T

which may be written

*„(,) = £ sin(mrT/r) ^lmijT


n = — oo yivl/ T

= £ Sin nTCT/o ci27tn/0%


(26.26)
«= — oo nnf0

where /0 = 1/7" is the fundamental frequency.

(B) Nonperiodic case (Example 26.11)

x(t) = (26.27)

Some similarity between the two results is obvious. The transition


between the two cases can be obtained by letting T -> oo in (26.26),
so that the periodic repetitions of the central pulse are pushed away
to infinity. Then the spacing between the spectral lines, 1/7, which
equals /0, shrinks to zero. From this point of view, substitute into
the periodic result (26.26) the symbols

fo = 5/ and nf0 = /.

Then (26.26a) becomes

sin nfz ^2nft § f


nf
Finally, take the limit as f0 = 5/ -> 0. According to (13.9), the limit
of the sum is equal to the integral

f" e„w.d/i
J-„ it/
and this is the same as the result (26.27) for the nonperiodic case.
The following properties follow from the definitions (26.25b):

Properties of the Fourier transform !(/)


Suppose x(t) is a real function.

(a) X(-f) = X(f). (26.28)


(b) If x{t) is even, X(f) is real and even.
(c) If x(t) is odd, X(f) is pure imaginary, and odd.
454 26.1 1 Mathematical techniques

There are several definitions of a Fourier transform pair which


are different from (26.25), and are in common use. Also, in special
cases such as those in (26.28b and c), the relations (26.25) can be
expressed in terms of real integrals. It is necessary when reading
about the subject to determine what particular form is being used.

26.12 Short notations


(a) fix(f)] (square brackets) denotes the Fourier transform of x(t)
(eqn (26.25b)):

7lx(ty] = x(t) e~i2n/t dt.


J — 00

(b) f~1[X(f)] denotes the inverse Fourier transform of a spectral


distribution X(f). It gives us back the originating function x(t) by
(26.25a):

x(t)=T1lX(/)]=['* X(f) ej2nft df.

(c) The symbol W is often used to indicate the connection between


x(t) and X(f):

x(t)~X(f).

26.13 Fourier transforms of some basic functions


[Note: a more complete list of transforms is given in Appendix G.]

(a) The top-hat function FI(t)


(a) 11(f)
The functions II(t) and n(t/r) are shown in Fig. 26.20. The
1 transforms were found in Example 26.11:
1 1
1 1
1 1
1 1
1 1
1 1 Top-hat function
1 1
1 O I /
2 2 sin tif
(a) fin(t)] =
(b) II(f/T)
*f (26.29)
1
1 "1 sin nf r
1 1 (b) finf/xf]
1 1
I 1
1
nf
1
1 1
1 1

°'v (b) The function sine


The functions on the right of (26.29) are related to a standard
Fig. 26.20 (a) n(£), (b) n(t/r) function defined by
(width t).
sin kx
sine x =-.
KX
26.1 2 Fourier series and Fourier transforms 455

The graph is shown in Fig. 26.21. It is an even function, and it can


be shown that the signed area under the curve is equal to unity.

The function sine x


(a) sine x = sin(7tx)/:rx.
Fig. 26.21 sine x = sin(7tx)/(jtx). (26.30)
sine x dx = 1.
J — oo

The transform of the top-hat function (26.29) becomes

Fourier transform of II(r)


(a) //II(£)] = sine/. (26.31)
(b) //II(£/t)] = t sine xf.

To find the Fourier transform of sine t, start with (26.31b). Since


t sine xf — //II(£/t)], it follows that
U(t/x) = 1 [t sine x/],
* oo

= x sine xf e>2nft df.


J — OO

Interchange the letters t and f take the complex conjugate of the


result to make the sign in the exponential negative, and put 1/t in
place of t. We obtain
f00 l t
n(r/)= -sinc-e ]2nft dt.
J-00 T T
Multiply through by x to obtain the results:

Fourier transform of sine t


(a) //sine t] = II(/). (26.32)
(b) //sinc(t/r)] = xYl(xf).

Equations (26.31) and (26.32) illustrate a general fact; that as the


duration of a signal increases (e.g., as x increases in (26.31b)), the
effective frequency range tends to become narrower, and conversely.
(c) A one-sided exponential
Consider the function in Fig. 26.22a defined by
x(t) =

where H(t) is the unit function (1.13) and a is positive.


'oo

Fig. 26.22(a) x(t) = e~°“H(t). nm\ = x(t) e i2nft dt e at e ~i2nft dt


(b) X(f) = 7le-°‘Hm Jo
456 26.12 Mathematical techniques

' 00 -1
g-(a +j2Jt/)f 00
0
J0 a + }2nf
1
a + )2nf
Therefore

J[e-“'//(r)]= (26.33)
a + )2nf
Since x(r) is neither even nor odd the spectral distribution is a
complex function. Its real and imaginary parts are shown in Fig.
(26.22b).

Example 26.19. Find the Fourier transform of the function given by x(t) = e 1,1
(see Fig. 26.23a)
We have

X(f) = e ^ e >2Kfl dt
- oo

•0
e' e *2nft dt + e~‘ e~i2nft dt

1 [eU -j27t/,i]0 ^ + (-D


[e -(1 +j27t/)[-10O
Jo
1 -j2 nf 1 +)2nf
1 1 2
+
1 — j2tt/ l+j27i / l + 4n2f2

This function is shown in Fig. 26.23b.

26.14 Rules for manipulating transforms


The following rules enable transforms to be obtained from known

Signal Transform
*(0 X(f) = r[x(f)]
(a) Linearity Ax.it) + Bx2(t) AX.it) + BX2(t)
(b) Time scaling x(At) Xif/A)/\A\
Time-reversal x(-t) X(~f)
(c) Time delay x(t — B) X{f) Q~)2nBf
(d) Frequency scaling x(t/C)/\C\ X(Cf)
(26.34)
(e) Frequency shift x(t) e)2nDt Xif - D)
(f) Modulation x(t) cos 2nKt [X{f - K) + X(f + K))/2

x(t) sin 2nKt {X(f - K) - X(f + K)}/(2j)

(g) Duality X(t) *(-/)


(h) Differentiation dx(t)/dt (j2 Kf)X(f)
d"x(t)/d£" G2«fYXif)
26.13 Fourier series and Fourier transforms 457

ones. The constants A, B, C, D, K are assumed to be real, but the


signals x(t) may be real or complex.
The proofs are left to the problems. Most of them are obtained
by writing down the appropriate Fourier integral or its inverse and
then making a simple change of variable. The following examples
illustrate how these results are used.

Example 26.20. Given that T[n(f)] = sine /, obtain yfsinc(t/T)].

Use the time-scaling rule (26.34b) with A = l/z:

n(t/r) <-> t sine (/t).

Example 26.21. Given that J\Y\(t)\ = sine /, obtain J\fA(at + d)].

We have

n(ar + d) = n(a(f + d/a}).

From the time-scaling rule (26.34b').

yfn(at)] = (l/|a|) sine (//a).

Then, using the time-delay rule (26.34c), with B = —d/a,

Tfn(a[t + d/a})] = (l/|a|) sinc(//a) ^2ndfla.

Example 26.22. Obtain the signal x(t) produced by the spectral distribution
X{t) X(f) shown in Fig. 26.24.
■ ■ 1 I
1 The two rectangular pulses are arrived at by extending the range of FI(/) by a
1 1 1
1 1 factor 2 to give FI(f/2), then shifting this graph along the / axis a distance 2 to
-3 -2 -1 0 1 2 3 t the left and 2 to the right to give
Fig. 26.24 X(f) = FI(i{/ + 2}) + ri(F[/ - 2}).

From the frequency-scaling rule (26.34d) with C =

n (\f) *-> 2 sine 21.


Then, by the frequency-shift rule (26.34e), wkth K = +2,

X(f)<^(^m + e_j47t,)-2 sine 2t

= 4 cos Ant sine 21.

(The modulation rule (26.34f) could have been adopted for the final stage
instead.)

Example 26.23. Let

Given that

x(t) <-> — 4trj//(l + 47i2/2),

use the duality theorem (26.34g) to deduce the Fourier transform of t/( 1 + t2).

Put

X(f) = -4jrj//(l + An2f2).


458 26.13 Mathematical techniques

By the frequency-scaling rule (26.34d), with C — 1/2ti,

x(J-j = — 2j//(l + f2) <-► 2nx(2nt).

Divide by (— 2j) and rename the functions obtained Y{f) and y(t):

//(1 + f2) = Y(f) <-+jnx{2nt) = y(t). 00

x(t) is given, so we know the inverse transform of

Y(f) = //(1 +/2)-


We need the transform of the identical time function given by

Y{t) = f/( 1 + t2).


The duality theorem (26.34g) tells how time may be exchanged for frequency
in a given function. Applying it to Y(f) in (ii) we have

Y(t)<^-y(-f) =}nx(-2nf),

so we must substitute {-2nf) for t into (i) (including the inequalities t < 0 and
t > 0) giving:

jjn e2n/, / < 0,


t/(1 + f2)
(—jit e_2lt/, / > 0.

Example 26.24. (Sidebands) The voltage signal x(t) = v(t) cos 27i/0r repre¬
sents an audiofrequency signal v(t) used to modulate a carrier wave of high
frequency f0. Suppose that T[e(0] = T(/), where f lies in the ran9e ~fm< f <
fm < /o- Use the modulation formula (26.34/) to illustrate the general nature of
the spectral distribution function X(f) = Tt^(01-

From (26.34f)

*(/) = iW -/o) + v(f + fo)}- 0)


V(f — /0) is zero unless

fm < f fo < fnn


that is, unless

fo ~ L < f < fo + L> (»)


and similarly V(f + f0) is zero unless

fo fm < f < ~fo + fm ('»)


The intervals (ii) and (iii) do not overlap, since fm < f0. Therefore the spectral
distribution (i) falls into two separate parts on opposite sides of the origin of
/, as in Fig. 26.25). They are related to the sidebands of communication
engineering. The two parts have the same shape, since their graphs consist of
the graph of V(f) moved through distances +/0. (In general they would be
complex, and even if they are real they will not generally correspond to two
Fig. 26.25 (a) Spectral real signals.)
distribution of v(t).
(b) Spectral distribution of
v(t) cos 2n f0t.
26.15 The delta function and periodic functions
The delta function, or impulse function, is described in Section 25.5.
The Fourier transform of 5(t — a) is given by:
* 00

ZT5(f - a)] 5(t — a) e~i2Kft df


J — 00

— Q~j2nfa
(26.35a)
26.14 Fourier series and Fourier transforms 459

by the sifting property, (25.6). The special case a = 0 gives

Timi = 1 • (26.35b)
Next, consider a spectral distribution function consisting of very
narrow band of frequencies /, of width e, about the value f0 (see
Fig. 26.26), so that the range of/ is

/o — 2e < / < /o + 2£-

To maintain a measurable signal, we make the amplitude large


enough for the area under the graph to be equal to (say) unity. As
in Section 25.5, we may represent this spectral distribution function
by 8(/ - /o)- The signal giving rise to 5(/ - /0) is given by the
inverse transform:
_m_

f° ‘ 7-'[5(/-/„)]= | " 5(/-/0)ei^‘d/ = eiw«


Fig. 26.26 The approach ro <5(/ — /0). J - oo
Therefore

and similarly
e -j2nfot §(y _|_

7 [cos 2 k fQt] These are complex signals. But

cos 2nf0t = j(ei2nfot + e 2,I/o<)-

so

cos 2nf0t ^ i{5(/ - /0) + 5(/ + /0)}. (26.36a)


Similarly,

-J 0 / sin 2nf0t = — (ej27t/°r — e - j2l^°()5


0 /(
2j
so
j Tfsin 2rt/0/] ^

sin 2nf0t ~ i (8(/ - /0) - 5(/ + /0)}. (26.36b)


2j
Therefore, the (real) cosine and sine functions having frequency f0
are each associated with a pair of spectral lines, located at / = +/0,
as in Fig. 26.27.
/0 The delta function is not at all a normal function. It belongs to
a class of mathematical entities called generalized functions. They
are essential in practical applications, since their use greatly simplifies
what would otherwise be very difficult calculations. Generalized
functions play a part similar to the symbol j in algebra: j is not an
ordinary number, but in most ways it behaves like one.
There are apparent anomalies associated with generalized func¬
tions; for example we have just obtained the Fourier transform of
Fig. 26.27 cos 2nf0t, but the normal definition of a Fourier transform (26.25b)
460 26.14 Mathematical techniques

does not work with a periodic function, because the integral does
not have a definite value when we apply the infinite limits of
integration. Exact justification and interpretation of these questions
is far beyond the scope of this book. You should regard relations
such as (26.36) as being usually safe, and call on them as if you
were using a dictionary, as did the original inventors of these
methods.
In this sense we can obtain the Fourier transform of a general
periodic function. The result is:

Transform of a periodic function

xP(0 is periodic with period T, and /0 = \/T.


00

(a) ^Txp(t)] = I Y„6(/-n/0),


n = — oo (26.37)
where

(b) *„ = /oi xP(t)e-i2”""dt


J Period

The spectral frequency distribution consists of an infinite row of


‘spikes’ 8(/ — nf0) spaced at equal intervals /0. These are weighted
by Xn, which are just the two-sided Fourier series coefficients for
the periodic function xP(t) given by (26.18b).
To prove the result (26.37), take the Fourier series representation
(26.18a), and use (26.36a and b) to transform the cosines and sines
in the series term-by-term. We obtain
00

J[xP(t)]= I XMf-nfo).
n = — oo

which is (26.37a). The coefficients Xn are given by (26.18b), which


is the same as (26.37b).

26.16 Convolutions
Suppose that we have a spectral distribution X(f) which can be
written as the product of two simpler functions Xx(f) and X2(f),
whose inverses we know:

X(f) = X,{f)X2{f), (26.38)


where

Xl(0~*l (/), x2(t)<-W). (26.39)


The inverse transform of X(f) is given by

x(t) = (26.40)
J ~ 00
26.15 Fourier series and Fourier transforms 461

in which
* 00

*1 (/) e-i2n/'x1(0 dt. (26.41)


J — CO

Substitute (26.41) into (26.40), but change t to u, since t is already


in use in (26.40):

x(t) = eJ 2 71/11 e-j27c/uXi(u) du W2(/)d/

^‘-“^(u) d« I2(/)d/

“ ej2«/(»-->AT2(/)d/ du
J — CO

after changing the order of integration. The interior integral is equal


to the inverse of X2{f) at time (t — u), so it is equal to x2(f — u).
Therefore

x(t) = Xi(u)x2(t — u) du. (26.42a)

If we had started by substituting for X2(t), we should obviously


have arrived at
CO

x{t) = Xj(t — u)x2(u) du, (26.42b)

so the two integrals on the right of (26.42a and b) are equal. This
enables us to invert products of spectral distributions.
The integrals

x,(u)x2(t — u) du, or x2(t — u)x2(u) du, (26.43a)

(which are equal) are often written in the short form

x1(t)*x2(t), or x2(0*xx(0. (26.43b)


x1(t)*x2(t) (or x2(t)*x4(0) is called the convolution of x2(t) and
x2(t). The result (26.42) is the convolution theorem. In the short
notation:

Convolution theorem for Fourier transforms

Let Xj(t) X^f) and x2(t) <-»• X2(f). Then


(26.44)
X1(f)X2(f)^x1(t)*x2(t)

where x1(t)*x2(t) is given by (26.43).

In the convolution integrals (26.43a) the variable of integration is u,


and t is to be treated like a constant parameter.
462 26.15 Mathematical techniques

Example 26.25. Obtain x(t) = x^t) * x2(t) when Xj(t) = 11(f) and x2(t) =
l/d +f2).
Write the convolution in the form

1
x(f) = xj(u)x2(t — u) du = n (u) du.
1 + (t - u)2

Since n(w) = 0 unless -j < u < the limits of integration become ±5, so

l 1
x(0 = n(«) du = du
1 + (f - uf -* 1 +(t-u)2

', + * du
(i)
,-i 1 +v2
(after putting t — u = v)

= arctanfi + 5) — arctan(r — j)

4
= arctan
4f2 + 3

At (i) in Example 26.25 the limits of integration were modified


to take account of the fact that the integrand is zero except over the
interval — \ < u < In many typical cases it is quite difficult to
establish the new limits. Consider, for example, the convolution of
two identical pulses Il(t):

x(t) = n(f) * n(r) = n(u)II(t — u) du. (26.45)


J — 00

For different values of t, n(? — u) occupies a different position on


the u axis. For certain ranges of t it partially overlaps II(w) from the
left; or from the right; and for other ranges of t there is no overlap,
as illustrated in Fig. 26.28.

H(u)
-Il(r-u) for various t
1
1
t

!
1
1
l
1
1

No overlap Overlap 0 _ Overlap No overlap


Fig. 26.28 from left 2 2 from right

To take this into account, set up a diagram as in Fig. 26.29, with


axes t and u. The region in which 11(f)IT(t — u) is nonzero is easy
to find by carrying out the following construction.
(i) U{u) is nonzero only if — \ u ^ The edges of this region
are the straight lines

u = —3 and u =
26.15 Fourier series and Fourier transforms 463

Draw these and label them with the u values.


(ii) n(f — u) is nonzero only if — \ ^ t — u < or if

u — \ ^t<u + \.

In terms of u, the edges of this region are therefore

u = t + 3 and u — t — \.

Draw these lines and label them.


(iii) The parallelogram enclosed by the four lines contains the
u, t values for which the integrand (26.44) is nonzero. Next, draw a
vertical line representing the current values of t as in Fig. 26.29.
The effective limits of integration are represented by the points where

Current
value of t

the t line intersects the sides, and the u values are already written
on these sides. (The limits of integration are therefore different
functions of t for values of t on either side of a vertex.)
Other ways of interpreting convolution integrals will be found
elsewhere, but this is by far the simplest way for working them out.
It can be adapted for use whatever the nonzero intervals for the two
functions may be. In practice it is used as follows:

Example 26.26. (a) Show that FI(f)* 11(f) = /1(f), where (Fig. 26.30a)

{1 + t, — 1 < t < 0,
1 - f, 0 < t < 1,

0, elsewhere.

(b) Show that 7[A(t)] = sine2 /.

y (a) Put x(t) = n(t) * n(t), and use the diagram Fig. 26.29 as described in (iii)
above.
If t < — 1 or t > 1, there is no overlap, so x(f) = 0.
464 26.15 Mathematical techniques

If — 1 < t ^ 0, the limits of integration are from u = — \ to t + so


't + i

x(t) = 1.1 du = 1 + t.
J -i
If 0 < t < 1, the limits of integration are from u = t — \ to so
’•i
x(t) = 1.1 du = 1 — t.
Ji-i
Therefore the convolution is equal to A(t).
(b) From the convolution theorem, (26.43),

Tin(r) * ri(r)] = 7tn(t)Mri(t)] = {Timm2,


and
T[n(t)] = sine /
by (26.31). Therefore (see Fig. 26.30b)
j\_A(t)~\ = sine2 /.

26.17 The shah function


The generalized function IHr(f) (pronounced ‘shah’) otherwise
called the Dirac comb is defined by
00

nir(t) = X 5 (t-nT) (26.46)

(Fig. 26.31a). It is an even function consisting of an infinite string


of equal ‘spikes’ (the delta functions) spaced at a constant interval
T, one of them being at t = 0. Since it is periodic, its Fourier
transform is given by (26.37a):

y[UM0]= t *,5(/(26.47)
n = — oo

where /0 = \/T, and, from (26.37b),

e-J2iw/otIUr(t) dt
Xn = fo
(b) J Period

T[Iur(0 =/04I1/(7) 00 '\T


■'o

(/o== 1/7)
= /<> Z Q-i2nnfot8(t-nT)dt = f0,
n = — oo J — \T

by the sifting rule (26.35a), since the only delta function within the
period is the one where n — 0. Therefore, from (26.47),
t 7[mr(0] = f0mfo(f)
r*V


1

0 fo 2/o Vo It is shown in Fig. (26.31b).


Fig. 26.31
00

(a) Ul-r(r) = I 5(t — nT).


The shah function
n= — oo
(b) 00
00 (a) inT(t) = X 8(t - nT).
?WA0]=fo I xnb(f-nf0) (26.48)
n = — oo

(b) ^rmT(0] = /„Ul/0(/), where /„ = l/7\


26.16 Fourier series and Fourier transforms 465

Example 26.27. The function x(t) is zero when t < — \T and t > \T. Show
the convolution y(t) = ILLr(f) * x(f) is the periodic function, having period T, which
agrees with x(t) in the range < t < \T.
Write
% CO

uiriO * *(0{ x(u)UL7-(£ — u) du


J — CO

00 r co 00

= Y, x(u)b(t — u — nT) du Z x(t - nT),


n = — 00 J - 00 M— CO

using the shifting theorem (the critical points are where t — u — nT = 0). The
term with n = 0 reproduces x(t), which is zero outside the range — \T to \T.
The term with n = 1 slides that graph a distance T to the right, and we have a
non-overlapping copy of x(t) in the range \T to §7) and so on. The general
picture is shown in Fig. 26.32: y(t) is a periodic copy of x(t), with period T.

_Ir _Lt 0 Lt It t
2 2 2 2 26.17 Energy in a signal: Rayleigh’s theorem
Fig. 26.32 (a) x(t) (nonperiodic)
The total energy E carried out by a signal, from t = — 00 to t = 00, is
(b) HLr(r) * x(t).
E = \x(t)\2dt.

This can be expressed in terms of the spectral distribution X{f) as


follows:

Rayleigh’s theorem
o r co

[x«)l2dt= [V(/)|2d/

or (26.49)

x(t)x(t) dt = X(f)X(f)df.
J — CO

We have

E = l*(OI2dt = x(t)x(t) dt
— 00 J — 00

‘ 00

-j2nft
x(t)| x(f) e df dt,

(after expressing x(t) as the inverse transform of X(f), and taking


its complex conjugate). Now change the order of integration:

E = X(/)( x(t) e ,2lzft dt \df


J ~ 00

00

X(f)X(f) df.
~ 00
466 26.17 Mathematical techniques

Parseval’s theorem extends this result for cases when the energy
depends on two functions, x(f) and y(t), as in the case of current
and voltage in circuits. It states that
* 00 r oo

x(t)y(t)dt = X(f)Y(f)df (26.50)


J — 00 J — 00

Problems

26.1. Draw a sketch of the following odd 2jt-periodic 26.5. Show that the Fourier series of the 2n-periodic
functions defined for and find a general function
formula for their Fourier coefficients:
0 ( —n < f ^ 0),
—1 ( — n < t < 0), /(f) =
(a) /(f) = U (0 < f ^ n),
1 (0 ^ f ^ 7t); , 2/ 1 1
i + - sin f + - sin 3f + - sin 5f + ■
(b) f{t) = f (-7t < t < ?r); nV 3 5
— t2 ( — n < t < 0), What value does the Fourier series take at f = 0?
By choosing a particular value of t, find the sum of
t2 (OsSf^Tt);
the series
e* — 1 ( — n < t < 0),
1 1 1
-(e' - 1) (0 ^ t < 7t); 1-1-b
3 5 7
1 ( — 7i < t < — \n),
26.6. A signal F sin f with amplitude F > 0 is fully
-1 ( —J7t<f<0),
(e) fit) = (f) fit) = sin if. rectified into F|sin f|. Find the Fourier series of the
1 (0<f^7t), rectified signal. What is the amplitude of its first harmonic?

-1 (^7i<f^7t);
26.7. A Fourier series is given by

26.2. Draw a sketch of the following even 2n-periodic n + a


I -sin nt,
functions defined for —n<t^n, and find a general cm -f- 3
formula for their Fourier coefficients:
where a is a design parameter in the system. Find a
'-1 ( — 7t < t S$ JU), in order that the leading harmonics n = 1 and n = 2

(a) fit) =< 1 (-in < t s£ in), have amplitudes in the ratio 2:1. What is the amplitude
of the next harmonic?
(in < f sC n);
26.8. The two 27i-periodic signals shown in Fig. 26.33
(b) fit) = f2; (c) fit) = cos if.

26.3. Draw a sketch of the following 2n-periodic functions


defined for — n < f < n, and find a general formula for
their Fourier coefficients:

(0 ( —n < f < 0),


(a) fit) =
jf (O^f <n);

jt + n ( —n < f < 0),


(b) fit) =
If (0 < f ^ n).

1
26.4. A half-rectified sine wave is given by the 2n-
periodic function
-■Jt 7t
[0 (-n<f«SO),
/(0 = -1
[sin f (0 < f ^ n).

Find the Fourier series of /(f).


Fig, 26,33
Fourier series and Fourier transforms 467

are added. Find the Fourier series of the combined 26.14. From Problem 26.13, or by direct means, obtain
signal. What value should F take in order that the the Fourier series valid for — n ^ f ^ 7t:
leading harmonic should disappear?
x (-1)"
f2 = ^n2 + 4 X — - cos nt.
26.9. A F-periodic function is defined by n= 1 n2

Q(t) = iT2-t2 for-fF^r^F. By integrating all the terms in this expression from
f = 0 to f = x, obtain a Fourier series for x3 — tt2x.
Find the Fourier series of Q(t). What is the error (It is always valid to integrate a Fourier series in
between the sum of the first four terms of the series this way in order to obtain a new one, but differenti¬
and Q(t) at (a) f = 0, (b) f = \T! ation of the terms does not always lead to a valid
series.)
26.10. A 2tt-periodic function is defined by

j7?f(jr — f) (0 < t ^ 7t), 26.15. Obtain the Fourier series of the function having
At) = period F which is defined for —\T^t<\T by
|/f(fr + f) ( —7i < f ^ 0).
\-2t (—^F < f < 0),
Find the Fourier series for /(f). What is the ratio of Pit) = <
{ 2t (O^tc^F).
the amplitudes of the third and first harmonics? Compare
the values of /(f) and the Fourier series up to and Display its spectrum as in Fig. 26.12.
including the coefficient b3 at f = Iff-
26.16. The function /(f) is defined on the interval 0<f ^ 1
26.11. Find the Fourier series of the 27i-periodic function by /(f) = 1. Express /(f) as a half-range Fourier series
defined by /(f) = t(n2 — t2) for — n<t^ n. Find the on 0 < t ^ 1, (a) as a sine series, (b) as a cosine series.
derivative of /(f) for —n < t ^ n and find its Fourier Sketch the sum of the series on — oo < f < oo in both
series. Confirm that the derivative of the Fourier series cases.
of /(f) obtained by term-by-term differentiation is the
same series as the Fourier series of /'(f). 26.17. The function /(f) is defined on 0 < f < 1 by
Consider now the function g(t) = f3 defined for — n < /(f) = f. Express /(f) as a half-range Fourier series
t < 7i. Find the Fourier series of g(t) and g'(t). Confirm on 0 ^ f < 1, (a) as a cosine series, (b) as a sine series.
that the derivative of the Fourier series of g(t) is not the
same series as the Fourier series of g\t). 26.18. Express /(f) = sin tot on 0 ^ f < n/to as a half¬
Comparing the functions of /(f) and g(t), what range cosine series. Sketch the sum of the series on
feature of g{t) do you think causes the problem with — oo < f < oo.
its differentiated Fourier series?
26.19. Express /(f) = cos wt on 0 ^ f < n/to as a half¬
26.12. Sketch the rectified sine wave defined by range sine series. Sketch the sum of the series on
— oo < f < oo.
f0 (-7t<f<0),
Pit) = <
(Jsin 2f| (0 < f < tt), 26.20. Express /(f) = cos f on 0 < f ^ 2n as a half¬
range sine series.
extended so as to have period 2n. Find its Fourier
series. (The identities
26.21. Express /(f) = cos f on 0 ^ f ^ 2n as a half¬
sin A cos B = i[sin(A + B) + sin(A — B)], range cosine series.
sin A sin B = j[ — cos(A + B) + cos(A — B)],
26.22. Express the function /(f), for 0 ^ f ^ 7i, (a) as
will be needed.)
a half-range sine series, (b) as a half-range cosine series:

1 (0 ^ f < \k),
26.13. Show that
At)
(jit ^ f ^ 7l).
, a (-ir1 . ,
r = 2 / -sin nt
0

n= 1 « 26.23. The Fourier series for the function P(f), period


for — 71 < f ^ 7i. Integrate the terms from t = 0 to f = x, 2n, given by
and rearrange them to show that -f (-7l<f«$0),
PU) =
■v2 = 4 X
(-1)"-1 A * (-ir-1
4 X -COS 77.X.
t (0 ^ f ^ 7C),
? is
n= 1 n~ „=i n2
cos f cos 3f cos 5f
Now use (26.10) to establish the value of the constant Pit) — \tl —1 '-9-'-T-b ■
term in this Fourier series. l2 32 52
468 Mathematical techniques

(see Example 26.2). Deduce from this the Fourier expan¬ (a) Let T = n, and
ions of the following periodic functions. (“2^ <t^0),
(a) Q(t), period 4, where /«) =
(0 < t s= §n).
— 31 (-2<t<0),
<2(0 = Show that
31 (0 ^ t ^ 2);
y 1 _ n2
(b) R(t), period 2, where
„h(2n +l)2 ~J'
1 + t (-lsStsjO), (b) Let f(t) — t (— 7i < t < jc) be a 27r-periodic func-
R(t) =
1.1 — f (0«St<l). ion. Find its Fourier series (see Problem 26.1b), and
deduce the corresponding Parseval identity.
(Sketch R(t) to understand the connection with P(t)-)
(c) Check that P(t), Q(t), R(t) have similar spectra.
26.27. The function /(f) with period T has the Fourier
series
26.24. The Fourier series for the function P(t), period
2, given by
/(t) = + X (an cos ncot + bn sin ncot).
(-1 sS t < 0),
P(t) =
Find the Laplace transform of the function as the sum
(0<t< 1),
of a series of Laplace transforms of the trigonometric
for one period, is terms. Hence find the Laplace transform of the 2ti-
,
2 l 1 1 periodic function defined by
P(t) = -1 sin nt + - sin 3nt + - sin 5jtf + •
7i \ 3 5 — t2 ( — 71 < t < 0),
/(£) =
(see Example 26.4). f2 (0 < t ^ 7l).
Deduce the expansion of the function Q(t), period (See Problem 26.1c.)
T, defined on one period by

(0 ^t< i T), 26.28. A radio wave described by


<2(0 = x(f) = a cos cot cos co0t,
(i T^t<T).
where ct»0 is very much greater than co, represents a
6.25. Find the Fourier series of the 27t-periodic sawtooth carrier wave subject to amplitude modulation by the
ave defined by comparatively slowly-varying term cos cot, which repre-
ents a musical note. Roughly sketch the general character
f{t) = t (-n^t^n).
of x(f).
Determine the forced part of the solution of the second- (a) If co = 500 and co0 = 100,001 (notice the 1 at
order differential equation the end), what is the period of x(f)?
d2x (b) Let co = p/q and co0 = r/s, where p, q, r, s are
-+ Q2x = K sin cot, whole numbers such that no two of them have a
d t2
common divisor other than 1. What is the period?
where co ^ ±Q. Hence put together the periodic output Express x(t) as the sum of two waves with angular
of the forced system frequencies co ± oo0 (these are called the sidebands).
d2x What is the Fourier cosine expansion based on this
tt + Q2* = /(0, period?
dt2
(c) If you know about irrational numbers, show
where fit) is the sawtooth wave above. For what that x/0 = cos t cos ^Jlt never repeats itself exactly:
values of Q does the system exhibit resonance? it is not periodic.

26.26. The Fourier series of a function with period T 26.29. (a) Prove that
is given by '±T
{ T, m = n,
ej27ln/0r g _
OO
fit) = 2«o + X ian cos M + b„ sin cot), J ~iT (0, 777 n.
n= 1 (b) Confirm that the expansion (26.18) is valid by
where T = In/ao. Multiply both sides of the equation multiplying both sides of (26.18a) by e_j27tN/o' and
^ fit) and integrate between — \T and \T, to obtain integrating the result over one period.
(•arscval’s identity
2 r ^ °° 26.30. Obtain the two-sided Fourier series for the saw¬
tooth function xP(f) defined by xP(t) = t/T for 0 < t < T,
- fit)2 = 2«o + I (a2n + b2).
1 J -iT n= 1 together with its periodic extension of period T.
Fourier series and Fourier transforms 469

26.31. Prove that if x(f) is an even function then 7~[x(t)] (b) Given that e |!| <-> 2/(1 + An2/2), obtain
is an even function of /. Use this fact to reduce the J-1D//(l + 47t2/2)].
Fourier transform pair to a real form. [Hint: split the
ranges of integration into two parts; — oo to 0 and 0 to 26.39. From the result e_"H(f) <-> l/(a + j2n:/), use the
co.] time-reversal rule (26.34b) to obtain i7”[e_a|,)], where
a > 0.
26.32. Prove that if x(f) is an odd function, then X(f)
26.40. (a) Obtain
is a pure-imaginary odd function. Show that the Fourier
transform pair can then be reduced to a pair of real J\Q~at cos fit H(t)] and jF[e_<" sin fit H(£)],
equations.
where ot > 0. (Hint: look at the table of simplifying rules
26.33. Prove the time-scaling rule, (26.34b) and the (26.34) before trying to tackle these directly.)
time-delay rule, (26.34c). (b) Obtain ^[e-”' cos (2nf0t + (/>)], where a >^0.

26.34. By (26.31), iF[n(f)] = sine /. (a) Use the time- 26.41. Prove that
delay rule (26.34c) to obtain the transform of
H(f) * (x(f) H(t)} x(t) dr.
0 < t < 1, o
x(t) =
elsewhere.
26.42. (a) Obtain x1(t)*x2(.t) when x1(£) = x2(t) = e_1 H(t).
(b) Confirm the result (a) by evaluating ^7~[x(£)] directly. (b) Use your result together with the convolution theorem
(c) Use the time-delay rule and the time-scaling rule (26.44) to obtain the transform of a new function, t e~'.
(26.34b) to obtain iF[x(r)] where b > 2C and (c) Obtain J[t e_a<] from (b), where a > 0.
{—1, —b — ^c<t<—b + jc,
(d) Obtain the same result as in (c) by noticing that
d
1, b — jc < t < b + fc, — (e-0") = -f e“a<.
dx
0, elsewhere.
[Hint: sketch a diagram.] 26.43. (a) Prove that
(d) Use the result obtained in Problem 26.32 to confirm n(t-i)*n(t + i) = /i(t- i).
your answer.
[Hint: use the convolution theorem, (26.44).]
26.35. Given that J[/l(t)] = sine2 / (proved in Example (b) Show that
(26.26)), where n(t — a) * n(f — b) = A(t — a — b).
(l+t, -1 < t < 0, (c) Show that
A(t) = < 1 - t, 0 < t < 1, 0, t < — | and t > §,

H<N
rO|<N

[0, elsewhere,
V
V

2 + t,
1
1

n(t)*n(io
obtain (a) J[A{2t/\\ (b) J[4l(2t — 3)]. 1, -2 < t < i

26.36. (a) Prove the frequency-shift property, (26.34e). 2 _ t J<t< l


2
(b) Obtain J[x(r) e±j2,t/o1].
(c) From (b) deduce the modulation rules, (26.34f), for 26.44. Show that the total energy in the signal x(t) =

j[x(t) cos 2nf0t\ and J{x{t) sin 2nf0t].


e-<“ H(t). (a > 0) is equal to l/2a. Show that the total
(d) Obtain 7[n(ft) cos 2nf0t] and sin 2nf0t']. energy due to the frequency range — f0<f<fo is

equal to — arctan(2tr/0/a).
26.37. (a) Given that A(t) <-> sine2 /, obtain J[sinc21] 7ta
either by using the duality rule (26.34g), or by a direct
method. 26.45. Prove the result of Example 26.27 by using the
(b) Use the result (a), together with the time-shift and convolution theorem (26.44) together with the expression
time-scaling rules, to find ?\A{at + fi)]. (A(t) is defined (26.47) for J[HIT(t)].
by
26.46. Use the Fourier transform to obtain a particular
(l + t, -1 < t < 1, solution of the differential equation
A(t) = <1 - t, 0 < t < 1, d2x 1
[ 0, elsewhere.) dt2 X_l + £2’

26.38. (a) Prove the differentiation rule (26.34h). in the form of a convolution integral.
PART V

Differentiation of
functions of two
variables
Contents
27.1 Functions of more than one variable 470
27.2 Depiction of functions of two variables 471
27.3 Partial derivatives 473
27.4 Higher derivatives 476
27.5 Tangent plane and normal to a surface 478
27.6 Maxima, minima, and other stationary points 480
27.7 The method of least squares 483
27.8 Differentiating an integral with respect to a parameter 484
Problems 486

27.1 Functions of more than one variable


Quantities in nature usually depend on, or are functions of, more
than one variable. The elevation H of land above sea level depends
on two map coordinates x and y; so H is a function of the two
variables x and y, and we write H(x, y). If we want to take account
of geological changes, then time t becomes a consideration, and in
that case H is a function of three variables x, y, t, and we write
H(x, y, t). It is easy to produce examples involving many variables;
for example the distance between two points P : (xu ylt zt) and
Q '■ (x2> F2> z2) >s a function of six variables. The state of the economy
is a function of a multitude of variables. We alternatively speak of
a function in 1, 2, 3,... dimensions.
Suppose that a quantity z, called the dependent variable, depends
on two independent variables x and y. The dependence can often be
expressed by an explicit formula such as
z = x3 + j/3, z = ex~2y, z = \xy\,
and so on. To make statements which apply to all sorts of functions
we use the notation
z= fix, y),
or z = g{x, y), etc. The letter / on its own signifies a function or
process: a computer subroutine, a particular formula, or a set of
27.1 Differentiation of functions of two variables 471

rules which will generate a single number z when two numbers x


and y are fed to it in the right order.
Thus, if
fix, >’) = 2x + y2,
then
/(3, —2) = (2 x 3) + ( —2)2 = 10,
/( —2, 3) = [2 x ( — 2)] + 32 = 5,
f(a, b) = 2a + b2, f(u2, v) = 2u2 + v2,
f(—x, x) = —2x + x2, f(y, x) = 2y + x2.
Notice the last one particularly: it is different from /(x, y).

27.2 Depiction of functions of two variables


Consider the particular function
fix, y) = x2 + y2.
Set up x, y, z axes; put
z = x2 + y2,
and proceed as if plotting a graph. Take a large number of pairs
(x, y), work out z for each, then put the point (x, y, z) in the axes.
For example, if x = 1 and y = 2, then z = 5 and we ‘plot’ the point
(1, 2, 5) as shown in Fig. 27.1a. For Fig. 27.1b, a great number of
points is supposed to have been plotted. They cover a surface shaped
like an inverted bowl.

Fig. 27.1 Depicting the function


fix, y) = x2 + y2.

Every function has a characteristic surface shape, which is the


analogue in three dimensions of the graphs used for functions of a
single variable. Some other functions are depicted in Fig. 27.2.
Another way of depicting a function is to sketch its contour map
or its level curves. Fig. 27.3 shows a contour map of a patch of
countryside. Along each contour the height is constant, and is
indicated on the curve. The important features of the terrain are
472 27.2 Mathematical techniques

Fig. 27.2
(a) The plane (b) The hemisphere (d) The saddle
7 7
r = 2.x + 4v - 2 z = (4 - .x2 - r)* z — x — y

very easy to pick out; there are peaks at A and B, a pass at C (which
is a ‘saddle’ as in Fig. 27.2d), valleys north-west and south-east
of C and ascents north-east and south-west of C. At E the contours
are close together, so the slope is steep, and at F the contours are
widely spaced so the slope is comparatively gentle.
Consider again the function f(x, y) — x2 + y2 depicted in Fig.
27.1. The contour of height c is the circle

x2 4- y2 - c,

where c > 0, which is a circle of radius c*, as shown in


Fig. 27.4a. This can be visualized as in Fig. 27.4b, as a horizontal
Fig. 27.3 slice of the surface z = x2 + y2 at height c, projected on to the x, y
plane.

(a) y Horizontal slice,


(b)
height c

Fig. 27.4
27.3 Differentiation of functions of two variables 473

Example 27.1. Sketch the contours of the function at.


.v

Fig. 27.5 I
The contour of height c is given by the equation
- = xy = c,
or
y = c/x.
By varying c, taking positive and negative values, the contour map or level
curves of Fig. 27.5 are obtained.

27.3 Partial derivatives


Suppose that z = f(x, y) represents the height above sea level of a
piece of countryside.
In Fig. 27.6a, an observer stands at the point P : (x, y), facing east,
in the direction of the x axis. A short step forward takes him to
Q : (x + Sx, y), up or down a slope. The altitude changes by an amount
8z = f{x + 5x, y) - f{x, y).
The average slope in this direction over the step length 8x is 8z/8x,
so the slope at P facing the observer is given by
8z fix + 5.x, y) - fjx, y)
lim lim
6x->0 8x 8.x->0 8x

(a) A
7 A

yj-

5
O x
Fig. 27.6
474 27.3 Mathematical techniques

Since the variable y is constant during the step, this is in effect


an ordinary derivative, taken with respect to x only. However,
it is customary to signal that another variable is present, which is
done by using the special sign: 8 (still called ‘dee’) instead of the
usual d for the derivative; writing

df dz
— or —
dx dx

instead of df/dx or dz/dx. This is called the partial derivative of


fix, y), or of z, with respect to x.
If the observer faces north and takes a step 5y, as in Fig. 27.6b,
then we obtain in the same way the slope df/dy or dz/dy in the y
direction.

Partial derivatives
If z = fix, y), then

df dz f(x + dx, y) - f(x, y)


— or — = lim -,
dx dx 5.x -o 8x (27,1)

df dz f(x,y + by)-fix,y)
— or — = lim -.
dy dy 8y->o dy

Example 27 2 .. Find dz/dx and dz/dy at the point x = 1, y = 3 when

z = x2y + 2x2 - 3y + 4.

For dz/dx, y has the status of a constant for the purpose of the differentiation, so

dz
— = 2 xy + 4x — 0 + 0 = 2 xy 4- 4x.
dx

At the point (1,3), dz/dx = 10.


For dz/dy, x is treated as constant, so
dz ,
— = x2 + 0 — 3 + 0 = x2 — 3.
dy

At (1,3), dz/dy = -2.

We often need to indicate the particular point at which a


derivative is to be evaluated, like the point (1, 3) in the previous
example. There are many notations in use for this purpose. We use
27.3 Differentiation of functions of two variables 475

to mean the derivatives are to be evaluated at P : (a, b). In this


connection, the following definitions are equivalent to (27.1):

Partial derivatives at (a, b)

d£\ = lim fjx, b) - /(a, b)


dxj(a.b) X - a
(27.2)

*)
dyJ(a,b)
or m
\dy/(a.b)
=lim
y^b y -b

1
Example 27.3. Obtain (a) ;(b)
dx\x + yj dy \(x2 + y2)*

(a) We hold y constant and use the quotient rule (3.2):

= [ix + y)~-x~(x + y)) I (x + y)2 = y


8x \x + y. \ dx dx )\ (x + y)2

since dx/dx = 1, and y is constant.


(b) x is held constant. Use the chain rule (3.3), putting

u = x2 + y2, z = u

then

dz dz du
dy du dy

(We write du/dy instead of du/dy in the chain rule because both x and y are
present in u, and x is being held constant.) Continuing, we have

* - y) = -y(x2 +
dy

Example 27.4. The potential function V(x, t) = A e~q‘ sin k(x — ct) represents
an attenuating wave travelling to the right along a cable with speed c. Here
A, q, k, c are constants. Find (a) the rate of change of V with time t at any fixed
point x; (b) the ‘potential gradient’ dV/dx along the wire at any moment.

(a) For 8V/dt, use the product rule (3.1) with u = A e~ql and v = sin k(x — ct).
We treat x as constant, so dv/dt instead of da/dt will be written into the product
rule:

dV d{uv) dv du
— =-= u-h v -—
dt dt dt dt

= A e~qt[ — kc cos k(x — ct)] + ( — qA e~q') sin k(x — ct)

= — A e~q'[kc cos k(x — ct) + q sin k(x — ct)].

(b) — = A Q~q' — sin k(x — ct) = kA e~q‘ cos k(x — ct),


dx dx

t being treated as constant.


476 27.3 Mathematical techniques

It will be seen that no new rules have to be learned in order


to obtain the partial derivatives of given functions. In fact the reader
has always unconsciously carried out partial differentiation when
differentiating expressions like A sin(cof + </>), without worrying
whether A, co, cf) were really constants or just to be treated as such
while differentiating.

27.4 Higher derivatives


Having differentiated a function, we might want to differentiate it

again. Thus, if z = f(x, y), we can form — and then — (— ) or


d fdz\ 8x dx\dxj
— (— , thus forming second derivatives, or derivatives higher
dy\dxj
than the second. There are four second derivatives, written as follows:
/dz\ d2z d (dz\ d2z

Example 27.5. Obtain the four second derivatives when

z = x3y + xy2 + x + y2 + 1.

The first derivatives are


dz oz
= 3 x2y + y2 + 1, = x3 + 2xy + 2 y.
dx d.y

Therefore
d2z
A f8!) = 6Xyt
fa2 dx \dxj

d2z d fdz\ _
3x2 + 2 y,
dy dx dy \dx

d2z d (dz
= 3x2 + 2 y,
dx dy dx \dy

d2z d (5z\
2x + 2.
dy2 dy \dy.

In the last example, we see that d2z/dy dx = d2z/dx dy. This is


always true for normal functions, although the proof is difficult:

For any function f(x, y),

d2f _ d2f
dy dx dx dy (27.3)
In higher derivatives, the dx and dy in the denominator
may be arranged in any order.
27.4 Differentiation of functions of two variables 477

For example
d3/ d3f d3f
2
and so on.
’ dx dy dy2 dx dy dx dy ’
The next example shows how to manage a problem in notation.
Often a function f(x, y) is used in which the variables x and y only
occur in a fixed combination u = h(x, y), so that

f(x, y) = g(u), with u = h(x, y).

where g represents a general, unspecified, function of a single


variable. To obtain a general formula for df/dx use the chain rule
(3.3) (see also Example 4.2c):

df dg du du dh , N
0'(«) — g\h(x, y)).
dx du dx dx dx

(You must not write dg/dx at this point.)


Thus suppose that f(x, y) = g(5x — 3y); then

% = 5g'(5x - 3y).
dx

If z = g(u), where u = h(x, y), then

dz dg du ,,, / N, dh
dx du dx dx
(27.4)
dz dg du (/I. dh
T = iLT = 9'(Kx,y))--
dy du dy dy

It is necessary to work out g'(u) first, before substituting u = h(x, y).

Example 27.6. Prove that if z = 4>(x — ct), where f is any function, then

d2z 1 d2z
dx2 c2 dt2

Put z = <J)(u) where u = x — ct. Then

dz df du
— =-= <p (u).
dx du dx

By the chain rule again,

d2z 8 XU 8u X'U ^
—- =— <b (u) =-= ffl (u).
dx2 dx du dx

Similarly

dz df du
— = — —= 0(m)(-c),
dt du dt

so

^ ^ [-c</)'(u)] = A l-cffu)'] ^ = (-c)2f"(u).


dt dt du dt
478 27.4 Mathematical techniques

Therefore

d2z 1 d2z
dx2 c2 dt2

The equation

d2z 1 d2z
dx2 c2 dt2

in Example 27.6 is called the wave equation in one space dimension.


It is a partial differential equation as contrasted with the ordinary
differential equations treated earlier in the book. We have verified
that (f)(x — ct) is always a solution, for any function </>. The general
solution is

cf>(x — ct) + if(x + ct),

where </> and tj/ are arbitrary functions. The general solution of
partial differential equations involves arbitrary functions rather than
the arbitrary constants which occur in ordinary differential equations:
even the simple equation dz/dx — 0 has the general solution z = f(y),
where f(y) is an arbitrary function.

27.5 Tangent plane and normal to a surface


The tangent plane to a surface z = f(x, y) at a point Q on the surface
plays the same role as the tangent line to a curve for functions of a
single variable. The tangent plane is the plane that fits the surface
near Q better than any other possible plane, as when a penny is
pressed against a teapot at a particular point.
Suppose that the tangent plane at Q (Fig. 27.7) has the equation

z = Ax + By + C.

There are three constants to be determined, so we need three


conditions to settle the values. Three conditions it is reasonable to
expect the tangent plane to satisfy are

(i) It must pass through Q; so c = Aa + Bb + C.


(ii) In the x direction at Q, the slope A of the plane must be equal

to the slope of the surface; so A = (— | .


\dxJQ
Fig. 27 .7The tangent plane to
(iii) In the y direction at Q the slope B of the plane must be equal
2 =f(x, y) at Q :(a, b, c),
where c =f(a, b).
to the slope of the surface; so B = (—) .
\dyJo
Therefore the required equation is

z =
27.5 Differentiation of functions of two variables 479

or, more tidily,

z-c = (^) (x-a) + \ (y-b).


\3Xy (a,b) b)

Tangent plane at Q : (a, b, c) on the surface


z = fix, y)
(27.5)
z-c = t—') (x - a) + (y - b).
\8xJ (a>b) \dyj(a,b)

Example 27.7. Find the equation of the tangent plane at the point Q: (2, 1, —2)
on the sphere x2 + y2 + z2 = 9.

Recast the equation into the form z = f(x, y), noticing that Q is on the lower
half of the sphere:
z = -(9 -x2-y2)i.
Then the chain rule gives:
dz , , (dz\
= -(-2x)-i(9 -x2-y2)~*, and ' 1 = 1:
dx (2,1)

dz dzN
= -(-2}'H(9 - x2 ~ y2) and ( —
dy (2,1)

Therefore the equation of the tangent plane at Q is


z - (-2) = l(x -2) + j(y - 1),
or
z = x + jy- §.

A straight line SQR (Fig. 27.8) is said to be normal or perpen¬


dicular to the surface z — f(x, y) at Q if it is perpendicular to its
tangent plane at Q. The equation (27.5) for the tangent plane can be
written in the form

x + (j^) y + (-l)z = C,

where C is a constant, so (see equation (10.22)) a triplet of direction


ratios for the line normal to the surface at Q is

(27.6)

Example 27.8. Find the cartesian (x, y, z) equation of the straight line normal
to the surface x2 + y2 + z2 = 9 at (2, 1, —2).

From Example 27.5 (which has the same data), the direction ratios in (27.6) are

1, 2> "I-

Therefore the equation of the normal line at Q is


x—2 y — 1 z + 2
~r=_r=^r
480 27.5 Mathematical techniques

The triplet of direction ratios in (27.6) can be regarded as the


three components of a vector parallel to the normal line. Such a
vector is called a normal vector at Q, and is denoted usually by n:

Any multiple of this vector is another normal vector, since it will be


parallel to the same line. A normal vector placed at Q is shown in
Fig. 27.8.

Normal vector n at Q : (a, b, c) where c — f(a, b),


on the surface z = /(x, y)

(27.7)
(a)

or any multiple of this vector. Its components are


direction ratios of the normal line at Q.

Example 27.9. Find several vectors normal to the sphere x2 + y2 + z2 = 9 at


the point (2, 1,-2) on the sphere.
The data are again the same as in Example 27.5. The normal taken from (27.7)
is (1, }, — 1). Another is (— 1, — j, 1). pointing in the opposite direction, while
(|, (, —|) is a unit vector which is a normal.

27.6 Maxima, minima, and other stationary


points
For a function of a single variable, a local maximum or minimum
or a point of inflection occurs where the tangent line to the graph of
the function is horizontal. For a function /(x, y) of two variables,
there are similar possibilities at points where the tangent plane
is horizontal. Such points, or rather their (x, y) coordinates, are
called stationary points of /(x, y), because as we pass through
them the function is momentarily neither increasing nor decreasing.
Sometimes a stationary point is a local minimum or maximum as
illustrated in Fig. 27.9a, b.
The condition for the tangent plane at Q on z = /(x, y) to be
horizontal is that the normal n at Q should be vertical, or parallel
to the z axis. Therefore the x and y components of n in (27.7) must
Fig. 27.9 (a) A local minimum. be zero:
(b) A local maximum.
df df
(c) A saddle. = 0, = 0.
(d) A shoulder. ox dy
27.5 Differentiation of functions of two variables 481

These constitute two simultaneous equations whose solutions (x, y)


are the stationary points of f(x, y).

Stationary points of f(x, y)


are at the solutions (x, y) of
(27.8)
df
= 0,
dx

We shall usually describe a stationary point of f(x,y) as being


‘at P : (x, y)’ rather than ‘at Q: (x, y, z) on z = f(x, y)’. If necessary,
the corresponding value of z can be worked out after finding (x, y).

Example 27.10. Find the stationary points of

f{x, y) = ix3 - xy2 - 2y,

and the value of f(x, y) there.

Since of/ox = x2 — y2 and df/dy = — 2xy — 2, stationary points occur where

x2 - y2 = 0, xy + 1 = 0.

The first equation is equivalent to y= ±x. Consider these alternatives


separately:
If y = x, the second equation becomes x2 + 1 = 0, which has no solution.
Therefore.reject y = x.
If y = —x, the second equation becomes — x2 + 1 =0, which has solutions
x = +1. Corresponding to these we have

y = +x = +1.

Therefore there are two stationary points, (1, —1) and ( — 1, 1). The values
of f{x, y) at these points are

/(l,-l) = f, /(-l, 1) = f-

A stationary point at (a, b) is a local maximum if f(a, b) is greater


than f{x, y) at all points in its immediate locality; it is a local
minimum if the words ‘less than’ are substituted for ‘greater
than’. On a contour map, a maximum or minimum shows its
presence by being surrounded by closed contours as in Fig. 27.10b.

(a)

Fig. 27.10 (a) Local maxima at


(2, and Q3 and a saddle at Q2-
(b) The contour map shows
closed curves around the
maxima.
482 27.6 Mathematical techniques

As with functions of a single variable, the test for a maximum or


minimum involves second derivatives. The following test enables
maxima, minima and other stationary points to be distinguished in
most cases, but we omit the proof, which is difficult.

Test for the character of a stationary point P : (a, b) of /(x, y)


Suppose that df/dx — df/dy = 0 at P. Then P is

d2fd2f ( d2f^2
(a) a saddle if ~-~f~ — ( -—Y~ ) < 0 at P,
dx2 dy2 \dx dyy

.rd2fd2f d2f V 32/ ( d2f \


(b) a maximum if —-—- J ' > 0 with -4 < 0 or -4 < 0 at P, (27.9)
' 5x2 dy2 fdx dy) dx2 V dy2 )

d2f d2f ' d2f V d2 f ( d2 f


(c) a minimum ll > 0 with —- >0 or —- > 0 1 at P.
dx2 dy2 \dx dy) dx2 \ dy2

(d) If none of these apply, the point might be any type.

Example 27.1 1. Find and classify the stationary points of


f{x, y) - |x3 + \y3 - x2 - y2

The stationary points are the solutions of df/dx = 0, df/dy = 0, or

x2 - 2x = 0, y2 - 2y = 0.

From the first, we obtain x = 0 or x = 2. From the second, y = 0 or y = 2.


Therefore there are stationary points at (0, 0), (0, 2), (2, 0), (2, 2). To test them,
we need the second derivatives at a general point:

d2f d2f d2f


~ = 2x~2, -L = 2y-2, —L = 0.
dx2 dy2 ' dx dy

At (0,0), these become respectively —2, —2,0. Then

dVd2/ d2f Y d2f d2f


—— ) = 4 > 0, —— = —— = — 4 < 0.
dx2 dy2 dx dy) dx2 dy2

Since the conditions of (27.9b) apply, the point is a maximum.


At (0, 2) and (2, 0),

d2fd2f ( d2f\2
—r- —— = —4 < 0;
dx2 dy1 \dxdy)

so, by (27.9a), both points are saddles.


At (2, 2),

dx2 dy2 \dx dy) dx2 dy2

Therefore, by (27.9c), the point is a minimum.


27.7 Differentiation of functions of two variables 483

27.7 The method of least squares


Suppose that a succession of experiments is performed in which we
vary one quantity x, such as voltage applied to a circuit, and measure
the corresponding value of another variable y, say the resulting
current. The values recorded for one or both of the variables might
be subject to random errors of measurement; on a graph of the
results, this will show up as scatter among the points, as in Fig. 27.11.
We might have reason to believe that the underlying relation
between x and y is a straight line. There is no way of deducing this
line with certainty, but the following method is often used to obtain
a convincing straight-line fit to the points.
Suppose that there are N points altogether; call them

(Xi,7i), (^2.72). (xN,yN).


The general point is called (x„, yn). Figure 27.11 shows a candidate
for the best-fitting straight line.

y — ax + b,

and we have to adjust the constants a and b to obtain a good fit.


The vertical deviation en of a point (x„, y„) from the line is shown:

en = yn - (ax„ + b).
The criterion we shall use to determine the best straight line is to
N
choose a and b so that Z e2n is as small as possible; that is to say,
n= 1
we want to minimize
N

n=1
Z (7. axn — b)2 = f(a, b) (say).

Therefore a and b are the variables in this problem, and everything


else has fixed values.
For a minimum, we require at least that

eUl
dadb
= o.
The derivatives are given by
N

%= Z 2(-x„)(y„ - axn - b) = 2 Z (aXn + bXn ~ Xr,yn)’


da „=i «= 1
N
df N
— = z (-2)(yn- axn- b) = 2 Z (axn + b~ yn)-
db „= i n= 1

N
Noting that Z b = b + b + -- - + b = Nb, we find the conditions
n= 1
for a minimum as the following pair of simultaneous equations for
a and b:
484 27.7 Mathematical techniques

Method of least squares

To fit a straight line y = ax + b to the N points


(xH,yn) (n =1,2
find a and b by solving the simultaneous equations
iv jv iv (27.10)
n Z xn T b Z Xn Z xnyn,
n= 1 n=1 n= 1

N N
a Z *„ + bN = Z yn-
n= 1 n= 1

We shall not prove that the stationary point of f(a, b) found by


this method is actually a minimum (see Problem 27.21).

Example 27.1 2. Find the straight line which best fits the data:
xn 0.0 1.1 3.2 3.9 7.1 8.9
y„ 1.1 1.6 1.6 2.8 2.9 3.8'
Here N = 6, and the coefficients in (27.10) are

Z xn ~ 24-2, Z^n=13.8,
n = 1 n= 1

n
t
= 1
x2n = 156.28, Z
«= 1
*„y„ = 72.21.

The equations for a and b therefore become


156.28a + 24.26 = 72.21,
24.2a + 6 b= 13.8.
By solving these we find that a = 0.28, b = 1.16, so the required line is
y = 0.28.x + 1.16.

The equations for a and b are sometimes ill-conditioned, meaning


that the solutions are very sensitive to small changes in the
coefficients. It is therefore advisable to retain all the significant
figures given by the data while solving them, despite the fact that
we know they already embody the errors of measurement.

27.8 Differentiating an integral with respect to a


parameter
Suppose that we have an integral whose integrand contains a
parameter a as well as the variable of integration - for example,
dx
e°" df, g(x)h(x + a) dx.
Jo 4 x + a
We shall consider a definite integral, though the process works in
the same way for indefinite integrals. Indicate the dependence on a
in the general case by

J(a) = I f(t, a) df.


27.7 Differentiation of functions of two variables 485

Then d/(a)/da can be obtained by the following rule:

Differentiating an integral with respect to a


parameter
'b

If f(t, a) df = 1(a), then


Ja (27.11)
d/(a) Chdf(t,a)
-—-df
da Ja da

This process is also called differentiation under the integral sign. To


prove (27.11), change a to a + 5a; then 1(a) changes to I (a + 5a). Put

I (a + 5a) — 1(a) = 5/(a).

Then

57(a) /(a + 5a) — 1(a)


5a 5a

T rb
f(t,a + 5a) dt — f(t, a) dt
5a Ja

f(t, a + 5a) - f(t, a)


dt.
Ja 5a

Now let 5a 0. Then 5/(a)/5a becomes d/(a)/da, and the integrand


becomes df(t, a)/da, which is the result (27.11).

df
Example 27.13. Evaluate 1(a) , where a > 0, and use (27.11) to
t2 + a2
df
evaluate J(a) =
o (t2+oc2)2

From Appendix E,

df
1(a) = = [a 1 arctanp/a)]? = —.
r + a 2a

By (27.11),

d/ _ d 1 d it
df
da o dat2 + a2 da 2a
or
— 2a
df = —
o (t2 + «2)2

Therefore

df
J(a) =
o (f2 + a2)2 4a3'
486 Mathematical techniques

Problems
Bz Bz
27.1. Sketch contour maps of the following functions: (b) Let z = g(x — y); show that — / — •= — 1.
BxI By
(a) 2x - 3y + 4; (b) —x + 2y — 1; '
(c) (x - l)(y - 1); (d) x2 + jy2 - 1;
(e) x2 + 2x + y2 (complete the square in x); 27.7. Show that, if z = g(x/y), then
(f) y/x; (g) y2 - x2; (h) y/x3; Bz Bz
x-h y — = 0,
(i) x3 + 4y2; (j) y/(x + y). dx By
and check the result in the case z = sin x/y.
27.2. By sketching rough contour maps, indicate the
paths of steepest ascent (the paths on which z in¬
B2f B2f B2f s2f
creases most rapidly), starting at the point (1,1): 27.8. Find—, in each of the following
(a) z = 2x — 3y + 4; (b) z = x — y; dx2 By2 By Bx Bx By
(c) z = x2y2; '(d) z = (x - l)2 + \(y - l)2. cases (note (27.3)).
(a) ax + by + c; (b) x2 + 2y2 + 3xy — x + 1;
27.3. Obtain df/Bx and Bf/By at the point (2, 1) for (c)sin(x-y); (d) y/x; (e) e2x+3>';
the following functions. (f) 1/x + 1/y; (g) sin 3x + cos 2y; (h) (3x — 4y)4;
(a) 3x + ly — 2; (b) — 2x + 3y + 4; (i) 1/(x + y); (j) In xy; (k) l/(x2 + y2)A
(c) 2x2 — 3y2 — 2xy — x — y + 1;
(d) ^x3 + y3 — 2y — 1; (e) x4y2 - 1; 27.9. Confirm that, if r = (x2 + y2)* and z = In r, then
(f) (x - l)(y - 2); (g) 1 /(xy); Bz x B2z 1 2x2
x —y 3 — = — and — =-.
(h) x/y; (i)-; (j) —- dx r2 Bx2 r2 r4
x + y x- + y Show that z = In r is a solution of the equation
(k) (x2 + y2)*; (1) (2x - 3y + 2)3;
d2z d2z
(m) ex2+>'2; (n) cos(x2 — y2); -+ — = 0.
(o) sin(x/y); (p) arctan(y/x). dx2 By2
(This is called Laplace’s partial differential equation in
27.4. (a) Let z = g(ax + by), where a and b are con¬
two dimensions.)
stants. Express Bz/Bx and Bz/By in terms of g\ax 4- by)
(which means g'(u) when u is subsequently put equal
to ax + by). Check your result for the cases when 27.10. Obtain the tangent plane and a normal vector for
g(u) = cos u and g(u) = e“.
the following surfaces at the points given.
(b) Let z = #(sin xy). Express Bz/Bx and Bz/By in terms (a) z = x2 + y2 at (1, 1, 2); (b) z = xy at (2, 2, 4);
of x, y, and </'(sin xy). Check the result by differ¬ (c) z = x/y at (2, 1,2);
entiating es,nxy directly. (d) z = (29 - x2 - y2)± at (3, 4, 2);
(c) A certain physical quantity V is a function only (e) z = x2 + y2 — 2x — 2y at (1, 1, —2);
of the radial coordinate r in plane polar coordinates: (f) z = exy at (0, 0, 1).
V = g(r), where r = (x2 + y2)C Express BV/Bx and BV/By,
firstly in terms of x and y, then in terms of r and 0. 27.11. The two surfaces z = x2 + y2 andz = x — y +
2intersect at the point Q : (1, 1, 2). Find normal vectors
27.5. In plane polar coordinates (r, 0) in the first quadrant, at Q to each of the two surfaces, nx to the first and
r = (x2 + y2)* and x = r cos 0. Form Br/Bx and Bx/Br, /i2to the second. By considering the scalar product
and show that nx -n2,find the angle between the normals and hence the
angle at which the surfaces cut at Q.
Br Bx
-±\.
Bx Br 27.12. Find the stationary points of the following func-
By considering the meaning of the derivatives Br/Bx ions, and classify them using (27.9).
and Bx/Br near a particular point P in the manner of Fig. (a) (x - l)(y + 2); (b) x2 + y2 - 2x + 2y;
27.6. show why it is not to be expected that the product (c) i*3 - ik3 - * + y + 3;
should equal 1. (In the case of a single variable and (d) cos x + cos y; (e) ln(x2 + x) -I- ln(y2 + y);
(f) s*2+y1-2x+2y. (g) XJ, + 1/JC + l/y;
ordinary derivatives, we often get true results by formally
cancelling out symbols like dx, du, etc., as in the chain (h) x3 + y3 — 3xy +1; (i) sin x + sin y;
rule. This almost never works when more variables are (j) xy2 - x2y + x - y + 1;
present: see for example the next problem.) (k) (x2 - y2) + 2xy; (1) (2 - x2 - y2)2;
(m) x4 + y4 + y — x;
„ . Bz I Bz (n) x4 -I- y4 (this eludes the test (27.9) - the point is
27.6. (a) Let z = sm(x — y); show that —/ — = — 1. obviously a minimum).
BxI By
Differentiation of functions of two variables 487

27.13 Classify the stationary point of ax2 + 2hxy + by2 (d) A rectangular container is required to have total
at (0, 0) for various relations between a, b, and h. surface area S, and a volume as large as possible. Find
its dimensions (i) if it has a lid, (ii) if it does not
27.14. Find positive numbers a, b, c so that have a lid.
(a) a + b + c = 21 and abc is a maximum.
(b) abc = 64 and a + b + c is a minimum. 27.19. Find the straight line which best fits the
experimental data in the sense of Section 27.7:
27.15. Find the absolute maximum value of x 1 2 3 4 5
(2 — x2 — y2)2 in the ‘box’ -1 < x sg 2, — 1 ^ y < 1. (It y 3.1 2.1 2.0 1.8 1.2.
will be necessary to investigate the function on the four
edges of the box separately, since the absolute maximum 27.20. The population P of a fast-breeding rodent was
will not be revealed by the conditions (27.9) if it is on observed over a period of 12 months, and the following
the edges.) estimates obtained:
t (months) 0 2 3 5 8 10 12
27.16. Find the shortest distance between the straight P (pop’n) 12 23 26 60 170 300 690.
lines x = y = z and 2x = y = z + 2, by using a simple
Assume that the underlying growth law takes the form
parametrization of each line. (Use different letters for the
(see Section 1.10)
two parameters: these will be the new variables for the
minimization.) P = Aeb‘,
where A and b are constants.
27.17. N points (xl5 jq), (x2, y2),(xN, yN) are given in To estimate A and b, take the logarithm of this
a plane, and P : (x, y) is a general point. Find P so that expression and treat y = In P as a variable in the least-
the sum of the squares of its distances from the N given squares method of Section 27.7.
points is as small as possible.
27.21. For the least-squares method of Section 27.7,
27.18. (a) A rectangular box with a lid must hold a given use the test (27.9) to show that the values of a and b
volume V, and have the smallest possible surface area. obtained do minimize the sum of squares. (This is, of
Show that it must be a cube. (Call the lengths of two of course, rather obvious intuitively.)
its sides x and y.)
(b) An open-topped rectangular box must have a 27.22. Using Laplace transforms with respect to t, solve
the partial differential equation
given volume V and its surface area must be as small as
possible. Find its dimensions. dz dz
(c) A circular-cylindrical box must have a fixed -I- x -— + z = 2x,
dt dx
volume V and minimum surface area. Find its dimen¬
sions (i) if it has a lid, (ii) if it has no lid. for x > 0 and t > 0, where z(0, t) = 0 and z(x, 0) = 0.
Functions of two
variables: geometry
and formulae
Contents
28.1 The incremental approximation 488
28.2 Small changes and errors 490
28.3 The derivative in any direction 493
28.4 Implicit differentiation 496
28.5 Normal to a curve 498
28.6 Gradient vector in two directions 499
Problems 502

28.1 The incremental approximation


It was explained in Section 25.5 that the tangent plane at a point
is the plane that best fits a surface at and around the point. The
formula for the tangent plane to a surface z = f(x, y) at Q : (a, b, c),
where c = f(a, b), is

— c = (x (y-b)
(a.b) (a.b)

(see (25.5)). We will set up new axes with origin at Q, parallel to the
old ones, and call them 5.x, 8y, 8z (see Fig. 28.1), anticipating that
we shall be concerned with small distances from Q. Then

5 x = x — a, 8 y = y — b, 8z = z — c.

In the new coordinates, the equation of the tangent plane is

6x + (A
\SyJ(a.b)
Now consider the quantity 8/, where

5/ = fix, y) - /(a, b).

This is the change in z on the surface z — /(x, y) from its value at


Q. The tangent plane is the best-fitting plane to the surface at Q, so
the formula

fix, y) - f(a, b) = 8f * 5x + 5y

must give the best-fitting linear approximation to 8/ near x = a,


y = b:
28.1 Functions of two variables: geometry and formulae 489

Best linear approximation to f(x, y) near (a, b).

afJf) ** + (%) s_v, (28.1)


\oxJ(aM \Sy/iaM
where 5/ = f(x, y) - f(a, b), 8x = x - a, 8y = y - b.

The reader is more likely to remember the formula obtained by


calling the general point (x, y) instead of (a, b), and putting z in
place of f. Also the approximation will be good enough to be useful
only when 5x and 8y are ‘small’ (how small will depend on
circumstances):

Incremental approximation for f(x, y)


(mnemonic version)
For small enough increments 8x and 8y:

f(x + 8x, y + 8y) - f(x, y) x ^ 8x + Sy (28.2)


ox oy
If we put z = f(x, y), this can be written
dz dz
8z — — 8x + -— 8y (approximately).
dx dy

This will be the source of almost all our results from now on.

Example 28.1. Let z = x2 + 3y2. Find an approximation to 8z in terms o/8x


and 5y near the points (a) x = 2, y = 1; (b) x = 3, y = 2; (c) x = 0, y = 0. (d)
Find the exact value of 8z in case (a) and compare it with the approximate values
in the three cases when 5x and 5y both take the values 0.1, 0.01, and 0.001.

dz dz
In general, — = 2x and — = 6y.
dx dy

(a) At (2,1), dz/dx = 4 and dz/dy = 6. Therefore, from (28.2)


8z = 4 8x + 6 by approximately.
(b) At (3, 2), dz/dx = 6 and dz/dy = 12; so
8z = 6 8x + 12 8y approximately.
(c) At (0, 0), dz/dx = dz/dy = 0; so the formula predicts
8z = 0 approximately.
The reason is that (0, 0) is a stationary point, so z hardly changes when we
move a short distance from (0, 0).
(d) From (a), the approximation near (2,1) when 8x = 8y = 0.1 is
8z = (4 x 0.1) + (6 x 0.1) = 1.0.
The exact value is given by
Sz = /(2.1, 1.1) -/(2,1) = 1.04,
so the error in estimating 8z is —4%. If 8x = 8y = 0.01, the error is —0.4%; if
Sx = Sy = 0.001, it is -0.04%-
490 28.1 Mathematical techniques

We see from (d) in the example that the approximation improves


percentagewise as 5x and 5y get smaller: it is not merely that
the error decreases because 5x, 5y, 8z all go to zero together. The
following example shows the reason for this.

Example 28.2. Find the exact algebraic form of the error incurred by using
(28.2) to estimate 8z at (2,1) when z = x2 4- 3y2 (see Example 28.1 a).

Put x = 2 + 8x and y = 1 + 8y. Then

5z = /(2 + 5x, 1 + 5y)-/(2,1)

= (2 + 5x)2 + 3(1 + 5y)2 - 7

= (4 8x + 6 5y) + (8x2 + 3 8y2).

The first two terms represent the linear approximation obtained in


Example 28.1(a). The remainder is the error incurred; the part we
ignore in the approximation. The error consists only of higher
powers of 5x and 5y, and this will always be the case. Therefore
the error is an order of magnitude smaller than the linear terms
retained in the incremental approximation (28.2).

28.2 Small changes and errors


The incremental approximation (28.1) or (28.2) can be used to
estimate the effect of making small changes in the values of
variables in a formula.

Example 28.3. Estimate the change in the value of

1
(.x2 + y2f

when (x, y) change from (3, 4) to (3.1, 3.8).

Using (28.2), put (x, y) = (3, 4), 8x = 0.1, 8y = —0.2. We require

hr) =[-x(x2 + y2)-'](34)= —fig,


W(3.4)

(= [-yi*2 +y2) -](3.4) — ~Ti5’


\oy/l 3.4)
Therefore, approximately,

= (-TTlXO.l) + (—ifs)( —0.2) = 0.004.

(The exact value of 8z is 0.00391 ....)

Example 28.4. The period T of the swings of a pendulum is equal to 2n(l/g)*,


where l is the length and g the gravitation constant. Estimate the error in
calculating T if, instead of using closely correct values l = 1.015 and g = 9.812 in
the formula, we use the rounded values l = 1 and g = 10.

The formula corresponding to (28.12) is

8T % — 8/ + — 5t/.
SI eg
28.2 Functions of two variables: geometry and formulae 491

Suppose for simplicity we decide to substitute the rounded values l = 1 and


g = 10 into the coefficients: we obtain

~xr = U )< i.io> = 0.993,


ol

~~ = (-tl^luoi = -0.099.
eg

Equation (28.2) then requires that we put

51 = (true value) — (rounded value) = 0.015,

5g = (true value) — (rounded value) = —0.188.

Then

8T % (true value) — (rounded value)

« (0.993X0.015) + (-0.099)(-0.188) = 0.0335

But this is not the error, for that we need

(error) = (rounded value) — (true value) = —5T,

so the error is about —0.0335. (The exact error is -0.0339_)

In the last example, we substituted the rounded (erroneous)


values into dT/dl and dT/dg, which led to a complication we might
have avoided. However, usually there is no choice, the exact values
being unknown. Let z — /(x, y), and suppose that we want to
estimate the error in z which could arise from using measured (i.e.
approximate) values for x and y. The error Ax in x is

Ax = (measured value of x) — (exact value of x),

and similarly for Ay and Az.


Usually we only know a range of possible error, not the errors
themselves. For example we might say that a parcel weighed
1430(± 15) g, meaning that we think it is between 1415 and 1445 g.
Therefore, the values of Ax and Ay are unknown, so the exact
values of x and y are unknown, and are not available to go into
(28.2) in place of (x, y). Suppose we put instead

x and y = (measured values).

To correspond with this, the definition of 5x, 5y, 5z in (28.2) requires

5x, 5y, 8z = (true values) — (measured values),

Therefore

8x = — Ax, 8y = — Ay, 8z = — Az

go into (28.2). Every term has then a negative sign, so the formula
in terms of Ax, Ay, Az has the same shape as the incremental formula:
492 28.2 Mathematical techniques

Small-error formula

If z = f(x, y), then

Az = — Ax + — Ay (approximately), (28.3)
8x dy
where x and y are measured values, and A stands for
error = (measured value) — (exact value).

This is used in the following way.

Example 28.5. In a triangle ABC, the side BC has length a given by


c sin A
sin(T + B)
Suppose that c = 10 (exactly), and angles A and B are measured to 5° accuracy:
A = 45( ± 5)°, B = 30( ± 5)°. Estimate the largest possible resulting error in a.

Put
sin A
a = f(A, B) = 10
sin(T + B)

Then
da sin(v4 + B) cos A — cos(T + B) sin A sin B
dA sin2(T + B) sin2(T + B)

Similarly
da sin A
— = _10—--.
dB sin2(T + B)
When taken at the measured values A = 45° and B = 30° (these are the only
values available), we get da/d A = 5.36 and da/dB = —7.58. The error formula
(28.3) becomes
A a = 5.36 AA - 7.58 A B
approximately, where A A and A B must be measured in radians.
The greatest possible magnitude occurs if AT and AB happen to have the
opposite signs and their greatest possible magnitudes; that is, if AT = —AB =
±0.087 radians. In that case, Aa = ±1.13. Therefore

a = 7.32(± 1.13),

showing a possible error of 15%.

Example 28.6. One solution of the equation x2 + bx + c = 0 is x = \[ — b +


(b1 — 4c)*]. (a) Find an approximate expression for the error Ax arising from
small errors Ab and Ac in b and c. (b) Estimate the maximum possible error in
the solution x if b and c are rounded to one decimal to give b cs 3.1, c ~ 2.1.

(a) We have Ax = (dx/db) Ah ± (dx/dc) Ac, in which we must put


dx „ , bx
—= it-1 + Kb2 - 4c)"*], — = -(b2 - 4c)“*.
db dc

(b) Since b and c are rounded numbers, all that we know about them is that
b = 3.1( ±0.05), c = 2.1 (± 0.05),
28.2 Functions of two variables: geometry and formulae 493

meaning that the error might be anywhere in the range indicated. Putting the
face values b = 3.1 and c = 2.1 into (a), we obtain

cbc dx
_ = 0.909, -=-0.909;
db dc

so, by (28.3),

Ax = 0.909 Ab - 0.909 Ac.

This takes its greatest possible magnitude when Ab and Ac take their maximum
values and have opposite sign: that is, when

Ab = ±0.05, Ac =+0.05.

In that case Ax = +0.909(0.05 + 0.05) = +0.091.


The value of x estimated from the rounded coefficients is x = -1. Although
the rounding error is only at most 2.5%, the error in the solution could be as
large as ±8.3%.

28.3 The derivative in any direction


The plane in Fig. 28.2 is a map of a surface z = /(x, y) with all detail
omitted. At P : (x, y) we see a slope dz/dx if we look east, a slope
dz/dy looking north, and other slopes in other directions. We can
find the slopes in other directions in terms of dz/dx and dz/dy. It
might seem that we could make the intermediate slopes equal to
anything we liked, but if the surface at P is smooth enough to have
a tangent plane, this is not so. In effect, the slopes we see are the
slopes of the tangent plane in the various directions.
Consider the direction PQ which makes an angle 9 with the
positive x axis, the direction for positive angles being anticlockwise
as with polar coordinates. Let the length PQ = 5s, a short step, and
let 5x and 5y be as shown. Then, by (28.2), the change in elevation
in this direction is given approximately by

dz dz
5z « - 5x + by.
dx dy

Divide by 5s; we obtain

5z dz 5x dz by dz dz .
— %-d-= — cos 9 + sin 9
5s dx 5s dy 5s dx dy

from Fig. 28.2. Now let 5s -> 0; the approximation becomes exact,
and we have an expression for the slope in any direction. Using the
notation for the directional derivative,

5z dz
hm — = —,
8s->o bs ds

we have the following formula.


494 28.3 Mathematical techniques

Directional derivative
The slope of z = f(x, y) at P in direction 8:
(28.4)
dz (dz\ „ (dzN
cos 8 + sin 8.
ds \dxjp \dyy

Example 28.7. Find the slope of the surface z = xy + x2 at P \ (2, 3) in the


—120°.
direction
The direction is shown on Fig. 28.3.

dz7 \
) = (y + 2x)(2-3) = 7,
dxj,(2,3)

dz
= (X)(2,3) = 2.
Gy) (2.3)
Also

cos(— 120°) = -sin 30° =

and

sin( —120°) = -cos 30°= -^3;


so
dz
= 7(-i) + 2( —V3)= —|(7 + 2^3).
ds

Example 28.8. The temperature distribution in a plate heated at the point (0, 0)
is given by T = l/(x2 + y2)T (a) Find the temperature gradient at the point (3, 3)
in a direction of 45° to the positive x axis, (b) In polar coordinates, T = 1/r. Show
that the result (a) is the same as dT/dr taken at any point where r = (32 + 32)T

( ^ Gx)(3,3) ( (x2 + y2)0<3.3) 18V2’

(dy),3.3, ( (x2 + y2)0( 3,3) 18V2'


Also cos 0 = 1/V2 and sin 0 = 1/V2. Therefore the temperature gradient at (3, 3)
in the given direction is

(b) T = 1/r, so dT/dr = — 1/r2. At the given point, r = 3V2, so the result is
the same.

Example 28.9. At any point on the plane z = J3x — y + 4, find (a) an


expression for the slope dz/ds in every direction, (b) the directions in which
dz/ds = 0, (c) the directions in which dz/ds is a maximum and a minimum.

(a) dz/dx = V2 and dz/dy = — 1; and these are the same at every point. By
(28.4),
dz
V3 cos 0 — sin 0.
ds
28.3 Functions of two variables: geometry and formulae 495

(b) dz/ds = 0 where ^3 cos 0 - sin 9 = 0, or tan 9 = J3. Therefore 9 = 60°


V
o/60°
60° or 9= —120°. These directions are opposed: see Fig. 28.4. They give the
$7
$y direction of the contour through any point.
•5° °^v.
of (c) dz/ds is a maximum (direction of steepest ascent), or a minimum (steepest

o, /’/
/’/ descent), in directions such that

V
s/
cy
r?/
-30°
or -J3 sin 9 - cos 0 = 0, or tan 9 = - 1/^/3. Therefore 9 = -30° or 9 = 150°,
-120° these directions being directly opposed: see Fig. 28.4. By considering the sign of
o .r

Fig. 28.4
or just by thinking about it, it can be seen that the directions of steepest ascent
and descent are as shown.

In the last example, the directions of steepest ascent/descent at


any point are perpendicular to the directions of the contours; we
shall now show that this is true for all surfaces. On the contour map
of z = /(x, y), the slope in the direction 9 at a point P : (x, y) has
the form (28.4):

dz
= A cos 9 + B sin 9,
ds
where A and B are the values of dz/dx and dz/dy at P. This is zero
in the directions 0, where

tan = — A/B.

The two directions 9: which satisfy this equation differ by n, so they


indicate smooth passage of the contour through P. The gradient
dz/ds is a maximum or minimum when

or in directions 02 where

tan 92 = B/A,

which give the directions of steepest ascent/descent. Since

tan 9X tan 92 — — 1,

these directions are perpendicular, a fact known intuitively by any


hill walker.

At each point on the map of z = /(x, y), the


direction of steepest ascent/descent is (28.5)
perpendicular to the contour.

The two systems of curves, consisting of the contours and the


curves which follow directions of steepest ascent or descent, are
496 28.3 Mathematical techniques

perpendicular wherever they cross, so they are called orthogonal


systems of curves (see Section 29.4).

28.4 Implicit differentiation


An equation of the type
fix, y) = c,
where c is a constant, describes a curve or curves in the (x, y) plane,
since we can imagine solving it to obtain y as a function of x. For
example, x2 + y2 — 4 represents the two semicircles y = ±(4 — x2)\
Another interpretation is that the equation /(x, y) = c describes the
contour z = c of a surface z = /(x, y), projected into the (x, y) plane,
as in Fig. 27.4b.
Although it is usually impossible in practice to solve for y in
terms of x, it is always possible to obtain an expression for the slope
dy/dx of the curve in terms of x and y. Choose any point P : (x, y)
on the curve (Fig. 28.5), and move along it a short distance to
Q : (x + 5x, y + 5y). Then dy/dx on the curve is given by

lim
dx §.x—>o bx
Since P and Q both lie on the curve, §/ = 0; so the incremental
approximation (28.2) gives

dl,x+dU
dx dy
0,

or
by df
bx dy'
Now let bx —> 0. The ‘ ~ ’ becomes * = ’, and by/bx becomes dy/dx,
from which we obtain:

The implicit-differentiation formula

The slope of /(x, y) = c at any point (x, y) on the


curve is given by
(28.6)
dy= _dfldf
dx dxj dy

The process is called implicit differentiation because /(x, y) = c


gives y in terms of x only ‘implicitly’, not explicitly.

Example 28.10. Find an expression for dy/dx at a general point (x, y) on the
circle x2 + y2 = 4.

Here f(x, y) = x2 + - 4, and so

df df
= 2x, — 2y.
dx dy
28.4 Functions of two variables: geometry and formulae 497

Therefore, by (28.7),
dy 2.\- x
dx 2y y'

(provided (hat (x, y) is actually a point on the given circle.)

In the last example, we would have obtained exactly the same


result for the circle x2 + y2 = 1, or x2 + y2 — 100. It is the numerical
values of x and y to be put in the right-hand side which will
distinguish the circle under discussion from all the other circles. In
fact the equation we obtained,
dy x
dx v
can be thought of as a differential equation. Its solutions (obtained
by the method of Section 22.3) are x2 + y2 — C, which includes the
given circle and all the others as well.

Example 28.1 1. Find dy/dx on the curve x3y — xy3 = 6 at the point (2,1).
(You can check that the point (2,1) is really on the curve.) Putting /(x, y) =
x3y — xy3, we have

of , 2 3 df 3*2
— = 3x y — y , — = x — 3xy .
ex ' dy

Therefore, at any point (x, y) on the curve,


dy = 3x2y - y3
dx x3 — 3xy2
At (2,1), the slope is

(-)
Vdx/,2.,) —*■
(This is not a differential equation: it is a numerical value which holds
at only a single point.)

The link with differential equations can be used in many ways, as


in the following example.

Example 28.1 2. Find the family of curves which is orthogonal (perpendicular)


to the curves xy = C.
The curves xy = C are the contours of the function f(x, y) = xy, and the new
family will be the curves of steepest ascent/descent on the contour map of xy.
The differential equation of the family xy = c is, from the implicit-differentia¬
tion formula (28.6)
dy = _y
dx x
Wherever the new curves intersect with these they must cut at a right angle, so
the product of their slopes at any intersection must be equal to —1. Therefore
the new family must have the differential equation
dy x
dx y
because (-y/x)P(x/y)P = -1 at any point P. This equation can be solved by
498 28.4 Mathematical techniques

separating the variables (Section 22.3), which gives

y2 - x2 = B,

where B is an arbitrary constant. This is another family of hyperbolas. A small


y region of the x, y plane is shown in Fig. 28.6.

28.5 Normal to a curve


The slope at any point P on the curve f(x, y) = c is equal to

(by (28.6)).

We shall obtain a vector n perpendicular or normal to the curve


at P. A straight line through P perpendicular to the curve must have
y2-x2=B slope

Fig. 28.6

because the product of the slopes must be equal to —1. A vector


with components (a, b) has slope b/a, so one normal vector n is

Any multiple of this n is also a normal at the point. Dropping the


suffix P, we have the following result.

Normal vector n at the point (.x, y) on the curve


f(x, y) = c
(28.7)
JdL dl\-8l^dl*
\cbc’ dyJ dx dy^

Example 28.1 3. Find several normal vectors at the point (2, 1) on the curve
x2 + y2 = 5.

Putting f{x, y) = x2 + y2, we have df/dx = 2x and cf/dy = 2y; so

= 2.

Therefore one vector normal to the circle at (2,1) is

n = (4, 2)

and from this any number of other normal vectors can be constructed by taking
28.5 Functions of two variables: geometry and formulae 499

multiples. For example, (2,1), (—2, —1) and are also normals, the last
one being a unit normal (one having unit length), which is often important.

Example 28.14. Find the angle of intersection between the curves x2 +


y2 = 5 and x2 — y2 = 3 at the point (2,1).
In the last example, we showed that n = (4, 2) is normal to x2 + y2 = 5 at
(2,1). Similarly the vector n = (4, —2) is normal to x2 — y2 = 3 at the point.
From Fig. 28.7, it can be seen that the acute angle 6 between the normals
is equal to one of the angles between the curves (the other is n — 6). From
(9.5).

n\ 'w2 _(4,2)-(4,-2)_3

lwil \ni\ V20V20


so 6 = 53.1°.

28.6 Gradient vector in two dimensions


It is familiar that the value of a quantity such as pressure or
temperature depends on, or is a function of, position (x, y). These
are scalar functions: the values they take up are ordinary numbers.
There are also vector quantities that depend on position. Figure
(28.8) shows some streamlines for a fluid flowing over a long cylinder
(assuming that the flow is always in the plane of the paper). The
velocity v is a vector which varies from point to point, so we can
write v = v(x, y). Gravitational, magnetic, and electric fields are
other instances or vector functions of position or vector fields.
Associated with any scalar function, there is an important vector
function which arises as follows.
We repeatedly produce formulae involving the pair of elements
dffdx and df/dy in combinations of the form U df/dx + Vdf/dy,
where U and V are constants or functions; for example as in (28.7),
(28.2), and (28.4). We can manipulate this pair as a unit by regarding

df .dfA fdf df\


dx dy \dx dy J

as a vector function. We call this vector function the gradient of


/ and denote it by

grad f or \f

(V is pronounced ‘del’ or ‘nabla’). We shall see that it works rather


like an ordinary derivative, but in two dimensions; hence its name.
Alternatively we can regard the symbol grad or V standing alone
as an operator (compare d/dx): it operates on scalar functions
/(x, y), instructing us to carry out the operation id/dx + j d/dy or
{d/dx, d/dy) on /(x, y):

grad f{x, y)
dx dy
500 28.6 Mathematical techniques

Gradient in two dimensions


Given a scalar function f(x, y), grad / or V/ stands
for
Jf.Jf (df df\
i^r- +J — or
ox dy \dx’dy) (28.8)

Alternatively, grad or V stands for the operator


,5 ,5 f S d\
i-h j — or —, — .
dx dy \dx dy)

Example 28.1 5. Lei f(x, y) = x2 + y2. Obtain (a) the vector function grad /:
(b) the value of grad / at the point (1, 2); (c) an expression for the magnitude, or
length, of grad / at (x, y).

df df
(a) grad f = -L( + —j= 2xi + 2yj;
ox oy
or we can use the alternative notations, and even the operator viewpoint:

V/ = f~,f)(x2 + y2) = (2x, 2y).


\ox oy)
(b) At x = 1, y = 2, we have
grad / = (2,4).
(c) The magnitude or length of a vector v = (a, b) is |v| = (a2 + b2)*, so
I grad /1 = [(2x)2 + (2.v)2]- = 2(x2 + y2)f

We can re-express some earlier results in terms of grad. For


example, we may write (28.7) immediately as follows.

A normal vector n at the point (x, y) on the


curve /(x, y) = c is (28.9)
n = grad f(x, y).

As we remarked earlier, expressions occurring in physical theory


frequently take the form

uel+vdl.
dx dy'

where U and V may be constants, or various functions. Then we


can write such expressions as a scalar (‘dot’) product by inventing
a new vector function S = Ui + Vj:

If S' = Ui + Vj, then


df df (28.10)
u {- + V{- = (U, V) = S grad /.
dx dy
28.6 Functions of two variables: geometry and formulae 501

Now consider the directional-derivative formula (28.4), regarding


it as representing the rate of change of f(x,y) in the direction 0:

df df df .
— = — cos 0 H-sin 0.
ds dx dy

To recast this in the form of (28.10), we require the vector

i cos 0 + j sin 0.

This is a unit vector (i.e. it has length unity) because

(cos2 0 + sin2 0)* — 1;

so put

i cos 0 + j sin 0 = s,

where s is a unit vector pointing in the desired direction, and (28.10)


becomes

Directional derivative

In the direction of a unit vector s, the rate of change


of f(x, y) is given by

— = S‘grad/; (28.11)
ds

that is to say, df/ds is equal to the component of


grad / in the direction of s.

Equation (28.11) can be written in a different way. If a and b are


two vectors, then the angle between them, qb, can be obtained from
the identity

a-b — \a\ \b\ cos <fr

(see (10.4)). If we put a = s and b = grad /, and use the fact that
|s| = 1, we obtain an alternative form of (28.11).

Directional derivative (alternative form)

df
— = Igrad /| cos f,
ds (28.12)

where f is the angle between grad / and the required


direction.

By using (28.12), the perpendicularity of the directions of steepest


ascent and the contours of f(x, y), proved in Section 28.3, can be
recovered.
502 Mathematical techniques

Problems

28.1. Use the incremental approximation (28.1) or (28.2) Suppose that nominally b = 2, A = 30°, C = 60°, but
to estimate the change 5z due to changes 8x and 8y as that C is found to be too large by 5%. By what amount
specified, and check the percentage error by calculating should A be changed so that S would be restored to
the exact result. the correct area?
(a) z = ,\-2 + y2 at (3,1), 8x = 0.1, 8_v = 0.3;
(b) z = sin xy at (0.5, 1.2), 8x = 0.1, 8y = —0.05; 28.8. A certain type of experiment to measure surface
(c) z = ev2 + 3v2 at (1,1), Sx = 0.1, Sy = 0.2; tension S requires the formula S = ahr3/p2, where a
(d) z = l/(x2 + y2)1 at (2,1), 8x = —0.2, 8y = 0.1. is a constant and h, r, and p are measured quantities.
Take the logarithm of the formula to find the fractional
28.2. Given z = x2 — y2 and two points P: (1.0, 2.1) change in 8S/S in S, in terms of simultaneous fractional
and Q : (1.1, 2.0), (a) estimate the change in z in going changes in h, r, and p.
from P to <2; (b) estimate the change in going from
Q to P; (c) explain in general terms why the second 28.9. Find the directional derivative df/ds of each of
estimate is not precisely the negative of the first. the following functions according to the data. Also,
for the given point, find the directions of the contour
28.3. (See Example 28.2). Obtain the exact algebraical and the direction of steepest ascent.
form of the error incurred in 8/ where (a) /(x, y) = x2 -f y2 at (1, 2), direction 9 = 30°;
(b) /(x, y) = x2y2 at (2,1), direction 9 = —45°;
5/ = /(x + Sx, y + by) - /(x, y),
(c) /(x, y) = x2y - xy2 + 2 at (-1,1),
by using the approximation 8/ % df * + — 8y direction 9 = 120°;
— ox
fix fiy (d) /(x, y) = sin xy at (^, k), direction 9 = —90°;
(e) /(x, y) = cos(x2 — y) at (0, — n), direction 0 = 0;
(a) for /(x, y) = xy near the point (2,1);
(f) /(x, y) = ex-y at (1,1), direction 0 = —45°.
(b) for f(x, y) = x/y near the point (2,1).

28.4. The relation between the object distance u, the 28.10. Find — at the prescribed points on the curves
image distance v, and the focal length / of a thin lens is dx
1 1 _ 1 given.
(a) xy = 1 at (2, |); (b) x2 + y2 = 25 at (3, 4);
u + v~f'
(c) 1/x - 1/y = i at (1, 2);
Suppose that the measured values of u and v are (d) to*2 + -hy2 = 1 at (2, 3);
u — 0.31( +0.01), v = 0.56( + 0.03); calculate the greatest (e) x3 + 2y3 = 3 at (1,1);
possible error in estimating f and the corresponding (f) x3y + 3x2 - y2 - 19 = 0 at (2,1);
percentage error. (g) xy2 - x2y + 6 = 0 at (3, 2);
(h) x2 + y2 = 4 at (2 cos 0, 2 sin 0);
28.5. A viscous liquid is forced through a tube of diameter (i) x2/a1 + y2/b2 = 1 at (a cos t, b sin f);
d = 10(±0.05 x 10-3) and length / = 0.1 under a press¬ (j) x cos y = y sin x at (k/2, 0);
ure p = 10(±5 x 104), and is found to pass fluid at a (k) y2 — 4nx = 0 at (at2, 2at).
rate ;■ = 0.625 x 10~9 per unit time. The viscosity p is
given by the formula 28.11. The ideal-gas equation is PV = RT, where R is a
K pci4 constant. There are three variables; P is pressure, V is
I] =-• volume, and T is absolute temperature, for a fixed mass
128 vl of gas. Show that
Find the maximum error in the viscosity estimate.

28.6. One root of the equation x2 + fix + c = 0 is x =


\[-h + (b2 — 4c)i], Suppose that b = 20.4 and c = 95.5. (The notation (du/dv)w means that the variable w is kept
Estimate the percentage error in the root which would constant during differentiation when u = g(v, w). Use
arise if these were rounded to b = 20, c = 96. (28.6).)

28.7. The area S of a triangle with base b and base angles 28.12. Find the cartesian equation of the tangent line
A and C is given by at a point (x,, yj on each of the following curves. (Find
dy/dx first.)
\b2 tan A tan C
S = --. (a) x2 + y2 = a2; (b) x2/a2 + y2/b2 = 1;
tan A + tan C (c) a2x2 — b2y2 = c;
Functions of two variables: geometry and formulae 503

(d) xy = 1; (e)x* + y*=l; obtain


(f) ax2 + 2hxy + by2 + 2gx + 2fy 4- c = 0.
dy dy
2a + 2x — + 2y + 2y — = 0,
dA dx
28.13. Suppose that the curves f(x, y) = a and g(x, y) = /?
intersect at right angles at a point (a, b). Find dy/dx at from which dy/dx can be found. Check that (28.6)
the point for each curve and deduce that, at (a, b), gives the same result.

of eg of eg
— — + — — = 0.
28.19. Find normal vectors to the curves below, and
ex ex ey ey find the angle between them at the intersection given.
(a) xy = 2, a2 — y2 = —3, at intersection (1, 2).
Use this result to confirm that, in the following cases,
(b) y = a3, a2 + \y2 = 36, at intersection (2, 8).
the two systems of curves are orthogonal (i.e. they always
(c) a2 + xy + y2 = 3, x + y = 2, at intersection (1,1);
intersect at right angles). Here a and jl are the parameters
interpret your result geometrically.
for the two systems - by varying them we obtain all the
(d) ax2 + 2hxy + by2 + c = 0 and
curves for the systems.
(a) a-2 + y2 = a, y/x = /?; ax0x + h( x0 + x)(y0 + y) + by0y + c = 0,
(b) a2 - y2 = a, .xy = /?; at any point (x0, y0) which lies on the first curve.
(c) y3 — a3 = a, 1/y + 1/a = /i;
(d) (a2 + y2)/a = a, (a2 + y2)/y = [}. 28.20. Find d2y/dx2 on the following curves.
(a) x4 — y4 = 1; (b) xy = 1; (c) xy exy = 1.
28.14. Let (a, y) be any point on the curve y3 — a3 = 1.
Find an expression for dy/dA at the point. Since this 28.21. Obtain grad f where /(a, y) is given by the
expression holds good for every point on the curve, it is following. Give its components, its direction, and its
a differential equation, having the given curve as one of magnitude at the points specified.
its solution curves. Verify this by solving it, and obtain (a) l/(x + y) at (1, -2); (b) y/x at (2, 0);
the other solutions. (c) y2 — 3a2 + 1 at (0, 0); (d) 1/x — 1/y at (2,1);
(e) 1 /r, where r is the polar coordinate, r = (a2 + y2)*;
28.15. (Numerical). Form the differential equation for confirm that the gradient vector points in a radial
the following families of curves, in which c is the direction.
parameter; then use the numerical solution method of
Section 20.2 to obtain a contour map of the functions 28.22. Use the gradient vector to obtain a unit vector
concerned. perpendicular to the following curves at the points
(a) a2 + 2y2 = c, c > 0; (b) a2 + xy — y3 = c; given
a2 + y
(a) 2x — 3y + 1 = 0 at any point;
(c) -- = c; (d) Ay e * = c. (b) a2 + y2 = 5 at (2,1);
x + y- (c) a2 + y2 = r2 at (x0, y0) on the circle;
(d) x2/a2 + y2/b2 = 1 at (x0, y0) on the ellipse;
28.16. Form the differential equation for each system (e) y = 3a2 — 2 at (2,10).
of curves, and deduce the differential equation for the
orthogonal (perpendicular) system. Solve it to obtain 28.23. Use the property (28.9) to find the angle of
the orthogonal system. intersection of the following curves at the point of
(a) y2 — a2 = c; (b) y3 + a3 = c; intersection given.
(c) y2 = ca; (d) ey — ex = c. (a) y2 — a2 = —3 and a3 — y3 = 7 at (2,1);
(b) x2y — xy2 = 0 and x/y — y/x = 0 at (2, 2);
28.17. Find the curves of steepest ascent from an arbitrary (c) a2 + y2 + 2x — 4y + 4 = 0 and y = a2 + 2x + 2 at
point (a, b) for each of the following functions. (—1,1); explain the result geometrically.
(a) ix2 + y2; (b) A3y3; (c) iy2 - y - a2.
28.24. Use (28.12) to prove the results given in Section
28.3 for a general /(a, y): that (a) the directions of
28.18. Implicit differentiation of y with respect to x can
most rapid increase and decrease through a point (a, y)
be carried out as follows when /(a, y) is given explicitly.
are perpendicular to the direction of the contour through
Consider /(a, y) = a2 + 2.xy + y2 = c. Then, by differen¬
the point; (b) the maximum rate of increase from the
tiating this equation and treating y as a function of a, we
point is equal to |grad /1 at the point.
Chain rules, restricted
maxima, coordinate
systems
Contents
29.1 Chain rule for a single parameter 504
29.2 Restricted maxima and minima: the Lagrange multiplier 506
29.3 Curvilinear coordinates in two dimensions 511
29.4 Orthogonal coordinates 513
29.5 The chain rule for two parameters 514
29.6 The use of differentials 517
Problems 519

29.1 Chain rule for a single parameter


Suppose that x and y depend on, or are functions of, another
variable t (say) which we call the parameter. It might represent
time, for example. We shall write

x = x(t), y = y(t).
As f varies, the point (x, y) follows a curve of some sort which is
said to be defined parametrically. The curve also has a character¬
istic direction, which is the direction the curve is described as t is
increasing, and is indicated by an arrow. We then have a directed
path.
(a) y
Example 29.1. Show that both of the following parametrizations define a unit
semicircle, centred at the origin, in the upper half-plane, traced anticlockwise: (a)
x = cos t, y = sin t, where t increases from 0 to tc, (b) x — — u, y = (1 — u2)A
where u increases from — 1 to 1.

(a) The shape of the curve is obtainable by eliminating f:


x2 + y2 = cos2 t + sin21 = 1;

so the points lie on the unit circle. Also, as f increases from 0 to n, y is positive
and x decreases from 1 to —1. The path is the upper semicircle from (1,0) to
( — 1,0), described in a single direction, as shown in Fig. 29.1a.
(b) x2 + y2 = (— u)2 + (1 — u2) = 1. As u increases from — 1 to 1, y remains
positive while x decreases from 1 to —1. The path is as in (a): see Fig. 29.1b.

Given a function f(x, y) which can take values all over the x, y
plane, the function

g(t) = f(x(t), y(t))


Fig. 29.1 picks out only the values on the path (x(f), y(t)). As we move along
29.1 Chain rules, restricted maxima, coordinate systems 505

this path, the function value varies, and we might be concerned with
the rate at which it changes with f. (This is generally different from
the rate at which /(x, y) changes with distance along the path, which
is equal to the directional derivative (28.4), and corresponds to using
arclength s as the parameter.)
To find df/dt, suppose that t increases from t to t + b1. Then, on
the curve (x(f), y(f)), x changes from x to x + 5x and y to y + 5y.
Divide (28.2), the incremental approximation, by 51\
5/
_
df 8x
_ _ I
df by
_ _

bt dx bt dy bt

dy
Let bt —► 0. Then becomes * = —, and — —,and we
bt dt bt df
have the chain rule (or total derivative):

Chain rule for one parameter

Given /(x, y), x = x(f) and y — y(f),


d/ = ^/dx + dfdy-
dt dx dt dy dt
(and similarly with z in place of f if we write
2 = fix, y)).

This expression is like the chain rule (3.3) for functions of a single
variable with an extra term in it for the variable y. Partial derivative
rather than ordinary derivative signs are then written as necessary.

Example 29.2. Let f(x, y) = xy — y2, x = t2, y = t3. (a) Find df/dt using the
chain rule; (b) find df/dt by substitution.

df
(a) = x — 2 y,
dx

Therefore, by (29.1),

df df^f + 8f^y = y(2t) + (x — 2y)3t2


dt dx df dy dt

= 2f4 + (f2 — 2f3)3f2 = 5f4 — 6f5.

(This expression can be written in various ways in terms of x and y, for example
as 5x2 — 6xy, or 5yx* — 6x2yL These all look very different, but they all take
the same values since x and y are connected by the fact that (x, y) lies on the
given curve.)
(b) By substitution,

/ (x(f), y(t)) = xy- y2 = f2f3 - (f3)2 = f5 - f6.

Therefore, as before
506 29.1 Mathematical techniques

Example 29.3. Prove the implicit-differentiation formula (26.6) by using the


chain rule with x treated as the parameter.

If f(x, y) = c, then there is a solution y = y(x) for which

f(x, y(x)) = c

is automatically true for every value of x involved (that is, it is an identity).


Therefore

d /(x, y(x)) = Q
dx

Comparing this with (29.1), the chain rule, we have x in place of t for the
parameter. In terms of the chain rule, we therefore have

df dx df dy _ 8f + df dy
0 = — — + T-—- = + —-
dx dx dy dx dx dy dx

From this we recover the implicit derivative formula

The chain rule is more useful for obtaining general results, as in


Example 29.3, than in working out special instances such as Example
29.2.

29.2 Restricted maxima and minima: the Lagrange


multiplier
Consider the simple function

/(.x, y) = x + y.

This has no maxima, minima, or other stationary points, since


z = x + y represents an inclined plane. However, if we travel around
the plane on a particular path, we are likely to encounter high points
and low points, and points where we are momentarily travelling on
the level. Suppose that we walk on the circular path x2 + y2 = 1,
shown on a map in Fig. 29.2.
Then A is the highest point; this is where we were walking uphill
but then turn downhill: this is a local maximum point on the
path. If we plotted a graph of elevation against time, this point
would show up as local maximum on the graph.
The clue which reveals A to be a maximum is that one of the
contours of x + y is a tangent to the path at A. Those nearby contours
that the path crosses are all lower than the one through A. Similarly,
at f?, there is a local minimum for the path.
This is an example of a restricted stationary-point problem,
the ‘restriction’ being the condition that the only points considered
are those that lie on a particular curve. A general statement of the
problem is as follows.
29.2 Chain rules, restricted maxima, coordinate systems 507

Restricted stationary-point problem

Find the stationary points of f(x,y) subject to the (29.2)


condition g(x, y) = c.

Very simple problems of this type can be solved by an elementary


method, as in the following example.

Example 29.4. Find the maximum possible area a rectangle may have if the
perimeter is restricted to length 10 units.
Call the sides x and y. Then we require the maximum of the area A:

A = f(x, y) = xy (i)

subject to the restriction on the perimeter P

P = g(x, y) = 2x + 2y = 10. (ii)

From the perimeter equation, we have y = 5 — x; so the area can be expressed


in terms of x only:

A = x(5 - x).

This has a turning point where dd/dx = 0, or

5 - 2x = 0,

that is, at x = f. The perimeter equation (ii) gives correspondingly y = f, so the


desired shape is a square of area 2f.

However, although the following problem looks very similar,


there turns out to be a difficulty.
508 29.2 Mathematical techniques

Example 29.5. Find the maxima/minima of z = /(x, y) = x2 — y2 on the circle


g(x, y) = x2 + y2 = 1.
On the given curve,
y2 — 1 — x2. (i)
The values taken by z — x2 — y2 on this curve are given in terms of x by
z = x2 — (1 — x2) = 2x2 — 1.
The stationary points of this function are where

~ (2x2 — 1) = 0;
dx
which is at x = 0. At x = 0, the curve equation (i) gives y = +1, so we have
found the points A : (0,1) and A': (0, — 1). These are in fact minima, and they
are shown on the path in Fig. 29.3a.

Fig. 29.3 (a) Contour map of


x2 — y2, showing also the
curve x2 + y2 = 1. Here A and
A' are minima, and B and B' are
maxima.

(b) The path x2 + y2 = 1 in the


(x, y) plane, with the
corresponding values of
z = x2 — y2 shown.

However, there are plainly two maxima also, at B and £?', which are
completely missed by the process above. We could have found them (but lost
A and A') if we had substituted for x instead of y by means of x2 = 1 - y2. You
can see the reason for losing A and A' if you sketch the function 2x2 - 1 between
x = i 1- The maxima are at the ends, but cannot be found by differentiating; see
also Example 4.8.
29.2 Chain rules, restricted maxima, coordinate systems 509

We can get over this difficulty by parametrizing the curve


g(x, y) = c as in the following Example, which repeats Example 29.5.

Example 29.6. Find the stationary points of x2 — y1 on the curve x2 + y2 = 1.


Put
x = cos t, y = sin t, 0 ^ t < 2n,

then the circle C: x1 + y2 = 1 is traced once, anticlockwise, starting and ending


at (1,0). On C,

f(x, y) = cos21 — sin21.

As we go along the path C, stationary points are encountered where

d/(jc(0, y(t))
0 =-= 2 cos t( — sin t) — 2 sin t cos t
dr
= — 4 sin t cos f = — 2 sin 2f.
The solutions of this equation in the range 0 ^ t < 2n are t = 0, -/k, k, frc,
which correspond to the points (1,0), (0,1), ( — 1,0), (0, — l).Therefore this
approach successfully found all the stationary points on the path, which the
method of Example 29.5 failed to do.

We shall now describe the Lagrange-multiplier method for solving


the restricted stationary-value problem (29.2). This uses the para¬
metric idea, but all reference to a parameter is eliminated eventually
so that we do not have to invent a parametrization and then go
through the resulting algebra.
By thinking of time as a possible parameter t, and P : (x(f), y(t))
as a point moving along the curve with velocity (dx/dt, dy/dt), it
can be seen that g(x, y) = c can be expressed parametrically so that
(a) that path is traced exactly once as t moves through its range,
and (b) that dx/dt and dy/dt are never both zero together (if t is
time, this means that the moving point P never pauses).
Then as P : (x(t), y(t)) moves along g(x, y) = c, the points Q
where df/dt = 0 are the stationary points of f(x(t), y(f)). Therefore,
by the chain rule (29.1),
df^dx + dfdy = Q
dx dt dy dt
To get rid of dx/dt and dy/dt, which are special to the particular
parametrization chosen, we need another equation. On the curve,
g(x, y) has a constant value c, so dg/dt = 0 at every point including
(2; so, by the chain rule,
dg^dx dgdy = Q
dx dt dy dt
These last two equations can be regarded as a pair of homogeneous
algebraic equations of dx/dt and dy/dt. From Section 10.4, the
equations have a nontrivial solution if and only if the determinant
of the coefficients is zero, so at the (unknown) point Q
(tfdg_(tfdg^ = Q
dx dy dy dx
510 29.2 Mathematical techniques

This can be written alternatively in the form

elldA = i, dl[eA = x (29.3a, b)


dxj dx dyl dy

where X (‘lambda’) is a new unknown constant, called the Lagrange


multiplier for the problem. We have lost some information here,
because the condition dg/dt = 0 does not distinguish between one
value of c and another, so we have to reassert the condition

g(x, y) = c. (29.3c)

Looking back, we have three unknowns: x and y (the coordinates


of any stationary point) and X, another constant. To determine these,
there are three equations: (29.3a, b) and (29.3c). Finally, we summarize
the method.

Lagrange-multiplier method for the restricted


stationary-value problem

To find the stationary points of f(x,y) subject to


g(x, y) = c, solve the following equations for x, y, X:

g(x, y) = c. (i)

dl-idA = o, (ii) (29-4)


dx dx

el-ieA = o. (iii)
dy dy

(The value of X can usually be discarded.)

Notice that all reference to the parameter t has disappeared. There


are many ways of proving (29.4), but this is probably the simplest
for two dimensions. The problem is treated for three dimensions in
Section 30.8.

Example 29.7. Find the stationary points of x2 — y1 on the curve x2 + y2 = 1


(compare Examples 29.5 and 29.6).

In (29.4), f(x, y) = x2 — y2 and g(x, y) = x2 + y2 - 1. The equations to be


solved, in the order of (29.4), become

x2 + y2= 1, (i)

2x — A(2x) = 0 or (1 — 2)x = 0, (ii)

— 2y — X(2y) = 0 or (1 + X)y = 0. (iii)

From (ii), either X = I or x = 0. Taking these possibilities in order:

If X = 1, then (iii) gives y — 0; consequently (i) gives x = +1.

Therefore we have found the points (1,0) and ( — 1,0) which we called B' and
B in Example 29.5.
29.2 Chain rules, restricted maxima, coordinate systems 51 1

If x = 0, then (i) gives y = ± 1. We have therefore found the points (0,1) and
(0, —1) which we called A and A' in Example 29.5.

The equations obtained are often awkward to solve. It is best to


be very systematic, not wandering aimlessly between the equations.
Be careful not to overlook possibilities (such as that (ii) in Example
29.7 is solved by X = 1); and check at the end that the solutions
actually fit. The values found for X do have a special significance in
certain subjects but otherwise can be thrown away.

Example 29.8. Find the rectangle of maximum area which can be placed
symmetrically in the ellipse x2 + 4y2 = 1 as shown in Fig. 29.4.

Suppose that one of the vertices, say A, is at (x, y). We shall require that x and
y be positive, since this is sufficient for the geometrical condition. The area is
equal to 4xy = /(x, y), while x and y are subject to g(x, y) = x2 + 4y2 = 1.
The three equations, taken in the order of (29.4), become

x2 + 4y2 — 1, (i)

2y — Ax = 0, (ii)
Fig. 29.4
x — 2 Xy = 0. (iii)

Suppose that neither x nor y is zero (that could not give a maximum). Then,
from (ii) and (iii),

X = 2 y/x = x/2 y, (iv)

so x = ±2y. However, these must have the same sign for positive area, and we
postulated that x and y should be positive. Therefore

x = 2y > 0; (v)

so, from (iv) again,

1=1. (vi)
Use (v) to substitute for x in (i): we get 8y2 = 1, or (rejecting negative values
of y)

y = 1/2V2,
and (v) gives correspondingly

X = l/y/2.

The sides have length 1/^/2 and yj2, so the area is 1.

29.3 Curvilinear coordinates in two dimensions


Suppose that x and y are functions of two parameters, u and v.
To indicate this, write

x = x(u, v), y = y(u, v).

This situation arises when we change coordinates from (x, y) to


another system. For example, the equations

x = u cos v, y = u sin v,

represent polar coordinates, with u as the radial and v as the angular


512 29.3 Mathematical techniques

coordinate. Now hold v constant; put

v = /?,
say, and let u vary. Then
x = u cos P, y = u sin ft.
Here u is the only active parameter; as it varies, (x, y) traces a radial
straight line. Suppose instead that u is held constant, say

u — a;
then, as v varies, (x, y) follows the circle
x — oc cos v, y — a sin v.
The point where the two curves intersect can be described either by
(a) u = a, v = P
in the new (polar) coordinates, or in the original coordinates by
x — a. cos p, y = a sin p.

In general, if we have
x = x{u, v), y = y(u, v),
and vary u and v together in an arbitrary way, then the corresponding
points (x, y) will completely cover some area in the x, y plane. If,
however, we put u = a and vary v, then put v = P and vary u, we
obtain two curves in parametric form:

(x(a, v), y(a, v)) and (x(u, p), y(u, P)).


(b) V
By choosing different values for a and /?, we produce a net consisting
of two independent systems of curves. This can serve as a new
coordinate system.

Example 29.9. Sketch the coordinate system defined by


x = u + v, y = u — v.

Put u = a (constant) and vary v in the equations


x — x + v, y = x — v.

Eliminating the active parameter v between the two equations:


y = —x + 2a,

which is a straight line. By taking different values of a, we obtain a system of


parallel straight lines as in Fig. 29.5a.
Put v = fi and vary u:
x = u + p, y = u - p.

Therefore
y = x - 2/E

which gives another family of parallel straight lines, obtained by taking various
values for the constant /?, as in Fig. 29.5b.
The two families happen to be at right angles. Taken together, as in Fig.
29.5c, they form a left-handed system of cartesian coordinates (u, v) with origin
at x = 0, y = 0.
Fig. 29.5
29.3 Chain rules, restricted maxima, coordinate systems 513

New coordinates (u, v) can also be defined in the form


u = u(x, y), v = v(x, y).
For example, the system
u = (x2 + y2)*, v = arctan y/x
defines polar coordinates, u (radial) and v (angular). To sketch the
curves corresponding to constant u or v, we put
a = u(x, y) or fi = v(x,y)\
each of these gives the corresponding curve implicitly.

Example 29.10. Sketch the coordinate system (u, v) described by


u = y2 — 2x2, v = xiy.

The curve u — a is obtained in terms of x and y by solving


a = y2 — 2x2.
These curves (for various values of a) are in fact recognizable without solving,
being a system of hyperbolas with asymptotes
y= ±V2x-
The curves v = ft are given in x*y = /?, so y■— P/xi. The system is sketched
in Fig. 29.6 for the first quadrant. Notice that v = 0 on both x = 0 and y = 0:
the connection between (x, y) and (u, v) is not one-to-one over the whole (x, y)
plane.

29.4 Orthogonal coordinates


Suppose that we have a (u, v) system of coordinates defined either
by x = x(u, v), y = y(u, v), or by u = u(x, y), v = v(jc, y), and the
curves u = a and v = (3 always intersect at right angles for any
constants a and /?. Then the (u, v) system is said to be an orthogonal
system of coordinates. For example polar coordinates are orth¬
ogonal. Coordinate systems which are not orthogonal are seldom
used because of the complexity of the formulae connected with them.
A test for orthogonality is the following.

Conditions for an orthogonal system of


coordinates

The (u, v) system is orthogonal if

either (a) u = u(x, y), v = v(x, y), and

du dv du dv (29.5)
-+-= 0;
dx dx dy dy

or (b) x = x(u, v), y = y(u, v), and

dx dx dy dy
-+ — — = 0.
du dv du dv
514 29.4 Mathematical techniques

We prove this result as follows.


(a) Consider the curve from each family which passes through
(x, y). According to (28.7), normal vectors to the two curves at (x, y)
are = (du/dx, du/dy) and n2 = (dv/dx, dv/dy) respectively. The
curves meet in a right angle if their normals do so, and the condition
for this is nx-n2 = 0, which is equivalent to the condition given in
(29.5a).
(b) Consider the curves u = ot and v = f which pass through
a point P which has new coordinates (a, /?). Their parametric
equations are

x = x(a, v), y = y(a, v), for the curve u = a,

x = x(w, /?), y = y(u, /?), for the curve v = /?.

Their slopes at P are respectively given by

fiy\ =(eyjex\ and (M =fd_ijd_p\


\dxjp \dvI dvjp \dxjP \duj 8u/P

The condition for the curves to be perpendicular is that the product


should equal —1, and this is equivalent to the result in (29.5b).

Example 29.1 1. Confirm that the following coordinate systems (u,v) are
orthogonal, (a) u = y2 — lx2, v = x^y; (b) x = 2uv, y = u2 — v2.
For (a), use (29.5a). We have

ou CV ou cv
= — 4x, = | x~*y. 2 y, —=
dx dx dy oy
so
du dv du dv
— — + — — = — 4.x((x _ iy) + 2y(x*) =
ox dx dy dy

For (b), use (29.5b); notice how this condition is differently structured from
(29.5a). We have

ox ox oy cv
— = 2 li, — = 2 u, — = 2m, — = — 2v;
du dv du dv
so
ox ox dy dy
- + —-= 2v(2u) + 2u( — 2v) = 0.
du dv du dv

29.5 The chain rule for two parameters


Suppose that we have a new set of coordinates defined by

X = X(u, v), y = y(u, v),


and a function /(x, y): an arbitrary function of position. The
function /(x, y) can be expressed in terms of the new coordinates;
for example if

x = u2 - V2, y — 2uv, and /(x, y) = x2 + y2,


29.5 Chain rules, restricted maxima, coordinate systems 515

then

f(x, y) = (u2 — v2)2 + (2uv)2 — (u2 + v2)2

when evaluated at the same point.


If we put

z = f(x, y),
then the derivatives dz/du and dz/dv indicate how z, or f(x,y),
changes as we move around in the new coordinates. Consider the
derivative

dz
du

in which v is held constant, at v = say. Since only u varies, we are


able to adopt the single-variable chain rule (29.1), with u instead of
t. However, we must write dx/du and dy/du instead of dx/du and
dy/du in order to indicate that another variable v is present,
although it is regarded as constant for the differentiation. We obtain
the following.

Chain rule for two parameters

If X = x(u, v), y = y(u, v), z = f(x, y), then

dz dz dx dz dy
du dx du dy du ’
(29.6)
dz dz dx dz dy
dv dx dv dy dv

(Or / may be written instead of z.)

Example 29.1 2. Use the chain rule (29.6) to obtain dz/dv where x = u2 — v2,
y = 2uv, and z = xy; check the result by substitution.

For the chain rule, we require

dz dz dx _ dy
— = v, — = x, — = — 2v, — = 2 u.
dx dy dv dv

Then

dz dz dx dz dy
-H- — 2yv + 2 xu = 2u3 — 6 uv2.
dv dx dv dy dv

To check the result, write z in terms of u and v:

z = xy = (u2 — v2)2 uv = 2 u3v — 2 uv3

Therefore — = 2u3 — 6uv2, as before.


dv
516 29.5 Mathematical techniques

There is clearly no advantage in using the chain rule for a simple


explicit case such as this. The use of such rules is to obtain general
results as in the following examples.

Example 29.1 3. Find expressions for dz/dr and dz/d9 when x = r cos 6,
y — r sin 6, and z is a function of position.

To use (29.6), put (r, 0) in place of (u, v):

dz dz 8x dz dy ,dz . dz
--I-- cos 6-h sm 0
Jr dx dr dy dr dx By’
dz dz dx dz dy . n dz dz
-H-- = — r sin 0 — + r cos 0 —.
Jo dx dO dy dO dx dy

Example 29.14. Find expressions for dz/dx and dz/dy in terms of dz/dr
and dz/dO, where x = r cos 0, y = r sin 0.

The appropriate form for chain rule (29.6) will be

dz dz dr dz d6 dz dz dr dz dO
dx dr dx + 06 dx’ dy dr dy 06 dy

To find dr/dx etc., use the alternative form for polar coordinates:

r = (x2 + y2)T 6 = arctan y/x;

then

dr x r cos 6
— = —-— =-= cos 0;
dx (x2 + y2p r

80 1 f y\ y r sin 6 sin 0
dx 1 + (y/x)2 \ -v2/ x2 + y2 r2 r

Therefore

dz sin 6 dz
- = cos
dx

Similarly dr/dy and 88/0y can be calculated to give

dz . ndz cos 9 dz
— = sm 6-1-.
dy dr r dO

(These can also be obtained by treating the pair of expressions for dz/dr and
dz/88 obtained in Example 29.13 as if they were a pair of simultaneous equations
for dz/dx and dz/dy, and solving them:)

Example 29.15. Supposing that no further information is provided, simplify


the expression

dP 8U OP dV
JtiJM dVdM'
We may understand from the notation that

P = P(U, V).

The partial derivative notation OU/OM and 8V/8M indicates that

U = U(M,...) and V = V(M,...),


29.5 Chain rules, restricted maxima, coordinate systems 517

at least one more variable being present: the expression does not tell us its name.
The chain rule automatically simplifies the expression to

dP dU dP dV _ 8P
dU~dM + dVdM~ dM'

Notice how the expressions in (29.6) are formed. Suppose that

P = P(U, V), Q = Q(U, V), U = (X, Y), V = V(X, Y).

To form for example —, write


dx
dP _ dP dP
dX~~~Ix ~~dX’
then fill in the spaces in the first term with dU and the second with

Example 29.1 6. Prove that if (x, y) and (u, v) are coordinates related by
x = x(u, v) and y = y(u, v), (i)

or alternatively by

u = u(x, y) and v = v(x, y), (ii)

then

dx dx du du
du dv dx dy

dy dy dv dv
CD |
<2 1

^1
1_
§*1

1
1

is equal to the unit matrix /2.


In the first matrix, the relations (i) are implied, and in the second the relations
(ii). By multiplying the matrices we obtain

dx du dx dv dx du dx dv
du dx dv dx du dy dv dy

dy du dy dv dy du dy dv
_ du dx dv dx du dy dv dy _

Each of these elements has the right shape for the representation of a derivative
by the chain rule (29.6), though the variable combinations occupying the various
positions may seem unusual. The matrix becomes

dx dx
dx dy 1 0
dy dy 0 1
dx dy

29.6 The use of differentials


Problems are sometimes made easier by working directly with the
incremental approximation (28.2): if z = f(x, y), then
518 29.6 Mathematical techniques

8z « — 8x H-8y.
dx dy

This can be more fruitful than searching for a chain rule or other
formula which will work. It is customary in certain applications,
particularly in thermodynamics, to write this formula in the form

dz dz
dz dx + dy.
dx 3.y
in which becomes ‘ = ’ and dx, dy, dz are put in place of 8x, Sy,
8z. Such expressions can be manipulated in the same way as the
differential forms described in Section 22.4 for functions of a single
variable (the theory, however, is somewhat difficult). Here we shall
adopt * = ’ for brevity, but retain 8x etc.

Example 29.1 7. Find a vector normal to the curve f(x, y) = c at a point (x, y)
on the curve. (Compare Section 28.5.)

Let P be (x, y) and Q a nearby point (x + 8x, y + 5y) also on the curve (see
f(x,y)= c
Fig. 29.7). Put

z = fix, y)-
Then, since z is constant on the curve (it equals c),

dz
5z = 0 = — 8x + 5y,
dx dy

where the derivatives are evaluated at P. This can be written

/ dz dz\
•(8x, 8y) = 0.
\3x’ dy)

But (8x, Sy) = PQ is in the direction of the tangent at P (more and more nearly
as PQ becomes smaller, of course), so (dz/dx, dz/dy) is a vector in the direction
of the normal, as we found in Section 28.7.

Example 29.1 8. Show that the coordinate system (u, v) defined by

x = 2 uv and y — v2 — u2

is orthogonal.

We have to show that any two curves given respectively by u = a and v = /?


intersect in a right angle, as in Fig. 29.8. If u and v are allowed to vary arbitrarily,
the incremental formula gives

8x — 2v 8u + 2u 5v, 8y = — 2u 8u + 2v 8v. (j)

But u does not vary on the curve u = a, so 8m = 0 and (i) becomes 8x = 2u 8v,
8y = 2v 8v.
The vector PQ points nearly in the direction of the tangent at P:

PQ = (8x, 8y) = (2u hv, 2v 8v). (jj)

Similarly, on the curve v = /?, we have 8n = 0; so


Fig. 29.8
8x = 2vbu, Sy=—2u8u. (Hi)
29.6 Chain rules, restricted maxima, coordinate systems 519

PR points in the direction of the tangent to v = /?, and

PR = (5x, 5y) = (2v 8u, —2u 8u).

From (ii) and (iii), we have

YQ PR — (2u 5v, 2v 5v) • (2v Su, - 2u 8u)

= 4uv 8u 8v — 4uv 8u 8v = 0,

so the curves intersect in a right angle.

Problems

29.1. Find a parametrization (x(f), y(t)) suitable for (h) Find the stationary points of (x — y + l)2 on
the following curves, specifying the range of t required y = x2.
to traverse the curve exactly once, in the anticlockwise (i) Show that in general there are three normals to a
direction if the curve is closed. parabola from any given point inside it.
(a) x2 + y2 = 25; (b) ^x2 + \y2 = 1; (c) xy = 4;
(d) x2 — y2 = 1 (try using the identity
1 + tan2 A = 1/cos2 A); 29.5. Find the stationary points of f(x,y) on
g{x, y) = c (i) by parametrizing the given path as in
(e) ^x2 - gy2 = 1; (f) y2 = 4ax;
Example 29.6, (ii) by using the Lagrange-multiplier
(g) (x - l)2 + (y - 2)2 = 9; (h) 2x - 5y + 2 = 0.
technique, in each of the following cases.
(a) f(x, y) = x2 + y2 on gix, y) = xy = 1;
29.2. For each of the following cases, obtain df/dt in (b) f(x, y) = x2 + y2 on (x - l)2 -I- y2 = 1;
terms of t by means of the chain rule (29.1). (c) fix, y) = x2 + 4y2 on x2 + y2 = 1;
(a) /(x, y) = x2 + y2, x(t) = t, y(t) = l/t; (d) fix, y) = 3x — 2y on x2 — y2 = 4;
(b) /(x, y) = x2 — y2, x(t) = cos t, y(t) = sin t; (e) fix, y) = xy on gix, y) = x2 + y2 = 1 (compare this
(c) /(x, y) = xy, x(f) = 2 cos t, y(t) = sin f; with (a).)
(d) f(x, y) = x sin y, x(t) = It, y(t) = f2;
(e) fix, y) = 4x2 + 9y2, x(t) = j cos t, y(t) = j sin f.
29.6. Show by means of sketches that, for the restricted
stationary-value problem, a stationary point can be
29.3. Two athletes run around concentric circular tracks
expected at any point where the curve g(x, y) = c is
of radius r and R with speeds v and V respectively. They
tangential to a contour of fix, y).
start on the same radial line. By using time as a
Use this observation to derive the Lagrange-
parameter, find the rate of change with time of the
multiplier principle. (Flint: consider the normals at the
distance between them.
point of tangency; or use implicit differentiation to get
expressions for the directions of the curves there.)
29.4. Use the Lagrange-multiplier method to solve the There are cases when a stationary point can occur
following problems. although the curves are not tangential there. Try to
(a) Find the maximum area of a rectangle having identify these cases by sketching various possibilities.
perimeter of length 10. (Hint: they correspond to X = 0.)
(b) Find the rectangle with area 9 which has the
shortest perimeter.
(c) Find the stationary points of x2 + 2y2 subject to 29.7. A change of coordinates from (x, y) to (u, v) is
x2 + y2 = 1. specified by each of the following. Show that the new
(d) Find the largest rectangle in the first quadrant coordinate system is orthogonal.
of the (x, y) plane which has two of its sides along x = 0 (a) u = 2x + 3y, v = — 3x + 2y;
and y = 0 respectively, and a vertex on the line 2x + y = 1. (b) u = xy, v = x2 — y2;
(e) Find the minimum distance of the straight line (c) u = x2 + 2y2, v = y/x2;
x + 2y = 1 from the point (1,1). (It is easier to consider (d) u = xy2, v = y2 — 2x2;
the square of the distance.) (e) u = x + 1/x + y2/x, v = y - 1/y + x2/y;
(f) Find the shortest distance from the origin to the (f) x = 2m — v, y = u 4- 2v,
curve x2 + 8xy + ly2 = 225. (g) x = u2 — v2, y = 2uv,
(g) With reference to Fig. 29.4, find the rectangle (h) x = u/iu2 + v2), y = v/iu2 + ir);

in the ellipse which has the minimum perimeter. (i) x = u2 — v2, y = — 2uv.
520 Mathematical techniques

29.8. Let r(f) and 0(t) be polar coordinates which are (c) f(x, y) = y2, x = uv, y = v.
functions of a parameter t.
(a) Express dx/df and dy/dr in terms of dr/df, dO/dt, 29.11. Find expressions for df/du, df/dv, d2f/du2, d2f/
r, and 9. dv2, and d2f/du dv if
(b) Use (a) to obtain expressions for d2x/dt2 and f(x, y) = g(x2 - y2), x = u + v, y = u-v.
d2y/dt2.
(The expressions will involve the functions g'(x2 — y2)
(c) Prove that
etc.)
n d2* ■ n d2y d2r
cos 9 —- + sin 6 —-
d t2 dr2 dr2 29.12. Let w = w(u, v), u = u(x, y), v = v(x, y), where u
and v are related in such a way that
d2p . d2x
cos 9 —- — sin U —— du dv du dv
dr2 dr2
dx dy ’ dy dx
(These two equations express the radial and tangential
Prove that
components of acceleration, given on the left, in terms
of polar coordinates.) d2u d2u d2v d2v ^
dx2 dy2 ’ dx2 dy2
..
29 9 Use the chain rule (29.6) to find df/du and df/dv
Use the chain rule (29.6) to prove that
in terms of u and v in each of the following cases.
(a) f(x, y) = 2x — y, x = uv, y = u2 — v2;
d2w d2w du\2 2 dv[A2" d2w d2w

(b) f(x, y) = y/x, x = u + v, y = u - v; dx2 dy2 LW + dy) _ _dx2 dy2_


(c) J'(x, y) = y2, x = u2 + v2, y = v/u;
(d) f(x, y) = (x - y)/(x + y), x = v, y = u - v. 29.13. Let r and 6 be the usual polar coordinates, and
z = /(x, y); show that:
29.10. By using the chain rule (29.6) twice, obtain d2f/ dzV
du2, d2f/dv2, and d2f/du 8v in each of the following
cases.
'•>©'* d~y) "

(a) /(x, y) = y/x, x = u + v, y = u - v, /t x csz C^Z d2z 1 dz 1 d2z


(b) /(x, y) = x2 + y2, x = uv, y = u2 — v2; dx dy dr2 r dr r2 d62
Functions of any
number of variables
Contents
30.1 The incremental approximation; errors 521
30.2 Implicit differentiation 523
30.3 Chain rules 525
30.4 The gradient vector in three dimensions 525
30.5 Normal to a surface 527
30.6 Equation of the tangent plane 528
30.7 Directional derivative in terms of gradient 529
30.8 Stationary points 532
30.9 The envelope of a family of curves 537
Problems 538

30.1 The incremental approximation; errors


For functions of three and more variables, simple pictorial repre¬
sentations are not available. Nevertheless many of the important
formulae follow the pattern of the two-variable case, simply contain¬
ing more terms of the same type. This follows from the incremental
approximation (28.1), extended to three and more variables.
Suppose that f(x, y, z,...) is any function of N (^3) variables.
The partial derivatives, df/dx, df/dy, df/dz,..., have the same
meaning as they did in Chapter 27: during differentiation, all the
variables except the named one are treated as constants.
Higher derivatives are defined as with functions of two variables;
for example

d3f _ d d df
dx dy dz dx dy dz

It follows from the result for second derivatives (Section 27.4) that

d3f = d3f = d3f


dx dy dz dy dx dz dz dy dx

and so on: the derivatives may be taken in any order.


The incremental approximation has the same form as (28.1)
and (28.2), simply containing further terms corresponding to the
extra variables:
522 30.1 Mathematical techniques

Incremental approximation for fix, V, z,...)


For small enough increments 8x, 8y, 8z,...:
8/ = f{x + Sx, y + 8y, z + 8z,...) - f(x, y,z,...)
5/ 5/
% — 8x H-8y H-8z + • • •. (30.1)
dx dy dz
If we put w = f(x,y,z,...), this can be written
„ dw „ dw dw
8w « — 8x H-8 v -8z + • • ■.
dx dy dz

To prove (30.1), the idea of a tangent plane is not available, so


we must go directly for the linear approximation to the function.
Put w = f(x, y,z,...), and consider a fixed ‘point’ P : (x, y, z,...)
and another nearby point Q : (x + 8x, y + 8y, z + 8z,...). Then w
changes to w + Sw. We assume that the relation between 8w and
Sx, Sy, 8z,... is close to linear for small Sx, 8y, 8z,...; that is to say,
8w = T 8x + 5 8y + C 8z + • • • + e, (30.2)
where A, B,... are certain constants and the error a is of a lower
order of magnitude than the 8-quantities (compare Example 28.2 for
two variables).
In order firstly to find A, vary only x, so that
8x / 0, 8y = 8z = • • • = 0.
Put these into (30.2) and divide by 8x, giving
8w e
— = A H-.
8x 8x
Now let 8x —► 0. Then 8w/8x —► dw/dx, and (since £ is of lower order
of magnitude than 8x) £/8x —> 0. Therefore

dx
Similarly dw/dy = B, and so on, which gives the result (30.1).
The incremental approximation (30.1) can be used to estimate
errors as in Section 28.2.

Small-error formula
If w = fix, y, z,...), then (approximately)
dw A dw A dw
Aw = — Ax H-Ay + — Az + • • •,
dx dy ' dz (30.3)

where x, y, z,... stand for the measured values and A


stands for
error = (measured value) — (exact value).
30.1 Functions of any number of variables 523

Example 30.1. In a triangle ABC, cos C = (c2 — a2 — b2)/2ab. In a particular


case, the measured side lengths are a = 3, b = 4, and c = 5.5 units. Possible errors
of measurement lie between +0.1 units. Find the error in estimating C in the worst
case.

For ease of differentiation put

a b
w = cos C = —
lab Yb 2a
Then
. dw , dw dw
A(cos C) = Aw % - - An H-Ab H-Ac,
da db dc
where

cw 1
-+
da 2 a2b 2b 2 a2

and similarly for the other two derivatives. From the measurements,

dw dw dw
— = -0.323, — = -0.388, — = 0.458.
da db dc

Therefore Afcos C) ss —0.323 A a — 0.388 Ab + 0.458 Ac.


To obtain AC from A(cos C):

A(cos C) cos C ) AC = ( — sin C) AC,


dC

where C is in radians. The value of C estimated from the cosine rule using the
measured values is 1.350 radians, and sin 1.350 = 0.976. Therefore

AC w -0.334 A a - 0.398 Ab + 0.469 Ac.

The magnitude of AC is a maximum if by chance the errors are

Aa=+0.1, Ab = +0.1, Ac =+0.1,

and then

AC * ±0.120.

which is about a 9% error.

30.2 Implicit differentiation


There is an analogy with the implicit-differentiation formula (28.6).
Suppose that

fix, y,z,...) = 0. (30.4)

This condition implies that any one of the variables depends on, or
is a function of, all the others. For example, if the variables are x, y, z,
and r, and

x2 + y2+z2-r2= 0,
then
y = ±(r2 — x2 — z2)+
Subject to (30.4) we can therefore talk about partial derivatives such
524 30.2 Mathematical techniques

as dy/dx: we think of y as being a function of the other variables,


but with all the variables except x and y held constant.
Suppose that (x, y, z,...) and (x + 8x, y + 5y, z + 8z,...) both
satisfy condition (30.4). Then 5/ = 0 and the incremental approxi¬
mation gives

dl 6* + S1 iy + dl Sz + • ■. » 0. (30.5)
dx dy dz

Suppose next that all the variables except x and y are kept constant,
so that 5x # 0 and 8y ^ 0, but 8z = • • • = 0, Equation (30.5)
becomes (df/dx) 5x + (df/dy) 8y % 0, so that

5y
_
df Idf
''N*'_/

Sx dx/ dy

Now let 8x ->■ 0 and the equation becomes

dl=J±ldl.
dx dxj dy

Implicit differentiation

If /(x, y, z,...) = 0, then

dJ.= JllSl. (30.6)


dx dx I dy
Any other two variables may be substituted for x
and y.

Example 30.2. For a fixed mass of gas, an equation of the form f(P, V,T) = 0
holds (the ‘equation of state'), where P, V, and T represent the pressure, volume,
and temperature respectively. Show that

dPdT dP , dPdTdV
a)-=-, (b-= -1.
STdV dV dTdVdP

The relation f(P, V, T) = 0 implies that any of P, V, or T is a function of


the other two variables: P = P(V, T), V = V(T, P), and T = T(P, V). If we put,
say P = P(V, T) = constant, then implicit differentiation, by (30.6), gives dV/dT
or dT/dV in terms of df/dV and df/dT (where we are reminded of the ‘constant
P' condition by the partial derivative signs instead of dV/dT and dT/dV).
Similarly we obtain dP/dT, dT/dP, dP/dV, and dV/dT.

and
dT dTI dP dV dv/dT

(from (30.6)). Therefore

dPdr_dfldf_ dp
(using (30.6) again).
dTdV~dVI ~dP~ ~dV
30.2 Functions of any number of variables 525

(b) By repeating the process (a) with different variables,

dP dTdV _
dfdVdP~

There are many more similar formulae obtainable by permuting


P, T, V: these identities are important in the theory of thermo¬
dynamics.

30.3 Chain rules


The chain rule for a single parameter t is obtained exactly as
in the case of a single variable: divide (30.1) by 51 and take the limit;
to give the following formula.

Chain rule for one parameter


Given f(x, y, z,...), where x = x(t).
y = y(t),z = z(t), (30.7)
df df dx df dy df dz
dt dx dt dy dt dz dt
(or with w in place of / if w = /(x, y,z,...)).

Notice that (x(f), y(t), z(t)) defines a directed path in three dimen¬
sions.
In the case of more than one parameter, the results of Section
29.1 may be extended as follows.

Chain rule for more than one parameter

For a function f(x,y,z,...,), where x, y, z,... are


functions of parameters u, v,..., we have
df df dx df dy df dz
du dx du dy du dz du (30.8)
df df dx df dy df dz
dv dx dv dy dv dz dv
and so for any other parameters. (If w = f(x,y,z,...)
then w may be written in place of /.)

30.4 The gradient vector in three dimensions


The gradient vector function, introduced for two dimensions in
Section 28.6, extends to any number of dimensions, though we shall
restrict consideration to three variables in this section.
In the equations we have obtained, such as (30.1) and (30.8), there
repeatedy occurs the triplet of elements df/dx, df/dy, df/dz; added,
526 30.4 Mathematical techniques

together with various multipliers. We can manipulate this group as


a unit by regarding

'df df df\
or —?+—/ +
dx ’ dy’ dz) dx dy
as a vector function—the gradient of /, now in three dimensions—
and denote it by
grad / or \f,
as before. As in Section 28.6, we can also think of grad or V standing
alone as an operator: an instruction to carry out the process

d „ d f d 'd d dN
+ + or
dx dy dz ydx ’ dy ’ dz,
on some scalar function f(x,y,z). The definition is stated for
^reference as follows.

Gradient vector function (three dimensions)

For a scalar function f(x, y, z):


grad / or V/

= (Vdf dA = ~dl + tdl + tdl- (30.9)


\cbc’ dy’ dz) dx dy dz ’
Alternatively, grad or V stands for the operator

or

Example 30.3. Let f(x, y, z) = x2 + y1 + z2. Obtain (a) the vector function
grad f(x, y, z); (b) the value of grad /(x, y, z) at the point (1, 2, 3); (c) an expression
for the magnitude (or length) of grad f(x, y, z).

(a) grad f(x, y, z) = A = (2x, 2 y, 2 z);


\dx dy dz]
or one can use the ‘operator’ idea and the other way of writing a vector:

3 $ \
i- + J — + k — )(x2 + y2 + z2)
dx dy dzJ
= i(2x) + j(2y) + k(2z).
(b) At x = 1, y = 2, z = 3,

grad / = (2, 4,6).

(c) The magnitude or length |v| of a vector v = (a, b, c) is |v| = (a2 + h2 + c2)F
so
jgrad /1 = [(2x)2 + (2y)2 + (2z)2]* = 2(x2 + y2 + z2 f.

Expressions which occur in the theory frequently take the form

Ual+yal+wdl, (30.10)
dx dy dz
30.4 Functions of any number of variables 527

where U, V, and W may be constants, or various functions. If we put


iU + jV + kW = S,
where S is another vector (compare Section 28.6), then we can write
(30.10) in the form

v*Uval+wa±-w,v,w)-(al££
dx dy dz \dx dy dzy
= S grad /,

as in the following example.

Example 30.4. Suppose that the concentration of plankton in the sea is


C(x, y, z, t). A whale travels on the path x = x(t), y = y(t), z = z(f), where t is
time. Show that. on the path of the whale.

dC dC
— =-h v-grad C,

where v is its velocity.


By the chain rule (30.7),

dC dC dx dC dy dC dz dC
df dx dt dy df dz df dt

after putting dr/dr = 1 into the final term. The whale’s velocity is
/dx dy dz\
v
\ d7 ’ dt ’ df/
so that

dC
-b vgrad C.
dt dt

(If the whale drifted with the motion of the sea, v would represent the velocity
of the current. This case is related to the concept of material derivative in fluid
mechanics. Instead of C there is a quantity such as the density or momentum
of a particular piece of fluid, whose variation we follow as the fluid moves
around.)

30.5 Normal to a surface


An equation of the form
g(x, y,z) = k
represents a surface in three dimensions, because we can imagine
‘solving’ the equation for z in order to obtain equivalent equation(s):
z = f(x, y).
Thus, if x2 + y2 + z2 = 1, then z = +(1 — x2 — y2)A The normal
to the surface can be expressed neatly as follows.

Normal (perpendicular) to a surface

Let P be any point on a surface g(x, y, z) = k. Then (30.11)


grad g, evaluated at P, is normal to the surface at P.
528 30.5 Mathematical techniques

(Compare (28.9), for the normal to a curve in two dimensions.) The


proof is as follows. In Fig. 30.1, P : (x, y, z) is the given point on the
surface and Q : (x + 5x, y + 8y, z + 8z) is any nearby point on the
surface. Then

g(x + 5x, y + 5y, z + 8z) - g(x, y, z) = 0,

or

5 g = 0.

Therefore, by the incremental formula (30.1),


Fig. 30.1

n dS x
0 = — , dcJ * H-6z
ox H-oy , dcJ s
dx dy dx
= (grad g)-(8x, 5y, 8z).

This shows that grad g is perpendicular to the vector (5x, 5y, 5z).
But PQ can be chosen to point in any direction from P in the surface,
so the only possibility is that grad g is perpendicular to the surface
itself at P.
We already know (from (27.7)) that a vector normal to a
surface described in the form z = /(x, y) is

(df df
\<j'x ’ dy ’

This is reconciled with (30.11) if we write its equation in the form

g(x, y, z) = /(x, y) - z = 0.

30.6 Equation of the tangent plane


Suppose that a surface is specified in the form g(x, y, z) = k, and
that the point P : (x, y, z) is on the surface. The conditions that the
tangent plane must satisfy are (a) it contains the point P; and
(b) it is perpendicular to the normal vector grad g(x, y, z) evaluated
at P, as in (30.11). These conditions are satisfied by the equation

(x — a) + (y-b) + (z — c) — 0. (30.12)

It can be seen that the expression is zero when x = a, y = b, z = c.


Also the coefficient vector

= [g^d sQp

is perpendicular to the plane. Therefore we may state the equation


of the plane as follows.
30.6 Functions of any number of variables 529

Tangent plane to the surface g(x, y, z) = k at


P : (a, b, c)
(30.13)
(x — a) + (y-b) + c) = 0.

30.7 Directional derivative in terms of gradient


The vector grad / contains the necessary information to calculate
z the rate of change' of f(x,y,z) in any direction. In Fig. 30.2, let
P : (a, b, c) be any point. Suppose that we require the rate of change
with distance of /(x, y, z) in the direction PR.
Choose a nearby point Q : (x + 8x, y + by, z + 5z) on PR, and
put

PQ = 5 s = (5x2 + 5y2 + 5 z2)K

(where §5 is a standard symbol for a small element of distance).


Then

5x by bz
- — cos a, — = cos p, - = cos y,
bs bs bs

where cos a, cos /?, cos y are the direction cosines of PQ (Section
10.5). Now divide the incremental approximation (30.1) through by
bs and take the limit as 5s —► 0. We obtain an expression for the
rate of change of /(x, y, z) with distance in any direction:

Directional derivative in three dimensions

In the direction having direction cosines (cos a, cos /?,


cos y):
(30.14)

Sf df df a df
— = — cos a 3-cos B H-cos y.
ds dx dy dz

In the two-dimensional version (28.4), the coefficients cos 0 and sin 9


are equal to the two-dimensional direction cosines, cos 9 and
cosQti — 9), so (28.4) is compatible with (30.14).

The direction cosines cos a, cos ji, cos y have the property cos2 a +
cos2 /? + cos2 y = 1 (see Section 10.5), so they are the components
of a unit vector s which points in the desired direction. Therefore
(30.14) can be written differently:
530 30.7 Mathematical techniques

Directional derivative in three dimensions in


terms of the gradient
In the direction of the unit vector s,
(30.15)
= s-grad /,
ds
which is the component of grad / in direction s.

As in Section 28.6, the result (30.15) can be expressed in a third


way. If a and b are two vectors, and f is the angle between them,
then a-b = |a||Z>| cos </>. Putting s for a and grad / for b in (30.15),
and using the fact that |s| = 1, we obtain the next result.

Directional derivative in 3 dimensions


Af
— = Igrad f\ cos </>. (30.16)
ds
where fl is the angle between grad / and the unit
direction vector s.

Now take a function f(x, y, z), and a point P : (xx, yx, zfl as in
z Fig. 30.3. By means of (30.16), we can explore the rate of variation
of f(x, y, z) in all directions, by pointing s in the required directions.
The only thing that changes when we do this is the angle f. It can
be seen from (30.16) that (i) If = 2n> then df/ds = 0, which is
consistent with s pointing tangentially to the surface f(x, y, z) = fP,
where fP is the value of / at P. (ii) df/ds takes its maximum
value Igrad /1, when </> = 0. That is to say, grad / points in the
direction of most rapid increase of/; i.e. it is normal to the surface
fix, y, z) = fP.
It is worth noticing that, for a fixed angle the unit vector s
may point anywhere along the generators of a cone having axis
grad /, as shown in Fig. 30.3. The directional derivative df/ds is the
same in all these directions.

Example 30.5. Let f(x, y, z) = 4 — x2 — \y2 — jz2 represent the atmospheric


concentration of a chemical which attracts insects, (a) Write down grad / at
(x, y, z). (b) Find a unit vector s which points in the direction of most rapid rate
of increase in f(x,y,z) at the point (1, 1, 1). (c) An insect sets off from (1, 1, 1)
and flies a short distance 5s in the direction given by (b). Find its new coordinates
(approximately).

'df 5f df\
(a) grad f(x, y, z)
dx dy dz)
= ( —2 x, —y, —z), or —2 xi — yj — zk.

(b) By (ii) above, grad / always points in the required direction; at the point
30.7 Functions of any number of variables 531

(1, 1, 1), its components are ( — 2, —1, —1). To obtain the corresponding unit
vector s, divide by the length [(-2)2+ (—l)2+ (-l)2]i = %/6, obtaining s =
(-2/V6, -1A/6, - 1/V6)-
(c) The insect moves a distance 8s from the point P: (1,1,1) along s
(see Fig. 30.4), so its vector displacement PQ is
z
(2 1 1 \
s 8s = (-— 8s,-8s,-8s ).
V V6 V6 V6 /
The components of this vector are the x, y, z displacements
2 1 1
8x =-— 8s, Sy =-8s, 8z =-8s.
6 V ’ 6 V 6 V
The new coordinates are therefore
2 1 1
x = 1-8s, v = 1-8s, z = 1-8s.
6 V ' 6 V 6 V
Example 30.6. Using Example 30.5(c) as a model, give a systematic method,
suitable for computation, for approximating to the path of an insect which always
flies in the direction of most rapidly increasing concentration.
Figure 30.5 shows notionally the path of such an insect. It starts at P0 : (x0, y0, z0),
and the path consists of short steps of equal length h (instead of 8s, for the
purpose of programming). The progression is P0, Pu P2,..., P„, Pn + 1 ..., with
coordinates numbered (x„, yn, z„) for n = 0, 1, 2,.... At each Pn, the insect moves
in the unit-vector direction sn, where sn = grad f(x, y, z) and, as in Example
30.5c,

in = [(grad/)/|grad/|]Pn

= ( —2x„, -y„, — z„)/(4x2 + y„ + z2nf

= (an, bn’c„) (say)- (30-17)


^0 For the general step from Pn to Pn+1, we obtain the small displacement
components 8x„, 8y„, 8z„ in the x, y, and z directions (in general these will differ
Fig. 30.5
from step to step):

(8x„, 8y„, 8z„) = snh = (a„h, b„h, cnh),

from (30.17). Therefore

(Xn+l,yn+UZn+l) = (Xn + Zn + K)

= (x„ + a„h, yn + bnh, zn + c„h). (30.18)

Equations (30.17)—(30.18), with the starting point (x0, y0, z0) given, form a
step-by-step process which is easy to computerize. The following table of the
early stages was calculated with h = 0.05; the starting point in this case is the
point (1, 1, 1), where /(x, y, z) = 4 — x2 — \y2 — \z2 as in Example 30.5.

n X„ yn
yn z,
0 1 1 l

'1 0.959 0.980 0.980

2 0.919 0.959 0.959

3 0.878 0.938 0.938

4 0.839 0.917 0.917

5 0.799 0.895 0.895


532 30.7 Mathematical techniques

If a surface is defined by the equation

f(x, y,z) = c,
where c is a constant, it is called a level surface of the function
/ (it is the analogy of a contour in the theory for functions of two
variables). According to (30.11), therefore, we can say in different
language:

Normal to a level surface of f(x, y, z)


grad /, evaluated at a point P, is perpendicular to the (30.19)
level surface of f(x, y, z) through P.

It follows that the insect in Example 30.6 crosses perpendicularly


all the level surfaces that it meets.

30.8 Stationary points


Stationary points (which include maxima and minima) are more
difficult to discuss in more than two dimensions, since we no longer
have the horizontal tangent plane to refer to. We should expect a
stationary point of /(x, y, z,...) to occur at any point Q where

(30.20)
dx dy dz

since all our previous formulae have been merely extended versions
of the two-dimensional case. To show that this criterion is the right
one, choose any path through Q, and suppose that we describe it
parametrically by

x = x(t), y = y(f), z = z(t), ... .

Then, if (30.20) holds at Q, the chain rule (30.7) together with (30.20)
gives

df df dx df dv
+ — — + --- = 0.
df ox df dy df
Therefore a turning point of /(x(f), y(t), z(t),...) is encountered at
Q on every path passing through Q, and this is what we should wish
to happen for the point Q to be described as stationary.

Stationary points of /(x, y, z,...)

The stationary points are the solutions (x, y,z,...) of


the equations
(30.21)
dl = dl-df _ _0
dx dy dz
30.8 Functions of any number of variables 533

Example 30.7. Find the stationary points of the function

J\x, y, z) = x2 + y2 + z2 — xy — 2yz — zx — z.

The conditions (30.21) become

of'/dx = 2.x — y — z = 0,
df/dy = —x + 2y — 2z = 0,
df/dz = —x — 2y + 2z — 1 =0.

These are solved by x = — \, y = — f, z = — |.

Restricted stationary-value problems (see Section 29.2) may occur


in any number of dimensions. In three dimensions, the restriction
may either be to values of /(x, y, z) on some given curve, or to values
on some given surface.
To help visualize a three-dimensional situation; suppose that a
fish swims through a field of pollution of density P = /(x, y, z). At
some point in the sea the pollution is at an overall maximum, but
this is of no concern to the fish if it does not swim through it.
However, it will notice highs and lows along its own path even if
there is nothing special about such points from an overall viewpoint.
These are restricted maxima and minima on the fish’s path. Suppose
that the path of the fish is expressed parametrically:

x = x(t), y = y(t), z = z(t).

Then the stationary points peculiar to the path are where df/dt = 0.
By the chain rule (30.7), these are the points where

df dx df dy df dz
—-h — — + — — = 0.
dx dt dy df dz df

When written in terms of f, this is an equation giving the critical


values of f. (We must be careful to avoid a parametrization such
that dx/df = dy/dt = dz/dt = 0 at some point on the path: at such
a point, a nonexistent stationary point would be predicted.)

Stationary points of /(x, y, z) on the path


(x(f), y(t), z(t))
The stationary points are the solutions of (30.22)
df dx df dy df dz
— -b 0.
dx df dy df dz df

(In particular cases it might be easier to substitute x(f), y(f), z(f)


directly into /(x, y, z) for the turning points of / with respect to f.)
It is more usual for restricted stationary-value problems to be
formulated in a way that avoids parametric considerations. The
restriction to a surface is the easier case. Instead of a fish in the
body of the sea, consider a crab which confines itself to the
534 30.8 Mathematical techniques

undulating seabed described by an equation of the form

g{x, y, z) = c,

encountering there the local pollution, given throughout the sea by


/(x, y, z). The crab does not know about the rest of the sea, but as
it moves around it will meet highs and lows (and other stationary
points) unconnected with possibly more extreme pollution in the
body of the sea. A stationary point will be found at a point Q on
the surface g(x, y, z) = c if

d/ = 0 at Q

in all directions s from Q which do not point into the body of the
sea, but are tangential to the surface g(x, y, z) = c.
Figure 30.6 shows such a point Q, and various tangential
directions denoted by unit vectors s pointing away from Q. From
(30.15), the required condition is

— = 0 = s-grad f at Q, for all such s. (30.23)


ds

In other words, grad f must be perpendicular to the surface at Q


(ignoring for the moment the chance that grad / might be zero at
Q). But, by (30.11), grad g is always perpendicular to the surface
g(x, y, z) = c—in particular at Q. Therefore grad / and grad g,
evaluated at Q, are parallel vectors; so

grad f — A grad g at Q,

where / is an (unknown) constant, called a Lagrange multiplier


for the problem. By writing grad / and grad g in their components,
we obtain

df . 8g df ,Sg df .Sg
— - a — = 0, — - /— = 0, — — /— = 0.
dx dx dy dy dz dz
(30.24a,b,c)

We now have three equations for the four unknowns: (x, y, z) (the
position of Q) and /. To find another equation, notice that (30.24a,b,c)
would be unaffected if we had g(x, y, z) equal to some constant other
than c, so it is necessary to reassert the particular surface:

g(x,y,z) = c. (30.24d)

The very special case mentioned above, that (30.23) is satisfied


alternatively if (by chance) grad / = 0 at Q, is still governed by the
same equations. When they are solved, we should merely find that
2 = 0. (We should not usually realize this in advance.) The case
corresponds to the unrestricted stationary-point problem (see
(30.20)), where the point found happens to lie in the specified surface.
30.8 Functions of any number of variables 535

Restricted stationary-point problem:


stationary points of f(x, y, z) subject to
g(x, y, z) = c
Solve for x, y, z, X the equations

cf
II
(i)
df dg n (30.25)
— - A— = 0, (ii)
dx dx
df „ dg n
— - X— = 0, (iii)
dy dy
df . dg
— - X~ = 0. (iv)
dz dz

Example 30.8. Find the stationary points of x2 + y2 + yz + zx on the hyper-


holoid x2 + y2 — z2 = 1.

The four equations (30.25) are

x + y — z- = 1, (i)
2x + z — 22x = 0, or (2 — 2 2)x + z = 0; (ii)
2 y + z — 2 Ay = 0, or (2 - 22)y + z = 0; (iii)
y + x + 2 2z = 0, or x + y + 2 2z = 0. (iv)

Equations (ii), (iii), and (iv) constitute a set of homogeneous linear algebraic
equations for x, y, z. The only possibilities are either that x = y = z = 0, which
is excluded since these values do not satisfy (i), or that the determinant of the
coefficients is zero:

2-2;. 0 1

det 0 2 — 22 1 =0,

1 1 22_

so that (1 — 2)(222 — 22 + 1) = 0. The only real solution is


2=1.
The equations then become
z = 0, z — 0, x + y + 2z = 0,
or
z = 0, y = -x. (v)
Substitute for y in terms of x into (i). Then
2x2 = 1, or x = ±1/n/2.
Therefore, using (v) again gives the stationary points

(±1/72, +1/72, 0).

For the corresponding problem of finding the stationary points


of f(x,y,z) on a specified curve, as for the fish problem discussed
536 30.8 Mathematical techniques

at the beginning of the section, the curve will be assumed to be


specified by the intersection of two surfaces:
g(x,y,z)-cu h(x, y, z) = c2.
The method of solution involves two Lagrange multipliers:

Restricted stationary-point problem:


Stationary points of f(x, y, z) subject to
g(x, y, z) = Cj and h(x, y, z) = c2.
Solve for x, y, z, X, g the equations

9 = Ci h = c2. (i), (ii)


(30.26)
df 2dg dh o (iii)
dx dx dx
df , dg dh
-X — — g — = 0, (iv)
dy dy dy
df . dg dh
-/-g — = 0. (v)
dz dz dz

We shall not give the proof in full. Briefly, the situation is shown in
Fig. 30.7. Q is a stationary point on the curve of intersection and s
a unit vector tangential to it at Q. Since

— = s*grad / = 0 at Q,
ds
grad / is perpendicular to s at Q. For the same reason as in the
earlier case, grad g and grad h are also perpendicular to s at Q.
Therefore the three vectors grad /, grad g, grad h all lie in the same
plane (which is perpendicular to §), so grad f can be expressed in
intersection.
terms of the other two vectors:
Fig. 30.7
grad / = X grad g + g grad h,
where X and g are certain constants, the Lagrange multipliers for
this problem. Then split this equation into its components to obtain
(30.26).

Example 30.9. Find the stationary points of x2 + y2 + z2 on the curve of


intersection of the vertical cylinder x2 + y2 = 1 with the plane x + y + z = 1.
(This is an inclined ellipse.)

Here f(x, y, z) = x2 + y2 + z2, g(x, y, z) = x2 + y2, and h{x, y,z) = x + y + z.


The equations to be solved become

x2 + y2 = 1, (i)
x + y + z = 1,
(ii)
2x — X2x — p = 0, or 2x(l — X) = p. (iii)
2y - 22y — p = 0, or 2y{\ - 2) = p, (iv)
2z — p. = 0. (v)
30.8 Functions of any number of variables 537

From (iii) and (iv), either (a) X = 1, so that /r = 0, or (b) X =£ 1, so that x = y.


We consider these possibilities in order.
(a) The case X = 1, /r = 0. (We cannot deduce anything about x and y from
(iii) and (iv) if this is true.) From (v) we obtain z = 0, so (i) and (ii) become
x2 + y2 = 1, x + y = 1.
The solutions are x = 0, y — 1, and x = 1, y = 0. Then we have found two
solutions:
(0,1,0) and (1,0,0).
(b) The case X =£ l, x = y. From (i), x = ± 1/^/2, y = ± 1/^/2. Equation (ii)
then gives z=l—x — y=l + J2. Thus we have two more solutions:
(1/V2, l/y/2, 1 - y/2) and (-1/J2, - 1/J2, 1 + J2).

(a) For a restricted stationary-point problem in N variables, there may


be up to N — 1 restricting equations, or constraints, with the same
number of Lagrange multipliers. The equations to be solved then
follow the pattern of (30.25) and (30.26).
The identification of a maximum or minimum is usually of most
interest. The general question is difficult, but sometimes it is fairly
obvious. For instance, in the previous example the values of / are
restricted to a closed curve, so the values of / obtained make
it clear that the points (a) give minima of /, and points (b) give
maxima.

30.9 The envelope of a family of curves


(b) Figure 30.8a shows the straight lines
y
y = a — a2x,
for several values of a, which we call the parameter of the family
of straight lines. The ‘boundary’ of the family is starting to form
itself into a curve E, which is sketched in Fig. 30.8b. The reason
why the curve E is sharply defined is because all the straight lines
are tangential to it, and therefore reinforce it along its length. The
curve is called the envelope of the family y = otx — a2x, where a
is the parameter of the family.
The family does not have to consist of straight lines. Suppose that
the family is described by

fix, y,cc) = 0.
To find the envelope (Fig. 30.9) consider two close values of the
parameter, a and a + 5a, the corresponding curves of the family
being
Parameter
a+ba
f(x, y, a) = 0 and f(x,y, a + 5a) = 0.

The intersection point R is the point where

fix, y, a) = fix, y, a + 5a) ( = 0).

1 herelore, at the point

fix, y, a + 5a) - fix, y, a) =


Fig. 30.9 5a
538 30.9 Mathematical techniques

Now let 5a 0. Then R and Q come together at P on the envelope,


and this equation becomes

g/(x’ y> a) = o, (30.27)


da

at P. Also P lies on the curve

f(x, y, a) = 0. (30.28)

(We had not so far used the fact that / is zero rather than some other
constant.) If we eliminate a between (30.27) and (30.28), we obtain
an equation in x and y which describes the envelope.

Envelope of the family of curves /(x, y, a) — 0,


where a is a parameter
The result of eliminating a between the equations (30.29)
r) f

/(x, y, a) = 0 and -- = 0 contains the envelope.


da

(The solution might also include the track of other peculiarities.)

Example 30.10. Find the envelope of the family of straight lines y = a — a2x,
where a is the parameter. (See Fig. 30.8.)

Let f(x, y, a) = y - a + a2x. Then

— = — 1 + 2ax = 0
dot

Therefore

a = l/2x. (i)

On the envelope, also

y — a + a2x = 0; (ii)

so, from (i), y — l/2x + l/4x = 0, or

y = l/4x,
which is a rectangular hyperbola (see Fig. 30.8b).

Problems

30.1. Write down the incremental approximation for 30.2. The distance d between two points {x„y„z,)
5/ in the following cases. and (x2,_y2,z2) in a plane is given by d2 = (x, — x2)2 +
(a) /(x, y, z) = 2x + 3y2 + 4z2 - 3; (y, — y2)2 + {z, — z2)2. Find approximately the change
(b) f{x, y, t) = (x2 + y2)_ie“'; from (1, 1,2), (1,2, 1) to (1.1, 0.9, 1.8), (0.9, 2.1, 1.1).
(c) f{r, 0,t) = e~'r cos 0;
(d) J\x, y, z, t) = x2 + y2 + z2 - t2;
30.3. Rx, R2, R2, R4 are resistances in a circuit whose
(e) fix,, y„ x2, y2) = (xj - x2)2 + (y, - yf)2\
overall resistance is R, arranged so that
(f) f(x,y,z, t)=il/r) e <-v2+F>/i Compare with the expres¬
sion for 8g when g{r, t) = (1/r) e^r2/l in polar coordinates. 1/R = 1/R4 + (Rj + R2)/iR\R2 + R2R2 + R2R,).
Functions of any number of variables 539

Find an expression for 5R in terms of 8-Rj, 8R2, 8R3, 8x, 8y, 8z at the points prescribed. Without solving for
and 8R4. z, deduce dz/dx and dz/dy at the points.
Suppose that initially Rl = 3, R2 = 10, R3 = 5, and (a) 2x — 3y ± 4z = 1 at points satisfying the condi¬
R4 = 10, and that R{ becomes 3.2 and R2 becomes 9.8. tion;
Estimate the change in R3 necessary if R is to remain (b) x2 + y2 + z2 = 14 at (1, 2, —3);
unaltered. (c) 4x3 + / + 9z3 - xyz2 = 13 at (1, 1, 1);
(d) x2 — z2 = 9 at x = 5, y = y0, z = 4. (this is a hyper¬
30.4. The equation 2x3 — 3x — 45 = 0 has a solution bolic cylinder).
x = 3. Find an approximate solution to the equation
2.lx3 - 2.9x - 47 = 0. 30.9. (a) Compare the result of using the chain rule
(30.7) with that of direct substitution in order to find
30.5. Estimate the maximum possible error and the df/dt when f(x, y, z) = xy/z and x = t, y = 4t, z — It.
corresponding percentage error in w for the following (b) The same parametrization as in (a), but with the
cases. function f(x, y, z) = sin(xy/z).
(a) w = yz + zx + xy, x = 2 (±0.1), y = 3 (±0.2), (c) Obtain an expression for df/dt on the path in
z = 1 (±0.1). (a) when f(x, y, z) = g(xy/z), g being any function, and
(b) w = (x — y)(y — z)(z — x), x=l(±0.1), y = confirm that it works with case (b). (Hint: express
2( ±0.1), z = 3( ±0.1). the result in terms of g'.)
(c) w = (x + y + z — f)_1, where it is known only that
x = 1.2, y = 2.9, z = 1.9, and f = 2.1 after rounding to
30.10. Cylindrical coordinates r, 8, z are shown in Fig.
one decimal place. Compare with the exact maximum
30.10. They are related to x, y, z by x = r cos 6, y =
and percentage errors.
r sin 9, z = z.

30.6. Estimate the maximum error and the maximum


percentage error for the following.
(a) (For c) c2 = a2 + b2 — lab cos A (the ‘cosine rule’
for a triangle ABC). Here a = 2 (±0.1),b = 4 (±0.1),
A = 135° (±2°). [Note A(c2) % 2c Ac.]
(b) (For d)

d2 = (xj - x2)2 + (j>x - y2f + (z! - z2)2,

where the measured values (xl5 y1; zf) = (1, 2, 1) and


(x2, y2, z2) = (2, 1, 1) have been rounded to one signi¬
ficant figure. [Note A(d2) % 2d Ad.']
(c) (For A) The area of a triangle with sides a, b, c
is given by

A = [s(s — a)(s — b)(s — c)]±

where s = 2(a + b + c). Consider the case when a = 2,


b = 4, c = 3, all with possible errors as large as ±0.1.
[You can substitute s directly into the formula for A
or A1, but it is easier algebraically to obtain two
simultaneous equations, with numerical coefficients, in¬
volving A A, As, A a, A b, Ac.]

30.7. (Section 24.2). If f(x, y, z,w) = c (a constant), then (a) Given f(x,y,z), use the chain rule (30.8) with r,
any of the four variables is a function of the other 9, z as the parameters to express df/dr, df/dO, df/dz
three. Use (30.6) to show that in terms of df/dx, df/dy, df/dz.
ox dy ox dy dx (b) Regarding (a) as a pair of equations for df/dx
(a)-= 1; (b)-= - —.
dy dx dy dz dz and df/dy, show that
dx dy dz dw sin 9 df
df adf
(c) Simplify-• and
dy dz dw dx dx dr
Test the truth of the results in the cases: df . a8f cos 6 df
(i) x + 2y + 3z + 4w = 5; (ii) xy2z3w = 1.
dy dr r 50'

30.8. Assume that the following relations define z implicitly (c) The results (b) show that the differentiation opera-
as a function of x and y. Write down the relation between tions d/dx and d/dr are equivalent respectively to the
540 Mathematical techniques

polar forms 30.16. Find df/ds for the following functions /, taken
at the point (2, 3, 2) in the direction § = i^j2, ^2,
. d sin 0 d . c cos 0 d
cos 0-— and sin (/-1— . W3).
dr r dO dr r dO (a) x - y + 2z; (b) xy + yz + zx;
(c) (xy + yz + zx)2;
Use this fact to confirm that
(d) x2 — y2 + 5 (in three dimensions: this represents
dl + dl = Bl + lJl + Ldl a vertical cylinder).
dx2 dy2 dr2 r dr r2 cO2
30.17. The equations for two surfaces, f(x, y, z) = a,
30.11. Obtain the vector function grad / for each of g(x, y, z) = b, where a and b are constants, together
the following. represent their curve of intersection, C. Show that the
(a) .v + v + z; (b) 2x — 3y + 5z — 6; vector product grad / x grad g, evaluated at a point on
(c) x2+y2 + z2; (d) x3 + 3z3—1 (in three dimensions); C, points in the direction of C. Use this to find a unit
(e) x2 — \y2 + gz2; vector § in the direction of C in the following cases;
(f) 1 jr, where r = (x2 + y2+z2)/ confirm that the gradient (a) 2x + 3y— z=l, x — y—z = 0, at any common point.
vector points in the direction of the position vector (b) x + y = 0, x — z = 0, at any common point.
(x, y, z). (c) x2 + y2 + z2 = 6, x — y + z = 0, at (1, 2, 1).
(d) x2 + (y - l)2 = 1, x2 + (y - 2)2 = 4, at x = 0, y = 0,
30.12. Obtain a vector which is normal to the follow¬ and any value of z. Explain what is happening here.
ing surfaces at the points specified, and construct a (e) xy + yz + zx = 3, x + y + z = 3, at (1, 1, 1).
unit vector from it:
(a) x — 2y + z = 0 at any point; 30.18. Find the stationary points of the following func¬
(b) y2 + z2 = 2 at any point; tions with respect to all the variables named in /:
(c) x~ + y~ + z~ = 9 at (2, 1, —2); (a) /(x, y, z) = x2 + y2 + z2;
(d) ±x2 + h'2 + TSZ2 = 3 at (2,3,4); (b) f{x, y, z) = x3 - 3x + y3 - 3yz + 2z2;
(e) x3v + zx3 = 5 at (1, 2, 3); (c) /(x, y, z) = xy A- yz A- zx A- y — z;
11 1 (d) f(x, y, z) = x/z + y/x + z/y;
(f) — + - H— = 1 at (2, 3, 6); (e) fix, y, z, /.) = (x + y + z) - 2(x2 + y2 + z2 - 1);
x y z
(f) f(x, y, z) = x4 + y4 + z4 - 2(x - y + z)2.
(g) (x2 + 4y2 — z2)-1 = jg at (4,1,2).

30.19. Find the stationary points of x2 A- y2 A- z2 on


30.13. By finding the gradient vectors, obtain the angle
the path
between the following surfaces at the point of intersection
given: x = cos r, y = sin t, z = sin \t,
(a) x2 + y2 + z2 = 9, x2 - z2 = 0 at (2, 1, 2); where 0 < f < An.
(b) x2 - y2 + z2 = 1, 2x - 3y + z + 1 = 0 at (2,2,1);
(c) x2 + y2 - z2 = 0, 3x + 4y - 5z = 50 at (3, 4, 5).
30.20. At the point (x, y, z) in the air, an insecticide
Explain the result.
maintains a concentration

30.14. (a) Find grad / for s = C exp{ —a[2(x — l)2 + 4y2 + z2]},

where C is a constant. An insect is trying to escape


by following the path of most rapid decrease in concentra¬
where A and a are constants. Deduce that the vector tion.
(2x, 4y, z) points in the direction of grad /. (a) Show that, when it is at (x, y, z), its direction is
(b) Let fix, y, z) = g[u(x, y, z)], where g and u are two that of the vector (2(x — 1), 4y, z).
other functions. Show that (b) In a short interval of time 51, it moves to (x + 8x,
, . f du du du\ y + Sy, z + 8z). Show that, approximately,
grad J = \g (u)—,g’(u) — ,g\u) — ,
\ cx cy dz J 8x Sy 8z

and deduce that grad u points either in the same or 2(x - 1) _ 4y ~ 7'
in the opposite direction to grad /. (c) By letting 8f -*• 0, show that its path is described
by the two differential equations
30.15. Write down expressions for the directional deriva¬
dz z dz z
tive of the following at the point (x, y, z), in terms of a
unit direction vector s. dx 2(x — 1) ’ dy 4y
(a) x + 2y -I- 3z; (b) x2 — y2 — 3z; (Such simultaneous equations would often be written
(c) (x - l)3 + y3 + z3. dx/2(x — 1) = dy/4y = dz/z.)
Functions of any number of variables 541

(d) Show that the general solution of these equations, x2 + y2 = 1 when


which expresses a path in space, can be written as
0(x, y) = xl 1 + ——-
z = Ay* = B(x - 1)*,
' x2 + y2
where A and B are arbitrary constants.
(e) Assuming that the insect starts at (0, 1, 1), find its 30.23. Often a function /(x, y, z) takes the form of ‘a
path. function of a function’: w = f(x, y, z) = g(u(x, y, z)). (An
example is
30.21. Use the Lagrange-multiplier technique of w = /(x, y, z) = sin xyz : u = xyz, w = sin a.)
(30.25)—(30.26) to solve the following restricted sta¬
(a) Write down several examples of functions which
tionary-point (SP) problems.
can be regarded in this way.
(a) SPs of .x + y + z subject to l/x + 1 /y + 1/z = 1.
(b) Show that
(b) SPs of xyz subject to l/x + 1 /y + 1/z = 1.
(c) SPs of x2 + y2 + z2 subject to ax + by + cz = 1. df du df du
— = <:/(«)—, — = g'(u)—,
(d) SPs of xj’ -b yz + zx subject to xyz = 1. (This corre¬ ox ox dy dy
sponds to finding the rectangular block of given volume
(You only need the one-variable chain rule (3.3).)
which has the smallest surface area.)
(c) Check the correctness of the,formulae (b) in the
(e) The problem of the rectangular block of greatest
cases when (i) w = eJc2-,'2+z:!, (ii) w = sin(xy/z).
volume which can be fitted into an ellipsoid leads to
(d) Using the results (b), rewrite the chain rule (30.7)
the problem: find the SPs of xyz subject to x2/a2 +
in the form appropriate to functions of the form g(u(x, y, z)).
y /b~ + z~/c- = 1.
(e) The path x = cos f, y = sin t, z = t represents a helix
(f) SPs of x2 + 4y2 + z2 on the intersection of the
whose axis is the z axis. Find an expression in terms of
two planes x — y — 2z = 0 and z = 1.
t for df/dt on the path when /(x, y, z) = g(xy/z), where
(g) SPs of x2 — y2 — z2 on the straight line
g is any function. Confirm the result for any simple case.
x — 1 _y — 2_z — 2
30.24. Often a function takes the form
2 -1 3
w = f(u, v, vv),
(h) SPs of xyz subject to xy + yz + zx = 1. (Compare
(d).) where u, v, and w are themselves functions of x, y, and
(i) SPs of x — y — 2z on the intersection of z = 1 with z. Write down a version of the chain rule (24.8) which
x2 + 4y2 -b z2 = 6. (Compare (f).) enables df/dx, of/by, df/dz to be found. (In this case,
x, y, and z function like parameters and u, v, and vv
30.22. (Numerical), (a) Write a program to carry out like the principal variables.) Use this result to prove the
the numerical scheme suggested in Example 30.6 in following results:
two dimensions in order to obtain the curves along (a) If w = f(x — y, y — z, z — x), where / is any func¬
which a function fix, y) increases most rapidly. It may tion, then
also be possible for you to display the curves on the 5w dw dw
— + — + — = 0.
screen. *■< ^
ox oy cz
(b) Use (a) to obtain a numerical solution of the following
Check your result with the function
problems. Try a succession of decreasing step lengths h
in order to ensure plotting accuracy. (x - y)(y - z)(z - x).
(c) The altitude H of a part of a hill is given in km by (b) If w = /(y/x, z/x), where / is any function of two
variables, then
H = 0.5 - x2 - Ay2.
dw dw dw
Start e.g. with (x, y) = (2, 2), and go to the summit. x-by-bz — = 0.
(For comparison, the exact solution to the problem ox oy oz
is y = ^x4; the summit is at the origin.) Try it out with the function x/y + y/z + z/x, noting that
(d) Plot the track of most rapid descent from the point
(3, 2) in the case where H = \ -b x2 - y2. (To descend,
use negative h. The shape is a saddle: viewed from the
origin, H increases east and west and decreases north
and south.) 30.25. Let fix, y, z, t) = ej('tlX+'[2,’+'[3Z_“"), where j is the
(e) In certain types of fluid flow in two dimensions, complex element (j2 = — 1), and kit k2, k3, and to are
the velocity vector v(x, y) is equal to the gradient of constants. Show that
a single scalar potential function tp(x, y, z): v = grad tp.
A streamline through any point is in the direction of 8l + dl + 8l = Ldl
v. Plot some streamlines on and outside of the circle dx2 by1 dz2 c2 dt2 ’
542 Mathematical techniques

where c = to/{kx + k2 + k3). (This is called the wave 30.27. The cross-sectional profile of a long cylindrical
equation in three dimensions, and /(x, y, z, t) is one of its mirror is the semicircle x2 + y2 = 1 in the right-hand half
solutions.) plane. Rays from the left, parallel to the x axis, fall on
Prove that g(kxx + k2y + k3z — tot), where g is any the mirror.
function of a single variable, is also a solution. (a) Show that the equation of the ray reflected from
the point (cos 0, sin 0) on the mirror is x sin 20 —
30.26. Find the envelopes of the following families. y cos 20 = sin 0.
(a) y = xx + or.x (parameter a); (b) By regarding 0 as the parameter, show that the
(b) y + orx = a (parameter a); envelope of these reflected rays is given by x2 + y2 =
-X y j(3y^ + 1). (In optics, this envelope is called the caustic
(c) - +-= 1 (parameter a); of the reflected rays.)
a 1 — a
(d) .x cos 0 + y sin 0 = 1 (parameter 0).
am
Double integration

Contents
31.1 Repeated integrals with constant limits 543
31.2 Examples leading to repeated integrals with constant limits 544
31.3 Repeated integrals over nonrectangular regions 546
31.4 Changing the order of integration for nonrectangular regions 548
31.5 Double integrals 550
31.6 Polar coordinates 555
31.7 Separable integrals 555
31.8 General change of variable; the Jacobian determinant 557
Problems 561

31.1 Repeated integrals with constant limits


Before explaining how they arise, we show first how a repeated
integral is written and evaluated. The following is an example of
a repeated integral with constant limits;
C2
I = (xy + y2 — 1) dx dy.
o JO

There are two stages of integration; first with respect to x, then


with respect to y, this being determined by the order in which dx
and dy appear under the integral signs.
The reader is recommended to copy the following procedure
step-by-step at first.
(i) Put brackets round the inner integral, which is the first to
be evaluated:

1 = (xy + y2 — 1) dx dy.
o \J

(ii) Make it clear which variable connects with which limits of


integration, by explicitly labelling them as shown:

I = (xy + y2 — 1) dx dy.
7=0 x=0

(iii) Evaluate the inner integral with respect to the first variable
(here x), treating the other variable (y) as a constant:
* 2
(xy + y2 - 1) dx = Qx2y + y2x - x]2=0
Jx=0
= 2y + 2y2 — 2.

This process eliminates the variable x.


544 31.1 Mathematical techniques

(iv) Use the result of (iii) as the integrand of the outer integral:

(2y + 2y2 — 2) dy = [y2 + Ik3 ~ 2y]*=0 — — 3-


Jy = o
This eliminates the variable y, so that the final result is a definite
number. If you find you are left with an x or y in the result, then
you have not followed the process correctly.

Example 31.1. Evaluate the repeated integrals

(a) I = (xy + 1) dx dy, (b) J = (xy + 1) dy dx.


o J 2 J
ri
(a) / = (xy + 1) dx dy
v— 0 \

The inner integral becomes

(xy + 1) dx = [ix2y + x]*=2 = 8y + 4 - (2y + 2)


x—2

= 6y + 2.

This forms the integrand of the outer integral:

/ = (6y + 2) dy = [3y2 + 2y]’=0 = 5.


>• = 0

(b) Here the order of the symbols j*=2 and j*=0 has been reversed, and also
the order of dx and dy. In other words, the same processes are to be carried
out, but in the reverse order. The details, however, look different.
We have

J = (xy + 1) dy dx.
x=2 \J y = 0

The inner integral is with respect to y, and we treat x as a constant:

(xy + 1) dy (x constant).
y=0

This is equal to

[*(fy2) + y]'. = o = \x + 1 ■

The outer integral becomes

J= | (fx+l)dx = 5,

which is the same as / in (a).

In this example, it makes no difference whether we integrate with


respect to x or y first. Later we show that this is always true when
the repeated integral has constant limits.

31.2 Examples leading to repeated integrals with


constant limits
Figure 31.1a represents a heap of grain in a rectangular silo of length
8 m and breadth 4 m. The top surface of the grain is curved, with
31.2 Double integration 545

(a) the equation

z = 3^.x2 + ^y2 +2 (0 ^ x ^ 8; 0 ^ y ^ 4),


and the problem is to find the volume, V say, of the grain.
Imagine the grain divided into thin vertical plane slices, parallel
to the (x, z) plane, the thickness of a slice being 8y. A typical slice
is shown in Fig. 31.1a, and the value of y is constant on its faces. It
is lifted out and displayed in elevation separately in Fig. 31.1b. The
face area is given by

Area ABCD — Tek2 + 2) dx,

in which y takes the current constant value. Therefore its volume,


8 V say, is given by

5F^Q (y^x2 + ^y2 + 2) dx^ 8y.

When we take the sum of all the elements 8F and let 8y -> 0, we
obtain in the usual way
rs
V= (A*2 + 16 y2 + 2) dx )dy.
v=0

The result has therefore taken the form of a repeated integral of


the kind described in Section 31.1. In evaluating it, the inner integral
gives the cross-sectional area of a slice on which y is held constant:
'8
(j2*2 + Tey2 + 2) dx = [9V3 + rkk2* + 2x]^=0
Jx = 0
_ M , 1 2
— 3 ' 2 y •

Finally

V= [4 (f+ iy2)dy=[fy + iy3]^o = 96.


Jy=0

It can be seen that, if we had taken the slices parallel to the (y, z)
plane, the process would have led to the integral

V 16 y2 + 2) dy dx.

The integrand is the same, and the result must be the same, when
the integrations over x and y are carried out in the opposite order.
In general, any repeated integral with constant limits,
fd fb
/(x, y) dx dy,
Jc Ja

can be interpreted when /(x, y) is positive as the volume of material


in a box standing on the rectangle specified by a ^ x ^ b, c ^ y < d,
when the material in it has depth /(x, y). If /(x, y) is negative over
31.2 Mathematical techniques
546

any part of this area, then it will obviously make a negative


contribution to the integral. This signed-volume analogy is closely
similar to the signed-area analogy (15.13). We can use the idea to
say that the order of integration for repeated integrals having
constant limits is immaterial:

Changing order of integration in a repeated


integral with constant limits
(31.1)
fd fb (*b I'd
fix, y) dx dy = f(x, y) dy dx.
Ja Jc

There is frequently an advantage to be had from changing the


order of integration in this way.

(a)
r1 r2
Example 31.2* Evaluate I x exy dx dy.
o Jo
The inner integral is

f2
x ex> dx,
Jx=0
which, though not very difficult, does involve integration by parts. To avoid
this, try the alternative order of integration:

/ = xe dy dx.
>' = o

The inner integral, with x being treated as a constant, is


*1 1
x evv dy = x — exv
Jy=0
X

Then

/ = (ev — 1) dx = [eA

which is much simpler.

31.3 Repeated integrals over non-rectangular


(c) regions
Suppose now that the base of the silo, loaded with grain, has the
triangular shape OPQ shown in Fig. 31.2; to take a definite instance,
we could consider its depth f{x,y) to be the same as before:
fix, y) = yj.x2 + pgy2 + 2; but our discussion will hold for a general
function /. Again we measure the volume V by summing the
volumes of slices parallel to the x axis, having thickness 5y.
Figure 31.2b shows a typical slice lifted out and viewed in (x, z)
axes in order to obtain its face area, and Fig. 31.2c shows the base
Fig. 31.2 of the silo in plan view; the slice chosen is along AB.
31.3 Double integration 547

The slices all have different x values at their starting points, and
these values depend on y, so the limits of integration are not
constant in this case. In order to determine the range of integration
of the slice at level y, it is necessary to refer to the triangular area
in Fig. 31.2c, called the region of integration for this problem.
The equation of the side OQ is

y= 2*

for 0 ^ x ^ 8. Since we need x in terms of y, we express this as

x = 2y,

and it is helpful to write it on OQ, as shown, together with the


simpler information required for the other limits of integration.
The face area of the slice ABCD at level y is therefore given by
'8
area ABCD = /(x, y) dx.
x = 2y

Its volume 5F is equal to (Area ABCD x 5y), so


‘8 \
5F /(x, y) dx 5y,
x — 2v

and finally the whole volume V is


”4 "8
V= /(x, y) dx Ay.
Jo J 2y

Notice that the limits of integration have nothing to do with


the integrand /(x, y), but depend only on the shape of the region of
integration in the (x, y) plane (in this case the triangle OPO in Fig.
31.2c). The limits of integration are the same no matter what the
integrand.

Example 31.3. Evaluate the integral I = (x + y) dx dy.


o 2y

Write the integral

1 = (x + y) dx Ay,
r = 0 \J x=2y

The inner integral is

(x + y) dx = [ix2 + yx]2=2).
= 2y

= (2 + 2y) - (2y2 + 2y2) = 2 + 2y- 4y2.

(Follow the calculation carefully.) Then

/ = (2 + 2y - 4y2) dy = [2y + y2 - fy3]i


JO -
— 5
3 '
y=0

(• 2 (* y

Example 31.4. Evaluate the integral / = xy dx dy, and sketch the


0 * 0
region of integration.
548 31.3 Mathematical techniques

Let

xy dx dy
>■ = 0 [= 0

The inner integral is

»•

xy dx = ljx2yVx=o = iy3-
Jx=o

(a) Therefore

/ = i y3 dy = i[/]o =2-
A sketch of the region of integration can be constructed in the following
way. The region of integration consists of the points (x, y) which simultaneously
satisfy

(i) 0 < y < 2 and (ii) 0 < x ^ y.

One way of finding these is to sketch the boundaries of the required region.
These are the lines

y = 0, y = 2, and x = 0, x = y,

and they are shown on Fig. 31.3a. The region consists of any points which lie
(b) between both pairs of boundaries (i) and (ii). This is the triangle shown in Fig.
31.3b.

31.4 Changing the order of integration for


nonrectangular regions
If the region of integration is a rectangle with sides parallel to the
axes, then (31.1) states that changing the order of integration simply
involves performing the same operations in the opposite order. But
if we do the same thing with the previous example, we get
Fig. 31.3
fr |*2
xy dy dx.
Jo Jo

Vertical strip, This is obviously nonsense: the answer contains y, whereas we ought
width 8x
to get the answer 2 again. In fact the new form means nothing at all.
If the region of integration is nonrectangular we have to write

f(x, y) dy dx
J ..

and begin again, filling in the limits of integration so that we cover


the same region. The interior integral is now with respect to y, so
we must start with strips parallel to the y axis as shown in Fig. 31.4.
Then the inner integral is

' 2

/(*» y) dy.
Fig. 31.4 J y=x
31.4 Double integration 549

The outer integral involves all x between 0 and 2, so finally we have

/(x, y) dx dy = f(x, y) dy dx.


o Jl o J
Each case has to be considered individually in this way.

Example 31.5. Change the order of integration in the repeated integral


'i rf(i-*y2)
y dx dy
Jo J — yj ( 1 — 4}’2)

and so evaluate the integral.

Write the integral in the form

/ y dx ) dy.
)- = 0 -4)4)

The limits of integration express the boundaries of the region of integrations:

x = (l-4y2)± x=-(l-4 y2)* y = 0, y = i

These are shown in Fig. 31.5 (the curved part can be written x2 + 4y2 = 1: an
ellipse with semi-axes equal to 1 and I). Figure 31.5a shows how the form given
is obtained, by starting with horizontal strips which end at x = ±(1 — 4y2)T
In Fig. 31.5b, the position with regard to vertical strips is shown,
(a)

Fig. 31.5

for which the inner integral will be over y. When the order of integration is
changed by this means, we obtain
r Wo “*4
I = y dy dx.
Jx= - 1 >• = o

The inner integral is now


'Wo -*2>
ydy= [iy2]SV(,-*2, = iU -*2).
0
Therefore

I = i f (1 - X2) dx = i[x - |x3] L ! = i


550 31.4 Mathematical techniques

It is left to the reader to try it in the original form; it is perfectly possible,


but more complicated.

31.5 Double integrals


The repeated-integral notation is very informative and self-con¬
tained: all the information needed is contained in the integral.
It even suggests the coordinates to be used, and gives explicitly the
boundary of the region of integration. However, problems do not
always fall easily into this form.
Suppose we have a lake of any shape, as shown in Fig. 31.6a,
whose depth is f(x,y): notional contours are suggested. We want
to find the volume of water in the lake.
Call the area covered by the lake the region of integration ^
for the problem. Construct a mesh on ^ consisting of small area
elements § A as in Fig. 31.6b: the mesh may be quite arbitrary for the
present purpose. A typical area element 5/4 is at P. Below 5A is a
depth of water we shall denote by f(P) (we shall not use f(x,y)
because cartesian coordinates might not be the ones we eventually
want to use). The volume 5F in the vertical column of water below
5/4 at the point P is approximated by

5V^f(P) 5/4.

If we add up all the volume elements in the usual way, we obtain


the total volume V. Denote this operation by

V=Z SF,

thus indicating a certain region of integration which can be


obtained by reference to the diagram, or might be specified separately
in words or in some other way. Now let all the area elements 5/4
Fig. 31.6
tend to zero, while becoming more numerous in order to cover
We obtain

V='L&V= lim Z/(P)5A.


8a ->■ o
It is natural to write this as some kind of integral as we did in one
dimension (see Section 15.1). There are several notations; we shall
write

V= f(P) dA,

which is to be read: the double integral of / over the region


X.- Unlike a repeated integral it does not give any clue as to how to
evaluate it.
As a rule, the argument that gives rise to a double integral by
way of a certain summation will not have anything to do with
volume; but, as a result of the summation that it represents, the
31.5 Double integration 551

signed-volume analogy referred to in Section 31.2 will always hold


good:

Double integral I f(P) dA and the


JL
signed-volume analogy

(i) / stands for lim f(P) 5A; ^represents a given


8a -+ 0 %
region in a plane; and 3A is a typical area element of
taken at the point P. The summation is over all (^1.2)
the elements 5/1 of ff.
(ii) (Signed-volume analogy) Whatever its origin, the
integral is numerically equal to the signed volume
between a surface z = f(x,y) and the plane z = 0,
taken over the region ff. (Where z is negative the
contribution counts as negative.)

Example 31.6. A flat plate occupying a region Qis acted on at every point
P on the plate by a variable normal stress o{P) per unit area. Express the total
(resultant) force F on the plate as a double integral.
8F=cr(P)8A The position is as in Fig. 31.7: ^is the plate. The force 5F acting on a typical
area element 5/4 at P is given by

5F % a(P) 5/4.

Add up the contributions of all the elements covering f_, and take the limit as
the mesh becomes finer and finer. We obtain

F = lim £ a(P) 5.4 = a{P) d/l.


5a - o % JJx
Confronted with a double integral, one has to decide on what
coordinate system to use (say cartesian or polar coordinates) so as
to turn it into a repeated integral. In the following example, cartesian
coordinates (x, y) are appropriate.

Example 31.7. We have shown in Example 31.6 that the resultant force F on
any flat plate R subject to a normal distribution of stress o{P) is given by

F = a(P) d/4. Find the force on a rectangular plate of sides 2 and 3 units

when a = 3(r2 — 2), where r is the distance from one of the corners.

This double-integral expression is perfectly general, applying to any plate, any


distribution of force and any coordinates. We have to reformulate the problem
for this case. Place the rectangle as in Fig. 31.8, with the corner to which the data
refers at the origin. A suitable mesh is the rectangular mesh, with 5/4 having
sides 5x and 5y: the area element is 5.4 = 5.x 5y. Also a = 3(x2 + y2 — 2).
We can add (which ultimately means integrate) the contributions

Fig. 31.8 5F « 3(x2 + y2 — 2) 5x 5y


552 31.5 Mathematical techniques

in any order that is convenient. Suppose we decide to add the contributions


along each horizontal strip at level y, and then to add the results from the strips.
Then from the strip at level y we obtain the contribution

3(x2 + 3
,2 _ 1 ) d.v ) 5)’

after letting 5x -> 0. When we add the contributions from all the strips and let
5y -> 0, we have the repeated integral

'3

F = 3(x2 + y2 — 2) dx dy = (—1+ 2y2) dy = 42.


0

The following examples show the adaptability of the notation. In


each case, Of represents the region in question, with P a representa¬
tive point of Uf and dA the corresponding element.

(i) Area. Area of df: dA.

(ii) Variable surface density. Total mass of a thin flat plate, of

variable mass a{P) per unit area: a(P) dA.

(iii) Moments. Moment of (ii) about the x axis: yo(P) dA.


J J
(iv) Moments of inertia. Moment of inertia of (ii) about the y axis:

x2cr(P) dA.

(v) Probability. A function /(x, y) is eligible to be a probability


density function for random variables X and Y over a region Uf if

/(x, y) $> 0 and /(x, y) dA = 1.

The probability that (X, T) lies in a subregion S is then

/(x, y)dA. (Here it is helpful to retain x and y: we are not

obliged to use P if it does not have the right associations.)


(vi) Vector resultant. A force per unit area, f{P) (stress), variable
in direction and magnitude, is applied to the surface of a flat plate

Of. The resultant force F is given by F = f(P) dA.


JJ
In order to interpret or evaluate the integral, we write / in its
components / = ifx + jf2 + k/3: the original double integral with
a vector function as integrand is really three double integrals in one.

The resultant force F-s in a fixed direction s is npysdA.


This integrand f-s is not a vector. It can be rewritten in any
convenient way: for example as |/|cos0, where 6 is the angle
between / and s.
31.6 Double integration 553

31.6 Polar coordinates


If the boundary of ^ in a double integral is circular, or if the
boundary is a circular sector, it might be easiest to work it out using
polar coordinates. However, when we do this, the integrand changes
in an important way.
(a) Figure 31.9a shows an annular sector ^ whose boundaries are
specified by r = a, r = b, 9 = a, 6 = /?, where r and 6 are polar
coordinates. We want to evaluate

f(P)dA = lim £/(P) 5/4, (31.3)


J J X. 5A -> 0

where P is a representative point of 9{_, and 5/1 for the moment


permits any kind of division of 9^ into small area elements. We want
to put everything in terms of polar coordinates. This process must
include a suitable choice of elements 5/1, so that the summation, or
integration, (31.3) can be carried out in an orderly way over the 5/4
elements—the equivalent of‘strips’ in (x, y) coordinates.
(b) The mesh suitable for this purpose is also shown in Fig. 31.9a,
and one of the area elements 5/1 is shown in Fig. 31.9b. It is nearly
a rectangle, with sides 5r and r 59, so

5/1 « r 5 r 5 9.

The sum in (31.3) therefore becomes, in polar coordinates,

lim ]T/(r, 0)(r 5r50).


5r-> 0 v
56^0
The sum of the elements along the radial line 9 is

fir, 9)rdr bd

after letting 5r -» 0. Now add up the contributions from all these


narrow sectors, ranging from 9 = a. to 9 = (1, and let 59 -> 0. We
obtain the repeated integral

'P
f(r, 9)r dr d 9.
J6=a

Notice that this contains an extra element r in the integrand.

Double integral in polar coordinates/ when the


region of integration ^ is an annular sector

If the sector 9^ is the region a ^ r ^ b,a ^ 9 ^ /J, then (31.4)

f(P)dA = f{r, 9)r dr d9.


j j Ja Ja
554 31.6 Mathematical techniques

Example 31.8. Find the volume V between the two planes x + y + z = 4 and
z = 0, over the quadrant 0 < r < 1,0^0^ i71-

Here ']{ is the region O^r^l.O^O^f77*11 plane z = 0. Expressed as a


double integral, the required volume V is
rr
V= f(P)dA = z dA.
J X.
Here z is given by

z = 4 — x — y = 4 — r cos 0 — r sin 0 = f(r, 0).

Then by (31.4),
-j7l

V■ (4 — r cos 0 — r sin 0)r dr dO

(4r — r2 cos 0 — r2 sin 0) dr I d0


0 = 0 \J r=0 /

[2r2 — ^r3 cos 0 — 3r3 sin 0]r‘=o d0


o

(2 — 3 cos 0 — 3 sin 0) d0 = it — §.

Example 31.9. A circular disc of radius 0.1 m has a surface charge density
a = 10-6 (1 + 103 r3 sin \0) coulomb m~2. Find the total charge.
* f
The total charge Q is given by a(P) dA, where the region is the disc
J J
0 r ig 0.1, 0 ^ 0 < 2k (if in doubt, sketch it). Remembering that, in polar
coordinates, 5/1 = r dr d0 (or reading straight from (31.4)), we have
0.1

<2 =
a(r, 0)r dr d0
0 J
I'm

10 6(1 + 103r3 sin \0)r dr d0

= 10' (r + 103r4 sin \0) drj d0


0=0 r=0
C 2n

= 10' Qr2 + i 103r5 sin 30]°=‘o d0


0=0
*27i:

= 10' (3 + 3 sin 30) d0 = 10 8[i0 — 3 cos 30]^

3.94 x 10'8.

Since the repeated integral has constant limits, the same result would be
obtained by integrating in the reverse order (see (31.1)).

Example 31.10. The curve r = cos 0 (0 ^ 0 ^ 571), together with the radii
from the origin to its ends, forms the boundary of the region and is shown in
Fig. 31.10. Obtain (a) the area of and (b) its moment about the y axis.

(a) In general, the area of a region is


dA. In this case, we shall add up
JJ
the contributions 5/1 along radial sectors inclined at angle 0, one of which is
Fig. 31.10 shown, and then sum the results for all these sectors to obtain the total area A.
31.6 Double integration 555

We can indicate this, together with the range for r and 0, by writing
0 = in r = cos 0 0 — -jTt r = cos d
A= X I 8/4 « X X (rbrSO).
0=0 r=0 0=0 r=0

When we let 5r and 50 tend to zero, we have a repeated integral with a variable
limit:
• in P cos 0 'in
A = r dr d0 = [|r2]-so°d0 = i cos20 dO

O
II
i
= To71 + 8-

(b) The moment 5M, about the y axis, of an area element 5/1 is
5M % x 5/4;
so, as a double integral, the total moment M is given by

M = x d/4.
'K.

In the same way as in (a), but with an extra factor

x = r cos 0,

we have
71 (* COS 0

M= r2 cos 0 dr dO = j^n + pj.

31.7 Separable integrals


Suppose that we have a repeated integral / with constant limits,
whose integrand f(x,y) is the product of a function of x only with
a function of y only:
I'd rb

I= f(x, y) dx dy = g(x)h(y) dx dy.


Jc

It is called a separable integral because of the following property.


The inner integral is

g{x)h(y) dx = h(y) g(x) dx,


J a

since y is held constant. Therefore

h(y) g(x) dx dy
Ic \Ja

rb
The integral g(x) dx, once worked out, is just a constant, so we

can take it out from under the y integral, obtaining


Cd
I = g(x) dx h(y) dy,
Jc

which is simply the product of two ordinary integrals. We have the


following result:
556 31.7 Mathematical techniques

Separable integrals
* d ffc -b 'd (31.5)
g(x)h(y) dx dy = g(x) dx h(y) dy.
Jc Ja Ja Jc

This can sometimes speed up the working when evaluating integrals,


but the following example proves an important result by applying
(31.5) the other way round.

Example 31.1 1. Prove that e *2 dx = §71*.

Put / = dx.

The name given to the variable of integration in a definite integral is a matter


of indifference, so we can equally well put

/ = j” e">,2dy.

The product is I2:

/2=[
Jo
e~x’ dx j*
Jo
e->'2dy.

By (31.5), this can be written as a repeated integral

I2 = e x e y dx dy. (31.6)

because this repeated integral is separable.


Regard x and y as cartesian coordinates. The region of integration Of is the
whole of the first quadrant (0 ^ x < oo; 0 ^ y < oc) in Fig. 31.11. Now change
to polar coordinates, putting
e-.*2 e->- _ e-(x2 + >-2) ancj x2 + yl _ ri

The area element is dA = r dr d0, and the same region is described in polars by

0 ^ r < oo, 0 ^ 0 ^

Then

I2 = e rl r dr d()
Jo

Fig. 31.11 d0 e r2 r dr (by (31.5) again)

= 2^i[e"r']rx=o = 471'

Therefore / =

Example 31.12. Prove the convolution theorem (see (25.11), Laplace trans¬
forms): that is, if F(s) and G(s) are the Laplace transforms of f(t) and g(t)

respectively, then F(s)G(s) is the Laplace transform of f(*)g(t t) dr.


o
31.7 Double integration 557

Consider the Laplace transform L(s) of /(r)fif(t - t) dr:


Jo
-i

L(s) = f{x)g{t ~ t) dr dt
Jo

- T) dr dr.

The region of integration is the triangle in the (t, r) plane shown in Figure 31.12.
Change the order of integration by summing vertical strips: we find that

L(s) = J j* e~s‘ f(x)g(t - t) dr dr.

Now change the variable in the inner integral from r to u, where


U — t — T,

remembering that z is constant in the inner integral. We obtain


•oo f* oc

L(s) = g(u)f(x) e s(u + r) du dr


o Jo
e-su e-st g(u)f(z) du dz

* 00 P 00

= e_s“g(u)du e_SI/(i)dr
Jo Jo
(since the integral is separable)
= F(s)G(s).

31.8 General change of variable; the Jacobian


determinant
Consider the integral

/ = f{x, y) dx dy (31.7)

where ^ is the region of integration in the (x, y)-plane and the area
elements 5/1 are small rectangles of side 5x and 8y as in Fig. 31.13.
The shape of ^ might suggest the use of another system of
coordinates to evaluate /. The special case of polar coordinates was
illustrated in Section 31.6.
Suppose that new coordinates u and v are defined by the relations

x = x(u, v); y — y(u, v); (31.8)

where there is a one-to-one correspondence between (x, y) and (u, v).


The objective is to put (31.7) entirely in terms of u and v.
Figure 31.14 shows a general point P at (xP,yP), or at (u = uP,v — vP)
in the new coordinates. The coordinate curves u — uP and v = vP
through P are also shown.
Now let bu and bv represent positive small increments in u and
v respectively. In Fig. 31.15(a) the two curves u = uP + bu and
Fig. 31.14 v = vP + bv are also shown. The area element PQRS, denoted by
558 31.8 Mathematical techniques

(a) 5/4', is of the type appropriate for the new coordinates, and when
u=u +8u V-V +8v
p v=v 5u and 8v are small, PQRS is nearly a parallelogram, as indicated
P in Fig. 31.15(b).
The area of the parallelogram PQRS is given by

Xq Xp X$ Xp
5,4' = det
yQ-yP Ps - yP~

where the verticals stand for the modulus of the determinant between
them (see Problem 31.17).
The elements of the determinant are given approximately by

Q-(xQ,yQ) dx
Xq — Xp = x(Up + 5U, Vp) XyUp, Vp) = —— 0U,
Cli

8x
Xc Xp = x(ttp, Vp T 5tt) x(uP, Vp ) ~ 8v,
ov
Fig. 31.15(a) Area element 54' for
the u, v coordinates. (b) When 5u
dy o
and 8i> are small, PQRS is yo“ - yP = y(up + ^ _ vp) = —
cu
5u,
nearly a parallelogram.

dy ?
ys~ yp = y(up, Vp + 8v) — y(uP, vP) = — 5y,
ov

where the partial derivatives are evaluated at P. Therefore

/ dx dx\
du Tv
5/4' = det 8u 8v
dy dy
\ du dv J
( dx dx\
du dv
det 8u 8v, (31.9)
dy dy
\ du dv )

since we required 5u and 8v to be positive.


The determinant which occurs in (31.9) is of wide importance. It
is called the Jacobian of the transformation (31.8), and has the
notation

d(x, y)
d(u, v)

For brevity it is sometimes denoted simply by J.


31.8 Double integration 559

The Jacobian determinant of the


transformation x = x(u, v), y = y(u, v)

(dx dx\

d(x, y) _ det du dv
(31.10)
d(u, v) dy dy
\du dv)

From (31.8) and (31.9), remembering the modulus, we can therefore


say:

Area bA' of an element at P with sides in the


directions of the u, v coordinate curves

d(x, y)
8/4' = §u bv (or | J\ bu bv) (31.11)
d(u, v)

where bu, bv are positive, and the Jacobian determin¬


ant is evaluated at P.

We can now rewrite the original integral (31.7) in terms of the


new coordinates u and v:

To express a double integral in new coordinates

If x = x(u, v), y = y(u, v), then

f{x, y) dx dy
Jjl
(31.12)
d(x, y)
f{x{u, v), y(u, «)} du dv,
Js d(u, v)

where S is the region 7^ transformed to the cartesian


(u, y)-plane.

The effect of making the change of variable is to change the


integrand to something different. This is not surprising; a similar
thing happens in the one-dimensional case. If

I = \ f(x) dx

and we change the variable by putting x = x(u), the new factor —


du
appears in the integrand.
560 31.8 Mathematical techniques

The final step is to convert (31.12) into a repeated integral in


terms of u and v, so that the integrations can be carried out.

Example 31.13. Transform the integral I = (x2 + y2) dx dy into polar

coordinates r, 0, where O^is the region shown in Fig. 31.16.


Here r and 0 stand in place of u and v, and

x = r cos 0, y = r sin 9,

ox dx .
— = cos 0, — = — r sin 0,
or 89
cy cy
= sin 9, — r cos 9.
or 89

Fig. 31.16 Therefore,

8(x, y) /cos 9 —r sin fb


= det I
8(r, 9) \sin 0 r cos 9

= r cos2 9 + r sin2 9 = r.

This is already positive, so the new area elements are given by

5/1' = r 8r 80,

as we found in Example 31.11. Also

/(x, y) = x- + y~ = r-.

Finally

/= r(r2 dr dfi) = r3 dr dO.

This is to be read straightforwardly as a fresh double integral in variables


called r and 0, with rectangular area elements 8r 89. When we draw the diagram
to find the shape of S, r and 0 are to be treated as cartesian coordinates in axes
labelled r and 0 (see Fig. 31.17). In this frame, S is bounded by the straight lines
,• = l, r = 2, 0 = 0, 9 = ^7i.
Fig. 31.17 Expressed as a repeated integral, using (say) strips parallel to the r axis,
K 02

I = r3 dr d0 = r3 dr d0
o
(since the integral is separable)

= [K]l[«]5’t = iIn-

Example 31.14. Evaluate I (y2 — x2) dx dy over the square region in

Fig. 31.18 by changing the variables to u, v, where x = v — u, y = v + u).


Figure 31.18 shows the x, y and the u, v equations of the sides of %. We have

dx dx

d(x, y) 8u ~8v -1 1
= det
a = det = -2.
8(u, v) oy 8y 1 1
Fig. 31.18 8u 8v
31.8 Double integration 561

Therefore

s , ^(x, y)
oA = - 8u 8v = 2uv 8u 8v
5(u, v)

In terms of u and v, y2 — x1 — Auv, so

I = 4uv(2 du dr) = 8 uv du dr.

The corresponding regionS in the (u, r)-plane is shown in Fig. 31.19. Therefore

u=0 'i p p p
I = uv du dr = 8 udu v dr
Jo Jo Jo Jo

(since the integral is separable)

= 8xfxf = 2.

Problems

31.1. Evaluate the following repeated integrals with con¬ (c) z = x + y, — 1 < x ^ 2, — 2 < y < 1;
stant limits. (d) z = — 1, a < x ^ b, c ^ y < d;
(e) z = 2x — y + 3, 0 < x ^ 1, 0 < y < 1;
xy2 d.x d v; (b) y txy d.x dy; (f) z = l/(x + y), l^x^2, 0 < y ^ 1;
o J 1 Jo Jo (g) z = (x + 2y - l)2, —2 ^ x < 1, - 1 < y ^ 1.
~d

(c) d.\- dy; (d) d.x dy;


C \J
31.3. In the following problems, the region of integration
'd l is not rectangular. In each case, sketch the region of
(e) dy d.x; (f) y sin xy dx dy; integration and indicate a typical strip for the inner
c
integral.
*1 T
(a) d.x dy; (b) x2y dx dy;
(g) .x2 d.x dy\ (h) .x2 dy d.x;
1 0 Jo Jo
-i
i r i rx
(.xy2 — x2y) d.x dy; (c) x2y dy d.x (compare (b));
(i)
%) 0 %, i Jo Jo
r i 'i i i
(j) (.xy2 — .x2y) dy d.x; (d) (x + y2)2 d.x dy; (e) y d.x dy;
i Jo 0 J —y

(k) (.X +^y2 + l)2 d.x dy; y2 sin xy dx dy;


(0
J0 J ~iy
r±.
2 p-i* p iVu-r2)
(l) cos(.x + y) dy d.x;
(g) x2dydx; (h) x dx dy;
o
I'
0 J0 J0 J0
d.x dy.
T fVu-x2)
(m)
i Jo y (>)
o Jo
x dy dx.

31.2. Find the signed volume between the given surfaces


and the plane z = 0 over the specified rectangular regions. 31.4. Find the volume of the wedge-shaped object having
(a) z = .xy, 0 ^ x < 1, O^y^l; one curved surface which is part of the cylinder x2 +
(b) z = xy, —l^x^l, 0^y<l (explain the y2 = 1, and whose flat surfaces are z = 0 and the plane
result); z = 2y. (Consider the simple wedge in z ^ 0 only.)
562 Mathematical techniques

31.5. Reverse the order of integration in each of the is the area element at the point P in (see Section
following cases. It is necessary to sketch the region of 31.5). In the following cases, the region % is described,
integration and to indicate a typical strip correspond¬ and /(F) given in cartesian coordinates. Evaluate the
ing to the new order of integration, as in Section 31.4. integrals.
(a) is the rectangle with corners at (1, 1), (2, 1),
(a) /(x, y)dxdy; (b) f(x, y) dx dy; (2, 4), and (1,4), and /(F) = x2 + y2.
0 J0 (b) J{_ is the equilateral triangle with vertices at
(0, — 1), (3-, 0), and (0, 1), and f(P) = x.
(c) /(.v, y) d.v dy;
(c) is the circle of radius 2, centred at the origin,
\ 0 -v-l and /(F) = y2.
(d) /(x, y) dx dy;
o J -v (1 -.v2) 31.8. As in Problem 31.7, but polar coordinates are
*4 fi.v to be used for the evaluation (see Section 31.6). Remember
(e) /(x, y) dx dy; (f) /(x, y) dy dx; the change in the area element; see (31.4).
0 J -v3 (a) is the disc x2 + y2 < 1, and /(F) = x2 + y2.
(b) is the disc x2 + y2 < 1, and /(F) = y2.
(g) /(x, y) dy dx (it becomes the sum of two
(c) is the area whose boundary consists of the
Jo J - 1 +.Y
integrals); x axis between x = 0 and 2, the y axis between y = 0
1 + v (x~ - 1 I and 2, and a quarter of the circle x2 + y2 = 4. Also
(h) /(x, y) dy dx. fix, y) = xy.
1 - N (.v3 - 1 ) (d) 3^ is the sector 1 ^r^2, OsSd^irt, and /(F) = xy.
(e) is the disc x2 + y2 ^ 4, and /(x, y) = arctan
31.6. Change the order of integration in the following, y/x.
and hence evaluate them. It is necessary to sketch the (f) 3^, is the first quadrant of the plane, and /(x, y) =
e-4(*2+.v2)
region of integration 1{ and to indicate the strip corre¬
sponding to the new inner integral. (g) Show that the volume of a sphere of radius a
is yna3. (Consider the hemisphere
(a) x sin xy dx dy;
0 z s$ (a2 — x2 — y2)/)
r 2<v -1) (h) is the half plane y 0, and /(F) = y e-(x2+>,2).
(b) x2 dx dy; [Hint: separate the integral: see (31.5).]
1 %
31.9. A circular hole of radius \a is drilled through a
(c) x2 evv dx dy;
sphere of radius a in such a way that the edge of the hole
passes through the centre of the sphere. Let the equation
(d) x2y e ■v2-''2 dx dy; of the sphere and the cylinder be x2 -I- y2 + z2 = a2 and
o (x — ja)2 + y2 = \a2. If x = r cos 0 and y = r sin 0, show
pi r2
that the volume Vc of material removed (the section
(e) y(l — x2 — y2)2 dx dy;
in the (x, y)-plane is shown in Figure 31.20) is given by

(0 dx dy; Vc = 2 j j ^Ja2 — r2 r dr dd.


o x~ + r J “tTT J 0
Pr r
(g) dx dy; Hence find the volume of the remaining part of the
0 (x + y)3 sphere.
pi pi
(h) y(x2 - y2)1 dx dy;
o
(* ■>
(i) x2 dx dy (the integral must be
->•-1
split into two parts);
'1 f 1 v dx dy
<j> CA-771 ‘
Jo Jr (x~ — )’ r
'r
31.7. The symbol f(P)dA represents a
JJA
double integral taken over the region 3C and d A
Functions of any number of variables 563

31.10. Find the Jacobian


V= 2 (x2 + 2) dA
OX dx
d(x, y) _ du Tv of the component.

d(u, v) dy dy 31.15. For the polar transformation x = r cos 9, y =


du cv r sin 9, show that

5(x, y) _ r
of the following transformations:
(a) x = u2 — v2, y = mu; d(r, 9)
(b) x = u — v, y = 2u;
Show that r and 9 are given by
(c) m = 2x — y, v = x + 2y;
(d) x = u — e~v, y = u — er. i- V
r = y/x2 + y2, tan 9 = -.
x
31.11. Find the Jacobian of the transformation x = u/v,
y = mu. Let ^ be the region bounded by y = 2x, y = x, Show that
xy = 1, xy = 8. Express d(r, 9) _ 1 _ ld(x, y)
d(x,y) r I 8(r, 9)

In fact under fairly general conditions, the Jacobian


satisfies this inverse rule, which is helpful in some cases
as a repeated integral in the (m, u)-plane, and evaluate it. since it can avoid the inversion of transformations (see
Example 29.16).
31.12. Sketch the region in the (x, y)-plane bounded by Find the <3(m, v)/d(x, y) if u = y/x2 and u = x/y2 using
this rule, and confirm that
the parabolas y = x2, y = 2x2, x = y2, x = 2y2. Find the
Jacobian of the transformation given by d(x, y) _ 1
d(u, v) 3m2u2
y x
u = —, v = —.
7
x- y 31.16. Find the Jacobian J(u, u) of the transformation
m = x2 — y2, u = 2xy. Draw a sketch of the region ^ in
Hence find the area bounded by the parabolas. the (x, y)-plane is bounded by the curves x2 — y2 = 1,
x2 — y2 = 4, xy = 2, xy = 4. By using the change of
31.13. Evaluate variable from (x, y) to (u, u), evaluate

(x2 + y2) dx dy.


xex+>'dA,
JJ
«. * 'J\

where f^is the region bounded by the square |x| + |y| = 1. 31.17. Let PQRS be a parallelogram with P at (xP,yP),
Q at (xQ, yQ), and S at (xs,ys). Show that its area is
given by the modulus of the determinant
31.14. A plastic component is cut from a solid plastic
rod which is cylindrical with cross-section bounded by Xn - Xp

the rhombus y = 1 — jx, y = — 1 — lx, y = l + }x, det


y = — 1 + jx in the (x, y)-plane, with the rod in the z
L yQ - yP
direction. The ends of the component are shaped into [Hint: simplify the problem by placing P at the origin of
the surfaces z = x2 + 2 and z = — (x2 -F 2). Find the volume coordinates.] t
Line integrals

Contents
32.1 Illustrating a line integral 564
32.2 General line integrals in two and three dimensions 567
32.3 Paths parallel to the axes 570
32.4 Path in dependence and perfect differentials 571
32.5 Closed paths 572
32.6 Green’s theorem 574
32.7 Line integrals and work 576
32.8 Conservative fields 578
32.9 Potential for a conservative field 580
32.10 Single-valuedness of potentials 581
Problems 584

32.1 Illustrating a line integral


Consider the following scenario. The success of a museum is
measured by two variables. These are the monthly income x from
visitors, and the monthly income y from grants and donations. The
variation is smoothed out so that they form a continuous record.
The exhibitions director receives a variable monthly bonus I which
rewards success and penalizes failure in promoting attendance: when
attendance changes by a small amount 5x, positive or negative, there
is a change in bonus (up or down) of 5/, where

§/ ~ /(x, y) Sx. (32.1)


y (grants)
Figure 32.1 charts the fortunes, or the state (x, y) of the museum
over a period, starting at state A and arriving at state B, in the form
of a curve joining A and B. Time does not register on this diagram,
except that direction of development as time increases is indicated
by the arrow. The directed curve is called the path from A to B,
denoted by (AB) (there may be more letters in the brackets).
Suppose that the bonus at the starting state A is IA, and at the state
B it is IB. Then the problem is to find the change in bonus over the
period, I{AB):

h- I a = W (32.2)
Divide up the path into many short segments such as PP' (Fig.
32.1). The increment 5/ over a typical segment is given by

(«/).,on, PP' ~ f(x, y) 8x.


Add the contributions of all the segments to obtain I(AB):

I(ab) ~ Z /(*> y)5x (32.3)


(AB)
32.1 Line integrals 565

Given a specific function /(x, y) and a specific path (AB), I(AB) could
in principle be computed by carrying out the summation (32.3)
numerically, taking 8x very small, and allowing for the fact that 8x
is sometimes positive and sometimes negative. We have to split up the
path for this purpose: in Fig. 32.1, 8x is negative along (AC) and
positive along (CB). If the path is vertical along a section, then 5x
will be zero, and there will be a zero change in the bonus along this
part despite the fact that y is changing.
In imagination, let 8x -> 0. Then ‘«’ becomes ‘ = ’. It is natural
to write the result as a kind of integral:

I(ab) = lim X /(*. y) 5x = /(*» y) dx. (32.4)


5x-0(aB) J (AB)

where the notation reminds us that we take values of (x, y) which


lie on (AB) and take account of the sign of 5x at each point on
the path.
The integral in (32.4) is called a line integral. It is not straight¬
forwardly an ordinary integral because the direction, left to right
or right to left, at every point must be taken into account. The director
is losing money along (AC). In order to arrive at B from A many
paths are possible. In general, a line integral I{AB) will depend on
the total history, on what path has been followed between A and
B, and we say that the integral is path dependent.
To show how to intepret (32.4) in terms of ordinary integrals,
suppose that the bonus function /(x, y) is given by

f(x, y) = x + \y so that 81 = (x + jy) 8x

(in suitable units). Suppose that the museum starts off at state A in
Figs 32.2a,b; it thrives in Fig. 32.2a and declines in Fig. 32.2b.
Consider the case (a). The graph AB can be expressed in principle
as a function of x, and the curve chosen for illustration is

y = x2 — 7x + 15.

Then
f*

*(AB) — (x + \y) dx = [x 4- i(x2 — 7x 4- 15)] dx.


(b) V (AB) J (AB)

Now 5x is positive all the way; so, regarded as the limit of the sum
in (32.4), this is just an ordinary integral. After simplifying the
integrand, we have
r5
/ = 5x + 15) dx = 11.33.
Jx= 3

For the case (b); the equation of the curve from A to C is

y = x2 — 3x + 3,
566 32.1 Mathematical techniques

so that I(AC) becomes

I (AC) ~ j(x2 — x + 3) dx.


J(AC)

In this case, however, the 8x in the sum (32.4) are all negative: x is
decreasing. To turn this into an ordinary integral, we have therefore
to reverse the sign:
r3
l(AC) — hcA) —
y(x2 — x + 3) dx = —5.33. (32.5)
Jx= 1

There is a reduction of bonus for bringing the museum to the edge


of ruin.
While we are still observing things, notice first that, in connec¬
tion with the sign change for negative 5x on (AC) in (32.5), we can
write

I = \(x2 + x + 3) dx = (+) j(x2 + x + 3) dx.

In other words, we obtain the correct result by setting the x


coordinate of the starting point as the lower limit, and that of
the end point as the upper limit, whether x is constantly decreasing
or constantly increasing along the path, and this is a general result.
Lastly we compare the result for the parabolic path (AC) in Fig.
32.2 with a straight path from A to C whose equation is

y = x.

Then

(x + 2)0 dx = f fx dx = §Qx2] 3 = — 6.
J (AC) J3
This is different from (32.5), so we must in general expect that line
integrals will be path dependent.
The following summary generalizes the special case we have
discussed.

The line integral I(AB) = f(x, y) dx


J (AB)

(a) Definition: I{AB) — lim £ f(x, y) 5x, where (x, y)


8x -*0 (AB)
takes values on the path (AB) from A to B.
(32.6)
(b) I(AB) is path dependent, and l(BA) = -IiAB).
(c) If 5x has constant sign on the path y = y(x) from
C : (*c> Fc) t0 D '■ (*d, yD\ then
XD

f(x, y) = /(x, y(x)) dx.


(CD) J xc
32.1 Line integrals 567

Example 32.1. Evaluate the two line integrals

(a) xy dx, (b) xy dx,


(/IB) MCA)

shown in Figs 32.3a,b respectively.

(a) On (AB), y = x and 8x > 0. Here A = (2, 2) and B = (4, 4); so, by (32.6),

U.4B) — xy dx x2 dx = [^x3]* =
(AB)

(b) The path (ACB) has to be broken into two parts: (AC), on which 5x < 0,
and (CB), on which §x > 0, where C = (0, 4). Then

/,(ACB) xy dx - xy dx + xy dx.
(ACB) (AC) (CB)

On (AC), y = 4 — x; on (CD), y — 4. Therefore

l(ACB) x(4 — x) dx + 4x dx
(b) (AC) (CB)

y >»=4 ‘o 4

(4x — x2) dx + 4x dx

= [2x2-ix3]° + 2[x2]« = *>.

Despite (32.6c), reduction to ordinary integrals over x is not


usually the best way to evaluate line integrals, as will be seen later on.

32.2 General line integrals in two and three


dimensions
Line integrals of the type

g(x, y) d y
J (AB)
are to be understood in a similar way:
f
g(x,y)by means lim £ g(x,y)3y,
J (AB) £>y -* 0 (AB)

in which the sign of 5y is positive on a segment along which y is


increasing and negative on a segment where y is decreasing. We
can similarly consider paths and functions in three dimensions:

fix, y, z) dx, g(x, y, z) dy, h(x, y, z) dz.


(AB) (AB) (AB)

and string these types together to obtain a general integral

(/ dx + g dy + h dc).
J (AB)

In the following definition, (AB) is a directed path in three dimen¬


sions with representative point P : (x, y, z), and /, g, h are any three
functions, their values at P being denoted by fP, gP, hP:
568 32.2 Mathematical techniques

General line integrals

(a) (/ dx + g dy + h dz)
J (AB)

= lim X (fp 5x + dp + hP 5z)-


5x, 6y, 8z -* 0 (AB)

(8.x, 8y, 8z are positive (negative) where x, y, z are


(32.7)
increasing (decreasing)).

(b) (/ dx + g dy + h dz)
J (AB)

(/ dx + g dy + h dz).

(c) For two dimensions, suppress the variable z.


(d) The above integrals generally depend on the path
between A and B.

To organize such an integral in order to take account of the signs


of 8x, 8y, 8z is often difficult. For example, if the path (AB) consists
of an ellipse inclined to the three axes, each term must be broken
into two sections, leading to six integrals in all, to ensure constancy
of sign of 8x, 8y, or 8z along each. However, if a parametric
representation of the path is adopted, the correct interpretation is
obtained automatically.
Consider the integral with respect to x;

f(x, z) dx,
J (AB)

where (AB) is parametrized as

x = x(f), y = y(t), z = z(t),

so that, as the parameter t increases or decreases from tA at A to tB


at 5, the path is traced exactly once in the right direction. Then,
in the short interval from t to t + Sr. the change in 8x is approxi¬
mated by

dr

and Sx automatically has the right sign. Now put (dx/dr) 8r into
the defining sum in place of Sx; correspondingly (dx/dr) dr will go
into the original integral in place of dx. After doing the same thing
with the y and z integrals, we have the following result.
32.2 Line integrals 569

Parametric evaluation of line integrals

(a) If x = x(t), y = y(t), z = z(t), and P : (x, y, z) covers


(AB) exactly once in the correct direction as t in¬
creases or decreases from tA to tB, then

(32.8)
(/ dx + g dy + h dr)
J (AB)

d t.
j tA

(b) For the two-dimensional case, omit z.

*
Example 32.2. Evaluate I = (X2 + y) dy, where (AB) is the path shown
J{AB)

in Fig. 32.4.
On (AB), y = —x, so we can use x = t, y = —t, with t running from t = 1 to
t = — 1. This covers (AB) once in the right direction (it is like using x as the
parameter). Then

dy
+ y = t2 — t and = -1,
df
so

1/21-1 _ 2
/ = (r - t) dt = B'3 2l Jl — 3 '

It is immaterial what parametrization is used, so long as it satisfies


the conditions in (32.8). The following example compares two parametrizations.

Example 32.3. Evaluate I = (x dy — y dx), where (AB) is the semicircle


J(AB)

shown in Fig. 32.5, by means of two parametrizations:


(a) x = cos t, y = sin t for — frc ^ t ^\n; (b) x = (l - (2)1,y = t,for — 1 ^ f < 1.
(a) x = cos t, y = sin f; so

dx dy
— = — sin t — = cos f.
dt dr

Then

/ = [cos f cos t — sin /( — sin f)] dt


J - in

[cos2 f + sin2 f] df = dt = 7i.


Fig. 32.5 — -j7l

(b) x = (1 — f2)i y = r; so

dx dy
= -fd t2r = 1.
dr df
570 32.2 Mathematical techniques

Then

/ = Id -f2)*-t[-r(i -f2r*]}df
- i

M df
[arcsin r]!_ i = n.

Example 32.4. Evaluate I — (x dx + y dy + z dz), where (AB) is the


J (AB)
path x = a cos t, y = a sin t, z = bt between t = 0 and 4n. ((AB) is a helix along
the z axis.)

We have
dx . dy dz
— = — a sin t, — = a cos t, — = b,
dr dr dr
so the integral becomes

/ = ( — a2 cos r sin t + a2 sin t cos t -t- b2t) dr

= b2 t dr = 8b2n2.

32.3 Paths parallel to the axes


Sometimes it is necessary to evaluate line integrals along line
segments which are parallel to the axes. In such cases the easiest
approach is a direct one, as shown in the following example.

Example 32.5. (See Fig. 32.6.) Evaluate the line integral x dy


<AB)

(a) over the path (AOB)\ (b) over the path (AQB).

In this method, we refer to the sums in the definition (32.7).

(a) On (AO), 5y = 0 since y is constant; so x dy = 0.


(AO)

On (OB), x = 0; so x dy = 0.
(OB)

Therefore
(* f*

x dy = X dy + x dy = 0.
J (AOB) . (AO) J

(b) On (AQ), x = 1; so x dy = 1 dy = 1.
(AQ) (AQ)

On (QB), y is constant; so 5y = 0, and x dy = 0. Therefore


(QB)

x dy = x dy + x dy = 1.
J (AQB) '(AQ) I (QB)
32.4 Line integrals 571

32.4 Path independence and perfect differentials


Despite the fact that the value of a line integral taken between two
given points usually depends on the path chosen, there are many
cases of physical importance for which the value is independent
of the path: all paths between A and B lead to the same value. To
show that such cases can exist, the following two examples show
integrands for which the integral is independent of the path.

Example 32.6. (In two dimensions). Show that I = (y dx + x dy) is


J(AB)
independent of the path chosen from A to B.

We can write the integrand in terms of a perfect differential (see Section 20.4):
y dx + x dy = d(xy).
Now express the integral in the form

I = (y dx + x dy) = d(xy). (•)


J (AB) J (AB)
From (32.7), the meaning of the integral is the limit of a certain sum,
which can be recast in the form

lim Yj (y 8x + x 8y) ~ l'm X 5(xy). (ii)


&x. 5y -> 0 (AB) S(xy) - 0 (AB)

As we travel along (AB), the value of (xy) starts at xAyA, where the values are
taken at A, then goes by steps 8(xy) until it attains the value xByB. In other
words,

X 6(xy) = xByB - xAyA.


(AB)

which is independent of what path connects A to B.

Example 32.7. Prove that

I = [(y + z) dx + (z + x) dy + (x + y) dz]
J (A B)

is independent of the path from A to B.

We can use the idea of differentials in exactly the same way: it is suggested by
the incremental approximation (30.1). If we put
f(x, y, z) = yz + zx + xy,

the incremental approximation gives

8/ = (y + z) 5x + (z + x) 5y + (x + y) 5z,

which parallels the corresponding statement involving differentials:

d(yz + zx + xy) = (y + z) dx + (z + x) dy + (x + y) dz.

In the same way as in Example 32.6, we have


f*

7 = [(y + z) dx + (z + x) dy + (x + y) dz]
V (AB)

d (yz + zx + xy)
J (AB)

= (yz + zx + xy)B - (yz + zx + xy)A,


572 32.4 Mathematical techniques

the suffices A and B meaning the values of the brackets at the end points A and
B. This is independent of the path.

In general, suppose that we can recognize the functions /, g, and


h in the differential form
/(x, y, z) dx + g(x, y, z) dy + h(x, y, z) dz
as being expressible in terms of a single function S(x, y, z) in the
following way:
SS , dS
fJA, g=-- h =
dx dy’ dz
Then we can write / dx + g dy + h dz as a perfect differential

, , asJ
8S A dS ,
f dx + g dy + h dz — — dx + —- dy + — dz = dS.
dx dy dz

This can be substituted into our integrals to yield

(/ dx + g dy + h dz) = dS = SB — SA.
J (AB) J (AB)
Provided that there is no ambiguity in the values to be assigned to
SB and SA (this possibility is discussed in Section (32.10) below), this
provides a way of evaluating the integral and possibly demonstrating
path independence.

Evaluation of integrals over perfect differentials

If / dx + g dy + h dz = dS, then
(32.9)
(/ dx + gdy + h dz) = SB~SA
J (AB)

32.5 Closed paths


(b) A closed path is one that returns to its starting point, so that B has
the same coordinates as A, as in Figs 32.7a,b. We shall discuss only
simple closed paths. These do not cross over themselves, as do the
curves in Fig. 32.7b.
It is clear from the definition (32.7) that, when A and B are the
same point, and the path is closed, their position on the curve will
not affect the value of the integral. Consequently its coordinates are
not usually stated; a closed path is indicated by a symbol such as
C, and the integral is written

(/ dx + g dy + h dz).
Jc
Fig. 32.7 (a) A simple closed path,
(b) Closed paths which are not In three dimensions, the direction along C is specified by extra
simple. information, such as by an arrow on a sketch of the curve. However,
32.5 Line integrals 573

in two dimensions, a convention operates: if it is not otherwise


indicated, the standard direction is anticlockwise.

Example 32.8. Evaluate I (x dy — ydx), where C is the ellipse x2/a2 +

y2/b2 = 1, described in the standard direction.

The ellipse can be parametrized for the anticlockwise direction by

x = a cos t and y = b sin t (0 ^ t ^ 2n),

where we assume a and b to be positive. We had a choice for the range of t,


because we can start at any point on the ellipse. For the choice 0 to 2it, the
path starts and ends at (a, 0). Then

dx dy
= —asm t. = b cos t;
dt dt

so

/ = [a cos t(b cos t) — b sin t( — a sin t)] dt

= ab df = 2 nab.

The following result is sometimes useful:

A criterion for general path independence

If (/ dx + g dy + h dz) = 0 for every closed curve


Jc „ ' (32.10)
C, then (/ dx 4- g dy + h dz) is path indepen-
(AB)
dent for every A and B.

(a) M To prove this, see Fig. 32.8a. Here A and B are any two points,
B
while (AMB) and (ANB) are any two paths from A to B. If we
reverse the direction of the path (ANB) we have Fig. 32.8b, which is
a closed curve C. Suppose that we know the integral around every
closed curve to be zero. Then

0 = (/ dx + g dy + h dz)
(AM BN A)
(b) M
(/ dx + g dy + h dz) + (/ dx + g dy + h dz)
I (AMB) (BNA)

(f dx + g dy + h dz) — (/ dx + g dy + h dz).
i (AMB) J (ANB)

Therefore the integrals along (AMB) and (ANC) are equal.


Path independence for any two particular points is sufficient to
Fig. 32.8
ensure path independence between all pairs of points:
574 32.5 Mathematical techniques

Path independence between given points

Let A and B be any two given points. Then, if I(AB) is (32.11)


path-independent, I(PQ) is path independent for every
pair of points P and Q.

The fixed points A and B are shown on Figure 32.9, and (P, Q) is
any other pair of points. (AKB), (AP), (QB), and (PRQ) are arbitrary
paths joining the points specified by their brackets. Since I(AB) is
independent of the path joint A and B,

I(APRQB) = AaX'B)’

so

I(AP) + hpRQ) + Aqb) = I(akb)i

or

I(prq) = Kakb) — AaP) — Aqb)-


But the right-hand side does not depend on which path was chosen
for (PQ). Therefore I{PQ) is independent of the path joining P and
Q, which proves the result.

32.6 Green’s theorem


The following theorem connects a two-dimensional line integral
around a closed curve C with a certain double integral over the
region A enclosed by C, as shown in Fig. 32.10. The functions
P(x,y) and Q(x, y) which occur are assumed to be ‘smooth’ in the
region considered. ‘Smoothness’ has a technical meaning, that all
the first derivatives of P and Q are continuous, but it will be enough
for us to say that P and Q must have no jumps or infinities on C
and its interior A taken together. The theorem is:

Green’s theorem in a plane

C is a simple closed path containing a region A;


P(x, y) and Q(x, y) are smooth functions. Then
(32.12)
(P dx + Q dy) =
J J

where dA is the area element, and the line integral


direction is anticlockwise.

Although the result is true in general, we shall prove it only for a


curve like that in Fig. 32.10a, for which lines parallel to the axes cut
the curve in at most two points.
32.6 Line integrals 575

Fig. 32.10 (a) The diagram for


Green’s theorem, (b) For the
integration of dP/dy. (c) For
the integration of dQ/dx.

dP
Consider dJ4. We shall integrate it by vertical strips as in
dy
Fig. 32.10b. Suppose that the top part AMB of C, between x = a
and b, has the equation y = f(x), and the lower part ANB is
y = g(x). Then

BP f/w dP
— dLA. = dv dx
* dy Ja J g(x) dy
Pb
= I [P(x, f{x)) - P(x, 0(x))] dx

P(x, y) dx — P(x, y) dx
(AMB) (ANB)

P(x, y) dx — P(x, y) dx
J (BMA) (ANB)

P(x, y) dx. (i)


jc
Similarly, but by using horizontal strips as in Fig. 32.10c,

rr dQ r* rhiy)dQ
— cLT = dx dy = Q dy. (ii)
A dx k(y) dx c

By subtracting (i) and (ii) we obtain the result required.

Example 32.9. Show that if C is a simple closed curve, the geometrical area it

encloses is equal to \ (x dy — y dx).

Put P = — y and Q = x in Green’s theorem:

8 8
(-ydx + xdy) = \ — (x)--(-y) )d*
A ox dy

dJ4

which is the geometrical area enclosed.

Green’s theorem (32.12) enables us to produce a criterion by which


path independence can be recognized:
576 32.6 Mathematical techniques

A condition for path independence

If dP/dy = dQ/dx, then for any points A and B, (32.13)


/*
(P dx + Q dy) is independent of the path (AB).
J(AB)

To prove this, let C be any closed curve. Its interior is denoted by


A. Then, by Green’s theorem,

(P dx + Q dy) =
) a\dx dy )

Therefore, by (32.10), we have path independence.

32.7 ;Line integrals and work


A particle follows a certain path (AB) in three-dimensional space
under the action of various forces and its own inertia. Consider one
of the forces, F(x, y, z), which might be the contribution of a force
field such as gravity, or a point force such as friction or the tension
in a string. F is a vector, and it does not necessarily point along
the path of the particle.
The path can be parametrized by using t, the time, as the
parameter, and its position specified briefly by its position vector
r(t):

(x(t)» y(t), z(t)) = r(t)

as in Fig. 32.11a. In a time interval 5f, the particle moves from P


to <2, and r(t) changes to r(t + 51), the change being denoted by 8r.
Figure 32.1 la also shows the force F acting on the particle when it
is at P. During the interval 8r, the work 5 IF done by F alone on the
particle is approximated by

5 IF = (component of F in direction PQ) x (distance PQ)

= (\F\ cos 0)|8»'| = F-br.

A
Fig. 32.11 X
32.7 Line integrals 577

The total work W(AR) done on the particle by F along the path
(AB) is given by

WIab)= I
(AB)
51F* X
(AB)
PS'-

When the step length goes to zero the sum can be written as an
integral.

W —
yy(AB) ~ F’dr.
(AB)

This integral has an ordinary meaning when it is written in


(dx, dy, dz) form by splitting F and 5r into their components (see
Fig. 32.11b):

F = FJ + F2j + F3k and 8r = 8x i + 5y j 4- 8z k.

Then

F'8r = Fx 5x + F2 5y + F3 8z,

and

W(ab) ~ X (Fi 8x + F28y + F3 8z).


(AB)

Finally, taking the limit as 8x, 5y, 5z approach zero, we obtain the
exact result.

Work done by a force along a path (AB)

WU*) = F’dr = (.F! dx + F2 dy + F3 dz). (32.14)


(AB) (AB)

(For two dimensions, suppress z.)

Example 32.10. A field of force F is constant everywhere. Show that the work
done by F alone on a particle which moves from a fixed point A to a fixed point
B is independent of the path followed.
Put F = ai + bj + ck where a, b, c are constants. Then, by (32.14), W(AB) is given
by

W,
(AB) (a dx + b dy + c dz) = d(ax + by + cz)
(AB) (AB)

= (ax + by + cz)B - (ax + by + cz)A,

which is a quantity independent of the path followed.

Example 32.1 1. (a) r is a position vector in axes x, y, z, and r = |r|. Show


that, in terms of differentials,

d(r~!) = — (x/r3) dx — (y/r3) dy — (z/r3) dz.


578 32.7 Mathematical techniques

(b) The gravitational force F of the earth acting on a particle of mass m at a


distance r from the earth's centre 0 is egual to —myr/r (7 constant, F is directed
towards 0 and has magnitude my/r2). Show that the work done by F when the
particle moves between any two points A and B is equal to my(rB 1 — rA ) (it is
path independent).
(a) r = (x2 + y2 + z2f, so by (30.1)

5) A
5(r ') « — (r~ *) 8x + — (r~ ‘) 8y + — (r~ *) 5z
dx cy oz

=-- 8x-; Sy
x y s
(x2 + y2 + z2)i (x2 4- y2 + z2)i

-r 8 Z
(x2 + y2 + z2)i

= -(x/r3) 5x — (y/r3) 8y — (z/r3) 8z.

The corresponding differential relation is

d(r~*) = -(x/r3) dx — (y/r3) dy - (z/r3) dz.

(b) Refer to Fig. 32.12 and (32.14). The components of F, are (Fu F2, F3) =
( — myx/r3, —myy/r3, -myz/r3), obtained by splitting r into its components.
Therefore the work fV(AB) done is

=
(Fl dx + F2 dy + F3 dz)
(AB)

Fig. 32.12
= my d(r ') (from (a))

= my(rB 1 - r~').

Examples 32.10 and 32.11 illustrate cases where the work done
by a force between two fixed points is independent of the path
between them, but this is not a universal state of affairs: for example,
it is not the case for the force field (y, — x, 0).

32.8 Conservative fields


We have spoken in a general way of a field of force and its action
on a particle. By a particle we mean an object small enough for
its exact shape, physical constitution, state of rotation, and so on to
be unimportant on the scale of the problem being considered; it
behaves in the way we imagine a point should behave.
However, the magnitude of the force exerted upon it by gravity,
electrostatic influence, etc. will still depend on the mass or charge
assigned to the particle. We need a way to specify the strength of
the force field itself, a field intensity, which is independent of what
32.8 Line integrals 579

particle we put into it. When the field strength is specified, we should
be able to deduce its effect on any particle.
This is not always quite straightforward, because the introduction
of a new particle into (say) an electrostatic field might change the
distribution of charge that constitutes the source of the field, so
that in effect we would be putting the particle into a modified
situation. The case is similar with gravity: if an asteroid enters the
moon’s gravitational field, the moon will respond by moving, and
the field entered will change, if only by a little. For the purpose of
defining field intensity, we imagine that somehow such an effect is
prevented from taking place. Subject to this, we have the following
definition.

Field intensity fP at P

fP is equal to the vector force that would act on a (32.15)


particle of unit mass (charge, etc.) at P if the sources
are assumed to be unaffected by the particle.

Therefore, if the gravitational field intensity is GP at P, the force


with which the field acts on a particle of mass m at P is mGP. One
can alternatively imagine a particle of extremely small mass n to be
introduced as a test particle. Then fP will be equal to /r-1 times the
force exerted on such a particle.
Consider the action of a field of intensity f(x,y,z) on a unit
particle which is travelling on a path (AB) (Fig. 32.13). We shall
consider not the work done by f on the particle, but the work done
against the field by the particle, which has the opposite sign.
Denote this quantity generally by v. The work done against / in a
step PQ is given by

6v ~ —/• 8r. (32.16)

The total work along the path is the limit of the sum of the 8v,
which can be expressed as a line integral as before:

v(ab) —
(/i dx + f2dy + f3 dz),
'(AB)

where / = (fu f2, /3).


The important case is when v{AB) is independent of the path,
in which case the field is said to be a conservative field. In practice
our ‘field’ will not usually consist of the whole of space; some space
will be occupied by impenetrable bodies, or we might be interested
only in the region 2^ inside a metal cage. According to (32.11) it is
only necessary to check path independence for any single pair of
points (A, B) within this region. Therefore:
580 32.8 Mathematical techniques

Conservative field in a region


Let A and B be two given points in 7/ Then f(x, y, z)

is conservative in ^ if v(AB) — /•dr is inde-


(AB) (32.17)
pendent of the path in ^from A to B. (Or equivalently

if /‘dr is zero for every closed path C in see

(32.10).)

The constant field of Example 32.10 and the gravitation field of


Example 32.11 are conservative.

32.9 Potential for a conservative field


Suppose that f(x,y,z) is conservative in a region and that A is
a fixed point in Since / is conservative, the integral

v(ap) -
f’dr,
J(AP)

where P is another point in is independent of the path (AP), and


so its value depends only on the location (x, y, z) of P. Therefore
we shall write
vlAP) = V(x, y, z) or VP, (32.18)

in which we have suppressed the coordinates of A since they are


constant.
In Fig. 32.14, suppose that (AP) is a fixed reference path from A
to P : (x, y, z). Let Q : (x 4- fix, y, z) be a point close to P, displaced
from it a distance fix in the x direction only. Choose a path (AQ)
consisting of two parts: the selected path (AP) and a straight-line
extension (PQ) from P to Q. The choice of these paths rather than
any others does not affect the values of VP and VQ since the field is
conservative. Then, from (32.18),

vu<2) ~ v(ap) = VQ-VP= V(x + fix, y, z) - V(x, y, z).


But also, from (32.16), fit) = —/-fir with 5y = 5z = 0; so

V(AQ) ~ V(AP) ~ ~ /i

where fx is the x component of /. Equating the last two results and


dividing by fix; we obtain
/i = ~lV(x + fix, y, z) - V(x, y, z)]/5x.
When fix 0, this becomes
32.9 Line integrals 581

and similarly
dV dV
f2 = and /3 = -
dy dz
Therefore

[sv dV , dV
— J H-*
<3y dz y
or

/ = —grad V. (32.19)
We call V a potential function for the field /, or simply a
potential. The single scalar function V(x, y, z) contains all the
information necessary to define the three scalar components of /:
fi(x, y, z), /2(x, y, z), /3(x, y, z). The point ,4 is commonly taken to
be at infinity: the reader might recognize the idea of ‘the work
required to bring a particle in from infinity’ in mechanics. However,
if we choose a different reference point A, it only changes V by an
additive constant, and does not, therefore, affect the truth of (32.19);
we get the same / whatever location A has. We sum up this result
as follows.

Potential V of a conservative field /


If /(x, y, z) is conservative in a region then
/ = -grad V
in 2^, where V is a scalar potential function for /. Also
(32.20)
V is defined in the region ^ by

Vp = /'dr,
J(AP)

where A is a fixed point.

As an example of a potential, the gravitational field from a particle


of mass M, namely / = — Myr/r3, has the potential V — —My/r.
This can be checked from the working of Example 32.10. The
potential function V is equal to the work done against the field in
moving a unit particle from a fixed point A to the current point P
in cases when the field is conservative. Therefore the potential
energy of a particle of mass m, relative to the reference point A at P is
equal to mV. Alternatively, V can be regarded as energy stored by
the field, like energy stored in a spring.

32.10 Single-valuedness of potentials


There is a connection between the question of single-valuedness in
a perfect differential and the conservative property of a force field.
582 32.10 Mathematical techniques

There exist fields f(x,y,z) which have a potential, but are not
conservative, because they do not satisfy the condition (32.17), that

/•dr should be independent of path.


J (AB)

Potential field

If there is a scalar function V such that (32,21)


/ = -grad V,
then /(x, y, z) is called a potential field.

We can test whether such a field is conservative or not:

Condition for a potential field to be conservative

If / = —grad V, and V is single-valued, then / is


conservative. In this case, the work v(AB) in moving a (32.22)
unit particle from A to B against the field is equal to
vb-va.

This is proved as follows:

V(AB) - /•dr = (grad V)dr


(AB) (AB)

' dV JV
I — + j— + k — (i dx + j dy + k dz)
(AB) v dx dy dz j
dV , dV J dV \
— dx + dy H-dz = dV (see (30.1))
J (AB) dx dy dz J (AB)

= VB- VA.
Provided that j(4B) dV is independent of the path from A to B, the
value to be assigned to VB — VA is unambiguous and we say that V
is single-valued. However, the values of V may depend not only on
the position, but also on the way in which the position was reached
(analogously to the time spent reaching a point on the other side
of a road being dependent on whether you cross directly or via the
underpass). For example, in the plane, let

v = e.
where 9 is the polar angle traversed in reaching the current position,
measured continuously from a given starting point. What do we
mean by

dei
J (AB)
32.1 0 Line integrals 583

Figure 32.15 shows two paths from A to B: (ACB) goes from A to


B more or less directly, and (ADB) circles the origin completely first.
The definition of the integral is that

d9 = lim Y, 50,
J (ab) 5e-*0(aB)
where the summation is carried out by taking small steps along the
path. On (ACB), 9 passes smoothly from 0 = 0 to 9 = jn, so

dV = do = ef m.
(ACB) I (ACB)

On (ADB), 9 starts at 9A = 0 and increases smoothly through values


jn, n, §71, 2k, to 0B = §tl Therefore

dF = de = eR - eA 171
= ■ .

(ADB) J (ADB)

Therefore V in this case is path dependent; if the potential of a


force field is given by

v=e.
where 9 is the traversed polar angle, then the field is strictly not
conservative: various paths from A to B involve different amounts
of work by a unit particle moving in the field.

Example 32.12. Show that the two-dimensional field

-y „ *
fix, y)
x2 + >-2' x2 +ylJ

is not a conservative field.

Apart from physical constants this represents the circumferential magnetic field
around a straight wire carrying a current, or the velocity field of a vortex. Put
r = xf + yj\ then

f-r 5+y = 0.
x + jA x2 + y2

Therefore the field is perpendicular to the radius vector at every point, as in


Fig. 32.16. It is easy to confirm that
/ = -grad V,
where V is a (path-dependent) continuous function such that
tan V = y/x.
Thus we may take V = 9 as described in the case we just discussed. (We cannot
write V = arctan y/x, because this function is discontinuous across the y axis:
it would have an infinite gradient there.) The figure makes it obvious that the
field is not conservative: more work is done if you take a unit magnetic pole
against the field fifty times around the origin in order to travel between two
points than if you go directly.
Fig. 32.16
584 32.10 Mathematical techniques

y The field in Example 32.12 is not conservative, but whole classes


of paths are equivalent. Suppose that, as in Fig. 32.17, we have two
paths, (AMB) and (ANB), which can be steadily deformed into each
other (as if A and B were connected by a piece of elastic) without
passing over the origin. Then these two paths are equivalent. In this
case, 6 starts at 8A = 0; although the value of d wanders about on
(ANB), increasing and decreasing, it still ends at the value dB = \ti,
as on the path (AMB).
However, (AMB) cannot be deformed into the third path (ALB)
without passing over the origin; by following it around, it can be seen
that dB= — §n for this path.
Fig. 32.17
Suppose that we confine consideration to a ‘patch’, or region ^
as in Fig. 32.18, which neither contains nor surrounds the origin 0.

Then, within this region, the field behaves as if it were conservative,


because any path from A to B inside the region can be deformed
into any other without crossing the origin. We could not tell, from
experiments confined to that the field is not conservative over
the whole plane.

Problems

32.1 (Section 32.1.) Evaluate the following line inte¬


grals where (AOB) is shown in Fig. 32.19a.

(a) I x dx; (b) y dx; (c) x2 dx.


I (AOB) (AOB) (AOB)

32.2. Evaluate the following integrals, T represents the


parabolic path (AOB) on y2 = x, shown in Fig. 32.19b.

(a) x dx; (b) y dx; (c) x2 dx;

(d) (x + y) dy; (e) xy2 dy;


Line integrals 585

32.5. (Section 30.3.) The references are to Fig. 32.20.)


(f) (x dx + y dy); (g) (2 dx - .V dy);
(a) dx; (b) I dy;
(ABC) J(AOC)
(h) (v dx — x dy).

(c) (x dy — y dx); (d) (x dy - y dx);


(ABC) (AOC)

32.3. (Section 32.2). Evaluate the following line inte¬


grals over the various paths T, which are specified (e) y dy; (f) y dy;
parametrically. (ABC) (AOC)

(a) J xy2 dx; (P is x = t2, y = r; 0 ^ t ^ 1. (g) (y dx + X dy); (h) (y dx + x dy).


(ABC) (AOC)

(b) (x dy — y dx); T is x = cos t, y = sin f; 32.6. (Section 32.4.) The integrands given are perfect
«. !P differentials; T represents any path having the right
0<t< 7t.
direction which joins the two given points.
r
(c) (z dx — x dy + y dz); T is x = t + 1, y = t. (a) (x dx + y dy + z dz); T is (—1, 1, — 1) to

z = 2f; 0 < t ^ 1. (F -1, 1).

(d) (x2 dx + y2 dy + z2 dz); T is x = cos t, y = sin f, (b) (yz dx + zx dy + xy dz); T is (0, 0, 0) to

z = f; 0 ^ t ^ 27i. (U, 1).


(e) Compare (c) when T joins the same two points,
(c) ex2 + y2 + z2 cj v _|_ y _j_ z dzy -p js (o, 0, 0) to
(1, 0, 0) to (2, 1, 2), but x = t2 + 1, y = It — t2, z =
2f2; 0 sc t < 1.
(U,l).

32.4. (Section 32.2). The line integral /(x, y) dy, (d) [(y + z) dx + (z + x) dy + (x + y) dz]; T is
l(AB)
where the path (AB) is described by the curve y = k(x), (1,1,1) to (0,1,0).
can be written formally as
(e) [cos(xy + yz + zx)]
^ x d/c
f(x, k(x)) — dx. x [(y + z) dx + (z + x) dy 4- (x + y) dz]; P is
J (AB) dx
(1,0, 7t) to (0, 71, 1).
(*
Apply this formula to (x + y) dy, taken over the
(AB |
(f) (xy2 dx + x2y dy); IP is (1, 1) to (2, 2).
parabolic path in Fig. 32.19b. Express it as the sum
of two ordinary integrals over x. (This is like using x
as the parameter in Section 32.1.) 32.7. (Section 32.5). Evaluate the following two-
dimensional line integrals over the closed paths C given,
the direction being anticlockwise.

(a) (x2 dy — y2 dx); C is the circle x2 + y2 = 4.

x y \
(b) - dx + - dy ; C is the ellipse ^x2 + ^y2 = 1;
y x /
use the parametrization x = 2 cos 0, y = 3 sin 0.

32.8. Evaluate the following (all the paths C are closed).

(a) (y dx + z dy -I- x dz); C is x = sin f, y = cos f,

z = sin t; 0 < t < 2n.

(b) (y dx + z dy + x dz); (ABC) is the triangle


(ABC)

Fig. 32.20 A: (1,0, 0), B : (0, 1, 0), C : (0, 0,1).


586 Mathematical techniques

32.16. Use Green’s theorem with (32.10) to decide


(c) (yz dx + zx dy + xy dz); C is any closed path. whether the following represent conservative fields (in
two dimensions) or not in the stated regions.
' (a) (x2 - y2, 2xy); all x, y.
32.9. Show that (yx2 dx + |x3 dy) is path (b) (] ln(x2 + y2), arctan(y/x)); x > 0.
J(.4B)
independent between any two points A and B. Use 32.17. A force field has field intensity f(x, y, z) = yi +
this fact to evaluate the integral along the spiral path j + xk Is / conservative? Find the work done against
given in polar coordinates (r, 9) by r = e0,2n for 0 ^ 0 <
the field by a unit particle moving in a straight line from
71.
(0, 0, 0) to (1, 1, 1).
(+

32.10 Show that if (/ dx + g dy) is independent 32.18. A force / is given by f(x,y,z) = yzi+xzj +
J (.4B| xyk. Show that it is conservative. Find the work done
of the path (AB) for every two points A and B, then against / along the path x = cos t, y = sin t, z = sin t
the integral around every closed path is zero. [Hint: A cos f; — j ^ t ^ jn. Are you doing this the easiest way?
and B may coincide.]
32.19. Prove that a force field / having the form / = rxr,
32.11. Show that if the variables are changed in a
where a is any constant, r is distance from the origin, f
perfect differential form, it remains a perfect differ¬
is the unit position vector and f = r/r, is a conservative
ential. Illustrate this by transforming the identity y dx +
field. [Hint: start by putting r = (x2 + y2 + z2)*, and
dy = d(xy) into polar coordinates.
guess something that / might be the gradient of. If you
cannot guess, then use the fact that grad F(r) = r{dF/dr).~\
32.12. (Green’s theorem. Section 32.6.) Confirm the truth
of Green’s theorem (32.12) for some very simple cases
32.20. Generalize Problem 32.19 to a field / = rf(r).
for which you know you can work out both the line
What is the potential of such a field?
integral and the double integral involved.

32.13. (Green’s theorem, Section 32.6.) Check the cor- 32.21. Confirm that Green’s theorem still holds for
rectess of the area formula. Example 32.9, by evaluating boundary C of the annular region 34 between the circles
x2 + y2 = 1 and x2 + y2 = 4 for the line integral
the line integral j (x dy — y dx) taken around the
[(2x — y3) dx — xy dy].
following closed paths.
J c
(a) The circle x2 + y2 = 4.
(b) The ellipse \x2 + ^y2 = 1. What are the directions on CP.
(c) The triangle with vertices ( 1,0), (2, 0), (0,4).
»

32.22. Show that (5x4y dx -I- x5 dy) = 0 holds for any


32.14. Find the area of the star-shaped region bounded
Jc
by the curve xS + yS = 1, by parametrizing its equa¬ closed curve C for which Green’s theorem is true.
tion as in Example 32.9.

32.23. Sketch the curve given parametrically by


32.15. The gravitation force F arising from a particle of
mass M at the origin upon a particle of mass m at a x = cos f — f sin 2t, y = sin f; 0 =$ t < 2n.
point with position vector r is given by F = —yMmr/r3.
Using Green’s theorem, find the area enclosed by the
Find the work done by F on a particle which travels in
curve.
from infinity to r.
Vector fields:
divergence and
curl
Contents
33.1 Vector fields and field lines 587
33.2 Divergence of a vector field 588
33.3 Surface and volume integrals 589
33.4 The divergence theorem 593
33.5 Curl of a vector field 595
33.6 Cylindrical polar coordinates 599
33.7 Curvilinear coordinates 601
Problems 602

33.1 Vector fields and field lines


Vector fields in two dimensions have already been encountered in
z Section 28.6. A vector field in three dimensions extends this notion
to a vector with three components which are functions of position
in space. In terms of cartesian components a vector field F(x, y, z)
will have the form

F(x, y, z) = Fx(x, y, z)i + F2(x, y, z)j + F3(x, y,z)k.

Vector fields abound in physical and engineering applications. Fluid


velocity, gravitational forces, magnetic and electric fields are examples
of vector fields. In time-varying applications the vector field and its
components will also depend on a fourth variable, namely time, but
here we shall concentrate only on the position variables.
At each point where the vector field is defined we can draw a
Fig. 33.1 Vector field. vector. Figure 33.1 shows a region with a sample of local vectors
drawn. Generally their magnitudes and directions will vary from
z
point to point.
Assuming that the components of the vector field are smooth
functions, we can associate with the vector field, field fines or integral
curves, which are such that the vector field at any point is always
tangential to a field line (Fig. 33.2). (The streamlines in Fig. 28.8 are
field lines for a two-dimensional vector field.) Suppose that a
particular field line is given parametrically by the position vector
r = r(t). Then its tangent is in the direction of dr/dt (see eqn (9.18))
which must be in the same direction as F:

dr
— = Kt)F(x, y, z)
Fig. 33.2 Field lines. df
588 33.1 Mathematical techniques

where n(t) is some scalar function of the parameter t. Hence in


component form

d* = H(t)Fl(x,y,z), dy = n(t)F2(x,y,z), ~=fi(t)F3(x, y, z).


dt dt dt
Elimination of the unknown pi(t) leads to:

Equations for field lines


dx dy dz (33.1)
Fx (x, y, z) F2(x, y, z) Ffix, y, z)

which are essentially two simultaneous differential equations for x,


y, and z. The solution of these equations is not always easy, but here
is an example of a vector field whose field lines can be found.

Example 33.1. Find the field lines of the vector field


F — xy2zi + xzf + xk.

Equation (33.1) becomes

dx dy dz
xy2z xz x

which is equivalent to the two differential equations

dx , dy
—=y , — = z,
dy dz

which are both separable differential equations. Hence

dx y2 dy, or x = ^y3 + C,, (33.2)

and

dy = dz, or y = \z2 + C2 (33.3)

Equations (33.2) and (33.3) are two families of surfaces (both are cylindrical) and
their curves of intersections are the field lines of F.

33.2 Divergence of a vector field


The divergence of a vector field F, denoted by div F, is a scalar field
defined by:

Divergence of a vector field

dF1 dF2 dF3 (33.4)


div F = V'F — + — + —.
dx dy dz
33.2 Vector fields: divergence and curl 589

The notation V-F emphasizes the del operator (Section 28.6)

„ „ d „ d f d
V = i — + j-h k
dx dy dz

again. Here V‘F is the ‘scalar product’ of the operator and the
vector field.

Example 33.2. Find the divergence of

F = sin(xy)/ + y cos(z)j + xz cos(z)k.

From the definition above

d . d d
div F = — (sin(xy)) + — (y cos z) + — (zx cos z)
dx dy dz

= y cos(xy) + (1 + x) cos z — zx sin z.

33.3 Surface and volume integrals


Let 5 be a surface (Figure 33.3), and let 55 be an element of area
on 5- Let the projection of5 on to the (x, y)-plane be and let 8/1 be
the projection of the element 5j>. (We assume that any line parallel
to the z axis cuts S in at most one point.) The element 8/1 could be
the rectangular element having area 8x Sy, in which case, for small
8x and 8y, 6S would be approximately a parallelogram on the
tangent plane at a point P within 8J>. Any integral of the form

(* (*

f(x, y,z)&S,
J Js

where f(x,y,z) could be either a vector or scalar field defined in


a region containing 5, is known as a surface integral. Note that
the element 85 is always assumed to be positive in the surface
integral.
The relation between 85 and 3A in Fig. 33.3 depends on the unit
normal n at P. Consider a vertical plane through P containing the
vector n and k as shown in Fig. 33.4. Let 0 be the smaller angle
between h and k, that is, 0 ^ 9 ^ 180°.
Then any length perpendicular to this vertical plane will be the
same on both 85 and 8/1, but lengths along 8/1 and 85 in the plane
will be in the ratio |cos 6\ = \n-k\, where the modulus sign is used
since n could be in the opposite sense to that shown in the figure.

Fig. 33.4
o
SA
Hence

8/1 = \n-k \ 85.


590 33.3 Mathematical techniques

Thus

dA
fix, y, z) dS = fix, y, z) (33.5)
ri'k\

which can be used as a definition of the surface integral.

Example 33.3. The roof of a building has the cylindrical shape z = h — bx2
z over a square floor plan given by |x| ^ a, |y| ^ a, where h > 2 a2b (see Fig. 33.5).
Find the surface area of the roof.

The surface area is given by

S = 6S.

In this case we use cartesian coordinates to define the element ST, which is the
rectangle with sides parallel to the axes with lengths bx and 5y. Thus 8T = 8x 8y.
The integration takes place over the square |x| ^ a, |y| < a in the (x, y)-plane
(Fig. 33.6). We also require the unit normal n. By (27.7) the unit normal will be

„ (— 2bx, 0, — 1)
n =- —
y/4b2x2 + 1

Hence

\n-k\= , =,
JA b2x2 + 1

and by (33.5)
*a _
S = x Ab2x2 + 1 dx dy.
J —a

The repeated integral is separable (Section 31.7). Hence

S = y/Ab2x2 + 1 dx dy = 2 a sj Ab2x2 + 1 dx.

The remaining integral can be evaluated using the substitution x = (sinh u)/(2b).
The result is

S = - [2ahx/4a2h2 + 1 + sinh l(2ab)~],


b

If the surface S is given by z =f(x, y), we can obtain a general


cartesian formula for the surface area. A vector in the direction of
the normal at any point on the surface is given (see Section 27.5) by

dl J1
dx' dy'

where we have chosen n to be in the direction in which its k


Fig. 33.6 component is positive. This ensures a positive value for the area. A
33.3 Vector fields: divergence and curl 591

unit vector in the direction of n is

tJJl.JL
\ 8x dy

Hence

k-n = +

and the surface area of S is therefore given by

dx dy,

where ^ is the projection of J> onto the (x, y)-plane.


A surface in three dimensions is a two-dimensional object, which
means that it can be represented by a position vector which is a
function of two parameters. Remember that for a curve in three
dimensions, the position vector is a function of a single parameter.
Unlike the cartesian form z = f(x, y), parametric equations enable
the creation of much more complicated surfaces.

Parametric form of a surface

A surface can be represented by a position vector r as


a function of two parameters u and v in the form
(33.6)
r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k,

where a^u^b, c^v^d.

The parameters u and v are defined over a rectangle in the


(u, y)-plane.
For the surface

/• = a cos u sin vi -I- a sin u sin vj + a cos vk,

we can see that

|r| = ^/[a2 cos2 u sin2 v + a2 sin2 u sin2 v + a2 cos2 v]

— y/\_a2(cos2 u + sin2 u) sih2 v + fl2 cos2

= a^/[sin2 v + cos2 u] = a,

which means that the position vector r traces out a sphere of radius
a, centre at the origin. We need to specify u and v to determine
which part of the sphere is defined. For the whole surface, the
parameters must lie in the intervals 0 ^ u ^ 2n and 0 < v ^ n.
More complicated surfaces can be generated in this way, and their
graphical representation has become easier using symbolic computer
592 33.3 Mathematical techniques

software (see Chapter 41, projects for this chapter). For example,
the position vector r defined by
r = (3 + cos v) cos u i + (3 + cos v) sin uj 4- sin v k,

generates a torus (like the shape of a doughnut) with its axis in the
k direction (Figure 33.7a). The vase-shaped surface in Figure 33.7b
is generated by
r = (1 + a sin bu) cos vi +
+ a sin bu) sin vj + uk,
(1

where a = 0.3, b — 3.5 for 0 ^ u ^ 2 and 0 ^ v < 2k.


Z

Fig. 33.7(a) Torus; (b) a vase.


(a) (b) x
z
Triple integrals or volume integrals can also be defined in vector
calculus. By analogy with the double integral, the triple integral of
a function / (either a scalar or vector field) over a three-dimensional
region V is

/ = m dv
Jv

where fi'T’is an increment of volume and P is a point in ST'(see


Figure 33.8). Its evaluation requires it to be converted into a
repeated integral with three integrations.
Fig. 33.8

Example 33.4. A cube of metal occupying |x| ^ a, |y| ^ a, \z\ ^ a has a radial
8‘V=8x8y8z density distribution given by

p(x, y, z) = a + (l{x2 + y2 + z2),

where a and are positive constants. Find the mass of the cube.

We choose a rectangular grid with volume element W = 8x 8y 8z (Figure 33.9).


The mass of this element is approximately

p 8.x 8y 8z = [a + (3(x2 + y2 + z2)] 8x 8y 8z.

The total mass is therefore the sum or integral of these elements within the cube.
The integral, with 8x, 8y, and 8z parallel to the axes, sweeps out the interior of
the cube if it is integrated in the x, y, and z directions, in turn, between —a and
a in each case. Hence the mass M of the cube is

M = [a + P(x2 +y2 + z2)] dx dy dz.


Fig. 33.9
—a
33.3 Vector fields: divergence and curl 593

This integral can now be evaluated as a repeated integral as follows:

M = [ax + /3(jx3 + xy2 + xz2)\a_a dy dz,


a

[2aa + 2/5(| a + ay2 + az2)\ dy dz,


a

[2aay + 2/][ja3y + ±ay3 + ayz2)]a_a dz,

*a
= [4xa2 + 4+ 3a4 + a2z2)] dz,
J —a

= [4a2z + 4^(fa4z + i«2z3)r_a,

= 8 a3(a + a2).

33.4 The divergence theorem


The divergence theorem (due to Gauss) relates a volume integral to
a surface integral. Let V be a region in three dimensions which is
bounded by a smooth surface 2>. We shall prove the theorem in the
restricted case in which any straight line parallel to any of the
cartesian axes cuts S in at most two points. It will look something
z like the surface shown in Figure 33.10. The theorem is:

Divergence theorem

Let S be a surface enclosing a region T2, and let F be


a smooth vector field defined in V. Then

div F = F-ndS, (33.7)


•> s

where n is the unit normal to S drawn outwards from


V.

Within the restrictions imposed on S we can divide S into two


surfaces, an upper one, S2 with equation, say, 2 = g2(x> y) ar>d a
lower one Jq with equation z = ^(x, y), the two surfaces meeting
on the curve C- We shall use the cartesian increment 5.x 5y 5z for bV.
The divergence theorem is really the sum of three results. Suppose
that F = F1i + F2j + F2k, and consider first

SF,
/3 dx dy dz.
’ dz

If ^is the projection of C on to the (x, y)-plane, then as a repeated


594 33.4 Mathematical techniques

integral,
* 'z = g 2(x,y) ap
f
h = 3 dz dx dy,
Jj X. _ % z = gi(x,y) 02

[F3(x, y, z)YzJgfx:yy\ dx dy,


JJK
rr
= [F3(x, y, g2(x, y)) - F3(x, y, gx(x, y))] dx dy
* *
From the previous section, noting carefully the directions of hx and
n2, the outward normals to J>x and S2, it follows that

dx dy = k-h2 <1S2 on ^2 but dx dy = -k-nxdSi on 5X,

since the angle between k and ft must be obtuse. Hence

I, = F3(x, y, g2(x, y))k-n2 d52


' 52

+ F3(x, y,gx(x, y))k-n1 dj 1 >

F^k'h dj>.

Similarly it can be shown that

h = f dFdv= f F{i‘n dj,


J* J V dx J 4S
/%
U = dF±d‘V= f F2j‘ft dj>’.
Jc Jv dy J
Addition of these results gives the divergence theorem:

/1 + /2 + I3
J JJv dx dy dz J

div F dV,
‘V
/* p
(FJ + F2j+ F2k)-ndS,
J Js

F-fidS.

The divergence theorem tells us something about the physical


interpretation of the divergence of a vector field. In Figure 33.11 the
curves represent streamlines of the flow of an incompressible fluid,
and v is the local velocity of the fluid. Consider any fixed closed
Fig. 33.11 surface S drawn in the flow. Then the outflow through an element
33.4 Vector fields: divergence and curl 595

of area 85 on the surface will be vnbS per unit time. The total
outflow through S will be
rp
v • n dj>.
J JS
Assuming that fluid is neither being created nor destroyed within
S, it follows that

v-n dS = 0,
Js
that is the net outflow is zero. By the divergence theorem it must
be true that

f div v dV = 0
Jv
for every such surface 5. Therefore
div v = 0
throughout the flow. This is known as the equation of continuity in
fluid dynamics. A vector field which satisfies div v = 0 is said to be
solenoidal. Generally the divergence of a vector field at a point P
measures the rate at which the vector field spreads out from P.
It is not difficult to generalize the divergence theorem to regions
which have corners or parts of S parallel to an axis as in Figure
33.12a, or to regions for which the two-point rule does not apply
as in Figure 33.12b. This region can be split into regions to each of
which the divergence theorem applies, and the theorem applies to
the whole region by addition. The surface integrals over the joins
cancel out.

33.5 Curl of a vector field


The curl of a vector field
F(x, y, z) = Ffx, y, z)i + F2(x, y, z)j + F3(x, y, z)k
is a vector field defined as follows:

(b) z

Curl of a vector field

SF2\ „ fdFl
curl F =
dz)l + \dz

\dx dy J (33.8)
i J k
d d d
dx dy dz

Fi Fs
Fig. 33.12
596 33.4 Mathematical techniques

This ‘determinant’ is a useful hybrid form which has unit vectors on


the top row, operators on the second and components on the third,
and it is evaluated using the two row expansion rule for determinants
(this rule is analogous to the determinant rule for the vector product
given in Section 11.2). The del form is

curl F = Vxf.

Example 33.5. Find the curl of

F = exyzi + (x2 4- y)j + xz eyk.


Using (33.8)
i j i
curl F = AAA
dx dy dz

exyz x2 + y xz ey

= (Jr (xz ey) - A (X2 + j/)^f + QL (exyz) - A (xz ey)jj

+ (A
\ox
(xl +y)~T~
dy
(eX3“)V
J

= xz eyi + (xy exyz — z ey)j + (2x — xz exyz)k.

We can interpret the curl of a vector field as follows. Consider a


rectilinear fluid flow with velocity

v — toyi (co constant)

in which the flow is in one direction only. Imagine that we are


looking down on the surface of the flow in Figure 33.13.
The divergence of v is zero so that the flow satisfies the equation
of continuity. Its curl is given by

i J A

d s 8
curl v = — ok,
dx dy dz

coy 0 0
Fig. 33.13
which is a vector perpendicular to the surface. The fluid as a whole
does not appear to rotate but a small leaf placed on the flow will
rotate in a clockwise sense as it is carried along with the stream.
For example, if it is placed so that y > 0 for all points on the leaf,
then the points furthest from the x axis will be moving faster than
those nearest. The rate of spin turns out to be

2 curl v = —jtok

everywhere.
A vector field which satisfies curl v = 0 is said to be irrotational.
There are two important identities for special vector fields.
33.4 Vector fields: divergence and curl 597

A conservative potential field 0 is irrotational since


(33.9)
curl grad 0 = 0.

The curl of a vector field F is solenoidal since


(33.10)
div curl F = 0.

The verification of these results is straightforward. For the first


one

i J k

d 8 d
curl grad 0 =
dx dy dz

dtfi d(j> 50
dx dy dz

fd2fi 82(fi d2(j) d2(f)


i+
W dz dz dy .dz dx dx dz.

d2(f) d2(j)
+
Kdx dy dy dxy

= 0

assuming that scalar field is smooth enough to ensure that all the
mixed partial derivatives cancel.
For the second result

div curl F =
±(aj3 dJ^) + l(dJ±
dx \ dy dz ) dy \ dz

+ l(dh-?lA
dz V 8x dy J

^ ^ ^ ^ , ^2 d2Fx
dx dy dx dz dy dz dy dx dz dx dz dy

= 0.

Example 33.6. Show that the vector field

F = (y2 z + z + y exy)i + (2xyz + x exy)J + (xy2 + x)k

is conservative. Find the scalar potential of F.


598 33.4 Mathematical techniques

We first check that curl F = 0. Thus

i j *

d d d
curl grad cp =
dx dy dz

y2z + z + y exy 2xyz + x exy


xy
xyz
yv2
+x

d 5 \
— (xy2 + x)-(2xyz + x ex>) ?
dy dz J

+ ( — (y2z + z + y exy) - (xy2 + x) \j


\dz dx

+ (— (2xvz + x exy)-(y2z + z + y exy) 1 k


\dx ' dy

= i(2xy - 2xy) +j\_(y2 + 1) - (y2 + D]


+ k[_(2yz + ex>' + xy ex>) - (2yz + ex>' + xy ex>')]

= 0.

We now need to find </> such that grad (p = F, that is,

d(p ,
— = y z + z + y ex>',
dx

d(p
— = 2xyz + x ex>',
dy

d(p 2
— = xy + x.
oz

Integrate the partial derivatives with respect to x, y, and z to give:

cp = (y2z + z + yexv) dx + f(y, z) = xy2z + xz + exv +/(y, z); (33.11)

(p = (2xyz + x ex>') dy + g(z, x) = xy2z + ex>' + g(z, x); (33.12)

cp = (xy2 + x) dz + h(x, y) = xy2z + xz + h(x, y). (33.13)

Here the ‘constants of integration’ become functions of the other two variables
in each case since partial derivatives are being integrated. Finally, cp given by
(33.11), (33.12), and (33.13) must all result in the same answer. This can be
achieved by the choices

f(y, z) = C, g(z, x) = xz + C, h(x, y) = exy + C,

where C is any constant. Hence

cp = xy2z + xz + ex>' + C.

Note that potentials of conservative fields can only be found to within an


additive constant.
33.6 Vector fields: divergence and curl 599

33.6 Cylindrical polar coordinates


In many applications it is advantageous to use alternative three-
dimensional coordinate systems. Usually the geometry of the applica¬
tion suggests an appropriate system, and one system which is
suitable for problems involving cylinders uses cylindrical polar
coordinates (p, 9, z) defined by (see Fig. 33.14):

A point P can be viewed as lying at the intersection of three surfaces


Fig. 33.14 Cylindrical polar
(Fig. 33.15), the cylinder p = a constant, the radial plane (p = a
coordinates.
constant through the z axis, and the horizontal plane z = a constant.
These surfaces meet at right angles at every point, and coordinate
systems with this property are said to be orthogonal.
The point P can be represented by the position vector

r = r(p, (p, z) = p cos (pi + p sin (pj + zk.

Cylinder p =constant Along the p-increasing line through P, (p and z are constant.
The vector dr/dp, evaluated at P, is a tangent to this curve at P,
pointing in the direction of increasing p. The corresponding unit
vector in this direction is

dr _ 1 dr
dp hp dp

where h = \dr/dp\ is called the scale factor associated with p. In


cylindrical coordinates

dr
cos (pi + sin (pj
dp

z=constant = (cos2 (p + sin2 (p)2 = 1.

Fig. 33.15 Similarly, /i0 = p and h, = 1. Therefore the unit vectors in cylindrical
polars are

1 dr
(33.15)
pd(p’

All partial derivatives are to be evaluated at P.


The gradient of a scalar function U(x,y,z) can be expressed in
terms of the unit vectors ep, e#, e,. Suppose that

grad U = gpep + + gzez,

where we require the components gp, gg,. Treating U as a function


600 33.6 Mathematical techniques

of p, 0, z, the incremental formula gives

8C 8C dU .
5(7 = — 8p H-50 H-5z. (33.16)
dp 80 dz
Also, from (30.15), the directional derivative of U is

dU
- s-grad (7,
ds
where s represents an arbitrary direction. Since |5s| = |5r|,

dr dr dp dr d0 dr dz
ds dp ds dcp ds dz ds

, dp „ d0 „ dz „
= hU-se’ + h*Tse* + KIse*-
Therefore,
d(7 dp d0 , dr
— = hp 9P - + h0 gf — + h2 g,_ —,
ds ds ds ds
or, expressed in increments,

8^ = hpgp bp + h^g# 80 + /izpz 5z. (33.17)

Compare (33.16) and (33.17), which are true for arbitrary 5p, 50, 5z.
In turn, put one of 8p, b(f), bz to a nonzero value, and the other two
to zero. We obtain

1 dU _ 1 dU 1 dU
9p
hp dp ’ J(p hf dcp h„ dz ’
so that grad (7 is given by

, TT 1 9£/„ 1 3£/„ 1 3(7 A


grad U — - —^
9p 30 /zz 3z

_dU „ 1 dU dU
+ e„.
dpCp p 80 8z

The divergence and curl also have their cylindrical polar forms.
For the vector field F = Fpep + F^e^ + F,e:, these are
1 " 0 3 c)
div F = (pFp) + (F<l>) + (pFz)
_dp dcp dz
and
eP PU
d d d
curl F
dp 80 dz

Fnp pF(j> Fz _

(1 dF, dFA „ (dFp 8FZ\ 1 /d d \


33.7 Vector fields: divergence and curl 601

33.7 Curvilinear coordinates


In this section we simply summarize the generalizations of the results
given in the previous section for cylindrical polar coordinates to
orthogonal curvilinear coordinates. Suppose that the position vector
r of a point is expressed in terms of the curvilinear coordinates
ux, w2, u3 so that

r = r(«ls u2, u3) = x(ul5 u2, u3)t+y{uu u2, u3)j+z(ux, u2, u3)k.

Assume that the curvilinear coordinates are orthogonal, that is,


M-,=constant the surfaces ux = a constant, w2 = a constant, u3 = a constant meet
1 w ^constant
at right angles at every point (Fig. 33.16). The unit vector ex is in
the direction of the curve along which the surfaces u2 — constant
and u3 = constant meet, and it points in the direction of ux
increasing. The other unit vectors are in the directions of the
intersections of the other surface pairs as shown in Fig. 33.16.
The scale factors and unit vectors are given by:

Scale factors, unit vectors

dr dr dr
K = , h2 — 5 h3 —
dux du2
Fig. 33.16 Orthogonal
curvilinear coordinates. 1 dr „ 1 dr 1 dr (33.18)
e, =
hx dux' h2 du2 ’ h% du3

Elements of distance 8s in the ut, u2, u3 directions are


respectively

hx 8ux, h2 8u2, h3 5u3.

We simply state the formulas for grad, div, and curl in general
curvilinear coordinates without derivation. They are given by:

Gradient of U
1 dU „ 1 dU „ (33.19)
I dU „
grad U -e -e2 3-T-e
hx dux h2 du2 h3 du3

Divergence of F = Fxex + F2e2 + F3e3

1 ^ C'

div F = - — (h2h3Fx) + — (h3hxF2) (33.20)


aZ^/22^3 1 ^^2

+ 4~(hth2F3)
du3
602 33.7 Mathematical techniques

Curl of F — F1e1 + F2e2 + F3e3

Mi h2e2 Ms
1 d d d (33.21)
curl F =
h1h2h3 dul du2 du3

Lmt h-2^2 ^3-^3-

Example 33.7. In terms of (x, y, z), spherical polar coordinates (r,8,cp) are
given by

r = xi + yj + zk = r sin 0 cos cp i + r sin 8 sin <pj + r cos 8 k,

r ^ 0, 0 ^ 8 ^ n, 0 ^ 0 < 2rt.
z
The coordinates are shown in Fig. 33.17: the coordinates are orthogonal with
coordinate surfaces of a sphere, r = constant, a vertical plane, <f> = constant and
a cone, 8 = constant. Find the scale factors of these curvilinear coordinates.
Hence, obtain the gradient of the scalar field U, and the divergence of the vector
field F = Frer + Fgee + F^e# in spherical polar coordinates.
The scale factors are

or
K= = |sin 8 cos cp i + sin 8 sin cpj + cos 8 k\
or

= ^sin2 8 (cos2 cp + sin2 (p) + cos2 0=1,

Fig. 33.17 Spherical polar dr


coordinates (.r, 0, cp). he | r cos 8 cos (pi Fr cos 0 sin <p j — r sin 8 k \
dd

= fyjcos2 0(cos2 <p + sin2 cp) -f sin2 8 = r,

Or
h* = = | - rsin0sin0 i + rsin0cos0/| = rsin0.
dtp

From (33.19)

dU 1 dU 1 dU
grad U =—er + - — ee + - — — v
or r dd r sin 0 d(p

By (33.20)

3 . „ 10
div F = — — (r2Fr) + — (sin0Fe) + ——■ (F0).
r2 or r sin 0 00 r sin 0 dtp

Problems

33.1. Find the surface area of the spherical cap of height 33.2. Evaluate the following triple integrals as repeated
h whose equation for z 0 is integrals:

M fx r 2y

(a) x dx dy dz;
z = j[a2 - x2 - y2] - a + h, (0<h^ a).
JO Jo J y
Vector fields: divergence and curl 603

» : fvn -.>■’]
P1 Show that 0 = 1/V*2 + y2 + z2 is a solution of
(b) x dx dy dz;
0 J0 J0 Laplace’s equation.

f 1 f = (Vn - .v2--2]
(c) X3 dx dy dz 33.10. Prove that
• 0 J 0 J -yv n -.v2-:-]
(a) div(F + G) = div F + div G;
33.3. It is intended to evaluate the integral (b) curI(F + G) = curl F + curl G.

33.11. Find the divergence of each of the following vector


fields:
as a repeated integral over the interior of the sphere
A'2 + y2 + z2 = a2 which lies in the first octant x ^ 0, (a) F = exy: i + sy2zj + tx:k\
y ^ 0, z > 0. Work out the limits of integration if the (b) F = (xz - y)i + yzj + 2xyi;
order of integration is x followed by y followed by z. (c) F = (xz — y2)i + yzj + lx2yk.

Indicate any vector fields which are solenoidal.


33.4. Show that the volume of the tetrahedron bounded
by the coordinate planes x = 0, y = 0, z = 0 and the
plane 33.12. Find the curl of each of the following vector fields:

(a) F = exy:i + ey2'~j + exzk;


- + - + - = 1, a > 0, b > 0, c > 0,
a h c (b) F = (xz — y)i + yzj + 2xyi\
(c) F = (2xy + yz)i + (x2 + xz)j + xyk.
is 5abc.

33.13. A vector field v is both irrotational and solenoidal.


33.5. Find the area of the surface z = x2 + y for which
|.v| ^ 1 and \y\ ^ 1. Show that its scalar potential <t> satisfies Laplace’s equa¬
tion:
33.6. Show that the vector field
V20 = 0.
F=(yz exy: — y sin xy + z)i+(xz exy: — x sin xy)j

+ (xy cxy- + x)k, 33.14. If r = xi + yj + zk and r is its magnitude find

is irrotational. Find the scalar potential of F. (a) div(r2r);


(b) curl(r3r);
33.7. Paraboloidal coordinates (u, v, 0) are defined by (c) grad r3;
(d) div(r/r3);
x = uv cos 0, y = uv sin 0, z = j(u2 — v2),
(e) curl((r/r2);
where u^0, v^0 and O^0<2jt. Find the corresponding (J) div grad r3-
scale factors. Find also div Fin paraboloidal coordinates.
33.15. Prove the identity
33.8. Using the definitions of grad, div, and curl, verify
the following identities: (vgrad)r = ^ grad ir — r x curl v,

(a) grad(f/F) = U grad V + V grad U; where v = | v|.


(b) div (UF) = (grad U)-F + U div F;
(c) div(F x G) = (curl F)-G — F-(curl (7);
33.16. Show that Laplace’s equation (see Problem 33.9)
(d) curl curl F = grad(div F) — div grad F. By div grad Fis
in cylindrical polar coordinates is given by
meant / div(grad F,)+ jdiv(grad F2) + k div(grad F3);
(e) grad(F-(7) = Fx curl G+Gx curl F+(F-grad)G 1 c ( dU\ 1 d2U c2U
+ ((7grad)F. -Ip — +-+-= 0.
pdf)\ dp J p2 <502 dz2
33.9. Show that
If U = f(p), that is, U is independent of the other
d 20 d2cp d2tj) variables, show that / satisfies the ordinary differential
div grad cj) = — + ^ equation
ox oy cz

This is often written as V20. The equation pf"(p) + f'(p) = 0-


V20 = 0 Flence show that f(p) = A + B In p, where A and B are
is known as Laplace’s equation. constants.
604 Mathematical techniques

33.17. Show that Laplace’s equation in spherical polar ence theorem to show that the surface area of 5 is given
coordinates is given by by

1 8 2dU 1 d ( . 8U
7 >■>
+ — sin 0 — div F d'U,
r or or r2 sin 9 80 \ 86
82U where V is the interior of S.
+-= 0.
r2 sin2 0 6(j)2

A solution with spherical symmetry is sought for U, 33.21. Let S be a closed surface surrounding a region V
that is with U = f(r). Show that f(r) = A + (B/r), where for which the divergence theorem holds. By using the
A and B are constants. vector field r = xi + yj + zk, show that the volume
enclosed by S is
33.18. A vector field is given by F = xy2i + xzj + xyzk.
Let S be the surface of a cube bounded by the planes
,\-= + 1, y = ±Lz= ± 1. Use the divergence theorem to
evaluate
where n is the outward normal to S.
Using this result verify that
F-h dA,
(a) the volume of the sphere enclosed by x2 + y2 + z2 = a2
is %na .
where h is the outward normal to the cube. (b) the volume of a cone with vertex at the origin and
plane base of area A in the plane z = h is jAh.
33.19. Prove that

ff 33.22. Let S be a closed surface surround a region


n-curl F dS = 0 V for which the divergence theorem holds. Let F be a
J J J»
vector field which satisfies div F — 1 in a region which
for any closed surfaced for which the divergence theorem contains V. Show that the volume enclosed by S is given
holds. by the formula

33.20. Suppose that F is a smooth vector field which


F-rtdS.
equals the outward unit normal n on S- Use the diverg- J Js
PART VI

Sets

Contents
34.1 Notation 605
34.2 Equality, union, and intersection 606
34.3 Venn diagrams 608
Problems 612

34.1 Notation
We are often interested in grouping together objects that have
common characteristics or features. We might be interested in the
integers 1, 2, 3, 4, or in all the integers. The set of all points in a
plane would consist of pairs of numbers of the form (x, y), where x
and y are coordinates which can take any real values. These
examples all involve numbers, but the elements of sets can be other
objects such as functions, or matrices, or Fourier series, or Laplace
transforms, etc.
A set is a collection of objects or elements. The elements in the
set can be defined by a rule or in any descriptive manner. Sets are
usually denoted by capital letters such as S, A, B, X, etc, and their
elements by lowercase letters such as s, a, b, x, etc. The elements in
a set are listed between braces {...}. If the set A consists of just
two numbers 0 and 1, then we write
A = {0,1}, or A — {1,0}, (34.1)
the order being a matter of indifference. We say that 0 and 1 are
the elements or members of the set A, or belong to A. We write
0 e A, 1 e A,
read as ‘0 belongs to the set A \ etc. The number 2 does not belong
to A, and we write
2 4 A,
that is, ‘2 does not belong to the set A\
The set defined by (34.1) is the binary set which could represent
the on and off states of a system. This could be the state of a light
switch, for example.
Sets can be either finite, having a finite number of elements, or
infinite, in which case the set contains an infinite number of
606 34.1 Mathematical techniques

elements. Thus the set given by (34.1) defines a finite set A, while
5 = {1,2, 3,...},
the list of all positive integers, defines an infinite set.
Some of the more common sets have their own special symbols:

U, the set of all real numbers


€, the set of all complex numbers
IR + , the set of all positive real numbers (excludes zero)
Z, the set of all integers (positive, negative and zero)
f^+, the set of all positive integers (34.2)
N~, the set of all negative integers
0, the set of all rational numbers (that is, numbers of
the form p/q where q =4 0 and p are integers)

Often the elements are defined by a rule rather than by a list or


formula. We write the set as
S = (x | -x satisfies specified rules},
which can be translated as ‘S is the set of values of x which satisfy
the stated rules’. The rules occur after the vertical |. Thus
S = (x | x e N + and 2 ^ x ^ 8}
is an alternative way of writing S = {2, 3, 4, 5, 6, 7, 8}. As another
example,
S = {x | x e 1R and 0 ^ x ^ 1}
is the closed interval [0, 1],

34.2 Equality, union, and intersection


Two sets A and B are said to be equal if they contain exactly the
same elements. If this is the case, we write
A = B.
For example,
A = {1,2, 3}, B = {3,2,1}, C = {3, 1,2,1}

are all equal, that is, A = B = C. The order of the elements is


immaterial, and repeated elements are discounted.
In a given context, the set of all elements of interest in a given
context is known as the universal set. It could be the set IR (the
set of real numbers), or the set of all complex numbers, but it will
vary from application to application.
We now define how sets can be combined to create new sets. The
union of two sets A and B is the set of all elements that belong to
A, or to B, or to both. It is written as
AuB = {x\xeA or xeB or both},
and read as ‘A union B\
34.2 Sets 607

Example 34.1. Find the union of


A = {x | x 6 R and 0 ^ x ^ 2} and
B = {x | x e K and 1 < x ^ 3}.
The elements in the union have to belong to one or other of the intervals
0 < x ^ 2, or 1 ^ x < 3, or to both. The interval 0 ^ x < 3 contains all these
numbers. Hence
A u B = {x | R and 0 ^ x ^ 3}.

The intersection of two sets A and B is the set A n B that contains


all elements common to both A and B. It is written and defined by
AnB = {x\xeA and x e B}.

Example 34.2. Find the intersection of the sets A and B in Example 34.1.
The elements in the intersection have to belong simultaneously to both intervals,
that is, to the overlapping part of the intervals [0, 2] and [1, 3], which is [1, 2],
Thus
/lnB = {x|xelR and 1 ^ x ^ 2}.

In the definitions of A u B and A n B above, we can see that the


logical operation ‘or’ is associated with union, while ‘and’ is
associated with intersection.
If A and B have no elements in common, then A and B are said
to be disjoint. The set with no elements is called the empty set and
denoted by 0. Thus, if A and B are disjoint, then A n B = 0.
The complement of a set A is the set of all those elements which
belong to the universal set U but do not belong to A. We denote
this set by A (the notations Ac and A' are also frequently used).
Hence, the complement of A is, assuming that x e U
A = {x | x f A).
We say that A is a subset of B, expressed as A ^ B, if every
element of A also belongs to the set B. It follows that A ^ U if
B £ U. If there are elements of B which are not in A, then A is called
a proper subset of B and written A c B. The statement A <= B
includes the possibility that A = B, while A a B does not. If A £ B
and B ^ A, then all elements in A are contained in B, and vice versa;
in other words, A = B.
The sets of integers Z and rational numbers <0 are proper subsets
of the real numbers 1R, that is
and
Z c= (R, QcR,
We can summarize the results as follows.

(a) Union: 4uB = {x|x6/lorxeBor both}.


(b) Intersection: ,4njB = {x|xev4 and x e B}.
(c) Complement: A = [x\xf A).
(d) Empty set: 0, the set with no elements (34.3)
(e) Subset: A £ B means that A is a subset of B.
(f) Proper subset: A a B means that A ^ B
but A =£ B.
608 34.3 Mathematical techniques

34.3 Venn diagrams


Useful graphical views and interpretations of sets and operations
on them can be provided by Venn diagrams. We represent sets by
regions in the plane, with the interpretation that the region stands
for those elements belonging to the given set. The diagrams are
symbolic: the set A = {1, 2}, for example, could be represented by
the circle as shown in Fig. 34.1. Usually, sets are represented by the
interiors of circles, but any closed curves can be used. In a given
context, all the sets are subsets of a certain universal set U, whose
nature will differ according to the context.
Fig. 34.1 If the universal set is represented by a rectangle, then a subset A
of U could be represented by a circle within the rectangle shown in
Fig. 34.2. This is a Venn diagram for U and A. Remember that A

Fig. 34.2 Venn diagram for


the universal set U and
a set A.

could represent an infinite number of elements, or include just one


element, or be the empty set 0. The union, intersection, comple¬
ment, and proper subset can be represented by the Venn diagrams
shown in Fig. 34.3. The shaded regions indicate the elements defined
by the operations.

Fig. 34.3 (a) Union A u B.


(b) Intersection A n B.
(c) Complement A.
(d) Proper subset A c. B.

From the definitions of union, intersection, and complement, or


from Venn diagrams, the following laws of the algebra of sets can
be deduced:
34.3 Sets 609

/4 u 4 = /4, /I n /l = y4.
Commutative laws:
A u B = B u A, A n B = B n A.
Associative laws (see Fig. 34.4):
(A u B) u C = T u (B u C), (34.4)
(/4 n B) n C = A n (B n C).
Distributive laws:
A n (B C) = {A n B) u (A n C),
A u (5 n C) = (A u B) n (/I u C).

Sets also satisfy the following identity and complementary laws:

Identity laws: 4u0 = ^, A n U = A.


Complementary laws: (34.5)
A u A = U, A n A = 0, A = A.
Fig. 34.4 (a) (A u B) u C or
A u (5 u C). (b) (inB)nC
or /I n (S n C). For example, A consists of all elements that do not belong to A,
and none that do; so there are no elements common to A and A.
Therefore A n A = 0.
The difference of the sets A and 5, written as A\B, consists of
the set of those elements that belong to A but do not belong to B.
Thus
A\B = {x \ xe A and x <£ B} or An fin A.
(The notation A — Bis also used for A\B). Figure 34.5 shows a Venn
diagram for A\B.

Fig. 34.5 Venn diagram for the


difference A\B (shaded).

Example 34.3. Using Fig. 34.6 as the Venn diagram of two sets A and B, mark
by shading the following sets:
(a) A u B. (b) An B, (c) A n B, (d) du B, (e) A u B, (f) An B.

Venn diagrams of the sets are shown in Fig. 34.7.

The previous example confirms de Morgan’s laws, which are


Fig. 34.6
A u B = A n B, A n B = A u B. (34.6)
610 34.3 Mathematical techniques

Q (b) (c)

<35 (e) (f)

Example 34.4. Using Fig. 34.8 as the Venn diagram of three sets A, B, and
C, shade the following sets:

(a) (A n B) U C, (b) (A n B) n C, (c) (405)0(40 C),


(d) (4 U B) U (A n C).
The required sets are shown in Fig. 34.9.

Fig. 34.8

Fig. 34.9

Example 34.5. Show that (A n B) u {A n B) = A.


By the distributive law,

(A n B) u {A n B) = A n {B u B)
= A n U (by the complementary law)
= A (by the identity law).
34.3 Sets 61 1

Example 34.6. Show that (A u B) u (T\B) = A u B.


From Fig. 34.5, we can observe that A\B = A n B. Hence

(A u B) u (A\B) = (A v B) u (A n B)
= A u (B u (/4 n 6)) (associative law)
= A u ((B u A) n (B u B)) (distributive law)
= .4u ((B u A) nU)
= A u (B u A) (identity law)
= (B u /l) u (commutative law)
= Bu(4u4)=Bu4=3uB.

Alternatively, and more intuitively, we may notice that, since A\B is a subset
of A, it is therefore also a subset of 4uB, and so adds nothing to A u B when
united with it.

Example 34.7. In a manufacturing process, a product passes through three


production stages and is given a quality check at all three stages, which it either
passes or fails. Let Pt represent the set of products passing the quality check at
stage i. Draw a Venn diagram of the process. Interpret the quality failures of the
products in the sets given by P,, P2\(P1 u P3) and (Px u P2) n P3. What set
represents the completely satisfactory products'?
A production run of 1000 occurs, of which 8 fail all stages, 20 pass only stage
Pi, 31 only stage P2, and 17 only stage P3; 814 pass stages Pi and P2, 902 stages
P2 and P3, and 800 stages P3 and Pv Determine the final number which pass all
quality checks.
Pj represents all products which fail the P, quality check.
B2\(P, u P3) represents those products which pass only P2 stage.
(Pi u P2) n P3 represents those products which are satisfactory at stages P3 and
P, or P2. The set P, n P2 n P3 represents those products which are satisfactory
at all stages.
The numbers associated with each subset of the universal set U are shown
in Fig. 34.10. Since 8 fail all quality checks, then the number of elements in
P, u (Pi u P3) is 992. In the figure, k represents the number of products which
pass all the quality checks. Hence 800 — k, for example, represents those
products which are satisfactory in stages P1 and P2, but fail in P3. Thus
Pj u P2 u P3 contains

992 = 20 + 31 + 17 + (814 - k) + (902 - k) + (800 - k) + k

products. Hence 992 = 2584 — 2k, and so

k = 796.

Of the 1000 products manufactured, 796 passed all the quality checks.

In the previous example, we are really interested in the numbers


of elements in each of the sets. For example, the number of elements
in U is 1000 and the number of elements in P2\{P\ ^ P3), those
products which pass only stage 2, is 31. We write

n(U) = 1000, n[P2\(Pt u P3)] = 31.

The number of elements in the set 5 is n(S): this number is known


as the cardinality of S. Many sets can have infinite cardinality.
612 34.3 Mathematical techniques

For example, n(Q>), where Q is the set of rational numbers, is an


infinite number. We write n(Q) = oo. The empty set 0 has no
elements: hence n(0) = 0.
The following results apply to finite sets. If two finite sets A and
B are disjoint, then they have no elements in common. It follows that
n(A uB) = n(/4) + n(B).

This result applies to any number of disjoint sets. It is clear that


they must be disjoint, since otherwise elements would be counted
more than once.
This last result is also a useful method of counting elements when
combined with a Venn diagram. Consider just two sets A and B as
shown in Fig. 34.11. The sets representing each of the subsets in the
Venn diagram A\B, A n B, and B\A are shown in Fig. 34.11.
Since these sets are disjoint, then we can obtain a formula for the
number of elements in the union of A and B, namely

n(A u B) = n(A\B) + n(A n B) + n(B\A). (34.7)

Fig. 34.11 Counting elements For sets A and B separately,


in the union of two sets.
n(A) — n(A\B) + n{A n B), n(B) — n(B\A) + n{A n B).

(34.8)

Elimination of n(/l\5) and n(B\A) between (34.7) and (34.8) leads


to the alternative result

n(A u B) = n(A) + n{B) — n(A n B).

For three finite sets A, B, and C the corresponding result is

n(A yj B u C) = n(/l) + n(B) + n(C) + n{A n B n C)

— n(B n C) — n(C n A) — n(,4 n B).


This result can be constructed from the Venn diagram.

Problems

34.1. (Section 34.1). List the elements in the following (t)(A\B)nC; (g)4\(BnC);
sets: (h) (A\B) u (B\C).
(a) S = {x | x e N + and 3 < x ^ 10};
(b) S = {x j x e M + and - 2 ^ x ^ 4};
34.3. (Section 34.2). Determine the union A u B of
(c) S = {x\xeZ and — 2 < x < 4};
each of the following pairs of sets A and B:
(d) S = {x\xeN+, and -2 < x ^ 4};
(a) A = {x | x e R and — 1 < x < 2},
(e) S = {1/x | x e N + and 3 ^ x ^ 8};
B = {x | x g R and — 1 ^ x < 4);
(f) S - {x21 x e N + and |x| ^ 3};
(b) A = {x | x e 05 and - 1 ^ x < 0}, B = {x | x g IR
(g) 5 = {x+jy |xeN+, yeN+, 1 ^ x ^ 4, 2 < y ^ 5}.
and 0 < x < 1};
(c) A = {1, 2, 3, 4}, B = {— 4, -3, -2, -1};
34.2. (Section 34.3). Show on Venn diagrams the follow¬ (d) A = {y | y = cos x, x e IR, and 0 x ^ jn},
ing sets: B = {y \ y = sin x, x e U, and -?n x ^ £n}.
(a) A u B; (b) A n B\ (c) A n (B uC);
(d) (A n B) u (B n C); (e) A n B:
34.4. (Section 34.2). Determine the intersections An B
Sets 613

of the following sets: 34.6. The set S consists of products, each of which is
(a) A = {x | x e R, and — 2 ^ x ^ 1}, given n pass/fail tests, numbered 1 to n. The' set Sr
B = {x | x e R, and — 1 ^ x ^ 2}; consists of those products that pass test r. What is the
(b) A = {x|xe[\l + and -5<x^2}, B = {x | x e R, set of products that
and — 5 ^ x ^ 2}; (a) fails all tests,
(c) A = {n\n = 1/m and m e N + }, B = {n | n = 1/m2 and (b) fails only test 1,
m e l\l + }; (c) fails some tests?
(d) A = {x | x e R and x2 — 3x + 2 = 0},
B = {x | x e R and 2x2 + x — 3 = 0}; 34.7. At Keele University, all first-year students must
(e) A = {x | x e R and |x| ^ 2},
take three subjects of which at least one must be a
B = {x | x e R and |x — 11 < 1}. science subject, and at least one must be a humanities
or social science subject. Let A be the set of all first-year
34.5. (Section 34.3). Construct a set formula for the students in a given year, 4, the set of students who take
shaded sets of Fig. 34.12: exactly one science subject, B, the set of students who
take just one humanities subject, and B2 the set of those
who take two social science subjects. Draw a Venn
diagram to represent the different sets of students classi¬
fied by groups of subjects. Give set formulae for students
who take
(a) just one social science subject
(b) no humanities subject,
(c) one subject from each group.

34.8. (Section 34.3). The rules listed in (34.4) illustrate


the duality principle which states that every statement
involving sets which is true for all sets has a dual in which
u and n are interchanged, and 0 and U are inter¬
changed everywhere.
Use Venn diagrams to establish the following:
(a) (A\B) n C = (A n C)\B;
(b) A n (B u C) = {A n B) u (A n C).
What are their dual identities?

34.9. Three sets A, B, and C satisfy

A n B n C = (A n C) u (B n C).

Explain why the duality principal of Problem 34.8 does


not apply. What condition of the duality principle is
violated?

34.10. The cartesian product of two sets A and B is the


set of all ordered pairs {(a, />)}, where a e A and b e B. It
is written as

A x B = {(a, b) | a e A and be B}.


If A = B, then we write A x A = A2. Let A = {1, 2}
and B = {1,2,3}; write down all the elements in the
sets A x B, B x A, A2, and B2.

34.11. The cartesian product extends to the products


of three or more sets. Thus

A x B x C = {(a, b, c) \ a e A and b e B and c e C}.

Let A = {1, 2, 3}, B = {0, 1}, and C = {1, 2}. Write down
all the elements in

A x B x C, A2 x C, (A u B) x C, (A n B) x C.
614 Mathematical techniques

34.12. At the end of a production process, 500 electri¬ 34.15 The menu in a restaurant contains three courses:
cal components pass through three quality checks P, 4 starters (set A), 5 main courses (set B), and 3 sweets
Q, and R. It is found that 38 components fail check (set C). Customers can choose either the full menu or,
P, 29 fail Q. 30 fail R, 7 fail P and Q, 5 fail Q and R, alternatively, a main course and a sweet. In terms of
8 fail R and P and 3 fail all checks. Determine how cartesian products what is the set of all possible meals
many components: (the answer is really a set of pairs and triples). For
(a) pass all checks, (b) fail just one check, how many different orders can customers ask?
(c) fail just two checks.

34.13. (Section 34.4). For three finite sets A, B. and C, 34.16. Given A = {1,2,3}, B = {3,4}, and C =
show that the number of elements in the union of the {2, 3, 4, 5}, find the elements in the sets B u C, B n C,
sets is given by and the cartesian products A x B and A x C. Verify
n(A u B u C) = n(A) + n(B) + n(C) that

+ n(A n B n C) — n(B n C)
A x (B u C) = (A x B) u (A x C),
— n(C n A) — n(A n B).
34.14. If A and B are two finite sets, explain why, for A x (B n C) = (A x B) n (A x C).
the cartesian product (defined in Problem 34.10
above), (This example suggests general results which are true for
n(A x B) = n(A)n(B). all sets.)
Boolean algebra:
logic gates and
switching functions
Contents
35.1 Laws of Boolean algebra 615
35.2 Logic gates and truth tables 617
35.3 Logic networks 619
35.4 The inverse truth-table problem 621
35.5 Switching circuits 622
Problems 623

35.1 Laws of Boolean algebra


We are now going to present some new operations between special
entities. They have some analogies with ordinary addition and
multiplication, and the symbols for them will be similar—but not
the same, since we need to emphasize that these are Boolean
operations. The algebra involved is named after George Boole
(1815-64) who first developed the modern ideas of symbolic logic.
Boolean algebra has applications in logic and switching circuits.
Consider a set B which consists of just two elements 0 and 1, that
is, B = (0, 1}. We shall denote the sum of two elements a and b of
B by a@b (the notations v, u, and +, and the alternative
term join are also used); we denote the product of the two elements
by a*b (the notations a, n, x, and •, or simply ab, and
the alternative term meet are also in use) and the complement of
a by a (~a and ~ia are used in logic). These binary operations
applied to the members of B are defined to give the elements
shown in Table 35.1.

Table 35.1 Binary operations

Sum Product Complement


a b a® b a b a*b a a

0 0 0 0 0 0 0 1
0 1 1 0 1 0 1 0
1 0 1 1 0 0
1 1 1 1 1 1

Thus, for example


001 = 1, 10 1 = 1, 0*0 = 0, 1*1 = 1, 0=1, 1 = 0.
616 35.1 Mathematical techniques

The elements of B are known as Boolean variables. We have


restricted our set B to one with just two elements or binary digits,
because this is the main application in circuits and computer design,
but definitions can be interpreted for more general sets. A Boolean
algebra is a set with the operations ©, *, and " defined on it,
together with the following laws on any elements a, b, c which belong
to B:

Commutative laws:
a © fi = © a, a*b = b * a;
Associative laws:
a © (b © c) = (a 0 b) © c, a*(b*c) = (a*b)*c; (35.1)
Distributive laws:
a * (b © c) = (a * b) © (a * c),
a © (b * c) = (a © b) * (a © c).

In addition, the set must contain distinct identity elements 0 and


1 for the operations © and * respectively. For these elements we
must have the identity laws
a © 0 = a, a * l — a.
Finally, the complement laws must hold:
a® a — l, a*a = 0.
To summarize, we can say that a Boolean algebra consists of
the collection
(B, ©, *,",0,1),
in other words, a set B, the binary operations © and *, the
complement ", and the identity elements 0 and 1.
In our case B = (0, 1}, the binary set, which consists simply of
identity elements.. We can check that the definitions in Table 35.1
satisfy the laws in (35.1). They are essentially the laws of set
operations with sum © and product * replacing union u and
intersection n, and with 1 replacing the universal set U and 0 the
empty set 0.
Just as with sets, we can deduce further laws, some of which are
included in (35.2):

Absorption laws:
a@(a*b) = a, a * (a © fr) = a;

de Morgan’s laws: a®b = a*b, a * b = a © b;


Identity laws:
a©0 = a, a* 1 = a (35.2)
l©a = a©l = l, 0*a = a*0 = 0;
Reflexive law: a = a.
35.1 Boolean algebra: logic gates and switching functions 617

Note that * takes precedence over © in the absence of brackets.


Thus, in the first absorption law, a®a*b means a® (a* b); in the
second absorption law, the parentheses are essential.
We will prove one of the absorption laws to illustrate how proofs
are approached in Boolean algebra.

Example 35.1. Prove that a © a* b = a.


For all a,b e B

a®a*b = a*\®a*b (identity law)

= a * (1 © b) (distributive law).

Now

1 © b = (1 © b) * 1 (identity law)

= 1 *(b® 1) (associative law)

= (b®b)*{b® 1) (complement law)

= b®b* 1 (distributive law)

= b©b (identity law)

= 1 (complement law).

Finally

a® a*b = a* 1 = a.

35.2 Logic gates and truth tables


Any expression made up from the elements of B and the operations
©, *, and ~~ is known as a Boolean expression. For example,

a® b, a® b, a® a*b,

are Boolean expressions. For the binary set, the elements 1 and 0
can represent ‘on’ or ‘off’ states in digital circuits. The basic
components in a computer are logic gates which can produce an
ouptut from inputs. All the outputs and inputs can be in one of two
states, usually either low voltage (0) or high voltage (1).
The fundamental Boolean operations of ©, *, and " correspond
to devices known respectively as the or gate, and gate and not gate.
As with circuit components such as resistance and inductance, each
has its own symbol.
The or gate has two inputs and a single output represented by
the symbol in Fig. 35.1. The output is / = a © b. The inputs a and
b can each take either of the values 0 or 1. Hence there are four
possible inputs into the device as listed in Table 35.2. The final
column / can be completed using the sum rule in Table 35.1. Then, if
a is ‘on’ (1) and b is ‘off’ (0), the output / is ‘on’ (1). Table 35.2 is
known as the truth table of the or gate.
618 35.2 Mathematical techniques

Table 35.2 Truth table


for the OR gate

a b / = a®b
a f=a(&b
0 0 0
b 0 1 1
1 0 1
Fig. 35.1 The OR gate. 1 1 1

Table 35.3 Truth table


for the and gate

a b f = a*b
a f-a*b
0 0 0
b 0 1 0
1 0 0
Fig. 35.2 The and gate. 1 1 1

The symbol and truth table for the and gate are shown in Fig.
35.2 and Table 35.3. Again the device has two inputs and the single
output / = a*b, the product of a and b.

Table 35.4 Truth table


for the not-gate

a / = a
■- \>^ 0 1
Fig. 35.3 The not gate. 1 0

Finally the not gate is shown in Fig. 35.3 with its truth table
given as Table 35.4. The not gate has a single input and a single
output which is the complement of its input.
There is further jargon associated with these gates. The output
a © b is known as the disjunction of a and b, while a*b is known
as the conjunction of a and b, and a is called the negation of a.
These devices can be connected in series and parallel to create
new logic devices, each of which will have its own truth table.
A series connection between a not gate and an and gate is shown
in Fig. 35.4a. The output a*b of the and gate becomes the input of
the not gate which results in the output a*b. This device is known
as the nand gate, and it has its own symbolic representation shown
in Fig. 35.4b. Its truth table is given in Table 35.5.
A series connection between a not gate and an or gate produces
the nor gate as shown in Fig. 35.5a. The output / is the complement
of the sum of a and b. The nor gate also has its own symbol shown
in Fig. 35.5b. It has the truth table shown in Table 35.6.
35.2 Boolean algebra: logic gates and switching functions 619

Table 35.5 Truth table


for the nand gate

a b f = a* b

0 0 1
0 1 1
1 0 1
(b) 1 1 0
Fig. 35.4 The nand gate.

Table 35.6 Truth table


affifc iv f=a®b for the nor gate
— 4>>-—
a b f = a@b
(a)
0 0 1
0 1 0
1 0 0
1 1 0
(b)

Fig. 35.5 The nor gate.


35.3 Logic networks
The five gates introduced in the previous section can be linked in
series and parallel combinations to create further logic networks.
Some examples are presented here.

Example 35.2. Construct the Boolean expression for the output f of the device
a*b
shown in Fig. 35.6.
Starting from the left in Fig. 35.6, the upper and gate produces an output a*b

Z> and the lower or gate has an output c © d. These become the inputs into the
or gate on the right. Hence the final output is

=D c®d
f = a*b@c@d.

Fig. 35.6
Since there are four inputs, the output can be determined for each of the 24 = 16
possible inputs.

Example 35.3. Figure 35.7 shows a logical network with three inputs a,
b, c, and four devices. Find a Boolean expression for the output f. Write
down the truth table for the system.
The input b is the same in both devices P and Q. The output from the and gate
P is a*b, and the output from R is a*b. The output from Q is b © c. Hence
the inputs a*b and b © c into S produce an output

/ = a*b © b © c.

D r s

c
Fig. 35.7
620 35.3 Mathematical techniques

The truth table for this network is given in Table 35.7.

Table 35.7

a b C a*b b©c a * b © (b © c)

0 0 0 0 1 1
0 0 i 0 1 1
0 1 0 0 1 1
0 1 1 0 1 1
1 0 0 0 1 1
1 0 1 0 1 1
1 1 0 1 0 1
1 1 1 1 0 1

Example 35.4. Show that, using just the nor gate, it is possible to build a
logic network to model any Boolean expression.
Given inputs a and b, we have to show that devices can be constructed using
just nor gates with outputs of a © b, a*b, and a. For inputs of a and b, the
single nor gate generates an output of a © b. Figure 35.8 shows three devices
which simulate the required outputs.

a a®b © a®b =a®b


b
(a)

r-^ 'N a®a


Fig. 35*8 /c\

Example 35.5. Design a logic network using or, and, and not gates to
reproduce the Boolean expression f = a * b ® a for inputs a and b.

Fig. 35.9

From input b we obtain b by a not gate. The inputs a and b are then fed into
an and gate to produce a*b. Finally a spur from the a input and the a*b
output are fed into an or gate as shown in Fig. 35.9.
35.4 Boolean algebra: logic gates and switching functions 621

35.4 The inverse truth-table problem


In this problem we attempt to recreate a Boolean expression for a
given truth table. For example, Table 35.8 is a truth table for two

Table 35.8

a b /

0 0 0
0 1 1
1 0 1
1 1 0

inputs a and b. We illustrate a method for the construction of a


Boolean expression which will generate this truth table. Pick out
cases for which / = 1. For the case a = 0, b — 1, write down a*b,
and for a = 1, b = 0 write down a*b, using in the products, the
complement of any zero element. Now form the sum of the elements
which produce / = 1. We obtain

/ = a*b © a*b, (35.3)

where it can be checked that, if a = b, then / = 0, and if a and b


are not the same, then / = 1.
This particular gate is known as the exclusive-OR gate, or exor
gate, and has its own symbol shown in Fig. 35.10. This form of /
:=D- f=a*b© a*b
obtained by the construction just described is known as the
disjunctive normal form.
Fig. 35.10 The exclusive-OR The method can be applied to more complex truth tables. Table
gate.
35.9 shows an output for three inputs. The output 1 appears in rows
2,4, 5, 7, 8. The disjunctive normal form for a corresponding Boolean
expression is, following the rules for products of elements and their
complements,

f = a*b*c@a*b*c@a*b*c®a*b*c®a*b*c.

Table 35.9

a b c /

0 0 0 0
0 0 1 1
0 1 0 0
0 1 1 1
1 0 0 1
1 0 1 0
1 1 0 1
1 1 1 1

Thus in row 2 write a*b*c, since a = b = 0 but c is 1, and so on.


622 35.4 Mathematical techniques

Check that / does give the required output. The disjunctive normal
form always guarantees an answer, but it is not necessarily the
simplest or most efficient in circuit architecture.

35.5 Switching circuits


A circuit of on-off switches can also be represented by Boolean
a expressions. For example. Fig. 35.11 shows a simple on-off switch
in part of a circuit. Current flows if the switch S is in the on or
Fig. 35.11 On-off switch. closed position (a = 1), and does not flow if the switch is in the off or
open position (a = 0). The variable a represents the state of the
switch.
Consider two switches Sl and S2 in series (Fig. 35.12). Current
only flows if both switches are closed, that is, when ax - 1 and
a2= 1, where a1 and a2 represent the states of the switches. Hence
a b the truth table for the series switches is as shown in Table 35.10.
Thus the state of current flow is given by / = a*b, the product of
Fig. 35.12 Two switches in series.
a and b.

Table 35.10 Truth table


for two switches in series

a b /

0 0 0
0 1 0
1 0 0
1 1 1

Fig. 35.13 Two switches in parallel. Similarly two switches in parallel (Fig. 35.13) correspond to the
sum of a and b. The truth table is given in Table 35.11. The final
column indicates that / = a © b.
The complement of a, the state of switch Sl5 is another switch S2
in the circuit which is always in the complementary state to Su off
when Sj is on and vice versa. It can be represented symbolically by
Fig. 35.14, in which the switches S: and S2 are joined by a rigid tie.
These devices are analogous to the gates of Section 35.3. For
switching circuits, the Boolean expressions are often referred to as
Fig. 35.14 Complement of a switch switching functions.
using a rigid tie.

Table 35.11 Truth table


for two switches in parallel

a b /

0 0 0
0 1 1
1 0 1
1 1 1
35.5 Boolean algebra: logic gates and switching functions 623

Example 35.6. Find a switching function f for the system shown in Fig. 35.15.

Let a1, a2, a3, a4 represent respectively the states of each switch Su S2, S3, S4.
Since S2 and S3 are in parallel, their output will be a2 © a3. This combined in
series with a4 will give an output of {a2 ® a3) * aA. In turn, this is in parallel with
S1. Hence, the final output is
(a2 © a3) * a4 © at.

Example 35.7. A light on a staircase is controlled by two switches and S2,


one at the bottom of the stairs and one at the top. Switches can be separately ‘up'
or ‘down'. If both switches are up, the light is off. Either switch changed to down
switches the light on, and any subsequent change to a switch alters the state of
the light. Design a truth table for the circuit.
The truth table is shown in Table 35.12, where the state of St (i = 1, 2) is at = 0
when the switch is up (off) and at = 1 when the switch is down (on). The light
on is / = 1, and the light off is / = 0. This truth table is the same as that for
the exclusive-OR gate in Section 35.4. Hence, from (35.4), the circuit can be
represented by the switching function

/ = a, * d2 © «i * a2.
The actual circuit is shown in Fig. 35.16, where Sj and S2 are one-pole two-way
switches. At Su the state a1 represents the switch ‘up’ and its complement al is
the switch down. A similar state operates at S2.

Table 35.12

Switch S! Switch S2 Light «i a2 /

up up off 0 0 0
down up on 1 0 1
down down off 1 1 0
Fig. 35.16 up down on 0 1 1
Two-switch light control.

Problems

35.1. Read through Example 35.1. Now prove the other be proved in Boolean algebra implies another theorem
absorption law: with * and © interchanged for the same elements.)

a* a © b = a. 35.2. (Section 35.1). Prove the de Morgan result


a © b = a* b,
(Example 35.1 and this result illustrates the duality
principle, which states that any theorem which can by showing that (a © b) © (a * b) = 1. Explain how the
624 Mathematical techniques

duality result (Problem 35.1) gives the other de Morgan


theorem.

35.3. (Section 35.1). Let B be the Boolean algebra with


the two elements 0 and 1. For arbitrary a,b e B, prove
the following:
(a) a * (a © b) =_a * h; (b) (a © b) * (a © b) = a; (a)
(c) (a © b) * a * b = 0.

35.4. (Section 35.1). Using the laws of Boolean algebra


for the set with two elements 0 and 1, show that:
(a) a*b®a*b = cr, (b) a@a*b*c = a®b*c.
Use the result to obtain the truth tables in each case.

35.5. (Section 35.4). In Problem 35.4b, it is shown that


(b)
a®a*b*c = a®b*c.

Design two sequences of gates which give the same


output for the inputs a, b, and c. The resultant gates
*=D—£>-
are said to be logically equivalent.

35.6. (Section 35.4). Design a circuit of gates to produce (C)


the output

(a © b) * (a © c).

Construct the truth table for this Boolean expression.

35.7. (Section 35.1). Show that the Boolean expressions


(a © b) * (a © b) © a and a © b are equivalent.

Fig. 35.17
35.8. (Section 35.1). Show that the following Boolean
expressions are equivalent: 35.11. Find the outputs / and g in the logic circuits
(a) a © b; (b) a © b * b. shown in Fig. 35.18. This device can represent binary
addition in which g is the ‘carry’ in the binary table
35.9. (Section 35.3). Find a Boolean expression / which shown in Table 35.14. The output g gives the ‘1’ in
corresponds to the truth table shown in Table 35.13. the ‘10’ in the binary sum 1 + 1 = 10.

Table 35.1 3
——Cx>
a b C /

0
0
0
0
0
l
1
1
3>
0 1 0 0 Fig. 35.18
0 1 1 0
1 0 0 1
Table 35.14
1 0 1 1
1 1 0 0
1 1 1 0
X y x + y

0 0 0
0 l 1
35.10. (Section 35.3). Construct Boolean expressions for 1 0 1
the output / in the devices shown in Fig. 35.17a—d.
1 l 10
Construct the truth tables in each case.
Boolean algebra: logic gates and switching functions 625

35.12. (Section 35.3). Reproduce the logic gate in Fig.


35.6 using just the nor gate.

35.13. (Section 35.4). Using the disjunctive normal form,


construct a Boolean expression / for the truth tables
given in Tables 35.15 and 35.16.

Table 35.1 5 (a)

a b /

0 0 0
0 1 1
1 0 1
1 1 1

(b)
Table 35.16
Fig. 35.19
a b c /

0 0 0 1
0 0 1 0 32.15. (Section 35.4). Find switching functions for the
0 1 0 0 switching circuits shown in Figs 35.19a,b.
0 1 1 1
1 0 0 1 35.16. A lecture theatre has three entrances and the
1 0 1 0 lighting can be controlled from each entrance; that is, it
1 1 0 1 can be switched on or off independently. The light is ‘on’
1 1 1 0 if the output / equals 1 and ‘off’ if / = 0. Let a; =1 (i =
1, 2, 3) when switch i is up, and let a; = 0 (i = 1, 2, 3)
35.14. (Section 35.3). Showthat any Boolean expression when it is down. Construct a truth table for the state of
can be modelled using just a nand gate. (Hint: use the lighting for all states of the switches. Also specify a
a method similar to thatexplained in Example 35.4.) Boolean expression which will control the lighting.
Graph theory and
its applications
Contents
36.1 Examples of graphs 626
36.2 Definitions and properties of graphs 627
36.3 How many simple graphs are there? 629
36.4 Paths and cycles 629
36.5 Trees 631
36.6 Electrical circuits: the cutset method 632
36.7 Signal-flow graphs 635
36.8 Planar graphs 638
36.9 Further applications 640
Problems 643

36.1 Examples of graphs


A graph is a network or diagram composed of points, or nodes or
vertices, joined together by lines or edges, each of which has a
vertex at each end. Figure 36.1 shows a graph which has four vertices
{a, b, c, d} and six edges (ab, ab, ad, bd, be, cd}. Two vertices are not
joined in this graph, namely a and c, while a and b are joined by
two edges. Generally, it is not the shape of the graph which is
c important; it is usually the presence and number of edges which is
Fig. 36.1 significant.
Here are some practical examples of situations and objects which
can be usefully represented by graphs.

(i) Electrical circuits. Figure 36.2a shows an electrical circuit with


three resistors R1, R2, and R3, an inductor L, and a voltage source
Vv Each edge has just one component, and the joins between
components are the vertices (the term node is frequently used in
circuit theory) in the graph. Care has to be taken with the definition
of nodes (see Section 36.6): they are not necessarily where three or
more wires meet. This circuit has four vertices a, b, c, d, and it can
be represented by the graph in Fig. 36.2b. The presence of a line or
edge between two nodes in the graph indicates that there is a
component between the nodes.
Figure 36.3 shows another circuit with six vertices in which the
boxes indicate electrical components. The wires joining c to f and
b to e cross over each other. In the design of printed circuits, it is
useful to know whether the circuit can be redrawn so that no wires
cross. Such a graph, with no edges crossing, is known as a planar
(b) d graph. The graph in Fig. 36.2 is planar, but the graph of the circuit
in Fig. 36.3 has no planar drawing: at least two edges will cross in
Fig. 36.2
any plane diagram of it.
36.1 Graph theory and its applications 627

(ii) Chemical molecules. The molecule of ethanol can be represented


by Fig. 36.4a. In its graph representation in Fig. 36.4b, the vertices
represent atoms and the edges bonds. The number of bonds which
meet at an atom is the valency of the atom. Thus carbon (C) has
valency 4, oxygen (O) valency 2, and hydrogen (H) valency 1.
Generally in graphs, the number of edges that meet at a vertex is
known as the degree of the vertex.

(ii) Road maps. Road maps and street plans are graphs with roads
as edges and junctions as vertices. However, most road networks
H H
include one-way streets. Hence graphs need to be modified to
indicate directions in which movement or flow is permitted. Figure
H-C- C -O-H 36.5a shows a typical section of a street plan with some one-way
streets. We have to associate directions with the edges as shown in
the graph of the plan in Fig. 36.5b. Note that two-way streets now
(a) H H
have two directed edges associated with them. This is an example of
a directed graph, which is also known by the shortened term digraph.

(iv) Shortest paths. Figure 36.6 shows a digraph with weights associated
-*■ with each edge. The graph could represent routes between towns S
and F which pass through intermediate towns A, B,..., the weights
associated with each directed edge could stand for distances or times.
(b) This graph is shown as a digraph, but weights could be present with¬
out directions in some cases. We might be interested in this example
Fig. 36.4 Ethanol molecule.
in the shortest distance between the start (S) and the finish (F).

36.2 Definitions and properties of graphs


As we have seen, a graph is an object composed of vertices and
edges with one vertex at each end of every edge. An edge which
joins a vertex to itself is known as a loop. If two or more edges join
the same two vertices then they are known as multiple edges. A
graph with no loops or multiple edges is known as a simple
graph. A graph with loops and/or multiple edges is known as a
multigraph.
A graph in which every vertex can be reached from every other
vertex along a succession of edges is said to be connected. Otherwise
the graph is said to be disconnected. A connected graph is in one
piece: a disconnected graph is in two or more pieces.
The degree of a vertex x is the number of edges that meet there,
denoted by deg(x). If, in a graph G, all the vertices have the same
degree r, then G is said to be regular of degree r.
Fig. 36.5 (a) Traffic flow in a
road grid, (b) Digraph Example 36.1. Find the degree of the vertices in the graph in Fig. 36.1.
representation of the roads in (a).
Three edges meet at the vertex a. Hence deg(a) = 3. Four edges meet at b. Hence
deg(b) = 4. Similarly, deg(c) = 2 and deg(d) = 3.

A simple graph in which every vertex is joined to every other


vertex by just one edge is called a complete graph.
628 36.2 Mathematical techniques

Fig. 36.6
Figure 36.7 shows some examples of the various graphs described
above.
Since every edge has a vertex at each end, it follows that the sum
of all the vertex degrees equals twice the number of edges. This is
known as the handshaking lemma. For example, from Example
36.1,

deg(a) + deg(b) + deg(c) + deg(d) =3+4+2+3= 12,

which is twice the number of edges in the graph shown in Fig. 36.1.
There two immediate consequences of the handshaking lemma:

(i) the sum of all the vertex degrees in a graph is an even number;
(ii) the number of vertices of odd degree is even.

Fig. 36.7 (a) Connected simple


graph, (b) Connected multigraph.
(c) Disconnected multigraph.
(d) Regular graph of degree 3.
(e) Complete graph with five
vertices: deg(a) = 4.
36.3 Graph theory: circuit analysis and signal flow 629

36.3 How many simple graphs are there?


Graphs can be described as labelled, in which case the vertices are
distinguishable as in Fig. 36.8a or unlabelled as in Fig. 36.8b. If
we look at graphs with just three vertices, there are eight labelled
simple graphs as shown in Fig. 36.9, but there are just four distinct
unlabelled graphs as shown in Fig. 36.10. In Fig. 36.9, the labelled
graphs (2), (3), and (4) will correspond to the same unlabelled graph.
The number of labelled simple graphs with n vertices is fairly easy
d c
to calculate. Between any two vertices, there is the possibility of an
(a) edge. Any vertex can be joined to n — 1 other vertices. Since this
will duplicate edges, there will be ^n(n — 1) possible edges. Each edge
may be either present or not. Hence the number of possible
combinations of present and absent edges will be 2^"("_1,, which is
number of labelled graphs. Thus there must be 2i4(4~1> = 26 = 64
labelled graphs with four vertices; of these, 11 can be identified as
unlabelled graphs. The latter graphs are shown in Fig. 36.11. Of the
11 unlabelled graphs it can be seen that six are connected and four
are regular.
(b) For applications involving circuits, the main interest is in connected
graphs. The numbers of the various categories of graphs up to n = 7
Fig. 36.8
vertices are given in Table 36.1 . It can be seen from the table that
the number of unlabelled graphs is a considerable reduction on the
labelled set, and that regular graphs are comparatively rare. The
counting of unlabelled graphs does not follow from a simple formula.

Table 36.1

n 1 2 3 4 5 6 7

Labelled graphs 1 2 8 64 1024 32 768 2 097 152


Unlabelled graphs 1 2 4 11 34 156 1044
Connected graphs 1 1 2 6 21 112 853
Regular graphs 1 2 2 4 3 8 6

36.4 Paths and cycles


Consider a graph G. Suppose we follow a succession of connected
edges between two vertices a and z, along which there may be
repeated edges and vertices. This is known as a walk between a and
z. If all the edges walked are different (that is, no edge is covered
more than once but vertices may be visited more than once), then
the walk defines what is known as a trail. A trail is said to be closed
if the first and last vertices are the same. If all the vertices on a trail
are dilferent, except possibly the end pair, then the succession defines
a path. A closed path is known as a cycle or circuit. For example,
in Fig. 36.12, a-f-b-c-d is a path between a and d, but a-b-f-e-b-c-d
is only a trail since vertex b is passed through twice. Also,
a-b-c-d-e-f-a is an example of a cycle.
630 36.4 Mathematical techniques

Fig. 36.9 Labelled graphs with


three vertices.

Fig. 36.10 Unlabelled graphs


with three vertices. •

Fig. 36.11 All unlabelled


graphs with four vertices.

a f

Example 36.2. Electrical circuits are usually such that every edge of its
representative graph is part of a cycle. List all the distinct cycles in the circuit in
Fig. 36.2(a).
The edge of the circuit is repeated in Fig. 36.13. The complete list of cycles is:
3- edge cycles: a-b-c-a, a-b-d-a, a-d-c-a, b-d-c-b;
4- edge cycles: a-b-d-c-a, a-d-b-c-a, a-b-c-d-a.
Fig. 36.12
36.4 Graph theory and its applications 631

Some graphs have special closed-path and cycle properties. A


connected graph G is said to be eulerian if there exists a closed
trail that includes every edge in G. A connected graph G is said to
be hamiltonian if there exists a cycle that includes every vertex in
G. The graph in Fig. 36.13 is hamiltonian but not eulerian. One
hamiltonian cycle in its graph is a-b-d-c-a. Note that this cycle
Fig. 36.13
does not have to cover every edge in the graph.
The graph in Fig. 36.14 is both eulerian and hamiltonian. An
a
eulerian trail is
a-b-c-d-e-f-g-e-c-g-b-f-a,
and a Hamiltonian cycle is
a-b-c-d-e-g-f-a.
It can be shown that a graph is eulerian if and only if every
vertex has even degree. This provides an easy test for the eulerian
property of a graph.

36.5 Trees
A connected graph which has no cycles is known as a tree. An
example of a tree is shown in Fig. 36.15. The edges in a tree are
called branches.
Suppose that a graph G consists of the set V(G) of vertices and
the set E(G) of edges. Then any graph whose vertices and edges are
subsets of V(G) and E(G) respectively is called a subgraph. It is
important to note that the subgraph must be a graph whose vertices
and edges come from G; and only edges that join two vertices of
the subgraph are permitted in the subset of E(G).
Suppose that G is a connected graph.

A spanning tree of G is a subgraph of G which is a tree and


includes all vertices of G.

Figure 36.16a shows a connected graph G and Figure 36.16b shows


a spanning tree of G. Graphs can have many different spanning trees.
The set of edges that are not part of the spanning tree (the dashed
edges in Figure 36.16b) is known as the cotree and its edges are
called links.
Consider a tree with n vertices. Construct the tree from a chosen
(b) vertex by adding edges: Each edge added must introduce a new
Fig. 36.16 (a) Connected graph,
vertex, since othewise a cycle would be created and the graph would
(b) The same graph with a no longer be a tree. Hence a tree with n vertices must have just n — 1
spanning tree branches. It follows that a graph with n vertices must have a cotree
with e — n + 1 links, where e is the number of edges of the graph.
We now introduce the cutset by which we can disconnect a graph
into two subgraphs which together contain all the original vertices,
by removing a minimum set of edges in the graph.
632 36.5 Mathematical techniques

In a connected graph, a cutset is a set of edges whose removal


disconnects the graph into two subgraphs such that any proper
subset of the cutset does not disconnect the graph.

In other words, there must be no redundancy in the cutset. Thus,


(a) for example in Fig. 36.17a, the dotted line C1( which removes the
edges ab, bf, and be, defines a cutset {ab, bf, be}; but C2 in Fig.
36.17b does not define a cutset, since the subset {ab, bf, be} of edges
disconnects the graph.

36.6 Electrical circuits: the cutset method


In this section we give a brief description of the representation of
circuits by graphs, and show how Kirchhoff’s laws can be applied
to cutsets of the resulting graphs. Figure 36.18a shows a plan of a
circuit with seven resistors, a voltage supply, and two capacitors.
This particular circuit has ten components and ten edges. Note that
A will be a vertex or node (a preferred term in circuits) but that
the joins B, C, and D are not separate nodes but can be coalesced
into a single node. The equivalent graph is shown in Fig. 36.18b:
it has five nodes and ten edges. Note that it is a multigraph with
two nodes joined by two edges and two nodes joined by four
edges.
A circuit loop in the circuit is a cycle in the graph.
d
Kirchhoff’s laws have already been stated in Section 19.4, but
for convenience they are given again Jrere in graph terms. They
Fig. 36.17 A cutset of a graph.
state (i) that the algebraic sum of the voltages around any loop is
zero, and (ii) that the algebraic sum of the currents entering any node
is zero.

R3
c, —ir^ -\w-
} Rf> B C D
L u
H1-
-VA-

<

v( V2
QC

;
r-

*4=
r\ T
a:
-

Fig. 36.18
A circuit and its graph. (a) (b)

In addition, for resistors we also have Ohm’s law which states that
the voltage across a resistor is directly proportional to the current
flowing through it, that is,

v oc /' or v = Ri,

where the constant R is measured in units called Ohms (Q). Figure


36.19 shows a circuit with two independent maintained current
36.6 Graph theory and its applications 633

*1 sources ix and iY: the symbol of the circle enclosing an arrow


represents a maintained current in the direction of the arrow.
The corresponding 6-node digraph with currents iu i2,..., i8 in
the directions indicated is shown in Fig. 36.20. If any current turns
out to be negative then its direction will be opposite to that shown.
Now introduce nodal voltages va, vb,..., vf as shown in Fig.
36.21. The use of nodal voltages means that effectively Kirchhoff’s
first law is automatically satisfied. The earthing at e makes ve = 0
and other voltages can be measured relative to this zero ground
Fig. 36.19 potential.
This circuit has 13 unknowns: 8 currents and 5 nodal voltages.
The problem with circuits is the selection of the minimum number of
b i'i c
consistent equations from Kirchhoff’s laws and Ohm’s law to
determine the unknowns.
The graph of this circuit is the same as that in Fig. 36.16a, and
we shall use the same spanning tree as shown in Fig. 36.16b. In this
graph, the number of nodes n is 6, the number of edges e is 10.
Hence the cotree has, from the previous section, e — n + 1 = 10 —
6+1=5 links. Any cutset of the original graphs which contains
Fig. 36.20 one and only one branch of the spanning tree (the rest of the cutset
consisting of links) is known as a fundamental cutset of the circuit.
Hence we can associate five fundamental cutsets with the spanning
vb tree in Fig. 36.16b. Five possible cutsets Cu C2,..., C5 are shown
in Fig. 36.22.
By repeated use of Kirchhoff’s second law to the nodes on one
side of a cutset, it follows that the algebraic sum of the currents
crossing the cutset must be zero. Hence the five cutset equations are:

Q: z'i - z3 + ix = 0, (36.1)

C2: z'i — i3 + z4 + i5 + z7 — i8 = 0, (36.2)


o'
<o

(36.3)
II
1

1
<N

00
rO

C4: i6 i5 i-j + ig — 0, (36.4)

^ 5 - h ~~ * 8 = (36.5)

These equations must be independent since each one contains a


current from a branch of the spanning tree which does not appear
in any other equation. Further any nonfundamental cutset equation
will be a linear combination of the five fundamental cutset equations.
Fig. 36.22 The number of branches in the spanning tree defines the number of
independent equations.
We can also apply Ohm’s law to each resistor (note that current
flows from high to low potential). Thus the voltage difference across
is vc — vb, so that

ii = (t>c - vb)/Ri- (36.6)


634 36.6 Mathematical techniques

Similarly

(Vf - vc)/R2, (36.7)

(Vb - Vf)/R3, (36.8)

(Vf - va)/R4, (36.9)

Vf/R-5’ (36.10)

(~va)/R6, (36.11)

(36.12)

(vd - vc)/R8. (36.13)

We can now substitute for the currents from (36.6) to (36.13) into
(36.1) to (36.5) resulting in 5 linear equations to determine the nodal
voltages va, vb, vc, vd, vf in terms of the known currents ix and iY.
The remaining currents can then be calculated from (36.6) to (36.13).

Example 36.3. Using the cutset method, find all currents and nodal voltages
in the circuit shown in Fig. 36.23.
The circuit can be represented by a graph with 5 nodes (Fig. 36.24) with the
currents it, i2, i3, i4, i5 in the directions shown.
A spanning tree with three links is shown in Fig. 36.25 together with cutsets
Ci, C,, C3, C4. Hence Kirchhoff’s second law implies:

C,: /, - i3 + i2 = 0, (36.14)

C2: br - i3 + b = 0, (36.15)
Fig. 36.23
C3: —iy + b — f3 + i2 — 0, (36.16)

C4: -iY + i4 + i2 = 0. (36.17)

With ve = 0, the currents in terms of the nodal voltages va, vb, vc, vd are, by
Ohm’s law:

b = (». - v„)/Ri = 2(va -vb), (36.18)

b = (b - vb)/R2 = i(b - »»), (36.19)

b = (b, - «d)/jR3 = vb - vd, (36.20)

Fig. 36.24 U = (ve ~ vd)/R4 = i(ac - ad), (36.21)

b = = io. (36.22)

Eliminate the currents in (36.14) to (36.17) using (36.18) to (36.22):

2va - fvh + ity + Vd = 0, (36.23)

3vb - 3b + vd = 2, (36.24)

-lb + 6b-ib=b (36.25)

2vd = >■ (36.26)

These are linear equations which can be solved using the methods of Chapter
Fig. 36.25 Fundamental cutsets
12. Computer algebra is also very useful in solving sets of equations of this type
36.6 Graph theory and its applications 635

(see the computer algebra applications for Chapter 12 in Chapter 41). The
answers are

va = 5 V, i)t = 4V, r( = 4 V, vd = 2V.

Since vc = vb, no current flows through the resistor on be.


We can summarize the result for an earthed circuit which contains only
resistors and current sources. Suppose that the representative graph of the circuit
contains n nodes and e edges of which / contain known current sources. The
curcuit will have e — f unknown currents and n — 1 unknown nodal voltages
giving e — / + n — 1 unknowns in total. Its spanning tree will have n — 1 edges
which will lead to n — 1 fundamental cutset equations, and Ohm’s law will apply
to e — / resistors. Hence we shall always have a consistent set of e — f + n — 1
equations to find the unknowns.
This result can be extended to circuits with current sources, voltage sources
(batteries) and resistors. If the representative graph has n nodes and e edges of
which / contain current sources and s maintained voltage sources, then the
number of unknown currents will be e — / and the number of unknown nodal
voltages will be n — 1 — s since the nodal voltage difference across a battery will
be known. Hence the number of unknowns is e — f + n — l — s which will
satisfy n — 1 cutset equations and e — f — s Ohm’s laws.

36.7 Signal-flow graphs


Figure 36.26 shows a block diagram of a negative-feedback control
system. The input into the system is P(s) and the output Q(s). All
operations are defined by their transfer functions (see Section 25.4).
The boxes represent devices or controllers. The circle represents a
Fig. 36.26 Negative-feedback
sum operator, and the return sign on F(s) indicates positive or
control system. negative feedback. The output signal Q(s) is fed back into the input
through H(s), and it is a negative feedback which will reduce the
output. In a later problem, we shall consider a device with a positive
feedback. Thus the input into G(s) is

/4(s) = P(s) - F(s), (36.35)

The boxes each produce outputs given by the transfer functions

Q(s) = G(s)/l(s), (36.36)

F(s) = H(s)Q(s), (36.37)

We wish to find Q(s) in terms of P(s), G(s), and H(s), from the
equations (36.35) to (36.37). Thus, from (36.36)

Q(s) = G(s)A(s) = G(s)[P(s) - F(s)],


= G(s)[P(s) - H(s)g(s)].

Hence the output transfer function is

G(s)
P(s) G(s) GG) Q(s) = Pis).
1 + G(s)H(s)
1 +G(s)H(s)
This is the closed-loop transfer function. The actual signal can be
Fig. 36.27 Block-reduced
obtained by finding the inverse Laplace transform for Q(s). Hence
diagram for Fig. 36.26. the system is equivalent to that shown in Fig. 36.27.
636 36.7 Mathematical techniques

If the feedback reinforces the input signal it is called positive


feedback. Figure 36.28 shows a multiple-feedback control system
with a positive and a negative feedback. The output signal is given by

_G^G^G^s)_
Q(s) = P(s), (36.38)
1 - G2(s)H1(s) + Gj(s)G2(s)G3(s)H2(s)

which can be obtained by the method of block-diagram reduction.


For example, the feedback through Hx makes the system equivalent
to that shown in Fig. 36.29. We can now combine the series devices
which reduce the system to the negative-feedback control system
considered at the beginning of this section. The details are omitted
here.
This block-reduction method can get quite complicated for a
complex feedback system. Instead of using block reduction in this
way, represent the system by a weighted digraph as shown in Fig.
36.30, where the weights are the transfer functions—except that the
edges representing the input and output are assigned weight 1 since
they carry no devices. Also the negative feedback is replaced by
— H2(s), to make sure that it reduces the input into G^s). This is
the signal-flow graph of the system. Let the inputs into the nodes
be x1? x2, x3, and x4 as shown; then, for the positive feedback cycle,

*3 = G2x2, x2 = Glxl + ifiX3.

Fig. 36.28 A multiple-feedback


control system.

Fig. 36.29 First stage in the


block reduction of the multiple-
feedback control system.

Fig. 36.30 Signal-flow graph


for the multiple-feedback control
system shown in Fig. 36.28.
36.7 Graph theory and its applications 637

(The argument (s) has now been dropped from the working.) Hence
GlG2xi
X3~V~G2H1'
In other words, we can replace (a) by (b) in Fig. 36.31.

", G,G2

Fig. 36.31

There are other rules, and a complete list now follows for the
replacements for subgraphs in the graph.
(a) Multiple edges. See Fig. 36.32. This follows since

x2 = Gxj + Hxl = (G + H)x1.

(b) Edges in series. See Fig. 36.33. This follows since

Fig. 36.32 Multiple edges. x3 = Hx2 = H(Gxj) = HGx,.

G H GH
by
Fig. 36.33 Edges in series.

(c) Cycles. See Fig. 36.34. This follows since

x3 = Hx2 and x2 = Gxx + Jx3.

H GH
G X~HJ
•-*-> by • > •
X\ xi x3
Fig. 36.34 Cycle. J

Assume that HJ / 1; otherwise there is infinite gain.


(d) Loops. See Fig. 36.35. This follows since

,x2 = G.Xj + Hx2

with H / 1.
(e) Stems. See Fig. 36.36. This follows since
Fig. 36.35 Loop.
x2 = Gxj, x3 = Hx2 = HGxx, x4 = Jx2 — JGxx.

Apply these rules to the successive reduction of the feedback system


in Fig. 36.30. The sequence of steps in the reduction of the
signal-control graph to a single-edge graph is shown in Fig. 36.37.
The weight of the final edge agrees with the output in equation
(36.38).
Essentially the operations in a signal-flow graph are those applied
to a weighted digraph as illustrated in the following example.
638 36.7 Mathematical techniques

Example 36.4. Find the out put-input relation in the signal-flow graph shown
in Fig. 36.38.
Applying rule (a) to the multiple edge, and rule (c) to the cycle, the graph is
reduced to Fig. 36.39. Apply the series rule to the divided edges to give Fig.
36.40. Finally the multiple edge and series rules give Fig. 36.41. Thus the output
is given by

abd
<7 = + he(g + /).
1 — be

rule (b)
In the actual control system a, b, c,... will be transfer functions.

P(s) 1 FI j Q(s) 8+f


Fig. 36.38 Fig. 36.39

' I
G1G2G3

i //! t ;\+G abd


1 -be

Fig. 36.37 Successive steps in abd


+ he(g+f)
the reduction of the signal-flow P \-bc <7
>
graph of the control system h(g+f)e
shown in Fig. 33.28. Fig.36.40 Fig. 36.41

36.8 Planar graphs


As we remarked in Section 36.1, planar graphs are important in
circuit design since planar circuits can be manufactured as a single
board. A planar graph is a graph that can be drawn with no edges
crossing or meeting except at vertices. The standard example of a
simple application which cannot be represented by a planar graph
W G E
is the delivery of three services, water (W), gas (G), and electricity
(E), to three houses A, B, C (Fig. 36.42). This graph has no plane
drawing. The re-organization of the graph in Fig. 36.43 shows the
impossibility of this; if W and C are connected last then this edge
must cross either AE or BG.
The graph in Fig. 36.42 is an example of a bipartite graph in
which one set of vertices may be connected to another set of vertices,
but not to vertices in the same set. If every vertex in one set is
Fig. 36.42 Bipartite graph K3 3. connected by one edge to every vertex in the other set then it is
36.8 Graph theory and its applications 639

A called a complete bipartite graph. If the sets have m and n vertices


respectively, then the notation Km „ denotes the complete bipartite
graph. Figure 36.42 shows the graph K3 3 and this graph is not
planar. Check that the graphs K2 2 and K2 3 are planar.
In planar graphs there is a relation between the numbers of
vertices, edges, and faces. In a plane drawing of a graph, the plane
is divided into regions called faces. One face is the region external
to the graph. Figure 36.44 shows a planar graph with five vertices
and seven edges, and with four faces: A, B, C, and the external face D.
A remarkable formula, due to Euler, links the numbers of vertices,
edges, and faces of a graph.
Fig. 36.43

a b
Theorem (Euler). Suppose that the graph G has a planar
drawing, and let v be the number of vertices, e the number of
edges, and / the number of faces of G. Then
v — e + f = 2.

Proof. For the graph G, define a spanning tree (see, for example,
Fig. 36.44 A planar graph with Fig. 36.45). The spanning tree must have n vertices and n — 1 edges
fiver vertices, seven edges, and (see Section 36.5). It must also have just one face. Since
four faces.

n — (n — 1) + 1 « 2,

Euler’s formula holds for the spanning tree. Successively replace the
other edges in the graph. Each time an extra edge is added, a face
is divided and one extra face is added. However, algebraically, this
cancels the additional edge in the accumulation to Euler’s formula
for the spanning tree. Hence

Fig. 36.45 A graph with a


spanning tree. V — e + J = 2

for the reconstructed graph G.


A complete graph with n vertices is a simple graph in which every
vertex is joined to every other vertex. For the n-vertex graph, it is
denoted by K„. The graphs of K2, K3, K4, and Ks are shown in
Fig. 36.46. Of these graphs, K2, K3, and K4 are planar, but K5 and
all succeeding complete graphs are not.
The graphs K3 3 and K5 are the keys to tests for planarity of
graphs, and whether it is possible to design, for example, a plane
printed-circuit board to make the required connections between
electronic components. It was proved by Kuratowski in 1930 that
every non-planar graph contains subgraphs which are either K3 3
or K5, or K3 3 or K5 with additional vertices on their edges.
640 36.8 Mathematical techniques

36.9 Further applications


Braced frameworks
Consider a frame which consists of four struts in the shape of a
o - - o rectangle (Fig. 36.47a) with pin joints at each corner. Without a
diagonal tie the structure will not support a vertical load, but will
collapse into a parallelogram as shown in Fig. 36.47b. The structure
o can be made rigid and load bearing by the insertion of a diagonal
(a) strut as in Fig. 36.48.
Load
Consider now a pinjointed framework with m x n rectangular
frames with some individual frames braced. How can we decide
whether a particular framework is braced, that is no part of it can
be sheared? And if it is braced, how many ties could be removed to
leave a minimum bracing? The framework is similar to a vertical
section of scaffolding or a steel framed building, although in
(b) both cases the joins are bolted but can still need bracing to ensure
rigidity.
Fig. 36.47 Single unbraced
pin-jointed frame.
Figure 36.49 shows a 5 x 6 framework with 11 braces as shown
(braces can be diagonal struts in either direction). Label the cell
rows ru r2,..., r5 and the cell columns cu c2,..., c6 as shown in
Fig. 36.49. The framework will be represented by a bipartite graph
(see Section 36.8) with the cell rows and columns as vertices. Arrange
them in rows as shown in Figure 36.50.
If a particular rectangular cell is braced then the identifying row
and column vertices are joined by an edge. Thus the cell ric1 is
braced so that an edge joins rx and cx in the bipartite graph. No
Fig. 36.48 Braced frame.
edge joins rx and c3 since this cell is not braced. The bipartite graph
representing the framework is shown in Figure 36.50. If the graph
C1 c2 c3 C4 c5 c6 is connected, then the framework is braced since the shearing of any
cell or group of cells is not then possible. The graph is connected
in this case, and the framework is braced. Can any braces be
removed in such a way that the framework is still braced? Any brace
which is removed must not disconnect the graph. If the graph
contains a cycle (Section 36.4) then any edge removed from the cycle
will not disconnect the graph. This removal rule can be applied to
each cycle in the graph. If, at the end of this process, there are no
Fig. 36.49 5 x 6 framework.
cycles remaining and the graph remains connected, then the frame-
36.9 Graph theory and its applications 641

work is said to have a minimum bracing. The framework graph in


Fig. 36.50 contains just one cycle, namely r1c1r3c3r4c6r2c2r1 (see Fig.
36.49). Any edge can be removed from this cycle leaving a minimum
bracing. The removal of any further edges will disconnect the graph.
If every cell is braced in a framework then the bipartite graph
Fig. 36.50 will be complete, and the framework will be seriously overbraced.
You might note that a complete bipartite graph Km „ has mn edges
but a minimum bracing for an m x n framework has m + n — 1
edges: for example, if m = 5 and n = 6 then mn = 30 whilst m + n— 1
= 10.
Figure 36.51 shows an unbraced 4x5 framework, its (discon¬
nected) graph and the same framework sheared.

r2 r3 r4

Fig. 36.51 An
unbraced framework.

Phasing of traffic signals


Figure 36.52 shows a road junction with 8 incoming lanes of traffic
and a one-way exit. Suppose that each lane can be controlled by its
own individual signal.
One solution for traffic management would be to allow each lane
to have a green signal in sequence with the remaining all on red,
but this would be inefficient since obviously several lanes of traffic
can move simultaneously without risk. How can an efficient phasing
of the signals be designed?
Label each incoming lane a, b, c,..., h as shown, and let these
d e
be vertices of a graph (Fig. 36.53). Starting, say, with a we decide
Fig. 36.52 Road junction. which traffic lanes are compatible with a, that is, which lanes can
also have green lights simultaneously without risk of a collision.
Thus a and b are compatible, and we therefore join a and b by an
edge. Lanes a and c are also compatible, and we therefore join a
and b by an edge. Lanes a and c are also compatible, but a and e
are not, and so on. The graph G in Fig. 36.53 shows which lanes are
compatible, and is known as the compatibility graph for this
junction.
We now look for complete subgraphs (Section 36.8) in G. An edge
is a complete subgraph (K2), a triangle (K3) is a complete subgraph
with three vertices, K4 with four vertices, and so on. We try to use
642 36.9 Mathematical techniques

the largest subgraphs in any covering of G, that is a list of subgraphs


which includes all vertices. In G, abed, abdf and abfg are K4
subgraphs, and there are a large number of triangles. For example
we can cover G by the set of subgraphs

{abed, abfg, def,fgh}.

Generally, we include as many large subgraphs as possible. In this


list it is better to use fgh rather than just gh: this could be chosen
since / is included in other subgraphs.
d e Suppose that the period of the traffic signal sequence is T seconds
with each lane having a green light for at least \T. There are four
Fig. 36.53
different traffic flows represented by the subgraphs. Suppose that
each subgraph list of lanes has a green light for \T. The green/red
phasing sequence is shown in Table 36.1.

Table 36.1

Subgraph

Time abed abfg def fgh

0 green red red red


{T-{T red green red red
\T-\T red red green red
\t- t red red red green

The actual phasing lane-by-lane is shown in Fig. 36.54 where


t-abed —r- abfg—i—def—i—fgh —i
the solid line indicates the green light for a lane. For example
0 \t \t IT T between \T and \T, lanes a, b, e, f are on green with the others on
red.
The total waiting time for the traffic at the junction is a measure
of the efficiency of the timings and phases. Let ta, tb, tc,... be the
f--
g- waiting times of the lanes so that, from Fig. 36.54 we can see that
h- ta — jT, tb = jT, tc — JT, etc. Hence the total waiting time WT is
given by
Fig. 36.54 Traffic phasing.

Wt — ta + h +'■■+ h

abed i-abfg -—i- def -i-fgh


r
=ir+|r+|r+ir+|r+|r+ir+|r
_ 9t
0 —5 T
1 T ~ 21 •

a
b Can the waiting time be reduced within the time constraints by either
c
d choosing a different set of subgraphs to cover G, or a different
sequence of timings? Figure 36.55 shows the same choice of sub¬
graphs but with different timings. The result is a slightly shorter
waiting time of ^T.
Fig. 36.55
Graph theory and its applications 643

Problems

36.1. (Section 36.2). Write down the degree of each 36.7. Write down the adjacency matrices of the graphs
vertex in the graph in Fig. 36.56. in Fig. 36.7. Note that a single loop introduces an
element 1 into the appropriate position on the leading
diagonal. What characterizes the matrix of a discon¬
nected graph?

36.8. (Section 36.4). How many different cycles pass


through a single vertex in a complete graph with four
vertices?

36.9. (Section 36.4). List all trails between vertices a


36.2. (Section 36.2). Draw the complete graph with and f in the graph shown in Fig. 36.48. Identify which
six vertices. How many edges does it have? trails in the list are also paths.

36.3. (Section 36.2). Sketch the 21 connected unlabelled


graphs with five vertices. How many of them are planar?

36.10. (Section 36.4). Is the graph in Fig. 36.48


36.4. Sketch the eight regular graphs with six vertices. eulerian? If it is find an eulerian closed trail. Is it
How many of them are connected? hamiitonian?

36.5 The adjacency matrix of a graph G with no loops


is a vertex-vertex matrix, in which the element in the ith
36.11. (Section 36.5). Construct a spanning tree for
row and j th column is 0 if vertices i and j are not joined
the graph shown in Fig. 36.57. Draw its cotree. Show
by an edge, and r if i and j are joined by r edges. Thus,
that there is a spanning tree in which no vertex has
if we number the vertices a, b, c, d as 1, 2, 3,4 respectively,
degree more than two.
then the adjacency matrix of the graph in Fig. 36.1 is

0 2 0 1

2 0 1
A =
0 1 0
1 1 1 0

Note that the leading diagonal has zeros if there are no


loops. The adjacency matrix is a formula for the graph.
Evaluate A2. What is the interpretation of the matrix
in terms of the edges of G?

36.6. Draw the graphs defined by the following adjac¬


ency matrices:

0 1

1 0 1 1 1

(a) A = 1 1 0 1 1 Fig. 36.


1 1 1 0 1
_ 1 1 1 1 0
36.12. Figure 36.58 shows a graph with seven vertices.
0 2 0 0 (a) Decide whether the graph is eulerian.
(b) Construct a spanning tree for the graph. How many
2 0 1 1
(b) A = branches does the tree have?
0 1 0 1 (c) Draw a cutset which disconnects the vertices a,
0 1 1 0 b, g, f from the vertices c, d, e.
644 Mathematical techniques

a 36.15. (Section 36.6). A circuit is represented by the


graph shown in Fig. 36.61. The current i0 is from an
independent source, and all other edges contain a resistor
in which the current i1 passes through a resistor Rt and
so on. Define a spanning tree for the graph. How
many fundamental cutsets are required? Write down the
current equations associated with each of the cutsets. If
ij = 2A, a maintained current, vc = 0 (earthed) and
Rk= 1 Q for k = 1, 2,..., 7, find the remaining voltages
l’a, vb, vc, ve.

b (2 c

Fig. 36.58

36.13. Figure 36.59 shows a digraph. How many trails


are there between a and e? Which of them are also paths?
Can you find a four-edge cycle?

Fig. 36.61

36.16. (Section 36.6). Figures 36.62a,b show two cir¬


cuits with current sources and resistors. Figure 36.62c
shows a circuit with current sources and a constant
voltage source (battery). Use the cutset method to
find the modal voltages and currents through the resis¬
tors.

(a) iy= 1 A
Fig. 36.59

36.14. (Section 36.6). Figure 36.60 shows a circuit with


an independent current source i0. Represent the circuit
by a graph. How many vertices does the graph have?

-VA-
r2
*3 *4
_ YYV -
YYV -VW -

( K *7^

Fig. 36.60 Fig. 36.62 (continued)


Graph theory and its applications 645

(b)
/?, = ! Q 36.18. (Section 36.5). Figure 36.63 shows a positive
feedback control system. If P(s) is the system input,
find its output Q(s), and the transfer function of a
single equivalent device.

G(s)
J *
H(s)

Fig. 36.63

36.19. (Section 36.7). Find the outputs in the systems


shown in Fig. 36.64a,b by progressively replacing parts
of the system by equivalent devices until just one device
(c)
remains. Find the transfer function of the resulting
equivalent single device.

36.20. (Section 36.7). Reduce each of the signal flow


graphs in Figs 36.65a,b,c,d to an equivalent single
edge, and (e) to a stem, and find the transfer function
in each case.

36.21. (Section 36.8). Label the edges, vertices, and faces


of the graphs shown in Figs 36.66a,b and verify Euler’s
formula.

36.17. Complete the block-reduction method for the 36.22. (Section 36.8). Show that the bipartite graph
multi-feedback control system shown in Fig. 36.64. K2 3 has a planar representation.

(a)

(b)
Fig. 36.64
646 Mathematical techniques

Fig. 36.65 (a) (b) (c) (d) (e).

(b)

36.23. (Section 36.8). The complete graph K5 does not 36.25. (Section 36.9). Show that the framework in Fig.
have a plane drawing. What is the minimum number of 36.67 is overbraced. How many ties can be removed to
edge crossings in a plane representation of the graph? leave a minimum bracing?

36.26. (Section 36.9). How many ties will be needed


36.24. (Section 36.1). List all the paths between S and
to secure a minimum bracing for the framework shown
T in the network given in Fig. 36.6, and hence find
in Fig. 36.68? Draw in a suitable set of ties for a minimum
the shortest and longest paths. (This method of simply
bracing.
listing all paths can become very extensive for larger
networks: efficient algorithms are really required to
reduce the number of calculations.) 36.27. (Section 36.9). Decide whether the frameworks
Graph theory and its applications 647

36.28. (Section 36.9). The framework in Fig. 36.69c is Fig. 36.69


required to be strengthened so that it is overbraced
with each diagonal tie as an edge in at least one cycle
h g
in the associated bipartite graph. What is the minimum
number of ties which must be added?

36.29. Figure 36.70 shows a junction with 8 distinct


lanes of traffic each controlled by a separate traffic
signal. This is really a ’design and solve' problem.
Here is one model: of the doubtful cases assume that
lane a with both c and e, and that e is compatible
with h. Draw the compatibility graph for this junction.
List all complete subgraphs with 4 and 3 vertices. If
the period of the traffic signal cycle is T and the
subgraphs

{abef, cdg, aeh]


Fig. 36.70
are chosen with each allowed green for \T, calculate
the total waiting time. Suppose that the subgraph abef
runs for \T and the others for \T each. How does this
affect the total waiting time?
Difference equations

Contents
37.1 Discrete variables 648
37.2 Difference equations: general properties 650
37.3 First-order difference equations and the cobweb 652
37.4 Constant-coefficient linear difference equations 653
37.5 The logistic difference equation 658
Problems 662

37.1 Discrete variables


In many applications, functions can only take discrete values—that
is, they cannot (for various reasons) take a continuous spectrum of
values. It is reasonable to model the temperature in a room by a
function which varies continuously with time—most of the calculus
in this book is concerned with such functions. On the other hand,
the population size of a country can only take integer values. As
births and deaths occur, the population size is discontinuous in time,
and the graph of population size against time will be a step
function. Between births and deaths the population number will be
constant so that we are only concerned with changes which take
place at these events. In this problem jumps occur at variable time
intervals.
We can obtain discrete data from a continuous signal or function
by sampling the signal at regular time steps rather than keeping a
continuous record. This is often the situation in microprocessor-
driven operations.
Let us start by considering a simple financial application which
generates discrete values. In compound interest the sum of £P0 is
invested in an account to which interest accrues annually at a
compound rate of 1007%. If £Pt is the amount in the account at the
end of the first year, then

P^O+ZjPo. (37.1)
Let £P„ be the sum after n years. Then, similarly

P„ = (l +I)Pn-x. (37.2)

This is an example of a difference equation or recurrence relation. It


gives the values of P„ at the integer values 0, 1, 2,... in terms of the
immediately preceding value. Treating the variable as n, the difference
in this case is 1. The notation P(n) instead of P„ is often used to
emphasize the function aspect of P but we have chosen the more
economical subscript form P„.
It is fairly easy to solve (37.2) by repeated application of the
37.1 Difference equations 649

formula starting with (37.1). Thus

p2 = ( i +dpi=( i + n2p0,
p3 = (l +/)T2 = (1 +/)3P0,

and so the formula

Pn = ( 1 + TO (37.3)
holds at least for values of n up to 3. Suppose that (37.3) holds for
n = k. Then (37.2) implies that

pk+i = ( i +/m = o + i)k+ip0-


So the same formula holds for Pk + 1. Hence, if the result is true for
k then it is also true for k + 1. Equation (37.1) confirms that it is
true for k = 1. It follows sequentially that it is true for n — 2, n = 3,
and so on. (This method of proof is known as induction.)

Example 37.1. £1000 is invested for 5 years at the following rates: (a) 5%
annually; (b) ff/n calendar monthly, (b) jfj% daily (ignoring leap years).
(c) Calculate the final amount in the account in each case.
In each case the formula is

F„ = (l +/)"P0,
with P0 = 1000, but the / and n differ.
(a) This is the original problem with n = 5 and / = 0.05. Hence

P5 = (1 + 0.05)5 x 1000 = 1.05s x 1000 = 1276.28

(in £, to the nearest penny).


(b) This account has 12 compounding periods each year, giving a total of 60
over the 5 years. Hence we require

/ 0.05\6°
p60 = l i+_ J x 1000 = 1283.36.

(c) For the daily rate, there are 365 x 5 = 1825 compounding periods. Thus
we require

0.05\182S
x 1000 = 1284.00.

There is a slight gain with increasing number of compounding periods.

Example 37.2. The sum of £50000 is borrowed over 25 years and repaid in
equal annual instalments, the interest on the outstanding balance in any year being
8%. Find the annual repayments. This scheme equalizes the repayments over the
term of the loan.
Let us solve the general mortgage-repayment problem. Suppose that £P is
borrowed over N years at an interest rate of 100/% on the outstanding loan.
Let the amount outstanding after m years be £Q,„. Thus Q0 = P and QN = 0.
Let £A be the annual repayment. Then this must include the interest on the
debt still owed and capital repayment. Thus

T — IQm- 1 + (Qm-l — Qm)y


or
Qm-(\ +I)Qm-i = -A (m = 1,2,, N). (37.4)
650 37.1 Mathematical techniques

It can be done, but it is not quite so obvious now how to iterate Qm from Q0.
It is sometimes helpful to look for any constant solutions of the difference
equation. In this case, let Qm = C for m = 1, 2N. Then

C — (1 + I)C = — A, or C = A/I.
This is known as a fixed point or an equilibrium value of the difference
equation.
Put

Qm = C + Um = A/I + Um
into (37.4). Then

(A/I+ UJ-(1 +I)(A/I+ Um.1)= -A.


Hence Um satisfies

Um-(l + I)Um.l =0,


which is the difference equation (37.1) again. Thus

Um = (1 + I)mU0 = (1 + I)m(Qo - A/D,


and so

Qm = A/I + Um = A/I + (1 + I)m(Qo - A/I).


Finally, the boundary conditions Q0 = P and QN = 0 imply

0 = A/I + (l + I)N(P — A/I).


Hence

7P(1 + /)*
A =
(1 + If - 1
For the data given, P = 50000, / = 0.08, N = 25, and consequently

0.08 x 50000 x 1.0825


A = = 4683.94 (to 2 d.p.),
1.08-5 - 1

which represents the annual payments in £ to the nearest penny. The total
repayment over the 25 years is

AN = A x 25 = 117 098.47 (to 2 d.p.),

In the first year, in £,

Qi = A/I + (1 + I){P - A/I) = P(1 + I) - A = 49 316.06


which indicates, as one might expect, that interest payments dominate in the
early years.

37.2 Difference equations: general properties


Any equation of the form

un — J (u„ _ un _ 2t • • • ■> un - m) (37.5)


(where m is an integer ^ 1) for any successive sequence of integers
n, which may terminate or not, is known as a difference equation.
The term discrete dynamical system is also frequently used. Thus
un — 2u„_1 + 2, (37.6)
un - 3u„_j + 2u„_2 + n2, (37.7)
un+l = kun(\ - un), (37.8)
are examples of difference equations.
37.2 Difference equations 651

The number m in (37.5) is known as the order of the difference


equation: it is the difference between the largest and smallest
subscripts attached to u, namely

n — (n — m) — m.

Thus (37.6) and (37.8) are first-order difference equations, while


(37.7) is second-order. The sequence of integers attached to u can
be translated (that is, any integer can be added to the index) without
affecting the difference equation. The difference equation

K + 2 = 3u„ +1 + 2 u„ + (n + 2)2,
is the same as (37.7): n has been replaced by n + 2 throughout.
Given initial conditions, the successive terms are very easy to
compute. For a first-order difference equation, we can assume that
u0 is given, but it could be any term, say ur, which is taken as the
initial condition. Generally, our aim is to find a sequence {u„} and
a formula for u„ for n ^ r which satisfies the difference equation.
The difference equation (37.8) (which is known as the logistic
equation) with k = 2 is

K+i = 2w„(l - u„). (37.9)

Suppose that we put u0 = then the sequence

3 _ 15 .. 255 _ 65535
«i « Wo « W1 ' « HA 5 • • • j

8 32 512 131 072

follows by successive substitution. This sequence of numbers is


actually approaching the value \ as n increases. We can sketch the
sequence by discrete values at integer values of x in the usual
cartesian axes. The series of dots in Fig. 37.1 is a graphical
representation of the sequence.
The implied limiting value of un as n -> oo for this sequence
suggests that un = \ is a solution of the difference equation (37.9),
and this can be confirmed. We can find all constant solutions by
simply putting un = u for all n. From (37.9), the constant solutions
0 1 2 3 4 n are given by

Fig. 37.1 u = 2u(\ — u), or 2u2 — u = 0,


Iterations of the sequence
which implies that u = 0 and u = \ are solutions. These are also
u„ + i = 2u„( 1 - un) with
u0 = 4
known as the fixed points or equilibrium values of the difference
equation.

Fixed points or equilibrium values

For any first-order difference equation un+1 = /(u„), (37.10)


its fixed points are given by solutions of
u = /(«).
652 37.2 Mathematical techniques

You might notice that the solutions of (37.9) vary qualitatively


with the initial conditions. If 0 < u0 < 1, then un \ as n -> oo; but,
ifti0 > 1 or u0 < 0, then u„ becomes unbounded for large n. We shall
discuss the logistic equation further in Section 37.5.
For second-order difference equations, the same process gives
equilibrium values. For example, if

»„ + 2 — 2u„+1 + 4 un = 6,

Fig. 37.2 then the equilibrium value is given by

u - 2u + 4u = 6, or u = 2.

On the other hand, the second-order difference equation (37.7) has


no fixed points since

u — 3m — 2u — n2 = —4 u — n2

can never be constant for constant u.

37.3 First-order difference equations and the


cobweb
Fig. 37.3 Cobweb for
An alternative method of representing solutions of difference equa¬
«„+i = ~2 »„ + 2 with uo = i
tions graphically is the cobweb. Consider the first-order difference
equation

«„+1 = /(«») = iK + !•
Plot the lines v = x and y - |x + 1 (Fig. 37.2). Select an initial value,
say, u0 = j, which corresponds to P0 on the x axis. Then

ul ~ 2 2 ' 4’

which we can represent by :(u0, ux) on the line y = ix + 1. Locate


the point Qx :(uu ux) on y = x, which can be achieved by drawing
PXQX parallel to the x axis. Next P2 on y = ^x + 1 can be found by
drawing QXP2 parallel to the y axis. Its ordinate must be u2. Repeat
the process by drawing lines between y = x and y = + 1 using
the same rules.
The usefulness of the method is that a graphical representation
and interpretation of the solutions can be achieved by a simple line
drawing. It is particularly helpful for finding fixed points and
assessing their stability. We can see that this difference equation
has a fixed point, which is stable since all cobwebs approach the
fixed point.

Example 37.3. Sketch a cobweb solution for

un+ i = — kun + k,
for (a) k = f (b) k = (c) k = 1, using the initial value u0 = | in each case.
(a) Plot the lines y = x and y = —Lx + j. They intersect at the fixed point (y, y)-
Starting from P0 : (|, 0), the cobweb traces P0PiQ\P2Q2P3 ■ ■ ■ in Fig. 37.3.
Evidently it approaches the fixed point as n -*■ oo, indicating stability.
37.3 Difference equations 653

(b) The lines are y = x and y = —§x + §. The fixed point is at (§, §), and
the cobweb path is P0P1Q1P2Q2 • • • in Fig. 37.4. The path moves away from the
fixed point implying its instability.
(c) The lines are y = x and y = — x + 1 with fixed point (^, f). The path
starting at P0:(|, 0) follows the rectangle P\QXP2Q2> indicating periodicity (Fig.
37.5). This is true for any starting value except that of the fixed point itself.

Graphs of the sequences un versus n are shown in Fig. 37.6.


The stability of the fixed point of the first-order linear difference
equation can be summarized as follows.

Stability
Fig. 37.5 Cobweb for The first-order difference equation un + 1 — —kun + a
1 = -«„+! with Uq = |. has
(a) a stable fixed point if — 1 </c< 1, (37.11)
(b) a fixed point that is not stable if \k\ > 1,
(c) a periodic fixed point if k — 1,

37.4 Constant-coefficient linear difference


equations
Any difference equation of the form
u„ + an_lu„_1 + • • • + an_mun_m = f(n),
where the a, (i = n — m,— 1) are constants, is a constant-
i_i-1_i
coefficient linear difference equation. We shall look in detail at the
O 12345678 n
(a) second-order case
un + 2 + 2aun+1 + bu„ = f(n), (37.12)
where a and b are constants and f(n) is a given function. The
methods generalize in a fairly obvious way to higher-order systems.
There are many parallels between the difference equation (37.12)
and second-order constant-coefficient equations (Chapters 18-19).
The equation is said to be homogeneous if f{n) = 0, and inhomo¬
geneous otherwise, just as in the case of second-order differential
equations. However, this section is self-contained and reference back
is not necessary. The general solution of the inhomogeneous case
requires that of the homogeneous case: hence we start with the latter.

Homogeneous equations. We can see how to proceed by looking at


[ * * * the first-order constant-coefficient equation
\ / \ / \ / \

wwww
\ / \ / \ / \' un + 1 — cun = 0. (37.13)
tf tf tf
As can be seen from (37.2) or verified directly, the general solution
I_I I-1-1-1-1-1

O 1 2345678 n of this equation is


(c) un = Ac", (37.14)
Fig. 37.6 Solutions of u„+I = where A is any constant. Notice that we could equally well write
— k(u„ + 1) for (a) k = j, (b)
k = 2, (c) k = 1. un = Ac"~1, or un = Acn + 1:
654 37.4 Mathematical techniques

the result would be equally correct, although A would take different


values for the same initial condition. The significant property of
(37.13) and its solution (37.14) is that u„ + 1 is a constant multiple
of un.
With this in view, we attempt to find solutions of

un + 2 + 2 aun + 1 + bun = 0 (37.15)

in the form

K = Pn,

where p is a constant. Thus

uh + 2 + 2 au„+1 + bun = pn +2 + 2apn+1 + bpn,


= (p2 + 2ap 4- b)p",
= 0,

for all n, if p = 0 or

p2 + 2 ap + b = 0. (37.16)

The case p = 0 leads to the self-evident trivial solution un = 0. We


are interested in solutions of (37.16), which is known as the
characteristic equation of (37.15).
There are various cases to consider. Suppose that the roots of
(37.16) are the distinct numbers px and p2. Hence un = p" and
un = pn2 are solutions of (37.15). Since this equation is homogeneous
and linear, it follows that any linear combination of p" and p" is
also a solution. We state this as follows.

Distinct roots

The general solution of un + 2 + 2aun + l + bun = 0 for (37.17)


distinct roots px and p2 of p2 + 2ap + b = 0 is
un = Ap'[ + Bp2, for any constants A and B.

Example 37.4. Find the general solution of

K+2 ~ mb+i - 6u„ = 0. (37.18)


The characteristic equation of (37.18) is

p2-p- 6 = 0, or (p - 3) [p + 2) = 0.
The roots are px = 3, p2 = —2. Hence the general solution is

un = A- 3" + B(- 2)n.

Example 37.5. Find the solution of


un + 2 + 2w„ + 1 - 3u„ = 0

that satisfies u0 = 1, ux = 2.
37.4 Difference equations 655

The characteristic equation is

p2 + 2p — 3 = 0, or (p + 3)(p — 1) = 0.

The roots are px = — 3, p2 = 1. Hence the general solution is

u„ = /4( —3)" + B 1" = A( — 3)n + B.

From the initial conditions,

Uq = 1 = A -f- F, ux = 2 = —3A + F.

Hence A = — \ and B — f. The required solution is

«„= —i-(—3)" + f.

The characteristic equation can have equal roots, which is a


special case. Consider the difference equation

un+2 - 2aun+1 + a2un = 0,


where a / 0. Its characteristic equation is

p2 — lap + a2 = 0, or (p — a)2 = 0,

which has the repeated root p — a. One solution is Aan; but we


require a second independent solution. Let un = vnan. Then

0 = un + 2 - 2au„ + 1+a2un = an + 2vn + 2- 2a" + 2vn + 1 + an + 2vn,


= an + 2{vn + 2 - 2vn + l + va).
We require therefore

vn+2 ~ 2vn + l + vn = 0.
since u / 0. It can be verified that this equation has a solution vn — n.
Hence a further independent solution is un = Bnan.

Equal roots
The general solution of un + 2 — 2aun+1 + a2un = 0 is (37.19)
un = {A + Bn)an

Roots can also be complex. Consider the difference equation

un + 2 + 2W„+1 + 2 U„ = 0.
Its characteristic equation is

p2 + 2p + 2 = 0

with roots pj = — 1 + j, p2 = —1 — j. The method still works.


However, the general solution becomes
un = A(-\ +])» + £(—1—jy.

For a real-valued problem, the constants A and B will be complex


conjugates which ensure that un is real. The solution can be cast in
656 37.4 Mathematical techniques

real form by using the polar forms (Section 6.3) of the complex
numbers. In this case

-1 ±j = V2e±inj.

Hence

un = A2±n Qinin + B2inQ~inin,

= 2in[/l(cos Inn + j sin inn) + B(cos inn - j sin inn)'],


= 2*"(C cos inn + D sin inn),

where C = A + B and D = (A — B)j.

Complex roots, a ± jfi = r e±ej


The general complex solution of un + 2 + 2aun+1 + bun
= 0, where a2 < b, is
un = A(a + \JT + B{(x-iP)n.
The general real solution is
un = r"(C cos On + D sin On).

Example 37.6. Obtain the general solution of

«n+ 2 + un = °-

The characteristic equation is

p2+ 1=0,

giving roots = j, p2 = — j- Hence

un = Ain + B(-i)n.

In polar form, j = e*nj, —j = Hence the real form of the solution is

un = C cos ]nn + D sin \nn.

Inhomogeneous equations. The general inhomogeneous equation is

un + 2 + 2 aun+1 + bun = f(n) (37.21)


(see (37.12)). Let un = vn + q„< where vn is the general solution of the
corresponding homogeneous equation. Substitute this form of un into
(37.21):

(.Vn + 2 + On + i) + 2 a{vn + 1 + <?B + 1) + b(v„ + qn) = f{n),


or

('vn + 2 + 2attn+1 + bvn) + (qn + 2 + 2 aqn+1 + bqn) = f{n).


Since vn satisfies the homogeneous equation, it follows that
On + 2 + 2 aqn+1 + bqn = f(n),
which means that qn must be a particular solution of the inhomo¬
geneous equation. As in differential equations, vn is known as the
complementary function.
37.4 Difference equations 657

We construct particular solutions by appropriate choices of


functions usually containing adjustable parameters which are sug¬
gested by the form of the function f(n). If a particular choice fails,
then we reject it and try something else. For example, if f(n) = k, a
constant, then we might try qn = C. This will work provided that
the homogeneous equation does not itself have a constant solution,
a point which the next two examples illustrate.

Example 37.7. Obtain the general solution of


un + i - «„+i - 6u„ = 4.
From Example 37.4, the complementary function is
vn = 3" A + (-2 )"B.
For the particular solution, we try qn = C, since /„ = 4. Then
Qn + 2 - qn+i - 6qn r 4 = C - C - 6C - 4,
= —6C — 4 = 0,
if C = — f. Hence q„ = — §, and the general solution is
un = 3"A + (-2)"B - f.
Note that the two unknown constants in the general solution occur in the
complementary function.

Example 37.8. Obtain the general solution of


un + 2 + 2un+1 - 3un = 4.
From Example 37.5, the complementary function is
v„ = (-3)"A + B.
In this case we expect the choice qn = C to fail, since it must make the left-hand
side of the difference equation vanish. When this happens, we try
q„ = Cn.
Then
<7„ + 2 + 2<7„+i - 3<7„ - 4 = C(n + 2) + 2C(n + 1) - 3Cn - 4
= 2C + 2C - 4 = 4C - 4 = 0,
If C = 1. Hence the general solution is
u„ = (-3)n + B + n.

Table 37.1 lists some simple forcing terms f(n) with suggested
forms of particular solution and alternatives containing parameters
to be determined by direct substitution.

Table 37.1

m Trial solution q„

k (a constant) C; or Cn, if C fails;


or Cn2, if C and Cn fail; etc.
kn Ck"; or Cnk", if Ckn fails; etc.
n C0 + Cjti
np (p an integer) C0 + Cxn T ■ • • T Cpnp (may need higher powers of
n in special cases)
sin kn or cos kn Cj cos kn + C2 sin kn
658 37.4 Mathematical techniques

Example 37.9. Find the general solution oj

«„ + 2 - 4“n = «•

The characteristic equation is

p2 - 4 = 0, or (p — 2)(p + 2) = 0.

The roots are p, = 2, p2 = -2. Hence the complementary function is

vn = 2 "A + (-2 )"B.

For the particular solution, try (choosing from Table 37.1)

d,, = C0 + Ci«.
Then

q„ + 2 ~ 4cln+\ ~ n = Q) + Q(n + 2) — 4C0 — 4Cxn — n,

= ( — 3C0 + 2C,) + n(-3C, - 1).

The right-hand side vanishes for all n if

-3Co + 2C1=0, —3Cj — 1=0.

Hence C, = — C0 = 2Q/3 = — f, and the general solution is

u„ = 2n A + ( —2)"6 — | — 3«-

37.5 The logistic difference equation


Consider again the logistic difference equation

u„ + 1 = au„(l - u„), (37.22)


where a is a parameter which will take various values. This
nonlinear equation can model population growth of generations. If
un represents the population size of generation n and a is the
birth-rate, then we might expect the population size of the next
generation to be otun in the absence of any inhibiting factors such
as lack of resources or overcrowding. If a > 1, then the population
model given by the first-order difference equation un + 1 — an,, would
imply that the population would grow to infinity, since the equation
has the solution un = a"u0. To counter this possibility, we can
introduce a feedback term —out;, which will tend to reduce popula¬
tion growth when the population is large.
Fixed points of the equation (37.22) occur where

u = otu(\ — u)

at u = 0 and u = 1 — 1/a. We can adapt the cobweb method of


Section 37.3 to this nonlinear difference equation by plotting graphs
of the parabola y — f(x) = ax(l — x) and the straight line y = x.
Fixed points of the difference equation will occur where the line and
the parabola intersect. The values of x at these points will be given
by the solutions of

ax( 1 — x) = x, or x(ax — 1 — a) = 0.

The points are x = 0 and P: x = 1 — 1/a. We shall only look at


37.5 Difference equations 659

values of a > 1, so that one fixed point is in the first quadrant, x > 0,
y > 0. A cobweb solution for the case a = 2.8 is shown in Fig. 37.7.
Notice that, for this choice of a, the fixed point P is stable; that
is, the cobweb solution approaches P. The slope of the graph of
y = ax(l — x) at P determines the stability or instability of the
solutions. The slope at P is m = a — 2ax, where x = 1 — 1/a. Hence
m — a — 2a(l — 1/a) = —a + 2.
As with the cobweb for two intersecting lines for the linear difference
equation in Section 37.3, the fixed point P is locally stable if
Fig. 37.7 Cobweb solution for
un + l = 2.8h„(1 — un) showing a m > — 1, in that all cobweb paths starting close to 1 — 1/a approach
solution starting from x = u0 the fixed point P as n -> oo. This corresponds to a < 3. Notice also
approaching the fixed point at P. that, if 1 < a < 2, then the y = x intersects the parabola y =
ax(l — x) between the origin and its maximum value. This follows
since the maximum occurs at x = \ and 0 < 1 — 1/a < \ implies
1 < a < 2.
For a ^ 3 the solutions become more complicated. The fixed
point at the origin is unstable: hence there is no stable fixed point
to which solutions can approach. We can obtain a clue as to what
happens if we look at the function of a function given by
y = /(/00) = a[ax(l - x)][l - ax(l — x)],
= a2x(l — x) — a3x2(l — x)2.
When a = 3, this curve intersects y = x at x = 0 and at P only. This
can be checked by noting that
x = a2x(l — x) — a3x2(l — x)2
can be written as
x(27x3 — 54x2 + 36x — 8) = 0, or x(3x — 2)3 = 0,
when a = 3. Graphs of the curves y = /(x) and y = /(/(x)) for
a = 3 are shown in Fig. 37.8a. The fixed point P is at (f, §). As a
increases, two additional fixed points develop on the line y = x.
Further graphs of the two functions y = /(x) and y = /(/(x)) for
a = 3.4 are shown in Fig. 37.8(b), together with the line y = x. There
are fixed points at 0, A, B, C, of which A and C are stable. While

y (a)
y (b)

Fig. 37.8 (a) Graph of


y = /(/(x)) for the critical case
a = 3. (b) Graph of y = /(/(x))
fora = 3.4 showing fixed points
O, A, B, C.
O
660 37.5 Mathematical techniques

A and C are fixed points of this equation, they can be associated


with 2-cycles or period-2 solutions of the difference equation, as
shown by the superimposed cobweb on y — ax(l — x) in Fig. 37.8b.
This phenomenon is known as period doubling. It first appears when
the fixed point Pony = /(x) ceases to be stable at a = 3. The 2-cycle
then grows in ‘amplitude’ as a increases. The solution is said to
bifurcate at a = 3. This type of bifurcation, where the stable solution
becomes suddenly unstable and throws off two stable solutions on
either side, is an example of a pitchfork bifurcation, called this
because of its fork-like appearance (see Fig. 37.9).
For general a, the fixed points of y — /(/(x)) occur where

x = a2x(l — x) — a3x2(l — x)2.

To solve this equation, put 1 — x = u. Then u satisfies

1 — u = a2(l — u)u — a3(l — u)2u2,

which can be written as G(u) = 0, where

G(u) = (u — l)(a3w3 — a3u2 + ol2u — 1). (37.23)

One obvious solution of G(u) = 0 is u = 1, while the cubic factor


has the solution u = 1/a as can be verified. Hence

G(u) = (u — l)(aw — l)[a2u2 + a(l — oc)u + 1],

At A and C, u satisfies

oc2u2 + a(l — oc)u + 1 = 0. (37.24)

Hence, at the two fixed points A and C,

x, _ 1 + a + ^/[(a + l)(a - 3)]


(a > 3),
2a

respectively, while x = 1/a at P. Since

/(xi) = axj(l - x0

= i[l + a-V[(«+ l)(a — 3)]]

x 1 — — {1 + a - ^/[(a + l)(a - 3)]}

1
= [a + i _ ^[(a + i)(a _ 3)]]

[a - 1 + ^[(a + l)(a - 3)]]

1
l)(a - 3)]}2]

= f [1 +<* +VT(a + l)(a — 3)]] = X 2?

and similarly /(x2) = x1. Hence period doubling occurs, and transfers
37.5 Difference equations 661

between xt and x, can take place around the square cobweb in Fig.
37.8b.
The fixed point A will remain stable if the absolute value of its
slope is less than 1. The same condition will also apply at C. In fact
the critical slope is —1, and we will find the value of a at
which this occurs. We have

— /(/(*)) = oc2 — 2 a2x — a3(2x — 6x2 + 4x3)


dx'
= a2 — 2a2(l + a)x + 6a3x2 — 4a3x3. (37.25)

We require the value of a given by

f /(/(*)) =fij—
dx
or 4a3x3 — 6a3x2 + 2a2(l + a)x — a2 = 1, (37.26)

when x satisfies (37.24) with u — 1 — x, which is when

a2x2 — a(l + a)x + a + 1 = 0. (37.27)

Remove the x3 term from (37.26) by multiplying (37.27) by 4ax, and


subtracting it from (37.26). Then

— 2or(a — 2)x2 + 2 a(a — 2) (a + l)x — (1 + a2) = 0. (37.28)

Equations (37.27) and (37.28) must have the same roots in x. In each
case, make the coefficient ofx2 equal to 1. The results for comparison
are

2 (® + 1) (a + 1) n
a a

2 («+ 1) , («2 + 1) n
a 2a2(a — 2)

These equations have the same roots if

a + 1 (a2 + 1)
(37.29)
a2 2a2(a — 2)’

or

a2 — 2a — 5 = 0.

We are interested in values of a > 3, so that the required root of


(37.29) is a = 1 + ^6 = 3.449_In fact the slopes at both A and
C both become — 1 for this value of a. Thus, for

3 < a < 1 + J6,

the 2-cycle solution is stable.


At a = 1 + j6, the system bifurcates again into a 4-cycle or
period-4 solution, which corresponds to the set of stable fixed points
662 37.5 Mathematical techniques

of y = /(/(/(/(x)))). A graph of this function for a = 3.54 is shown


y in Fig. 37.10 together with the 8 fixed points. The cycle doubles
again at about a = 3.544,... and so on. The intervals between the
bifurcations of the period doubling rapidly decrease, until a limit is
reached at about a = 3.570,... beyond which chaos occurs. The
iterations are no longer periodic for most values of a beyond this
point, although there are some brief intervals of periodicity.
The sequence of period doubling bifurcations is known as the
Feigenbaum sequence, and it has certain universal aspects in that
it is not just a consequence of the logistic equation, but has common
features with other difference equations which generate period
doubling.
The simplest way to view the progressively complex behaviour is
Fig. 37.10 Fixed points of through a computer-drawn picture of the iterations of
v = /(/(/(/(.*)))) for y. = 3.54.
given by the intersection of the m„+i = au„(l - un)
curve and the line y = ,v.
for stepped increases in a starting at a = 2.8 up to a = 3.8, which
covers the main area of interest. The result is shown in Fig. 37.11.
1.0 The series of single dots for each a in 2.8 ^ a ^ 3 indicates the fixed
point, which then bifurcates into a stable 2-cycle attractor for
0.8
3 < a ^ 1 + y/6. This in turn bifurcates into a stable 4-cycle attrac¬
0.6 tor at x = 1 + v' 6 and so on. The effect of infinite period doubling
is that the solution is ultimately nonperiodic. The generally chaotic
0.4 and noisy behaviour of the difference equation can clearly be seen
in the large number of dots for larger values of a. These nonperiodic
0.2 sets are known as strange attractors. The successive iterates of
the logistic equation wander about in a seemingly random but
bounded manner, and never settle into a periodic solution. However,
Fig. 37.11 Period doubling for within the chaotic band of a values, there appear windows of
the logistic equation for periodic cycles. Problem 37.26, for example, confirms that there is
increasing a, following by a 3-cycle around a = 3.83.
chaotic iterations beyond about
The logistic equation can be thought of as a relatively simple
x = 3.57.
model example. Many similar nonlinear difference equations also
exhibit similar period-doubling bifurcations and strange attractors.

Problems

37.1. £1000 is invested over 10 years at an interest (a) Find the required adjustment to the annual repay¬
rate of 6% annually. Find the final total investment. ments for the loan to be repaid over the original term.
What should the monthly interest rate be to achieve the (b) If the repayments are not changed, by how much
same final total? will the mortgage term be reduced?

37.2. The sum of £50000 is borrowed over 25 years 37.3. Find the fixed points of the following difference
and the money is repaid in equal annual instalments. equations:
The interest rate on the outstanding balance in any (a) un+l = u„(2 - u„);
year is 10%. Find what the annual repayments would (b) un + 1 = u„(l + un)(2 - 3u„); (c) un + 1 = sin u„;
be. After 5 years, the interest rate is reduced to 9%. (d) un+1 = i sin u„; (e) u„+1=eu"-l.
Difference equations 663

37.4. Given the initial value u0 in each case, calculate the (c) n„ + 3 - 3un + 2 + 3u„+1 + u„ = f(n), where
sequence of terms up to u5 for each of the following (i) /(n) = 1, (ii) /(n) = n, (iii) f(n) = n2.
first-order difference equations: (d) u„ + 2 - 6u„ + 1 T 9u„ = f(n), where (i) f(n) = 2",
(a) u„ +1 = 2w„(3 - u„), u0 — 1; (ii) f(n) = 3; (iii) /(n) = 3"; (iv) f{n) = n3".
(b) un+1 = 2u„( 1 - u„), u0 = i;
(c) u„ + 1 = 3.2un(l - u„), Uq = 37.12. A ball bearing is dropped from a height z = h0
(d) u„ + 1 = 4i/„( 1 - t<„), u0 = on to a metal plate, and the coefficient of restitution
between the ball and the plate is e, where 0 < e < 1.
37.5. (Section 37.3). Sketch the cobweb solutions for the Set up a difference equation for the maximum height
following first-order equations with the stated initial reached after n impacts. Solve the equation. (Assume
conditions, and discuss the stability of the fixed point: that a ball dropped from a height h hits the plate with
(a) un+, = jun + i u0 = i and u0 = f; speed v = ^/(2gh), where g is the acceleration due to
(b) un+, = 2u„ - 2, u0 = i and u0 = §; gravity. The rebound speed of the ball is ev.) Instead
(c) u„+1 = -n„ + 2, u0 = ^ and u0 = f; of being stationary, the plate now oscillates so that it
(d) u„ + 1 = -|u„ + |, «0 = i and u0 = f; is moving upwards at a speed u (a constant) at the
(e) t/„+1 = —2u„ + 3, u0 = i and u0 = §. moment of each impact with the ball. Find the difference
equation for h„. Show that the difference equation has a
37.6. The function /(n) satisfies fixed point and interpret its meaning.

/(«) = /(i») + 1-
37.13. Dn(x) is the n x n determinant defined by
Put n = 2"' and g(m) = /(2m), and show that
g(m) = g(m - 1) + 1. 2x 1 0 • 0
Hence find /(«) given that /(1) = 0.
1 2x 1 • ■ 0
£>„(*) = (n > 2),
37.7. Use the method suggested in the previous prob¬
lem to solve 0 0 0 • 2x
f(n) = /(in) + f,
2x 1
given the initial condition /(1) = 0. D2(x) = D,(x) = 2x,
1 2x

37.8. (Section 37.3). Find the general solutions of the Show that
following difference equations:
D„(x) = 2xDn_1(x) - D„_2(x).
(a) 11 n + 2 + 2w„+ 1 - 3u„ = 0; (b) un+2 - 9un == 0:
(c) l,n + 2 + 9un = 0; (d) - 4u„_, + 5nn^2 == 0 Solve the difference equation for x ^ 1 and x = 1.
(e) l,n + 2 — 4 u„ + 1 + 4w„ = 0;
(0 Un + 3 - l<n + 2 + “n + I — «n = 0; 37.14. Let {un} (n = 0,1,...) be a sequence. The power
(g) un + 3 — = C1; series
(h) W« + 3 - 3«„+ 2 + 3n„+, — «„ = 0;
(i) 11 n + 2 — lC+ i - un -(- u„i-1 = 0. /(«„. X) = X UnXn
n=0

37.9. Express the solution of the initial-value problem is known as the generating function of the sequence.
un + 2 — 6u„+, + 13m„ = 0, u0 = 0, «i = 1> Thus, for example, if u„ = (— 1 )"/n\, then
in real form. cc (_!)"
f{un,x) = X —
37.10 Find the difference equation satisfied by n = 0 nl

u„ = A-2n + B-( —5)", which means that e“x is the generating function of
for all A and B.
The generating function of {un + 1} is

37.11. Obtain particular solutions of the following in¬ 00 i » ( _ i)« +1

homogeneous difference equations: /(u„ + 1,x)= X W" = - I -^" + 1


n=0 x „ = 0 (« + 1)!
(a) u„+2 + 2m„ + 1 - 3u„ = f(n), where
(i) j\n) = 2"; (ii) f(n) = n; (iii) f(n) = 2, 1 / * (-1)" 1
(iv) f(n) = (-3)". = - X —x" ~1 =-[/(«»>x)_ 1]-
x\n=0 n!
(b) n„ + 2 + 2«„+1 + 2un = /(n), where Consider the difference equation
(i) f(n) = 1; (ii) f(n) = n + 3;
(iii) f{n) = cos jttn. Un + 2 + Un+ i — = 0, n0 = 1, tq
664 Mathematical techniques

where
By taking the generating function of the equation, show
that un — 2a b
, A =
1 _ -1 0_

Deduce that

Using the binomial theorem find u„. z„ = A"Zq-


Consider the case with a = 1 and b = — 8. Find the
37.15. A Fibonacci sequence is defined as a sequence eigenvalues of A and use the methods of Section 13.6
in which any term is the sum of the two preceding to find a formula for A". Hence solve the difference
terms. For the Fibonacci sequence starting with u, = 1, equation for u„ in terms of u0 and u,.
u2 = 2, find and solve the difference equation for un.

37.21. (Section 37.5). Consider the logistic equation


37.16. Solve the initial-value difference equation
un+i = ocu„( 1 - u„).
3u„ + , — 2un + l — un = 0, u i=2, u2 = 1,
Draw cobweb solutions starting at u0 = j for the cases
and show that u„ -* § as n -> oo. a = 2.7, a = 2.9, and a = 3.3. What do you infer about
the stability of the fixed point in the first quadrant?
37.17. A symmetric random walk takes place on the
integer steps on the line between x = 0 and x = N. 37.22. (Section 37.5). In the logistic equation u„+1 =
At any position x = r (1 < r < N — 1), the probability ccu„(l — «„), for what positive values of a is the origin
that the walker moves to either x = r + 1 or x = r — 1 a stable fixed point?
at any stage is j. The probability uk that the walker
reaches x = 0 first, given an initial position x = k, satisfies
37.23. (Section 37.5). Find the two stable values between
the difference equation
which u„ ultimately oscillates in the logistic equation
llk = 2llk - I T 2llk +1' llO ~ 1’ UN = 0) «„+i = 3.25«„(1 - ;/„).

for 1 < k < N — 1. Find uk. What is the probability


37.24. Consider the difference equation
that the walker reaches x = N first?
If dk is the expected number of steps in the walk “„+i = «(i - l«B - 2I)■
before it reaches 0 or N, then dk satisfies
Sketch the function y = /(x) = a(f — |x — f|) for a =
dk = i( 1 + <4+1) + 1 + <4-1), d0 = dN = 0 f. Where are the equilibrium points of the difference
equation for a > 1? Show that the origin is stable if
for 1 ^ k ^ N — 1. Find the expected duration of the
a < 1, and unstable if a > 1. What happens if a = 1?
walk.
Skeich the graph of y = /(/(x)) fur a. = 2. Show
that there exists a 2-cycle and locate the periodic values
37.18. Show that u„ = «! is a solution of the second- of un.
order difference equation

an + 2 = (n + 2 )(n + l)u„. 37.25. Find the fixed points of

By using the substitution un = v„n\, find a second inde¬ «„+i = <xu„( 1 - O,


pendent solution.
for all a. Determine the slope of y = f(x) = ax(l — x3)
at the nonzero fixed point. Confirm that this fixed
37.19. Given that point is stable if a < f and unstable if a > f. Sketch
cobweb solutions for a = 1.2, 1.4, 1.8.
.v„ = i k\
k = 1
37.26. By starting from u0 = 0.957417, compute uu
find a first-order difference equation for s„. Solve the u2, ■. ■, u5 for the difference equation
equation to find a formula for the sum s„.
«»+i = ocu„(l - u„), a = 3.83,

37.20. Show that the difference equation and confirm that the logistic equation appears to have
a 3-cycle for this value of a.
un + 2 + 2aun + l + bu„ = 0

can be expressed as 37.27. Find the fixed points of the difference equation

+1 = Azn, «n+i = ocu„(l - uj2,


Difference equations 665

in the three cases (a) a = 9, (b) a = 4, (c) a = f. Discuss where C is any constant. This general solution includes
the stability of the fixed points in each case. closed form chaotic solutions. For example, if C = 1/ji,
then
37.28. Show that the special logistic equation

«„ + i = 4u„(l - u„) u„ = sin2(2")


has the solution

i/„ = sin2(2"C7i) which never repeats itself for n = 0. 1, 2,....


PROBABILITY AND
STATISTICS

Probability
38 Contents
38.1 Introduction 666
38.2 Sample spaces, events, and probability 667
38.3 Sets and probability 669
38.4 Counting and combinations 673
38.5 Conditional probability 675
38.6 Independent events 677
38.7 Total probability 678
38.8 Bayes’ theorem 679
Problems 680

38.1 Introduction
An experiment or trial is described as random if the result or outcome
of the experiment is not predictable or contains uncertainty. The
theory of probability is essential in the modelling and analysis of
random experiments. In some aspects of life we expect and often
hope that situations we meet behave in a predictable or deterministic
manner. We expect water to freeze at 0°C under normal pressure:
we expect the sun to rise at the appropriate time each day. For
important safety reasons we expect an aircraft to have predictable
characteristics in a wide range of sometimes extreme situations.
However, the weather is largely unpredictable looking more than a
week into the future. The distinction between random and determin¬
istic has become less ‘certain’ in more recent times. Some physical
phenomena such as the weather can be modelled by deterministic
equations but still exhibit long-term, seemingly unpredictable, be¬
haviour. Such systems, which display what is known as chaos (see
Section 37.5 for a model difference equation with a chaotic output)
show extreme sensitivity to small initial changes. Chaos is distinct
from random behaviour but the outcome can show very similar
manifestations.
If an experiment can be repeated a large number of times then
we can measure the frequency of a particular outcome. This only
makes sense if the conditions surrounding the experiment do not
38.1 Probability 667

change with time. In such an experiment the number of times in


which a particular outcome occurs may achieve some regularity. We
can measure this by calculating the relative frequency of this outcome
defined by

. . . number of time of occurrence of outcome


relative frequency =-.
total number of experiments

After a large number of experiments, this ratio may approach a


steady value which is known as the probability of this particular
outcome.
For example, the standard die has six faces numbered 1, 2, 3, 4,
5, 6. After a large number of throws, we would expect the number
1 (or any other number) to appear on the upper face with a relative
frequency of 1/6. Hence we expect that the probability of a 1
appearing is 1/6.
Many probabilities are based on data, past records, the ‘degree
of belief’, the view of individuals and so on. Horse races are usually
not repeated so that there can be no relative frequency approach,
but bookmakers and punters bet on the basis of the previous form
of the horses, the state of the course and the pattern of bets.
Generally as the race approaches the bookmakers’ odds reflect how
the accumulation of bets has been distributed among the runners.
Many outcomes will be assigned probabilities with at least some
subjective element.
Probabilities are important in measuring risk, and there can be
surprising results. From past data the earth receives a significant
meteor impact every 100 years. The probability of a particular
individual being killed by such an impact is very small but nonzero.
However, the impact could be cataclysmic, which means that by
some measures the probability of being killed by a meteor impact
is greater than that arising from a plane crash. In engineering, as
the reliability of components improves, the likelihood of failure
becomes more remote, but might as a consequence have more
serious implications if it does occur.

38.2 Sample spaces, events, and probability


The first task with our random experiment is to define the list or
set of all possible outcomes which is known as the sample space. A
simple example is the single spin of a coin, in which there are two
possible outcomes with either a head or a tail showing. The
outcomes can be denoted by H (for head) and T (for tail). The
sample space for this experiment has two elements which we denote
in set terms by

S = {H, T).

(Information about sets and set notation can be found in Chapter


34.) For the single throw of a fair die, the sample space has the six
668 38.2 Mathematical techniques

possible outcomes, namely 1, 2, 3, 4, 5, 6. Hence its sample space is

S= (1,2, 3,4, 5, 6}.

Some sample spaces have an infinite number of elements. Suppose


we spin a coin until a tail appears. Any number of heads could
appear before a tail. Hence the sample space is

S = {0 head, 1 head, 2 heads, 3 heads, and so on}.

However, the sample space is countable, that is, the elements in the
sample space can be numbered. A sample space is said to be discrete
if it contains a finite or countably infinite set of outcomes. A list of
outcomes such as {2, 4, 6, 8,...} would be countably infinite.
A collection of elements satisfying a common requirement in a
sample space is known as an event. For a die the event could be the
appearance of a particular number, say 5, an odd number outcome,
or any number less than 5. These events are respectively the sets

A1 = {5}, A2 = { 1,3,5}, A3 = { 1,2, 3, 4}:

they are subsets of the sample space, that is, in the notation of (34.3)
A £ S in each case.
As we mentioned in the introduction, the probability of an event
is the relative frequency that the event takes place in a large number
of repetitions of the experiment. The probability of an event A is
denoted by P(A). For the single spin of a coin we expect heads and
tails to be equally likely to occur. Thus

P(H) = |, P(T) = i

We can also view this in a non-experimental way. If an event can


occur in n different ways out of a total number of N possible ways,
all of which are equally likely, then the probability of the event is
n/N. For a fair coin a head can arise in one way from two equally
likely ways. Hence P(H) =
For the die the probability that any individual number x is face
up is given by P(x) = 1/6. The probability that a number less than
5 appears will be

number of ways in which numbers less than 5 occur


* \^3/ — ---

total number of possible outcomes

_ 4 _ 2
_ 6 ~ 3’

where A3 = {1, 2, 3, 4}.

Example 38.1. Two coins are spun. What is the probability that at least one
head appears?

It is essential in the solution to distinguish the coins, as, say, a and b. Thus if
Ha is the event that coin a shows a head, Ta that a shows a tail, and so, then
the sample space has four elements:

5 = {(//„, Hb), (Ha, Tb)ATa, Hb), (Ta, Tb)},


38.2 Probability 669

which are all equally likely. Thus

P((Ha, Hb)) = P((Ha, Tb)) = P((Ta, H„)) = P((Ta, Tb)) = J.

The event A in the problem is

A = {(Ha, Hh), {Ha, Tb), (Ta. Hb)},

which contains 3 of the 4 elements. Hence at least one head occurs with
probability P(A) = f.

Example 38.2. Two dice a and b are rolled. What are the elements of the
sample space? What is the probability that the sum of the face values of the two
dice is 8? What is the probability that at least one 5 appears?
We distinguish the outcome of each die so that there are 6 x 6 = 36 possible
outcomes for the pair. The sample space has 36 elements of the form where
i and j take all integer values 1, 2, 3, 4, 5, 6, and i is the outcome of die a and
j is the outcome of b. The full list is

S = {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6),

(2.1) , (2, 2), (2,3), (2, 4), (2,5), (2, 6),

(3, 1), (3, 2), (3,3), (3,4), (3, 5), (3, 6),

(4.1) , (4, 2), (4, 3), (4, 4), (4, 5), (4,6),

(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6),

(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)},

and they are all equally likely. If At is the event that the sum of the dice is 8,
then from the table,

A, = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)}

which occurs for 5 elements out of 36. Hence

P(A ,) = £•

The event that at least one 5 appears is the list

A2 = {(1, 5), (2, 5), (3, 5), (4, 5), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (6, 5)},

which has 11 elements. Hence

P(A2) = £.

38.3 Sets and probability


Set notation is very helpful in representing sample spaces and events.
This section uses the properties of sets and Venn diagrams explained
in Chapter 34. Consider Example 38.2 again: this is the problem
when two dice are rolled. In set terms the same space S can be
thought of as the universal set for this experiment. Suppose that we
are interested in the event A3 in which either the sum of the two
dice is 8 (event A-f) or at least one 5 appears (event Af), or both.
This event is the union of the subsets of S, namely A1 and A2,
represented by

A3 = A1u A2,
670 38.3 Mathematical techniques

where

A, = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)},

A2 = {(1, 5), (2, 5), (3, 5), (4, 5), (5, 1), (5, 2), (5, 3), (5, 4),

(5,5), (5, 6), (6,5)}.

The event A3 has 14 elements of which two are common to both


Ax and A2. If A4 is the event that both Ax and A2 occur, then A4
is the intersection of Ax and A2, namely

A4 = A1nA2 = {(3, 5), (5, 3)}.

The two events are shown diagrammatically in Figure 38.1.


Remember that the complement of a set or event is denoted by
A, and the empty set by 0.

Example 38.3. Suppose that A, B, C are three events in the same space S.
Write down the sets which represent the events that: (a) A occurs, but B and C
do not', (b) A, B, and C all occur.
(a) The event that B or C occurs will be B u C. The event that neither B nor
C occurs will be the complement B u C. The required set will be the intersection
of this event and A, namely

A n (B u C).

By de Morgan's first law (34.16), this is equivalent to

A n (B n C),

which can be written unambiguously as A n B n C by the associative law for


intersection (34.4).
(b) Events B and C occur in the set B n C. Events A and B n C occur in
the event
Fig. 38.1 (a) The event
/l3 = /l1u A2, (b) The event A n (B n C) or AnBnC.
A4 = At n A2.
Two events are said to be mutally exclusive if they cannot occur
together in a single trial, which in set terms is equivalent to two
subsets of S being disjoint, that is, having no elements in common.
Consider the following illustrative application of a single die, which
is rolled and the score noted. An event of interest in a random
experiment can be specified in many ways. A player could be
interested in even or odd scores, the score 2 or not, or scores which
are factors of 6 or not. In each case the sample space is divided into
two disjoint sets or mutually exclusive events, together providing
an exhaustive (meaning that there are no outcomes which are not
in at least one event), list of outcomes. For example, if A stands
for the event of an even score, then A must represent an odd score.
Thus
A n A = 0, and A u A = S.

Similarly, if At denotes the event of a score i where i = 1, 2,..., 6


for the rolling of a die, then the events are mutually exclusive and
38.3 Probability 671

exhaustive, and the sample space will be the union of these events:
S = d1u/t2U'-'Ud6.
Any union of events can be expressed in terms of the union of certain
mutually exclusive events. For example, the union A u B of two
events A and B can be partitioned into the mutually exclusive events
A n B, A n B, and A n B. Then
A u B = (A n B) u {A n B) u (A n B).
In another example, an event A in the sample space which also
contains B can be divided as
A = (A n B) u (A n B),
which can be interpreted as meaning that A can occur either with
B or without B.
Suppose the sample space is partitioned into the n mutually
exclusive and exhaustive events Au A2, If A is any event,
then
A = (A n Ax) u(/tn A2) u • • • u (A n An).
This means that, if A occurs, then it must occur as one, and only
one, of the events A\, A2,..., A„. It might happen that A n A,- = 0
for some intersections, but this does not matter.

Example 38.4. In Example 38.2, express the sample space S and the events
A i and A2 in set terms.
The sample space is given by
S = {(i,j) \i,j= 1,2, 3,4, 5, 6},
which has 36 elements since the dice are distinguishable and (i,j) is distinct from
(J, i). The events A1 and A2 can be written

A i = {(i,j)\i +j = 8},
A2 = {(i,j) | either i = 5 or j = 5 or both}.

The points we have illustrated can be summed up in a set of rules


called the axioms of probability:

Axioms of probability

For every event A in a sample space S, the probability


P(A) must satisfy,
(a) 0 ^ P(A) ^ 1;
(b) for the empty set (or non-event) and the sample
space S: (38.1)

P(0) — 0, P(S)= 1;
(c) for n mutally exclusive events A^ A2,..., A„,
P(A1uA2v---uAn)
= P(A1) + P(A2)+ ■ ■ ■ +P(A„).
672 38.3 Mathematical techniques

The rules can be interpreted as

(a) every probability must lie between 0 and 1;


(b) the probability of an impossible event is zero, and the probability
of the occurrence of some element in a sample space is certain;
(c) the probability that one of a set of mutually exclusive events
occurs is the sum of the probabilities of each event.

Example 38.5. Two dice are rolled. What is the probability that a total score
of 4 or 7 occurs?
Let Aj be the event of a score 4 and A2 be the event of a score 7. These cannot
occur together, so they must be mutually exclusive events. Hence by (38.1c) and
the complete list of outcomes in Example 38.2,

P(A, u A2) = P(AX) + P(A2) = £ + f-6 = 1

If two events Ax and A2 are not mutally exclusive then they must
have elements of the sample space in common. Using partitioning,
which was explained previously in this section, Au A2, and therefore
the union of Ax and A2 can be partitioned into unions of mutually
exclusive events. Thus

Ax = (Ax n A2) u (Ax n A2),

A2 = (Ax n A2) u (Ax n A2),

Ax u A2 = {Ax n A2) u (Ax n A2) u (Ax n A2),

since (Ax n A2) u (Ax n A2) = Ax n A2. Hence by rule (iii) in


(38.1),

P(AX) = P(AX n A2) + P(AX n A2), (38.2)


P(A2) = P(AX n A2) + P(AX n A2), (38.3)
P(AX u A2) = P{AX n A2) + P(AX n A2) + P(AX n A2).
(38.4)
Elimination of P{AX n A2) and P{AX n A2) between (38.2), (38.3),
and (38.4) leads to:

Probability addition law

For two events which are not mutually exclusive: (38.5)

P(AX u A2) = P(AX) + P(A2) — P(A x n A2).

Geometrically the result can be seen from Fig. 38.2 in which the
intersection Ax n A2 is ‘counted twice’ in P(AX) + P(A2).

Example 38.6. // two dice are rolled, what is the probability that either the
sum is 8 or at least one 5 appears?
Fig. 38.2. As we saw in Example 38.2, if A, is the event that the sum is 8 and A2 the event
38.3 Probability 673

that at least one 5 appears, then

P(A2) =

These are not mutually exclusive events because both events occur when the
outcomes are {(5, 3)} or {(3, 5)}. Therefore

AlnAI = {{3, 5), (5, 3)},

in which case

P(Al n A 2) = fe = i's •

Hence, by (38.5),

P(A! vjT2) = ^ + — :n> = Is>

which means that the sum is 8 or at least one 5, appears with probability

38.4 Counting and combinations


In many applications the total number of elements in a sample space
or in an event needs to be counted. Enumeration of outcomes can
become a lengthy process. For example, suppose that an experiment
consists of trials such as the spinning of k coins or the rolling of k
dice. If there are n possible outcomes for each coin or die, then the
same space has nk possible outcomes. The rolling of four dice leads
to a sample space with 6 4 = 1296 elements.
As we saw in Section 38.1, probabilities can be obtained using
relative frequency arguments. For the counting process which is
needed, permutation and combination formulas are often useful.
A permutation is a particular ordered selection. The notation nPr
means the number of ways in which r different items can be selected
from n distinct items taking due regard of the order of selection. If
items are not replaced the first item can be chosen in n ways leaving
n — 1 items. Hence the second item can be chosen in n — 1 ways.
The first two items can be chosen in n(n — 1) different ways.
Continuing this process r times we obtain

nl
„Pr = n(n - 1 )(n - 2)- • •(« - r + 1) = --
(n — r)l

Example 38.7. How many permutations of the letters a, b, c, d can be made if


two are selected each timel
In this example n = 4 and r = 2. Thus

4P2 = 4-3 = 12.

The full list of permutations is

{ab, ac, ad, ba, be, bd, ca, cb, cd, da, db, dc}.

A combination is an unordered selection. The notation for a


combination is nCr which means the number of ways in which r
different items can be selected from n items without regard to order.
Among the nPr permutations there are rl which give the same
674 38.4 Mathematical techniques

combination, because the first position can be chosen in r ways, the


second in r — 1 ways, and so on. Thus

"R n\
C = - =-.
r! (n — r)!r!

In the example above the items ab and ba are not distinguished in


the combination and so on, so that 2 different letters may be chosen
from 4 different letters in

ways.
Note that

nl nl
C =_=-
(n — r)lrl (n — (n — r))!(n — r)!

It can also be written as

Notice also that the sequence nCr (r = 0, 1,2,,n) generates the


coefficients of the binomial series in (a + b)n (see Appendix A(c)).
Special values are

r — 11
it'-'O >
C
n^n
=
x1 •

Example 38.8. How many different 5-card hands can be dealt from a standard
deck with 52 cards? What is the probability that a hand dealt at random consists
of 5 spades?
This is a combination problem, not a permutation one. Thus there are

52! _ 52-51-50-49-48
2 598 960
47!5! 1■2•3-4-5

different hands.
The number of different hands consisting of 5 spades is, since there are 13
spades in the pack.

13! 13T2T1T0-9
1 3 C5 = 1287.
8!5! T-2-3-4-5

To obtain the probability that a random 5-card hand contains 5 spades we


can use the counting argument, namely that out of the 2 598 960 equally like
different hands 1287 will have 5 spades. Hence
1287
P(5-card spade hand) =-% 0.0005,
2 598 960

which implies that about one hand in 2000 will have 5 spades.

Example 38.9. A box contains 20 balls of which 7 are red(r), 5 are white(w)
and 8 are black(b) balls. If 3 balls are drawn at random, without replacement, find
the probability that
38.4 Probability 675

(a) two red balls and one black ball are drawn',
(b) one of each colour is drawn:
(c) one or more red balls are drawn:
(d) all are of the same colour.

The total number of 3-ball selections which can be made is

N = 20C3 = 1140

for labelled balls. They are all equally likely to be drawn.


(a) The numbers of ways in which 2 red balls and 1 black ball can be drawn is

C2 x 8 = — -8 = 168.
1-2
Hence
,,, 168 168 14
P(2r and 16) = — =-= — % 0.15.
N 1140 95

(b) The number of ways in which one of each colour can be chosen is
7 x 5 x 8 = 280 from a total of 1140. Hence

P(lr and lw and 16) = = — ss 0.25.


1140 57

(c) The number of ways in which no red ball is drawn is 13C3 = 286 from the
total of 1140. Hence the probability that a selection contains at least one red
ball is
286 _ 854 _ 427
P(^lr) = 1 - P(0r) = 1
1140 ~ U40 “ 570

(d) Since the events are mutually exclusive, using (38.1c),

P(3r or 3w or 36) = P(3r u3wu 36)

= P(3r) + P(3w) + P(3b)

1^1 T 5C3 + 8C3


70 C,

101
0.09.
H40

38.5 Conditional probability


In many applications we are interested in an event A given that an
event B occurs. The probability of A, conditional that B occurs, is
written as P(A | B). A Venn diagram showing the overlapping events
is displayed in Fig. 38.3a. The probability P(A\B) refers to the
restricted set in Fig. 38.3b in which effectively the new universal set
is B. In enumeration terms we can derive

(number of outcomes in A n B)
P(A\B) = ---
(number of outcomes in B)
Fig. 38.3 (a) Both A and B
occur in the shaded intersection
(number of outcomes in A n 5)/(number of outcomes in S)
A n B. (b) P(A | B) refers to the
new universal set B. (number of outcomes in 5)/(number of outcomes in S)
676 38.5 Mathematical techniques

Hence the formal definition is, assuming that P(B) ^ 0,

Conditional probability of A given B

P(A n B)
(38.6)
P(A | B)
P(B)

Example 38.10. Six cards are dealt from a well-shuffled deck of playing cards.
Given that all six cards are black, find the probability that they are all of the same
suit.
Let A and B represent the following events:

A = {the cards are black}, B = {the 6 cards in the same suit}.


Thus
A n B = {6 black cards of the same suit}.

Therefore
(number of combinations of 6 clubs or 6 spades)
P(A n B) = ------
(number of combinations of 6 cards)

2'13^6

52^6
Also
(number of combinations of 6 black cards) 26^6
P(B) = ---
(number of combinations of 6 cards) 5 2^6

Hence the conditional probability that they are all of the same black suit is

P(A n B) _ 2 - 13C6 < 52^-6


P(A\B)
P(B) ~ 52C6 26^6

? 13! 6!20! _ 12
% 0.015.
" 6!7! ~26!~ 805

Note the following properties of conditional probabilities:

(i) P(A | A) — 1.
(ii) P{A | B)P{B) — P(B | A)P(A).

The latter follows since A n B = B n A and

P(A I B) = P(A n B)/P(B), P(B | A) = P{B n A)/P(A)

from definition (38.6).

Example 38.1 1. A production line is supplied with the same component made
by two different machines M, and M2. It is known from samples of the outputs
that the probability that a component from is not faulty is 0.91 and from M2
is 0.85. Machine supplied 60% of the components and machine M2 40%.
Components are chosen at random and tested before the next stage of production.
What is the probability that

(a) given that a component was made by M2 it is not faulty!


(b) a component is not faulty?
38.5 Probability 677

Let Au A2, and B be the events

A i — {component made by Mj}, A2 = {component made by M2},

B = {component not faulty}.

From the 60%/40% supply we know that P(/t,) = 0.6 and that P(A2) = 0.4. The
known failure rates in M, and M, give the conditional probabilities P(B\ T,) = 0.91
and P(B\A2) = 0.85.
(a) The answer is P(B \ A2) = 0.85.
(b) Write the event Bas (Bn/t,)u (B n A2) which is still the event that the
component is not faulty. Since B n A, and B n A2 are mutually exclusive, it
follows that

P[(B n Tj) u (B n T2)] = P(B 0/1,) + P(B n A2)

= P(B\Al)P(A1) + P(B\A2)P(A2)

= 0.91 x 0.6 + 0.85 x 0.4 = 0.886,

using (38.6). Hence the probability of a non-faulty component is 0.89 approxi¬


mately. In solving this problem we have encountered a new law in (b) called the
law of total probability which will be discussed further in Section 38.7.

38.6 Independent events


The recognition of independence of events and data is crucial in
probability and statistics. Two events are said to be independent if
the occurrence of either event has no effect on the occurrence of the
other. In terms of conditional probability this means that two events
A and B are independent if and only if
P(B\A) = P(B) or P(A\B) = P{A)
that is

P(A n B) = P(A)P(B) (38.7)

by (38.6). The independence result generalizes for N independent


events Au A2,..., AN to

P(A, n A2 n ■ ■ • n AN) — P(AX)P(A2)-• P(/fN). (38.8)

The following simple illustration shows the distinction between


dependent and independent events. Two cards are chosen at random
from a pack of 52 cards. In the first case, the first card is replaced
before the second card is chosen. The events considered are

A — {first card is an ace},


B = {second card is an ace}.
Then

P(A) = & = and P(B\A) = ± = ^ = P(A).


In other words the events are independent.
On the other hand if there is no replacement, then

P(A) = & but P(B\A) = £ ^P(A),


indicating that A and B are not independent events.
678 38.6 Mathematical techniques

P Example 38.1 2. Figure 38.4 shows parts of two circuits which contain electri¬
r-r~l cal components P, Q, and R placed in parallel and series. For the parallel case
Q the circuit fails if all three components fail, but in the series case failure occurs if
H q h-
just one component fails. In some time interval the probabilities of failure of P, Q,
_R
and R are respectively p, q, and r. What are the probabilities of circuit breakdown
(a) in the two cases?
Let A, B, and C be the events
P Q R

-1 P H 9 H r 1- A = {P fails}, B = {Q fails}, C = {K fails},


(b) where we assume that failures are independent events.
For the parallel case failure occurs if A n B n C occurs. By (38.8)
Fig. 38.4 (a) Components in
parallel, (b) Components in
P(ArsBnC) = P{A)P(B)P(C) = pqr,
series.
which means that the probability of failure is pqr.
For the series case failure occurs if the event iuBuC occurs. Using (38.5)
twice and (38.8)

P(A u B u C) = P(A) + P(B u C) - P(A n (B u C))

= p + P(B u C) — P(A)P(B u C)

= p + (1 - p)(P(B) + P{C) - P(B n C))

= P + (1 ~ P)(q + r - qr)

= (p + q + r) - (qr + rp + pq) + pqr

which is the probability of series failure.

38.7 Total probability


Suppose that a sample space is partitioned (see Section 37.3) into
two events and A2 which are mutually exclusive. In other words
A1 n A2 = 0 and Ax u A2 — S. Let B be an event in S (see Fig.
38.5). The sets Bn/1, and B n A2 are mutually exclusive so that

P{B) = P{B n Aj) + P(B n A2).

From the notion of conditional probability (38.6) we obtain:

The law of total probability

For mutually exclusive events A and B (38.9)


P(B) = P{B\A,)P{Af) + P(B\A2)P(A2).

The result generalizes to the case in which S contains n mutally


exclusive and exhaustive events Au A2,..., A„. If B is an event in
S, then

P(B)= X P(B | Ai)P(Ai).


i= 1
38.7 Probability 679

Example 38.13. A box contains 8 red and 13 black components. A machine


draws components at random from the box and fits them into a circuit. What is the
probability that the second component is redl
Suppose now that components in the box are replaced with components of the
same colour as they are used. What is the probability that the second component
is red?
Define the event as follows:

A! = {first component is red},

A2 = {first component is black},

B = {second component is red}.

Then

P(Al) = %, P(A2) = if.

Also

P{B\Af) = ^, P(B\A2) = &.

Using (38.9), since Ax and A2 are mutually exclusive and exhaustive,

P(B) = P{B\A])P(Al) + P(B\A2)P(B\A2)

-1-§_ i JL. JL2-


20 21 ~ 20 21 — 21*

Hence the probability that the second component draw is red is 8/21, which is
the same as P(AX). This suggests (correctly) that the probability that the second
ball is red does not depend on the colour of the first ball.
The first solution was selection without replacement. In the second part of
the question the components are replaced. In this case P(B) = fp in other words,
with or without replacement, the probability that the second component is red
is still 8/21.

38.8 Bayes’ theorem


Suppose that the sample space S is the union of the mutually
exclusive events Ax and A2. In this case A2 = Au and the notation
suggests the generalization which follows. Suppose that an event B
occurs. We ask the question: if B occurs, what is the probability that
Ax occurs? In other words what is P(A1 \ B)1
From the rule for conditional probability (38.6) we can deduce,
since B n Ax = Ax n B, etc,

P(B nAl) = P(A1\ B)P(B) = P(B \AX)P{AX), (38.10)

P(B n A2) = P(A2 | B)P(B) = P(B | A2)P(A2). (38.11)

From (38.9) we also have

P(B) = PCBjAJPiAJ + P(B\A2)P(A2). (38.12)

Elimination of P(B) between (38.10) and (38.12) leads to:


680 38.8 Mathematical techniques

Bayes’ theorem
For mutually exclusive events A{ and A2

(38.13)

PtflAJPiAJ
PiBlA^PiA i) + P{B\A2)P{A2)

Example 38.14. It is known that 4% of a hatch of components in a manufactur¬


ing process are faulty. Components are tested on the production line with 90%
probability that a faulty component is detected, but it is known that in 2% of the
cases a component which is not faulty is nevertheless recorded as faulty. What is
the probability that a component which is recorded as faulty is actually faulty?
Let /t, and A2 be the events

A, = {component faulty},

A2 = {component not faulty),

and let B be the event

B — {test indicates faulty}.

Then

PM,) = 0.04, P(A2) = 0.96, P(B | /4() = 0.90, P(B | X2) = 0.02.

We require the probability that the component is faulty given that the test
recorded faulty, that is, P(A1 \B). By Bayes’ theorem (38.13)

P{B\A,)P{A,)
P(At | B) =
P(B\Al)P(A1) + P(B\A2)P(A2)

0.9 x 0.04 + 0.02 x 0.96

If the sample space is partitioned by {/!, }, (i = 1, 2,..., n) then the generalized


Bayes’ theorem is

Problems

38.1. How many elements do the following sample spaces the sum of the face values is 7? What is the probability
contain: that no 5 appears? What is the probability that the score
is 7 or less?
(a) the spinning of 5 coins;
(b) the rolling of 3 dice;
38.3. Two dice are rolled and the scores noted. Write
(c) a coin and a die randomly thrown together;
down the elements in the sample space. How many
(d) a dart thrown at a dartboard.
elements does the set have? Let A denote the event {the
sum of the outcomes is 5}, and B denote the event {at
38.2. Two dice are rolled. What is the probability that least one die shows 4}.
Problems 681

Express the sets of these events in formula terms. 38.13. Prove that
List all the elements in A, B, A u B, and A n B.
n — 1 Cr T lCr_ ! = nCr.

38.4. Suppose that A, B, and C are three events of the 38.14. Prove that
sample space S. Write down the set formulae for the
n
events:
(a) only B occurs, (a) X nCr = r,
r=0
(b) exactly one of A, B, or C occurs.

(b) i nCr3' = 4".


38.5. Suppose that a sample space S includes the events
r= 0
A and B. Show that the number of elements in A u B
can be expressed as 38.15. How many different 4-card hands can be dealt
from a deck of 52 playing cards? How many hands
n(A u B) = n(A n B) + n(A n B) + n(A n B),
contain 4 cards of the same suit? What is the probability
(this is an alternative version of (34.7)). that a hand dealt randomly contains four cards from the
Suppose the two dice are rolled. Let A denote the same suit?
event {the sum of the outcomes is 6} and B the event
38.16. In the previous question investigate how the
{both dice show the same number}. List the elements in
probabilities change for n-card hands (1 < n ^ 13) with
A n B, A n B, and A n B, and find n(A u B) using the
n cards from the same suit.
formula above.
38.17. A box contains 22 balls of which 7 are red, 9 are
38.6. A card is drawn from a deck of 52 playing cards. white and 6 are black. Four balls are drawn at random
If A is the event that an ace is drawn, B is the event that from the box without replacement. Find the probability
a heart is drawn and C is the event that a black card is that
drawn, explain in terms of the cards drawn what the (a) 3 red balls and one white ball are drawn;
following events represent: (b) the balls are red;
(a) A n B; (b) A n C; (c) A u B; (c) the balls are all of the same colour;
(d) A u B u C; (e) A\B; (f) A\B; (d) there is at least one ball of each colour.
(g) A\C\ (h) (A n B) u C; (i) (A n B) u (A n C).
38.18. A production line is supplied with the same
38.7. Cards are drawn from a deck of 52 playing cards component made by two different machines Mt and AT,.
without replacement. What is the probability that It is known from samples of the outputs that the
(a) the first card is a king? probability that a component from Mx is not faulty is
(b) the first two cards are kings? 0.89 and from M2 is 0.83. Machine M1 supplies 70% of
(c) the first card is a king, the second and third cards the components and machine Mj 30%. Components are
are not kings and the fourth card is a king? chosen at random and tested before the next stage of
production. What is the probability that
38.8. A well-shuffled deck of cards is cut twice randomly. (a) given that it was made by Mu a component is not
What is the probability that two aces are shown? (This faulty?
is a problem of selection with replacement.) (b) a component is not faulty?
(c) given that a component was faulty that was manu¬
factured by M2?
38.9. Evaluate the following permutations:
(a) 5^3> (k) 10^4’ (C) 7^7, (d) lP\- 38.19. A production line is supplied with the same
component made by three different machines Mu M2,
38.10. How many different 3-letter ‘words’ can be made and M3. It is known from samples of the outputs that
up from the letters a, b, c, d, e with no repetition of letters? the probability that a component from Mx is not faulty
is 0.87, from M2 is 0.84, and from M3 is 0.91. Machine
38.11. How many 5-digit numbers can be formed (num¬ M, supplies 45% of the components, machine M2 30%,
bers cannot start with 0) from 0, 1,2, 3, 4, 5, 6, 7, 8, 9, if and machine M3 25%. Components are chosen at random
(a) numbers are selected without replacement? and tested before the next stage of production. What is
(b) any number of repetitions of numbers is allowed? the probability that
(c) without replacement but such that the number must (a) a component is not faulty?
be divisible by 5? (b) given that a component was faulty that it was
manufactured by M2?
38.12. Calculate the following combinations: (c) given that a component was faulty that it was made
(a) 7^3’ (k) 99^96’ (C) ,lC5. by M, or M2?
682 Problems

Fig. 38.6

38.20. Figure 38.6 shows part of a circuit with 6 compon¬ of winning in each case. A seventh bonus ball is also
ents in a parallel and series combination. The probabili¬ drawn from the remaining 43 balls and further prizes are
ties of failures of components are pu p2, p3, q, ru and r2 given for those who correctly choose the bonus ball and
as shown and are independent. What is the probability any 5 of the 6 drawn numbers. Find the probability of
that this part of the circuit fails? If all components have winning in this case. What is the overall probability that
the same probability of failure of 0.98, what is the a lottery ticket wins at least one prize?
probability that this part of the circuit fails?
38.23. A game is played in which n players each spins a
38.21. It is known that in a batch of 100 microprocessors,
coin and the outcome examined. The game continues
5 are defective.
until the outcome is either n — 1 heads and 1 tail, or 1
(a) A microprocessor is chosen at random without
head and n — 1 tails. The single player with the different
replacement. What is the probability that it is defective?
outcome wins the coins from the other players. Show
(b) Two are chosen at random without replacement.
that the probability that the game ends at a given play
What is the probability that both are defective?
is n/2"-1, and that the probability that the game finishes
(c) Two are chosen without replacement. Given that the
at the ith play is given by the geometric distribution
first is defective, what is the probability that the second
is also defective?

38.22. In the UK national lottery 6 numbered balls are


selected at random from 49 balls numbered 1, 2, 3,..., 49
without replacement. Prizes are given to those who Find also the mean number of plays to the end of the
correctly select 3,4, 5, or 6 numbers. Find the probability game.
Random variables
and probability
distributions
Contents
39.1 Random variables 683
39.2 Probability distributions 684
39.3 The binomial distribution 685
39.4 Expected value and variance 687
39.5 Geometric distribution 689
39.6 Poisson distribution 691
39.7 Other discrete distributions 693
39.8 Continuous random variables and distributions 694
39.9 Mean and variance of continuous random variables 695
39.10 The normal distribution 696
Problems 698

39.1 Random variables


In experiments or trials in which the outcome is numerical, the
outcomes are values of what is known as a random variable. For
example, suppose that a coin is spun three times and we record the
outcomes and ask: how many heads appear? Then the answer will
be 0, 1, 2, or 3 heads. The sample space S, which lists all possible
outcomes in trial, has 8 elements given by

S = {(HHH), (THH), (HTH), (HHT), (TTH),


(THT), (HTT), (TTT)}.

The random variable X associated with the question is the number


of heads obtained. Generally, the random variable X assigns a
number to each event in the sample space S. This set of numbers is
denoted by Sx. In this example
X
Sx = {0, 1,2,3},
which is a list of the possible numerical outcomes of the number of
heads.
The random variable X can be thought of as a function or
mapping from the sample space S to Sx which, since it was a set of
real numbers, can be represented by points on a straight line. A
representation of the mapping is displayed in Fig. 39.1 in which it
Fig. 39.1 Mapping of the
is shown that the element s in S is mapped by X into the value 3f(s)
random variable X from the
sample space S onto the real on the real line Sx. In the example s could be {HTT) giving 2f(s) = 1,
line Sx. but notice that THT and TTH will also map into 3f(s) = 1.
684 39.1 Mathematical techniques

In the example above, X is a discrete random variable since X(s)


is one of a finite set of numbers. In some cases the possible outcomes
are infinite in number but can still be counted. For example, suppose
that X is the random variable of the number of spins of a coin until
a head appears. The list of possible outcomes is {1, 2, 3,...} which
is unbounded but countable, and X is still called a discrete random
variable.
Obviously many random variables can be associated with the
same experiment. In the example above where a coin is spun three
times, a random variable Y, say, could be the number of tails
observed.

39.2 Probability distributions


Let the random variable X take the values x1? x2, ■ ■ ■ (depending on
the context it is sometimes more convenient to start with x0, x1;...),
where the set of numbers can be finite or infinite. In terms of a
random variable we write probabilities as P(X = x,), which means
the probability that the random variable X takes the value xh or
we could consider P(X < x;), which is the probability that the
random variable takes values strictly less than x;, and so on. Often
we denote P(X = xf) simply by the symbol p;. The pairs (x/5 p;)
for 2=1,2,... define the probability distribution or probability
function for the random variable X. Note that for any probability
distribution of a discrete random variable we must have:

Probability distribution P(X = x,) = p,


(i) 0 ^ p,1;
(ii) Yj= i Pi= 1, if X has n possible outcomes, or,
£2=1 Pi = 1 if X has a countably infinite set of out¬
comes.

A discrete probability can be expressed in a table such as

*; = Xi X2 x3 •••

■■■
11

p1 P2 Pi

For the coin spun three times in Section 39.1, the distribution
would be

X; = 0 12 3

13 3 1
Pi~ 8 8 8 8

since each of the outcomes in the original S is equally likely.


39.2 Random variables and probability distributions 685

Example 39.1. A box contains 6 components of which two are defective.


Components are selected at random without replacement until a defective component
is chosen. Find the probability distribution of the number of components drawn
from the box.

Let .Y be the random variable (number of components withdrawn including the


defective). Then

S* = {!> 2, 3, 4, 5} = {xj, (/ = 1, 2, 3, 4, 5).


The probability

Pi = P(X = xj = | = y,

since there is a 2 in 6 chance of choosing a defective on the first selection. Also


P(X = x2) = f-f = £,

since the probability of choosing a non-defective component at the first stage


is % which leaves 2 defective in the remaining 5. Similarly

P(X = x3) = f-H = i P{X = x4) = H'H = A-

P(X = x5) = f-H-i-i =*.

The complete distribution is

*; = 1 2 3 4 5

1 4 1 2 1
Pi- 3 15 5 15 15

The distribution can be represented graphically as shown in Fig. 39.2.

38.3 The binomial distribution


Suppose a series of trials are independent and have two possible
outcomes which occur with probabilities p and (1 — p). If p is
constant throughout then these are known as Bernoulli trials. A
simple example is the spinning of a coin. We could define a random
variable X which takes the value 1 if a head appears and 0 if a tail
appears. The probabilities that these occur is \ in each case. The
terms success and failure are frequently used in this context, and
generally Bernoulli trials apply to populations that naturally divide
into pairs of alternatives, for example, on/off, male/female, alive/dead,
etc. With 1/0 representing success/failure, a Bernoulli sequence of
trials might look like

10001 1 1 100101 100 —,


If p is the probability of success at each trial and q = 1 — p is the
probability of failure, then the probability distribution of Bernoulli
trials is

0 1
X
II

Pi = q p
686 39.3 Mathematical techniques

Let us consider a further distribution which can arise from


Bernoulli trials. A series of independent Bernoulli trials takes place
with the probability of success or failure of any given trial given by
p or q. Consider the probability distribution of i successes in a fixed
number of trials n. In the notation of probability distributions

x, = i, (i = 0,1,2, ...,n).

Here is a particular sequence:

1 1 1 • • • 1 000 • • • 0.

i times n — i times

This sequence, in which there are i successes followed by n — i


failures, occurs with probability

p‘qn~l.
However there are many sequences which have i successes (1) and
n - i failures (0), and the number of possible arrangements is „Q
(see Section 38.4 for an explanation of the combination notation).
Including every arrangement, the probability of i success in n trials is

nQpY-1,
since the probability of a success followed by another success is
p x p = p2, and so on. This is called the binomial distribution for
X, the random variable of the number of successes in n trials.

Binomial distribution which has the probability


function
nlp'q"-1
P(X = xt) = Pi nCiPY (39.2)
(n-i)U\’

(i = 0, 1, 2,-n).

The binomial distribution contains two parameters, n the number


of trials and p the probability of success.
Since q = 1 — p, the first few coefficients in the distribution are

Xi = 0 12 3

„ „ n(n - 1) 2 „_2 n(n — l)(n — 2) 3 „_3


Pi= q npq p q p q

which are recognizably the first few terms in the binomial expansion
of (p + q)n (see Appendix A(c)). Hence
39.3 Random variables and probability distributions 687

2k " n\p'qn 1
LPi=L 7 = (p + q)n= 1,
i=o i = 0{n — i)U\

since p + q = 1. This confirms that (39.2) does satisfy the key


requirement for a probability distribution. Some bar charts for the
binomial distribution are shown for n = 10 and p = 0.3, 0.5, 0.7 in
Fig. 39.3.

Example 39.2. Three dice are rolled simultaneously. What is the probability
that two 5s appear with the third face showing a different number?
Let the random variable X be the number of 5s which appear. Then
S,-{0,1,2, 3}.
The outcomes from each die are independent with a 5 showing called a success
and no 5 showing a failure. The probability that a single die shows a 5 is f
Hence X has a binomial distribution with parameters n = 3 and p = f Hence,
by (39.1),
P(X = 2) = 3C,a)2(f) = ff-e ~ 0.069,
which is quite small. The other probabilities are
P(X = 0) = iff w 0.579, P(X = 1) = ~ 0.347, P(X = 3) = jf* « 0.005.
The odds for obtaining three 5s are 1 in 216.

39.4 Expected value and variance


The expected value or mean or expectation of a random variable is
defined in terms of a weighted average of outcomes: the weighting is
equal to the probability p; with which xt occurs. Thus if X is
a random variable which can take the values x1? x2,... with
probabilities pu p2, ■.. then

Expected value or mean of X is defined by

£(V) = yPix,
Fig. 39.3 Binomial distribution i
for n = 10 and (a) p = 0.3,
where the summation is over all i, either finite or (39.3)
(b) p = 0.5, (c) p = 0.7.
countably infinite.
The symbol p is often used for the expected value
E(X).

For the binomial distribution (39.2) with parameters n and p, the


expected value is (note that the distribution has n — 1 elements)

EOO = f Ap‘q"~‘i
i=0
n\plqn~l " (n - l)!pi_y_1'
- I = np I
t=i («-/)!(/-!)! ; = i (n - i)\(i - 1)!
-^(n-DIpV1-1 , ,
= np L “7-:-T7q- = nP(P + <l)
i=o (n - 1 - i)!i!
= np,
using the binomial expansion (Appendix A(c)).
688 39.4 Mathematical techniques

Example 39.3. In Example 39.2 what is the expected value of the number of 5s
which appear when 3 dice are rolledl
From the definition of expected value and the results in the previous example,
3 125 „ 75 15 1 108 1
E(X) = X P(x = 0i = —*0 + 1 H-■ 2 H-■ 3 —-—.
i= 0 216 216 216 216 216 2
This result checks with np = § = i-
Random variables can be combined as in X + Y, and it is possible
to consider functions of random variables g(X). Expected values
satisfy the following theorems (which will not be proved here,
however). If c is a constant, and X and Y are random variables, and
g(X) is a function of X, then

Rules for expected values

(i) E(cX) = cE(X);

(ii) E(X+Y) = E(X) + E(Y); (39.4)

(iii) E(XY) = E(X )E( Y) (if X and Y are independent)

(iv) E(g(X)) = Y]=i d(xi)Pi (f°r a finite distribution).

Whilst the expected value of a random variable is a useful average


it gives no idea of the spread of the distribution about the expected
value. Two distributions can have the same mean but can have very
different shapes in relation to the mean. A measure of the spread is
the difference X — E(X), the difference between the random variable
and its mean. However, its expectation is always zero since, using
(39.4) (i), (ii) above,

E(X - E(X)) = f_ (x, - E(X))p, = X xlPl - E(X) f p,


i= 1 i=l i=l

= E(X) - E(X) 1 = 0,
which is obviously not helpful as a measure of spread. Instead we
choose the random variable (X — E(X))2. Its expected value is
known as the variance and denoted by:

Variance of a random variable

Xar(X ) = g2 = EI(X- E(X ))2] = E[{X- /r)2], (39.5)


where p = E(X).

Using (39.4), note that the variance can be expressed in the


Var(X) = E(X2 - 2pX + p2)
= E(X2) - 2pE(X) + p2 (39.6)
= E(X2) - p2
which is more convenient.
39.4 Random variables and probability distributions 689

Since the units associated with the variance are squares, the
symbol o2 is frequently used for variance so that in ‘linear’ terms
the spread can be defined by o — •s/Var(X). This is known as

Standard deviation of the random variable X:

(j = yVar(Z)- (39.7)

Example 39.4. Find the variance of the binomial distribution given by (39.2).
Using (39.2) and (39.4) (iv)

Var(A) = E(X2) - p2 = £ i2nCiPV"‘ - p2,


i= 0

" i-n\piqn~i ,
= )-P ■
i=i (« — 0!0' — D!

As a device for summing the series we assume that p and q are independent
parameters, and use formula E(X) = np(p + ij)"_1 for the expected value of the
binomial distribution. Thus

£ i-n\piqn~i 2_ d f ^ n\piqn'i \ 2

ih (n — /)!((’ — 1)7 “ P Vp \ f= i (n - i)l(i - 1)!/ “ M ’

= p^(E(X))-p2,
dp

= P~ (nplp + qy-^-p2,
dp

= p\n(p + q)"~1 + n(n - 1 )p(p + q)n~2] - n2p2

= pn\ 1 + p(n — 1)] — n2p2 (since p + q = 1)

= np( 1 - p).

The following rules for variances will be assumed:

Rules for variances

(i) Var(X + c) = Var(X);

(ii) Var(cX) — c2 Var(X);

(iii) Var(X + Y) = Var(X) + Var(T)

(if X and Y are independent).

39.5 Geometric distribution


Consider again a sequence of independent Bernoulli trials explained
in Section 39.3 with in any trial a probability p of success (1) and a
probability q = 1 — p of failure (0). Suppose that we are interested
690 39.5 Mathematical techniques

in the number of trials up to and including the first success. Call this
random variabie X, and let X = i correspond to the sequence

0 0 0 • • • 0 1'.

/ — 1 times

The probability of i — 1 failures is q‘ 1 so that

P(X = i) = Pi = q'p, (i = 1,2,...).

Unlike the binomial distribution, this distribution has an infinite


sample space. It defines

The geometric distribution

p(x - i) = Pi = (i - /?y~v, (* = i, 2,...). (39.9)

Note that
cc p
Z Pi pZy"1=p(1 + i + <i2 + "') = .— =l’

using the formula for the sum of a geometric series (see Section 1.13).
A bar chart of a geometric distribution with p = 0.2 is shown in
Fig. 39.4.
The expected value of the random variable of the geometric
distribution is
OO OO

p = E(X)= X 'Pi = Z W_1


i= 1 i= 1

= p{ 1 + 2q + 3 q~ + • • •)
Fig. 39.4 Geometric
= p[(l + q + q~ + • • •) + q(l + 2q + 3q2 + • • •)]
distribution with p = 0.2.

- + qp = 1 + qp
1 - q
Hence
1

In a similar manner it can be shown that the variance is given by

2
a

Example 39.5. In a drug-testing programme, independent and sequential tests


are conducted. Each test costs £500. The probability of success at each test is p.
However, for each test after the first there is an additional cost per test of £200.
What should p be greater than if the expected cost of the tests should not exceed
£2000?
39.5 Random variables and probability distributions 691

Let X be the random variable of the number of tests up to and including the
first success. We are actually interested in a random variable which is a function
of X, namely the cost C(W) which is given by
C(X) = 500X + 200(X - 1) = 700- 200.

Thus C(l) = 500, C(2) = 1200, C(3) = 1900, etc. Using (39.2), the expected value
of C{X) is

E(C(X)) = £(700A" - 200) = 700E(X) - 200 = — - 200,


P

since X is a random variable with a geometric distribution. This expected cost


is less than £2000 if

700 700
- 200 < 2000 or — < 2200.
P P

Hence the probability must satisfy the inequality p>2i~ 0.32.

39.6 Poisson distribution


Let X be a random variable which can take values 0, 1,2,... with
probability

P„ = P(X = n) = - ~(n = 0,1,2,...).


ft!

This is a probability distribution since


oo Jn g—2

I P» = I = e_yl eA = 1,
n=0 n=0 ft

using the power series for eA (see Equ. (5.4a)). This is known as the
Poisson distribution with parameter X. It occurs in problems in which
discrete data accumulate, such as, for example, in the Geiger counter
which records the number of radioactive particles which hit the
instrument from a radioactive source. The distribution is appropri¬
ate for data arriving in a sequential random manner.
The Poisson distribution has mean
n ~ —A
ft/ e 2"e
n =0 ^• n= 1 (n -1)!
CO ;« p -A 00 Xn
e"x
a I , = /
« =oI G
n!
=

variance is
OO
\ ntv2\ ,,2 _ V
n2X" e~
n= 1 n!

I — — x2 = e-A2 — [2 eA] — X2
n=i(n-l)\ d2
= X e A(eA + X eA) — X X.
692 39.6 Mathematical techniques

Apart from being a distribution in its own right, the Poisson


distribution is also a useful approximation to the binomial distribu¬
tion (see Section 39.3) for large n. In the binomial term nCipiqn~l
put the parameter p = X/n. Then

Now let n oo. The term in the square brackets approaches X'/il
as n —>• oo, whilst

(This limit can be obtained by putting h = n/X in the approximation


for e given in Section 1.8.) Thus

as n -» oo. This is a useful approximation as the following application


illustrates.

Example 39.6. Certain processors are known to have a failure rate of 1.2%.
They are shipped in batches of 150. What is the probability that a batch has
exactly one defective processor? What is the probability that it has two?
We assume that the defects are independent. We use the binomial distribution
with probability

Pi(n. p) = „C, p'q

with n — 150 and p — 0.012 (failure of the component is ‘success’ in the binomial
convention). Hence for i = 1, 2, a direct calculation gives

p,(150, 0.012) = ^CdO^S)149^^)1 = 0.297891,

p2( 150, 0.012) = 150C2(0.988)148(0.012)2 = 0.269 549.

In this problem n is ‘large’, so that it is suitable for the Poisson approximation.


The parameter X for the corresponding Poisson distribution is given by

X = np = 150 x 0.012 = 1.8.

Hence the probability of one failure is

X _,
— e A = 1.8 e 18 = 0.297 538,
1!
39.6 Random variables and probability distributions 693

and of two failures is

0.267 784,

which show accuracy to 2 decimal places compared with the binomial distribu¬
tion. This is more than sufficient in many applications. The Poisson approxima¬
tion avoids the rounding errors which can occur in calculating probabilities
raised to large powers.

39.7 Other discrete distributions


(a) The Pascal or negative binomial distribution. This is the distribu¬
tion with function

Pi = i-iQ-i/O -p)‘~k, (;i = k,k+ 1,...).

This distribution is an extension of the geometric distribution, and


arises from the random variable which is the number of Bernoulli
trials to achieve k successes where a success occurs with probability
p. This is sometimes known as inverse sampling since the number of
successes k is specified in advance.
Its mean and variance are

k 2 k{\ - p)
P = ~, =--—.

P P

(b) Hypergeometric distribution. Consider a box containing w white


balls and b black balls. Suppose that n balls are chosen at random
from the box without replacement. What is the probability that /'
white balls are chosen? The i balls must be chosen from w, and the
n — i balls from b. Hence the number of possible samples is
M,C; bCn_i. By this counting method we obtain

c bc '-'n -1
P(X = i) - Pi
r
w + b'-'n

where

0, 1, 2,..., n if n ^ w,

0, 1,2,..., w if n > w.

The function p, defines the hypergeometric distribution. Its mean


and variance are given by

nb 2 nwb(b + w + n)
^ w + b' (w + b)2(w + b — 1)

The same problem with replacement leads to the binomial distribu¬


tion.
694 39.8 Mathematical techniques

39.8 Continuous random variables and


distributions
In many applications the discrete random variable which takes its
values from a countable list is inappropriate. For example, the
random variable X could be the time from, say, t — 0 until a light
bulb fails. Whilst it would be possible to measure failure to the
nearest hour and use a discrete random variable, it is often more
convenient and more accurate to use a continuous random variable,
which is defined for the continuous variable t ^ 0, and is no longer
a countable list of values.
Instead of the sequence of probabilities {P(X = x;)} = (pj, we
define a probability density function (pdf) /(x) over — oo < x < oo
which has the properties:

Probability density function

(a) /(x) ^ 0, ( — co < x < oo);


(t>) j-oo /(x) dx = 1;
(c) for any x1? x2 such that — oo < x, < x2 < oo, (39.10)
rx2
P(x1 ^ X ^ x2) = f(x) dx.
J Xl

The random variable X can take any value of the continuous


variable x. A graph of a density function /(x) against x is shown in
Fig. 39.5. By (a) the curve must never fall below the x axis, by (b)
the area under the curve must be 1, and by (c) the probability that
X lies between two values xt and x2 is the shaded area under the
graph. Unlike ph the pdf /(x) is not itself a probability.
We can associate with the pdf a cumulative distribution function
(cdf) F(x) which is defined by

Fig. 39.5 Probability density


function. Cumulative distribution function

F(x) = P(X ^ x) = /(u) du. (39.11)

F(x)
It represents the probability that X ^ x. By (39.10b) it follows that
F(x) -> 1 as x -> co,
and

P(x, ^ x ^ x2) = 2 f(u) du = F(x2) - Fix,).


O x J Xl

Fig. 39.6 Cumulative A typical cdf, which must be a non-decreasing function, is shown in
distribution function. Fig. 39.6.
39.8 Random variables and probability distributions 695

Example 39.7. Let X be the random variable of time to failure of a light bulb
measured from time t = 0. Assume that X has a pdf

t ^ 0
m t <0,

where t is measured in hours. What is the probability that the bulb has failed
at t = 10 hours'? What is the probability that the light bulb fails between t = 10
hours and t = 20 hours'?
Note that /(f) is a pdf since /(f) ^ 0 and

/(f) df = ot e *' df = [ —e *']o = 1.

For the first question we require

r io
P(X ^ 10) = a e 31 dt = [ —e3']o° = 1 — e - 10a

For the second question

’ 20
20
P(10 sS X < 20) = e-3' df [-3 10
J 10

= e“103 - e_2°3 = e“‘°3(l - e~103).

Thus the light bulb fails before 10 hours with probability 1 — e“ 103, and between
10 hours and 20 hours with probability e~103(l — e-103).

The pdf in the previous example is the exponential distribution


which is frequently used in ‘time to failure’ problems. Its cdf is
given by

ot e dw = 1 x ^ 0;
Fix) =

0, x < 0.

Note that density functions do not have to be continuous: they can


include jumps. Also if some event can only take place after a given
time, say, then we put the density function equal to zero until that
time.

39.9 Mean and variance of continuous random


variables
By analogy with that for discrete random variables the expected
value of mean and the variance of a continuous random variable X
with pdf f{x) are defined to be:
696 39.9 Mathematical techniques

Mean of continuous random variable:

/< = E(X) xf{x) dx.

Variance of continuous random variable: (39.12)

a2 = Var(^) = E((X - ju)2)


0

(x - n)2f(x) dx.

For the exponential distribution with pdf given by

fa e_at, t ^ 0,
m = k
its mean or expected value is

M = tf(t) dr = at e at dr,

00 d
r — (e aI) dr,
o dr

= -[re-aI]0» + dr (integrating by parts)


dr

0+ e dr = -.

It can be shown similarly that

1
<x2 =

39.10 The normal distribution


The normal distribution with pdf defined by

Normal distribution, N(/r, a2),

1 (39.13)
/ (x) — —— e 2a1 , — oo < x < oo.
o^/2n

is particularly important in many applications. It has a symmetrical


bell-shaped distribution about its mean /r. Note also that o in (39.13)
39.10 Random variables and probability distributions 697

is its standard deviation. A typical normal distribution is shown in


Fig. 39.7. It can be verified that
'so

f(x) dx = 1,
J — 00

xf(x) dx — n.
Fig. 39.7 A normal
distribution.
x2/(x) dx = cr2.
J -8
The normal distribution N(^,(j2) is a two-parameter distribution
with its mean and standard deviation as parameters.
The standardized normal distribution is N(0, 1) with pdf

It has mean zero and standard deviation 1. Any normal random


variable X with distribution N(fx, a2) can be ‘standardized' by con¬
sidering the random variable Z = (X — /r)/<f. In the distribution
(39.12) this is equivalent to the substitution z = (x — /z)/cr). Thus
N(0, 1) has the density

The standard normal curve representing N(0, 1) is shown in Fig. 39.8.


The standard deviations within 1, 2, 3 units of the mean are also
shown in the figure. If Z is the corresponding random variable then
the probability that Z lies within one standard deviation of the mean
zero is the area under the curve between — 1 and 1. Thus

1 f1
P(-l < Z < 1) = — e~>z dz = 0.6827,
yj K2 J—1
Fig. 39.8 The standard normal
but numerical integration is required to evaluate this integral.
curve.
Tables of standard normal distributions can also be used to estimate
the answer (see Appendix G).
Similarly

P(-2 ^ Z ^ 2) = dz = 0.9545,
n J

P(-3 ^ Z ^ 3) = iz2 dz = 0.9973.

The last result implies that there is a 99.73% chance that a selected
698 39.9 Mathematical techniques

item lies within 3 standard deviations of the mean for the standardized
normal distribution.
The importance of the normal distribution lies in the observation
that in many measurements, which almost always involve random
experimental errors, the distribution of the errors seems to be normal
(see Section 43.3).
The cdf for the standardized normal distribution N(0, 1) is

<P(z) = P(Z ^ z) e W du,


Fig. 39.9 Cumulative
distribution function for the whose values can be obtained from Appendix H. A graph of 0{z)
standardized normal
against z is shown in Fig. 39.9: it can be used to estimate probabilities
distribution.
for the normal distribution.

Example 39.8. The mean height of 459 university students is 180 cm with a
standard deviation of 4.2 cm. Assuming that the heights are normally distributed
estimate the number of students who have heights greater than 200 cm, and the
number who have heights between 175 cm and 185 cm.
For this sample /<= 180 and <7 = 4.2. Hence the normal distribution 1V(180, 17.64)
is given by

We can obtain the corresponding standardized normal distribution by putting

x — 180
~Tt64~

If x = 200 then z = (200 - 180)/17.64 = 1.13. Hence

P(Z ^ 1.13) = 1 - F(1.13) = 1 - 0.87 = 0.13,

approximately (this can be read either from Fig. 39.8, or by using tables). Hence
around 0.13 x 459 % 59 students will have heights in excess of 200 cm similarly
if x = 195, then z = —0.28, and if x = 185 then z = 0.28. Thus

P( — 0.28 ^ Z < 0.28) = F(0.28) - F(-0.28) = F(0.28) - (1 - F(0.28))

= 2F(0.28) - 1 = 2 x 0.61 - 1 = 0.22,

approximately. Hence it expected that about 101 students will have heights
between 175 cm and 185 cm.

Problems

39.1. A biased coin is spun three times. The probability 39.2. Explain why the sequence
of a head appearing is 0.45 and of a tail 0.55. If X is the
random variable of the number of heads shown, what is
the sample space of XI What is the probability distri¬ U = 1, 2, 3,...)
bution of XI
Sketch a bar chart showing the probability distribution.
What is the probability that X is greater than or equal
can be interpreted as a probability distribution. If
to 1, that is P(X > 1)7
P(X = j) = Pj find P(X ^ 6).
Random variables and probability distributions 699

39.3. The probability of success in a sequence of inde¬ What is the probability that an individual bottle fails the
pendent Bernoulli trials is / If 12 trials take place weight test?
calculate the probabilities of 0. 1.12 successes. Cal¬
culate also the mean and standard deviation of the 39.10. Suppose that the random variable X has the
random variable which is the number of successes. exponential distribution with pdf

39.4. The uniform distribution has the pdf 1.5 e-1,5', t^0.

f 1 /(b — a) a < x < b


./(-v) = <
(0 elsewhere Find the following probabilities
(a) P(0<X < 1); (b) P(X < 0); (c) P(X 1);
Sketch the graphs of the pdf and its cdf. Find the mean (d) P(X ^ 1); (e) P(X < 2) or P(X < 1).
and standard deviation of the uniform distribution.
39.11. Calls to a freephone information line are assumed
39.5. Prove that the variance of the geometric distribution to occur so that the times between calls are exponentially
Pi = (1 - P)‘~'p. O' = L 2-) is (1 - p)/p2. distributed with mean time of 20 minutes between calls.
If X is the random variable of the time between calls,
39.6. Components join a production assembly line in (a) What is the probability that there are no calls in a
sequence. The probability that a particular component one hour interval?
is faulty is 0.012. How many components (excluding the (b) What is the probability that there is at least one call
faulty component) will be expected to join the assembly within a 15 minute interval?
line before a faulty one is encountered? What is the
standard deviation of the number of components to 39.12. A geiger counter is an instrument for counting the
failure? number of radioactive particles emitted by a radioactive
sample which strike the instrument. In a probability
39.7. A coin is spun until a tail a shown. What is the model of the counter, the random variable X, which
probability that 8 heads appear before the first tail? is the number of radioactive particles detected in a given
time interval, has a Poisson distribution
39.8. In a series of Bernoulli trials the probability of
e-A/."
success is p. Let X be the random variable until r P(X = X) =-, (n = 0, 1,2,...)
successes occur. For example, if 1 denotes success and 0 n\
denotes failure then, in the sequence
where X is a parameter which characterizes the radio¬
1001100011100100 activity of the sample. Show that the mean and variance
of the probability distribution are both X.
7 successes will have occurred in 16 trials, that is AT = 16 What is the probability that 5 or more hits occur in
for r = 7 in this case. Show that the time interval?

Pi = p(x = i) -= r _ j )pro -prr 39.13. The random variable Z has a standardized normal
distribution. Estimate the following probabilities:
for i = r,r + 1, r + 2,.... This is the negative binomial (a) P(Z ^ 0.8); (b) P(Z ^ 0.7); (c) P(-0.5 Z ^ 0.8).
distribution. Confirm that
39.14. A particular repetitive operation on a production
line has a uniform distribution (see Problem 39.4) pdf
X Pi = 1 •
/(f) = 0.1 for 33 < t < 43, where time is measured in
seconds. What are the mean time and variance of the
Show that operation? On average what proportion of operations
take longer than 40 seconds.
r r( 1 - p)
E(X) = -, Var(20 =-
P P 39.15. It is required in an application that

39.9. In a milk-bottling plant bottles are filled with milk A(a2 — f2) —a ^ t =$ a
and their weights checked. If a bottle is underweight or 0 elsewhere
more than 4"u overweight the production line is stopped
and the problem investigated. Assume that a bottle fails should be a pdf. What should the parameter A be in
randomly with the same probability p. What would be terms of a! Find the variance of the distribution. What
an appropriate distribution for this problem? On average should A and a be for the distribution to have a standard
it is found that breakdown occurs every 1503 bottles. deviation of 1?
700 Mathematical techniques

39.16. The time to failure of catalytic converters in 39.17. The random variable of the time to failure in a
exhaust systems of cars is modelled by a normal random batch of light bulbs is assumed to be exponentially
variable with mean of 1200 hours. If 95°,, of the converters distributed with mean time to failure of 500 hours. What
are to last at least 1000 hours without failure, what is is the probability that a light bulb is still functioning after
the maximum value which the standard deviation of the 640 hours? A room is lit by 4 light bulbs which are not
normal distribution can take? replaced as they fail. What is the probability that just 2
bulbs will still be working at 640 hours?
Descriptive statistics

Contents
40.1 Representing data 701
40.2 Random samples and sampling distributions 705
40.3 Sample mean and variance, and their estimation 707
40.4 Central limit theorem 708
40.5 Regression 710
Problems 713

40.1 Representing data


Statistics is a subject concerned with the collection, analysis, and
interpretation of data. Any method which seeks to interpret the data
is a branch of statistical inference. The data set usually consists of
a random sample from some larger set called a population and may
be quite a small proportion of it. The objective is to make inferences
about the population as a whole from a small sample of it. Hence
if we want to find out what the mean salary of a population is, then
a random sample of individuals is taken, and the mean of the
resulting sample is used to estimate and make inferences about the
unknown mean salary of the population. Generally the values in a
sample are known as variates. From this process of sampling the
aim is to infer properties about the whole population: this is known
as statistical inference. Any quantity calculated from the sample is
known as a statistic, and the corresponding (usually unknown),
value in the population is known as a parameter.
Let us first look at graphical ways of representing the data. Table
40.1 shows the number of vehicles cross an automatic census cable
on a road on a particular day. The day is split into 2 hour time slots
from midnight to midnight. We can represent the data by a
histogram, in which each two-hour slot is represented by a rectangle
whose height is the frequency or the number of vehicles in this case,
and whose width is the time interval as shown in Fig. 40.1. We can
draw a frequency polygon by joining the midpoints of the tops of
the rectangles. If there are a large number of sample categories then
the polygon could be replaced by a smooth curve fitted to the data.
Given a set of data, the design of histogram for the data is a
matter of judgement. In the example, the 24 hours was divided into
12 two-hour time intervals, but we could have collected alternatively
over 23 one-hour intervals. The intervals are also known as cells or
bins. The intervals should usually be of equal ‘width’. Also the
702 40.1 Mathematical techniques

Table 40.1

Time interval Number of vehicles

00:00-02:00 6
02:00-04:00 4
04:00-06:00 9
06:00-08:00 21
06:00-10:00 24
10:00-12:00 15
12:00-14:00 16
14:00-16:00 18
16:00-18:00 29
18:00-20:00 20
20:00-22:00 16
22:00-24:00 10

Fig. 40.1 Histogram of the data


in Table 40.1.

number of intervals should not be too large for the data. A working
rule is that the number of intervals should roughly increase like yjn,
where n is the number of observations. In the example above there
are 188 observed vehicles which according to the rule suggests about
14 intervals which is close to our choice of 12.
Here is another example. A ‘snapshot’ of vehicles on a short
stretch of road is taken at the same time on the same day of the
week for a sample of days. Table 40.2 shows a frequency table of
the number of cars. The histogram is shown in Fig. 40.2. The sample
mean x of n observations {x;}, where x; occurs with frequency f is
given by

- I"= 1 fXi
x = =--,
B-, f,

which is equivalent to the average, or mean of the total set of


observations since there must be Yj = i /; of them. For the traffic
census in Table 40.2
40.1 Descriptive statistics 703

Table 40.2

Number of vehicles (x,) Frequency (/■)

0 12
1 15
2 13
3 8
4 5
5 3
6 2
7 1

Fig. 40.2 Histogram of Table


40.2

(0 x 12) + (1 x 15) + (2 x 13) + (3x8)


+ (4 x 5) + (5 x 3) + (6 x 2) + 7 x 1)
12+15 + 13 + 8 + 5 + 3 + 2+1

59

The mean x of this sample will be an estimate for the true


population mean. If the samples are not classified into categories
then f = 1 and, as before,

1 "

x = - Z Xh
n i= i

where n is now the number of samples.


There are other measures of the central characteristics of samples.
The mode is the number which occurs most often in a sample, and
is therefore most likely to occur. Thus in Table 40.2, the number 1
appears most often (15 times), so that the mode of this data is 1.
704 40.1 Mathematical techniques

The central item in an ordered list of sample values is known


as the median. Suppose that a list of examination marks is given by:

Examination marks: 31, 36, 38, 39, 45, 46, 57, 60, 65, 65, 69, 72, 75, 79

in increasing order. If the sample has an odd number of items, say


In + 1, then the median is the (n + 1) item: if the number is even,
say In, then it is defined to be the average of the «th and the
(«+l)th numbers in the ranking. In the list of examination
marks above the median is ^(57 + 60) = 58.5. The mode is 65 but
it would be a number of no particular significance in this list, since
the mode contains only two marks.
In Table 40.2, there are 59 numbers from 0 to 7 consisting of 12
zeros, 15 ones, etc. The median is the 30th number which will be one
of the twos. Hence the median is 2.
The box plot displays graphically important features of data such
as the median, the spread, and symmetry of the data, and is
particularly useful in comparing different data sets, as for example
in the results in a series of associated examination papers. Suppose
that the examination marks in three papers are as percentages (each
in increasing order) as shown in Table 40.3. We first find the median

Table 40.3

Examination marks (0-100)

Paper 1 (16 results) 27,40,46,48,55,55,56,58,61,63,64,66,68,69,72,78


Paper 2(11 results) 30, 38, 39, 48, 58, 61, 64, 68, 69, 70, 81
Paper 3 (9 results) 26, 40, 43, 54, 56, 61, 62, 72, 74

of the marks in each paper. Thus the medians are 59.5 for Paper 1,
61 for Paper 2 and 56 for Paper 3.
Suppose that the data contain 2n observations. Then the first
quartile is the median of the n smallest observations, and the third
quartile the median of the n largest observations. If the data contain
2n + 1 observations then the first quartile is the median of the n + 1
smallest observations and the third quartile the median of the n + 1
largest observations. (The second quartile is the median.) The
quartiles divide the observations into four approximately equal
numbers of observations.
For the examination marks paper by paper the quartiles are given
in Table 40.4. The difference between the third and first quartiles is

Table 40.4

First quartile Median Third quartile

Paper 1 51.5 59.5 67


Paper 2 43.5 61 68.5
Paper 3 43 56 62
40.1 Descriptive statistics 705

100

75

50

25

Fig. 40.3 Box plots for three 0


examination papers. Paper 1 Paper 2 Paper 3

a measure of the spread of the data, and is known as the interquartile


range.
Create a vertical scale 0-100 as shown in Fig. 40.3. For each
paper, position a box such that its upper edge is level with the third
quartile on the scale, and its lower edge is level with the first quartile.
The line across the middle of the box is the median. Extend each
box by a line to the extreme marks above and below the box. These
lines are known as whiskers. Visually we can see how the average
and spread of the marks compare. A compressed box indicates poor
discrimination in the marks, and long whiskers might indicate
exceptional successes or failures (often known as outliers). Examiners
may wish to take remedial action by scaling in the light of the
comparative box plots if there are candidates in common among
the papers.

40.2 Random samples and sampling distributions


One aim in statistics for a set of data is to fit a probability model
to it so that inferences can be drawn concerning the data. This
usually requires the selection of a probability distribution to model
the data, often on the basis of minimal information. Having chosen
the distribution, the parameter values of the distribution have to
be estimated from the data. The set of data is the only hard information.
Consider a simple yes/no poll of the population in the UK. The
question is asked: are you in favour of the UK joining the European
Monetary Union (EMU)? The answer must be either "yes’ or ‘no’:
‘don’t know’ responses are not permitted. The question could be put
to the whole population in a referendum (at considerable expense),
but if we are interested just in an opinion poll then the question
could be put, say, to a random sample of 500 individuals chosen
randomly from the population. (There is the question of how this can
706 40.2 Mathematical techniques

be achieved but we will not dwell on this polling problem.) Suppose


that in the sample 56% say ‘yes’ and 44% say ‘no’. We could conclude
that the population is in favour but we could also ask how much weight
should be attached to this poll result. Is it far enough away from the
50% critical value, for example, for a confident prediction?
More information might be obtained if we took a number of
500-person polls from the population, and examined the distribution
of the random variable X representing the number of votes for
EMU. This distribution is known as the sampling distribution of X.
We could model the posing of the question to 500 individuals as a
series of 500 independent Bernoulli trials which has a binomial
distribution (see Section 39.3) with parameters n = 500 and an
unknown probability p for the number of‘yes’ votes.
For a single poll with n persons, the binomial distribution (the
probability of i yes votes) is

n\p‘( 1 - p)n~‘
O' = 1, 2,..., n).
(n — i)\i\
The mean of the binomial distribution, which is the mean number
of yes votes, is np.
We can estimate p from each poll. Since the mean for the binomial
distribution is np, it seems reasonable to estimate p as X/n. We shall
denote an estimator for p from a sample by p. The value of a random
variable is known as an estimate. The symbol p is used for both the
random variable and its value.
One test of whether we are looking at an appropriate measure of
the probability p is the behaviour of the expected value of p. Thus
np
E(p) = E(X/n) = 1 £(*) = — = P,
n n
since we are assuming a binomial distribution. Hence the expected
value of the estimates gives the probability p, the mean of the
sampling distribution of p. Generally, if the expected value of the
estimate equals the parameter being estimated, then the estimate is
called unbiased: if this is not the case then the estimate is called biased.
The spread of the estimate can be found by calculating the
expected value £[(p — p)2]. Then

£[(p - o2] = £[{p - £(p)}2] = Var[p]


by (39.5). Hence, the variance of the sampling distribution of p, is
given by

Var[p] = Var -
n

= ~2 Var[3f] (by (39.8(h))


n
np(l-p) p(l-p)
n
40.2 Descriptive statistics 707

for the binomial distribution. As we might expect the variance of


the sample means decreases with increasing sample size.
Given the one sample at the beginning of this section, the estimate
for p is p — 0.56. The estimated variance of this single sample
replacing p by p is

Var[p] =
m-P) 0.56 x 0.44
-« 0.005.
n 500

The corresponding standard error for p is ^Var(p) * 0.022.

40.3 Sample mean and variance, and their


estimation
The general sampling problem is as follows. Suppose that we have
a sample of values xu x2, ■ ■ ■, x„ of a random variable X1, X2,... ,Xn
taken from a population. The samples are random and they are
assumed to be independent of each other. We assume that the
sampling distribution can be modelled by a known probability
distribution perhaps one of the distributions discussed in Chapter
39. Estimates, preferably unbiased, are required for the mean and
variance of the underlying distribution.
An obvious choice for the mean is simply the sample mean which
is the average value of the sample.
The sample mean is a random variable defined by

Sample mean for sample size n

X = xi+X2 + -'- + Xn (40.3)

If xl5 x2,..., x„ are values obtained in a particular sample then its


mean is

n i= i
What is the relation between the sample mean and the mean of
the population? The expected value of X is
(\ « \ l » l
E(X) = £ - E *< = - Z £(^) = -m = E-
\ni=i J «; = i n
As we might expect, the expected value of the sample mean is the
same as the mean of the population.
The variance of the sample mean is, by (39.6),

Var(X) = Varf1 f x) = \ f Var(X;)


\w .- = t / n ,■= i

(40.4)
n
708 40.3 Mathematical techniques

where a2 is the unknown variance of the population. Its standard


deviation oj^Jn is known as the standard error of the sample mean.
We also need an estimate for the variance er2 of the population.
We might choose

£ (Xt - X)2
t = \-.

which is the variance of the sample, but is it unbiased? In other


words does its expected value equal a2? The following algebra
supplies the answer:

1 "
E{T2) = E -Y & - *)2
n i= i

1 "
= E -1
n i= i
l(Xi -H)-{X-

1 "
= E - I [(*i - E)2 - 2(Xt - fi)(X -n) + (X
n i= i

1
(Xt - v)2 - n(X - !i)2
n

= - [na2 — nE[(X — /r)2]]


n

, - , a2 n — 1
= a2 - Var(X) = a2-= —
n n

using (40.4) in the last line. In other words the expected value of
the sample variance is not an unbiased estimator of the variance of
the sampling distribution: there is a correction factor of (n — 1 )/n.
A better statistic for an unbiased estimator of the variance is:

Estimator for the sample variance

S2_Zi-,(x,-x)2 (40.5)
n — 1

For large samples the difference between T2 and S2 is small but it


can be significant for small sample sizes. The estimator is often
known simply as the sample variance.

40.4 Central limit theorem


The normal distribution was introduced in Section 39.10., and
its importance in the context of random errors was hinted at there.
The pdf for a normal distribution with mean n and variance o2 is
40.4 Descriptive statistics 709

(eq. (39.10)):

f(x) = - — e (x ^2i2aZ _
(Xy/2n

The central limit theorem (which will not be proved here) states that
if random samples are taken from a distribution with mean /< and
standard deviation er, then the sampling distribution of the random
variable X of the sample mean will be normally distributed with
mean p and standard deviation a/y/n as n -*■ oo whatever the
original distribution of the Xt. Analytically this can be expressed as:

Central limit theorem

(40.6)
lim
n -* oo

In this result (X — p)yfn/o is the standardized random variable


derived from X, and for large n it is normally distributed.
As we have already stated the true significance of this result is
that it is independent of the distribution of each Xh which need not
Probability be normal.
This result can be illustrated in the case of the throwing of n dice
in which the frequencies of average scores are kept. The probabilities
can be computed (see Project 40.4 in Chapter 41) easily for small
values of n. For example, if n = 2, then the possible average scores
and the probabilities with which they occur are given in Table 40.5
and Fig. 40.4a. Graphs for n = 2, 4, 6 computed using a programme
Average score
to generate the bar charts are shown in Fig. 40.4. The bounding
Probability

0.1 \ n=4 Table 40.5


\

_ 1 3
2 5
3 7
4 9
5 11 6
Average score 2 2 2 2 6
O 12 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11
(b) Average score Probabilities 36 36 36 36 56 36 36 36 36 36 36

Probability

curve begins to show for n = 6 the familiar shape of the normal


distribution.

O 1 2 3 4 5 6 Example 40.2. A die is rolled 6000 times. The number T of times face 1 appears
(c) Average score is counted. Find ml and m2 in P(m{ < T < m2) in order that T should lie within
one standard deviation of its mean value 1000.
Fig. 40.4 Probabilities versus Let X be the random variable that one appears face up on the die. Then
average scores for the rolling 2, E(X) = f Its variance is given by
4, and 6 dice
= Var(X) = E(X2) - E(X)2 = l - £ = 4L-
710 40.4 Mathematical techniques

By the central limit theorem

/ 7- 6000 - <12

kx ^ —— ^ k? e *“2 du,
75
^^6000
7^ <ti

V
or

+ iooo ^ r<^/c2x/60oo + 1000

1 *2

e *“2 du.
7^
Hence by the normal distribution table (Appendix G)

/q =-0.8413, k2 = 0.8413.
Hence

ij = -0.8413 7? y6()0() + 1000 % 976,


6

m2 x 1024.

40.5 Regression
Suppose that we have a set of data in which one quantity is measured
in relation to another quantity. For example, the fuel consumption
of a car will vary with the speed of the car, or the weight of an
individual will vary with the hpight of the person. We may wish to
speculate as to what the relationship is between two (or more)
quantities.
Suppose that a sample of measurements is taken (for example,
fuel consumption (y) for different speeds (x)of a car. This leads to
the paired data (xl5 yt), (x2, y2), ■ • ■, (*„, y„), in which one or both
variables may contain random errors. We can obtain an idea of the
likely relationship between x and y by plotting the coordinates
(x;, y,) as points in rectangular cartesian coordinates, giving what
is known as a scatter diagram. Some examples are shown in Fig.
40.5. If we fit a curve to the data shown in the scatter diagrams,
then we might guess a straight line fit to the data in Fig. 40.4a, and a
curve in Fig. 40.5b, whereas in Fig. 40.5c, which shows data centred
around a point, we might feel that no relationship exists between the
variables. Often in scientific experiments the relationship between
the variables can be inferred from some underlying theory although
parameters may be unknown. For example, it might be known that
the formula relating x and y is linear so that we need to find the
best straight line fit to the data. For others we might need to guess
the likely shape of the curve from the scatter of the data as in
X
Fig. 40.5b.
(c)
In some data sets there can be errors in both measurements. In
Fig. 40.5 Scatter diagrams. others, one variable known as the controlled or independent variable
40.5 Descriptive statistics 71 1

x is specified (measurements could be made at specified times


which are known accurately) and y, which will contain random
errors, is known as the response or dependent variable. In the fuel
efficiency tests the speed of the car could be measured accurately
(controlled variable), but the fuel consumption (response variable)
might be affected by other factors (ambient temperature, engine
tuning, etc). On the other hand in the height/weight data the
measurements could be accurate, although the weight could vary
over time. There is unlikely to be a ‘formula’ relating height and
weight (there may be other parameters involved) but nevertheless
it is useful to have a working relation between the two for life
tables used by insurance companies. The process of estimating the
response variable from a set of controlled variables is known as
regression.
If the hypothesis is that the data follows a straight line relationship
then the model is known as linear regression model. This regression
model assumes that the random variable Y of the data {yj is given
by
Y — ax + b + e,

where a and b are unknown parameters and e is a random


error with mean 0 and unknown variance a2. Note that the variance
of Y is

Var( Y) = Var(ax + b + e) = Var(ax + b) + Var(e) = Var(e) = a2.

With x as a controlled variable, the vertical deviation of the point


from (x{, y{) from the line is

= yt - (axi + b).

We use the method of least squares for the sum of the squares of
the deviations which requires the minimum of

/(a, b) = £ ef = X (y, - axt - b)2


i= 1 i=l

(see Section 27.7 for a full derivation of a and b). The minimum
occurs where df/da = df/8b = 0 and, as in (27.10), the best straight
line fit is given by the solution of
n n n

a Z
i= 1
x? + Z
b
i=l
xt = iX = 1
Wh

n n

a X Xi + bn= X Yi-
i= 1 i=l

The solutions of these equations are the least squares estimates


for a and b, and using the notation for estimators we shall distinguish
them by a and S:
712 40.5 Mathematical techniques

Least squares estimates:

b = y — dx,
. ZUiXiYi-nxy
U 'D=lx?~nx2'
where (40.7)

n
1 1 ”
x =
n i
Z x,
= 1
y = Z yt-
n i= i

The least squares regression estimator y is given by

y — ax + b.

and this can be used to estimate y for other values of x. It also


defines the equation of the regression line of y on x though the data.
The regression line of x on y, which generally will be a different line,
can be found similarly.
The estimates a and b have been obtained by least squares. Are
they unbiased estimators of a and hi We can decide the answer to
this question by finding their expected values. Thus, noting that Yt
is the random variable with value xt- and that x; is a controlled
variable.

E(d) = E
‘Z?=i xtf-nxY
L Z?=1x?-«x2 J
= E[Z"=i xi(axi + b + £j) + xYj= 1 (aXj -f b + £,.)]
Z?= i x‘ -nx

= Z"= 1 XjiBXj + b) - x Z"= 1 (axi + b)


(since £(£,) = 0)
E"= i xf -nx
_ a Yj = i xf + bnx — nax2 — nbx = a.
Z"= ix? -nx

Also, by (40.9) and the result E(a) above

E(b) — E(Y — ax)


n

Z (axt + b + El) - xE(d)


i= 1

1 "
= - Z (axi + b) — xa
n ;= 1

= ax + b — xa = b.

Hence a and b are unbiased estimators of a and b respectively.


Regression lines are most easily determined and compared with
the data by using computer software. Whilst we have only discussed
40.5 Descriptive statistics 713

regression lines, in many applications regression curves are more


appropriate, but the important point is that they must be linear in
the parameters.

Problems

40.1. Find the mean, median, first and third quartiles, 40.5. In an experiment 127 observations are taken which
and the interquartile range of the following two data sets: can be assigned to a maximum of 36 intervals. If you
(a) 10, 11, 11, 15, 17, 20, 25, 25, 27, 30, 38, 42, 47; wish to display the data in a histogram, what would be
(b) 5, 12, 15, 16, 20, 29, 29, 32, 39, 44. a suitable number of intervals to use?
Draw box plots for both sets of data.
40.6. A random variable X has a uniform distribution
40.2. In a university degree examination with four papers (see Problem 39.4) with pdf
each taken by 20 candidates the percentage marks are
as shown in Table 40.6. Draw comparable box plots for 1, 1 < x ^ 2;
the results. 0, otherwise.

Table 40.6 A random sample of size 35 is taken. Find the mean and
estimate the variance of the sample. What can you say
Examination marks (0-100) about the distribution of the sample mean?

Paper 1 24,27,27, 30,40.42,48, 55, 58, 60, 61, 63, 64, 40.7. A random sample is taken from a population
66, 66, 68, 69, 72, 78, 85 which has mean p and variance a2. The sample values are
Paper 2 30,35,36,38,39,40,44,45,48,51,54,58,61, 9.71, 10.26, 9.80, 9.85, 9.99, 10.10, 9.79.
64, 65, 65, 69, 70, 81, 90
Paper 3 26,29, 30, 35, 36, 37,46,48,49,49, 50, 54, 56, Estimate the sample mean and the same variance, a2.
61, 69, 70, 71, 71, 72, 74
Paper 4 10,20,22, 34,41. 44,45,45.45, 50, 55, 55, 55, 40.8. A die is thrown 9000 times, and the number of
56, 64, 65, 66, 70, 85, 91 times face one appears is recorded. If T is the random
variable for the number of ones in 9000 throws, calculate
k1 and k2 such that
40.3. Samples of packets of crisps are weighed at the end
of a manufacturing process. Packets have to contain a 1 P2 . 2
P(1460 < T < 1540) = —= e** dx.
minimum of 25 g. The sample weights are ^/in J fc j
25.1, 25.3, 25.0, 25.7, 25.3, 25.2, 25.1, 25.5, 25.7, 25.1.
40.9. Fuel consumption figures for standard urban
Calculate the sample mean, mode, and standard deviation. cycles of a selection of cars together with their weights
are given in Table 40.8. Find the least squares estimator
40.4. In a continuous production process a machine cuts
pipes into nominal lengths of 10 metres. The actual
Table 40.8
lengths in a production run are given in the Table 40.7.
Draw a histogram over (a) 10 intervals of width 0.1
Vehicle Weight, w (kg) Fuel consumption, c (km/litre)
metres, (b) 5 intervals of width 0.2 metres. Add a
frequency polygon to both histograms.
A 2100 4.96
Table 40.7 B 1350 9.10
C 1008 12.04
Length Frequency Length Frequency D 1323 7.68
interval of pipes interval of pipes E 710 15.15
F 1215 10.98
9.5s$x< 9.6 1 lO.OsSxclO.l 21 G 1436 7.75
9.6^x< 9.7 4 10.1 <x< 10.2 15 H 1561 8.25
9.7^x< 9.8 5 10.2^x< 10.3 11 I 2120 4.85
9.8^x< 9.9 12 10.3 <x< 10.4 5 J 1975 4.64
9.9^x< 10.0 20 10.4^x< 10.5 2 K 1535 5.56
714 Mathematical techniques

for a regression line of fuel consumption (c) on weight where = axt + t>. Estimate the variance of the regres¬
(w). sion line.
An unbiased estimator for the variance in linear One point is some distance from the regression
regression is given by line (such rogue values are known as outliers). If
this particular vehicle is excluded from the data
v (yt - Pi)2 how are the regression line and the estimated variance
i=i n—2 affected?
PART VIII

Applications projects
using symbolic
computing
Contents
41.1 Symbolic computation 715
41.2 Projects 716

41.1 Symbolic computation


There have been a number of significant advances in symbolic
computation and computer algebra manipulation in recent years.
These are systems which bring together symbolic, numerical, and
graphical operations in one software package. The mathematical
methods introduced in this book are particularly appropriate
contexts in which to have a first look at such systems.
The software Mathematical has been used extensively in the
production of the drawings of curves and surfaces, and in the
checking of examples and problems in this text. At an elementary
level, Mathematica is particularly helpful, for example, with oper¬
ations such as differentiation (including partial derivatives), the
construction of Taylor series, elementary algebraic operations
involving matrices and linear equations, elementary integration
(including repeated integrals), and difference equations; but most
topics in this book can be approached to some extent using
Mathematica. It is also useful in curve sketching in that a quick view
of the general feature of a curve can be obtained, which can then
be revised and edited to produce detailed graphs as required.
It is not the purpose of this book to provide an introduction to
Mathematica. There are a number of texts which do, including the
handbook that comes with the system.
A few useful titles are listed below:

Wolfram, S. (1991). Mathematica: a system for doing mathematics


by computer (2nd edn). Addison-Wesley, Redwood City, California.

t Mathematica is a registered trade mark of Wolfram Research Inc.


716 41.1 Mathematical techniques

Abell, M. L., and Braselton, J. P. (1992). Mathematica by example.


Academic Press, San Diego, California.
Blackman, N. (1992). Mathematica: a practical approach. Prentice-
Hall, Englewood Cliffs, New Jersey.
Skeel, R. D., and Keeper, J. B. (1993). Elementary numerical com¬
puting with Mathematica McGraw-Hill, New York.

41.2 Projects
The following projects are listed by chapter. They are selective
samples of problems and do not cover every topic in the book. The
intention is that they can be approached using mainly built-in
Mathematica commands: very few problems require programming
in Mathematica. It is generally inadvisable to attempt these problems
by hand, since many could involve a great deal of manipulation,
although some projects are prompted by examples and problems in
the relevant chapters. Some commands which might be used for the
projects are listed in the Answers to Selected Problems, p. 747,
together with sample programs and outputs for selected projects.
It is worth emphasizing that computer algebra systems usually
generate outputs or answers without explanation of how the outputs
were arrived at, unless the programming within them is investigated.
Outputs can go wrong for many mathematical reasons. For example,
a curve can oscillate too frequently for the built-in point spacing to
detect, which can result in a false graph. This can be corrected by
increasing the number of plot points, but the potential difficulty has
to be recognized at the formulation stage. Symbolic computation is
not a substitute for understanding mathematical techniques.
Mathematica notebooks for each project are available on the
web at
http://www.keele.ac.uk/depts/ma/mathtech/
or on disk, both in PC versions. Mathematica version 2.2.3 at least
is required. Enquiries about the disk should be directed to
Department of Mathematics, Keele University, Keele, Staffordshire
ST5 5BG, UK. (Email: [email protected])

Chapter 1
1. Draw the graphs of y = x3, y = (x - l)3, y - 1 = x3, y - 1 =
(x — l)3 for -1.5 2.5. How do they differ?
2. (a) Plot the points (n, n2 + 1) for n = 1,2, 3,4, 5.
(b) Plot the points in (a) but with successive points joined by straight
lines.
(c) Plot y = x2 between x = 0 and x = 5.
(d) Show the curves from (b) and (c) on the same graph.
3. Plot curves defined by the following relations between x and y.
(a) x2 + 3y2 = 4; -2 ^ x < 2;
(b) x2 + 2y2 - xy + 2y = 4; -3<x^3;
(c) x4 + 2y2 - xy - 2x2y = 4; -2<x<3.
Applications projects using symbolic computing 717

4. Define the function /(x) = x( 1 — x* 1 2). Plot the graphs


(a) y = f(x); (b) y = f( 1 - x); (c) y = f(-x); (d) y = /(|x|),
all for — 2 ^ x ^ 2.

5. Define the Heaviside function H(f) and the signum function sgn f.
Plot graphs of the following functions on — 4 ^ t < 4:
(a) H(t); (b) sgn t; (c) H(f) + H(-r); (d) sgn(sin t).

6. Plot the graphs of the curves defined by the following polar


equations:
(a) r = j(l — cos 9) for 0 < 9 ^ n (cardioid).
(b) r = (4 sin2 0—1) cos 9 for 0 ^ 9 < 2n (folium).

7. Express
1
(x — 1 )(x — 2)(x — 3)(x — 4)(x — 5)
in partial fractions.

Chapter 2
1. Define the function
. x sin x — 1 + cos x
fix) = — - -.
sin 2x + 2 — 2 e*
Find limv^0 /(x). Plot the function for — 0.5 ^ x ^ —0.001 and for
0.001 ^ x ^ 0.5, and check graphically that this agrees with the
limit.

2. Find the derivative of


/(x) = lx2 + 8x3 + 9x4 + 10x5 + llx6 + 12x7
and its values /'(0.2) and /'(0.4).

3. Find the derivative of


fix) = x4 + 2x3 - 3x2 - 2x + 4.
Find the approximate values of x where fix) = 0, using a numerical
solution routine. Plot graphs of y = f(x) and y — f'(x) on the same
axes and compare the zeros of f\x) with the zero slopes on y = /(x).

4. Find the equation of the tangent to the curve


y = x sin 2x
at x = 0.7. Plot the graphs of the curve and its tangent.

5. Find the first three derivatives of


f{x) = x sin2 x + x2 sin(x2),
and confirm that the first nonzero higher derivative at x = 0 is
fm( 0) = 6.
6. Plot the graphs of y = /(x), y = f'{x), and y = f"(x) for

fix) = X2(x2 - 3)
718 Mathematical techniques

in the interval -2 ^ x ^ 2.5. (This should confirm the results from


Problem 2.19.)

Chapter 3
1. Display rules for the derivatives of the following general forms:
(a) f(x)g(x); (b) f(x)/g(x); (c) f(g(x)); (d) f(x)g(x)h(x);
(e) f(x)g(x)/h(x); (f) f(h(x))/h(x).

2. Find the first derivatives of


/(x) = esinxcos2* sin x.
The function is periodic. What is its minimum period? Plot its graph
and the graph of /'(x) over one cycle. Estimate where /(x) is
stationary and then find each of the roots of f\x) — 0 to 5 decimal
places using a root-finding routine.

3. If
x2 + 2y2 - xy — 2yx2 = 4,
find dy/dx as a function of x and y.

Chapter 4
1. Display rules for the first and second derivatives with respect to
x of the following general forms:
(a) /(x2); (b) /(sinx); (c) /(sin(x2)).

2. Find the first and second derivatives of


/(x) = O.lx5 - 0.5x4 + 0.2x3 + x2 - 0.7x + 2.2.
Estimate the roots of f'(x) = 0 from a graph of y — /(x). Then find
the roots to five decimal places by a root-finding routine. Calculate
/"(x) at each stationary point, and confirm the second-derivative
test for stationary points. Points of inflection are given by /"(x) = 0.
Find their locations on the original graph of y = /(x).

3. Plot the graph of


x2- 1
y =
2x + 1 ’
and its asymptotes y = jx — { and x = — \ (see Fig. 4.13).

4. Plot the graph of y = /(x) = x5 — 2x3 + x2 - 3x + 1 in the


interval — 1 ^ x ^ 3, and estimate the roots of /(x) = 0 in this
interval. Set up a Newton routine

/(*»)

for calculating the roots of /(x) = 0, and find, starting at x = 0.5


and 1.6, the roots to ten significant figures. What is the smallest
number of iterations required in each case to calculate the roots to
ten significant figures?
Applications projects using symbolic computing 719

5. Plot the graph of y — x + sin 5x in the interval 0 ^ x ^ 25 using


(a) the default plotting routine,
(b) plotting with 20 plot points,
(c) plotting with 50 plot points.
Explain why the graphs are different for this type of function.

Chapter 5
1. Obtain formulas for the Taylor polynomials for the following
functions centred at x = a as far as (x — a)3:
(a) /(x); (b) [/(x)]2; (c) f(x)g(x); (c) e/(x). State the coeffi¬
cient of (x — a)2 in each case.

2. Find Taylor expansions about x = 0 up to and including x5 for


each of the following functions:
(a) ex; (b) (x + l)cosx; (c) ln(l +sinx); (d) exp(sin(ex — 1)).

3. Find the Taylor polynomials for (sin2 x)/x2 up to and including


xN for N = 2, 4, 6. Plot the graphs of the function and its Taylor
polynomials for 0.001 ^ x ^ 2, and compare them. At approximately
what values of x do the Taylor polynomials visibly part company
from the exact function?

4. Find the Taylor polynomials for In x about x = 1 for N = 6.


Construct an error function which is the difference of In x and its
Taylor polynomial. Show that, at 2.159 approximately, this error
starts to exceed 0.2 as x increases. Plot this error function against
x for 1 ^ x ^ 2.2.

Chapter 6
1. Solve, for the complex number a, the equation z = 0 where

. _ (2 + 3j)4 | (a - 2j)
(l-5j)3 (1 + 5j)4

2. If z = x 4- jy, find the real and imaginary parts of z ez cos z.

3. Find the 13 roots ofz13 = 1 + j, and plot the roots on the Argand
diagram.

4. Let zy = 1 — 2j, z2 = 3 + j. Plot the following points on the


Argand diagram:

Zy+Z2, Zy+Z2, Zy—Z2, Zy + Z2, Z yZ 2, Zy/z2.

5. Find |z| and Arg z, where

(1 + 2j)4 2(3 - 4j)3


z =
(1 + 3j) 1 + 4j
720 Mathematical techniques

Chapter 7
1. Let
" 1 2 3 4" " 1 0 -1 0“

-2 3 —4 1 1 -2 1 2
, B —
3 4 1 2 -3 1 -3 1

_ 4 -1 2 3_ _ 2 1 2 1_

"3121"

P P 12
C =
1-2-3 2

_2 1 0 -1 _

Find and compare


(a) AB and BA; (b) A(BC) and (AB)C; (c) (A + B)J and
At + BJ; (d) (AB)J and BTAT.

2. Find the inverse of

1 Xj x\

1 x2 x\

-1 *3 A-
(see Problem 7.18). Find the equation of the parabola of the form
y = a + bx + cx2 through the points (-1,-2), (i, — 1), and (f,2).

3. Let
“i i i i“
3 3 6 6

1 i 1 i
. 4 2 8 8
A =
113 1
8 4 8 4

1 1 1 1
L 2 6 6 6 -1

Find A2, A4, A8, A16. How do you expect An to behave as n —> oo?

Chapter 8
1. Let

" 1 - 1 2 3" ' 2 4 -3 1“


3 1 0 -3 0 -1 4 3
A = B =
2 - 1 3 -1 -2 -2 3 1
_2 - 1 2 4_ _ -2 5 6 —5 _
Find det A, det B, det A \ and det AB. Confirm that

det A~1 = 1/det A, det A det B = det AB.


Applications projects using symbolic computing 721

2. Factorize the following determinants:


1111 1 1 1 1
1 1 1
b c d a b c d
(a) a b c ; (b) (c)
2 b2 c2~2 d2 a2 b2 c2 d2
a2 b2 c2
3 b3 c3 d3 a4 h4 c4 d4

3. Find the values of a for which


5 a -1 1

2 1 a 2

3 a 14

-10 a 2
is zero.

Chapter 9
1. Plot the curve which has the position vector
r — (2 cos t)i + (2 sin t)j + 0.3tk
from f = 0 to t = 20. What is the curve called? The position vector
represents a particle moving along the curve. Find the velocity
vector r and the acceleration vector r of the particle. Show that
r • f = 0.
2. Plot the trefoil knot given parametrically by
r = (1 + a cos 3f)(cos 2ti + sin 2tj) + a sin 3f k
with a ~ 0.25 and 0 < t ^2n.

Chapter JO
1. Show that
~ _2~312 2~112 2_3/231/2_

_2-3/2 _2-1/2 2_3/231/2

_ 2_131/2 0 2_1
defines a rotation of axes. If each row defines the direction of the
X, Y, Z axes in the x, y, z frame, find the equation of the line
x + 2y — 2z = 1 in the new axes.

Chapter 7 7
1. The area of a triangle whose vertices are the points with position
vectors a, b, and c is given by the formula
%\b x c + c x a + a x b\.
Devise a program based on this formula to determine the area for
general vertices. What is the area if a = (1,0, 1), b = (2, — 1, 1), and
c = (1,1,2)? Plot a diagram showing the triangle.
722 Mathematical techniques

2. A tetrahedron has vertices with position vectors


a = (1, — 1,2), b = (—1,2,3), c = (2, — 1,3), d=( 1,3,-2).
Find its surface area. Draw a three-dimensional plot showing
the tetrahedron viewed from the point with position vector
(2.1, -2.4, 1.5).

Chapter 12
1. Use a row-reduction routine to solve the linear equations
x + 2y — 3z = q,
2x + py + z = — 1,
x - 2y - z = 4,
where p and q are two parameters. Determine for what values for
p and q, (a) the equations have a unique solution, (b) no solution,
(c) an infinite set of solutions.

2. Use a row-reduction method to solve the linear equations


x + 2y + pz — 5,
3x + 2y + z = q,
2x — y + 4z = 7,
where p and q are two parameters. Confirm that
63-5 q
z-\TTTp
and discuss the nature of solutions for all values of p and q.

3. Using a row-reduction instruction, show that


Xi + 3*3 = 5,

— *1 + *2 - x3 + X4 = -1,
Xj + 2x2 + 1 1x3 = 4,

— Xj 4- 2x2 + 3x3 + x4 = 3,

is an inconsistent set of equations.

Chapter 13
l. Find the eigenvalues and eigenvectors of
-61 2 0 “

10-3-1
A =
2 1-6 0
-2 2 0 —3 _
How many linearly independent eigenvectors does A have?
Find the eigenvalues of the following matrices:
(a) A~1; (b) A2; (c) A + kl.
Applications projects using symbolic computing 723

2. Find the eigenvalues and eigenvectors of

(\ 2 l\

A = 2 1 1

\1 12/
Construct a matrix C of eigenvectors and confirm that

A = CDC~l,

where D is a diagonal matrix of eigenvalues. Obtain the general


formula for

An = CDnC~1.

3. Find the inverse and transpose of

' 1 2 2

A = $ 2 1 -2
2 -2 1
and verify that A is an orthogonal matrix. Find the eigenvalues of
A. What expected property do they have?

4. Find the eigenvalues of

5 5-6 2"

-3 13 -6 2
A =
-3 7 0 2

3 -15 12 2_

Find the expression det(/l — /I4), and demonstrate the Cayley-


Hamilton theorem of Problem 11.23.

Chapter 14
1. Plot the graphs of the derivative dy/dx = sin 2x and the equation
of the curve through (rc, — 1) of which this is the derivative (see
Example 12.7).

2. Plot the graph of

dy ,
— = x e x 4- sin x — x cos 2x,
dx

for 0 ^ x ^ 10. Show that an antiderivative which is zero when


x = 0 is

y = 2 + |[ —4(1 + x)e-x — 4 cos x — 2x cos 2x


+ sin 2x — 2x2 sin 2x].

Plot the graph of the signed area between x = 0 and x = 10.


724 Mathematical techniques

Chapter 15
1. Set up a program to compute the area under the curve y = f(x)
between x — a and x = b using the approximation
N- 1
h X /(*„),
n=0

where h = (b — a)/N and x„ = a + nh. Apply the method to the


following functions, limits and subdivision numbers:
(a) /(x) = x2, 1 ^ x ^ 3, N = 200;
(b) /(x) = xe_I, 0 ^ x ^ 3, N = 20;
(c) /(x) = x3 sin x, 0 ^ x 7t, N = 30;
(d) /(x) = cos(e“x), 0 ^ x ^ 1, IV = 25.
In cases (a), (b), and (c), compare the numerical result with the
areas obtained by integration. In these cases, how many subdivisions
are required to obtain a numerical result correct to three decimal
places? In (a), show that over 10,000 steps are required. Why is this?

2. Use a symbolic integration program to obtain the following


indefinite integrals:

(a) (In x)3 dx; (b) sin5 x cos3 x dx; (c) x2 e* sin x dx;

dx dx
(d) v/(l-x2)dx; (e) ; (0
x(x + l)(x + 2)(x + 3) (1 -x3)
Check each answer by recovering the integrands by differentiation.

3. Evaluate the following definite integrals:


x dx x3 dx
(a) x(ln x)3 dx; (b) ; (c)
o 7(5 + 4* - 4x2) o (1 - x2)1
1 100 Xn
(d) I 1 dx.
o n = 0 n!

4. Find

1(a) = (In x)3 dx.

Find the limit

lirnim.
h-o ( — In b)3
How does 1(a) behave as a oo? Does
* OO
(In x)3 dx
Ji
exist?

5. A cylindrical hole of circular cross-section and radius b is drilled


through a sphere of radius a > b, the axis of the hole passing through
the centre of the sphere. Find the volume of the remaining object.
Display a diagram of the object for some values of a and b.
Applications projects using symbolic computing 725

Chapter 16
1. Plot the graph of the polar equation r = sin 56 for 0 ^ 6 ^ 2n.
Find the area enclosed by the five ‘petals’ of the curve.
Show that the area of the 2n + 1 petals of r = sin(2n +1)0 (n ^ 1)
independent of n.

2. Devise a program to generate the trapezium rule:

f(x) dx x b-—a + (/(xj) + /(x2)


Ja N

+ • • • + /(Xjv-i)) + 2/(&)]•
Apply the program to the integral
' 2
e“2* sin2 x dx,
Jo
and compare the result with the exact value of the integral.
Investigate how many steps are required to obtain a result accurate
to three decimal places.
Apply the program also to Problem 16.20.
3. A thin plane metal plate consists of an isosceles triangle of height
h and base length 2a with a semicircle of radius a attached
symmetrically by its diameter to the base of the triangle. Find the
location of its centroid on its axis of symmetry.

4. Set up a program to generate Simpson’s rule


'b

/(x) dx %
Ja
b-af ljN \
— /(a) + /(h) + 4 £ /(x2k_i) + 2 X f(x2k) ,
3/V \ jc=i k= l /
where N is an even number. Apply the method to /(x) = e-*2, with
b — 1, a = 0. Compare results with the trapezium rule above.

Chapter 17
1. Illustrate the substitution method in integration by writing a
program to integrate
f x — 2
J 7(5 + 4x - x2)
using the substitutions x = u + 2, u = 3 sin t. Integrate directly and
through the substitutions.

2. Integrate the following, and compare your answers with computer-


integrated ones:
x dx .., f , f 4 ,
(a) ; (b) tan x dx; (c) cos x dx;
4x2 + 1
x dx sin3 x
(d) ; (e) dx.
V(* -!) cos x
Mathematical techniques
726

3. Computer-integrate the infinite integrals

ho —
t10 e~‘ dr, Ju = t11 e"
Jo
and confirm that /n//10 = H-
4. Computer-integrate the following infinite integrals:
' o° P00 In x
(a) e-x sin x dx; (b) dx; (c) x 3°e ax'dx.
i

5. Evaluate the integral


(In x)6
f(a) = dx
x
for a > 1. Find /(10), /(20), and /(oo). The results indicate that f(a)
tends to a limit very slowly as a -» oo. Find where
, , (In x)6
0(*) =-r-

has a maximum value, and plot the graph y = g(x) for 1 ^ x < 100.

Chapter 18
1. Solve the differential equation x + x = 0, for the initial conditions
(a) x(0) = 0, (b) x(0) = 1, (c) x(0) = 2 and plot the solutions on the
same axes for 0 ^ f ^ 2.

2. Solve the differential equations


(a) 2x + 3x + x = 0, (b) x + 2x + 2x = 0, (c) x + 2x + x = 0,
each for the six sets of initial conditions:
(i) x(0) = 0, x(0) = 1; (ii) x(0) = 0, x(0) = 2;
(iii) x(0) = 0, x(0) = 3; (iv) x(0) = 0, x(0) = 1;
(v) x(0) = 0, x(0) = 2; (vi) x(0) = 0, x(0) = 3.
Plot all solutions on the same axes for each differential equation, for
0 ^ t ^ 5.

Chapter 19
1. Solve the differential equation 2x + 3x + x = cos t subject to
x(0) = 0, x(0) = 1. Plot the solution for 0 ^ t < 50.

2. Solve the differential equation x + x = cos t subject to x(0) = 0,


x(0) = 0. Plot the solution for 0 ^ t ^ 20.

Chapter 20
1. Solve the differential equation x + x = 0 subject to the initial
conditions x(0) = 1, x(0) = 0. Also solve x + sin x = 0, by a built-in
numerical solution method for 0 ^ t ^ 10 subject to the same initial
conditions. Plot both solutions for 0 ^ t ^ 10. Comparison of the
plotted solutions will indicate by how much the period decreases
when the linear approximation is used. Re-run the programs for
different amplitudes x(0).
Applications projects using symbolic computing 727

Chapter 21
1. Draw the phasor diagram of the sum of the three phasors of
u(f) = 2 cos 1 Of, v(t) = cos( 1 Of — jtc), w(f) = 3 cos( 1 Of + jtc)
(see Example 21.6).

Chapter 22
1. Draw the lineal element diagram of dy/dx = xy, produced by a
standard package in the square (0 ^ x ^ 1, 0 y ^ 1} (see Section
22.1). Compare this with the exact solution (see Section 22.1) drawn
through the points (0, 0.2), (0, 0.4), and (0, 0.6).

2. Repeat the above process for the differential equation dy/dx =


x — y of Example 22.1.

3. Design a program for Euler’s method (Section 22.2) for the


initial-value problem

y-
dx
= xy2> y(°) = 1.
(see Example 22.4) with step length h — 0.2 and 5 steps. Run the
program for the cases h = 0.1 and h = 0.01 and compare the results.

4. Plot numerical solutions for


dy 3y — x
dx 3x — y
(Example 22.14 and Fig. 22.11) using built-in routines. As with many
equations of this type it is often easier to solve the equivalent
simultaneous equations
dx „ dy
— = 3x - y, — = 3y - x,
df df
numerically for various initial values of x(0) and y(0).

Chapter 23
1. By splitting the differential equation x + 2x3 = 0 into the system

x = y, y = -2x3,
and plotting four phase paths respectively through the four points
(x(0), y(0)) = (0.3, 0), (0.6,0), (0.9, 0), (1.2, 0)
over the interval —1.5 ^ x < 1.5, show that the solutions appear to
be periodic.

2. Plot phase paths for the van der Pol equation


x + 10(x2 — l)x + x = 0
showing the limit cycle. Also show the corresponding (f, x) graph
of the periodic solution (the periodic solution has an initial value
close to x(0) = 2, x(0) = 0).
728 Mathematical techniques

Chapter 24
1. Computer algebra systems are quite efficient at finding Laplace
transforms of complicated expressions involving standard functions.
Test the system with the following transforms:

(e) L{e’/t'2}; (f) £{cosh at}.

2. Solve
x + 2x = e~f, x(0) = 3,
using a Laplace-transform package, and compare the answer with
that of Example 24.12. Plot the input e-' and the output against t
for 0 ^ t ^ 3.
3. Using a Laplace-transform package, solve the system
x + 2x + x = a cos cof, x(0) = 0, x(0) = 0.
Plot the input and output functions for a = 1, co — 1 and 0 ^ t ^ 30.
Estimate the eventual amplitude of the periodic output.

4. Find the functions whose Laplace transforms are:

Plot the functions in each case.

5. Consider the function f(t) = In t. Show the Laplace-transform


package produces the transform
1
-(y + In s),
s
where y is Euler s constant given by

Derive a program to calculate Euler’s constant. It should give


7 = 0.577215....

Chapter 25
L Find the Laplace transform of the solution of
x + 0J2x = ab(t - 1), x(0) = x(0) = 0,

which has impulse input applied at time t = 1. Invert the transform


and plot the output for to = 4, a = 1 (see Example 25.3).

2. Following the previous project, solve more complicated problem


Applications projects using symbolic computing 729

with two impulses:

2x + 3.x + 2x = a 5(r — n) cos t + 65(f — 2n),

x(0) = x(0) = 0.
Plot the output for a = b — 1.

3. Let f(t) = r3, g(t) = cos t. Find the convolution

f(t - u)g(u)du.

Then verify that

L{f(t)}L{g(t)}=L< fit - u)g(u) du>.

4. A transfer function with a parameter a is given by (Section 25.10)


, 4z3 - 8z2 - 2z + 4
Giz) =
6 z4 — 6z3 — 2 a2z2 + 3z2 + 2 a2z — 2 a2
Find the locations of the poles of (f{z). For what values of a do all
poles lie within the unit circle (indicating transient stability). Plot
the poles on an Argand diagram for a = 2.

Chapter 26
1. Consider the period 2 sawtooth function defined over its funda¬
mental interval — 1 < t ^ 1 by f(t) = t. Find its general Fourier
coefficient and output its first four terms. Plot and compare the
graphs of this truncated series and the sawtooth for — 3 < t ^ 3.
2. Repeat the previous problem but with the function

(0<t< 1),
f(t) =
(—1 < f < 0).
Plot the graphs of f(t) and the first 12 terms of its Fourier series.
The graph should show Gibbs' phenomenon, in which the Fourier
series approximation overshoots the function at discontinuities. You
can try it with (say) 20 terms or more, but you should include more
interpolating points in these cases.

3. Find the Fourier coefficients of the 2K-periodic function defined


by
f{x) — x6 — 5tt2x4 + 77r4x2
on the interval —n^x<n. What is the sum of the series
oo (_ 1 \n + 1

I
n =1 n
4. Find the Fourier transforms of the following functions:
(a) the top-hat function n(t) (Section 26.13a);
(b) the one-sided exponential e"'H(t) (Section 26.13c);
730 Mathematical techniques

(c) e |(| (Example 26.19);


(d) e_|,_11;
(e) l/d ,+ f2)-
Plot the graph of the transform in (e).

5. Find the functions whose Fourier transforms are


(a) e"/2;
(b) 1/(4 + /2);
(c) 2;
(d) 2 cos(/ — a).

Chapter 27
1. Plot the saddle surface z — x2 — y2 in the cylinder x2 + y2 ^ 1,
using a three-dimensional parametric plot routine with parameters
r and u where
(x, y, z) = (r cos u, r sin u, r2cos2u).
Also draw a contour plot of the surface in the (x, y) plane on the
square —l^x^l, — 1 < y < 1.

2. Plot the surface z = xy(x2 — y2) in the cylinder x2 + y2 ^ 1 using


the same routine as in Project 25.1 above, but with the parametric
equations
(x, y, z) = (r cos u, r sin u, \r2 sin 4u).
How would you describe this saddle? Draw its contour plot in the
square — l^x^l, — l^y^l.

3. For the function


/(x, y) = ev2-v sin(xy) + x ln(x2 + y3),
verify that
d2f _ d2f
dy dx dx dy

4. Plot the surface given by z = cos xy over — rc x ^ re, — jk ^


y ^ jK. Find the partial derivatives at (^t, 1) and construct the
equation of the tangent plane there. Finally plot the surface and its
tangent plane.

5. Find the stationary points of


f(x, y) = 0.3x3 + 0.2y2 — x2y — xy + 2y
numerically by solving

*LJ±-
dx dy
o.

Plot the contours on the (x, y) plane for - 3 ^ x < 3, - 9 y ^ 3.


Find the values of the second derivatives at each stationary point
and check the second derivative tests (27.9) at each point.
Applications projects using symbolic computing 731

6. Find the least-squares straight-line fit to the points


(0,1.1), (1,2), (2,2.9), (3,3.9), (4,4.5), (5,5.1),
in the (x, y) plane. Plot the data and the least-squares straight-line
fit. If you are using a built-in routine, check your results against that
given by (27.10).

Chapter 28
1. Find the family of curves orthogonal to that of

Plot both families of curves for |x| ^ 2, |y| ^ 2.

Chapter 29
1. Find where the function
fix, y) = x3 - 2xy - x + 3y2
is stationary subject to the condition x2 + 2y2 = 1. Devise a program
which uses the Lagrange-multiplier method (29.4): here is a suggested
line of approach. First plot the contours of z = /(x, y) and the curve
x2 + 2y2 = 1. Locate the approximate coordinates of any point of
tangency. Then use a built-in root-finding scheme to locate the
stationary values. There should be four.

Chapter 30
1. Find the equation of the tangent plane to the surface
x3y + zx + xy2z — — 3
at (1, 2, -1).

2. Show graphically the intersection of the cylinder x2 + y2 = 1 and


the plane x + y + z = 1 (Example 30.9).

3. Find the envelope of the family of curves


y(a2 — 1 + ax) = x
with parameter a. Plot the envelope and a sample of touching curves
in — 3 < x 3.

Chapter 3 7
1. By repeated integration, evaluate the integral
-i r i
(x + y e~xy -h xy) dx dy,
J-i Jo

using a symbolic routine. Plot the surface


z = x -h y e~xy + xy
over O^x^l, - l^y^l. Interpret the integral as the volume
under the surface. Does the integral contain ‘negative’ volumes
732 Mathematical techniques

under the surface? Plot the positive part of the surface over the same
rectangle.

2. Evaluate the repeated integral


'a r *J(a2 -y2)/a
x2y dx dy.
Jo J - ^J{a2 - y2)/a
Plot the region of integration in the (x, y) plane, and then check
that the integral has the same value with the order of the integration
reversed.

Chapter 32
1. Let
/(x, y, z) = xyi + yzj + (z-y)xk.
Find/as a function of t on the line x = t, y = t, z — t. Evaluate the
line integral
/*

/•dr
,
on this line between (0, 0, 0) and (1, 1, 1).
Repeat the process with the curve x = f2, y = f3, z = t4 and the
same end points. Plot both paths of integration.

Chapter 33
1. Plot the surfaces defined parametrically by the following position
vectors:
(a) r = (3 + cos v) cos u i + (3 4- cos v) sin uj + sin v k (see Section
33.3);
(b) r = (1 + a sin(bu)) cos v i + (1 + a sin{bu)) sin vj+ uk, where
a = 0.3 and b = 3.5 (see Section 33.3).

2. Given that

/(x, y, z) = exyzi + z cos(xy)j + (x2 + y2)k,


find

(a) div/
(b) curl/
(c) div curl /;
(d) curl curl/at the point (1,0, — 1).

3. Using symbolic computation test the validity of the following


identities:

(a) (E gradlE = ^grad^-F) — F x curl F;


(b) div(E x G) = G-curl F — F-curl G;
(c) curl(E x G) = (G-grad)/" — (F-grad)G — G div F + Ediv G;
(d) div(U grad V — Ugrad U) = UV2U- VV2U.
(e) curl curl F = grad div F — V2F.
Applications projects using symbolic computing 733

Chapter 34
1. A and B are the sets of integers defined by

A = {2n + 5(-l)"| neN+, 1 ^ n ^ 100},

B = {n2 — n + 1 | n g f^J +, 1 ^ n ^ 10}.

Produce lists of the elements in A u B and A n B. How many


elements do each of these sets have?

2. Let A, B, and C be the following sets:

A = {n(n — l)\n e N+, 2 ^ n ^ 100},

B — {\n2 — 100n| | n e N +, 1 ^ n ^ 160},

C = {4n | n e N+, 1 < n ^ 2200}.

Verify the first distributive law

A n (B u C) = (A n B) (A n C).

How many elements are there in the set in(6u A)2

Chapter 36
1. Draw the labelled drawings of the bipartite graphs K5 6 and K6 6.
Answer the following for each graph by the built-in diagnostic test.
(a) How many edges has each graph?
(b) Is the graph eulerian? If it is, list an eulerian walk.
(c) Is it hamiltonian? If it is, list a hamiltonian cycle.

2. Check the complete graphs Kn, 2 ^ n ^ 7 and the bipartite


graphs K, j (2 i ^ 5; i ^ j ^ 6) for planarity, using a built-in
diagnostic test.

Chapter 37
1. Rework Example 37.2 using a symbolic package for solving
difference equations. Solve the mortgage difference equation

Qm ~ (1 + I)Qm- 1 ~ — A,

with / = 0.08 and Q0 = P = 50000 (in £). Given that Q25 = 0,


find A. List the outstanding debt Qm each year m to the nearest £.
Plot (a) the outstanding debt against years and (b) the annual
interest repayments A — IQm against years.

2. Solve the following homogeneous difference equations:


(a) un + 2 ^m+i ' 10, (b) un + 2 "~f 2un_f_i + 2un 0,
(c) un + 2 + 4un+1 + 4un = 0;
(d) un + 3 + 3u„ + 2 + 3un + 1 + un = 0, u0 = 0, ul = 1, u2 = — 1.

3. Solve the following inhomogeneous difference equations


(a) un+2 -un+1- \2un = 2 + n + n2; (b) un+2 -un+1+ 4un = 2";
(c) un + 3 + 3un + 2 + 3m„ + 1 + un = n2, u0 = 0, ux = 1, u2 = -1.
734 Mathematical techniques

4. Devise a program to generate cobweb plots for the first-order


difference equation
m«+i — — kun + k
for (a) k = i, (b) k = f, (c) k = 1, with initial value u0 = f in each
case (see Example 37.3).

5. Display cobweb plots for the logistic difference equation


un+1 = au„(l - un)
for selected values of a. Some suggested values are:
(a) a = 2.8 to show a stable fixed point;
(b) a = 3.4: find the period 2 solution;
(c) a = 3.5: find the period 4 solution;
(d) a = 3.7: chaotic output;
(e) a = 3.83: should be able to locate a stable period 3 solution.

6. Design a program to generate the period doubling display shown


in Fig. 37.11 for the logistic equation un + 1 = au„(l — un) for a
increasing from a = 2.8 to a = 4.

Chapter 38
1. (See Example 38.8) A box contains 40 balls of which 7 are red, 12
are white and 21 are black. In each of the cases n = 2, 3, 4, 5, 6, 7, n
balls are drawn at random from the box without replacement. What
is the total number of n-ball selections which can be made? What is
the probability that there are n (n — 2, 3, 4, 5, 6, 7) balls of the same
colour? Show the probabilities graphically in a bar chart.

Chapter 39
1. List the probabilities of the binomial distribution for n = 12 and
p = 0.7. Check that their sum is 1. Plot this discrete distribution as
a bar chart.

2. Plot graphs of the probability density function (pdf) and the


cumulative distribution function (cdf) for the standardized normal
distribution N{0, 1).

3. Model a sequence of n Bernoulli trials with success/failure equally


likely, in which the number of successes is recorded. You could try
n = 50 run 500 times and count the number of successes i for
/ = 0, 1, 2,..., n. This should approximate to the binomial distribu¬
tion „Ciplqn~l. Plot this distribution and compare it with the
simulation.

Chapter 40
1. Devise a program to draw comparative box plots for the
examination data given in Problem 40.2.

2. Produce a histogram and frequency polygon for the pipe length


data given in the table accompanying Problem 40.4.
Applications projects using symbolic computing 735

3. Some randomized points (xt, yt) are generated by the Mathe-


matica command
Table[{x + 0. 2*Random[], x+2+1. 2*Random[]}, {x, 0, 6, 0. 5}].

Find the regression lines of y on x, and of x on y for the data.


Plot the data and both regression lines. Also find the mass centre of
the data, and add this point to the graph. Where does the mass
centre lie in relation to the regression lines?

4. Two dice are rolled and the average scores recorded. Compute
the probabilities of the possible average scores, and plot them in a
bar chart. Repeat the programme for four and six dice. Plot bar
charts in each case to illustrate the development normal distribution
predicted by the central limit theorem.
Answers to selected
problems
Chapter 1 1.33. The vertex is ( — 4, 7).
1.2. (a) v = —2,y + 3; (b) y = 1; (c) y = jx y.
Intersections are A: (2. 1). B: (f, {), C: (1, 1). 1.36. (b) 2/(x + 2) - 1 /(a + 1).
(d) l/2x — 1/(a + 1) + 1/2(a + 2).
AB~ly/l3, AC = 1. BC = 4^/5. (f) l/4.x - l/4(x + 2) - 1/2(a + 2)2.
(h) l/2(a — 3) + l/2(x+ 1).
1.3. (b) Slope = j. Intersection with axes at (2,0), (0, — f).
1.37. (b) l/2(.x - 1) + 1/2(a2 + 1) - a/2(a2 + 1).
1.4. (b) (y + 2)/(.\- + 1) = -2, so y = — 2x - 4.
(d) (y — 2)/(x — 1) = 3, so y = 3.x - 1. 1.38. (b) a - 3 - l/(x + 1) + 8/(a + 2).

1.6. Hint: choose a suitably. 1.39. (b) 1 + 1/2 + 1/5 + 1/10 + 1/17.

1.7. (b) Centre (1.0), radius 2.


(d) Centre (y, — j), radius y 11. 1.40. (b) f (ir = (3)2 + a/3 + • • • + ai6
n—2

1.9. (b) a- = -f ± y 14, y = -i ± y 14. = ai2{i + ^ + --- + (3)4].

1.14. (b) 1. Now (1.31) gives the sum in the brackets. Finally we
(d) -l/V'2. (f) -V3/2. obtain 121/729.
(e) -341/1024.
1.16. (b) cos a; (d) —cos a.

1.17. (b) 2 cos i(.x + y) sin j(x — y). Chapter 2


2.1. (b) 0.5; (e) 2; (g) 1.
1.18. In the following, n represents any integer:
(b) yTt + /7tu; (d) y + yn; (f) 2n. 2.2. (c) 6; (e) -0.25; (g) -4.

1.19. (b) amp. = 1.5; ang. freq. = 0.2; 2.3. (c) -l/.x2; (f)4x.
period = 31.41; phase = —0.48.
2.4. (c) -8.
1.20. (b) ix — §; (d) arcsin yx, 0 ^ a ^ 2.
(f) arccos(arcsin a), 0 ^ a ^ sin 1. 2.5. (c) 32, -32.
(h) -| + (1 + 4aA aS* -l

2.8. (c) dE/dT = 4/cT3.


1.22. (b) 3e2; (d) 3 In 3. or In 3.
(f) 2; (h) ±y'2;
2.9. (b) lx6 - 18a5 + 1.
(1) Hint: write sinh 2.x = 4(e2v — e^2v) and obtain a
quadratic equation for e2-v. x = \ ln(2 + yjll).
2.10. dy/dx = tan a, where a is the inclination angle.
1.26. Hint: x = sinh y = 2(e-v — e_ v). Form an equation
for e’’ and solve it. 2.11. Use the formula for tan(/4 — B) in Appendix B(b).

1.28. 5 cos((ut - 0.927). 2.12. (b) 2; (d) 1; (g) 2; (i) n/180 = 0.0175.

1.29. C = 2, a = 1.386, /(2) = 1/8. 2.15. (a) 2 cos a + 3 sin a.

1.30. Tidal period = 12.57 hr. It floats for 9.20 hr. 2.16. (b) y = 24a — 39; (d)y = e"1x.
Hint: It floats when sin 0.5f ^ —0.666. Sketch y = sin 0.5r
and y = 0.666 and find the intersections. 2.17. (b) 6.x - 2, 6, 0.
Answers to selected problems 737

Chapter 3 where (for this context) the coefficients are rounded


to three decimal places. For two-decimal accuracy, we
3.1. (b) x cos .x + sin x; (f) 2.x In x + .x.
need —0.79 < x < 0.79.
3.2. (b) 1/(1 + .x)2; (f) (.x2 — 2.x sin x cos .x)/.x4 cos2 x. 5.3. (b) The terms in the expansion of sin x are of
(m) nx"~1. size |x|2"~7(2n — 1)! with n= 1,2. We need to
choose n so that this is less than 0.00005 when x = ±2.
dg d /' d2/ , - d/ dg d 2g The first value within the limits is n = 7. The polynomial
3.3. (d)/—+ 0—, g —- + 2-+ / —,
dx dx dx2 dx dx dx2 is

d3/ d2/ dgf d/d2# d3# 1 3 1


x-x3 -I- — x-x H— xy
5 1 7 1 9 1 n 1
x11 +
0 —+ 3~ — + 3“~+ / 3! 5! 7! 9! 11! 13!
dx3 dx- d.' dx dx2 dx3
5.4. (b) 771 — x.
3.4. (b) —2 cos x sin x; (e) 2 sin x/cos3 x. 5.5. (b) I + \x + gx2 + • • •, — 2 < x < 2.
(j) 12x2(x3 + l)3; (n) -3e“3x.
(s) ax = e*1"", so d(a*)/dx = (In a)ax. (h) 1-x -I— x2 — * • •, valid for all x.
2! 4!

3.5. (f)ix“2; (i) -ix“i


5.6. (b) 1 + 2* — gX2.

3.6. (f) e~'(cost — sinf); (k) 2 sin.x(cosx —sinx)/x3.


5.7. (b) tan x % (x - 5X3 + t2tjx5)(1 - W + Ax4)-1

3.9. (c) ( —2x sin x2)/cos x2. The original function only * X + |x3 + £x5.
has a meaning when cos x2 > 0.
5.8. (d) ln(l + x + x2) = ln[x2(l + 1/x + 1/x2)]
3.10. (b) e'(cos t -I- t cos t — t sin f).
= 2 In x + ln(l + 1/x -I- 1/x2).
1 1
3.11. (b) dy/dx = —y2/x2. This can be written in other
Then treat 1/x + 1/x2 as the small variable.
ways; for example put yJ = 1 — x5 from the equation of
the curve. 5.11. (b) Suppose that the first nonzero derivative is the
Nth: f{N\c) # 0. Consider whether N is even or odd, and
3.15. (b) -5.
whether f(N\c) is positive or negative.

3.16. (b) dy/dx = ±x/(2V(l - (x/2)2)).


Chapter 6
Chapter 4 6.1. (b) 3±j.
4.1. (b) 2f2; (c) 4f3.
6.3. (b) 3 - 5j; (d) 9 + 3j; (f) 1 + 6j.
4.2. (c) x = e_1 (min.); (g) x = 0 (min).
(i) x = —1/^/3 (min.), x = 1/^/3 (max.). 6.5. (d) — 23 — Aj-
(t) Points of inflection at x = me; maxima at x = (2n + ?)n;
minima at (2n — j)n.
6.6. (a) -4j; (c)-Ufj-

6.7. (a)l-j; (c) -2j.


4.5. If base = x and rectangle height = y, then A =
xy -I- g7tx2 (constant), and P = (1 + ^7t)x + 2y. Substi¬
6.8. (b) 16.233 - 0.167); (d) 88.669.
tute for y from the formula for A to express P in
terms of x only. The minimum of P is reached when 6.9. (b) \z2\ = 8; Arg z2 = |ti.
x = [2/4/(l + Iti)]1. (d) |z4| = 3; Arg z4 = n.

4.10. (b) 8y a —0.2 (exact value 0.227...). ^ y = 2\ (d) The parabola, y2 = 4x; (f) y = x
(d) 8y as —0.4 (exact value —0.5). ^ q\

4.11. (a) bv as —0.11; (d) 5A as —0.08. 6.11. (a) V2eH (d) 14e ^j; (g) e2 ej; (j) J2e ^j.

6.16. (a) 2mtj (n = 0, ±1, ±2,...); (c) (2n + l)7tj.


Chapter 5
5.1. (b) (1 + xp * 1 + ix - |x2 + Ax3. For two dec¬ 6.18. (a) cos(ln 2) + j sin(ln 2).
imal places, we need |Ax3| < 0.005, or — 0.43 < x <
0.43; (d) To four terms, 6.23. (a) x2 — y2 + 2xyj.
sin 2x as 2x — 1.333x3 + 0.267x5 + 0.025x7, (d) cos x cosh y — j sin x sinh y.
738 Answers to selected problems

6-28. 2 — j, -1 -j, -1+j. 9.26. (a) y + z = 1. (b) 3x — 2y — z = 0.

6.29. (b) e2cosW cos(2 sin 0). 9.27. y/2.

9.28. (a) ±(3/^34, 4/^34, 3^/34). (b) ±(7,7,7).


Chapter 7
7.2. a- = -2, y = 1. 9.29. (a) - 3/ + 2j + 4k. Length J29.

- 10 -5
7.6. A =
20 10_ 9.36. r = - f a + —\ [Hint: Draw a diagram involving
2 \|«| \b\)
-5 6 16 a and A.]

7.7. .42 + C2 = -8 11 2 9.37. The minimum separation occurs when f = I2j s.

—6 -6 -7
Chapter TO
7.11. A1"-'. 10.1. (a) 10. (e) zero.

7.16. a- = - 17, y = -2, z = 8.


10.3. If your diagram is a parallelogram ABCD, the
theorem obtained is AC2 + BD1 = 2(AB2 + AD2). If
Chapter 8 you use the triangle rule the result gives the median of
8.1. (c) 1; (e) - 1. a triangle in terms of the sides.

8.4. (b) 1728; (d) -8132. 10.5. (a) 6. (b) -5.

8.6. (b — c)(c — a)(a — b)(a + b + c). 10.6. (a) 35.3°.

8.14. x = a, b, c, —a — b — c.
10.8. 54.7°.
7 1 -5
10.9. - 33x2 - 13y2 + 95z2 + 48xy - 144yz + 96z.x = 0.
8.16. del(AB) = -36, A~' = ! -2 0 2
10.10. 78.9°, 68.6", 32.5°.
11-1
10.12. F = a + 2b + 2c.
Chapter 9
10.16. a= -§, P = ly = &.
9.1. (a) TQ = (5, -3), QP = (-5,3).

9.2. (f) Length = 5, 0 = 126.9 . 10.17. x = 0, y = 0, z = 1.

10.18. (a) 2^2, 0. (b) (x - —j" + (r + -L)" = 1.

9.4. BE = (0, —4); BE = 4; bearing south. 10.19. (c) (l,m,n) = (j, -§).
9.5. (c) V6.
10.21. (a) ±(rj, o, tx)-
9.7. (b) 2a = (6, 4, 6), 3h = (3, 3, 6), 2a - 2b = (3, 1. 0).
10.26. (a) 19.1°.
9.10. (a) (3, 3, -6).
(b) (X + 2)2 + (Y - 1 )2 + (Z + 3)2 = 1. 10.29. Hint; Translate the axes to the point q and so
work from the simpler form, (10.24a).
9.16. Speed 10^/2; Direction towards north east. (Hint:
use vwe = vw — vc in components, with t?N = (u, v).) 10.30. (a) P, is -y + z = 4, P, is 2x - 2y + z = 5.
(b) 45°. (c)2V2.
9.22. (b) la + {b (c) 4a — \b. (d) and (e). The line L is given by r = 2(1, 4, —4). Show
that intersection with P, and P, occurs when 7. =
9.23. (a) (a + /.b)/( 1 + /.). (b) (a - }.b)l( 1 — /.).
(c) ^he point is on the extension of AB in the direction 10.34. Begin by finding any two points on the line of
of AB. intersection. (The resulting form is not unique.)
Answers to selected problems 739

Chapter 1 7

1
7
Cl

Cl
">
"-1 +2^/2'

1
11.1. (a) (4, 7, 5). (d) -9. (h) (-24, 3, 15).
7 7

11-9. Hint: the determinant is equal to a-(b x c), where


13.4. (c) Eigenvalues ( — 2,2,3); Eigenvectors
OA = a, etc.
0 1 0
11.12. X=-i Y= -§,Z = —f.
-1 , 0 i
2
11.13. (c) /. = /(= — i v = i L3 meets Z., at (1, 1, j) and
2 0 _1_
Z.2 at (i -§, i).
7
m —2 and a = 7.
11.15. (a)(^,f,-i). (b)(^, ?•--»*). (c)fj.
[Note: the unit vector in the direction of i—2j — 2k
13.14. The matrix C is given by
is i(f-2/-2A-).]
7 -1 -1
11.16. (a) —6. (b) 6. (c) 0. (d) 0. (e) — 2y 3.
C = 1-1 1

1 0 2
Chapter 12
12.1. (c) .v, = 1, ,v2 = -1, .v3 = -5. l l 1
(e) x, = 2, x2 = -1, x3 = 2. 13.16. lim An = $ l l l
n~* oc

12.7. x, = 40, x, = 88, x3 = -68, x4 = -59. 1 1 1

12.9. (b) _ 13.22. Eigenvalues are 0, 4, 4, 12.


5 0-5
13.26. A3" = I3, A3n + 1 = A, A3n + Z = A2.
A -6 10 1 .
7 5 3_
Chapter 14
(e) 14.1. (a) lx6 + C; fx5 + C; ^x4 + C;
ix3 + C; 3x2 + C; 3x + C; C.
10 0 0 0 (g) ex + C; — e_x + C; |e2x + C; -2 e"*x + C;
-1 1 0 0 0 — 1 e_2x + C.
(k) x + lnx + C ((x + 1 )/x = 1 + x”1);
0-1 1 0 0 2x — 2x5 + C; ln|x| — 2x~1 — ix-2 + C.

0 0-1 10
14.2. (b)4-^(l-x)5 + C; -|(8-3x)-^ + C;
0 0 0 -11 !(1 -xp + C.

12.12. The shadow on the z plane has vertices at the 14.3. (b) — ln|l - x| + C; -$ ln|4 - 5x| + C.
points ( — 1,0, 0), ( — 1, —2, 0), (1, 0, 0).
14.4. (c) |x + i sin 2x + 22 sin 4x + C.
12.16. Nontrivial solutions if k = 1, — 1, 4.
14.5. x2ex - 2xex + 2ex + C.
12.18. Nontrivial solutions if k = —6, —1, 3, 4.
14.6. (a) 2; (h) —In 2.
12.22. x, = 1.398, x2 = 1.090, x3 = -0.2844,
x4 = -0.3697. 14.7. (c) 4 — x2 ^ 0 if — 1 ^ x ^ 2, and 4 — x2 ^ 0 if
2 ^ x ^ 3. The geometrical area is

Chapter 13 [F(x)]2_, - [f(x)]2,


r-3i
13.1. (b) Eigenvalues 4, 9; Eigenvectors , where F(x) = 4x — 3X3.

(e) Eigenvalues 3 - 4^/2, 3 + 4^/2; Eigenvectors 14.8. (a) At T Zl; (b) g/3 T At T B.
740 Answers to selected problems

Chapter 15 Therefore the area is


15.1. (b)
x=2

x= i ri lim Y [x(x ~ 1) — (— x)] 8x = x2 dx = §.


lim Y x'5 §x = x5 dx = [5X6] L1 = 0. 8.v -» 0 x = 0 Jo
6x -* 0 .v = - 1 J- 1

16.13. i
15.2. (b) (x + l)2 dx = |(x + l)2 + C.
J 16.14. (b) n.

15.3. (c) dx = [x]o = 2; (i) §(22 — 1). 16.15. In a plane perpendicular to the end, y is down¬
ward and x is horizontal; the origin is at the top.
Area elements are horizontal strips of width 5y in the
15.4. (b) (a-2 - 1) dx = |>3 a] L, = -i end face. Force = \pgLH2. Moment = \pgLH3.

16.16. Distance of centre of mass from vertex is |H.


15.5. (b) e-2,'dr= -2[e“2l']j = —2(0 — 1) = 2.
16.17. ricra3b (a — mass per unit area).

15.6. (c) 2/rc; (h) (1 — e ') df = T + e~ 1. 16.18. (a) (b) 33aHB3, where a is mass per
unit area.
I
(T + e“' — 1)= 1 + T-‘ e-T - 7”1
T
Chapter 17
as T -* x.
17.1. (c) -ie~3* + C; (f) -^(3 - 2a)6 + C.

15.7. The integrands are (a) even; (b) odd; (c) odd; (j) (2x — 3p + C; (n) j ln|2x + 3| + C.
(d) odd. (o) ln|l — a| + 1/(1 — x) + C.

15.9. (b) The exact result is ^/tt/2. 17.2. (b) —3 cos 3(3? — 1 ) + C; (e) -|(-tp + C.

15.10. (e) t(x + l)“2sin(x+ 1) — \x~2 sin x. 17.3. (d) 2 sin(x2 + 3) + C.


(j) 2 ln(l + x2) + C.
15.11. (b) x ^ — 1: j (constant); — 1 ^ a ^ 1: tx2;
a ^ 1: 2 (constant).
17.4. (c) 5 sin3 2x + C.
(g) Put cot 2x = cos 2x/sin 2x, then u = sin 2x, giving
Chapter 16 7 ln|sin 2x| + C; (j) 3 cos3 x — cos x + C.
16.1. 5.3 x 10-3.
17.5. (b) 205/32; (e) —In 2; (h) j In 2.
(k) Zero; (n) (2/co) cos </>.
16.2. J (20 - 10?) d? = -20, x(4) = - 17.

17.6. (b) 371; (d) + j; (f) |jt.


16.3. (b) jit; (g) k.
17.7. (e) tan x — x + C; (f) —x“1 - arctan(x-1) + C.
16.5. (a) %nab2.
(k) ^[arcsin x + x(l — x2p] + C.

16.6. v = kx2 d y = 7i(2j;)2 Ay = 28tc/3. 17.8. (b) iln|x/(x + 2)| + C.


(d) ln|x + 1| - ^ln|2x + 1| + C.
(f) ln|x| - iln(x2 + 1) + C.
16.7. Put a = 0 at A; moment mx dx = \mL2. (i) 1 in[(l + sin x)/(l — sin x)] + C.

16.8. 1.27. 17.9. (b) | ln(x2 - 2x + 3) + C; (e) ln(e* + e~x) + C.


(0 21n(x* + 1) + C.
16.9. 0.015 g.
17.10. (b) jx e3x — ^e3* + C.
16.12. A sketch shows that x(x — 1)5= — a if 0 ^ x ^ 2. (f) 2x sin jx + 4 cos \x + C.
Answers to selected problems 741

(i) fx2 In .v - 4x2 + C. 18.14. 0 = 0.0719 e“0 033' sin 0.696r.


(j) x"+1[ln x - 1 /(« + 1)]/(/? + 1).
(k) Hint: bring together the two terms f (In x/.x) d.x. 18.18. A = (Mg/P) e W(-V-H) p.

17.11. (a) Hint: there are two stages required; see Example
Chapter 19
15.20.
19.1. (b) -}f3- |t2-|f -
17.12. Hint: the same integral occurs on both sides (d) le2'; (i) — rs sin 3r.
but with a different factor. (k) — rs cos 2? + rs sin 21.

19.2. (d) !( —6 cos r — 3 sin f).


17.13. (b) Zero; (d) j; (h) 7i.
(0 — tt7(4 cos 2f + 11 sin 21).
(h) ^ e'(4 cos 2t + 7 sin 2f).
17.15. F(0) = \_n, F( 1) = 1. F(4) = F(5) = f5.
19.3. (b) — |f cos 2t.
17.16. (a) 2 In3 2 - 6 In2 2 + 12 In 2 - 6.
(b) F(0) = 2. F(l) = jt. F(4) = 7i4 + 12jr2 + 48, 19.4. (b) \t2 e'; (e) \t e' sin t.
F( 5) = tt5 + 20tt3 + 120rr.
19.5. (c) A e1' + B e^5' —1—^7 cos 2t.
(i) A cos x + B sin ,x + xM — 1 -f 5e3x.
Chapter 18
19.6. (c) — 3 + A e'’.
18.2. (b) -Y = A e5'; (e) .v = A e“5'; (i) .x = A e'.
(g) (sin.x — cos ,x — .x cos .x + A)/(x -I- 1).
(l) (x + 1) ln|.x + 1| + 1 + A(x + 1).
18.3. (b) .x = e^'-u; (d).x= 10e“,, + 1).
19.9. llj minutes.
18.4. 7(f) = 70 e~Rl L. I reduces to a fraction 1/n of
itself in any interval of length L In n/R.
Chapter 20
18.5. (a) A(t) = C e~k‘ (C arbitrary); (b) The half-life 20.1. (b) 3 cos (cot + 7t); (e) 3 cos(2t + \n).
(h) 5 cos(21 + 4>), cf) = — arctan 4.
F= - In 2 years. The information implies that e~20k =
20.2. (c) x leads y by n.
1 - 0.175 = 0.825. so k = 0.0096. Therefore F = 72 years.
20.3. (b) (i) 0.318 cycles/sec. (ii) 0.316 cycles/sec.
18.6. If N(t) is the number, then 5N % 20(jN)5t so (iii) About 3 cycles.
the equation is dN/dt = ION. In the second experi¬
ment there is an average death-rate of 1 per rabbit 20.4. (b) C = 7(4 - </> = arctanf —1/(^6 - 1)),
per year, so dN/dt = 9N. ( — tTT < <p < 0).

18.7. (b) A e' + Be'2'; (e) A e'/2v'3 + B e_,/2'/3. 20.7. The solutions are of exponential type.
(l) A e~3' + Bt e“3'.
(n) A + Bt (this is an exception to (16.10)). 20.8. x = e"4' - 4e“6'.

20.9. A e~kl + Bt e~k'.


18.9. (b) y(e'— e“2'); (d) (The general solution is A
e~x + Bxe~x), y = e(.x — l)e_t. 20.10. (a) Period = 1.0508.
(b) Amplitude = 10/[(36 — ur)2 + co2]5, phase =
18.10. (b) A cos 3f + B sin 3f. — arctan[co/(36 — oj2)].
(d) A cos co0t + B sin co0t. (c) Resonance: co = 5.958.
(f) e'(T cos t -I- B sin t).
(i) e^s'(4 cos |yjlt + B sin \^/2t).
Chapter 21

18.11. (c) a cos co0t + (b/oj0) sin co0t. 21.1. (b) —2ejni (2e_57tj in standard form).

21.2. (d) 2 e_J7CJ; 2 cos (cot — in).


18.12. 0 = a cos(g/iyt.
(i) eI,97j; cos(o>f + 1.97).

18.13. The initial angular velocity d0/dt is v/l; 21.3. (b) 1 - e“**J = 1 + j - V2 e4Itj.

21.4. (b) 1 — 3 e~ + eini = 1 + 4j = -J\l e^, where </> =


arctan 4 = 1.33.
742 Answers to selected problems

21.6. (b) R + o)Ly, (d) K/( 1 + oRCj). 24.5. (b) 1; (d) gf4; (g) \ e*.
(i) R + jcuL/(l — orLC). (k) \ e' + { e"'; (o) 2 cos 2t — \ sin It.
(k) jo)RL/[R(\ - orLC) + yoL], (sjle't2; (u) i(cos t - cos 2f).

24.6. (e) (2s2 + 3s - 2)X(s) - 10s - 11.


21.7. V = Z/and V = 2. ,
(d)/ = 2(l +}coRC)/R: |/| = 2(1 + orR2C2)*/R;
24.7. (b) 2 e' + c“2'; (e) 3 e-' cos 2f.
arg / = arctan(wKC).
(f) y = 4 ex + )e^' + ) cos x.

21.8. (b) VJV0 = A(3 - 2j); E0//, = 1(5 - j). 24.8. (b) 3 - 3 cos f + sin t.
(e) -Je“‘ + Ie'-ire' + it2 e'.
(i) — 1 e‘ + \ e ' + 4 e-' — /i e
Chapter 22
22.4. (b) 2.x2 - y2 = C; _ (g) y = ,x/(l + Cx). 24.9. (b) x = | + |e4' + \t e4',
(k) .x = ±2~HC - t3)~! for f3 < C.
(n) arctany + arctan .x = in. Take the tangent of this y — — \6 + 16 c -h 4/ e .
expression and use the formula for 'tan(/4 + B); we
24.10. (b) e'(U + IB + i) + e"‘(i/4 - JB + i) - 3,
find that y = (.x + l)/(.x — 1).
where .4 and B are arbitrary. This is the same as
fe' + De"1- 3, where C and D are arbitrary.
22.6. (b) y = A(.v2 + C)2 for .x2 + C > 0. y = 0 is
also a solution; (d) Those parts of the curves y =
24.13. e”2 e“2s[(s + l)2 - l]/[(s + l)2 + l]2
sin(ln |.x| + ) for which .x and dy/dx have the same sign.
Also y = + 1 are solutions. = e-2 e '2vs(s + 2)/(s2 + 2s + 2)2

22.7. (b) y3 - 3.xy = C. 24.14. (b) H(0 sin t — H(f — 1) cos(f — 1).
(d) xy — y2 — x2 = C.
(f) y3 + y - x3 = C. 24.15. (b) (1 e2’ + g e^2‘ — j)H(f),
(h) y + cos y + sin x = C; (j) ex+y + y - x = C.
— (g e2(l_ 11 + 5 e_2<,~11 — i)H(t — 1).
(d) jH(t)f sin r + ,H(t - n)(t — n) sin(r — it).
22.8. (b) xy + y/x = C; (d) x/y + y - x = C.
(e) y/x - x/y - 1/x = C; (f) x2/2y2 + 1/xy = C.
Chapter 25
22.12. (b) x( 1 + 2y2/x2)* = C. 25.3. Hint for working: s2 + 2ks + oj2 has real factors
(d) x2 - 4y2 = Cy3' when k2 > co2\ so put s2 + 2/cs + co2 = (s — a)(s — /?),
where <x,/? = — k ± (k2 — or)1. Then x(f) is given by

Chapter 23 (ot - /?)“1 [(a + k) e3‘ — (fl + k) e^']H(f)


23.2. (b) y = Cx (this is not covered by (23.22)).
+ /(a - j8)_1[ e3t('_'0> - e^'-'0)]H(r - f0),
(d) xy = C (a saddle).
where k = 1 + 2k.
23.4. (b) Saddle (i.e. unstable), m = \( — 3 ± ^/13).
(f) Stable spiral; directions are clockwise round origin. 25.4. By proceeding as suggested, we obtain

ti(x) = Ax + 5B.X3 + (Mg/6K )(x - ^/)3H(x -1/).


23.5. (b) Equilibrium points at (1,1). (1,1) is a stable
spiral, anticlockwise about (1,1); (d) Equilibrium points The conditions at x = I give A = Mgl2/16K, B— — Mg/2K.
at (— 1, 0), (0, 0), (0,1); (0, 0) is a centre and ( — 1, 0), (1, 0) This problem could be solved by integrating the equation
are saddle points. four times, and linking the solutions over [0, j/] and
[2/, /] by the condition that u(x), u'(x), u"(x) are continu¬
ous at x = jl, but this is automatically secured in the
Chapter 24 Laplace-transform method.
24.1. (b) 4/(s + 1); (d) 6/s3 - 1/s.
(g) (3 - s)/(s2 + 1). 25.5. (b) 2s/(6s2 + s + 1).

24.2. (b) 1/s - 2/(s + 2); (e) (3s - 4)/(s2 + 4). 25.6. (b) V2/Vx = 3/(20s2 + 12s + 5);
(g) i[l/s - s/(s2 + 4)]. V2/I = 3/(4s2 + 6s + 1).

24.3. (b) l/(s + 2)2; (d) (s - 2)/(s2 - 4s + 5). 25.7. (b) f; (f) 1 — cos t; (h) j( — r cos t + sin f).
(i) (s2 - 9)/(s2 + 9)2; (1) 24/(s + l)5. (j) n\m\tn+m+1/(n + m + 1)!.
Answers to selected problems 743

1
25.8. (b) - /(T)(ew('~r> - e“"('_r)) di. 26.10. ao = 0, an = 0, bn = [1 - (-1)"]
2w Kll
o
(n= 1,2,...).
25.9. (b) cosh f.
x 4
25.19. (a) x(f) = 5(f) + 25(f - T) + 5(f - IT), *(s) = 26.16. (a) I — - sin(2« 1) 7Tf.
„=i (2n — 1)ti
1 + 2e~sT + e~2sT.
y 2 x. 4
25.20. Y(s) = Y, [3_n^(r - (77 + l)f) + 26.18.-y —-cos 2ntot.
n=0 71 n=l 71(4/7" — 1)

2 ■ 3 _'V)(f - (7; + 2)7)


14/ 1
26.23. (b) R(f) = - 4- - cos f + — cos 3f
25.21. (a) z~1 +2z-2-2-3. (b) 1 -z"1 + z~2- 2 n\ 32
= z/(z + 1). (c) 2z/(2z - 1). (d) z/(z2 - 1).

25.22. (a) 7z/(z — l)2.


H—- cos 5f + • • •
52
25.23. (a) (z - l)/(z + 1). g(t) = [1, -2,2, -2,...}.
26.26. (b) t 4 = T-
25.27. (a) Unstable. Poles at z = ±2, giving growth |2" n = 0/7 6

and |( — 1 )"2". (c) Stable. Poles at z = ±^j, giving decay


1 1 1 26.30. z + — X - ei2nnl,T.
COS - 71/7.
2k „ = - x n
4 2" 2

26.32. x([) = 2 X(f) cos 2nft df where X(f) =


25.31. x(f) = 2 Xc(f) cos 2nft d/,

x(f) cos 27t/f dr.


-Y,.(/) = 2 .v(f) cos 2nft dr. x(f) and 3fc(/) are called
Jo 26.34. (c) 2c sine cf cos 2nbcf.
a Fourier cosine transform pair.
-X 26.35. (a) 2 sine 2 {f. (b) 2 sine2 2/ e~i6nf.
25.33. x(t) = 2 Xs(f) sin 2nft d/,
26.40. [l/{a + }(2nf + /?)} + l/{a + j(27t/ - )?)}].

26.42. (b) 1/(1 +j2ft/)2.


Xs( f) = 2 x(t) sin 2nft df. x(t) and Xs(f) are called
Jo 26.44. (b) The Fourier transform is sinc2(/) e J2,t<a+f,)7 ^
a Fourier sine transform pair.
/l[t - (a + fc)].

Chapter 26
Chapter 27
26.1. (b) a„ = 0, bn = -2(- 1)7/7-
27.3. (c) 4x — 2j/ — 1; —6y — 2x — 1.
(e) c/„ = 0, hn = — [ 1 + (- 1)" - 2 cos(^«7t)]. (f) 37 - 2; x - T (i) 2y/(x + y)2; -2x/(x + y)2.
nn
(k) x(x2 + y2yk y(x2 + y2)~T
2n2
26.2. (b) bn = 0. a0 = 5F dF
27.4. (c) — = g'{r) cos 0; — = g'(r) sin 0.
dx dy
aB = -(-l)"(/i = 1,2,...).
/?“ 27.8. d2//dx2, d2//dy2, and d2//dx dy = d2//dy dx are
4(-iy given in order: (b) 2, 4, 3. (d) 2y/x3, 0, — 1/x2.
(c) b„ = 0, a„ = -
7t(4/72 — 1) (h) 108(3x - 4y)2, 192(3x - 4y)2, - 144(3x - 4y)2.
(k) —r"3 + 3x2r-5, — r~3 + 3y2r-5, 3xyr-5, where r =
26.3. (a) c/0 = Att, x2 + y2)l

~ 0, CUn- I — 2’ 27.10. (b) 2x + 2y — z = 4; one normal is (2, 2, — 1).


nrr
(d) 3x + 4y + 8z = 29; one normal is (—§, —2, — 1).
(-1)"
&.=■-- - (77=1,2,...). 27.11. 78.9° or 101.1°.

27.12. (b) (1,-1), min; (d) (nn, mn); min if n and


26.5. Series sum is ^tt.
777odd, max if n and m even, otherwise saddle.
26.8. F = 2. (h) (0, 0) saddle; (1,1) minimum; (k) (0, 0), saddle.
744 Answers to selected problems

27.14. (a) a = b = c = 1\ (b) a = b = c = 4. 29.8. (b)

27.15. The maximum is 9, attained at (2, ± 1). x = - 2fO sin 0 + r cos 0 - 0zr cos 0 - Or sin 0,

y = 2r0 cos 0 -I- r sin 0 — 02r sin 0 + Or cos 0.


27.16. Minimum distance = J2.

27.18. (b) Depth = 2~3V}: square base, side 23V3. 29.9. (c) df/cu = -2v2/u\ df/dv = 2v/u.

29.10. (b) d2J'/du2 = 12u2 - 2v2, d2f/du dv = -4uv, d2fl


Chapter 28 dv2 = -2 ir + 12r2.
28.1. (b) 8r = 0,0718 ... (exactly). The incremental ap¬
proximation gives 8z % 0.0784. Error = 9.1%. 29.11. It is easiest to put x2 — y2 in terms of uv. Finally,

28.3. (b) ((5.v)2 - 4 6.x- 8y - 7 5y)/4( 1 + 5y). d2f/cu2 = \6v2y"{4uv), c2f/dv2 = I6u2g"(4uv),

28.6. -5.7%. d2f/cu dv = 4g'(4uv) + \6uvg"{4uv).

28.7. 1.67 reduction, approximately.


Chapter 30
28.9. (b) —2j2\ (d) Zero (it is the same in all direc¬ 30.1. (b) 8/« -x(x2 + y2)“^e“'5x
tions).
— y(x2 + y2)"5 e-' 8y — (x2 + y2)~- e-' 5f.
28.10. (b) (e) -I; (j) 1.
(e) 8/ Je 2(x, - x2) 5x, - 2(x, - x2) 8x2 +
28.12. (b) xxx/cr + yxy/b2 = x2/a2 + y2/b2.
(f) axtx + h(ytx + x,y) + byy, + g(x + x,) + 2(Ti - y2) S>’i - 2(yx - y2) 8y2.
./(>’ + Ti) + c — 0.
30.2. -0.07.
28.16. (b) x~1 — 1 = const; (d) e* + ev = const.
30.3. It is easiest to write 5(1/R) % —5R/R2. We obtain
28.17. (b) y2 — x2 = b2 — a2. 6R x 0.198 6Rj + 0.018 8R2 + 0.334 8R3. The required

8R3 is -0.108.
28.19. (b) 49.8° or 130.2°.
(d) Hint: compare Problem 26.12f. 30.4. Put ax3 — bx — c = f(a, b, c, x) and use (30.1).

28.21. (b) (0, 2); (d) ( —i, 1).


30.5. (b) Hint: use logarithmic differentiation: 8w «
— 3 Sx + 3 8z and 8w ^ 2(±0.6). What is the signifi¬
28.22. (b) (2, l)A/5.
cance of the absence of a term in 8yl
28.23. (b) 0 = 0.
30.6. (b) Maximum |Sw| % 0.14; max. percentage error
10%.

Chapter 29
29.2. (b) —4 sin t cos f; (d) 2 sin(f2) + 4f2 cos(f2). 30.8. (b) 28x + 48y — 68z = 0. For dz/dx, put 8y = 0:
dz/dx = 3. Similarly dz/dy = §.
29.3. It is easiest to start by expressing the distance
D in terms of polar coordinates (r, 0), (R, 0) by using 30.11. (b) (2, - 3, 5); (d) (3x2, 0, 9z2).
the cosine rule (Appendix B(f)). Then (f) ( — x/r3, —y/r3, —z/r3), where r — (x2 + y2 + z2)T

dD (Rv — rV) sin(0 — 0)


30.12. (b) (0, 2y, 2z). Unit vector = (0, y/(y2 + z2)1,
dt [R2 + r2 — 2Rr cos(0 — 0)]3’ z/ (y2 + z2)b
where 0 = vt/r, 0 = Vt/R.
30.13. (b) cos 0 = 11/3^14, so 0 = 11.5° (i.e. the angle
29.4. (b) x = y = 3; (e) The coordinates of the nearest of intersection of smallest magnitude).
point on the given line are (f,;). Distance = 2/y/S.
30.15. (b) s-(2x, —2y, —3).
29.5. (b) (0,0), (2,0). (A suitable parametrization is
x = 1 + cos t, y = sin t.) 30.16. (b) (Check that s as given is a unit vector.)
(d) (±6/^/5, ±4/^/5). (A suitable parametrization would
be x = 2/cos t, y = 2 tan t.)
Answers to selected problems 745

30.17. (b) -2i-2k. Chapter 32


32.1. (b) 1.
30.18. (b) (±1.0,0) and (±1.^).
(d) x = y = z is a line of stationary points (excluding 32.2. (b) £ (d) (f) 0.
the origin).
(e) = y = r = ±1 x 3. /. = ±!v3.
32.3. (b) n:; (d) §7t3.

30.19. Stationary at (1. 0. 0), ( — 1.0. 1). (-1.0, — 1). 32.5. (b) 2; (d) 0; (g) 0.

30.21. (b) (3, 3, 3); (e) (a/^/3, h/yj3, c/^J3). 32.6. (b) 1; (d) -3; (f)
(g)(i 3-D.
32.7. (b) 0.
30.26. (b) 4.xy =1; (d) .x2 + y2 = 1.
32.8. (b) -§.

Chapter 31 32.9. Zero.


31.1. (b) e — 2; (d) (d — c)(b - a); (i)
(m) I In 2. 32.11. Put x = x(u,v) and y = y(u,v), where u and v
dx ox
are the new coordinates. Then put d.x = — du H-dr
31.2. (b) Zero. Refer to the signed volume analogy
(29.2b); (f) ln(27/16).

31.4. f. 32.14. In.

32.16. (b) nonconservative.


31.5. (b) J\x, y) dy d.x.
o

\ <1 - .V2) Chapter 33


(d) J\x, y) dy d.x. 33.1. n[a3 — (a — h)3~\/a.
-1 *0
r-o r1 + 1 33.2. (a) 1/84; (b) 1/24; (c) 13/384.
(g) fix, y) 4x dy + fix, y) d.x dy.
* -1J 0
33.5. 2^/6 + 2sinh-‘ (^2).
1
31.6. (b) 3' T7-
33.6. Scalar potential is exyz + cos xy + z.x + C.

31.7. (b) 1.
33.7. Scale factors are = h2 — Jiu2 + u2), /z3 = uv.

31.8. (b) 33.11. (b) div F = 2z.

33.12. (b) curl F = 2xi + (x - 2y)j + k.


31.9. 2u2(4 + tu)/9.

31.10. (a) Hu1 + r2); (b) 2; (c) 1/5; (d) -2 cosh v. 33.14. (a) 3r2; (b) 0- (c) 3rr; (d) 0\ (e) 0; (f) 12r.

33.18. 8/3.
31.11. The value of the integral is 2(257 — 129v/2)/5.

31.12. Area = 41/12. Chapter 34


34.1. (c) -2, -1,0, 1,2, 3, 4; (f) 1,4,9.
31.13. 1/e.
34.3. (c) A u B = {-4, -3, -2, -1, 1,2, 3,4}.
31.14. Volume = 20.
34.4. (b) 4 n B = {x|xe N + and — 5 ^ x ^ 2}.
31.15. 1/4. (d) A n B = {1}.
746 Answers to selected problems

34.5. (b) B\(A uC); (d) (B n D)\A. 36.19. (a) The transfer function is

34.6. (b) S:\(S uS2u u Sr).


1 + G2//, — GlG2G2,H2
34.7. (b) [{A\A1)\Bl]uB2.
GiG3
34.10. A2 = {(1, 1), (1, 2), (2, 1), (2, 2)}. ; (1 — G1G2H2)(1 + G3//j)

34.12. (b) 66.


g2g2g3g4 +g5g6g7
1 + G2G3H2 1 - H,
Chapter 35
35.6. See table below. 36.24. S/IFT, length 12.

35.10. (b) (crb)-(b ® c)\ (d )(a-b ® a-b)(c-d). 36.25. One tie.

35.15. (a) If a, represents the state of switch S,, etc., 36.27. (b) Framework is overbraced.
then the switching function is
36.28. Two ties.
(«1 © Ch) © [(^3 © fl4)'a5]-

36.29. Waiting times are 14773 and 97/2.


35.16. See the table.

Solution for 35.6 Solution for 35.16 Chapter 37

a b c (a@b)-(a®c)
37.1. £1790.85, 4.87%.
fli «2 «3 /

0 0 0 1 0 0 0 0 37.2. (b) 16.9 years.


0 0 1 0 0 0 1 1
0 1 0 0 0 1 0 1
37.3. (b) 0,&-1 ±Vl3)/6.
0 1 1 0 0 1 1 0
1 0 0 1 1 0 0 1
1 0 1 1 1 0 1 0 37.6. f(n) = (In n)/ln 2.
1 1 0 1 1 1 0 0
1 1 1 1 1 1 1 1 37.8. (b) un = A3n + B(- 3)".
(c) m„ = 3"(A cos \mz + B sin inn).
Chapter 36
36.3. Twenty are planar. 37.11. (a)(ii) un = -^n + in2.; (b)(ii) u„ = in +
(c) (iii) un = \n2.
(d) (iii) un = Tsn23n.
36.4. Five are connected.

37.13. D„( 1) = n + 1.
36.8. Six not including reversed order.

36.13. There are three different paths between a and e. 37.16. un = | — |(—-j)".

36.14. Five vertices. 37.17. dk = k(N - k).

36.15. /] = — xf/O h = ~ tt<0’ h = ~ 2t!c>’ U = rrio> 37.19. s„ = ln2( 1 + n)2.


h = ii'o’ G = rr'o' h = — 7'o-
37.22. 0 < a < 1.
36.17. The transfer function is
37.23. Oscillates between 0.4953 and 0.8124.
pg,g2g3
1 - G2Hl + g,g2g3//2' 37.24. The periodic values of the 2-cycle are 0.4 and 0.8.
Answers to selected problems 747

Chapter 38 39.4. Mean = (a + b)/2; standard deviation = (b — a)/


38.1. (a) 32; (b) 216; (c) 12; (d) 63. (V3)-
38.2. The probability that the score is 7 or less is 7/12. 39.6. Mean number of non-faulty components to failure
is 82.33; standard deviation of the number of components
38.5. n(A u B) = 10. to failure is 82.83.

38.6. (b) Ace of clubs or ace of spades drawn; (d) any 39.7. 1/29.
ace or any heart or any black card drawn; (f) any heart
except the ace of hearts; (h) ace of hearts or any black
39.9. Probability that a bottle fails the test is 0.00067.
card.

39.10. (a) 0.777; (c) 0.223.


38.7. (b) 1/221; (b) 0.004166....

38.9. (b) 5040; (d) 7. 39.11. (b) 0.528.

39.13. (b) P(Z ^ 0.7) = 0.758.


38.11. (a) 27 216; (c) 3360.

38.12. (b) 156 849. 39.14. On average 30% of operations take longer than
40 seconds.
38.15. 270 725; 0.010 56....
39.15. Standard deviation of 1 if a = J5 and A =
38.17. (a) 9/209; (c) 16/665; (c) 683/1463. 3/(2075).

38.18. (b) 0.872; (c) 0.602. 39.16. Maximum value of standard deviation is 121.6.

38.19. (b) 0.453; (c) 0.547. 39.17. Probability that just two bulbs will be still work¬
ing is 0.242.
38.20. With the same probability of failure 0.98, prob¬
ability that circuit fails is 0.963.
Chapter 40

38.21. (b) 1/495; (c) 4/99. 40.1. (b) Mean = 24,1; median = 24.5; interquartile
range = 17.

38.22. Overall probability is approximately 1/53.7.


40.3. Sample mean = 25.3; mode = 25.1; variance =
0.0644.
38.23. Mean number of plays to the end of the game is
2"” 7/3.
40.5. About 11 intervals.
Chapter 39
39.1. P(X >\) = 0.841. 40.6. Estimated variance of the sample is 1/12.

39.2. P(X Ss 6) = 1/32. 40.8. /c, = - 1.1337; k2 = 1.1337.

39.3. Mean = 0.0769; standard deviation = 0.0887. 40.9. For full data a = —0.019 964; b = 52.998.

Chapter 4 7
The answers contain suggested Mathematica commands for solutions to the projects. Some full programs and
outputs are also included for selected problems. Note that there are frequently several ways of presenting
answers using Mathematica programs. Inevitably these answers and comments are brief: a working knowledge
of Mathematica using mainly built-in commands is required.

41-1.1. Use the Plot[ ] command.


748 Answers to selected problems

41-1.2. Use Table[ ] to define the set of points and ListPoint[ ] to plot the points. The curve can be plotted
using Plot[ ] and all displayed with Show[ ].

41-1.3. Here is a program and a selected output.

In[1]:=
<<Graphics'ImplicitPlot'
ln[3]:=
ImplicitPlot[xA4+2yA2-x y-2 y xA2==4,{x,-4,4},
AxesLabel->{x,y},PlotRange->{{-2,2.2},{-1.5,3.2}}]

41-1.4. Use Plot[ ] commands.

41-1.5. Use Plot[ ] commands.

41-1.6. Here is a program which uses the command ParametricPlot[ ].

The polar variables are (r,t): equations are defined by r1[t ] and r2[t ].
rl[t_]:=.5 (1+Cos[t])

This the graph of the cardioid.


ParametricPlot[{rl[t] Cos[t],rl[t] Sin[t]},{t,0,2 Pi),
AspectRatio->1]

This is the graph of the folium.


r2[t_]:=Cos[t]*(4*(Sin[t])A2-l)

ParametricPlot[{r2[t] Cos[t],r2[t] Sin[t]},{t,0,2 Pi},


AspectRatio->1]

41-1.7. Partial fractions can be generated by Apart[ ].

41-2.1. Limits can be obtained by the Limit[ ] command.

41-2.2. Use either D[ ] or /'[*].


Answers to selected problems 749

41-2.3. Numerical solutions can be found for the equation g(x) = 0 by using NSolve[g[x] = = 0] and curves plotted
by Plot[ ].

41-2.4. Define the function /[x] = ,\- sin[2*x] and a tangent function.

tangent[x, a] = /'[«] * (.x - a) + /[a],

41-2.5. Use D[ ].

41-2.6. Use Plot[ ].

41-3.1. Product rule given by D[/[x] *r/[x], x].

41-3.2. Stationary values can be found by using FindRoot[ ].

41-3.3. Implicit differentiation can be implemented using Dt[ ] and the explicit derivative obtained by Solve[..., Dt[ ]].

41-4.1. Use D[ ] command.

41-4.2. A possible program is included here.

This line defines the function


f[x_]=0.l*xA5-0.5*xA4+0.2*xA3+xA2-0.7*x+2.2

This draws its graph between x=-2 and x=4.


Plot[f[x],(x,-2,4},AspectRatio->Automatic,PlotRange->
{-2.8,3.5},AxesLabel->{x,y}]

D[ ] determines its derivatives


D [ f [x] , x]
D[f[x],(x,2}]

The following commands find the values of x at the four stationary values. Initial
estimates for the FindRoot[ ] commands are estimated from the graph above.
FindRoot[f 1[x]==0,{x,-1}]
FindRoot[f1[x]==0,{x,.5}]
FindRoot[f1[x]==0,(x,1.5}]
FindRoot[f’[x]==0,(x,3}]
xl=-0.93972;x2 = 0.352685;x3 = l.27 565;x4 = 3.31138 ;

The following values of the second derivatives check the second derivative test
for each stationary point.
f■■[xl]
f■■[x2]
f ■ •[x3]
f 1 1 [x4]
Plot [f’ 1 [x],{x,-2,4},AspectRatio->Automatic,
PlotRange->{-2.8,3.5}]

The following commands determine the points where f"(x) is zero, the points of
inflection on the curve.
FindRoot [f 1 [x]= = 0,{x,-0.5}]
FindRoot[f 1’[x]==0,{x,l}]
FindRoot[f’’[x]==0,{x,2.5}]
750 Answers to selected problems

41-4.3. Use Plot[ ]

41-4.4. Estimate the location of the root using Plot[ ], define a function

/»M = x- / W//'M,
and then use NestList[ ].

41-4.5. A suggested program is included here.

The graphs of y=x+sin(5x) drawn for 0<x<25 in (a) the default Plot[ ] form, (b)
Plot[ ] with 20 plot points, (c) Plot [ ] with 50 plot points. Notice the differences.
curvel=Plot[x+Sin[5*x],{x,0,25},AxesLabel->{x,y}]

curve2=Plot[x+Sin[5*x],{x,0,25},PlotPoints->20,
PlotStyle->RGBColor[0,0,1],AxesLabel->{x,y}]

curve3=Plot[x+Sin[5*x],{x,0,25},PlotPoints->50,
PlotStyle->RGBColor[1,0,0],AxesLabel->(x,y}]

Show[curvel,curve2,curve3]

The default plot misses a beat around x=21, whilst a scheme with 20 plot points
misses the oscillations completely. Plot[ ] with 50 plot points is sufficient to
detect all the oscillations of the functions. The Plot[ ] algorithm can be deceived.

41-5.1. Series[ ] defines the Taylor series up to any required number of terms. Normal[ ] defines the corresponding
Taylor polynomial.

41-5.2. Use Series[ ] and Normal[ ].

41-5.3. A possible program and output is included here.


The function is ((sin x)/x)A2.
In[1]:=
Series[((Sin[x])/x)A2,{x,0,2}]
Out[ 1 ]=
2
x 3
1-+ O [x]
3
tn[2]:=
f1[x_]=Normal[%]

Out[2]=
2
x
1-
3
ln[3]:=
Series[((Sin[x])/x)A2,(x,0,4}]
Out[3]=
2 4
x 2 x 5
1-+ - + 0[x]
3 45
ln[4]:=
f 2 [x_], = Normal [%]
Out[4]=
Answers to selected problems 751

2 4
x 2 x
1-+ -
3 45
ln[5]:=
Series[((Sin[x])/x)A2,{x,0,6}]
Out[5]=
2 4 6
x 2 x x 7
1-+-+ 0[x]
3 45 315
ln[6]:=
f 3 [x_] =Normal [%]

Out[6]=
2 4 6
x 2 x x
1-+-
3 45 315
The actual curve is in black and the approximations in various levels of gray.
Inf 10]:=
Plot[{fl[x],f2[x],f3[x],(Sin[x]/x)A2},{x,0.001,3},
PlotRange->{0,1.1},PlotStyle->{{GrayLevel[.2]},
{GrayLevel[.4]},{GrayLevel[.6]},{GrayLevel[0]}},
Axe sLabel->{x,""}]

41-5.4. Use Series[ ] followed by Normal[ ]. FindRoot[ ] and Plot[ ].

41-6.1. Use Solve[ ].

41-6.2. Requires package Algebra'Relm and command ComplexExpand[ ].

41-6.3. List the roots using Table[ ] and plot them using ListPlot[ ].

41-6.4. Use ListPlot[ ].

41-6.5. Abs[ ] and Arg[] give the modulus and principal argument of a complex number.

41-7.1. Matrices are entered as lists by rows. Transpose and Inverse are given by the commands Transpose[ ] and
lnverse[ ].

41-7.2. Requires the lnverse[ ].

41-7.3. Can use either products or MatrixPower[ ] to calculate powers of matrices.


752 Answers to selected problems

41-8.1. Determinant of a matrix is given by Det[ ].

41-8.2. Use Det[ ] followed by Factor[ ].

41-8.3. Use commands Det[ ] and Solve[ ].

41-9.1. Plot the path using ParametricPlot3D[ ].

41-9.2. Use command ParametricPIot 3D[ ].

41-10.1. RotationMatrix[ ] is in the package Geometry Rotations .

41-11.1. Requires package LinearAlgebra CrossProduct'. Use Cross[] for vector products and Polygon[ ] to
show the triangle.

41-11.2. A program is given below.


Requires package LinearAlgebra'CrossProduct'.
<<LinearAlgebra'CrossProduct'

a={1,-1,2};b={-1,2,3);c={2,-1,3};d={1,3,-2};

pl=Cross[b,c]+Cross[c,d]+Cross[d,b];

p2=Cross[c,d]+Cross td,a]+Cross[a,c];
p3=Cross[d,a]+Cross[a,b]+Cross[b, d];
p4=Cross[a,b]tCross[b,c]+Cross[c, d] ;

Uses the area formula from the previous program.


area=0.5*(Sqrt[pl.pl]+Sqrt[p2.p2]+Sqrt[p3.p3]+
Sqrt[p4.p4])

N[%]

The following steps define the triangular faces of the tetrahedron


polyl=Polygon[{b,c,d) ] ;

poly2=Polygon[(a, c, d> ] ;

poly3=Polygon[{a,b, d} ] ;

poly4=Polygon[{a,b, c} ] ;

The next instruction displays the tetrahedron. Try different viewpoints.


Show[Graphics3D[(polyl,poly2,poly3,poly4}],
Axes->True,AxesLabel->(x,y, z}
,ViewPoint->{2.100,-2.400,1.500},
AspectRatio->.8]

41-12.1. Can use LinearSolve[ ] to solve linear equations.

41-12.2. RowReduce[ ] reduces the matrix of coefficients to echelon form.

41-12.3. Use RowReduce[ ].

41-13.1. Use commands Eigenvalues)] ] and Eigenvectors!] ].

41-13.2. Program is listed below.

a={{1,2,1},{2,1,1},{1,1,2}};

Eigenvalues[a]
f=Eigenvectors[a]

The transpose command creates the matrix C.


c=Transpose[f]

And its inverse.


Inverse[c]
Answers to selected problems 753

d=DiagonalMatrix[{-1,1,4}]
c.MatrixPower[d,n].Inverse[c]

The next line works out the n-th power of A directly


MatrixPower[a,n]
MatrixPower[a,n]-c.MatrixPower[d,n].Inverse[c]
The last steps check that the two are the same.

41-13.3. Check lnverse[ ] — Transpose[ ].

41-13.4. Use Eigenvalues[ ]. Det[ ], ldentityMatrix[ ], and MatrixPower[ ].

41-14.1. Use Plot[],

41-14.2. Command lntegrate[ ] will perform the necessary integration.

41-15.1. A sample program is given below.

This method simply divides the interval into equal steps and the approximation
is the sum the products of the step lengths and the function values at the step
points.
f[x_]=xA2;
a=l;b=3;h=(b-a)/100;
The next step works out the sums
N[Sum[h*f[a+i*h],{i,0,99}]]
The next instruction evalautes the integral. These steps are repeated for each
function except for (d) which does not have an elementary integral.
N[Integrate[f[x],{x,1,3}]]
g[x_]=x*Exp[-x];
b=3;a=0;h=(b-a)/20;
N[Sum[h*g[a+i*h],{i,0,19}]]
N[Integrate[x*Exp[-x],{x,0,3}]]

p[x_]=xA 3 *Sin[x];
b=Pi;a=0;h=(b-a)/3 0;
N[Sum[h*p[a+i*h],{i,0,29}] ]
N[Integrate[xA3*Sin[x],{x,0,Pi}]]

q[x_]=Cos[Exp[-x]];
b=l;a=0;h=(b-a)/25;
N[Sum[h*q[a+i*h],{i,0,24} ] ]
Try the sums with different values for h.

41-15.2. Essentially use lntegrate[], although simplification is possible with commands Simplify[], Together[],
and ExpandAII[ ].

41-15.3. Use lntegrate[ ]

41-15.4. Try lntegrate[ ] for a suitable definite integral followed by Limit[ ].

41-15.5. Program and output shown below.


754 Answers to selected problems

The following diagrams show sphere and the cylindical hole.


In[i4]:=
sl=ParametricPlot3D[(Cos[t]*Cos[u],Sin[t]*Cos[u],
Sin[u]},{t,0,2*Pi},{u,-.35*Pi,.35*Pi},Boxed->False,
Axes->False,DisplayFunction->Identity]
s2=ParametricPlot3D[{0.454*Sin[t],0.454*Cos[t],u},
{t, 0,2*Pi} ,
{u,-.891,.891},Boxed->False,Axes->False,
DisplayFunction->Identity]
Show[si,s2,Shading->False,DisplayFunction->
$DisplayFunction]

Explain why the volume is given by the following integral.


In[ 17]:=
Integrate[Pi*(aA2-xA2-bA2),
{x,-Sqrt[aA2-bA2],Sqrt[aA2-bA2]}]

Out[17]=
2 2 3/2
4 (a - b ) Pi

41-16.1. Requires package Graphics'Graphics and then PolarPlotf ] and Integrated ].

41-16.2. Possible program show below.

f[x_]=Sin[x]A2*Exp[-2*x];

b=2;a=0;n=10;

The following steps define the trapezium rule given by Trapl.


h=(b-a)/n;
TrapI=N[h*(0.5*f[a]+Sum[f[a+i*h],(i,1,n-1}]+0.5*f[b])]

Int=Integrate[Sin[x]A2*Exp[-2*x],{x,0,2}]//N

Trapl-Int

The last step checks the difference between the result form the trapezium rule
and the exact integral,which is possible in this example. Try different values for
the number of steps n.
Answers to selected problems 755

41-16.3. Uses Plot[ ] and lntegrate[ ].

41-16.4. Program given below.

A program for Simpson's rule,


f[x_]=Exp[xA2];
b=1;a=0;n=4;
h=(b-a)/n;
Simp=N[h*(f[a]+Sum[(3+(-1)A(i+1))*f[a+i*h],
{i,1,n-1}]+f[b])/3]

41-17.1. Program given below.

The substitutions are x=g(u) and u=j(v).


f[x_]=(x-2)/Sqrt[5+4*x-xA2];
g[u_]=u+2;
x=g[u];
j[v_]=3*Sin[v];
u=j[v];k[v]=f[g[j[v]]]D[g[j[v]],v];Simplify[%]
m[v_]=Integrate[%,v]

The following is by direct integration using the built-in lntegrate[ ].


Clear[x,u,v]
ClearAll[f]
f[x_]=(x-2)/Sqrt[5+4*x-xA2];
Integrate[f[x],x]

41-17.2. Use Integrate]] ].

41-17.3. Use lntegrate[].

41-17.4. Use lntegrate[ ].

41-17.5. Program given below.

f[x_]=Log[x]A6/xA2;

This tests the limit of f(x) as x-> infinity. It is necessary that f(x)-> 0 but not
sufficient for the integral to exist.
Limit[f[x],x->Infinity]
g[a_]=Integrate[Log[x]A6/xA2,{x,1,a}]
N[g [10]]

N[g[20]]
The following graph indicates that f(x) tends to 0 as x-> infinity.
Plot[f[x],{x,1,100}]
D[f[x],x]
NSolve[f'[x]==0,x]

41-18.1. Program and output are given below.


756 Answers to selected problems

The solution of the equation is given by DSolve[ ].


In[18]:=
sol=DSolve[x'[t]+x[t]==0,x[t],t]

Out[18]=
C [1]
{(x[t] -> -•}}
t
E

The solutions for the different initial values can be put in one Plot[ ] command.
In[19]:=
Plot[(sol[[1,1,2]]/.C[l]->0,soil[1,1,2]]/.C[l]->l,
sol[[1,1,2]]/.C[l]->2},{t,0,2},AxesLabel->{t,x}];

41-18.2. Use DSolve[ ] to solve the differential equation. Plotting can be achieved by adapting the program in the
previous project.

41-19.1. Program and output shown below.


In[24]:=
sol=DSolve[{2*x''[t]+3*x1[t]+x[t]==Cos[t],
x'[0]==0,x[0]==l},x[t],t]

Out[24]=
-1 8 -Cos[t] + 3 Sin[t]
{ {x [ t ] ->- + - + -}}
t t/2 10
2 E 5 E
In[25] : =
graph=Plot[sol[[1,1,2]],
{t,0,50},AxesLabel->{t,x},
PlotStyle->{Thickness[.006]}];
x

The forced periodic output is quickly achieved as the transient dies out.
Answers to selected problems 757

41-19.2. Use DSolve[ ].

41-20.1. Program given below.

This is the solution of the linearised pendulum equation for simple harmonic
motion.
sol=DSolve[{x''[t]+x[t]==0,x[0]==1,
x1 [0] ==0> , x [t] , t]

graphl=Plot[sol [[1,1,2]],{t,0,10},PlotStyle->
RGBColor[1,0,0]]

The following equation is the nonlinear pendulum equation solved numerically


for comparison. Both solutions are shown finally together.
appsol=NDSolve[{x''[t]+Sin[x[t]]==0,x[0]==l,
x'[0]==0},x,{t,0,10}]
graph2 = Plot[Evaluate[x[t]/.appsol],{t, 0,10} ]
Show[graphl,graph2]

41-21.1. Program given below.

xl=3/Sqrt[2];yl=3/Sqrt[2];x2=2;y2=0;x3=0;y3=-l;
ArgandPhasor={Thickness[.01],Line[{{0,0},{xl,yl},{xl+x2,
yl+y2},{xl+x2+x3,yl+y2+y3}}]};
Show[Graphics[ArgandPhasor],Axes->True,AspectRatio->
Automatic]
Print[{xl+x2+x3,yl+y2+y3}//N]

41-22.1. Requires package Graphics PlotField and then PlotVectorField[ ] and DSolve[ ] for the solutions
of the differential equation.

41-22.2. As in 20.1.

41-22.3. Program listing given below.

f[x_,y_]=x*yA2;
h= . 2;
y[0]=l;
x[n_]=n*h;
y[n_] :=y[n]=y[n-1]+h*f[x[n-1],y[n-1] ] ;
yvalue=Table[y[i],{i,1,6}]
points=Table[{x[i-1],yvalue[[i]]}, {i,l,6}]
Euler=ListPlot[points]

Euler gives a list of points computed by Euler's method.


Clear[f,x,y]
DSolve[{y'[x]==x*y[x]A2,y[0]==l},y[x],x]

The solution of the differentail equation is given by Exact.


Exact=Plot[l/(l-0.5*xA2),{x,0,1},PlotStyle->
RGBColor[1,0,0]]
Show[Euler,Exact]

41-22.4. Numerical solution of the differential equation is given by NDSolve[] and the plotting by
ParametricPlot[ ].
758 Answers to selected problems

41-23.1. Sample program shown.

See Example 21.2. The second-order equation is replaced by two first-order


differential equations.
soll=NDSolve[{x'[t]==y[t],
y'[t]==-2 x[t]A3,x[0]==0.3,y[0]==0},{x,y},{t,0,25>]
dl=ParametricPlot[Evaluate[{x[t],y[t]}/.soll] , {t,0,25},
AspectRatio->Automatic,PlotRange->All,DisplayFunction->
Identity]
sol2=NDSolve[{x1[t]==y[t],
y'[t]==-2 x[t]A3,x[0]==.6,y[0]==0},{x,y>,{t,0,20}]
d2=ParametricPlot[Evaluate[{x[t],y[t]}/.sol2],{t,0,20}f
AspectRatio->Automatic,PlotRange->All,DisplayFunction->
Identity]
sol3=NDSolve[[x1[t]==y[t],
y'[t]==-2 x[t]A3,x[0]==.9,y[0]==0},{x,y},{t,0,16}]
d3=ParametricPlot[Evaluate[{x[t],y[t]}/.sol3],{t,0,16},
AspectRatio->Automatic,PlotRange->All,DisplayFunction->
Identity]
sol4=NDSolve[{x1[t]==y[t],
y'[t]==-2 x[t]A3,x[0]==1.2,y[0]==0},{x,y},{t,0,12}]
d4=ParametricPlot[Evaluate[{x[t], y [ t ]}/.sol4],(t,0,12},
AspectRatio->Automatic,PlotRange->All,DisplayFunction->
Identity]

The final Show[ ] displays 4 phase paths.


Show[dl,d2,d3,d4,AxesLabel->{x,y},
DisplayFunction->$DisplayFunction]

41-23.2. See the previous program.


The next eight projects require the package Calculus LaplaceTransform. .

41-24.1. The transforms are given by LaplaceTransform[ ], but use lntegrate[ ] instead for the finite transform.

41-24.2. Inverse transforms are given by lnverseLaplaceTransform[ ].

41-24.3. Sample program below.

Requires package Calculus'LaplaceTransform'.


<<Calculus'LaplaceTransform'
Solution of second-order equation.
Clear[a,w,s,t,x]

lap=LaplaceTransform[x1'[t]+2*x'[t]+x[t]-
a*Cos[w*t],t,s]

lapl=lap /.{LaplaceTransform[x[t],t,s]->lapx,
x[0]->0,x1 [0]— > 0}
lap3=Solve[lapl==0,lapx]

solution=InverseLaplaceTransform[lap3[[1,1,2]],s,t]
a=l;w=l;
Answers to selected problems 759

Inputs and outputs are shown in the diagram.


Plot[{solution,Cos[t]},{t,0,30},AxesLabel->{x,t},
PlotRange->{-1.5,1.5},PlotPoints->50,
PlotStyle->{RGBColor[1,0,0],RGBColor[0,0,1] }]
Clear[a,w]

41-24.4. Use lnverseLaplaceTransform[ ].

41-24.5. Use LaplaceTransform[ ].

41-25.1. Additionally requires package Calculus 'DiracDelta'.

41-25.2. Additionally requires package Calculus 'DiracDelta'.

41-25.3. Define convolution as lntegrate[/[t - u] * g[u], {«, 0, t}], and then apply LaplaceTransform of the
convolution.

41-25.4. Use Solve[ ] to find the roots of the denominator. (Generally if numerical solutions are required use
NSolve[ ].)

41-26.1. Uses Plot[ ], lntegrate[ ], Sum[ ], and Show[ ].

41-26.2. See the previous project.

41-26.3. See the following program.

Requires package Calculus' FourierTransform'.


<<Calculus'FourierTransform'

The function is even. Hence only Fourier cosine coefficients are present.
Remove[fs]
fs=FourierCosSeriesCoefficient[xA6-5*PiA2*xA4+
7*PiA4*xA2,{x,-Pi,Pi},n]
rulesl= {Cos [Pi n_] - > (-1) An, Sin [Pi n_]-->0}
fs=Simplify[fs/.rulesl]
aO=FourierCosSeriesCoefficient[xA6-5*PiA2*xA4+
7 *PiA4 *xA 2,{x,-Pi,Pi},0]

.The sum of the series can be obtained by putting x=0 in the Fourier series: the
answer is 31 (Pi)A6/30240.

41-26.4. Requires the packages Calculus'FourierTransform' and Calculus'DiracDelta'. Use UnitStep[ ] for functions
with steps.

41-26.5. See previous project.

41-27.1. Program and partial output shown below.

Requires package Graphics'ParametricPIot'


<<Graphics'ParametricPlot3D'

This plots the surface.


In[29]:=
ParametricPIot3D[{r*Cos[u],r*Sin[u],rA2*Cos[2*u]},
{r, 0,1},{u,0,2*Pi},AxesLabel->{x,y,z},
PlotPoints->{15,25},Shading->False]
760 Answers to selected problems

Out[29]=
-Graphics3D-
This plots the contour curves on the (x,y) plane.
ContourPlot[xA2-yA2,{x,-1,1},{y,-1,1},
ContourSmoothing->Automatic,ColorFunction->Hue,
FrameLabel->{x,y>]

41-27.2. Requires package Graphics ParametricPlot3D and then the commands ParametricPlot3D[ ] for the
surface, and ContourPlot[ ] for the contours of the surface.

41-27.3. Use D[] for partial derivatives.

41-27.4. Program given below.

ClearAll[f,fx,fy,x,y]

The following defines a function of two variables,


f[x_,y_]:=Sin[x*y]

surface=Plot3D[f[x,y],(x,-Pi,Pi},{y,-Pi/2, Pi/2}]
The partial derivartives follow.
D[f[x,y],x]
fx[x_,y_]:=y*Cos[x*y]
D[f[x,y],y]
fy[x_,y_]:=x*Sin[x*y]

tangentfunction=f[Pi/4,1]+fx[Pi/4,1]*(x-Pi/4)+
fytPi/4,1]*(y-1)

z=tangent is the equation of the tangent plane.


tangent=Plot3D[tangentfunction,{x, -Pi, Pi} ,
{y,-Pi/2,Pi/2}]

contact={Line[{{Pi/4,l,0},{Pi/4,l,2}}],
Thickness[.02]};
Answers to selected problems 761

Vertical line indicates the point of tangency. The surface and the tangent plane
are put on the same figure. Try other tangent planes and viewpoints.
Show[surface,tangent,Graphics3D[contact] ,
AxesLabel->{x,y,z} ,BoxRatios->{1, 1, . 8} ,
Viewpoint->{1.990,-2.400,1.750}]
41-27.5. Program listing given below.
ClearAll[f]
f[x_,y_]:=.3xA3+.2*yA2-xA2*y-x*y+x+2*y
fx=D[f[x,y],x]
fy=D[f[x,y],y]
The stationary values are found numerically by NSolve[ ].
NSolve[{fx==0,fy==0},{x,y}]
The following gives a contour plot of the surface showing all the stationary
points.
ContourPlot[f[x,y],{x,-3,3},{y,-9,3},Contours->30,
ContourSmoothing->Automatic,
FrameLabel->{x,y},PlotPoints->20]
The next cell checks formula (25.9) for the first stationary point.
(D[f[x,y],x,x]*D[f[x,y],y,y]-D[f[x,y],x,y]A2)/.
{x->-0.620464,y->-5.58872}
D[f[x,y],x,x]/.{x->-0.620464,y->-5.58872}
D[f[x,y],y,y]/.(x->-0.620464,y->-5.58872}
(0.620464,-5.58872) is a minimum.
Apply the formula to the other two points.
41-27.6. Program listing given below.
The command Fit[ ] does the least squares for you.
Fit[{{0,1.1},{1,2},{2,2.9},(3,3.9},(4,5},{5,5.1}},
{l,x},x]
The following plots the straight line,
leastsquares=Plot \%,{x,-0.5,5.5}]
Data={{0,1.1},{1,2},{2,2.9},{3,3.9},{4,5}, {5,5.1}}
dataplot=ListPlot[Data,AxesOrigin->{0,0},
PlotRange->{-.5,7.5},PlotStyle->PointSize[0.01]]
Show[ ] displays both the data and the straight line fit to the data.
Show[dataplot,leastsquares]

41-28.1. Program listing given below.


Curves given by f(x,y)=c.
f[x_,y_]:=y*Exp[-x]
D [y [x] *Exp [ -x] , x]
diff=Solve[%==0,y'[x]]
diff [ [1,1,2] ]
orthogonal=l/diff[[1,1,2]]
Differential equation of the orthogonal trajectories.
762 Answers to selected problems

DSolve[y1[x]==-orthogonal,y[x],x]
'Curves' are solutions of the differential equation, and 'trajectories' are their
orthogonal trajectories.
curves=ContourPlot[f[x,y],{x,-2,2},{y,-2,2},
ContourShading->False,
ContourSmoothing->Automatic,Contours->11,
PlotPoints->25]
trajectories=ContourPlot[yA2+2*x,{x,-2,2},{y,-2,2},
ContourShading->False,ContourSmoothing->Automatic,
PlotPoints->25]
Showfcurves,trajectories]
41-29.1. Program listing given below.
Requires package Graphics' ImplicitPlot'.
The problem is to find where f(x,y) is stationary subject to the condition g(x,y)=0.
<<Graphics'ImplicitPlot'
ClearAll[f,g]
f[x_,y_]:=xA3-2*x*y-x+3*yA2
g[x_,y_]:=xA2+2*yA2-l
lagrange=ContourPlot[f[x,y],(x,-1.5,1.5},
{y,-1.5,1.5},
ContourShading->False,Contours->15,
GontourSmoothing->
Automatic,PlotPoints->25,AxesLabel->{x,y}]
constraint=ImplicitPlot[g[x,y]==0,{x,-1.5,1.5},
{y,-1.5,1.5},PlotStyle->RGBColor[1,0,0],
AxesLabel->{x,y}]
The following figure shows the intersection of the contours of f(x,y) and the
curve g(x,y)=0. Tangencies between them indicate approximate stationary
values which are then used to find numerical locations using a root-finding
routine. There are four stationary values.
Show[lagrange,constraint]
D[f[x,y]-p*g[x,y],x]
D[f[x,y]-p*g[x,y],y]
The approximations for x and y are taken from the figure above.
FindRoot[{-l-2*p*x+3*x*x-2*y==0 ,
-2*x+6*y-4*p*y= = 0,xA2 + 2*yA2 = = l},{x,.2},{y,-.6},
{p,l}]
FindRoot[{-l-2*p*x+3*x*x-2*y==0,
-2*x+6*y-4*p*y==0,xA2+2*yA2==l},{x,l},{y,0},{p,l},
Maxlterations->400]
FindRoot[{-l-2*p*x+3*x*x-2*y==0,
-2*x+6*y-4*p*y==0,xA2+2*yA2==l},{x,-1.3},{y,-.3},
{p,l}]
FindRoot[{-l-2*p*x+3*x*x-2*y==0,
-2*x+6*y-4*p*y= = 0,xA2 + 2*yA2==l},{x,0.5},{y,- 0.8},
{p,1.4}]
Answers to selected problems 763

41-30.1. Requires package Calculus 'VectorAnalysis' and then Grad[ ].

41-30.2. Program and output displayed.

This diagram shows the intersection of the plane and the cylinder in Example
30.9.
In[1]:=
cylinder=ParametricPlot3D[(Sin[t],Cos[t] ,u},
{t,0,2*Pi},{u,-2,3},AxesLabel->{x,y,z} ,
DisplayFunction->Identity]
ln[2]:=
plane=ParametricPlot3D[(u*Sin[t],u*Cos[t] ,
l-u*Sin[t]-u*Cos [t]},{t,0,2*Pi},(u,-2,2},
AxesLabel->{x,y,z},DisplayFunction->Identity]
ln[3]:=
Show[cylinder,plane,Viewpoint->{1.929,-1.306,1.553},
DisplayFunction->$DisplayFunction,Shading->False]

41-30.3. Requires package Graphics ImplicitPlot

41-31.1. Use lntegrate[] and Plot3D[ ].

41-32.1. Program and output displayed.

In[ 17]:=
ClearAll[x,y,z,r,t,f]
ln[18]:=
x[t_]:=t;y[t_]=t;z[t_]=t;
f[x_,y_,z_]:={x*y,y*z,x*(z-y)}
f [x[t] ,y[t] ,z[t] ]
r[x_,y_,z_]:={x,y,z}
764 Answers to selected problems

Out[20]=
2 2
{t , t , 0}
First line integral
Integrate[f[x[t],y[t],z[t]].D[r[x[t],y[t],z[t]],t],
(t,0,l>]
Out[22]=
2

3
ln[23]:=
ClearAll[x,y,z,f,r]
In [24]:=
x[t_]:=tA2;y[t_]=tA3;z[t_]=tA4;
f[x_,y_,z_]:={x*y,y*z,x*(z-y)}
f[x[t],y[t],z[t]]
r[x_,y_,z_]:={x,y,z>
Out[26]=
5 7 2 3 4
{t , t , t (-t + t ) }
Second line integral
Integrate[f[x[t],y[t],z[t]].D[r[x[t],y[t],z[t]],t],
{t,0,l}]
Out[28]=
341

630

The two paths of integration are shown below


ln[29]:=
ParametricPlot3D[{{t,t,t},{tA2,tA3,tA4}},{t,0,l},
AxesLabel->(x,y,z}]
1
Answers to selected problems 765

41:33.1. Requires the command ParametricPlot3D[ ].

41-33.2. Use the package Calculus'VectorAnalysis'.

41-33.3. See previous project.

41-34.1. Use lntersection[ ] and Union[ ]. The command Length[ ] gives the number of elements in a set.

41-34.2. Use Table[ ] to generate the sets, and then the distributive law on lntersection[ ] and Union[ ].

41-36.1. Requires the package DiscreteMath'Combinatorica'. The tests are EulerianQ[ ] and HamiltonianQ[ ].

41-36.2. Requires the package DiscreteMath'Combinatorica'. The test for planarity is PlanarQ[ ].

The remaining projects require the package DiscreteMath'RSolve'.

41-37.1. Use RSolve[ ] to solve the difference equation.

41-37.2. Use RSolve[ ].

41-37.3. Use RSolve[ ].

41-37.4. A sample program and output is shown below.

The difference equation is u(n+1)=-ku(n)+k with k=1/2.


In[30]:=
k: =1/2
In[31]:=
f[x_]:=-k*x+k
ln[32]:=
x: =3/4
In[36]:=
cobwebl=Table[Line[{(Nest[f,x,n],Nest[f,x,n]} ,
(Nest[f,x,n],Nest[f,x,n+l]} ,
{Nest[f,x,n+1],Nest[f,x,n+l]>}],{n,4}] ;
In [34]:=
diffl=Plot[-k*x+k,{x,0,1},DisplayFunction->Identity]

Show[diff1,
Graphics[Line[ { {x, f [x] } , { f [x] , f [x] } } ] ] ,
Graphics[Line[{{0,0},{0.75,0.75}}] ] ,
Graphics[cobwebl],
AspectRatio->Automatic,
DisplayFunction->$DisplayFunction]

This solution is stable. Find other cobwebs for k=3/2 and k=1.
766 Answers to selected problems

41-37.5 See 34.4 for the programming method.

41-37.6. Use Next[ ] and Nextlist[ ] in a Do[ ] loop.

41-38.1. The BarChart[ ] command can be found in the package Graphics'Graphics'.

41-39.1. See Project 38.1.

41-39.2. Requires the package Statistics'NormalDistribution' in which can be found PDF[ ] and CDF[ ].

41-39.3. Helpful to use Frequencies[ ] from the package Statistics'DataManipulation . BarChart[ ] can be found
in Graphics'Graphics'.

41-40.1. Median[ ] can be found in the package Statistics DescriptiveStatistics.

41-40.2. See Project 38.1.

41-40.3. Use the package Statistics DescriptiveStatistics'.

41-40.4. Use the packages as in Project 39.3.


Appendices

Appendix A Some algebraical rules

(a) Index laws for real numbers


(i) a0 = 1.
(ii) apaq = ap+q.
(iii) a~p = \/ap.
(iv) (ap)q or (aq)p = apq (so ap/q = (ap)1,q or (allq)p).
For example, a2 = Ja because (rF)2 = a and (,Ja)2 = a. Con¬
ventionally, a1 and ^Ja represent the positive root when we are
talking about real numbers (for complex numbers, see Chapter 6).
It may be necessary to restrict a to be positive so that aplq is a real
number. For example, ( —8)2 or ^( — 8) is not real: there is no real
number whose square is equal to —8. But ( —8)5 or ^/( —8) = —2.

(b) Quadratic equations


(i) ax2 + bx + c = 0 has the solutions

Xi, x2 = l-b ± J(b2 — 4ac)]/2a.

(ii) In terms of xx and x2, the factors are

ax2 + bx + c = a(x — xj(x — x2).

(iii) Sum and product of solutions:

Xi + x2 = —b/a, XjX2 = c/a.

(c) Binomial theorem


(i) If n is a positive integer (or whole number)

(a + b)n = a" + nan ~1 b + —-an~2b2

n(n — l)(n — 2) 33
+ —a b + • • • + b".
3!

where the binomial coefficient

[ n\ n!
\r J (n — r)!r!

There are (n + 1) terms in this sum, and it is symmetrical in a and b.


768 Appendix A

An important special case is

„ , v, 1 , , <n - 0 2 , n(n - 1 )(n - 2)


(1 + x)" = 1 + nx -\-—-x + x3 + + x”.
2! 3!

(ii) Pascal's triangle. Each entry (apart from the ones) is the sum of
two previous entries - the one above, and the one above and to the
left - as illustrated by the underlined group:

n = 1 11
n=2 121
n = 3 13 3 1
n — 4 1 4 6 4 1
and so on. Thus

(1 + x)4 = 1 + 4x + 6x2 + 4x3 + x4.

(iii) Permutations and combinations

nl nl
„Pr =-, Cr —-(see Section 38.4).
(n — r)! (n-r)!r!

(d) Factorization
a2 — b2 = (a + b)(a — b),
a2 — b3 = (a — b)(a2 + ab + b2),
a2 + b3 = (a + b)(a2 — ab + b2).

(e) Constants
e = 2.71828182 ..., n = 3.14159265
1 radian = 57.29578 ..,°, 1° = 0.01745 ... radian,
360° = 2n radian.

(f) Sums of powers of integers


n

X r = 1 + 2 + 3 H-b n = \n{n + 1)
r= 1

X r2 = 12 + 22 + 32 + • • • + n2 = \n{n + l)(2n + 1)
r= 1

X r3 = l3 + 23 + 33 + • • ■ + n3 = ^n2{n + l)2.
r= 1
Appendix B 769

Appendix B Trigonometric formulae

(a) Relation between trigonometric functions


sin2 A + cos2 A — 1,
tan A = sin A/cos A; sec A = 1/cos A; cosec A = 1/sin A.

(b) Addition formulae


sin(/4 ± B) = sin A cos B ± cos A sin B,
cos(/4 ± B) = cos A cos B + sin A sin B,
tan(/4 ± B) = (tan A ± tan £?)/( 1 + tan A tan B).

(c) Addition formulae: special cases


sin 2/4 = 2 sin A cos A,
cos 2/4 = cos2 A — sin2 A
= 2 cos2 A — 1 = 1 — 2 sin2 A,
tan 2/4 = 2 tan A/{ 1 — tan2 A),
sin 3/4 = 3 sin A — 4 sin3 A,
cos 3/4 = 4 cos3 A — 3 cos A.

(d) Product formulae


sin A sin B — |[cos(/l — B) — cos(/4 + 5)],
cos A cos B — j[cos(/4 — B) + cos(/4 + £)],
sin A cos B = |[sin(/4 — B) + sin(/4 + B)~].
sin C + sin D = 2 sin |(C + D) cos |(C — D),
sin C — sin D = 2 sin y(C — D) cos |(C + D),
cos C + cos D = 2 cos |(C + D) cos |(C — D),
cos C — cos D = —2 sin j(C + D) sin |(C — D).

(e) Product formulae: special cases


sin2 A — \{\ — cos 2/4),
cos2 A = |(1 + cos 2/4),
sin3 /4 = |(3 sin /4 — sin 3/1),
cos3 /l = j(3 cos A + cos 3/1).

(f) Triangle formulas


(i) a + P + y = 180°
(ii) Cosine rule: a2 = b2 + c2 — 2be cos a.
sin a sin/? sin y
(in) Sine rule: - - =-=-.
a b c

(g) Trigonometric equations


In the following, n represents any integer (i.e. any whole number,
positive or negative); x is in radians.
(i) sin x = 0 and tan x = 0 when x = nn; cos x = 0 when x =
jn + nn.
(ii) The following formulae show how to obtain all the solutions
of certain equations when one solution has been obtained (for
example, a hand calculator or a computer gives only one solution
of sin x = —j, namely x = arcsin( —^) = —0.5236 ...).
770 Appendix B and C

If sin a — c, then all the solutions of sin x = c are

x — nn + ( — l)"a.

If cos = c, then all the solutions of cos x = c are

x = 2nn ± /?

If tan y = c, then all the solutions of tan x = c are

x = nn + y.

The equation tan x = c occurs rather frequently: notice particularly


that, if y is a solution, then so are y ± n.

(h) Hyperbolic functions


cosh x = \{tx + e“x); sinh x = ^(ex - e-x); tanh x = sinh x/cosh x;
sech x = 1/cosh x; coth x = cosh x/sinh x; cosech x = 1/sinh x,
sinh(x + y) = sinh x cosh y ± cosh x sinh y,
cosh(x + y) — cosh x cosh y ± sinh x sinh y,
cosh2 x — sinh2 x = 1, sinh 2x = 2 sinh x cosh x,
cosh 2x = cosh2 x + sinh2 x,
cosh jx = cos x; sinh jx = j sin x (Section 6.6).

Appendix C Areas and volumes

(a) The area of a triangle is \bh, where b is the length of one side
and h its height from that side.
(b) The circumference of a circle is 2nr, where r is its radius.
(c) The area of a circle is nr2, where r is its radius.
(d) The area of a circle sector is \r29, where r is its radius and 9
the angle of the sector in radians.
(e) The volume of a sphere is fnr3, where r is its radius.
(f) The surface area of a sphere is 4nr2, where r is its radius.
(g) The volume of a cone is %Ah, where h is its height and A the
cross-sectional area of its base.
(h) The area of an ellipse is nab, where a and b are the lengths of
its semi-axes.
Appendix D 771

Appendix D A table of derivatives

dy
dx

c (constant) 0
x" (n any constant) nx" ~1
eax a eax
kx (k > 0) kx In k
In x (x > 0) x-1
sin ax a cos ax
cos ax — a sin ax
tan ax a/cos2 ax
cot ax — a/sin2 x
sec ax (a sin ax)/cos2 ax
cosec ax — (a cos ax)/sin2 ax
arcsin ax a/{\ — a2x2y
arccos ax — a/( 1 — a2x2)J
arctan ax a/( 1 + a2x2)
sinh ax a cosh ax
cosh ax a sinh ax
tanh ax a/cosh2 ax
sinh -1 ax a/( 1 + a2x2)2
cosh-1 ax a/(a2x2 — l)1
tanh -1 ax a/( 1 — a2x2)
dv du
u(x)v(x) u-h p —
dx dx
u{x) 1 ( d« dv\
— n-u —
v(x) v \ dx dx/
1 1 dt>
n(x) v2 dx
d y dt/
y(u(x))
du dx
dy dy du
y(n(u(x)))
dv du dx
772 Appendix E

Appendix E A table of integrals

f(x) f(x) dx (C is an arbitrary constant.)

xm (ffl#- 1) 1 xm + 1 + C
m + 1

X-1 ln|x| + C, or ln|Cx|


Qax (1/a) eax + C

•Ac:

O
kx/\n k + C

A
In x (x > 0) x In x — x + C
sin ax — (1/a) cos ax + C
cos ax (1/a) sin ax + C
tan ax — (1/a) In cos ax| + C or —(1/a) ln|C cos ax|
cot ax (1/a) ln|sin ax| + C or (1/a) In C sin ax|
sec ax — (l/2a) ln[(l — sin ax)/(l + sin ax)] + C
cosec ax (1/2a) ln[(l — cos ax)/( 1 + cos ax)] + C
l/(x2 + a2) (1/a) arctan(x/a) + C
l/(x2 - a2)i (l/2a) ln|(x — a)/(x + a)| + C or (1/a) tanh- l(x/a) + C
1 /(a2 — x2)* arcsin(x/a) + C (or — arccos(x/a) + C)
1 /(a2 + x2)5 (1/a) sinh' l(x/a) + C or ln[x + (x2 + a2)1] + C
l/(x2 - a2y ln[x + (x2 — a2)5] + C
x eax (l/a2)(ax - l)eax + C
x cos ax (l/a2)(cos ax + ax sin ax) + C
x sin ax (l/a2)(sin ax — ax cos ax) + C
x In x ^x2 In x — jx2 + C
eax cos bx [l/(a2 + 62)] ea-v(a cos bx + b sin bx) + C
eax sin bx [l/(a2 + b2)~\ eax( — b cos bx + a sin bx) + C
Appendix F 773

Appendix F A table of Laplace transforms, inverses and


general rules

In the following table, n and m represent a positive integer or zero.


The constants k and c are arbitrary unless otherwise indicated.
Transforms Inverses
'oo
m F(s) = e SI /(f) dr f(s) m
Jo

nl 1 _
tn j.m — 1
sm (m — 1)!

Qki 1 1
s—k s—k
nl 1 _ ^m l g/cr
t"ekt
(s — k)n + 1 (s - k)m (m — 1)!
s
cos kt cos kt
s2 + k2 s2 + k2
k 1
sin kt sin kt
s2 + k2 s2 + k2 k
s2 — k2 s2-k2
t cos kt t cos kt
(s2 + k2)2 (s2 + k2)2
2 ks s
t sin kt - t sin kt
(.s2 + k2)2 (s2 + k2)2 2k
H(r — c) (c > 0) Q~cs/S e~cs/s (c > 0) W-C )
5(f — c) {c > 0) &-CS
e~cs (c > 0) 5(r - c)

Summary of rules In the following rules, F(s) «-► /(f).


/sN
Scale rule (24.5) and F(ks) <-> - /I - (fc > 0)
m~kFW.

Shift rule, or multiplication If k is any constant,


by ekt (24.7) ek,/(t) <-> F(s — /c)
Powers of t (24.8) If n is a positive integer, then

tnm <- ( l)n .


ds"
d/(0 s /vn, d:/(f)^s2jF(s)_s/(0)_/'(0).
Derivatives (24.12) sF(s) — /(0),
df df2
Delay rule (24.15) If c > 0, then e csF(s) ^ fit - c)H(f — c) (where H is the Heaviside
unit function).
1/s as an integration If F(s) /(f), then
operator (25.1) 1
- T(s) /(t) dr

Convolution theorem (25.11) If 0(f) G(s) and /(f) «-»• F(s), then

F(5)G(s) g(t - t)/(t) dr = g(?)f(t - t) dr .


774 Appendix G

Appendix G A table of Fourier transforms and


general rules

General rules

Signal Transform

Fourier transform pair x(f) X(f) ej27t/'d/ X(f) x(t) e~i2nft dr

Linearity + Bx2(t) AX,{f) + BX2(f)


Time scaling x(At) \A\-'X{A~'f)
Time reversal x(-t) X(-f)
Time delay x(t - B) X(f)e~**-*f
Frequency scaling |C|-1x(C-10 X(Cf)
Frequency shift x(f) ej2TtD' X(f - D)
Modulation x(r) cos 2nKt i{X(f + K) + X(f — K)}
x(t) sin 2nKt ij{X(f + K) - X(f - K)}
Differentiation dx(t)/dt (j2 nf)X(f)
dnx(t)/dtn (P-nfTX(f)
Duality X(t) x(~f)
r°°
Convolution Xi(u)x2(t — u) du = xt(f) * x2(t) xyf)x2{f)
J — 00
* oo
= Xx(t — u)x2(u) du
J — 00
Multiplication Xi(t)x2(t) Xi(f - v)X2(v) dr
'x

X,(v)X2{f - v) dr
J — oo

Periodic function xP(f) Xp(t) (period T) Z Xn 8(/ - n/0), where/0 = l/T,


n = — oo

xP(f) e 2lti7ot
J Period
Appendix G 775

Short table of Fourier transforms

Signal Transform

n(0 = H(t - i ) - H(t + |) sine /


sine t n(/)
(1 + t, —1 < f < 0
A(0 = ) .
i-b 0< t< 1 sine2 /
lo, elsewhere
sine2 t n(/)
e“!H(f) 1/(1 +J27T/)
t e“'H(r) 1/(1 + j2rc/)2
e-l'l 2/(1 + 4rt2/2)
1
7i e~27t|/l
1 + t2
e-n,2
e_7t/2
5(0 1
1 5 (/)
COS 271 f0t i{S(/ + /o) + 5(/ - /o)}
sin 2nf0t ij{5(/ + /o)-8(/-/o)}

in r(0= I 5(t - nT), (T> 0) /Om/o(/), (/0 = i/ri


n = — co
776 Appendix H

Appendix H Probability distributions and tables

(a) Distributions, means and variances


(i) Discrete distributions

Mean
Distribution Probability 00 Variance (o2)

n\prqn~r
Binomial np

1
(n — r)!r!
1 1 -p

1
Geometric

1
P P2
knQ~2
Poisson k k
n!
k fed - P)
Pascal r- lQ- iP*d - Prk
P P2
c b'-'n-r
w^r c nb nwb(b + w + n)
Hypergeometric
w + bCn w+ b (w + b)2(w 4- b — 1)

(ii) Continuous distributions

Mean
Distribution Density GO Variance {a2)
o o

x
W

(ke~2x, l 1
Exponential
jo, x I I1
A

fl/(b -a), a <x<b


Uniform + b)2 Ub - a)2
jo, elsewhere
Standardized
p 2 •*'
0 1
normal

(b) Normal distribution tables


Standardized normal distribution cdf giving the values of

1
<P{x) = e~it2 dt
J2n,
for 0 ^ x ^ 30 at 0.01 intervals. For x < 0, 0(x) can be calculated
X from 0( - x) = 1 — 0(x)
Appendix H 777

X 0 1 2 3 4 5 6 7 8 9

0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.0633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9137 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990

Table giving x for specified values of <l>(x) for 0.50 ^ <J>(x) < 0.98 at 0.01 intervals

<D(.x) X 4>(x) X <D(x) X

0.50 0.0000 0.67 0.4399 0.84 0.9945


0.51 0.0251 0.68 0.4677 0.85 1.0364
0.52 0.0502 0.69 0.4959 0.86 1.0803
0.53 0.0753 0.70 0.5244 0.87 1.1264
0.54 0.1004 0.71 0.5534 0.88 1.1750
0.55 0.1257 0.72 0.5828 0.89 1.2265
0.56 0.1510 0.73 0.6138 0.90 1.2816
0.57 0.1764 0.74 0.6433 0.91 1.3408
0.58 0.2019 0.75 0.6745 0.92 1.4051
0.59 0.2275 0.76 0.7063 0.93 1.4758
0.60 0.2533 0.77 0.7388 0.94 1.5548
0.61 0.2793 0.78 0.7722 0.95 1.6449
0.62 0.3055 0.79 0.8064 0.96 1.7507
0.63 0.3319 0.80 0.8416 0.97 1.8808
0.64 0.3585 0.81 0.8779 0.98 2.0537
0.65 0.3853 0.82 0.9154 0.99 2.3263
0.66 0.4125 0.83 0.9542
Index
abscissa 2 parallelogram rule 101 EXOR gate 621
acceleration 35 phasor 340 identity laws 616
radial 159 polar coordinates 102 join 615
transverse 159 real axis 100 logic gate 617
vector 157, 158 z-transformation 426 logic networks 619
adjacency matrix 643 argument 6, 102 logically equivalent gates 624
adjoint matrix 137, 138 principal value 102 meet 615
algebra, Boolean 615 (see also Boolean asymptote 71, 74, 718 NAND gate 618
algebra) atom 627 negation 618
algorithm, numerical 79, 352 (see also attractor 662 NOR gate 618
approximation) strange 662 NOT gate 618
amplitude 12, 327 augmented matrix 199 OR gate 618
angle 7 autonomous differential equation product 615
angular 366-381 reflexive law 616
frequency 12, 327, 435 axes, cartesian 2, 146 sum 615
momentum 195 left-handed 2, 146, 185 switches in parallel 622
velocity 195 oblique 194 switches in series 622
antiderivatives 238-240 right-handed 2, 146, 185 switching circuits 622
and indefinite integral 240, 251 rotation of 167, 169, 170 switching function 622
table of 243 truth table 617
approximation (see also Taylor series) truth table, inverse 621
algorithm 79, 352 variables 616
bisection method 82 box plot 704-705
bar chart 687 interquartile range 705
Euler method 351
basis median 704
Euler method for systems 380
differential equations 301-306 quartiles 704
by Fourier series 441, 442
vectors 155 outliers 705
Gauss-Seidel method 208 Bayes’ theorem 671
incremental 76, 488, 521 whiskers 705
beam problem 274 branch
interpolation 128 Bernoulli
iterative process 79 of a curve 71
equation 365 of a tree (graph theory) 631
Jacobi method 209 trial 685
large values, for 92 binary
lineal element diagram 350, 727 operation 615
linear 488 set 605, 616
Newton’s method for roots of binomial distribution 685-686, 692 capacitor
equations 79, 718 Poisson approximation 692 complex impedance 343, 406
path of steepest ascent/descent 495 binomial theorem 89, 767 phasor 343
rectangle rule 250, 269 bins (statistics) 701 cardinality (of a set) 611
Simpson’s rule 275, 725 bipartite graph 638 infinite 611
small changes 75, 490 complete 639 cardioid 28, 717
small errors 492, 522 bisection method 82 computer program 748
step-by-step 79 block diagram 635 carrier wave 458, 468
Taylor polynomials 84-85, 719 reduction 635 caustic 542
trapezium rule 269, 725 bond 627 Cayley-Hamilton theorem 236
arccos, arcsin, arctan functions 14 Boolean algebra 615 cdf (cumulative distribution function)
area (see also integrals) absorption laws 616 694
analogy for integrals 258, 259, 268 AND gate 618 cells (statistics) 701
computation of 249, 269, 275 associative laws 616 central limit theorem (statistics) 709
as a definite integral 250 binary addition 624 centre (phase plane) 371, 377, 378
as a double integral 552 binary operation 615 centroid 271, 283
geometrical 244 Boolean expression 617, 620 chain rules 52, 57, 65, 477, 505, 514 (see
as a line integral 586 commutative laws 616 also functions of one, two and N
parallelogram 187, 563 complement 615 variables)
in polar coordinates 267 complement laws 616 more than one parameter 514, 525
signed 244-245, 248, 251 conjunction 618 one parameter 504, 525
as a sum 248 de Morgan’s laws 616 chaos 662, 665, 666, 734
of a surface 591 disjunction 618 characteristic equation (see also
table 770 disjunctive normal form 621 difference equations; differential
of a triangle 721 distributive laws 616 equations, second order;
Argand diagram 100, 719 duality principle 623 eigenvalues)
imaginary axis 101 exclusive-OR gate 621 difference equations 654
780 Index

characteristic equation (cont.) volume of 266 sketching 70, 716


differential equation 301 conjunction 618 slope 30
matrices 214 conjugate, complex 98, 100 tangent line 30, 33
circle 5 connected graph 627 tangent vector 157
area 770 conservative field 578-581, 597 curvilinear coordinates 511, 599, 601
cartesian equation 152 potential 580 cutset (see also graph theory) 632
circumference 770 contour map 472, 481, 730 fundamental 633
vector equation 152 convergence cycle (graph theory) 629
circuit (graph theory) 629 of infinite series 88 cylindrical polar coordinates 539, 599,
circuits 68, 264, 296, 329, 345-348, 626 of integrals 256 600
(see also graph theory; Boolean convolution 413, 729 (see also Fourier
algebra; impedance; transfer transform; Laplace transform;
function; z-transform) z-transform)
damper 329
balanced bridge 346 discrete 424
damping 329
cutset method 632 evaluation of 462—463
critical 337
equation 329, 403, 405-407 Fourier transform 461
heavy 337
graph theory representation 632-635 Laplace transform 413
dash notation for derivative 66
nodal analysis 633 memory and 416
deadbeat 331, 372
parallel 344 theorem 413, 461, 729
decay, radioactive 18, 308
series 344 theorem proof for Laplace transform
degree (of a vertex) 627
signal flow graphs 635 556-557
delay rule (second shift rule) 398, 456
switching 622 z-transform 424
del (grad) operator 499, 589, 596
cobweb 652, 734 coordinates, two-dimensional (see also
delta function (impulse function) 404,
computer program 765 coordinates, three-dimensional;
728
cofactor 130 axes)
discrete systems 417, 418
combinations (see also permutations) cartesian 2
Fourier transform 458
673-674, 768 origin 2
and Heaviside unit function 394
common ratio (geometric series) 24, 88 orthogonal systems 513 input to circuit 415, 417
compatibility polar 10, 103, 267 Laplace transform of 405
linear equations 202 transformations 231, 558
z-transform 418
complement (of a set) 607, 670 coordinates, three-dimensional (see also de Moivre’s theorem 104
complementary function 319, 656 (see coordinates, two-dimensional; axes) de Morgan’s laws 609, 616
also difference equations; cartesian 146 derivative, directional (see directional
differential equations, second order) curvilinear 511, 601 derivative)
complete graph 627, 639 cylindrical 539, 599 derivative, ordinary 32, 33 (see also
completing the square 5, 96, 286 orthogonal systems of 513, 601 derivative, partial)
complex impedance 343-344 (see also paraboloidal 603 and antiderivative 238
impedance) spherical 602 of ax + b 56
complex numbers 96, 719 (see also cosine function 9 (see also trigonometric chain rule 52, 57, 65
Argand diagram) functions) dash notation 64, 66
argument 102 antiderivative 243 directional 494, 501, 530
conjugate 98, 100 derivative 42 dot notation 366
de Moivre’s theorem 104 exponential form 104 of e* 41
difference 98 Taylor series 89 first 43
and differential equations 304, 314 cosine rule 29, 77, 769 of function of a function 52
division 98 counting index (series) 24 higher order 44
exponential form 103, 105 Cramer’s rule 194, 197, 202, 207 implicit 58, 496
imaginary part 97 cross product (see vector product) and incremental approximation 76
logarithm 110 cumulative distribution function 694 index notation 83
modulus 100, 101 curl 595 under integral sign 484
ordered pair 100 in curvilinear coordinates 602 of inverse functions 59
parallelogram rule 101 identities 597, 603 of lnx 43
polar form 102 curves logarithmic 57
product 98 angle between 499 material 527
quotient 99 asymptotes 71, 73-74 notations 33, 34, 64, 66, 83, 159
real part 97 branch 71 parameter, in terms of 60
reciprocal 98 caustic 542 of product 48, 65
roots 96, 108, 719 chord 31 of quotient 51, 65
rules for 98-100 convex/concave 179 and rate of change 34
solution of equations 109, 110 curvature of 178, 182 of reciprocal 51
standard form 97 envelope 362, 537, 731 second 44
sum 98 gradient 30 of sin x, cos x 42
compound interest 82, 648, 649 normal to 178, 498, 500 of sums 38
computation, symbolic 716 orthogonal systems of 513 table of derivatives 43, 771
computer algebra 716-735, 747-766 parametric equations 60, 504, 532 total 505
conditional probability 675-676 point of inflection 179 of vectors 157, 159
cone 180, 428 radius of curvature of 179 of x” 36, 55
Index 781

derivative, partial 473, 474 equations, second-order; harmonic general 375-377


higher 476 oscillator; Laplace transform; initial value problem 367, 376
mixed 476 linear oscillator; phase plane; instability 371
notations 474—475 phasors) limit cycles 379
second 476 autonomous 366-381 linearization 377
determinant 123, 129-139 coefficients 296 linearized systems, classification of
2 x 2 123 definition 296 378
3 x 3 129 dependent variable 296 nearly linear 335
cofactor 130 forcing term 296 node 372, 378
cofactors, sign rule 131 homogeneous 296 numerical method 380
by computer algebra 721 independent variable 296 orbit 369
expansion by first row 130 initial condition 300, 307 path 369, 373
expansion, general 134 initial value problem 300, 307, 367 periodic motion 370
factorization 721 lineal element diagram 350 phase diagram 369
interchange rows/columns 133 linear 295-324 phase plane, defined 369
Jacobian 558 linear, matrix form 383 qualitative methods 366
rules 132-137 linear, nearly 335 saddle 371, 378
suffix permutation 130 order of 296 self-similar 378
tridiagonal 141 partial 478 separatrix 375
diagonal dominance 209 qualitative methods 367 spiral 372, 378
diagonalization of a matrix 221, 224, solution (defined) 297 stability 371
723 state of system 300, 307 trajectory 369
difference (of sets) 609 systems of (see circuits; Laplace van der Pol equation 367, 380, 727
difference equations 428, 648 transform; phase plane) differential equations, second-order (see
attractor 662 differential equations, first order also linear oscillator)
bifurcation 660 Bernoulli equation 365 autonomous 366
boundary conditions 650 change of variable 360 basis of solutions 302, 303, 304
chaos 662 computer program 756 change of variable 360
characteristic equation 654 direction field 350 characteristic equation 301
cobweb 652 direction indicators 350 characteristic equation, roots of
complementary function 656 energy transformation 362, 382 301-304
compound interest 648, 649 envelope 362 complementary function 319
definition 650 equilibrium point 370, 376 complex number methods 304, 314
difference 648 Euler numerical method 352, 727 by computer algebra 727
discrete variables 648 first-order systems 375 (see also phase computer program 756
equilibrium 650, 651 plane, Laplace transforms) damping 306, 330, 332
Feigenbaum sequence 662 general solution 299, 319, 323 equidimensional 365
first-order 652 graphical method 350 forced, general solution 318
fixed point 650, 651 initial condition 300 forced, particular solution 312
forcing term, table 657 initial value problem 300 harmonic forcing 314, 331, 332
homogeneous 653-656 integrating factor 321, 323, 359 homogeneous (unforced) 296, 303,
inhomogeneous 653, 656-658 isoclines 350 330
linear, constant coefficients 653 lineal-element diagram 350, 727 inhomogeneous (forced) 310, 323, 332
logistic equation 651, 658 linear 297 initial value problem 307
order 651 linear, unforced 296 linear 297, 332
particular solution 656 nonlinear 354, 356, 361, 366-381 linear oscillator 326-333
period-2 solution 660 numerical solution 352, 727 nearly linear 335
period doubling 660 separable 354 particular solution 310
periodicity 653 separation of variables 354 physical analogies 329
pitchfork bifurcation 660 simultaneous 396 unforced, general solution 302
recurrence relation 353, 648 singular solutions 356, 362 differentiation (see also derivative)
solution by computer algebra 733 solution curves 299, 350 chain rule 52, 57
stability 652, 653, 659 solution by differentials 356 function of a function rule 52
strange attractor 662 state 300 implicit 58, 496, 524
z-transform 428 systems of (see circuits; Laplace of integrals 261
differential transform; phase plane) of inverse functions 59
for differential equation 356 trivial solution 298 logarithmic 57
form 357 variable coefficients 321 partial 473, 474 (see also derivative
integrating factor 359 differential equations, nonlinear partial)
and line integals 571 autonomous 367, 376 product rule 48, 65
perfect 359, 571 centre 371, 376, 378 quotient rule 51, 65
table of 357 centre, condition for 377 reciprocal rule 51
for two variables 517 computer program 758 of vectors 157, 159
differential delay equation 431 direction of paths 373, 376 diffraction 329
differential equations (see also Duffing equation 382 digraph (directed graph) 627
approximations; differential equilibrium point 370, 373, 376 weighted 636
equations, first-order; differential Euler’s method 380 directed graph 627
782 Index

directed line segment 145 as a repeated integral 551 and variance 688
direction cosines 168, 529 repeated integral, evaluation 543 experiment 666
direction ratios 171, 172 separable type 555 Bernoulli trials 685
directional derivative 494, 501, 530 signed volume analogy 551 outcomes 667
discrete dynamical system 650 duality principle 456, 613, 623 trial 666
discrete systems 418 (see also Duffing equation 382 exponential distribution 695, 696
z-transform) dynamical system 650 exponential function 14
impulsive input 417 base 14
input/output 417 cuts y axis at 45° 14
linear 417 derivative of 41
doubling period 17
notation for sampling 418
e, numerical value 15 doubling principle 17
sampling smooth function 418
echelon form 199 growth, decay 17, 18, 308
transfer function 417, 421
edge (of graph) 626 half-life period 18, 308
transients 424
eigenvalues 214 inverse of logarithm 15
discrete variable 648
characteristic equation 214 Laplace transform of 386
disjunction 618
complex 215 limit of aVe-cx 72
disjunctive normal form 621
by computer algebra 722 Taylor series 89
displacement 34, 142, 144
computer program 752 value of e 15
displacement vector
in differential equations 383
addition 144
of a matrix 214, 228
components 144
positive 230
initial/end-points 145
repeated 218
distance 3 face (of a graph) 639
zero 218
of point from plane 175 eigenvectors 215 factorial function 290
signed 2 by computer algebra 722 feedback 635
distribution computer program 752 Feigenbaum sequence 662
sampling 706 of matrix 215, 220 Fibonacci sequence 664
distributions (pdf) (continuous) 552, 694 orthogonal 229 field (see also vector field)
cumulative 694 and quadratic forms 227 conservative 578-581
cumulative distribution function (cdf) for repeated eigenvalues 218 intensity 578
694 ellipse 6 vector 499
exponential 695, 696 area 770 fixed point 651
mean 696 parametric equations 63 stability 653
normal 696 empty set 607, 670 fluid flow 507, 594, 595, 596
standardized normal 696 energy transformation 362, 382 material derivative 527
uniform 699 envelope 362, 537 focal length 81, 502
variance 696 equilibrium force
distributions (discrete) of forces 177 components 176
and Bernoulli trials 685 point 370, 371, 373, 376 equilibrium 176, 177
binomial 685-686, 692 errors 75, 492, 522 (see also moment about axis 191
geometric 689 approximation) moment about point 190
hypergeometric 693 estimate (statistical) 705 at a point 176
mean (expected value) 687 estimator of parameter 701 resultant 176
Pascal (negative binomial) 693 biased/unbiased 706 Fourier series 434—451
Poisson 691-692, 699 sample mean 707 approximation, nature of 441, 442
variance 688 sample variance 708 average value 439, 442
divergence (of a vector field) 587, 600, standard error 708 carrier wave 458, 468
601 ethanol molecule 627 coefficients 436, 438-439
in curvilinear coordinates 600, 601, Euler’s constant 728 complex coefficients 449
602 Euler’s formula (complex numbers) 103 computer program 759
identities 603 Euler’s method (differential equations) cosine series 443
theorem 593 352, 727 of even functions 443
divergent series 88 computer program 757 extensions 445
dot product 163 (see scalar product) Euler’s theorem (graph theory) 639 finite range, on 444
double integral 544 eulerian graph 631 Fourier transform (see Fourier
area increment 559 computer test for 733 transform)
change of variable 553, 557, 559 events fundamental frequency 435
changing order, constant limits 546 exhaustive list of 670 half-range series 446, 447
changing order, nonconstant limits independent 677 harmonics 435
548 intersection 669-670 at a jump 442
examples 552 mutually exclusive 670 Laplace transform of 468
inverse of Jacobian 517, 563 subset of sample space 668 of odd functions 443
Jacobian 557, 559 union 669-670 one from another 448
nonrectangular regions 545 Venn diagram 670 Parseval’s identity 466, 468
polar coordinates 553 exp(jc) 14 period 2ty 439
rectangular region 544 expected value (mean) 686, 695 period T 436, 439, 449
region of integration 547, 550 rules 688 periodic function 435, 437
Index 783

restricted range 444 restricted stationary points 533-537 gamma function 290
sawtooth wave 468, 729 stationary points 532 gas, equation of state 524
sine series 443 tangent plane 529 gate (logic)
spectrum 447 functions of one variable 64-65 (see also AND 618
standard form 436 derivative; differentiation) EXOR 621
symmetry, use of 442 argument of 6 NAND 618
trigonometric integrals 437 delta 404, 728 NOR 618
two-sided 449 discontinuous 7 NOT 618
and vibrations 435 differentials of 356 OR 617
Fourier transform (see also Fourier estimating small changes 75 Gauss-Seidel method (for linear
series, two-sided) equations) 208
even 7, 260
carrier wave 458 diagonal dominance 209
exponential 14
convolution integral 461 Gaussian elimination 198
harmonic 11, 326
convolution theorem 461 back substitution 198
Heaviside 7, 717
cosine and sine 459 and echelon form 199
hyperbolic 18, 107
definition 452 inverse matrix 201
implicit 6
Fourier transform pair 452 pivots 199
impulse 404
generalized functions 459 geometric distribution 689
incremental approximation 76
impulse function 6(t) 458 geometric series 24, 88
input/output 6 common ratio 24
inverse of a product 461 inverse 13
inverse transform 454 sum of 25
logarithm 15 geometrical area 244
nonperiodic functions 451
maximum/minimum 66 in polar coordinates 267
notations 454
mean value 263 gradient (straight line) 4
Parseval theorem 466
odd 7, 259 gradient vector (grad) 499, 526
periodic function 460
periodic 11 curvilinear coordinates 599-600, 601,
properties of 453
point of inflection 67 602
Rayleigh’s theorem 465
rational 21 identities 603
rules, table of 456, 774
signum (sgn) 8, 27 graphs 3 (see also curves)
sidebands 458
stationary points 68 gradient 30
sine function 455
step 648 sketching 70
spectral distribution 452
switching 622 slope 30
table of 775
translation of 6 graph theory (networks) 626
n(t) (top-hat function) 454, 455
trigonometric 9 bipartite graph 638, 639
III(r) (shah function) 464
unit step 7 branch 631
frameworks 640
functions of two variables 471 circuit 629
bipartite graph 640
chain rule 477 circuit loop 632
minimum bracing 640
chain rule, one parameter 505 circuits, electrical 632-635
frequency 12, 327 compatibility graph 641
chain rule, two parameters 515
angular 12, 327 complete graph 627, 639
contour map 471
domain 343 connected graph 627
curves, angle between 499
forcing 333 cotree 631
curvilinear coordinates 511
polygon (statistics) 701 cutset 632
relative 667, 668 dependent, independent variables 471
depiction of 471 cutset, fundamental 633
friction 312 cycle 629
function 6 (see functions of one, two and derivatives, mixed 476
degree of a vertex 627
N variables) differentials, use of 517
digraph 627
complementary 319, 656 directional derivative 493, 501
directed graph (digraph) 627
generating 663 errors 490, 492
disconnected graph 627
implicit 6 gradient vector 500
edge 626
functions of N variables 521 (see higher derivatives 476
eulerian graph 631
separately functions of two implicit differentiation 496
Euler’s theorem 639
variables) incremental approximation 488
face 639
chain rule, more than one parameter Lagrange multiplier 506
frameworks 640 (see also frameworks)
525 least squares method 483 hamiltonian graph 631
chain rule, one parameter 525 level curves 471 handshaking lemma 628
derivative, mixed 521 linear approximation, best 488 labelled graph 629
directional derivative 529 maximum/minimum 480, 506 link 631
and envelope 538 normal to a curve 498, 500, 518 loop 627
errors 522 normal to surface 479, 527 multigraph 627
gradient vector 525, 532 orthogonal systems of curves 496 node 626
higher derivatives 522 partial derivatives 473, 474 path 629
implicit differentiation 524 restricted stationary points 506 planar graph 626
incremental approximation 522 stationary points, tests for 480- regular graph 627
Lagrange multipliers 534, 535, 536 481 signal flow graph 635 (see signal flow
level surface 532 steepest ascent/descent 495 graphs)
normal to surface 528, 532 surface 471 simple graph 627
partial derivatives 522 tangent plane 479 spanning tree 631
784 Index

graph theory (networks) (cont.) incremental approximation 76, 489, 522 interval 1
subgraph 631 index notation for derivatives 83 infinite 2
traffic signal phasing 641 induction proof 649 inverse function 13
trail 629 inductor basic property 13
tree 631 complex impedance 343, 406 derivative of 59
unlabelled graph 629 phasor 342 integration of 289
vertex 626 inequality 1 reflection property 13
walk 629 infinite series 86 inverse matrix 122, 137
weights 627 convergence 87 isocline 350
gravitational field 581 divergence 88 iterative methods (see approximation)
Green’s theorem 574 geometric 88
partial sums 87
sum 87 Jacobi method (for linear equations) 209
Taylor series 88, 89 Jacobian (double integration) 558, 559
hamiltonian graph 631 inflection, point of 67, 179 inverse of 517
computer test for 765 inner product (see scalar product) jump (in function) 7
handshaking lemma 628 input 417
harmonic forcing 314
integer floor function 432
harmonic function 11, 326
integral (see also antiderivative;
standard form 326 Kirchhoff laws 344, 632
integration; double integral; line
harmonic oscillation 326 Kuratowski 639
integral)
phase diagram 369
and area 249
phasor 340
area analogy 259, 268
harmonic oscillator 306, 326-329, 367,
area, polar coordinates 267
368, 726 Lagrange multiplier 509, 510, 534-537,
of complex functions 256
computer program 757 731
definite 250, 253 computer program 762
harmonics 434
differentiation of 261
Heaviside unit function 7, 396, 717 Laplace equation 486
Laplace transform of 397 differentiation under the integral sign
Laplace transforms 384—433 (see also
histogram 701 485 discrete system; z-transform)
homogeneous algebraic equations 205, of even function 260
and circuits (see circuits)
206 (see also linear algebraic improper 255
by computer algebra 728
equations) indefinite 240, 251 convolution theorem 413, 416, 556-
homogeneous differential equations 296 infinite 255 557
(see also differential equations, integrand 250 definition 385
second order) as limit of a sum 249, 253, 265 delay rule (second shift rule) 398
Hooke’s law 234, 329 limits of integration 251 delta function 405, 728
hyperbola 6 limits of integration, variable 261 of derivatives 392
hyperbolic functions 18, 19, 107 numerical evaluation of 250, 264, 268, differential-delay equation 431
derivatives 53 275 differential equation, variable
identities 19, 20 of odd function 259 coefficients 432
inverse 20 rectangle rule 250, 269 and differential equations 394—396
trigonometric functions, relation with Simpson’s rule 275 discrete systems (see z-transform)
107 solid of revolution 266 division by x 402
hypergeometric distribution 693 square bracket notation 246 of Fourier series 468
surface 589 of Heaviside unit function 396
table of integrals 772 impedance, x-domain 406
trapezium rule 269, 725 impulse function 405
identity volume 592 impulsive input 415
algebraic 21 integral equation 403, 431, 432 integral equations 403, 431, 432
of functions 356 Volterra 431 inverse 390
impedance, complex 343-344 integrand 250 inverses, table of 773
capacitor 343, 406 integrating factor 321, 359 multiplication by ekt 388
in frequency domain 343 integration 248-291 (see also double multiplication by f 389
inductor 343, 406 integration; integral) notation 384
parallel 344 and area 249, 258 original 390
resistor 343, 406 change of variable 277-285 of a partial differential equation 487
series 344 change of variable, definite integrals partial fractions 391
in x-domain 406 282 of a product 413
in tu-domain 343-344 of inverse function 289 quiescent system 395
implicit differentiation 58, 496, 524 partial fractions 285 rules, list of 773
improper integral 255 by parts 286-291 x-domain 403, 406, 408
convergence 256 reduction formulae 290-291 scale rule 387
divergence 256 by substitution 277-285 shift rule 388
impulse function 404 (see also delta as a sum 249, 253, 265 sifting 404
function) of trigonometric products 282 standard functions 385-386
impulsive input 416, 417 interference 329 of switching functions 397
increment 31 intersection of sets 607, 670 table of 391, 773
Index 785

tilde notation 385 oscillator) associative law 115


transfer function, 5-domain 408 damped forced 316, 331-332 Cayley-Hamilton theorem 236
transfer function, cc-domain 413 damped unforced 307 by computer algebra 720
Volterra integral equation 431 deadbeat 331, 372 conformable for multiplication 115
and z-transform 420 free oscillations 330 difference 114
lead and lag 328 natural oscillation 330 elementary row operations 198
least squares properties 332 equality 113
by computer algebra 731 resonance 333 Unear equations 121
estimates 711 transient 332 multiplication 115, 117
method 484 link (graph theory) 570 multiplication by a constant 113
level curve 471 In (see logarithm) multiplication on left/right 118
gradient vector normal 500 logarithm 15 postmultiplication 118
level surface 532 of complex number 110 powers 121, 723
gradient vector normal 527 derivative of 43 premultiplication 118
normal to 532 inverse of exponential 15 row-on-column operation 116
light switches (Boolean application) 623, properties 16 sum 114
625 logarithmic differentiation 57 summation notation 117
limit 31, 71 logic gates 617 (see also Boolean algebra; matrix, inverse 122, 137, 139 (see also
infinity, at 72 gate) matrix; matrix algebra)
important limits 39^41, 72 logic networks 619 (see also Boolean by Gaussian elimination 201
from left 73 algebra) of a product 124
notation 33 logistic equation 364, 651, 658, 665, 734 rule for 2 x 2 123
from right 73 loop (of graph) 627 rule for 3 x 3 125
limit cycle 379, 727 maximum/minimum
contours around 481
stability 380
local 67, 481
line integral 565 mass-centre 270, 272, 283 N variables 532
closed path 572 mass-spring system 329 one variable 67
closed path convention 572 material derivative 527 restricted 69, 506, 533-537 (see also
definition 566 matrix 112 (see eigenvalues; Lagrange multipUers)
field, conservative 578-582 eigenvectors; matrix algebra; slope changes sign 66
field intensity 579 matrix, inverse) two variables 480
Green’s theorem 574 adjacency 643 two variables, classification 482
non-conservative potential field 583 adjoint 137, 139 mean (expected value) 687
as an ordinary integral 565 augmented 199 population 701
parametric evaluation 569 characters tic equation 214 rules 688
parametric representation 568-569 determinant of 123, 125, 139 (see also sample 702
path 565 determinant) median 704
path dependence 566 diagonal 121 minimum (see maximum)
path independence 571-579 diagonalization of 221, 224 mode 703
paths parallel to axes 570 eigenvalues of 214 modulus 2, 101
perfect differential 572 eigenvectors of 215 moment (see also force)
potential 581 idempotent 235 of force 190-192
potential field 582 identity 121 of inertia 270, 272, 294, 552
in two and three dimensions 567-568 inverse 122, 137 of momentum 195
work 576-579 leading diagonal of 120 vector 191
linear algebraic equations 121, 196 lower triangular 209 mortgage 649, 733
augmented matrix 199 nonsingular 123 multigraph 627
back substitution 198 null 114 mutually exclusive events 670
compatible 202 order 113
Cramer’s rule 194, 197, 202, 207 orthogonal 228, 230
diagonal dominance 209 positive-definite 229
echelon form 199 powers of 121, 224, 227
elementary row operations 198 quadratic form 227, 230 negation (Boolean algebra) 618
Gauss-Seidel numerical method 208 rectangular 113 negative binomial (Pascal) 693
geometrical interpretation 204 row-stochastic 226, 236 networks (see graph theory)
homogeneous 205-206 singular 123, 218 Newton cooling 325
ill-conditioned 484 skew-symmetric 120 Newton’s method 79
incompatible 202 square 113 by computer algebra 718
Jacobi numerical method 209 symmetric 119 nodal analysis (circuits) 633
pivots 199 trace of 235 node (graph theory) 626
solution by computer algebra 722 transpose 118, 132 node (phase plane) 372
solution by elimination 196 unit 121 nonlinear differential equations (see
solution by Gaussian elimination 198 upper triangular 210 differential equations, nonlinear)
trivial, nontrivial solutions 206 vector 113, 120 normal
linear dependence 221 zero 114 to curve 498, 518
linear independence 221 matrix algebra (see matrix; matrix, to plane 174
linear oscillator 330 (see also harmonic inverse) to surface 480, 527
786 Index

normal coordinates 234 perfect differential form 359, 571 estimated mean 703
normal distribution 696 period 11, 12, 327 mean 703
standard normal curve 697 doubling 660 population problems 18, 308, 364, 375,
standardized 696 periodic functions 11 (see also harmonic 431, 487, 658
table 776 functions; Fourier series) position vector 152
number line 1 amplitude 12, 327 derivative of 157
numbers angular frequency 12, 327 positive definite
complex 96 (see also complex and Fourier integrals 460 matrix 229
numbers) frequency 12, 327 quadratic form 230
index laws 768 integrals of 436 potential 581
modulus 2, 101 lead/lag 328 energy 39, 233, 581
rational 606 period 11, 12, 327 field 582
set notations 606 phase 12, 327 single-valued 581
square root 2 phase difference 328 product rule (differentiation) 48, 65
numerical methods (see approximation) spectrum 447 probability (see also distributions;
wavelength 11, 12, 327 events; random event; random
permutations 673, 768 (see also variable; sample space)
Ohm’s law 632 combinations) addition law 672
ordered pair 100, 613 perspective 182 axioms 671
ordinate 2 phase 12, 327 Bayes’ theorem 671
origin of coordinates 1 difference 328 conditional 675-676
original (Laplace transform) 390 phase plane 369 (see also differential counting method 673
orthogonal matrix 228, 230 equations, nonlinear) cumulative distribution function (cdf)
rotation of axes 231 general 275-377 694
orthogonal systems orbit 369 density function (pdf) 694
of coordinates 513, 599 path 369, 373 distributions (see distributions)
of curves 496 qualitative methods 366 and expected value 687-688
oscillations phasor 339-363 and experiments 667
damped 306 addition 341 notation 668, 675
forced 332 algebra of 341 and relative frequency 667, 678
harmonic 327 capacitor 343 and sample space 667
longitudinal 233 complex impedance 343-344 and set notation 669
oscillator, linear 330, 332 definition 339 total 678
outlier 705 of derivative 341
output 417 diagram 342
frequency domain 343
quadratic equation 96, 768
harmonic oscillation 340
quadratic form 227
parabola 5 inductor 343
quartiles 704
parallelogram of integrals 341
quotient rule (differentiation) 51, 65
area (determinant) 563 resistor 343
area (vector) 187 time domain 343
parallelogram rule transfer function 346
complex numbers 101 voltage gain 346 radian 8, 768
vector addition 148 pitch 434
radioactive decay 308
parallelepiped pitchfork bifurcation 660 random
volume (determinant) 194 pivot 199 event 666, 683
volume (vector) 189 planar graph 638 sample 701
parameter (statistics) 701 computer test for 765 walk 664
parametric equations of a curve 60, 504, plane random variable 683
532 cartesian equation 153, 174 continuous 694
Parseval identity 466, 468 normal vector to 174 discrete 684
partial derivative (see derivative, partial; tangent 478, 479, 529 mean (expected value) 687
functions of N variables) vector equation of 153, 174 probability distribution 684
partial fractions 20 Poisson distribution 691-692, 699 variance 688
in integration 285 approximation to binomial walk 664
and Laplace transforms 391 distribution 692 rational function 21
rules 21-22 polar coordinates 9-10 reciprocal rule (differentiation) 51
and z-transforms 425-426 and complex numbers 102 recurrence relation 353 (see also
partial sum 87 curve sketching in 717 difference equations)
Pascal distribution 692 derivative Ay/dx 61 reduction formulae 290-291
Pascal’s triangle 768 in double integration 553 region of integration 547, 550
path (see differential equations, geometrical area in 267 regression 710, 711
nonlinear; line integral) motion in 158 controlled variable 710
path (graph theory) 629 pole (see z-transform) least squares estimate 711
pdf (probability density function) 694 polynomial 21 line 712
pendulum 39, 309, 335, 373-375, 490 Taylor 84-85, 719 linear model 711
computer program 757 population (statistics) 701 scatter diagram 710
Index 787

unbiased estimators 712 proper subset 607 regression 710-711


repeated integral (see double integral) subset 607 sample mean 702, 707
resistor union 606-607, 670 sample variance 708
complex impedance 343, 406 universal 606, 608 sampling distribution 706
phasor 343 Venn diagram 608 standard error of mean 708
resonance 333-334 sgn function (see signum function) statistic 701
restricted stationary values 69, 506, 533- shah function 464 stem 637
537 shift rule 388 stiffness of spring 233, 329
resultant of forces 176 shoulder (surface) 480 Hooke’s law 233, 329
rotation of axes (see axes) sidebands 458 straight line 3
row operations, elementary 198 sifting function 404 cartesian form in three dimensions
signal 417 176
signal energy 465 cartesian form in two dimensions 4,
signal flow graphs 635-638 172
block diagram 635 determinant equation of 140
saddle (phase plane) 371, 378
cycle 637 direction cosines 167
saddle (surface) 472, 480, 482, 730
edges in series 637 direction ratios 171
computer program 759
feedback 635-636 gradient 4
sample 701
loop 637 parametric form 175
mean 702, 707
multiple edges 637 perpendicular 4
standard error of mean 708
stem 637 slope 4
variance 708 vector equation 175
weighted digraph 636
sample space 667 (see also events) strange attractor 662
signed area 244, 248
elements of 667 streamline 541
signum function 7, 27
exhaustive, mutually exclusive subsets subgraph 631
simple harmonic motion (see harmonic
of 667-671 subset 607
oscillator)
partitioning of 671 substitution, method of (see integration)
Simpson’s rule 275, 725
standard error of sample mean 708 summation sign 24
computer program 755
Venn diagram 670 superposition principle 313
sine function 9 (see also trigonometric
sampling distribution 706 surface 471
functions)
sawtooth wave 468, 729 area 591
antiderivative 242
scalar product 163 cone 472
derivative 42
scalar triple product 188 contour map 472, 481, 730
exponential form 104
cyclic order 188 hemisphere 472
half-rectified 466
scale rule integral 589-590
Fourier transform 586 rectified 466, 467
Taylor series 89 normal to 478, 479, 527, 532
Laplace transform 387 parametric form 591
singular matrix 123, 218
scatter diagram 710 saddle 472, 480, 482, 730
sector singular solutions (of differential
equations) 356, 362 shoulder 480
area in polar coordinates 267 stationary points 481, 532
separation of variables (differential slope 30, 33 (see also curve; functions of
two variables; straight line) stationary points classification 482
equations) 354 tangent plane 478, 479, 529
separatrix 375 change of sign at maximum/minimum
switches 622
sequence 24, 87, 651 66
light 623, 625
of partial sums 87 spectrum 447
in parallel 622
series (see Fourier series; geometric speed 62, 150
in series 622
series; infinite series; Taylor series) sphere
tied 623
sets 605-612 surface area 770
truth tables 622, 623
associative law 609 volume 770
switching circuits (Boolean algebra) 622
binary 605 spiral (curve)
truth tables 622
cardinality 611 archimedean 28
switching function 622
cartesian product 613 equiangular 28
system
commutative law 609 spiral (phase plane) 372, 378
of differential equations 378
complement 607, 670 square root 2
electrical 329
complementary law 609 standard deviation 689, 698
feedback 635
de Morgan’s laws 609 standard error 708
mechanical 329
difference 609 stationary points
in series 410-413, 635
disjoint 607 in N variables 532-534
distributive law 609 in one variable 68
duality 613 restricted 69, 506, 533-537
elements 605 in two variables 480 tangent (to a curve)
empty 607 in two variables, classification 482 equation 30, 33
finite 605 statistic 701 vector 157
identity law 609 statistics 701-714 tangent plane 478, 479, 529
infinite 605 box plot 704-705 Taylor polynomial 84-85, 719
intersection 607, 670 central limit theorem 709 Taylor series
number sets 606 estimator 701 binomial theorem 89
ordered pairs 613 population 701 composite functions 90
788 Index

Taylor series (cont.) variance 688, 696 orthogonal coordinates (general) 539,
computer program 750 sample 708 599, 601, 602
variate 701 paraboloidal coordinates 603
General point, at 94
interval of validity 88 vectors (see also vector field) scale factors 599
acceleration 157-159 solenoidal 595
large x 92
polynomial approximation 85, 719 angle between 164 spherical polar coordinates 602
table 89 basis 155, 166, 184 surface integral 589-590
tetrahedron 722 bracketed sums 148 volume integral 592
time domain 343 column 113, 120 vector product
total derivative 505 components 144 of basis vectors 184
total probability 678 coplanar 150, 189 direction of 186
total waiting time 642 cross product (see vector product) and moment of force 190, 191
traffic signals and curvature 178 rules 184, 187
compatibility graph 641 differentiation 157, 159 vector space 220
phasing 641 directed line segment 145 vector triple product 192
subgraph 641 displacement 144 velocity 34, 35
total waiting time 642 displacement on an axis 142 angular 195
transfer admittance 347 dot product (see scalar product) radial 159
transfer function 346 equality 147 relative 150
frequency domain 347 equations 152 transverse 159
s-domain 408 and force 176, 190 vector 158
2-domain 421 gradient 499 (see also gradient Venn diagram 608
transfer impedance 347 vector) vertex 626
transform (see Laplace transform; z- initial/end points 145 vibrations
transform; discrete systems; Fourier invariance 187 differential equations of 234
transform) magnitude (length) 147 and Fourier series 435
transient 332 multiplication by a scalar 147 normal coordinates 234
of discrete signals 424 normal to curve 498, 500 volume
translated function 4, 6 normal to plane 174 of cone 266
transpose 118, 132 normal to a surface 478, 479, 527, 532 ellipsoid 274
trapezium rule 269, 725 notation 146 integral 592
tree (graph theory) 631 parallelepiped 189, 194
parallel 147
spanning 631 of solid of revolution 266
parallelogram rule 148
trial 666 parametric equation 153 table 770
Bernoulli 685 perpendicular 165
triangle rule (vectors) 148 position 152
triangle, vector area 194, 721 and relative velocity 150
trigonometric functions 9 right/left-handed system of 185 walk (graph theory) 629
derivatives of 42 row 113, 120 walk, random 664
exponential form 104 rules of vector algebra 146-150 water clock 274
identities 11, 109, 769 scalar product 164 wave
integrals of products 282 scalar triple product 188 attenuating wave 475
inverses 14 sense of 147 equation 478
truth tables (Boolean algebra) 617 sum of 147-148 number 327
for gates 617-619 tangent to curve 157 wavelength 11, 12, 327
inverse method 621 Wheatstone bridge 81, 345, 346
triangle rule 148
for switches 622-623 whisker 705
unit 155-156, 166
vector product (see vector product) work 253, 265, 576, 577, 579
and velocity 150, 158
uniform distribution 699 vector field 499, 587 (see also curl;
union (of sets) 606, 607, 670 divergence; gradient)
unit step function 7 conservative 597 z-transform (see also discrete systems)
universal set 606 curl 595 (see also curl) complex plane 424
cylindrical polar coordinates 539, 599, convolution theorem 424
600 definition 420
del operator 499, 589, 596 delay circuit 419
valency 627 divergence 587 (see also divergence) difference equations 428
van der Pol equation 367, 380, 727 divergence theorem 593 differentiation analogue 432
variable field lines 587-588 inverse 420
controlled (independent) 710 fluid flow 587, 594, 595, 596 and Laplace tranform 420
dependent 6, 471 gradient 499, 526 (see also gradient) poles of 426
discrete random 684 identities 603 stability of discrete circuit 426
independent 6, 471 integral curves 587 time-delay rule 432
random 684 irrotational 596 transfer function 420, 421
response (dependent) 711 Laplace’s equation 603 transients, growth of 426-429
Many students beginning their engineering and science courses need a book on
mathematical methods to underpin their studies. This textbook offers an accessible and
comprehensive grounding in many of the mathematical techniques required in the early
stages of an engineering or science degree, and also for the routine methods needed by
first-year mathematics students.
Mathematical Techniques starts by revising work from pre-university level before
developing the more advanced material which students will encounter during their
undergraduate studies.
In this second edition, the content of the book has been revised and extended, and now
includes: the z transform for discrete systems; the Fourier transform; a revision of the
chapter on vectors, and a new chapter on vector fields including divergence and
curl operators; some new applications in graph theory; and three new chapters on
elementary probability, random variables, and statistics. The chapter on applications of
symbolic computing has been extended to cover these changes in the main text, and
remains a valuable separate chapter in the book.
This new edition now falls into eight parts:
♦ Elementary methods, differentiation, complex numbers
♦ Matrix algebra and vectors
♦ Integration and differential equations
♦ Transforms and Fourier series
♦ Multivariable calculus
♦ Discrete mathematics
♦ Probability and statistics
♦ Projects using symbolic computation
Chapters and sections are largely self-contained, allowing students to concentrate on the
specific methods they need to master. The book contains nearly 500 fully worked examples,
more than 2000 exercises (with selected answers), and over 120 computing projects.
The text is accessible, widely illustrated, and stands as an ideal introduction to students
in the first year of their university course.

Dominic Jordan and Peter Smith are members of the Mathematics Department at
Keele University. They both have long experience of teaching this material, at many
different levels, to a wide variety of students.
'... an introduction covering all the standard techniques...
a large and friendly format.' New Scientist
'... exactly the book that is now needed ... just the thing to become ISBN 0-19-856461-9
the recommended text for freshman courses...'
The Times Higher Education Supplement

OXFORD UNIVERSITY PRESS

You might also like