0% found this document useful (0 votes)
632 views129 pages

Calculus PDF

Uploaded by

dankememe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
632 views129 pages

Calculus PDF

Uploaded by

dankememe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 129

7 Polar and parametric curves 58

I NTUITIVE I NFINITESIMAL C ALCULUS 7.1 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . 58


7.2 Calculus in polar coordinates . . . . . . . . . . . . . . . . 58
Viktor Blåsjö 7.3 Parametrisation . . . . . . . . . . . . . . . . . . . . . . . . 59
© 2018 intellectualmathematics.com 7.4 Calculus of parametric curves . . . . . . . . . . . . . . . . 60
7.5 Reference summary . . . . . . . . . . . . . . . . . . . . . 60

8 Vectors 62
8.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
C ONTENTS 8.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.3 Vector product . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.4 Geometry of vector curves . . . . . . . . . . . . . . . . . . 65
8.5 Reference summary . . . . . . . . . . . . . . . . . . . . . 65
1 Differentiation 3
1.1 Infinitesimals . . . . . . . . . . . . . . . . . . . . . . . . . 3 9 Multivariable differential calculus 68
1.2 The derivative . . . . . . . . . . . . . . . . . . . . . . . . . 4 9.1 Functions of several variables . . . . . . . . . . . . . . . . 68
1.3 Derivatives of polynomials . . . . . . . . . . . . . . . . . 5 9.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . 70
1.4 Derivatives of elementary functions . . . . . . . . . . . . 6 9.3 Unconstrained optimisation . . . . . . . . . . . . . . . . 70
1.5 Basic differentiation rules . . . . . . . . . . . . . . . . . . 7 9.4 Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
1.6 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . 7 9.5 Constrained optimisation . . . . . . . . . . . . . . . . . . 72
1.7 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 9.6 Multivariable chain rule . . . . . . . . . . . . . . . . . . . 73
1.8 Reference summary . . . . . . . . . . . . . . . . . . . . . 8 9.7 Reference summary . . . . . . . . . . . . . . . . . . . . . 74

2 Applications of differentiation 11 10 Multivariable integral calculus 77


2.1 Maxima and minima . . . . . . . . . . . . . . . . . . . . . 11 10.1 Multiple integrals . . . . . . . . . . . . . . . . . . . . . . . 77
2.2 Concavity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 10.2 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . 77
2.3 Tangent lines . . . . . . . . . . . . . . . . . . . . . . . . . . 12 10.3 Cylindrical coordinates . . . . . . . . . . . . . . . . . . . 78
2.4 Conservation laws . . . . . . . . . . . . . . . . . . . . . . 13 10.4 Spherical coordinates . . . . . . . . . . . . . . . . . . . . 78
2.5 Differential equations . . . . . . . . . . . . . . . . . . . . 14 10.5 Surface area . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.6 Direction fields . . . . . . . . . . . . . . . . . . . . . . . . 16 10.6 Reference summary . . . . . . . . . . . . . . . . . . . . . 79
2.7 Reference summary . . . . . . . . . . . . . . . . . . . . . 17
11 Vector calculus 82
3 Integration 20 11.1 Vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.1 Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 11.2 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.2 Relation between differentiation and integration . . . . 21 11.3 Line integrals . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.3 Evaluating integrals . . . . . . . . . . . . . . . . . . . . . . 21 11.4 Circulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.4 Change of variables . . . . . . . . . . . . . . . . . . . . . . 22 11.5 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.5 Integration by parts . . . . . . . . . . . . . . . . . . . . . . 23 11.6 Electrostatics and magnetostatics . . . . . . . . . . . . . 86
3.6 Partial fractions . . . . . . . . . . . . . . . . . . . . . . . . 24 11.7 Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . 87
3.7 Reference summary . . . . . . . . . . . . . . . . . . . . . 25 11.8 Reference summary . . . . . . . . . . . . . . . . . . . . . 88

12 Further problems 91
4 Applications of integration 31
12.1 Newton’s moon test . . . . . . . . . . . . . . . . . . . . . . 91
4.1 Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
12.2 The rainbow . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.2 Arc length . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
12.3 Addiction modelling . . . . . . . . . . . . . . . . . . . . . 92
4.3 Center of mass . . . . . . . . . . . . . . . . . . . . . . . . . 31
12.4 Estimating n! . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.4 Energy and work . . . . . . . . . . . . . . . . . . . . . . . 33
12.5 Wallis’s product for π . . . . . . . . . . . . . . . . . . . . . 93
4.5 Logarithms redux . . . . . . . . . . . . . . . . . . . . . . . 34
12.6 Power series by interpolation . . . . . . . . . . . . . . . . 93
4.6 Reference summary . . . . . . . . . . . . . . . . . . . . . 35
12.7 Path of quickest descent . . . . . . . . . . . . . . . . . . . 94
12.8 Isoperimetric problem . . . . . . . . . . . . . . . . . . . . 94
5 Power series 37
12.9 Isoperimetric problem II . . . . . . . . . . . . . . . . . . . 95
5.1 The idea of power series . . . . . . . . . . . . . . . . . . . 37
5.2 The geometric series . . . . . . . . . . . . . . . . . . . . . 38 13 Further topics 97
5.3 The binomial series . . . . . . . . . . . . . . . . . . . . . . 39 13.1 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.4 Divergence of series . . . . . . . . . . . . . . . . . . . . . 40 13.2 Evolutes and involutes . . . . . . . . . . . . . . . . . . . . 97
5.5 Reference summary . . . . . . . . . . . . . . . . . . . . . 41 13.3 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . 98
13.4 Hypercomplex numbers . . . . . . . . . . . . . . . . . . . 99
6 Differential equations 43
13.5 Calculus of variations . . . . . . . . . . . . . . . . . . . . . 100
6.1 Separation of variables . . . . . . . . . . . . . . . . . . . . 43
6.2 Statics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 A Precalculus review 102
6.3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 A.1 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.4 Second-order differential equations . . . . . . . . . . . . 47 A.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.5 Second-order differential equations: complex case . . . 50 A.3 Trigonometric functions . . . . . . . . . . . . . . . . . . . 103
6.6 Phase plane analysis . . . . . . . . . . . . . . . . . . . . . 51 A.4 Logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.7 Reference summary . . . . . . . . . . . . . . . . . . . . . 54 A.5 Exponential functions . . . . . . . . . . . . . . . . . . . . 105

1
A.6 Complex numbers . . . . . . . . . . . . . . . . . . . . . . 105
A.7 Reference summary . . . . . . . . . . . . . . . . . . . . . 107

B Linear algebra 113


B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 113
B.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
B.3 Linear transformations . . . . . . . . . . . . . . . . . . . . 114
B.4 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . 114
B.5 Inverse matrices . . . . . . . . . . . . . . . . . . . . . . . . 115
B.6 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . 115
B.7 Eigenvectors and eigenvalues . . . . . . . . . . . . . . . . 116
B.8 Diagonalisation . . . . . . . . . . . . . . . . . . . . . . . . 118
B.9 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
B.10 Reference summary . . . . . . . . . . . . . . . . . . . . . 121

C Notation reference table 128

2
1.1.2. Explain why. Hint: a tangent line may be considered a
1 D IFFERENTIATION line that cuts a curve in “two successive points,” i.e., as
the limit of a secant line as the two points of intersection
are brought closer and closer together:
§ 1.1. Infinitesimals

§ 1.1.1. Lecture worksheet

The basic idea of the calculus is to analyse functions by means


of their behaviour on a “micro” level. Curves can be very com-
plicated when taken as a whole but if you zoom in far enough
they all look straight, and if you slice a complicated area thin
enough the slices will pretty much be rectangles. Lines and
rectangles are very basic to work with, so at the micro level ev- 1.1.3. Why is it called a “tangent” line? Hint: the dance “tango”
erything is easy. Here is an example: and the adjective “tangible” share the same Latin root.
dy
1.1.1. The area of a circle is equal to that of a triangle with its 1.1.4. Argue that dx is the slope of the graph.
radius as height and circumference as base.

(a) Explain how this follows from the figure below.


§ 1.1.2. Problems

1.1.5. Explain how the result of problem 1.1.1 can also be ob-
tained by considering the area as made up of infinitesi-
mally thin concentric rings instead of “pizza slices.”

= 1.1.6. † Isaac Newton’s Philosophiae Naturalis Principia Math-


ematica (1687) is arguably the most important scientific
work of all time. The very first proposition in this work
is Kepler’s law of equal areas. The law says that planets
sweep out equal areas in equal times:
=
∆t
∆t
(b) Explain why this is equivalent to the school formula
A = πr 2 .

In the context of the calculus we utilise this idea by system-


Newton’s proof uses nothing but very simple infinitesi-
atically dividing the x-axis into “infinitely small” or “infinitesi-
mal geometry.
mal” pieces, which we call dx (“d ” for “difference”). Here I have
drawn such a dx and the associated change dy in the value of C c
the function:

b B

ds dy

dx S A

In an infinitely small period of time the planet has moved


from A to B . If we let an equal amount of time pass
Since dx is so small, the curve may be considered to coincide again then the planet would continue to c if it was not
with the hypotenuse ds on this interval. Of course in the figure for the gravity of the sun, which intervenes and deflects
this is not quite so, but the figure is only schematic: in reality the planet to C . Since the time it takes for the planet to
dx is infinitely small, so the hypotenuse ds really does coincide move from B to C is infinitely small, the gravitational pull
with the curve exactly, we must imagine. has no time to change direction from its initial direction
B S, thus causing cC to be parallel to B S.
In fact, if we extend the hypotenuse segment ds we get the tan-
gent line to the curve. (a) Conclude the proof of the law.

3
§ 1.2. The derivative the derivative as “so many steps in y for so many steps in
x.” This is what we do for example when we speak of so-
and-so many “miles per hour.” It can be confusing that
§ 1.2.1. Lecture worksheet
concepts like “per hour” or “per step in the x-direction”
dy
occurs in the description of something that is supposedly
The idea that dx is the slope of the graph of y(x) is very useful. instantaneous and not at all ongoing for hours.
It has many faces besides the geometrical one:
The confusion can be illustrated with the following sce-
dy
• Geometrically, dx is the slope of the graph of y. nario. A police officer stops a car.
dy
• Verbally, is the rate of change of y. O FFICER : The speed limit here is 60 km/h and you were
dx
going 80.
• Algebraically,
D RIVER : 80 km/h? That’s impossible. I have only been
dy y(x + dx) − y(x) driving for ten minutes.
= .
dx dx
O FFICER : No, it doesn’t mean that you have been driving
for an hour. It means that if you kept going at that speed
• Physically, the rate of change of distance is velocity; the
for an hour you would cover 80 km.
rate of change of velocity is acceleration.
D RIVER : Certainly not. If I kept going like that I would
1.2.1. Sketch the graphs of the distance covered, the speed, and
soon smash right into that building there at the end of
the acceleration of a sprinter during a 100 meter race,
the street.
and explain how your graphs agree with the above char-
acterisations of these quantities. (a) How can the officer better explain what a speed of
80 km/h really means?
Of course the rate of change of y(x) is generally different for dif-
ferent values of x. We use y 0 (x) to denote the function whose Imagine an electric train travelling frictionlessly on an
value is the rate of change of y(x). We call y 0 (x) the derivative infinite, straight railroad. The train is running its engine
of y(x). Thus y(x) is the primitive function, meaning the start- at various rates, speeding up and slowing down accord-
ing point, while y 0 (x) is merely derived from it. For example, ingly. Then at a certain point it turns off the engine. Of
y 0 (0) = 3 and y 0 (1) = −1 means that the function is at first ris- course the train keeps moving inertially.
ing quite steeply but is later coming back down, albeit at a less
rapid rate. (b) Explain how this image captures both the “instan-
taneous” and the “per hour” aspect of a derivative
1.2.2. Express symbolically in terms of derivatives: in a concrete way.
(a) The volume of the arctic ice V (t ) is shrinking. 1.2.5. Sometimes we consider the derivative of the derivative,
(b) The population P (t ) grows at a rate of 10% per unit or the “second derivative,” y 00 . The second derivative is
d2 y
time. also denoted 2 .
dx
(a) Argue that this notation makes algebraic sense by
§ 1.2.2. Problems considering
d dy
µ ¶

1.2.3. The length σ is called the subtangent: dx dx

(b) Also write out the meaning of

ds y 0 (x + dx) − y 0 (x)
dy y 00 (x) =
dx
dx
y
and show that it leads to the same result.
σ 1.2.6. “In the fall of 1972, President Nixon announced that the
rate of increase of inflation was decreasing. This was the
(a) Express σ in terms of y and y 0 . first time a sitting president had used the third derivative
(b) A famous curve has “constant subtangent”— to advance his case for re-election.” (Notices of the AMS,
indeed this is how it was usually referred to in the Oct. 1996, 43(10), p. 1108.)
th
17 century. Which curve is it?
If Nixon was speaking of f 000 , what is f ?
0
1.2.4. The derivative y (x) represents the “instantaneous” rate
 rate of change of quantity of goods purchasable by
of change. In the case of a moving object the derivative of
one dollar
its distance is the velocity “at a given instant.” Neverthe-
dy  quantity of goods purchasable by one dollar
less it is useful to read the fractional interpretation dx of

4
 quantity of goods accumulated from t=0 until now dx
by someone spending one dollar per unit time

 net change in quantity of goods purchasable by one x


dollar from t=0 until now
x dx

§ 1.3. Derivatives of polynomials 1.3.1. Find the derivative of x 3 and draw the corresponding
picture. Thus, as its side length grows, the volume of a
cube grows at a rate of times its [side length, diag-
§ 1.3.1. Lecture worksheet onal, surface area, volume, number of sides].

The pattern continues, giving the general differentiation rule


To find the derivative of y(x) we should:
(x n )0 = nx n−1 . This rule works also for non-integer exponents,
• let x increase by an infinitesimal amount dx; as we shall see in problems 1.3.3, 1.3.4, 1.3.5, 1.5.3, 1.5.4.

• calculate the corresponding change in y, which is de- 1.3.2. (a) What is the formula for the volume of a sphere?
noted dy; (Just state it for now. We shall prove it in problem
4.1.1.)
dy
• divide the two to obtain the rate of change dx .
(b) Take its derivative with respect to the radius. What
At the final stage we typically discard any remaining terms in- is the geometrical meaning of the result? The
volving dx on the right hand side, since dx is infinitely small. derivative of the volume of a sphere with respect
We cannot, however, discard all dx’s from the outset. This is to its radius is: [the radius, the radius squared, the
because even though it is infinitely small, it still has an impact surface area of the sphere, the area of the equatorial
when considered in relation to dy. This is the way any small circle, the volume of the sphere, the volume of the
numbers work. If we have 5+0.00001 then the second term can circumscribed cube, none of the above].
pretty much be discarded since it is so insignificant compared
0.00001 (c) Draw the corresponding picture and compare with
to the first. However, if we have 5 + 0.000001 then in fact those
problem 1.3.1.
tiny numbers become very significant indeed, in this case even
outweighing the “big” number and making the end result 15. (d) The derivative of the area of a circle with respect
Thus 0.00001 and 0.000001, though tiny on their own, become to its radius is: [the radius, the radius squared, the
big when taken in ratio. In the same way, infinitesimal terms area, the diameter, the area of the circumscribed
can be discarded in expressions like 5 + dx but not in expres- square, the circumference, none of the above].
dy
sions like dx . It is therefore safest to do our discarding only
at the final step of our three-step plan for finding derivatives,
since that is when we are done dividing. § 1.3.2. Problems

In the case of y(x) = x this goes as follows. Suppose x increases p


by dx. What is dy, the corresponding change in y? Quite clearly 1.3.3. Consider the function f (x) = x. Geometrically, we can
dy = dx in this case, since the function “doesn’t do anything” interpret f (x) as the [area, side, diagonal, perimeter] of
to the variable, but rather merely passes it along, whence the a square whose [area, side, diagonal, perimeter] is x. To
change in output equals the change in input. The derivative, investigate the derivative of f (x), we let x increase by dx
therefore, is and look for the change df in f (x). We can visualise it
dy dx like this:
= =1
dx dx dx
That is to say, the rate of change of y = x is 1; its slope is 1; it’s
always heading one step up for each step over. x
2
For the derivative of y = x we follow the same plan. Suppose
x increases by dx. What is the corresponding dy? It is
x dx
dy = (x + dx)2 − x 2 = 2x dx + (dx)2 where (with everything expressed in terms of x and dx):

(a) area of big white square =


so
dy 2x dx + (dx)2 (b) total shaded area =
= = 2x + dx
dx dx
(c) short side of each shaded rectangle =
Since dx is so small we can throw it away. Thus the derivative is
dy (d) Therefore, df /dx =
dx = 2x. Note that the calculations correspond to this picture:

5
1.3.4. Find the derivative of 1/x algebraically and confirm that (b) If degrees were used instead of radians, which as-
it agrees with the rule (x n )0 = nx n−1 . Hint: after writing pect(s) of the figure, if any, would become invalid?
out dy, combine the terms on a common denominator.
 parts marked 1
1.3.5. A pizza is to be shared among x friends. They cut it into
so many equal pieces. Then one more friend shows up.  parts marked d θ
The pizza now has to be divided into x +1 pieces, but the  parts marked sin(θ) and cos(θ)
cutting had already taken place. Therefore each person
cuts off an x th piece of their slice and give it to the new-  parts marked d sin(θ) and −d cos(θ)
comer.
(c) Complete the ratio based on similar triangles:
(a) How much smaller did each piece of pizza become? sin(θ) / 1 = /
(b) This illustrates the fact that the derivative of f (x) = In §A.5 we saw that exponential functions have the property
is f 0 (x) = . that they grow in proportion to their size.
(c) Does everyone have the same amount of pizza in 1.4.2. Formulate this in terms of derivatives.
the end?
In fact, the number e we mentioned there can be defined as
1.3.6. Imagine a string wrapped around the earth’s equator. the number such that (e x )0 = e x . In other words, e is the base
How much longer would the string need to be for it to for the exponential function that is its own derivative.
be able to be raised one meter above the earth’s surface
at all points? What if you used a beach ball in place of 1.4.3. To find the derivative of ln(x), write y = ln(x). Then the
dy
the earth? By considering the derivative of the circum- derivative we seek is dx .
ference of a circle with respect to its radius we see that
the results are [equal, proportional to r , proportional to (a) Rewrite y = ln(x) in exponential form and find dx
dy .
r 2 , proportional to r 3 ]. dy
(b) Invert the fraction to find dx .

(c) What is the derivative of ln(x)? Of course the an-


§ 1.4. Derivatives of elementary functions
swer should be expressed purely in terms of x.

§ 1.4.1. Lecture worksheet


§ 1.4.2. Problems
1.4.1. Explain why investigating the derivatives of sine and co-
sine leads to the following figure. 1.4.4. Find the derivative of y = arcsin(x) by differentiating x =
sin(y) and inverting the fraction in the manner of prob-
lem 1.4.1.

-d cos θ Later we shall see a more geometrical way of arriving at


this result in problem 4.2.4.
d sin θ


1.4.5. Derivative of arctangent. Recall the geometrical defini-
1 tions of the tangent and arctangent functions from §A.3:
1
dx
sin θ


θ
dy
cos θ
tan θ

x
y=
arct

(a) To prove that the angles marked as equal really are


equal, which of the following are useful?
an x

 addition formula for cosine θ


 addition formula for sine
Let us find the derivative of the arctangent. In other
 angle sum of triangle
words we are looking for dy/dx. In the figure I made x
 radius of circle meets circumference at right increase by an infinitesimal amount dx and marked the
angle corresponding change in y. We need to find how the two
are related. To do this I drew a second circle, concentric
 product rule of differentiation
with the first but larger, which cuts off an infinitesimal
 Pythagorean Theorem triangle with dx as its hypotenuse.

6
(a) Show that this infinitesimal triangle is similar to the § 1.6. The chain rule
large one that has x as one of its sides.

(b) By what factor is the second circle larger than the § 1.6.1. Lecture worksheet
first? (Hint: Find the hypotenuse of the triangle
with x in it. Remember that the first circle was a The final differentiation rule we need tells us what happens
unit circle.) when one function is “trapped” inside another (like the links
of a chain):
(c) Use this to express the “arc” leg of the infinitesimal
triangle as a multiple of dy. ( f (g (x)))0 = f 0 (g (x)) · g 0 (x) (chain rule)

(d) Find dy/dx by similar triangles. 1.6.1. Argue that the chain rule can be written
1.4.6. If f (x) = arctan(x) + arctan(1/x), compute f (1), f (−1) d df dg
and f 0 (x), and explain the “paradox.” f (g (x)) =
dx dg dx

and use this to give an algebraic justification for its truth.


§ 1.5. Basic differentiation rules The chain rule is typically quite evident when its meaning is
spelled out for a real-world example; in fact you would prob-
ably often use it intuitively without even thinking about it as
§ 1.5.1. Lecture worksheet
a formal rule. Consider for example the following scenario. A
scientist observes that the boundary of a polar ice cap is reced-
Above we found the derivatives of various standard functions. ing by 3 km/year. The boundary is currently 2000 km from the
We must now consider how to find derivatives of functions that pole. The ice cap may be considered a circle centered at the
are built up by combining these functions in various ways. pole.
Let us start with these simple rules: 1.6.2. How fast is the area of the ice cap shrinking? What does
this have to do with the chain rule?
( f + g )0 = f 0 +g0 (sum rule)
(c f )0 =cf 0 (coefficient rule) 1.6.3. Explain how the composite function of problem A.2.3
can be used to illustrate the chain rule. Hint: it is prob-
These rules are quite evident already without any calculations. ably best to use specific numbers for illustration pur-
poses.
1.5.1. Justify these rules in purely verbal, “common-sense”
terms.
§ 1.7. Limits
A less obvious rule is:

( f g )0 = f 0g + f g 0 (product rule) § 1.7.1. Lecture worksheet

1.5.2. Prove the product rule by viewing f g as the area of a rect- Some people find the notions of infinitesimals and the in-
angle with sides f and g and considering how the area finitely small disagreeable. Mathematics should not be built
changes as x (the implied variable) grows by dx. on such mysterious and quasi-metaphysical notions, they say.
No, give me good old real numbers, they say; otherwise it’s not
mathematics.
§ 1.5.2. Problems
This kind of conservatism can be accommodated using the no-
tion of limits. The limit of f (x) as x goes to a, or in symbols
1.5.3. Find the derivative of 1/x by letting y = 1/x and differen- limx→a f (x), means the number that f (x) approaches as x is
tiating x y. taken closer and closer to the given number a. Thus for exam-
p ple limx→∞ 1/x = 0 because 1 divided by something very big
1.5.4. Come up with a similar proof for 1/ x.
is virtually zero. Also limx→0 1/x = ∞ because as x becomes
1.5.5. To find the derivative of a quotient f /g of two functions closer and closer to 0 (think of x = 0.0001 and such numbers)
we can simply write it as f · g1 and apply the product rule. the function 1/x becomes very big and grows beyond bounds.
So there is really no need for a separate rule. However, we 1.7.1. Actually we should be a bit more careful and write
can save ourselves some time by working out this prod- limx→0+ 1/x = ∞ and limx→0− 1/x = −∞. How so? Illus-
uct rule calculation once and for all, so that we have a trate with a figure.
rule for quotients “ready to go” for future reference. Do
so. 1.7.2. What is the visual meaning of limx→a f (x) = f (a) in
terms of the graph of f ? In such cases we say that f (x)
1.5.6. What is the product rule for a product of three functions? is continuous.

7
The principles of the calculus can be formulated in terms of (or maybe the average balance across the year) and then
limits. Then instead of speaking of an infinitesimal increment gave us 100% of that amount. So if we started with $1 we
dx we can speak of the limit of a finite increment ∆x as ∆x → 0. would have $2 on January 1, and $4 the year after that.
Thus for example the derivative may be more formally defined However, why should interest calculations be based on
as the notion of a calendar year? There is no intrinsic rea-
son for this. The bank could just as well give us interest
y(x + ∆x) − y(x) y(x + dx) − y(x)
lim instead of payments every month or every week or whatever.
∆x→0 ∆x dx
If we simply deposit an initial amount B 0 and leave it to
1.7.3. Write out the proof that (x 2 )0 = 2x in both manners of ex- grow, the formula for our balance after t years will be
pression side by side.
100 t
µ ¶
The limit approach has some advantages over freewheeling use B (t ) = B 0 1 +
100
of infinitesimals in certain technical contexts. However, as the
above example shows, this comes at the cost of often needless if the interest is added yearly,
pedantry. The infinitesimal manner of speaking is simpler and 100/12 12t
µ ¶
more suggestive, and we shall therefore stick to it throughout B (t ) = B 0 1 +
100
this book. But it will do us well to know that we can fall back on
a more formal explication in the language of limits if we should if the interest is added monthly, and
run into tricky problems where the meaning of infinitesimals 100/52 52t
µ ¶
becomes unclear. B (t ) = B 0 1 +
100
if the interest is added weekly.
§ 1.7.2. Problems
(a) What is the balance after one year in each of these
scenarios?
1.7.4. L’Hôpital’s rule. Suppose you want to compute the limit
(b) ? Explain how these differences arise. Hint: con-
f (x)
lim sider the phenomena of “interest upon interest.”
x→a g (x)
Thus the account holder should insist on more frequent
(a) Explain why this would be easy if f and g were con- compounding. And so should the mathematician, be-
tinuous functions and f (a) = 8 and g (a) = 2. cause mathematically there is no reason why the theory
If f (a) and g (a) are both zero, however, the situation is of interest should be based on any particular chunks of
trickier. But consider this analogy. Suppose f (x) and time that are nothing but arbitrary social conventions.
g (x) represent the distance from the finish line of two Better then to denote the number of compounding oc-
sprinters in a race x seconds after they took off. casions in a year by n and let n → ∞ instead of fixing it at
1, 12, 52, or any other arbitrary number.
(b) What is the meaning of f (a) = g (a) = 0 in this con-
text? (c) Write down the balance formula for n compound-
ing occasions per year.
Suppose now that one of the runners was running twice
as fast as the other during the last second of the race. (d) What is the balance after one year in this case?
Imagine watching a slow-motion replay of the finish. (e) Let n → ∞ in this expression. This limit is the num-
(c) Argue that this makes it clear that ber e.
(f) In §1.4 we said that e could be defined as the num-
f (x) f 0 (x)
lim = lim 0 ber such that (e x )0 = e x . Explain how this is related
x→a g (x) x→a g (x) to the economic definition.

Thus in these kinds of “0/0” situations we can differen-


tiate top and bottom of the fraction without altering the § 1.8. Reference summary
value of the limit. This often makes it possible to com-
pute the limit.
§ 1.8.1. Meaning of derivative
1.7.5. The number e can be defined as a limit. This idea arises
most naturally in the context of economics. Imagine that y 0 = slope of graph of y
we have a bank account with a 100% interest rate. (This is = rate of change of y
a fantasy, no doubt, but mathematically speaking it is the
= number of units by which y increases
simplest possible interest rate, so it is the natural starting
point for a “pure” mathematical theory of interest.) In its per unit increase in x (at current rate)
simplest form this would mean that, once a year, on De-
cember 31, the bank looked at the balance of our account y 0 positive ⇐⇒ graph of y heading upwards

8
y 0 negative ⇐⇒ graph of y heading downwards – . . . an x is inside a root or in the denominator of a frac-
0 tion.
y = 0 ⇐⇒ graph of y horizontal
Rewrite in form x n and apply differentiation rule for
magnitude of y 0 = steepness of graph of y
this form.
∆x change in x; difference between two x-
function equivalent form derivative
values (∆ = delta = difference) p 1/2 1
x x p
2 x
dx “infinitesimal” or “infinitely small”
change in x (d = difference) 1/x x −1 − x12

∆y, dy change in y resulting from the change in 1/x 2 x −2 − x23


x
– . . . function times function.

§ 1.8.2. Physical meaning of derivative Product rule. Differentiate one and keep the other as it
is, then vice versa, and add the results.
d
distance = velocity
dt function derivative
3
d x sin(x) 3x sin(x) + x 3 cos(x)
2
velocity = acceleration
dt xe x e x + xe x

§ 1.8.3. Derivatives of elementary functions – . . . function divided by function.

function derivative function derivative Quotient rule.


n n−1 2
x nx arctan x 1/(1 + x ) ³
2x
´0
2(1+x 2 )−2x(2x) 2−2x 2
1+x 2
= (1+x 2 )2
= (1+x 2 )2
p
sin x cos x arcsin x 1/ 1 − x 2
p – . . . function plus or minus function.
cos x − sin x arccos x −1/ 1 − x 2
ln x 1/x loga (a) 1/(x ln(a)) Sum rule. Differentiate each separately and keep the
sign in between.
ex ex ax a x ln(a)
(5x + 3x 3 )0 = 5 + 9x 2

§ 1.8.4. Derivatives of composite functions


– . . . one expression contained inside another.
( f (g (x)))0 = f 0 (g (x)) · g 0 (x) chain rule Chain rule. Differentiate the outer function, leaving the
( f g )0 = f 0g + f g 0 product rule inside as it is, then multiply by the inner derivative.
( f /g )0 = ( f 0 g − f g 0 )/g 2 quotient rule
In other words: Replace inside function by a single let-
( f + g )0 = f 0 +g0 sum rule
ter g , and differentiate as if g was the variable. Then
(c f )0 =cf 0 coefficient rule
substitute back the full expression for the inside func-
tion in place of g in the result. Then multiply the result
§ 1.8.5. Problem guide by the derivative of g with respect to the true variable.

• Differentiate a given function when the function consists of


... function to
differentiate outer inner derivative
– . . . a number.
f (g (x)) f g f (g (x)) · g 0 (x)
0

Derivative is zero. (Derivative means rate of change and p p


1−x g 1−x p1 · (−1)
numbers don’t change.) 2 1−x

– . . . a number times something. sin(x 2 ) sin(g ) x2 cos(x 2 ) · 2x

Coefficient rule. Keep the number in front and differ- sin3 (x) g3 sin(x) 3 sin2 (x) · cos(x)
entiate the something. (1 + 2x)8 g8 1 + 2x 8(1 + 2x)7 · 2
1 1
(5x 2 )0 = 10x x+x 3
g −1 x + x3 − (x+x 2
3 )2 · (1 + 3x )

9
– . . . an expression inside another inside another inside or negative as x approaches a; the answer is +∞ or −∞ ac-
another . . . cordingly. If the result is of the form 0/0 or ∞/∞: rewrite
f (x) and try again. Rewriting could involve factoring, can-
Apply chain rule repeatedly. The inside derivatives will
celling, algebraic/trigonometric/logarithmic identities, and
spit out inside derivatives of their own. Just keep multi-
L’Hôpital’s rule. If f (x) is oscillating (e.g. sin(x)) without ap-
plying.
proaching any one value: limit does not exist.
¡ p ¢2
Differentiate f (x) = sin( x) .
³¡ p ¢2 ´0 p ¢¡ p ¢0 § 1.8.7. Examples
f0 =
¡
sin( x) = 2 sin( x) sin( x) =
p ¢ p p 0
Differentiate:
¡
2 sin( x) cos( x)( x) =
¡ p ¢ p 1 −1 p p p
2 sin( x) cos( x)( 2 x 2 ) = sin( x) cos( x)/ x. cos(x 2 )
−2x sin(x)
– . . . a very complicated algebraic expression involving
roots, exponents and/or fractions.
x 2 + sin(x)
If using the usual rules seems to daunting, try taking the
2x + cos(x)
logarithm of both sides of the equation, simplify with
logarithm laws, and then use implicit differentiation.
5 ln(x)
• Estimate how much y changes (∆y) when x goes from some
value x 0 to some other value x 1 , given derivative of y. 5 · x1
∆y dy dy
Since ∆x ≈ dx , we get ∆y ≈ dx · ∆x. Fill in ∆x = x 1 − x 0 and
2
dy 0
= y (x 0 ). xe −x
dx
2 2
• Estimate y 0 (a) given the graph or a table of values for y(x). e −x − 2x 2 e −x

Focus on two x-values not too far apart, both in the vicinity
of x = a. Find the difference between them. This is ∆x. De- xe 1/x
termine the corresponding change ∆y in y (i.e., how much e 1/x + xe 1/x (− x12 ) = e 1/x (1 − x1 )
y changes when you go from one of your x’s to the other).
∆y dy
Divide to obtain ∆x ≈ dx .
x+1 4
( x−1 )
• Find the derivative of y(x) from first principles.
3
3 1·(x−1)−(x+1)·1 (x+1)
Let x increase by an infinitesimal amount dx; calculate the 4( x+1
x−1 ) (x−1)2
= 4( x+1 3 −2
x−1 ) (x−1)2 = −8 (x−1)5
corresponding change dy = y(x + dx) − y(x) in y; divide the
dy
two to obtain the rate of change dx . 2 +3x−2
xe x
2 2 2 +3x−2
e x +3x−2 + xe x +3x−2 (2x + 3) = e x (1 + x(2x + 3)) =
2
§ 1.8.6. Limits e x +3x−2 (2x 2 + 3x + 1)

lim f (x) = the number that f (x) approaches


x→a 2x
as x gets closer and closer to a 2x · ln 2

L’Hôpital’s rule: If, when trying to compute the limit


arctan x1
f (x)
lim , 1
(− x12 ) = − x 21+1
x→a g (x) 1+ 12
x

0 ∞
you get or when plugging in x = a, you can take the deriva-
0 ∞
tive of both numerator and denominator without altering the
value of the limit:
f (x) f 0 (x)
lim = lim 0
x→a g (x) x→a g (x)

• Find limx→a f (x).


Plug a into f (x). If the result is a number: this is the an-
swer. If the result is ∞: determine whether f (x) is positive

10
(0, a)
2 A PPLICATIONS OF DIFFERENTIATION

(b, c)

§ 2.1. Maxima and minima

α β
§ 2.1.1. Lecture worksheet
(x, 0)

To find the maximum or minimum values of a function of one


variable we look for x’s such that f 0 (x) = 0. This follows from the principle that “nature does nothing
in vain.” To prove this, suppose light has to travel from
2.1.1. Explain why. Hint: It may be easiest to argue this point (0, a) to (b, c) via some point (x, 0) on the x-axis.
in contrapositive form: if f 0 6= 0 then it can’t be a max or
min.
(a) Express the distance travelled as a function of x.

2.1.2. To get a more hands-on feel for this, draw a closed curve
with various squiggles in it on a piece of paper. Hold the (b) Find the derivative of this function and set it equal
paper up against the wall. Consider the tangent lines at to zero.
the lowest and highest points. Rotate the paper in vari-
ous ways an repeat.
(c) Show that the law of reflection follows. Hint: inter-
Once found, the points where f 0 = 0 can be classified using sec- pret the fractions in your equation as ratios of sides
ond derivatives: of the triangles in the figure.

f 00 positive
=⇒ minimum

f 00 negative
=⇒ maximum
§ 2.1.2. Problems

It is easy to remember which is which since a positive sec-


ond derivative corresponds to a “happy mouth” shape and vice
versa.
2.1.6. Newton writes in his Treatise of the Method of Fluxions
and Infinite Series (1671): “When a quantity is the great-
2.1.3. Explain why the classification works. Hint: What does f 00 est or the least that it can be, at that moment it neither
say about the slopes? flows backwards nor forwards: for if it flows forwards or
increases it was less, and will presently be greater than it
2.1.4. Prove that a square has greater area than any rectangle of is; and on the contrary if it flows backwards or decreases,
the same perimeter. then it was greater, and will presently be less than it is.
Wherefore find its Fluxion . . . and suppose it to be noth-
ing.” What result is Newton explaining? Translate his rea-
According to ancient sources this principle used to be
soning into modern calculus language.
important for the purposes of division of arable land.
Those ignorant of mathematics could be fooled into ac-
cepting a smaller plot by being led to believe that the
2.1.7. The law of refraction says that when light passes from a
value of a plot is determined by the number of paces
medium where its velocity is c 1 to a medium where its
around it. Then as now, it pays to be educated in mathe-
velocity is c 2 , the angles of incidence α and refraction β
matics.
are related by

2.1.5. The law of reflection says that when light reflects off a
mirror, the angle of incidence α is equal to the angle of sin(α) c 1
reflection β. =
sin(β) c 2

11
(b) A point where f 0 = 0 yet no maximum or minimum
occurs.

(c) A point where f 0 6= 0.


(0, a)
2.2.3. (a) Argue that landing an airplane smoothly calls for a
curve with an inflection point, since the flight path
needs to be horizontal at the beginning and end of
α
descent, yet sloping downwards in between.
(x, 0)
In light of the above (and perhaps problem A.2.8a), we
are inclined to try to model the descent path by a cubic
β polynomial y(x) = ax 3 + bx 2 + c x + d . Suppose descent
starts at (−L, H ) and ends at (0, 0).
(b, c) (b) What is the meaning of L and H ?

(c) What can you infer about the values of the coeffi-
cients a, b, c, d based on what is known about the
path and its endpoint slopes?

Assume that the plane needs to maintain a constant hor-


We can prove this in essentially the same way as the law izontal speed V throughout the descent. This is not en-
of reflection (problem 2.1.5). Suppose light has to travel tirely realistic; in reality planes do slow down some dur-
from (0, a) to (b, c) via some point (x, 0) on the x-axis. ing descent. But it is not too far from the truth because
(a) Express the time it takes the light to travel this path the plane needs to keep a good speed so as to maintain
as a function of x. flight aerodynamics and not go into free-fall mode, just
like a waterskier starts to sink if he’s not being pulled fast
(b) Find the derivative of this function and set it equal enough.
to zero.
(d) Express the vertical acceleration in terms of L, H
(c) Show that the law of reflection follows. Hint: inter- and V . Hint: find dy/dt using the chain rule.
pret the fractions in your equation as ratios of sides
of the triangles in the figure. (e) This equation can be used to compute any one of
the variables in terms of the other three. What
(d) ? Place a coin at the bottom of a bowl. Place your would be the most realistic practical application of
eye so that your view of the coin is just obstructed this?
by the edge of the bowl. Pour water into the bowl.
You can now see the coin. Explain the result in
terms of the law of refraction. § 2.3. Tangent lines

§ 2.2. Concavity § 2.3.1. Lecture worksheet

Derivatives can be used to find the equation for a tangent line


§ 2.2.1. Lecture worksheet of a curve since they give the slope m needed for the equation
for a line y = mx + b. The y-intercept can then be determined
Geometrically, f (x) tells us how high up we are, and f 0 (x) by plugging in a known point.
whether we are heading up or down. What is the geometrical
2
meaning of f 00 (x)? It is the derivative of f 0 (x), so it say whether 2.3.1. Consider the parabola y = x .
f 0 (x) is increasing or decreasing. In terms of f (x) this has to do (a) Find the equation for its tangent line in the point
with how the graph “bends.” We say that when f 00 (x) is positive (1, 1).
the graph is “concave up” and when it is negative the graph is
“concave down.” A point where we switch from one to the other (b) Find the equation for its tangent line in a general
is called an “inflection point.” point (X , X 2 ).

2.2.1. What can you say about the concavity of a quadratic (c) Do the same for the more general parabola y = ax 2 .
function?
(d) How can you characterise the y-intercept of these
2.2.2. Are the following true or false? Illustrate with figures. An tangent lines in general, verbal terms?
inflection point could be:
(e) Use this information to give a calculus-free recipe
(a) A maximum or a minimum. for constructing tangent lines to parabolas.

12
This is in fact how tangents to parabolas were charac- (e) What are the coordinates of T ? Hint: this was de-
terised over two thousand years ago by Apollonius in his termined in problem 2.3.1.
classic Κωνικά, Prop. I.33.
(f) Find the lengths F T and F P .
(g) Conclude the proof of the focal property of the
§ 2.3.2. Problems parabola.

2.3.2. Focal property of the parabola. Parabolic reflectors con- 2.3.3. Newton’s method for finding roots of equations numeri-
centrate all incoming rays parallel to its axis in a single cally. Tangent lines can be used to find roots of equa-
point: tions numerically. Suppose we are looking for the points
where a certain function y(x) is zero. Often we cannot
determine these values directly by setting the function
equal to zero, since this equation may be much too dif-
ficult to solve by hand. In such a case we can proceed
as follows. Let x 0 be out best guess as to where the root
might be. Evaluating y(x 0 ) will probably show us that it
was not quite zero as we had aimed for. However, we
then compute the tangent line to the function at this
point and find its x-intercept. If our guess was reason-
This is useful for picking up as much as possible of a ably close to a root this will give us a new x-value, x 1 ,
signal that has been weakened by travelling a long dis- which is closer to the root, as the figure shows. Then we
tance, such as communications from a satellite. It can can repeat the process, getting closer and closer.
also be used to focus the rays of the sun; legend has it that
Archimedes set fire to enemy ships this way. The mirror
also works in the converse direction: light originating at
the focal point will be reflected into parallel beams. This
is useful for making a limited amount of light reach as far
as possible in a focussed direction, such as in the head-
lights of a car.

Let us prove that the focal property in fact holds. Sup-


pose the parabola y = x 2 is a mirror. Consider a light
x3 x2 x1 x0
ray parallel to the y-axis that comes in from above and
strikes the parabola in the point where its slope is 1.
(a) ? This process only makes sense if we cannot solve
(a) What point is this? y(x) = 0 but we can compute the tangent line in var-
ious points. Is it realistic for this to happen in prac-
(b) Use the law of reflection (problem 2.1.5) to find the tice?
slope of the ray after it has been reflected in the tan-
gent line. (b) Given one of the x n , find a formula for the next one,
x n+1 . Hint: this is closely related to problem 1.2.3a.
(c) What is the equation for the reflected ray?
(c) Apply this method to the equation 1/x−a = 0. What
(d) What is the y-intercept of the reflected ray? expression for x n+1 does this give? What is the limit
Call this point F . The focal property says that all other of x n as n → ∞?
rays are also reflected toward this point. To check that (d) Explain briefly how this can be used to compute
this is so, let P = (X , X 2 ) be a general point on the any quotient A/B without actually dividing (i.e.,
parabola. Draw its tangent and let T be its intersection without using the division algorithm you learned in
with the y-axis. school). Some computer systems use this method
for division since it requires fewer steps than the
usual division algorithm, and thus saves comput-
ing time.

P = (X , X 2 ) § 2.4. Conservation laws


F

§ 2.4.1. Lecture worksheet

T Many useful physical laws are conservation laws, i.e., laws that
say that some particular quantity is preserved. Mathemati-

13
cally speaking, proving a conservation law means proving that § 2.5. Differential equations
something has derivative zero with respect to time. In this sec-
tion we shall derive a conservation law for motion in this way.
§ 2.5.1. Lecture worksheet
As we shall see, proving that the derivative is zero follows at
once from the basic laws of physics we summarised in §A.7.12.
In many real-world scenarios we want to know the value of a
Thus this conservation law—and many others—are in a way
certain function but we know, initially, only its derivative. If I
really nothing but other basic laws in disguise (and a disguise,
put some money in the bank I may want to know how much I
furthermore, that the calculus readily unmasks). Nevertheless
will have some years from now, for example when I retire. But
conservation laws are so useful that physicists have made up
this is not what the bank tells me. Instead they tell me the inter-
names for the things that are preserved, such as “energy.”
est rate; that is to say, the rate at which the money is growing,
Suppose we fire a projectile straight up into the air. If its ve- or the derivative. In physics it’s even worse. We may want to
locity is great enough if will never fall back down. The mini- know for example the position of a satellite. But, like the bank,
mum velocity for which this happens is called the escape ve- nature doesn’t tell us. We have to start with Newton’s law
locity. Calculating it is an example of a situation where using a
conservation law is very convenient. Let m = mass of the pro- Force = mass × acceleration,
jectile, v = velocity of the projectile, G = gravitational constant,
and figure out the position from there. So nature tells us the
M = mass of the earth, y = distance from the projectile to the
second derivative (acceleration) of what we really want to know
center of the earth. I say that this conservation law holds:
(position).
mv 2 Mm What we are given in these kinds of situations is a differential
−G = constant
2 y equation, i.e., not the function itself but some condition that
its derivatives must fulfil. Differential equations are equations
2.4.1. Check this by calculating the derivative with respect to involving derivatives, such as y 0 = y. In words, this equation
time. Hint: recall from §A.7.12 that: says: the rate of growth is equal to the current amount. So the
more you have the faster it grows. Things that have this prop-
Mm erty include money and rabbits, as noted in §A.5.
ma = F = gravity = −G
y2
A solution to a differential equation is a function that satis-
fies it. In our example e x is a solution, since if y = e x then
(Later we shall reason our way to this conservation law in a y 0 = e x = y. But there are also other solutions. You can mul-
more intuitive way, as opposed to merely checking it “after the tiply e x by any constant and it will still solve the equation: if
fact,” as it were. See §4.4.) y = ce x then y 0 = ce x = y.
2.5.1. What is the real-world meaning of the constant c in the
How does this help us calculate the escape velocity? If we have
case of money? In general, constants that occur in so-
just enough initial velocity to escape to y = ∞ we will get there
lutions to differential equations are determined by plug-
with zero velocity v = 0.
ging in known initial conditions.
2.4.2. Plug in these conditions to determine the value of the 2.5.2. Match each differential equation with the real-world sce-
constant in the conservation law corresponding to this nario it models, and explain the meaning of the variables
scenario. and constants involved.
2.4.3. Use the conservation law with this value for the constant A. Growth of population with unlimited resources.
to determine the escape velocity. Hint: when the projec-
tile is fired, y = radius of the earth. B. Growth of population with limited resources.
C. Motion of pendulum.
2.4.4. ? What are the physical names for the quantities in our
conservation law? D. Body in free fall.
E. Predator-prey system (foxes/rabbits, shark/fish,
etc.).
§ 2.4.2. Problems
F. Predator-prey system with harvesting (hunting,
fishing, etc.).
2.4.5. Commercial aircrafts fly at an altitude of about 10,000
meters. With what velocity does a projectile need to be G. Growth of population with limited resources and
fired from the surface of the earth to reach this altitude? harvesting.

2.4.6. The escape velocity for a black hole exceeds the speed of H. Conventional two-army warfare.
light. To what radius would the mass of the earth need to I. Conventional army versus guerrilla. (Imagine the
be compressed for it to become a black hole? battle taking place in a jungle, with the guerrillas

14
hiding in the trees and bushes. The guerrillas can 2.5.5. If we ignore air resistance the only force acting on a
target enemies as in any battle, but the conven- falling object is the constant gravitational acceleration
tional army, since they cannot see the guerrillas, (g ≈ 10; see §A.7.12), which gives the differential equa-
can only fire their machine guns into the jungle tion v 0 = −10. Suppose you fire a gun straight up into the
somewhat randomly, hoping to hit a guerrilla.) air. The initial velocity of the bullet is 1000 meters/sec-
00 ond.
i. y = −k y
(a) Solve the differential equation for v as a function of
ii. y 0 = k y
t with the given initial condition.
iii. y 00 = −k (b) Find the height of the bullet as a function of time.
iv. y 0 = k y(a − y) − b (c) With what velocity does the bullet strike the ground
0
v. y = k y(a − y) when it lands?
n x 0 = −b y Of course this is unrealistic. The resistance of the air is
vi. considerable. Experience shows that it is roughly propor-
y 0 = −ax
tional to the square of the velocity. Then the differential
n x 0 = ax − bx y − ex
equation is something like v 0 = −10 − 0.004v|v|.
vii.
y 0 = −c y + d x y − e y
(d) Why did I write v|v| instead of v 2 ? Hint: Consider
n x 0 = −b y the difference between going up and coming down.
viii.
y 0 = −ax y
(e) With what velocity does the bullet strike the ground
n x 0 = ax − bx y according to this model? Hint: Instead of solving
ix.
y 0 = −c y + d x y the differential equation to find this out, use the
reasonable assumption that the descending bul-
2.5.3. The case of pendulum motion is more important than it let will reach terminal velocity (i.e., a velocity at
might seem. We shall come back to it later, but for now which it is no longer accelerating) before hitting the
we can use it to introduce an important distinction be- ground.
tween two types of equilibria. What are the two positions
in which a pendulum (with a rigid rod) can be in equilib-
rium? What is the qualitative difference between them? § 2.5.2. Problems
This is a useful metaphor for many less concrete situa-
tions. 2.5.6. It makes sense that Newton’s law F = ma has accelera-
tion in it, because to stand still and to move with con-
This is similar to which of the following?
stant velocity is physically equivalent. That is, no physi-
 A ball rolling in a landscape of hills and valleys. cal experiment can tell one state from the other. This was
known to Galileo, who explained it as follows in his Dia-
 A ball thrown vertically into the air.
logue Concerning the Two Chief World Systems (1632).
 An elastic beam that is bent and then released. Shut yourself up with some friend in the main cabin be-
low decks on some large ship, and have with you there
 A planet orbiting the sun. some flies, butterflies, and other small flying animals.
Have a large bowl of water with some fish in it; hang up
2.5.4. A disease spreads in proportion to the number of en- a bottle that empties drop by drop into a wide vessel be-
neath it. With the ship standing still, observe carefully
counters between infected and healthy individuals. Ar- how the little animals fly with equal speed to all sides of
gue that this can be captured by a differential equation of the cabin. The fish swim indifferently in all directions; the
drops fall into the vessel beneath; and, in throwing some-
the same form as the population growth model in prob- thing to your friend, you need throw it no more strongly
lem 2.5.2, but with a, y, k now corresponding to: in one direction than another, the distances being equal;
jumping with your feet together, you pass equal spaces in
 number of infected every direction. When you have observed all these things
carefully (though doubtless when the ship is standing still
everything must happen in this way), have the ship pro-
 infectiousness of disease ceed with any speed you like, so long as the motion is uni-
form and not fluctuating this way and that. You will dis-
 recovery rate cover not the least change in all the effects named, nor
could you tell from any of them whether the ship was
 total population moving or standing still. In jumping, you will pass on
the floor the same spaces as before, nor will you make
 number of uninfected larger jumps toward the stern than toward the prow even
though the ship is moving quite rapidly, despite the fact
that during the time that you are in the air the floor under
 incubation time you will be going in a direction opposite to your jump. In
throwing something to your companion, you will need no
Hence “the growth of a population is the spread of the more force to get it to him whether he is in the direction of
disease of life,” so to speak. Differential equations often the bow or the stern, with yourself situated opposite. The
droplets will fall as before into the vessel beneath without
reveal analogies like this. dropping toward the stern, although while the drops are

15
in the air the ship runs many spans. The fish in their wa-  y 0 = p(b − d )(1 − y)
ter will swim toward the front of their bowl with no more
effort than toward the back, and will go with equal ease
to bait placed anywhere around the edges of the bowl. Fi-  y 0 = p(b − d )(y − 1)
nally the butterflies and flies will continue their flights in-
differently toward every side, nor will it ever happen that
they are concentrated toward the stern, as if tired out from (d) Therefore, with the initial condition y(0) = 1/1000,
keeping up with the course of the ship, from which they
will have been separated during long intervals by keeping
y(t ) = .
themselves in the air. And if smoke is made by burning
some incense, it will be seen going up in the form of a lit- According to Rashevsky, “If we roughly assume that for
tle cloud, remaining still and moving no more toward one
side than the other. religious beliefs . . . only about one person in a thousand
in early antiquity was a natural agnostic” (i.e., y(0) = p =
This means that physical laws cannot speak directly 1/1000) and that “the order of magnitude of b is about
about velocity. An observer on the shore thinks the guy 10−2 individuals/year,” then “in about the last 10,000
in the ship is moving; but the guy in the ship could claim years, the time that has elapsed since the emergence of
that he is in fact standing still and that it is the guy on the mankind from a primitive state, we find an increase of y
shore that is moving. As we just saw, no physical experi- from 1/1000 to only about 1/100,” and indeed “we actu-
ment can settle their dispute, so they must both be con- ally find that all the major religions . . . still share between
sidered to be equally right so far as physics is concerned. them practically all of humanity.” As for the future, “in
Nature does not distinguish between them, so her laws 100,000 years the fraction y . . . will have increased to only
must be equally true for them both. about 2/3.”
To illustrate this more formally, let the person on the
shore be the origin of a coordinate system, and let the (e) Do we have all the information we need to verify
ship be traveling in the positive x-direction with con- these calculations with our formula for y(t )?
stant velocity v. Now imagine releasing a butterfly inside
the ship, in the manner described by Galileo. Suppose
the butterfly moves in the x-direction only, and let X (t )
be its position in the coordinate system of an observer on § 2.6. Direction fields
the ship (i.e., taking a point inside the ship as the origin).
(a) Find the general formula for the position x(t ) of the
butterfly in the coordinate system of the observer § 2.6.1. Lecture worksheet
on the shore.
(b) Express the position, velocity, and acceleration of A useful tool for understanding a differential equation is its di-
the butterfly in terms of both coordinate systems. rection field. You construct it as follows: pick a point (x, y), plug
these values for x and y into the differential equation, solve for
(c) What is the conclusion?
y 0 , and draw a little line segment with this slope at the point
2.5.7. Rashevsky (Looking At History Through Mathematics, in question. Then you repeat this for many points until you
M.I.T. Press, 1968) proposed the following model of the see the pattern. Let us take the population growth with un-
increase of agnosticism on a historical timescale. limited resources as an example. This is the direction field for
y 0 = 0.01y:
Assume that most people are receptive to common faiths
while a small fraction pN of the total population N is
naturally agnostic. Let’s say that the birth rate is b and 300

the death rate is d , so that N 0 = (b − d )N . The agnostic


population A will grow because the agnostics bring up 200
their children to be agnostic, while a fraction p of other
births are naturally agnostic.
100
(a) Thus A 0 = .
The agnostics constitute a growing fraction y(t ) of the 100 200 300 400
population, so that A = y N .
(b) Take the derivative of both sides in this equation For example, the slope at the point (0, 100) is 1, since plugging
with respect to time t . (Note that both N and y are x = 0 and y = 100 into y 0 = 0.01y gives y 0 = 1.
functions of t .)
A solution to the differential equation must follow the direc-
(c) From this we see that tion field lines at every point. Thus once we have drawn the
 y 0 = pb(1 − y) direction field we can easily see what the solutions of the equa-
tion will look like. Here I have drawn the solution curves corre-
 y 0 = pb(y − 1) sponding to initial populations of 2, 10, and 50:

16
300 (f) Other constants remaining as above, what is the
highest harvesting rate that can be maintained
without eventually depleting the fish population
200
(assuming that the initial population is large
enough)?
100
(g) Draw the direction fields for this rate of fishing, and
for a higher rate of fishing. Explain what these pic-
100 200 300 400 tures show.

2.6.3. This problem is based on the differential equations for


We see that the “biblical” case is slow to get off the ground.
warfare in problem 2.5.2.
2.6.1. Below is the direction field for the differential equation
(a) In these equations, the derivatives are taken with
y0 = . This corresponds to the differential equa- dy
respect to time. So y 0 means dt . But by dividing one
tion of a freely falling object from problem 2.5.5 (with
equation by the other we can obtain a new equation
time on the x-axis and velocity on the y-axis). dy
involving only dx and no t . Do this.

(b) Draw the corresponding direction field and explain


how the course of a battle is reflected in this pic-
ture. Treat both the case where the armies have
equal fighting efficiency and a case where one army
is stronger.

(c) Use direction fields to illustrate what happens


if one army receives troop reinforcements mid-
battle.

(d) What if they receive better weaponry mid-battle in-


stead, increasing their killing efficiency? Discuss
what happens in terms of direction fields.
The dashed solution curves correspond to falling objects
with different [air resistance/initial height/initial veloci- (e) Draw the direction field for conventional-versus-
ty/weight]. guerilla warfare, and use it to say something about
the real-world differences between this and the
conventional case.
§ 2.6.2. Problems

§ 2.7. Reference summary


2.6.2. One of the differential equations in problem 2.5.2 de-
scribes the growth of a population with limited resources
and harvesting, such as the population of fish in a lake § 2.7.1. Maxima and minima
that is being fished by humans.
Types of points with derivative zero:
(a) Which one? Explain briefly the real-world meaning
of the terms in the equation.

(b) Consider the case where k = 0.01, a = 1000, b = 900.


Find the equilibrium values of y. What do they rep-
resent in real-world terms?
minimum maximum inflection point
(c) Sketch a direction field for this differential equa-
tion. Second derivative test:
(d) There is a qualitative difference between the two
equilibria—what is it? Hint: What happens if you f 0 (a) = 0 and f 00 (a) positive =⇒ (local) minimum
increase harvesting temporarily for one week?

(e) ? One of the equilibrium values is lower than b.


Explain why this might at first seem paradoxical f 0 (a) = 0 and f 00 (a) negative =⇒ (local) maximum
in view of the real-world meaning of these val-
• Find (local) max. or min. of f (x).
ues. Nevertheless the model makes sense even with
these values for the constants. How can this be? Set f 0 (x) = 0 and solve for x.

17
• Classify values where f 0 (a) = 0 as max., min., or neither.

f 00 (a) positive =⇒ min.; f 00 (a) negative =⇒ max. If f 00 (a) = f 00 = 0 and changing sign =⇒ inflection point of f
0, it could be max., min., or inflection point. To find out
which, determine the value of f (x) for values of x slightly • Sketch the shape of the graph of a function based on knowl-
greater than and less than a. edge of its derivative.

Find and classify the critical points of f (x) = 2x 2 − ln|x|. Find where the derivative is zero; at these points the function
“goes flat,” i.e., has a horizontal tangent.
f 0 = 4x − 1/x, f 00 = 4 + 1/x 2 . f 0 = 0 =⇒ 4x 2 − 1 = 0 =⇒
x = ± 21 . f 00 (± 21 ) = 8 > 0, so x = 21 and x = − 12 are both local
For each interval between these points, determine the sign
minima.
of the derivative. Positive or negative derivative means that
the graph goes up or down respectively. Assuming that the
• Find global max. or min. of f (x).
derivative is continuous, its sign will be the same at any point
Find local max. and min. as above. Check the value of f (x) at within one such interval; therefore it is enough to evaluate
each of these points. The biggest of these values is the global the derivative at any one point in the interval. The sign of the
max. and the smallest the global min., if such exist. Inves- derivative on these intervals can also be inferred from the
tigate the values of f (x) as x approaches ±∞ or any point second derivative, if known (y 00 positive =⇒ y 0 increasing,
at which f (x) is not defined (such as a point corresponding and so on).
to division by zero). If f (x) → ∞ in any of these cases, no
global maximum exists. If f (x) → −∞ in any of these cases, Also determine what happens to the function as x goes to
no global minimum exists. plus or minus ∞, either by examining the function or by de-
termining whether the derivative is positive or negative for
Find any local and global extrema of f (z) = 10 + 4z + z 2 − big values of x.
2 3
3z .
A sketch agreeing with these three types of information will
Stationary points occur where f 0 (z) = 4 + 2z − 2z 2 = 0, be a good approximation of the shape of the graph.
which means z 1 = −1 or z 2 = 2. The second deriva-
tive is f 00 (z) = 2 − 4z. Since f 00 (−1) = 6 > 0 we see that Note that from information about the derivative alone it is
z 1 = −1 is a local minimum (with value f (−1) = 23 3 ). Since
not possible to know the vertical position of the graph, i.e.,
f 00 (2) = −6 < 0 we see that z 2 = 2 is a local maximum whether it needs to be shifted up or down. However, know-
(with value f (2) = − 50 2 2 3
3 ). Since f (z) = 10 + 4z + z − 3 z =
ing the value of the function at any one point is enough to
3 10 4 1 2 determine the vertical position of the graph.
z ( z 3 + z 2 + z − 3 ), we see that limz→−∞ f (z) = ∞ and
limz→∞ f (z) = −∞. Hence the function has no global
maximum or global minimum.

§ 2.7.3. Direction fields


• Find max. or min. of f (x) when x is limited to a specific in-
terval.

Set f 0 (x) = 0 and solve for x; list the x-values that are in the • Sketch the direction field of a differential equation.
given interval. Also include the endpoints of your interval in
your list. Any max. or min. occurs at one of x-values in this Pick a point (x, y), plug these values for x and y into the dif-
list. ferential equation, solve for y 0 , and draw a little line segment
with this slope at the point in question. Repeat for many
To classify whether max., min., or neither, evaluate f (x) at points until you see the pattern.
the given points and determine which points give the great-
est and smallest values. Non-endpoints may also still be clas-
Draw the direction field for y 0 = x 2 .
sified with the second derivative test.

§ 2.7.2. Shape of graphs 3

f 00 positive =⇒ graph of f concave up


1

(slopes increasing, turning upwards)


-5 -4 -3 -2 -1 0 1 2 3 4 5

-1

00
f negative =⇒ graph of f concave down -2

(slopes decreasing, turning downwards)

18
Draw the direction field for y 0 = y.
3

-5 -4 -3 -2 -1 0 1 2 3 4 5

-1

-2

-3

• Infer properties of a solution of a differential equation given


its direction field.
A solution curve will follow the direction field at all points. If
a point on the solution curve is specified, the solution curve
can be drawn by tracing along the direction field lines from
that point.

In each of the cases above, if y(x) is a solution to the given


differential equation, what are the possible values y(x)
can approach as x → ∞?

For y 0 = x 2 : ∞. For y 0 = y: ∞, 0, or −∞.

19
3.1.4. If σ(h) is the density of water at depth h, what does the
3 I NTEGRATION integral of σ(h) from the surface to the bottom of the sea
represent?
3.1.5. Which of the following have a common etymological root
§ 3.1. Integrals
meaning with the mathematical term integral?
 We need to integrate immigrants into society.
§ 3.1.1. Lecture worksheet
 Common name for 1, 2, 3, . . .
Rb
The integral a y dx means:  Spaghetti integrale.
R
• Algebraically, the sum (hence the , which is a kind of  None of the above
“s”) of infinitesimal rectangles with height y and base dx:

y
§ 3.1.2. Problems

1
Rb
3.1.6. Argue that b−a a f (x) dx represents the average value of
f (x) on the interval [a, b].

a b
dx
And thus: average value
Rb
• Geometrically, the integral a y dx is the area under the
graph of y(x) from x = a to x = b. (Technically the
“signed area”: area below the x-axis is negative since the Hint: this can be done very concretely by thinking of the
height of the rectangle, y, is a negative number.) area under the graph as so much sand, for example.
Rb
I now wish to convince you that also: 3.1.7. Argue that a f (x) − g (x) dx represents the area between
• Physically, the integral of velocity is distance; the integral the graphs of these two functions. Illustrate with a figure.
of acceleration is velocity. 3.1.8. Social science: index of inequality. Class inequalities are
often quantified by saying that the richest so-and-so per-
3.1.1. If I drive 50 km/h for one hour and then 100 km/h for
cent have so-and-so much of all the world’s wealth. To
three hours, how far did I go? Show that this is the area
put this in analytic form, let F (x) be the fraction of a par-
under the graph of the speed function.
ticular resource, such as money, owned by the poorest
3.1.2. Objects in free fall have a constant acceleration, accord- fraction x of the population. Thus F (0.3) = 0.1 means
ing to Galileo. Draw the graphs of the acceleration, ve- that the poorest 30% of the population owns 10% of the
locity, and distance fallen functions of a dropped stone, resource. Gini’s index of inequality is one way to mea-
and explain how your graphs agree with the above char- sure how evenly the resource is distributed. It is defined
acterisations of these quantities in terms of integrals. as the integral
Z 1
Finally: 2 x − F (x) dx.
0
• Verbally, integrals often represent some kind of net (a) Show graphically what this integral represents in
change or net effect, though this viewpoint is not always terms of the graph of F (x).
applicable.
(b) Which of the following must always be true in any
This shall become clearer in §3.2 but we can already feel it in society? (Assume that no one can have a negative
the above examples. amount of the resource, and that F (x) is twice dif-
ferentiable.)
3.1.3. (a) Argue that the verbal description too applies well to
problem 3.1.1.  F (0) = 0
(b) How do these descriptions play out if I then drive  F (1) = 1
backwards at 50 km/h for two hours? In which two
 F 0 (x) ≥ 0
senses can “distance travelled” be interpreted now?
How can you express each as an integral?  F 00 (x) ≥ 0

20
 F (x) < x
§ 3.2.2. Problems
 None of the above
3.2.2. Discrete analog of the fundamental theorem of calculus.
(c) The maximum possible value of Gini’s index of in-
Write down a list of eight arbitrary numbers, leaving gen-
equality is and the minimum is . In
erous spaces around them. Above the gap between each
the latter case F (x) = .
pair of numbers, write down the sum of all numbers up
(d) Which of the following represents the most unequal to this point. Below the gap between each pair of num-
society? bers, write down the difference between those two num-
bers. Above and below the new lists, write down the sums
 F (x) = x
of the difference list, and the differences of the sum list.
 F (x) = x 2 Explain how this is related to the fundamental theorem
of calculus.
 F (x) = x 3
3.2.3. † What happens if, in FTC1, we use a slanted line instead
of a perpendicular one? In other words, what is dA dx , with
§ 3.2. Relation between differentiation and integra- A(x) defined like this:
tion

§ 3.2.1. Lecture worksheet


A(x)
The fundamental theorem of calculus says that derivatives and 45˚
integrals are each other’s inverses in the following ways: x
d t
Z
y(x) dx = y(t ) (FTC1) 3.2.4. Argue that FTC1 can be obtained by differentiating FTC2.
dt a
Z b 3.2.5. Explain why
y 0 (x) dx = y(b) − y(a) (FTC2)
a
Z Z b a
To prove FTC1 we proceed as with Rany derivative. In this case f (x) dx = − f (x) dx
t a b
the variable is t and the function is a y(x) dx.
in two ways:
Rt (a) Geometrically in terms of the sign of dx.
a y(x) dx = area under y(x) from a to t =
(b) Algebraically in terms of FTC2.
a t (c) Which explanation do you prefer? Why?
Rt
so if t increases by dt then a y(x) dx increases by 3.2.6. (a) If I save all the money I earn, argue that my savings
balance is the integral of my salary.
(b) Explain the meaning of FTC1 and FTC2 in the con-
area under y(x) from t to t + dt = = y(t ) dt,
text of this example.

t dt
§ 3.3. Evaluating integrals
so Rt
d a y(x) dx y(t ) dt
= = y(t ),
dt dt § 3.3.1. Lecture worksheet
which proves FTC1.
FTC2 says that in order to integrate some function f (x) one has
3.2.1. What happens if we take the derivative with respect to
only to find an antiderivative F (x), that is, a function such that
theR lower bound instead? If y(x) is a positive function,
d a F 0 = f , because then
d t t y(x)d x = because if the endpoint of
integration is moved to the the area decreases. Z b
f (x) dx = F (b) − F (a)
FTC2 is even easier to prove: a
Z b Z b Z b
dy so to evaluate the integral we just have to plug in the bounds
y 0 dx = dx = dy
a a dx a into F (x) and take the difference between them.
= sum of little changes in y from a to b
To see that this is a very powerful result, consider the problem
= net change in y from a to b of finding the area under the parabola y = x 2 between x = 0
= y(b) − y(a) and x = 1.

21
Let’s say that we toss a 1-inch needle on a floor with 2-
inch floor boards going east-west. The position of the
1 needle is determined by two parameters: the distance y
from the southern end of the needle to the joint to the
north of it, and the angle θ the needle makes with the
floorboards. Thus the possible values of y are 0 ≤ y < 2
and the possible values of θ are 0 ≤ θ < π. Call this the
1
“possibility space.”

Without the FTC alphabet soup you wouldn’t know where to


y
start, would you? But with this machinery the whole thing 1
becomes reduced to a straightforward matter of manipulating θ 2
symbols in a predictable way:
¸1
1 x3 13 03 1
Z ·
x 2 dx = = − =
0 3 0 3 3 3 (a) Draw a coordinate system with θ and y as the x and
3
y coordinates respectively, and indicate the possi-
So in this case the antiderivative F (x) was x3 . Figuring this out bility space in this picture.
amounts to doing a differentiation problem “backwards”: We
(b) The needle will cross a joint if and only if y <
are used to problems like:
. (Use trigonometry.)
The derivative of x 2 is . (c) Shade the subset of points of the possibility space
satisfying this condition. The area of this shaded
But now we have to “anti-differentiate,” i.e., solve problems region is .
like:
The derivative of is x 2 . (d) The probability of the needle hitting a joint is the
area of this shaded region divided by the total
It follows that we can integrate or anti-differentiate most stan- area of the possibility space. So the probability is
dard functions by reading our tables of derivatives backwards. .

As in the case of differentiation, we also need rules for how to 3.3.4. The population P (t ) of a country is growing continuously
deal with functions that are built up from standard functions at a rate of (1 + t )% per year, where t is the number of
combined in various ways. Two simple rules are: years from today.
Z Z (a) This means that P 0 (t )/P (t ) = .
c f (x) dx = c f (x) dx R b P 0 (t )
(b) Find a P (t ) dt in terms of P (a) and P (b). Hint: The
Z Z Z integrand is a “logarithmic derivative.” The integral
f (x) + g (x) dx = f (x) dx + g (x) dx evaluates to .

3.3.1. Explain why these rules are quite obvious. (c) How many percent bigger will the population be
R2 in 6 years compared to today? Hint: Work out the
3.3.2. Evaluate 1 1 + 2x 4 dx.
R6 0
integral 0 PP (t(t)) dt in two ways, once using (a) and
once using (b). The population will have grown by
%.
§ 3.3.2. Problems

3.3.3. In pre-revolutionary France, Georges Louis Leclerc, § 3.4. Change of variables


Comte de Buffon, spent his bourgeois leisure time toss-
ing needles on the floor.
§ 3.4.1. Lecture worksheet
I suppose that in a room in which the par-
quet floor is simply divided by parallel joints, What is the antiderivative of 2x cos(x 2 )? By “guess and check”
one throws a stick in the air and one of the you can find the answer sin(x 2 ), which can be verified using
players bets that the stick will not cross any of the chain rule. But it is not always so easy to do this kind
the parallels of the parquet floor, and that the of “backwards chain rule” problem in your head. A system-
other bets on the contrary that the stick will atic technique for such integrals and more is substitution, or
change of variables. To find 2x cos(x 2 ) dx we could introduce
R
cross some of these parallels; the chances of
these two players are asked for. [Buffon, Essai the new variable u = x 2 . Then du dx = 2x, so Rdu = 2x dx. So
d’Arithmetique Morale, 1777] the integral rewritten in terms of u becomes 2x cos(x 2 ) dx =

22
cos(x 2 )(2x dx) = cos(u) du. This is easy to integrate: it’s
R R
The following example illustrates how we can deal with such
sin(u)+C , or, putting the answer back in terms of x, sin(x 2 )+C .
cases.
R1p
This technique always works when we need to integrate some- 3.4.3. Geometrically, 0 1 − x 2 d x is the area of:
thing where a function is “trapped” inside another function,
(a)
and the derivative of the trapped function appears on the out-
side (give or take a constant). In such a situation we should R π/2substitution x = cos(u), the integral be-
(b) Using the
choose the trapped function as our new variable u. For exam- comes 0 = du
ple:
R 2 x3 We can find a clever way of rewriting this integrand by ex-
x e dx u = x3 pressing the dashed length in the figure below in two dif-
R p
5x 1 − x 2 dx u = 1 − x2 ferent ways (by the Pythagorean theorem in the left figure
R and in terms of sines in the right one).
cos(8x + 1) dx u = 8x + 1
(cos x)7 sin x dx
R
u = cos x

3.4.1. Solve the second example. If we drop the 5x the integral


becomes much harder but also much more interesting—
why? See problem 1.1.1.

3.4.2. Select all that are true:


R 1
d x = 12 x1 d x = 12 ln(x) +C
R
 2x
1 1 1 (c) This gives 2 sin(u) = and hence sin2 (u) =
R R
 Using the substitution u = 2x, 2x d x = 2 u du =
1
2 ln(2x) +C
£ ¤π/2
1 1 (d) Hence the integral becomes =
R R
 Using the substitution u = 2x, 2x d x = u du = 0
ln(2x) +C

Explain the “paradox.” § 3.5. Integration by parts


R π/2 3 3
Let’s solve another example: 0 sin x cos x dx. Here it is
not so clear which function is the “trapped” one. It was § 3.5.1. Lecture worksheet
easier when we had all sines and one cosine, or the other
way around, as in the last example in the table. But this Like the chain rule, the product rule also has an integral coun-
can easily be arranged, for the identity sin2 x + cos2 x = 1 en- terpart: Z Z
able us to essentially “trade two cosines for two sines,” or f g 0 = f g − f 0g
conversely. R If we “trade in” two of the cosines in our inte-
π/2 R π/2
gral we get 0 sin3 x(1 − sin2 x) cos x dx = 0 sin3 x cos x dx − 3.5.1. Show that this follows immediately from the product
R π/2
0 sin5 x cos x dx. In each of these integrals it is clear that the rule. (So there is really no need to memorise it as a sepa-
substitution should be u = sin x. As always when we make a rate formula.)
substitution we immediately take its derivative, because we are
This is called integration by parts. Basically itRallows us to trade
going to need it to replace the dx in the original integral. In this
one integral for another, namely f g 0 for f 0 g . For this to
R
du
case dx = cos x, so du = cos x dx. We must also remember the
be a profitable trade the integral we bought should be sim-
bounds of integration. These are of course x-values, so when
pler than the one we sold, i.e., the integral should simplify if
we rewrite the integral in terms of u we must replace them with
we take the derivative of one factor ( f ) and the antiderivative
the corresponding u-values. For this we use the basic relation
of the other (g 0 ). So good choices for f would be polynomials
between u and x, namely u = sin x, to find that the lower bound
and logarithms, because they become simpler when differenti-
is u = sin 0 = 0, and the upper bound is u = sin π/2 = 1. Alto-
ating, and good choices for g 0 would be sines, cosines, and ex-
gether we get
ponential functions, because they do not become worse when
Z π/2 Z π/2 Z π/2 integrating. For example:
sin3 x cos3 x dx = sin3 x cos x dx − sin5 x cos x dx
0 0 0 R x/2
Z 1 Z 1 e x dx f =x g 0 = e x/2
= u 3 du − u 5 du
R 3
x sin x dx f = x3 g 0 = sin x
0 0
x 2 ln x dx g 0 = x2
· 4 ¸1 · 6 ¸1 R
u u 1 1 1 f = ln x
= − = − =
4 0 6 0 4 6 12
In the second case we have to integrate by parts three times to
sin2 x dx and cos2 x dx our trading trick doesn’t work.
R R
For “run down” the polynomial. In the last case we are forced to

23
anti-differentiate x 2 , which in itself makes the integral worse. 3.6.1. Find x3+x
R
2 −4 dx.
But this is a small price to pay for getting a clear shot at the
logarithm, which becomes vastly simpler when differentiating.
§ 3.6.2. Problems
The rest is simple fill-in-the-blanks. The first example above
for instance goes like this: 3.6.2. (a) One of the differential equations in problem 2.5.2
describes the growth of a population with limited
Z Z
|{z} e x/2 dx = |{z}
x |{z} x 2e x/2 x/2
1 |2e{z
| {z } − |{z} } dx resources. Which one? Explain briefly the real-
f g0 f g f0 g
world meaning of the terms in the equation.
x/2 x/2
= 2xe − 4e +C
(b) For the case of the human species inhabiting the
3.5.2. Work out the last example in the table above. earth, make a rough estimate as to the values of the
constants in the equation. Give brief justifications.

§ 3.5.2. Problems (c) Solve the differential equation. Hint: First find the
derivative of time with respect to population, inte-
3.5.3. † Geometrical interpretation of integration by parts. grate this expression, and then solve for population
as a function of time.
(a) Sketch a schematic representation of the paramet-
ric curve ( f (t ), g (t )). Assume that it is always head- (d) Use the “biblical” case of an initial population of 2
ing upwards and to the right. to determine the constant of integration and sketch
(possibly with computer assistance) the graph for
(b) Pick two t -values a, b and express the area under this case.
the curve between these two points as an integral.
(e) Mark the present-day population on the graph.
(c) Draw and express the areas of the axis-parallel rect- How many years after t = 0 is it?
angles with lower-left point at the origin and upper-
right point at ( f (a), g (a)) and ( f (b), g (b)) respec- (f) At what population size does the inflection point
tively. occur? Hint: Use the differential equation instead
of the formula for the population.
(d) Explain how the integration by parts formula is ge-
ometrically evident from your figure. (g) Complete the sentence: “When population growth
stops accelerating, the population has reached
.”
§ 3.6. Partial fractions 3.6.3. Chemistry: rate of reaction. Consider a chemical reaction
in which one molecule of reagent A combines with one
§ 3.6.1. Lecture worksheet molecule of reagent B to produce one molecule of a com-
pound X. The rate at which molecules of X are produced
Consider the problem of integrating a function with several is proportional to the concentration of the reagents:
factors in the denominator, such as dx
Z
1 = k(a − x)(b − x),
dx. dt
x(1 − x) where x is the concentration of X, a, b are the initial con-
The trick here is to split the integrand into partial fractions: centrations of each reagent, in mols per unit volume, and
1 A B k is a constant. Let us assume that a < b.
= + . dt
x(1 − x) x 1 − x (a) Rewrite the equation in the form dx = ···.
So I gave each factor of the denominator its own fraction, leav- (b) Find t as a function of x. Determine the constant of
ing the numerators as unknown constants. I say that we can in integration using the fact that no molecules of the
fact find numbers A and B that make this equation true. To find compound X are present at the beginning of the re-
these numbers, multiply both sides by x(1 − x) to clear the de- action.
nominators. This gives us 1 = A(1−x)+B x. Now, for the partial
fraction decomposition to be valid this equation must be true (c) Plot the solution curve for a = 1, b = 2, k = 1.
for any value of x. So in particular it must be true if we plug in (Note that the same graph, when rotated, can be
x = 1, for instance. I chose this value of x because it simplifies read as a graph of x as a function of t .)
the equation so nicely; in fact, the equation now says 1 = B , (d) Usually one can speed up chemical reactions by in-
so we have figured out one of our constants. Another clever creasing the temperature. Suppose this increases k
choice of x will be the other root, x = 0, which gives 1 = A. So to 2, but reduces each of the concentrations a and b
actually both constants are 1. Therefore by 10% due to heat expansion of the solution. Plot
Z
1
Z
1 1 this new situation. Is the reaction faster than be-
dx = + dx = ln|x| − ln|1 − x| +C . fore?
x(1 − x) x 1−x

24
§ 3.7. Reference summary
§ 3.7.5. Rules of integration

§ 3.7.1. Meaning and properties of integrals


Substitution, simplest case:
Z b Z
y dx = (signed) area under y(x) from x = a to x = b g 0 (x) f (g (x)) dx = F (g (x))
a
Z
y dx = indefinite integral of y(x)
Integration by parts:
= the general anti-derivative of y(x)
Z Z
(always includes constant of integration “+C ”) f g0 = f g − f 0g
(Terminology: F 0 = f ⇐⇒ F = anti-derivative of f )

Partial fractions, simplest case:


Z Z
c f (x) dx = c f (x) dx
f (x) A B
Z Z Z
= +
Z Z Z (x − a)(x − b) x −a x −b
f (x) + g (x) dx = f (x) dx + g (x) dx
Z b Z a
f (x) dx = − f (x) dx § 3.7.6. Problem guide
a b
Z ∞ Z b
f (x) dx = lim f (x) dx • Integrate: a number times a function.
a b→∞ a
Move the number out in front of the integral and integrate
the function. The number remains a coefficient of the an-
§ 3.7.2. Applied meaning of integrals swer.

b 2
5xdx = 5 xdx = 5 x2 +C
Z R R
rate of change of something dt = net change in that thing
a
Z b
• Integrate: a power of x.
velocity dt = net distance travelled
a
Integrate with power rule, i.e., increase exponent by 1 and
Z b
acceleration dt = net increase in velocity divide by the new exponent.
a

x 3 dx = 14 x 4 +C
R

§ 3.7.3. Fundamental theorem of calculus


dx x −3+1
= x −3 dx = +C = − 12 x −2 +C = − 2x1 2 +C
R R
d
Z t x3 −3+1
y(x) dx = y(t ) (FTC1)
dt a
Z b • Integrate: an x is inside a root or in the denominator of a
y 0 (x) dx = y(b) − y(a) (FTC2) fraction.
a
Z b Rewrite in form x n and apply integration rule for this form.
f (x) dx = F (b) − F (a) (FTC2)
a
function equivalent form anti-derivative
p 2x 3/2
§ 3.7.4. Integrals of elementary functions x x 1/2 3
p p
1/ x x −1/2 2 x
function anti-derivative function anti-derivative
1/x 2 x −2 − x1
n x n+1 1
x n+1 x ln|x|
R4
sin x − cos x cos x sin x p1 dx
1 x
1 p 1
1+x 2
arctan x arcsin x R4 ¤4 p p
1−x 2 = x −1/2 dx = 2x 1/2 = 2( 4 − 1) = 2
1 1
ex ex ax a x / ln(a)

25
• Integrate: function plus or minus function. R
tan x dx
R
Integrate each separately and keep the sign in between. R sinux = cos x.
Let R Then1
du = − sin x dx. Thus tan x dx =
R −1
cos x dx = cos x sin x dx = u du = − ln |u| + C =
− ln | cos x| +C .
(1 − p1x )dx
R

1 1 p • Integrate: function times function.


= (1 − x − 2 )dx = x − 2x 2 +C = x − 2 x +C
R

Unless of the above form, try integration by parts:


R ³ −x 1 ´
e + x 2 dx First determine which of the two functions you would gain
most by differentiating: this will be your f . The sooner a
e + x −2 dx = −e −x − x1 +C
R ¡ −x ¢
= function occurs in the following list, the more inclined you
should be to make it your f : logarithmic functions, inverse
trigonometric functions, algebraic functions, trigonometric
• Integrate: one expression contained inside another (includ- functions, exponential functions. The remaining function is
ing contained in denominator). your g 0 .

If (a constant times) the derivative of the inside function oc- Now differentiate your f -function to find f 0 and anti-
curs on the outside: Let u = (the inside function) and solve differentiate your g 0 to find g . Fill this into the integration
by substitution. by parts formula:
Z Z
• Perform a substitution (change of variables) in an integral. f g 0 dx = |{z} |{z} − |{z} |{z} dx
f g f0 g
Express the new variable u in terms of the old variable x or
vice versa. Find du dx
dx or du and use this to solve for dx. Use the If bounds are involved, put them everywhere:
resulting expression to replace the dx in the integral. Rewrite
any remaining expressions involving x in the integrand in Z b Z b
terms of u. Also translate the bounds from x-values into u- f g 0 dx = [|{z} |{z}]ba − |{z} |{z} dx
a a
values by plugging them in for x in the formula defining u. f g f0 g
Solve the resulting integral. If indefinite integral, rewrite the
answer in terms of x by substituting back using the formula
If a power of x is involved, repeated integration by parts is
defining u.
generally needed to bring it down one degree at a time.

x sin(x 2 )dx If the integrand is a sine or cosine times an exponential func-


R
tion, integrate by part twice. Denote the sought integral by I .
Substitute u = x 2 . Then
R du/dx = 2x, so dx = du/2x. You will find that I occurs also in your final expression. Re-
Thus x sin(x 2 )dx = 21 sin(u)du = − 12 cos(u) + C =
R
place that occurrence also by I , then solve for I in the result-
− 21 cos(x 2 ) +C . ing equation.

e x/2 x dx
R
R1 9
0 (x − 1) dx
Z Z
Substitute u = x − 1. Then dd ux = 1 ⇒ d x = d u, so the inte- e x/2 dx = |{z}
x |{z} x 2e x/2 x/2
R0 10 ¤0 | {z } − 1 2e| {z } dx
gral becomes −1 u 9 d u = u10 −1 = − 101
|{z} |{z}
. f g0 f g f0 g
x/2 x/2
= 2xe − 4e +C
R (ln t )10
t dt R
x cos x dx
1
The substitution u = ln t , which implies du = t dt, gives R
R (ln t )10 1 11 1 = x sin x − sin x dx = x sin x + cos x +C .
dt = u 10 du = 11 (ln t )11 +C .
R
t u +C = 11

xe 2x dx
R
1
R
x ln x dx
Integrate by parts with f = x and g 0 = e 2x , which gives
1 1 1 2x 2x R 2x
R R
Let u = ln x. Then du = dx/x, so dx = dx = f 0 = 1 and g = e2 . Hence xe 2x d x = x e2 − e2 d x =
R
R 1 x ln x ln x x
2x 2x
u du = ln |u| +C = ln | ln x| +C . x e2 − e4 +C = ( x2 − 14 )e 2x +C .

26
R ln x R π/2
x2
dx 0 sin x cos x dx

= − x1 ln x + 1
dx = − x1 ln x − x1 +C Let u = cos x. Then du = − sin x dx. Thus
R
x2 R π/2 R0 R1 1 2
i1
0 sin x cos x dx = 1 −u du = 0 u du = 2 u = 1/2.
0
x
R
sin(x)e dx
cos2 x dx
R
If you try to solve this by integration by parts you will feel
that you are going in circles: when you have done integra- R
cos2 x dx = 1+cos(2x)
R
dx =
R ¡1 1 ¢ 1
2 2 + 2 cos(2x) dx = 2x +
tion by parts twice you are back to the same integral that 1 sin(2x)
2 2 +C = 12 x + sin(2x)
4 +C .
you started with:
Z Z
sin(x)e dx = sin(x)e − cos(x)e − sin(x)e x dx.
x x x
3 sin3 xdx
R

= 3 sin x(1 − cos2 x)dx. Substitute u = cos x. Then dd ux =


R
But this is not as pointless as it might look. Denote the
integral you are looking for by I . Then: R sin x ⇒ 2d u = −
− R sin xdx. Hence the integral becomes
−3(1 − u )du = −3 + 3u 2 du = −3u + u 3 +C = −3 cos x +
I = sin(x)e x − cos(x)e x − I . cos3 x +C .

Now you can solve for I to get • Integrate: a product of sine and/or cosine factors with differ-
ent coefficients of x.
sin(x)e x − cos(x)e x
I= ,
2 Rewrite using addition formulas for sine or cosine.

plus a constant of course. • Integrate: a ratio of two polynomials.

(The method below requires that the degree of the numera-


R2
0 t e −2t dt tor is lower than the degree of the denominator. If it is not,
−2t R 2 e −2t −2t −2t −4
first bring it into such a form for example by factoring some-
= [t · e−2 ]20 − 0 −2 dt = [t · e−2 ]20 − [ (−2)
e 2 e
2 ]0 = 2 · −2 − 0 − thing out or performing a division of polynomials (§A.7.6).)
−4 e0 −4 1−5e −4
( e4 − 4 ) = −e
−4
− e4 + 14 = 4 Factor the denominator. The original fraction is equal to a
sum of “partial” fractions found as follows:
• Integrate: ln(x).
– A simple linear factor (ax − b) in the denominator con-
A
View as 1 · ln(x) and integrate by parts. tributes a term ax−b .
R – A double linear factor (ax−b)2 in the denominator con-
ln(x) dx A B
tributes the terms ax−b + (ax−b)2.
1
R
= x ln x − x x dx = x ln x − x +C .
– A triple linear factor (ax − b)3 in the denominator con-
A B C
tributes the terms ax−b + (ax−b)2 + (ax−b)3 .
• Integrate: an inverse (arc) trigonometric function.
– And so on for higher powers.
View as 1 times the function and integrate by parts.
– An irreducible quadratic factor (ax 2 + bx + c) in the de-
nominator contributes a term axAx+B
2 +bx+c .
R
arctan x dx
x
dx = x arctan x − 12 ln(1 + x 2 ) +C . – A repeated irreducible quadratic factor: analogous to
R
= x arctan x − 1+x 2
repeated linear factor.

• Integrate: a product of powers of sines and/or cosines. Use a separate letter for the constants A, B , etc., in each term.
These are some numbers yet to be determined. We find these
Only even powers occur: Rewrite using double angle formula numbers as follows. Set the original fraction equal to the sum
of the partial fractions. Proceed in one of two ways:
cos 2x = cos2 x − sin2 x = 2 cos2 x − 1 = 1 − 2 sin2 x
– (Best for simpler cases.) Multiply both sides by the de-
nominator of the original fraction. Many things can-
The power of one of them is 1: Make that function u and cel and we are left with an equation without fractions.
solve by substitution. Plug in the values for x that make one of the factors zero
Other cases: Rewrite using sin2 x + cos2 x = 1 to the forms (i.e., the roots of the original denominator). Each time
above. you plug in one of these values many terms will become
zero and you will get some quite simple equations from

27
which you can figure out the values of the constants A, R3 p
2
B , etc. −3 9 − x dx
R3 p
Let x = 3 sin θ. Then dx = 3 cos θ dθ. Thus −3 9 − x 2 dx =
– (Better in advanced cases.) Multiply out all parentheses R π/2 p R π/2 p
and identify coefficients of like terms on left and right −π/2 9 − 9 sin2 θ(3 cos θ) dθ = −π/2 3 9 cos2 θ cos θ dθ =
R π/2 R π/2 2
R π/2 9 ¡
sides (coefficient of x n on left hand side = coefficient of −π/2 3|3 cos θ| cos θ dθ = −π/2 9 cos θ dθ = −π/2 2 1 +
x n on right hand side). ¢ π/2
i
cos(2θ) dθ = 92 θ + 12 sin(2θ) = 29 π.
¢ ¡
−π/2
We have now reduced the problem of integrating the original
fraction to the problem of integrating the sum of the partial • Integrate: an expression involving (±a 2 ± x 2 )n .
fractions. This is done term by term.
If other rules are inapplicable, a substitution strategy anal-
– A
integrates to a logarithm (substitute u = ax − b). ogous to that of the previous case may help to simplify the
ax−b
integrand.
B C
– ,
(ax−b)2 (ax−b)3
, etc., integrates using power rule (sub- • Integrate: a more complicated case not covered above which
stitute u = ax − b). involves . . .
Ax+B
– integrates to an arctangent (first complete the
ax 2 +bx+c – . . . a rational function of sin(x) and cos(x).
square in the denominator).
Unless some simplification using trigonometric identi-
R x+1
ties suggests itself, substitute u = tan(x/2). This should
(x−1)(x−3)2
dx turn the integrand into a rational function of u, which
x+1 A B C can then be integrated. (For geometric interpretation
(x−1)(x−3)2
= (x−1) + (x−3) + (x−3) 2
see problem 7.3.5.)
⇒x + 1 = A(x − 3)2 + B (x− 1)(x
 − 3) +C (x − 1)
1  – . . . a root expression.
 0 = A +B   A= 2 
⇒ 1 = −6A − 4B +C ⇒ B = − 21 Try substituting u = this root expression. If this does
1 = 9A + 3B −C C =2
   
1 1
not help, try substituting u 2 = the interior of the root
R x+1
R 2 −2 2 ln(x−1)
So (x−1)(x−3) 2 dx = (x−1) + (x−3) + (x−3)2 dx = 2 − expression
ln(x−3) 2
2 − x−3 +C . – . . . a fraction of exponential expressions.
Try to find a substitution that will rationalise the ex-
R (x+2) pression (i.e., turn it into a ratio of polynomials).
(x+3)(x+4) dx
2 • Evaluate an integral where one of the bounds is infinite, or
2 1
dx = [2 ln(x + 4) − ln(x + 3)] +C = ln (x+4)
R
= x+4 − x+3 x+3 +C on an interval where the integrand becomes infinite (i.e., has
vertical asymptote).
R2 dx Evaluate the integral with a generic bound, say a, in place
1 x 2 −3x−4
of the exceptional one, and take the limit of the answer as a
R2 1/5 1/5
= 1 x−4 − x+1 dx = 51 [ln |x − 4| − ln |x + 1|]21 = 25 ln 23 goes to the exceptional point. Split into two integrals if the
exceptional point is in the interior of the interval. The inte-
gral is convergent if the limit exists and is finite, otherwise
dx divergent.
R
x 2 +4x+5
R
R dx • Find the derivative of an integral f (x) dx with respect to t
= (x+2)2 +1
= arctan(x + 2) +C
where t . . .
p – . . . is the upper bound of integration.
• Integrate: an expression involving ±a 2 ± x 2 .
f (t ).
Use a trigonometric substitution that enables you to rewrite
– . . . is the lower bound of integration.
the expression under the root as a perfect square using the
Pythagorean identity sin2 x + cos2 x = 1 or some variant of it − f (t ).
(e.g. divided by cos2 x). In this way the root can be eliminated
– . . . occurs in an expression g (t ) for the upper bound of
which should make the function easier to integrate. For the
integration.
purposes of substituting back a trigonometric answer to the
original variable, it is useful to interpret the original root ex- Combine the above with chain rule: f (g (t ))g 0 (t ).
pression as one of the sides of a right triangle; the trigono-
– . . . occurs in an expression g (t ) for the lower bound of
metric answer can then be interpreted as a ratio in this figure
integration.
and translated accordingly.
Combine the above with chain rule: − f (g (t ))g 0 (t ).

28
– . . . occurs in both bounds. R ln 3 ex
0 1+e x dx
Treat each bound separately according to the above;
SubstituteR u = 1 + e x , so that e x dx = du. Then the integral
add the results. 4
becomes 2 u1 du = [ln u]42 = ln 2.
et
Rx
Differentiate F (x) = 0 t +2 dt. R 1/2 1
0 dx
F 0 (x) = e x x + 2 2+8x 2

1 1/2 1 π
dx = 41 [arctan(2 X )]1/2
R
= 2 0 1+(2 X )2 0 = 16
R x2 et
Differentiate G(x) = 0 t +2 dt.
R4p
0 ex
2
1 x ln xdx
G (x) = x 2 +2
· 2x p
3/2 ln x R4 x
By parts: = [ x 3/2 ]41 − 1 3/2 dx =
16 ln 4
3 − 28
9

R2 x
§ 3.7.7. Examples 0 (x 2 +4)1/3 dx

1 8 du 2/3 3
= [ 3u4 ]84 = 3 − 22/3
R
= 2 4 u 1/3
( 12 + x)2 ln(1 + 2x)dx
R

The substitution 1 + 2x = t gives 2dx = dt and ( 12 +


R
(2r − 1)e −2r dr
R
2
x)2 ln(1 + 2x)dx = ( 1+2x 2
= t4 ln t 12 dt =
R R
2 ) ln(1 + 2x)dx −2r R −2r
By parts: (2r − 1)e −2r dr = (2r − 1) e−2 − 2 e−2 dr = (2r −
R
1
t 2 ln t dt. Integrating by parts, t 2 ln t dt = 31 t 3 ln t −
R R
R8 1 3 1 1 3 1
R 2 1 3 1 3 1
−2r R −2r −2r −2r
1) e−2 + e −2r dr = (2r −1) e−2 + e−2 +C = (2r −1+1) e−2 +C =
3 t t dt = 3 t ln t − 3 t dt = 3 t ln t − 9 t + C = 3 (1 +
3 1 3 −r e −2r +C
2x) ln(1 + 2x) − 9 (1 + 2x) + C . The final answer is thus
1 3 1 3
24 (1 + 2x) ln(1 + 2x) − 72 (1 + 2x) +C 1 .
R x+ 21
1+x+x 2 dx
e
R2
0 x 2 (9 − x 3 )10 dx dt
The substitution t = 1+x +x 2 implies dx = 1+2x which gives
R x+ 21
The substitution 9 − x 3 = t gives −3x 2 dx = dt, and hence 1
dx = 12 e1t dt =
R
dt = (1 + 2x)dx = 2(x + 2 )dx. Hence 1+x+x 2
e
x 2 dx = − 31 dt. Moreover, t varies between 9 and 1 as x 1
R −t 1 e −t 1 1
2 e dt = 2 · −1 +C = − 2e t +C = − 1+x+x 2 +C .
R2
varies between 0 and 2. Therefore 0 x 2 (9 − x 3 )10 dx = 2e
911 −1
R 1 10 1 1 1 10
R 1 9 10
R 1 11 9
9 t (− 3 )dt = − 3 9 t dt = 3 1 t dt = 33 t ]1 = 33 .
R p3
0 arctan xdx
R1 p p R p3 p
0 x x + 3dx By parts: = [x arctan x]0 3 − x
dx = 3π
− [(1/2) ln(1 +
p p 0 1+x 2 3
R1 p R1
2
0 x x + 3dx = [x 3 (x + 3) ]x=0 − 0 23 (x + 3)3/2 dx = 23 43/2 −
3/2 x=1 x 2
)]0 3 = 3π
3 − ln 2
22
3 5 (x + 3) ]x=0 = 32 23 − 15
5/2 x=1 4
[(4)5/2 − (3)5/2 ] = 16 4
3 − 15 [32 −
p p p
−16+12 3
9 3] = 80 128 36
15 − 15 + 15 3 = 5 . R1
0 arcsin xdx
p
dx = π2 +[ 1 − x 2 ]10 = π2 −1.
R1
By parts: = [x arcsin x]10 − p x
R 2 0 1−x 2
x cos xdx
2 2
R
RIntegrate by2parts: = x sin x − 2x sin x = x sin x +2x cos x − R∞ π
dx
2 cos x = x sin x + 2x cos x − 2 sin x +C . 1 x3

x −3 dx = limt →∞ π − 2x1 2 1 = limt →∞ π − 2t12 + 21 = π2 .


R∞ ¤t

£ ¡ ¢
1
R5
p dt
1 3t +1
R1 1
Make the substitution 3t + 1 = u. Then du −1 (1+x)2 dx
dt = 3 which gives
dt = 13 du. The upper bound of integration is 3·4+1 = 16 and This is an improper integral because (1+x) 1
R5 R 16 2 has an asymp-
the lower bound 3·1+1 = 4. Thus 1 p 1 dt = 4 p1u 13 du = tote at x = −1. Hence we must find it using limits:
3t +1
1 1
R1 1 1 1
1 p p
¤
1 16 − 12 1 u 2 16
]4 = 32 ( 16 − 4) = 34 . −1 (1+x)2 dx = limt →−1+ − 1+x t = limt →−1+ − 2 + 1+t = ∞. In
R
3 4 u du = 3 [ 1
2 other words, the integral is divergent.

29
R∞ dx
1 x 2 +x
R∞ RR
= 1 ( x1 − x+1
1
)dx = limR→∞ 1 ( x1 − x+1
1
)dx = limR→∞ [ln x −
R x R
ln(x + 1)]1 = limR→∞ [ln x+1 ]1 = ln 2

Determine the area enclosed by the curves y = x 2 and y =


−x + 1.
Setting the y-values equal to find the points intersection of
2
the curvesp gives −x + 1 = xp , which has the solutions x =
1 1
2 (−1 − 5) and x = 2 (−1 + 5). Between these points, the
R 12 (−1+p5)
upper curve is y = −x + 2. Hence the area is 1 p (−x +
p 2 (−1− 5)
h 2 1
3 2 (−1+ 5)
i p
5
1) − x 2 dx = − x2 + x − x3 1 p =5 6 .
2 (−1− 5)

30
§ 4.2. Arc length
4 A PPLICATIONS OF INTEGRATION
§ 4.2.1. Lecture worksheet

Rb
§ 4.1. Volume 4.2.1. Explain what a ds represents. (The meaning of ds is
shown in a figure in §1.1.)
Rbq ¡ ¢2
§ 4.1.1. Lecture worksheet 4.2.2. Show that a 1 + y 0 (x) dx expresses the arc length of
the curve y(x) from x = a to x = b. Hint: This is re-
ally nothing but the Pythagorean Theorem applied to in-
4.1.1. As the area under y(x) is made up of rectangles with
finitesimal triangles. Express ds in terms of dx and dy and
hight y and base dx, so is the volume of a sphere made
then factor out dx.
up of cylindrical slices with thickness dx.
4.2.3. Find the arc length of the semi-cubical parabola y 2 = x 3
from (0, 0) to (1, 1).

(a) Find a formula for the volume of such a slice, i.e.,


4.2.4. (a) Use the geometrical definition of the inverse
base area times height expressed in terms of r and
trigonometric functions from §A.3 to express
x. We assume that the x-axis skewers the cylinders
arccos(x) as an integral.
right through the middle.
(b) What is the derivative of arccos(x)? Explain how
(b) Sum up the pieces (i.e., integrate your expression) this follows from the above.
to find the famous formula for the volume of a
(c) Do the same for arcsin(x).
sphere.

4.1.2. Generalise your argument to find an integral expression § 4.2.2. Problems


for the volume of any solid of revolution, i.e., volume ob-
tained when the area under a graph y(x) is rotated about 4.2.5. Argue that the surface area of a surface of revolution
the x-axis. R
(curve y(x) revolved about the x-axis) is 2πy ds.

4.1.3. Volume of pyramid.


R 2 In a manner similar to problem 4.1.1, 4.2.6. Consider the curve y = 1/x from x = 1 to x = ∞.
the integral x dx can be interpreted as the volume of a
(a) Using the techniques of §4.3, find the volume gen-
pyramid.
erated when the area under this curve is rotated
2 about the x-axis.
(a) Explain how, by interpreting each x as an actual
geometrical square. (b) Using problem 4.2.5, find the surface area of the
same shape.
(b) Show how the well-known formula for the volume
of a pyramid agrees with a well-known integration (c) Is the result “paradoxical”?
rule.

§ 4.3. Center of mass

§ 4.1.2. Problems
§ 4.3.1. Lecture worksheet

4.1.4. Find an integral formula for the volume of the solid of The “law of the lever” says that a lever multiplies the effect of
revolution generated when area under y(x) is revolved a force by the length of the lever arm from the fulcrum to the
about the y-axis. Hint: Consider the volume to be made point where the force is applied. Thus we can lift a stone with,
up of thin cylindrical shells with height y and thickness say, a three times smaller force than that required to lift it di-
dy. What is the circumference and volume of such a rectly by using a lever with a three times longer arm on our side
shell? than on the stone’s side.

31
point-masses. In our example, we see already without calcula-
tions that x̄ for symmetry reasons, meaning that our piece of
metal would balance on the edge of a knife placed along the y-
axis. But what is ȳ? Along which horizontal line does the shape
balance?

4.3.6. Find ȳ by slicing the area into dy-thin horizontal strips


and considering each strip a separate point mass.

We can also calculate the center of mass of a curve in an anal-


We can see the same principle in action on a children’s play- ogous way. We assume then that mass is proportional to arc.
ground seesaw: one child can balance two on the other side if Physically we may think of a piece of metal wire, for example.
he sits twice as far out on the beam.
P 4.3.7. Express x̄ and ȳ as integrals with respect to arc length.
Algebraically, an equilibrium evidently corresponds to mx = Hint: each segment ds of the the curve (cf. §4.2) may be
0, where x is the position along the axis (with the fulcrum as considered a point mass.
origin) and m is the mass applied at that point. (We are assum-
ing that the lever bar itself is weightless.) The idea of the center of mass (x̄, ȳ) is useful and suggestive
beyond its physical context. As we saw, it can be interpreted as
We can also reverse the problem: given some masses m i and a kind of “average position.” This is a useful notion in geom-
their positions x i along the axis (measured from any reference etry, where (x̄, ȳ) is also called the centroid to de-emphasise
point, such as for instance the left endpoint), find where the its physical connotations. The center of mass can also be seen
fulcrum needs to be placed to achieve equilibrium; that is, find as a “nutshell” summary of a complex system. For example,
the point where you can balance the whole thing on the tip of this map of the center of mass of the population of the United
your finger. This point is called he center of mass and is de- States at various points in time vividly captures a complex his-
noted x̄ (“x bar”). Another way of putting it is that the original torical development:
distribution of masses is equivalent to all masses being stacked
at this one point x̄: pooling your masses in this way would not
alter the behaviour of the system as a whole as far as balances,
levers, seesaws, etc. are concerned.

4.3.1. Find the center of mass x̄ in terms of the x i and m i using


the equilibrium equation above.

Hint: That equation was set up with the fulcrum as the


origin of the coordinate system. How does it change if
the origin of the coordinate system is, e.g., the left end-
point of teh bar and the fulcrum is at x-coordinate x̄?

4.3.2. Argue that the result can be interpreted in a natural way


as a kind of average.

4.3.3. Why does it not matter where the origin of the coordinate
system is located?

Suppose now that the masses are located at various points § 4.3.2. Problems
(x i , y i ) in a plane (say placed on a thin metal tray) instead of
along a single axis. 4.3.8. Above we sliced the area into horizontal strips when cal-
4.3.4. What is the physical meaning of your expression for x̄ in culating ȳ of a figure.
this context? (a) Explain why this is in a way more natural than using
4.3.5. Find an expression for the center of mass in this case (i.e., vertical slicing.
the point at which you could balance the whole tray on (b) Explain how you could nevertheless compute ȳ on
the tip of your finger). the basis of vertical strips. Hint: first pool the mass
Suppose we want to find the center of mass of a figure, such as of each strip at its center of mass.
for example the area between the parabola y = x 2 −1 and the x- (c) Give an example where vertical slice are more con-
axis. We can imagine this area being cut out of a sheet of metal venient than horizontal ones.
and we want to know on which point this piece of metal could
be balanced on the tip of a needle. Above we dealt only with the 4.3.9. Theorem of Pappus. Show that if some plane area is
center of mass of a system of point-masses, but the idea is eas- rotated about the y-axis then the volume generated is
ily extended by thinking of the figure as made up of many little equal to the area of the region times the distance trav-

32
elled by its center of mass. Hint: compare the integral like those of §A.7.12, so integration can be used to go the other
expression for x̄ with that for rotational volume (§4.6.1). way and “build up” to the quantities occurring in conservation
laws starting from what is known. This is done by means of the
4.3.10. Find a similar theorem for the surface area of a solid of
concept of work, which is another way of looking at energy. We
revolution.
may define work as force times distance, or,
R more generally, the
4.3.11. Galileo thought that the shape of a necklace held up by integral of force with respect to distance, F ds.
its endpoints is a parabola. We shall prove that it is not.
4.4.1. Looking back at §2.4, explain how the two energy terms
The physical principle we need for this is: nature strives
can be obtained as work integrals given known expres-
to arrange the necklace in such a way that its center of
sions for force (§A.7.12):
gravity is as low as possible.
(a) −G Mym .
Consider the parabola y = x 2 −1 and view the portion be-
2
low the x-axis as the shape of a necklace suspended from (b) mv
R
2 . Hint: rewrite ma ds as an integral with re-
the points (−1, 0) and (1, 0). spect to v.
(a) Find the center of mass of this part of the parabola. But why should work, as defined above, be the same thing as
I say that a necklace placed along this curve and released energy? How can we arrive at this definition of work in the
would instead attain the shape first place? The rest of this section is devoted to developing our
physical intuition about this.
y = 0.314 e x/0.628 + e −x/0.628 − 1.607
¡ ¢
An object of mass m at a not-too-great height h above the sur-
face of the earth has a potential energy of mgh. This means
These two curves look like this, with the parabola drawn that we could, potentially, have it do so much work for us. You
solid: can think of for example a water wheel driven by a water fall:
this device takes advantage of the potential energy stored in the
water by virtue of its altitude, and harnesses it for some other
purpose. Thinking in terms of water wheels, it is easy to under-
stand why potential energy is proportional to mass and height.
For if the height is double, you can have the water run through
twice as many wheels on its way down, so you get twice as
much work out of it. And if the mass is double you can split it
in half and run each part through the water wheels separately,
which makes it clear that you get twice the work in this case
also. By the same argument we obtain the general relation

work = force × distance


(b) Verify that the curves have the same arc length. which may be taken as the formal definition of work, as above.
(c) Compute the center of mass of the second curve. Potential energy is energy by virtue of position; kinetic energy
is energy by virtue of velocity. Water can drive a water wheel
We shall see how to arrive at the true shape of the neck-
not only by falling from a certain height (potential energy) but
lace in problem 6.2.2.
also by rushing ahead in a stream at a certain velocity (kinetic
energy). I shall now prove to you that just as potential energy
is measured by mgh, so kinetic energy is measured by 12 mv 2 .
§ 4.4. Energy and work
First I want make it clear that kinetic energy is “stored work.”
Imagine yourself pushing a wagon along a railway track. When
§ 4.4.1. Lecture worksheet you are done pushing and let the wagon go, all the work you
put into it is now “stored” in the wagon in the form of kinetic
In §2.4 we studied conservation laws in physics. We made the energy. We can get it back out again for example by our proto-
general point that proving such laws corresponds to showing type method of water wheels, which we could have the wagon
that the derivative of some quantity is zero. But the inquisitive set spinning as it hits them along its path. Experience shows
reader may have objected: it’s all fine and well to prove that that it takes the same amount of effort to stop the wagon as it
some given quantity is constant by differentiating it, but how did to get it moving, so it is clear that the amount of work stored
do we know what quantity to differentiate in the first place? In in the wagon is the same as that you put into it.
other words, how do we discover what the conservation law is,
When you push the wagon to get it moving you are applying a
as opposed to proving it once it has been proposed?
certain force across a certain distance. The product of the two
Since differentiation and integration are inverse processes, you is the work you do, we saw above. This, then, is a measure of
will perhaps not be surprised to learn that, just as differentia- the kinetic energy, but not a very nice one. Kinetic energy is
tion can bring a conservation law “back down” to basic laws quite clearly intrinsic to the moving wagon, so it is awkward

33
to characterise it in terms of the action of the worker who set 146 meters. Its interior is basically solid stone through-
it moving and however long of a run-up he used. We should out, except for a few small chambers. The stones used
much prefer to express it in terms of the mass and velocity of weigh about 2700 kg/m3 .
the wagon. But this is easily done, for we know that
(a) Calculate the total work done in erecting the pyra-
force = mass × acceleration mid. Hint: Slice the pyramid into horizontal lay-
ers and express the work required to lift the stones
and for each layer. Then integrate to get the total work
distance = average velocity × time needed for all layers.
“Distance” here means the length of your run-up before you According to the ancient Greek historian Herodotus, the
released the wagon, and “time” how long you took to complete pyramid was built in 20 years by 100 000 workers. Let’s
it. Let’s say that you push equally hard throughout, so that the check whether this seems plausible.
force, and thus acceleration, is constant.
(b) Estimate how much work a man can do in one hour.
4.4.2. Conclude from this that the kinetic energy is 12 mv 2 . Hint: Picture the man lifting weights onto a ledge
The two forms of energy that we have studied are clearly in- of height 1 meter. What weight can he lift and how
terchangeable: when an object falls it “trades in” potential en- many times can he repeat this in one hour?
ergy for kinetic, and conversely when its velocity is directed up- (c) Based on your estimate, how many man hours
wards. By means of some ramps we could turn a water fall into would have been needed to lift the stones into place
a stream and conversely, so we would quite like to know which for the pyramid?
is better for driving water wheels. But it turns out to be all the
same. The economy of nature is such that the exchange rate in (d) Does Herodotus’s claim seem plausible?
these kinds of transactions is one to one. Energy is conserved.
This agrees with experience but we can also prove it formally.
§ 4.5. Logarithms redux
4.4.3. Prove, by taking its time-derivative, that the total energy
mgh + 21 mv 2 is constant for a freely falling object.
§ 4.5.1. Lecture worksheet
Another useful way of establishing this sort of result is to prove
that if it didn’t hold one could exploit the discrepancy to build
Integrals give us a new way of looking at logarithms, which is
a perpetual-motion machine which could create energy out of
more illuminating in certain respects. In particular, this per-
nothing, which is known to be impossible or at least a point
spective enables us to understand the derivative of the loga-
on which we would be very pleasantly surprised to be proven
rithm in a more direct way than the approach we used in prob-
wrong.
lem 1.4.1.
4.4.4. Argue on such grounds that mgh + 12 mv 2 is constant.
In §A.4 we saw that the essence of logarithms is that they turn
multiplication into addition:
§ 4.4.2. Problems
log(ab) = log(a) + log(b) (L1)
4.4.5. When you are pushing the wagon to get it moving, if you
push it for twice as long, while maintaining the same and that a table of powers of some integer has this property
constant force, then you double its final speed. But the when “read backwards.” If we plot the values of such a table in
kinetic energy doesn’t double but quadruple since it is a coordinate system we get a picture like this:
proportional to v 2 . So by doubling the input effort you
got four times the output energy stored in the system: a
violation of energy conservation. Resolve the paradox.
4.4.6. Consider a lever with one lever arm just over twice as
long as the other. Attached to the shorter arm is a weight
of mass 2. You lift a weight of mass 1 and attach it to
the longer lever arm. Then this weight will sink and the
We are looking to extend this table to include all intermediate
other one will rise. Doesn’t this prove that lifting a unit
values as well. In §A.4 we did this in an algebraic fashion. Now
weight is equivalent to lifting two unit weights? By con-
we wish to do it geometrically. Thus we look at the plot and try
necting several levers you could even lift any weight of
to characterise the function that runs through these points.
mass 2n with no more effort required to lift the origi-
nal unit weight. This surely violates energy conservation. 4.5.1. (a) Argue that it seems plausible that a function run-
Resolve the paradox. ning through these points has derivative 1/x.
Rx
4.4.7. The Great Pyramid of Giza, Egypt, has a square base (b) Explain why the function f (x) = c 1t dt, where c is
wide side length 230 meters, and its original height was any constant, has this derivative.

34
We now wish to check whether the function so defined in Find the rotational volume generated when f (x) =
p
fact has the property (L1), as desired. sin(2x), 0 ≤ x ≤ π2 , is rotated about the x-axis.
(c) For the formula f (ab) = f (a) + f (b) to hold for R π/2 p R π/2
= π 0 ( sin(2x))2 dx = π 0 sin(2x)dx =
all positive numbers a and b, it is necessary that π π/2
f (p) = 0 for a certain number p. Explain why. (Hint: 2 [− cos(2x)]0 = π.

Consider simple values for a and b.) What is p?


Determine the rotational volume generated when the curve
(d) Therefore, what is c?
y(x) = (1+a)e −ax , x ≥ 0 is rotated around the positive x-axis.
(e) How is the derivative of f (ax) related to the deriva- R∞ R∞
V (a) = π 0 (y(x))2 dx = π(1 + a)2 0 e −2ax dx = π(1 +
tive of f (x)? π(1+a)2
a)2 [ −1
2a e
−2ax ∞
]0 = 2a .
(f) What does this imply about how f (ax) is related to
f (x)? Use the value that you found in (b) to further
specify this relation. Find the rotational volume generated when y =
p 2
axe −ax , x ≥ 0, is rotated about the x-axis.
(g) Show that this implies f (ab) = f (a) + f (b), as de-
= π 0 axe −2ax dx = [2ax 2 = t , 4axdx = dt] = π4 0 e −t dt =
R∞ 2 R∞
sired.
π −t ∞ π −T π
4 [−e ]0 = 4 limT →∞ (1 − e )= 4.

§ 4.5.2. Problems
Volume of solid of revolution generated when area under y(x)
is revolved about y-axis (seen as made up of thin cylindrical
4.5.2. † Discuss what you consider to be the advantages and shells):
disadvantages of the two ways of defining logarithms
presented in sections §A.4 and §4.5. Hint: aspects
Z b
to consider might include how these definitions incor-
2πx y dx
porate non-fractional exponents, and what is “natural” a
about the natural logarithm.

4.5.3. With the logarithm defined as in §4.5.1, we can define the Find the volume generated when the area under the curve
exponential function e x as its inverse. Prove (e x )0 = e x y = x 2 , 1 < x < 2, is rotated about the line x = 2.
and e x y = e x e y starting with this definition. R2 R2 £ 3 4 ¤2
= 1 2π(2 − x)x 2 dx = 2π 1 2x 2 − x 3 dx = 2π 2x3 − x4 1 =
4.5.4. We write ln|x| rather than ln x for the antiderivative of
¡ 16 16 ¡ 2 1 ¢¢ 11π
2π 3 − 4 − 3 − 4 = 6 .
1/x. This may seem like a hassle. Of course, when the
logarithm is defined as above it only exists for positive
numbers, but what’s stopping us from simply extending Arc length of curve y(x) from x = a to x = b:
the definition to include negative numbers as well, so
that ln x = ln|x|, which would spare us the trouble of writ- Z b Z bq ¡ ¢2
ing the absolute value bars all the time? Hint: What are ds = 1 + y 0 dx
a a
some other important properties that we want the loga-
rithm function to have?
Surface area of a surface of revolution (curve y(x) revolved
4.5.5. Argue that ln|x| + C is not quite the most general an- about the x-axis):
tiderivative of 1/x. Hint: replace C with a locally constant Z b Z q
¡ ¢2
function. 2πy ds = 2πy 1 + y 0 dx
a

§ 4.6. Reference summary Average value of f (x) on given interval:

1
Z b
§ 4.6.1. Geometrical applications of integration f (x) dx
b−a a

Volume of solid of revolution generated when area under y(x)


is revolved about x-axis (seen as made up of thin disks): Area between f (x) and g (x) where f (x) is the upper function
(if no interval [a, b] is given, intersections are the implied end-
points: find them by setting f (x) = g (x)):
Z b
πy 2 dx
a Z b
f (x) − g (x) dx
a

35
§ 4.6.2. Center of mass

(x̄, ȳ) = centroid = center of mass (assuming uniform density).


Center of mass of region under graph of y(x) from x = a to
x = b:

Rb Rb 1 2
a x y dx a y dx
x̄ = R b ȳ = R b2
a y dx a y dx

Center of mass of region between graphs of f (x), g (x) from


x = a to x = b:

Rb Rb 1 2 2
a x( f − g ) dx a 2 ( f − g ) dx
x̄ = R b ȳ = Rb
a ( f − g ) dx a ( f − g ) dx

Center of mass of curve y(x) from x = a to x = b:

Rb Rb q ¡ ¢2
0 dx
a a x 1+ y
x ds
x̄ = R b = R q ¡ ¢2
b
a ds a 1 + y 0 dx

Rb Rb q ¡ ¢2
0 dx
a a y 1+ y
y ds
ȳ = R b = R q ¡ ¢2
b
a ds a 1 + y 0 dx

§ 4.6.3. Work
Z
work = force × distance = F ds

§ 4.6.4. Integral definition of the logarithm


Z x 1
ln(x) = dt
1 t

ln(x)

1 x

36
Instead we build them up from standard series such as the
5 P OWER SERIES above, by algebraic manipulations like substituting, multiply-
ing, and so on, as we are used to doing for ordinary polyno-
mials. Furthermore we shall see below that many important
§ 5.1. The idea of power series series arise more naturally in other ways altogether.

The above series give us a way of estimating these functions


§ 5.1.1. Lecture worksheet
by polynomials. If we cut the series off after for instance the
second-degree term we get the best possible parabolic approx-
Functions can be expressed as power series:
imation. Here I have illustrated this for the cosine and expo-
f (x) = A + B x +C x 2 + D x 3 + · · · nential functions:

We can think of the coefficients as so many “degrees of free-


dom,” i.e., free choices we can make when picking the coeffi-
cients.
5.1.1. These “degrees of freedom” have a direct visual meaning.
(a) Argue visually that by suitable choices of the con-
stants a, b, c, you can make a parabola of the form
y = ax 2 + bx + c = A(x − B )2 + C pass through any
predetermined points.
(b) For y = bx + c the number of points I can make it
pass through is
If we include more terms of the series we will see the polyno-
(c) For y = c the number of points I can make it pass mial “hugging” the function more and more closely, like this:
through is
(d) Conclude that it makes sense that any function can
be represented by an “infinite polynomial.”
The power series for a given function f (x) can be found by
plugging zero into f , f 0 , f 00 , etc., which gives the values of A,
B , C , etc., respectively.
5.1.2. Show that this gives

x x2 x3
ex = 1 + + + +···
1! 2! 3!
3 5 7
x x x
sin x = x − + − +··· 5.1.4. By picturing the graph of ln(x), I feel that the power series
3! 5! 7!
for ln(1 + x) starts with a [positive/negative/zero] con-
x2 x4 x6
cos x = 1 − + − +··· stant term, a [positive/negative/zero] linear term, and a
2! 4! 6! [positive/negative/zero] quadratic term.
5.1.3. Suppose you know the power series for sine and cosine 5.1.5. Argue that the power series for the sine implies that
but have no calculator at your disposal. In which of the sin x ≈ x when x is small. (This is a useful approximation
following situations could you use the power series to re- in many situations. We mentioned it already in §A.3, and
solve your problem? we also effectively used this approximation in §6.4 when
deriving the differential equation for pendulum motion.)
 I remember the wavy shape of the graph of the sine
function, but I forget how to plot it and tell it apart 5.1.6. Show that applied to a general function f (x) the method
from the cosine graph. gives
 I remember that the sine and cosine functions are f 0 (0) f 00 (0) 2 f 000 (0) 3
basically each other’s derivative, except there is a f (x) = f (0) + x+ x + x +···
1! 2! 3!
minus sign somewhere, and I forget where it goes.
 I remember that sin(60◦ ) is something quite simple, One use of the idea of polynomial approximation is to tackle
but I forget the exact value. difficult integrals. We quite often face integrals for which
This method of repeated differentiation for finding power se- none of our usual integration tricks work, such as the integral
2
ries is in principle always applicable. But in practice we rarely of e −x (the normal q distribution function at the heart of sta-
a 4 +(b 2 −a 2 )x 2
derive power series by the method of repeated differentiation. tistical theory) or a 4 −a 2 x 2
(the arc-length of the ellipse

37
x 2 /a 2 + y 2 /b 2 = 1, an important problem in astronomy since § 5.2. The geometric series
planets move in elliptical orbits). In such cases the best we can
do is often to expand the function as a power series and inte-
grate term by term, which gives us the desired integral in series § 5.2.1. Lecture worksheet
form.
2 5.2.1. (a) What is the greatest number smaller than 1? One is
5.1.7. Let us evaluate e −x dx in this way.
R
inclined to suggest a = 0.99999 . . ., but argue against
(a) If I include only the first four non-zero terms, the this by considering 10a − a.
integral is
(b) Generalise your argument to find a closed formula
(b) Though not an exact solution in closed form, this for 1 + x + x 2 + x 3 + · · ·
is still very useful. For example, I could use it
R1 2
to find a good approximation to 0 e −x d x. Sup- 5.2.2. Explain how power series are related to the paradox of
pose I use only the first three non-zero terms for motion mentioned by Aristotle, Physics, 239b11: “[Zeno]
this. This must already be quite good because I see asserts the non-existence of motion on the ground that
from the above that the next term would only affect that which is in locomotion must arrive at the half-way
the decimal and subsequent terms are even stage before it arrives at the goal,” and then the half-way
smaller. stage of what is left, etc., ad infinitum.

5.1.8. Suppose I use the first five terms of the power series for 5.2.3. Derive the series
e x to approximate e 0.1 , then use this result to find an ap-
proximation for e by [raising the result to the power 10, x2 x3 x4
ln(1 + x) = x − + − +···
taking 1 divided by the result, multiplying the result by 2 3 4
10, taking the ln of the result and multiplying by 10]. Al-
ternatively, I could find an approximation for e directly by first noting that
from the series by [plugging in x = 0, plugging in x = 1, Z x+1 Z x
1 1
using a geometric series, using a binomial series]. Which ln(1 + x) = dt = du,
of the two methods will be more accurate? [the first, the 1 t 0 1 + u
second, both equal]
then applying the geometric series, then integrating term
by term.
§ 5.1.2. Problems

5.1.9. (a) Estimate the sine of 1◦ using nothing but a simple § 5.2.2. Problems
calculator that only has the operations +, −, ×, /.
(b) Check your answer using a more advanced calcula- 5.2.4. What is 0.888 . . .? Does it “spill over” like we saw 0.999 . . .
tor that has a sine button. do in problem 5.2.1a?

(c) Is the “more advanced” calculator really more ad- 5.2.5. By the fundamental theorem of calculus, the arctangent
vanced, or does it just have the algorithm of (a) on is the integral of its derivative.
“speed dial”?
(a) Use this to find a power series for the arctangent.
5.1.10. (a) By considering the roots of sin(x)/x, argue that its (To make sure that you take the constant of integra-
power series tion into account, check that your constant term is
2 4 6 correct using the geometrical definition of the arct-
x x x
sin(x)/x = 1 − + − +··· angent.)
3! 5! 7!
(b) Find the value of arctan(1) in two ways: by the geo-
can be factored as
metrical definition, and from the power series.
x2 x2 x2
µ ¶µ ¶µ ¶
1− 2 1− 2 1− 2 ··· (c) Equate these two expressions for arctan(1) to find
π 4π 9π
an infinite series representation for π.
by analogy with the way one factors ordinary poly-
When Leibniz found this series he concluded that “God
nomials, such as x 2 − x − 2 = (x + 1)(x − 2).
loves the odd integers,” as you can see in the figure below
(b) What is the coefficient of x 2 when the product is ex- (taken from his 1682 paper).
panded?
(c) Equate this with the coefficient of x 2 in the ordinary
power series and use the result to find a formula for
the sum of the reciprocals of the squares, 1/n 2 .
P

38
number, a non-integer, 0] which is absurd. There-
fore our initial assumption e = p/q must have been
false.

§ 5.3. The binomial series

(d) ? What does Leibniz’s series have to do with a § 5.3.1. Lecture worksheet
square of area 1, which is what Leibniz has drawn
on the left? Hint: The use of the Greek letter π The binomial series
to denote the famous circle constant is a relatively
q(q − 1) 2 q(q − 1)(q − 2) 3
recent invention. It was never used in Leibniz’s (1 + x)q = 1 + q x + x + x +···
2! 3!
time, and certainly not by the ancient Greeks. This
is because they preferred to formulate mathemati- is perhaps best thought of by analogy with the integer-
cal truths geometrically rather than by “formulas.” exponent case. When q is an integer the series dies after the
So to understand Leibniz’s mode of expression you (q + 1)th term, and one has the trivial results
should consider how to express your formula for π
in purely geometrical terms. (1 + x)2 = 1 + 2x + x 2 , (1 + x)3 = 1 + 3x + 3x 2 + x 3 , etc.

Leibniz’s series is beautiful but it is not really very effi- Here we can think of, say, the coefficient of x 2 in the last ex-
cient for computing π. Already in 1424 al-Kashi had com- pression as follows. To get an x 2 -term when expanding (1 +
puted π with 16-decimal accuracy using different meth- x)3 = (1 + x)(1 + x)(1 + x) we need to choose x’s from two of
ods. the parentheses and 1 from the third. For the first x we have
(e) ? Estimate how many terms of Leibniz’s series must three choices, and for the second x we have two choices, giv-
be added together to achieve al-Kashi’s accuracy. ing 2 · 3 = 6 choices in total, except that we must divide by
the number of ways in which these two things we choose can
Here is al-Kashi’s result in his own notation: be ordered internally (choosing first the third parenthesis and
then the first is the same as choosing first the first and then the
third), which is 2!, thus explaining the coefficient 3. The gen-
eral binomial theorem can be thought of in the same way, even
That’s 3.1415926535897932. Note the interesting way in though q is no longer an integer. Let’s try the x 2 coefficient
which our symbols for 2 and 3 are derived from their Ara- again. To get an x 2 -term when expanding (1 + x)q we need to
bic counterparts. The Arabic symbols are perhaps a more choose x’s from “two of the q parentheses” so to speak (what-
natural way of denoting “one and then some.” ever that is supposed to mean when q is something like −1/2).
5.2.6. In this problem we shall show that e is not a rational For the first x we have q choices, and for the second x we have
number, i.e., not a ratio of two integers. We shall do this q − 1 choices, and correcting for internal ordering the coeffi-
q(q−1)
by assuming on the contrary that e = p/q for some inte- cient for x 2 should be 2! .
gers p and q, and showing that this leads to a contradic-
5.3.1. The binomial series for q = 2, q = −1, q = 21 have very dif-
tion.
ferent standing. Match them with a suitable description:
(a) Find a series representation of e using the power se-  becomes another famous series
ries for e x , and set it equal to p/q.
 becomes finite
(b) Multiply both sides by q!. You will find that the in-
finite series now starts with a series of integers and  amazingly works
then from a certain point onwards becomes a series
 equals a logarithm function
of fractions. The first non-integer term is
We typically use the binomial series for functions involv-
(c) By writing down this and the next few terms, we see
ing:
that, from this point onwards, the series is [</=/>]
the geometric series  powers

1 1 1  fractions
+ + +···
q + 1 (q + 1)2 (q + 1)3  roots
which in closed form =  logarithms other than ln
(d) Consequently, we have shown that [e, p, q, p!, q!, An important application of the binomial series concerns the
pq!/q] can be written as [an integer, a negative inverse trigonometric functions.

39
5.3.2. (a) Recall from problem 4.2.4 how to express arcsin(x) • We cannot naively assume that we can always manipu-
as an integral. late infinite series according to the same rules as finite
expressions (although this works more often than not).
(b) Find a power series representation of arcsin(x) by
expanding the integrand as a binomial series and • We can often avoid pitfalls and be more careful in our
integrating term by term. reasoning by considering the infinite series as the limit
case of a finite expression.

§ 5.3.2. Problems For the purposes of finitistic analysis, we can cut the series off
and add up all the terms we have up to that point. This is called
a partial sum. Often the terms of the series shrink very quickly.
5.3.3. The general binomial series can be bypassed in specific
Then the partial sums will soon be basically equal to the whole
cases in the followingp way. Let’s say we are looking for series since the cut-off terms are so small. In such a case we
the power series of 1 + x 2 . This means that we want a
say that the series converges, i.e., approaches a particular value.
series such that
Convergent series are “almost like a finite expression” in this
sense, so it is not surprising that they can generally be treated
p
1 + x 2 = A + B x +C x 2 + D x 3 + · · · ,
as such without any need to worry about falling into absurdi-
or, in other words, ties like those of problem 5.4.1.

When the partial sums of a series do not approach a particular


1+x 2 = (A+B x +C x 2 +D x 3 +· · · )(A+B x +C x 2 +D x 3 +· · · ).
value—i.e., when the cut-off part never becomes negligible no
p matter how far out you go—we say that the series diverges.
Find the series for 1 + x 2 by multiplying out the right
hand side and identifying coefficients with the left hand 5.4.3. Argue that the geometric series exhibits two different
side (constant term equal on both sides, coefficient of x “kinds” of divergence. Hint: consider x = 2 and x = −1.
equal on both sides, coefficient of x 2 equal on both sides,
etc.). Divergent series are the dangerous ones that can lead us into
absurdities if we carelessly assume that they obey all ordinary
5.3.4. (a) What is the largest r for which the circle with cen- rules of algebra.
ter (0, r ) and radius r fits inside the parabola y = x 2 ?
Use the power series expansion of the circle at the 5.4.4. Show for example that, given this assumption, the series
origin to find out. Explain how you can see from for x = −1 that you obtained in problem 5.4.3 can easily
the power series that a smaller or larger r would not be used to “prove” that 1 = 0.
work. In the geometric series examples, the divergent cases were easy
(b) Answer the same question with y = 1 − cos(x) in to spot in that their terms did not shrink to zero. No wonder
place of the parabola. then that we can never discard the cut-off part, since its terms
are still large. Unfortunately, however, telling convergent from
Leibniz called this the osculating circle, or “kissing circle” divergent series is not always as easy as this. It is in fact possi-
if translated literally from the Latin. ble for the terms of a series to shrink to zero and for the series
to nevertheless diverge at the same time. The following is such
an example.
§ 5.4. Divergence of series
5.4.5. Consider the following purported proofs that

§ 5.4.1. Lecture worksheet 1 1 1 1


1+ + + + +··· = ∞
2 3 4 5
5.4.1. Argue that the geometric series
First proof. The second term is ≥ 12 , the sum of the next
1 two terms is ≥ 12 , the sum of the next four terms is ≥ 12 ,
= 1 + x + x2 + x3 + · · ·
1−x the sum of the next eight terms is ≥ 12 , and so on. I
is clearly nonsense for certain values of x. see this by estimating each term by the [minimum/max-
imum/average] of the terms in that group. The result fol-
How can this be when we “proved” this formula in problem lows.
5.2.1b?
Second proof. I can picture the sum as the areas of rect-
5.4.2. (a) To diagnose the problem, try your argument from angles, where the value of the term is the height of a rect-
problem 5.2.1b on the finite series 1+x+x 2 +· · ·+x n . angle of base 1. By placing these rectangles in a coor-
dinate system starting from [(0, R ∞0),(−1, 0),(1, 0)] I see that
(b) If we let n → ∞, what does this tell us about when
1 + 21 + 13 + 41 + 51 + · · · [</=/>] a x1 dx, where the a corre-
the geometric series works?
sponding to this sum is . I can therefore use inte-
This case illustrates two points that apply generally: gration to prove the result.

40
Which proof is valid? [the first/the second/both/nei- since every integer has a unique prime factorisa-
ther]. The result shows that a series can [be comput- tion.
ed/converge/diverge/vanish/oscillate] even though [its
(b) Deduce from this and problem 5.4.5 that there must
terms go to 0/all its terms are positive/it has only inte-
be infinitely many primes.
ger denominators/it has the same derivative as ln]. Can
the same thing happen with a geometric series? [yes/no]
§ 5.5. Reference summary
§ 5.4.2. Problems
§ 5.5.1. Standard series
5.4.6. Another proof that the harmonic series diverges can be
given on the basis of the inequality General power series of f (x):
1 1 1 3
+ + > f 0 (0) f 00 (0) 2 f 000 (0) 3
n −1 n n +1 n f (x) = f (0) + x+ x + x +···
1! 2! 3!
(a) Prove this inequality. Hint: geometrically, it reflects
the fact that the function 1/x “flattens out” as x in- General power series of f (x) centered at x = a:
creases.
(b) Apply this inequality to the harmonic series f 0 (a) f 00 (a) f 000 (a)
f (x) = f (a) + (x − a) + (x − a)2 + (x − a)3 + · · ·
1 1 1 1! 2! 3!
1+ + + +···
µ 3 4 ¶ µ
2 ¶ Power series of elementary functions:
1 1 1 1 1 1
=1 + + + + + + +···
2 3 4 5 6 7
x x2 x3
and show how the divergence of the series follows ex = 1 + + + +···
from this. 1! 2! 3!
x3 x5 x7
5.4.7. I claimed that it is important to distinguish divergent se- sin x = x − + − +···
3! 5! 7!
ries from convergent ones on the ground that the former x2 x4 x6
can lead to absurdities if ordinary algebra is assumed to cos x = 1 − + − +···
2! 4! 6!
apply to them. However, I showed only that divergence
x2 x3 x4
of the types in problems 5.4.3 lead to absurdities, not that ln(1 + x) = x − + − +···
divergence of the type of problem 5.4.5 does so also. De- 2 3 4
rive an absurdity by careless reasoning with the latter se-
Geometric series (|x| < 1):
ries to establish that my point was well taken.
1
5.4.8. In fact, even convergent series are not entirely free of = 1 + x + x2 + x3 + · · ·
1−x
“paradoxes.”
(a) Using the power series for the logarithmic function, Binomial series (|x| < 1):
write down a series expression for ln(2). q(q − 1) 2 q(q − 1)(q − 2) 3
(1 + x)q = 1 + q x + x + x +···
(b) Rearrange the order of these terms by moving some 2! 3!
negative terms toward the beginning of the series in
such a way that after every positive term comes the § 5.5.2. Terminology
next two negative terms. Now subtract from every
positive term the negative term that follows, and Taylor series power series of a function centered at
sum the resulting series. some point x = a
(c) Discuss. Maclaurin
power series of of a function centered at
5.4.9. Consider the product series
the origin; special case a = 0 of Taylor se-
à !à !à !à ! ries
Y 1 1 1 1 1
= ···
1
1 − 12 1 − 13 1 − 51 1 − 71 partial sum
p prime 1 − p sum of cut off series; the terms of the se-
of a series
ries added together up to a certain point
(a) Expand each term as a geometric series and argue
that the result of multiplying everything out must convergent
the partial sums of the series approach a
be series
1 1 1 1 specific value as more and more terms are
1+ + + + +··· added
2 3 4 5

41
divergent
the partial sums of the series do no ap- e x sin x
series
proach a specific value as more and more ³
x 2 3
´³ 3 5 7
´
= 1 + 1! + x2! + x3! + · · · x − x3! + x5! − x7! + · · ·
terms are added (instead go to ±∞ or os-
= x + x 2 + 2!1 − 3!1 x 3 + 3!1 − 3!1 x 4 + · · ·
¡ ¢ ¡ ¢
cillate)
3
= x + x 2 + x3 + · · ·
alternating x5
every other term positive, every other (Note that we could not multiply the 1 by the 5! to put
series 5 4
negative a x5! in the answer, since there is a x4! hiding in the dots
of the e x -series, which when multiplied by x would affect
n! Multiply n by every integer below it. the fifth-power term.)

4! = 4 · 3 · 2 · 1 • Estimate the magnitude of the error when an infinite series


is replaced with one of its partial sums.
b
X
For every integer n starting with a and go- By looking at the magnitude of the next term or two, as well
f (n)
n=a ing to b, compute f (n), and add all of the as considering how fast the terms are shrinking, you can
results together. (Σ = sigma = sum.) make an educated guess. In an alternating series with shrink-
ing terms, the error is always less than the magnitude of the
next term.

(−1)n x n = 1 − x + x 2 − x 3 + x 4 − x 5 + · · ·
X
n=0
§ 5.5.4. Examples

Write as a power series:


§ 5.5.3. Problem guide
e −3x
2 3 4
• Find the power series for a given function. = 1+(−3x)+ (−3x) (−3x) (−3x) 9 2 9 3
2! + 3! + 4! +· · · = 1−3x + 2 x − 2 x +
27 4
8 x +···
This is typically best done using the standard series (§5.5.1)
in conjunction with the below techniques. Alternatively, the 1
general power series formula can be used, though this is usu- 1+x 2
ally more work.
= 1 − x2 + x4 − x6 + x8 + · · ·
• Find power series for f (g (x)) (such as f (3x), f (x 2 ), etc.)
when the series for f (x) is known. (1 + x)1/3
( 1 )(− 2 ) ( 1 )(− 2 )(− 5 ) ( 13 )(− 23 )(− 53 )(− 83 ) 4
Insert g (x), enclosed in brackets, in place of x in the series = 1 + 13 x + 3 2 3 x 2 + 3 3!3 3 x 3 + 4! x +··· =
for f (x). 1 + 13 x − 19 x 2 + 81
5 3 10 4
x − 243 x +···

sin(x 2 ) R1 2
Evaluate 0 e −x dx using a power series.
2 (x 2 )3 (x 2 )5 (x 2 )7
= (x ) − 3! + 5! − 7! +···
x x2 x3 2 2 4 6

=x −2 x6
+ x 10 14
− x7! + · · · e x = 1 + 1! + 2! + 3! + · · · =⇒ e −x = 1 − x1! + x2! − x3! + · · · =
3! 5! 4 x6
R1 2 R1 4 6
1 − x 2 + x2 − 6 + · · · . Thus 0 e −x dx = 0 1 − x 2 + x2 − x6 +
3 5 7
· · · dx = [x − x3 + 10
x x
− 42 + · · · ]10 = 1 − 31 + 10
1 1
− 42 +···.
• Find the power series for a given function from first princi-
ples.
Rx
Approximate f (x) = x 2 + 0 sin2 t dt by a power series of de-
(Hardly ever the easiest way to find a series in practice.) Set
gree 2.
the function equal to A +B x +C x 2 +D x 3 +· · · . Plug in zero to
determine A. Take the derivative of both sides, and plug in Using the chain rule and the FTC we get f 0 (x) = 2x +
zero again to determine B . Repeat. sin2 x and f 00 (x) = 2 + 2 sin x cos x. Hence f (0) = 0, f 0 (0) =
0, f 00 (0) = 2. Thus the power series approximation is f (x) ≈
• Multiply two series. x2.

Multiply as ordinary polynomials. When cutting the answer


off, take care that the terms you have not yet multiplied
would not affect the terms before the cut-off point in your
answer.

42
blowing up a bridge. If you take them on one at a
6 D IFFERENTIAL EQUATIONS time, how many enemies will survive to march on
your capital in this case?

§ 6.1. Separation of variables (f) Explain how, more generally, the conclusion “never
divide your forces” can be deduced from 6.1.2b di-
rectly.
§ 6.1.1. Lecture worksheet
(g) Solve the equations for guerrilla warfare. Does the
same maxim apply in this case?
The simplest strategy for solving differential equations is sep-
aration of variables: move all x’s to one side and all y’s to the
other, then integrate both sides. Thus if we have the equation § 6.1.2. Problems
dy 2 2
dx = x y we rewrite it as dy/y = x dx and then integrate both
3 /3+C
sides to get ln|y| = x 3 /3 +C , or y = ±e x . 6.1.3. Geometrical interpretation of separation of variables. Ex-
plain how solving y 0 = y by separation of variables corre-
This example alerts us to some technical points that often sponds to the figure below. Areas in the same shade are
come up in this context. First of all you may be upset that equal. The point generalises to any separable differential
I included the constant of integration only on the right hand equation.
side of the equation, even though I integrated both sides. But
this comes to the same thing, for if I had included constants on
both sides, say C 1 on the left and C 2 on the right, then I could
just move them to the same side to get C 2 − C 1 on the right,
which we might as well denote by a single letter C since a con-
stant minus a constant is just another constant. Also, we dislike
having constants in the exponents; it’s impractical. Therefore
the standard trick in these kinds of situations is to rewrite the
3 3 3
solution as y = e x /3+C = e x /3 e C = Ae x /3 . Again, e C is just an-
other constant so there is no point in writing it this way. It is
neater to just give it its own letter, A.
6.1.4. Consider a differential equation with separated vari-
6.1.1. Solve the differential equation for population growth
ables, f (x) dx = g (y) dy, to be solved for the initial con-
with unlimited resources (from problem 2.5.2). Note that
dition y(x 0 ) = y 0 . Instead of taking the indefinite inte-
our manner of rewriting constants makes the final con-
gral of both sides and including a Rconstant of integration
stant easy to interpret in real-world terms. x Ry
we can take the definite integrals x0 f (x) dx = y 0 g (y) dy
6.1.2. Find the equations for warfare in problem 2.5.2. For which gives us the solution directly, bypassing the need
the conventional warfare case, we shall now prove for the constant of integration. Explain why this works.
the famous military-strategic maxim “never divide your Hint: this can be done using problem 6.1.3.
forces,” or, if you prefer, “divide and conquer” (divide the 6.1.5. Forensic medicine. Newton’s law of cooling says that the
enemy, that is). temperature, H , of a hot object decreases at a rate pro-
(a) In these equations, the derivatives are taken with portional to the difference between its temperature and
0 dy
respect to time. So y means . But by dividing one that of its surroundings, S:
dt
equation by the other we can obtain a new equation dH
dy = −k(H − S)
involving only dx and no t . Do this. dt
(b) Solve this differential equation. (a) ? If you stir your coffee, does it cool faster or slower?
(c) What is the real-world meaning of the constant of How is this reflected in Newton’s law?
integration? Hint: Consider the cases where one The body of a murder victim is found at noon in a room
army has been depleted. with a constant temperature of 20◦ C. At noon the tem-
Suppose you and the enemy have equal fighting effi- perature of the body is 35◦ C; two hours later the temper-
ciency, a = b = 1. You have 5000 soldiers and the enemy ature of the body is 33◦ C.
has 7000. (b) Find the temperature of the body as a function of t ,
the time in hours since it was found.
(d) If you took the enemy head on, how many of their
soldiers would survive the battle and march on to- (c) Explain how you can check your work by consider-
ward your capital? ing the cases t = 0 and t → ∞.
(e) Suppose you managed to split the enemy into two (d) When did the murder occur? Assume that the vic-
groups of 4000 and 3000 soldiers, for example by tim had the normal body temperature 37◦ C at the

43
time of the murder. Provide the answer in both ex- (c) Remarkably, this problem can be alleviated by dif-
act and decimal form. ferentiating both sides. Do so! This leads to the dif-
ferential equation dp/dh = .
6.1.6. Since atmospheric pressure decreases when you climb
a mountain, it ought to be possible to determine one’s (d) Solve the resulting differential equation for p as a
altitude simply by measuring the atmospheric pressure. function of h. Consider p(0) in order to determine
In this problem we shall derive a formula that does pre- the constant of integration.
cisely this.
(e) Explain how you can check your work by consider-
For this purpose we need Boyle’s law of gases, which ing the physical meaning of p(∞).
states that pressure p is proportional to density σ, i.e.,
(f) Solve for h as a function of p.
p = aσ, for some constant a. (Background: Boyle discov-
ered this law in 1662 using “a long glass-tube, which, by This formula gives an easy way of finding the altitude
a dexterous hand and the help of a lamp, was in such a from the pressure, as sought. Note that the constants in
manner crooked at the bottom, that the part turned up the final formula are easily determined once and for all,
was almost parallel to the rest of the tube.” The pres- so that pressure is indeed the only input that needs to be
sure exerted on the enclosed air is the combined effect of measured in the field.
the atmospheric pressure p 0 and the weight of the excess
mercury (measured by h). The density of the enclosed (g) ? Does the formula also work below sea level?
air is of course readily measured by v. Thus, by pouring The “column of air” part of the argument may have both-
in more mercury, we can test the effect of an increase in ered you. Robert Hooke (Micrographia, 1665) explains it
pressure on density, which reveals Boyle’s law: every unit as follows: “I say Cylinder, not a piece of a cone, because,
increase in pressure causes an increase in density of a as I may elsewhere shew in the Explication of Gravity,
units.) that triplicate proportion of the shels of a Sphere, to their
p0 respective diameters, I suppose to be removed by the de-
crease of the power of Gravity.” In other words, while the
base area of a cone with its vertex at the surface of the
earth is as the height squared, gravity is as the inverse
height squared, meaning that the weight is equivalent to
that of a cylinder with constant gravity.
h
mercury

v
air

6.1.7. We shall consider a model for the spread of a disease in


an isolated population, such as the students at a board-
ing school. There are three variables: S = the number
of susceptibles, the people who are not yet sick but who
could become sick; I = the number of infected, the peo-
ple who are currently sick; R = the number of recovered,
The atmospheric pressure p(h) at any given altitude h is or removed, the people who have been sick and can no
determined by the weight of the “column of air” weighing longer infect others or be reinfected.
down upon it, i.e.,
Z ∞ (a) Explain why the following differential equations are
p(h) = σ(λ) dλ a reasonable model for the spread of the disease:
h
dS dI dR
where σ(λ) is the density of the air at height λ. (We know = −aSI = aSI − bI = bI
dt dt dt
that this weight is considerable because trees are made
out of air.) In terms of the pressure p 0 at the earth’s sur-
Consider a school with 1000 students. Let’s say that one
face, this can be rewritten as
student develops the flu, and that one day later two more
Z B students are infected.
p(h) = p 0 − σ(λ) dλ
A (a) Use the first equation above to estimate a on the
basis of this information.
(a) where A = and B = .
(b) Let’s say that b = 0.5. What is the real-world mean-
(b) Use Boyle’s law to eliminate density from this equa-
ing of this?
tion (and replace it with pressure).
(c) Sketch the direction field for this system with S on
The resulting formula relates pressure to altitude, which
the x-axis and I on the y-axis.
is what we wanted. However, the formula is quite useless
as a means of determining altitude since evaluating the (We really only care about S and I since these are
integral experimentally would require measuring pres- the variables that determine the course of the epi-
sure at many different intervals of height. demic. Once people fall into R they might as well

44
not exist as far as the spread of the disease is con- § 6.2. Statics
cerned, so we do not need R in our analysis.)

(d) Find the equation for I as a function of S and draw § 6.2.1. Lecture worksheet
its graph for the case of one initial sick student and
the others all susceptible.
Statics is the study of physical systems in equilibrium. Things
(e) How many students remain uninfected at the end that don’t move, in other words. In the problems of this sec-
of the epidemic? tion we shall see how two interesting statics problems reduce
to differential equations. This comes about because they in-
(f) Suppose half the students were vaccinated against volve tangential forces, and tangents are related to derivatives.
the flu (and thus not part of the susceptible popu-
lation). With a and b as above, and I = 1 at t = 0,
how many students remain uninfected in this case?
Indicate how this relates to your direction field. § 6.2.2. Problems

6.1.8. The tractrix is the curve traced by a weight dragged along


a horizontal surface by a string whose other end moves 6.2.1. In this problem we shall find the shape of a suspension
along a straight line: bridge cable. We assume that the weight of the roadway
and cars is uniformly distributed with density w, so that
a segment of the bridge with length x has weight w x.
This weight translates into tension in the cable. In the
figure we have indicated the tension forces T0 and T . If
the cable was cut at one of these points, then T0 or T in-
dicates the force we would need to apply at that point to
keep the shape of the cable intact.

In the physique de salon of 17th-century Paris, a pocket


T
watch on a chain was the preferred way for gentlemen to Ty
trace this curve.
Tx
(a) The same curve can also be interpreted as the “pur-
T0 wx
suit path” of a predator [on a leash/running freely]
chasing a prey that is running [in a straight line/s-
traight away from the predator].
Thus there are three forces acting on the cable segment:
the tensions T0 and T , and the weight w x. Since the ca-
Let’s say that the length of the string is 1. Consider it as ble is in equilibrium these forces must cancel out. There-
the hypotenuse of a triangle with its other sides parallel fore the horizontal component T x of T must equal T0 ,
to the axes. Draw a figure of this triangle and write in the and the vertical component T y of T much equal w x.
lengths of its sides (1 for the hypotenuse, y for the height,
and the last side by the Pythagorean theorem).
(a) Argue that T y /T x = dy/dx.

(b) Find a differential equation for the tractrix by ex- (b) Use this to express the condition of equilibrium as
pressing the slope of the curve in terms of this tri- a differential equation.
angle. dy/dx =
(c) Solve it.
(c) Separate the variables and make the substitution
u 2 = 1 − y 2 . Then: dx = du
6.2.2. The shape of a freely hanging chain suspended from two
points is called the “catenary,” from the Latin word for
(d) Factor out a u, split into partial fractions, and in- chain. In principle any piece of string would do, but one
tegrate. Substitute back to get x as a function of y, speaks of a chain since a chain with fine links embodies
and choose the constant of integration so that the in beautifully concrete form the ideal physical assump-
asymptote (along which the free end of the string is tions that the string is non-stretchable and that its ele-
pulled) is the x-axis and the point (0, 1) corresponds ments have complete flexibility independently of each
to the vertical position. Then: x = other.

45
nec alia melioris generis dari potest, nam certa quadam propo
ne semel in vniuersum assumpta, de caetero inueniuntur innum
1
seu quot lubet,the
puncta lineae
figure from quaesitae
Leibniz’s 1692 papervera
on theper geometriam
“linea cate-
na[ria]m sine suppositione
naria,” as he callsquadraturarum, quod you
it. Having solved the problem, in Algebram
can
easily understand what Leibniz means by “linea logarith-
scendentibus summum est. Hanc igitur placet paucis subbjicc
mica.”

T0 as

We can find a differential equation for the catenary by


considering the forces acting on a segment of it. These
forces are: the tension forces at the endpoints, which act
tangentially, and the gravitational force, which is propor-
tional to the arc s measured from the lowest point of the
catenary.

(a) Deduce by an equilibrium of forces argument that


the differential equation for the catenary is

dy
=s
dx

for some appropriate choice of units. Hint: con-


sider the vertical and horizontal components of T . § 6.3. Dynamics
1 geometriam ordinationem E
(b) Explain how this equation seems plausible visually,
quite apart from the physical argument regarding § 6.3.1. Lecture worksheet
forces.
Dynamics is the study of physical systems involving motion.
Ultimately, we seek an expression for the catenary in
We have already noted that such systems give rise to differen-
terms of x and y only, but the differential equation in-
tial equations since they are governed by Newton’s law F = ma,
volves also the variable s. Work around this problem as
in which a is the second derivative of position. In principle this
follows.
is the law at bottom of all problems of dynamics, but in prac-
(c) Use the basic property of arc lengths ds2 = dx2 +dy 2 tice it is sometimes better to circumvent a direct “brute force”
(this is the Pythagorean theorem applied to an in- attack using this law in favour of instead employing a suitable
finitesimal triangle; see §1.1) to eliminate dx in the conservation principle (cf. §2.4). In the problems of this sec-
differential equation. tion we shall see some examples of such approaches.

(d) Separate the variables in the resulting expression.

(e) Solve the resulting differential equation. Take the § 6.3.2. Problems
constant of integration to be zero (this corresponds
to a convenient choice of coordinate system). 6.3.1. Problems about balls rolling frictionlessly down curved
ramps are reduced to differential equations by the fact
(f) Solve for s in the resulting expression.
that the speed acquired is equal to the speed of an ob-
(g) Substitute this expression for s into the original dif- ject in free fall having covered the same vertical distance.
ferential equation for the catenary. This is a simple consequence of energy conservation: no
matter how the ball descends, the speed it acquires must
(h) Verify by differentiation that y = (e x +e −x )/2 is a so-
be precisely sufficient to take it back up to its starting
lution of the resulting differential equation.
point, whether by the same or any other path. The speed
(i) Sketch the graphs of e x , e −x , and the catenary, as of an object falling under constant gravitational accel-
well as the coordinate system axes in the same fig- eration is of course proportional to time, but to charac-
ure. terise the curve geometrically we do not want time to fig-
ure in our equations. Therefore we note that, since dis-
The link between the catenary and logarithms led Leib- tance fallen is proportional to time squared, time is pro-
niz to suggest that measurements on an actual hanging portional to the square root of the distance fallen.
chain could be used in place of logarithm tables for cal-
culations. For your amusement I have included below (a) Prove this by integrating the equation acceleration

46
= g twice (using the initial conditions correspond- (b) Using conservation of momentum, show that this
ing to an object dropped from rest). leads to the differential equation
Thus we have speed in geometrical terms as propor- dv b
tional to the square root of the vertical distance covered. =−
dm m
Stated as a differential equation, this becomes
Hint: Recall the principles for discarding negligible
q
infinitesimal expressions from §1.3.
dx2 + dy 2 p
= a y. (c) Solve the differential equation. (Find v as a func-
dt
tion of m, i.e., your answer should have the form
(b) The appearance of time in the above equation is an v = ···)
obstacle to finding a solution in purely geometrical
terms as an equation in x and y. However, consider You have a choice between two space ships. Ship A has
the special problem of finding a curve along which a mass of m = 1 and an exhaust velocity of b = 1. Ship B
a ball descends at uniform vertical speed, so that also has a mass of m = 1 but has an exhaust velocity of
dt = dy. b = 2. Ship A is cheaper so if you buy it you can afford
more fuel: a mass of 2 instead of just a mass of 1 if you
(c) Sketch a rough guess of what the solution curve will buy Ship B. Thus when you start your journey (from rest,
look like based on your physical intuition. v = 0) the masses of the ships would be m = 3 and m = 2
(d) Find an equation for the solution curve by solving respectively for Ship A and Ship B.
the differential equation. Graph it and check your (d) Use these initial conditions to determine the value
guess. of the constant in your expression for v for each
6.3.2. The following problem establishes a general result about ship.
beads on ramps which will be of use on multiple occa- (e) Which ship has a higher terminal velocity (i.e.,
sions in later problems. Consider a bead sliding down a velocity when fuel is exhausted, i.e., when m =
ramp of shape y(x). ship’s mass = 1)?
(a) Using the same reasoning as in problem 6.3.1, ex-
press the velocity ds/dt of a particle released from
the top of the ramp as a function of y. § 6.4. Second-order differential equations
(b) The
R time it takes the particle to reach the bottom is
dt. Using the above, rewrite this integral in terms § 6.4.1. Lecture worksheet
of x and y.
Pendulum motion is the prototype for all periodic phenomena.
6.3.3. Many second-order differential equations arise from Indeed, we shall see in this chapter that the many variants of
Newton’s law F = ma. However, the actual Newton’s law the pendulum motion problem exhaust the better part of the
d
is not F = ma but F = dt (mv). theory of second-order differential equations. But our first or-
(a) Explain in terms of rules of differentiation why der of business is to derive the equation for pendulum motion
these two forms of the law are often equivalent in in its very simplest case.
practice.
The saying that something relatively easy is “not rocket
science” was perhaps coined by someone familiar with L
this distinction, because in rocket science it is in fact nec-
d
essary to use the more complicated form F = dt (mv). x
Consider a rocket in outer space with no external forces
acting on it. Then mv is constant (“conservation of mo-
s
mentum”) since dt d
(mv) = F = 0. But the rocket can still mg sin
move forward by throwing out parts of its mass in the mg cos
form of exhaust products, say with velocity −b relative
to the ship. Thus for any infinitesimal time period dt we mg
have the following “before” and “after” scenarios:
We wish to find s(t ), the elevation of the pendulum measured
along its arc. As always we must start with Newton’s law F =
time mass velocity
ma. The force involved is the component of gravity that pulls
ship t m v in the direction of the tangent; this is −gm sin θ (negative be-
ship t + dt m + dm v + dv cause it acts to decrease s), so F = ma says m s̈ = −gm sin θ.
Since we are looking for a differential equation for s(t ) we want
exhaust t + dt −dm v −b s and t to be the only variables. But θ is also variable, so we

47
must get rid of it. We could make it sin θ = x/L, which doesn’t Using the pendulum case as a prototype, we might say:
seem much better since x is also variable. But here is the trick:
horizontal displacement is almost equal to displacement along y = height of pendulum
the arc, i.e., x ≈ s, at least for small θ. With this approximation = the thing you want to control
gm
we can get rid of all unwanted variables and obtain m s̈ = − L s. ẏ = speed of pendulum
In other words, s(t ) is a function such that when you differenti-
= change in the thing you want to control
ate it twice you get back the function itselfp times −g /L. Which
functions behave like this? Well, sin( g /Lt ) does, and you ÿ = gravity on pendulum
could p also put a constant A in front and it would still work. And = change in the change in the thing you want to control
B cos( g /Lt ) does too. So s(t ) must have the form
6.4.2. What corresponds to y, ẏ, and ÿ in the other two scenar-
ios?
p p
s(t ) = A sin( g /Lt ) + B cos( g /Lt )
 hot water valve / pig births
for some constants A and B .
 temperature of room / pigs to sell
6.4.1. Select all that are true:
 radiator temp. / adolescent pigs
 The higher the starting point, the greater the swing
time, according to the solution formula. To repeat the above equation schematically, Newton’s law F =
ma in the case of pendulum motion is of the form
 The smaller the swings, the more accurate the solu-
tion. −c x = ẍ

 This gives an easy way to estimate g . Now we wish introduce air resistance. This is another force, so
it will go on the left (F ) side of the equation. Like gravity, it too
 The fact that the solution has precisely two unde-
is working against the motion of the pendulum, so it has a mi-
termined constants in it corresponds to the fact
nus sign on front of it. But unlike gravity it does not depend
that the highest derivative in the differential equa-
on position but on velocity: the faster you go the more the air
tion was of order two.
“pushes back.” So the new schematic equation is
 None of the above.
−c x − b ẋ = ẍ
Pendulum motion is the archetype of periodic or oscillatory
phenomena resulting from a kind of “delayed feedback” mech- Four qualitatively different scenarios are possible:
anism. Gravity is always trying to pull the pendulum down to
• The “undamped” (b = 0) pendulum keeps on swinging
its lowest position. One might say that gravity “wants” the pen-
forever. This is the idealised case where there is no air
dulum to reach this position. But if this is what gravity is trying
resistance.
to achieve, gravity acts a bit stupidly, because it always over-
shoots its target. The problem is that gravity controls the ac- • The “damped” (b small) pendulum gradually swings
celeration rather than the velocity of the pendulum: When the shorter and shorter arcs. This is the realistic case of an
pendulum reaches the lowest position, gravity makes the ac- ordinary pendulum with air resistance.
celeration zero, but the pendulum still has velocity, so it keeps
• The “overdamped” (b big) pendulum goes slowly to its
going anyway.
lowest point without oscillating. This could be for exam-
An analogous situation is that of a thermostat trying to keep ple a pendulum submerged in thick syrup.
the temperature of a room at a fixed level by means of a warm-
• The “critically damped” pendulum is right on the bound-
water radiator. The thermostat “wants” the temperature in a
ary between damped and overdamped.
room to be at a desired level, say 20◦C . But it doesn’t control
the rate of change of temperature, but rather the rate of change Here are graphs of these four cases:
of the rate of change of temperature. This is because it controls
the valve of the radiator, not the radiator temperature itself. It
the thermostat wants it to be warmer, it opens the valve and
lets warm water into the radiator. When the thermostat doesn’t
want to increase the temperature anymore, it closes the valve.
But when the valve closes there is still warm water left in the ra-
diator, which will keep warming up the room for a while longer.
So the thermostat missed its target.

Another example is what economists call the “pork cycle,” I have varied the resistance b while keeping other things the
which arises due to the tension between the immediate current same. The critically damped case (dashed) has “just the right
demand for pork and the delay of raising a pigs to a suitable age amount of resistance.” Therefore it is often desirable for appli-
for slaughter. cations such as shock-absorbing suspensions.

48
It is not for nothing that we ended up with an example in- how we found the solutions because once we have them we
volving springs rather than pendulums. Strictly speaking, the can verify them by simply plugging them into the differential
spring rather than the pendulum is a truer archetype for the equation. Nevertheless one may wonder where the idea of the
theory of second-order differential equations. In fact, the dif- characteristic equation came from. The heuristic behind it is
ferential equation we derived above holds exactly for springs, that one guesses that there will be a solution of the form e mt
whereas it holds only approximately for pendulums (because and then plugs this in, giving
of our approximation horizontal displacement ≈ arc).
(e mt )00 + b(e mt )0 + ce mt = (m 2 + bm + c)e mt = 0

Since e mt is always positive, m 2 +bm +c must be zero, and thus


we are led to the characteristic equation.
6.4.3. Derive the differential equation for a spring from Hooke’s
law “ut tensio, sic vis” (“as the extension, so the force,”
So much for the homogenous case. When a forcing term is
i.e., force is proportional to the displacement of the
present one must split the problem into two parts. First throw
weight from its equilibrium position).
away the forcing term and solve the corresponding homoge-
Nevertheless I prefer the pendulum as my archetype. Its bene- nous case as above. Call the solution x h . Then find a particular
fits include: the physics behind it is of much greater generality solution to the forced equation, call it x p . Adding these two
(F = ma and gravity versus Hooke’s law); it exemplifies both solutions together solves the general equation.
stable and unstable equilibria in their purest forms; we have
greater intuition about it and encounter it more frequently in 6.4.5. Show that x h + x p is a solution, and argue that there are
everyday life. This last point makes a difference especially no others.
when “forcing” is introduced, i.e., when the pendulum is be-
ing pushed by an external force, like a child on a swing being We find the particular solution by educated guess-and-check:
pushed by a parent. Our schematic equation for a pendulum We hypothesise a solution of the same form as the forcing term,
left to its own devices can be rewritten as but with undetermined constants in front of every term, then
we plug this guess into the equation to determine the con-
ẍ + b ẋ + c x = 0 stants.

With forcing this becomes 6.4.6. Solve ẍ + 2ẋ + x = t .

ẍ + b ẋ + c x = f (t ) A special case not covered by this form of educated guessing is


the equation ẍ + c 2 x = A cos ct , which has the particular solu-
So the forcing term “doesn’t care about x”; it imposes itself fol- A
tion x(t ) = 2c t sin ct . Similarly ẍ + c 2 x = A sin ct has the partic-
lowing its own formula regardless of what the pendulum is do- A
ular solution x(t ) = − 2c t cos ct .
ing at that time. It is in fact often true for differential equa-
tions more generally that the terms involving the function and 6.4.7. Interpret these special cases in terms of a child on a
its derivatives represent the internal dynamics of the situation swing.
while other terms represent external forces artificially imposed
upon it from without. 6.4.8. I have solved a second-order differential equation and
6.4.4. Go back to problem 2.5.2 and see if you can find some found the homogenous solution x h and the particular
other such examples. solution x p . Which of the following must be true?

There is a systematic way of solving all second-oder differential  x h + x p is a solution to the differential equation.
equations with constant coefficients. For the “homogenous”
(i.e., non-forced) case ẍ + b ẋ + c x = 0, the strategy is to first  x h is a solution to the differential equation.
solve a related quadratic equation, the “characteristic equa-
tion” m 2 + bm + c = 0. So the power of m in this quadratic  x p is a solution to the differential equation.
equation equals the order of the derivative in the differential
equation. If this equation has two distinct real roots m 1 , m 2 ,  x p + 5 is another particular solution.
then the solution is x(t ) = Ae m1 t + B e m2 t . If the characteris-
tic equation has a double root m 1 = m 2 then the solution is  5x p is another particular solution.
x(t ) = Ae m1 t +B t e m1 t . These are the overdamped and critically
damped cases. The oscillating cases will come from character-  x h contains two undetermined constants.
istic equations that have no real roots at all, such as x 2 + 1 = 0.
Such cases will be treated in the next section.
 The choice of particular solution is not unique.

We know that we have found all solutions in this way since a  Once I have an initial condition such as x(0) = 2 I
general solution for a second-order problem should have two can give one concrete answer for the general solu-
constants in it (as in problem 6.4.1). It doesn’t really matter tion.

49
§ 6.5. Second-order differential equations: complex  ẍ + ẋ − x = 0
case  ẍ + 0.1ẋ + 2x = 0
 ẍ − 0.1ẋ + 2x = 0
§ 6.5.1. Lecture worksheet
Scenarios: A = pendulum with no air resistance; B = child
The method for solving second-order differential equations in- on swing pushed well by parent; C = child on swing given
troduced above needs some tweaking to apply to cases where out-of-synch pushes; D = pendulum in syrup; E = air
the roots of the characteristic equation are complex num- “encouragement” instead of air resistance; F = “negative
bers (§A.6). Though this is mathematically the most compli- gravity”; G = pendulum with slight air resistance; H =
cated case, it includes the physically simplest situation of all: pendulum pulled in one direction.
the simple pendulum equation introduced already in problem
2.5.2. Types of solution: a = perpetual oscillations; b = dying
oscillations; c = growing oscillations; d = slow approach
Suppose we want to solve ẍ + b ẋ + c x = 0 and find that the to equilibrium without oscillations; e = running off to in-
characteristic equation m 2 + bm + c = 0 has the complex roots finity; f = jerky motion.
m = a ± bi (complex roots always come in pairs like this). If
we proceed as above we would then have solutions of the form
x(t ) = Ae (a+bi )t + B e (a−bi )t . But of course we do not want com- § 6.5.2. Problems
plex numbers in our solution since we are interested in real
things like pendulums. We therefore break apart the complex 6.5.4. † Suppose we dig a straight, frictionless tunnel between
exponentials using problem A.6.7, any two points of the earth’s surface, such as New York
and Paris. Then, if we jump into the tunnel at one end,
e (a+bi )t = e at e bti = e at (cos bt + i sin bt ),
gravity will transport us to the other in less than 45 min-
utes. Let’s prove this.
and then just stick constants in front of every term to get the
final answer The motion is of course governed by F = ma. The force
in question is gravity. The gravitational pull exerted on
x(t ) = Ae at cos bt + B e at sin bt . an object with mass m inside the earth at a distance r
from the center of the earth is F = GRM3 mr , where G is the
As always, we can verify our answer by plugging it back into the usual gravitational constant, M is the mass of the earth,
original equation, and we know that it is the most general so- R is the radius of the earth. This is so for the following
lution since it has as many constants as the order of the equa- reasons. First of all the mass of the earth further than r
tion. Note that we only need to break apart one of the complex from the center will have no influence on the object since
exponentials to get all real solutions. The other one, e (a−bi )t , the net gravitational effect of this outer shell is zero.
would not add anything new since the constants will “eat up”
the minus sign. (a) Argue that this is so. Hint: Consider the object as
the vertex of a double cone, and argue that the two
6.5.1. Solve ẍ − 2ẋ + 2x = 0 given the initial conditions x(0) = 1
0 pieces of any thin outer shell that the cones cut out
and x (0) = 0.
cancel each other in terms of their gravitational ef-
6.5.2. What values of a and b correspond to undamped and fect.
damped pendulum motion respectively? Are there any
(b) Show that therefore the force of gravity acting on
other possibilities?
the body at a distance r from center of earth is
M
6.5.3. The standard form for a second-order differential equa- G R3
mr .
tion is ẍ + b ẋ + c x = f (t ). In terms of a pendulum, b is
air resistance, c is gravity, and f (t ) is forcing (i.e., an ex-
ternal force pushing the pendulum). By picturing this
x
prototype example we can get a good feeling for the be-
haviour of such a differential equation. Use this way of θ
thinking to associate the following differential equations
F
with their corresponding scenario and type of solution. r
Equations:

 ẍ + x = 0
 ẍ + x = t This force tries to pull the object towards the center of
 ẍ + 2x = sin(t ) the earth, but since the object can only move along the
tunnel it is only the part of the force that pulls in the di-
 ẍ + x = sin(t ) rection of the tunnel that has any effect. Let the tunnel be

50
the x-axis of a coordinate system with the tunnel’s mid- there are two equilibrium points which are of very different
point as the origin. character, as one can see from the phase plane diagram:

(c) Use trigonometry to find a formula for the effective


part of the force. (Note that the force is acting in the
negative x-direction.)

We have thus found the part of gravity that acts in the di-
rection of the tunnel, i.e., the actual force F in the law
F = ma governing the motion of the object along the
tunnel.

(d) Set this force equal to ma and solve the resulting


differential equation. Assume that we enter the
tunnel with zero velocity. 6.6.1. (a) Explain briefly the real-world meaning of this dia-
gram.
(e) Is x(t ) periodic? What is its period? What deter-
mines the period? (b) Calculate the equilibria.
(c) Find the equilibria of the system when harvesting is
introduced:
§ 6.6. Phase plane analysis n ẋ = ax − bx y − ex
ẏ = −c y + d x y − e y
§ 6.6.1. Lecture worksheet
(d) During World War I, when overall fishing was re-
The classification of exponential expressions that we have duced, the catches of Italian fishermen contained
carried out in the previous two sections is useful beyond a larger percentage of sharks. Explain why by equa-
its application to pendulum-type differential equations. tions as well as commonsensically.
In this section we shall see that it also helps us under-
stand the nature of equilibrium points of systems of dif- How does one draw the phase plane diagram for a given system
ferential equations. But let us first summarise what we of differential equations equations? In principle one can divide
have learned so far about expressions of the form e (a+bi )t . the two time-derivatives by each other; since

Exponent real and positive (a > 0, ẏ dy/dt dy


= =
b = 0): exponential growth. ẋ dx/dt dx
this gives an expression for the slopes, so that we can plot a di-
rection field as in §2.6. However, the resulting formula can be
Exponent real and negative (a < 0, very complicated to work with, as one may well expect from
b = 0): exponential decay. such a brute-force approach that, in dividing away the time
variable, disregards the underlying nature of the problem. We
shall consider instead a second approach, which is more sen-
Exponent pure imaginary (a = 0, sitive to the internal dynamics of the system.
b 6= 0): periodic oscillations.
Let us begin with the simple system
n ẋ = ax + b y
Exponent complex with positive ẏ = c x + d y
real part (a > 0, b 6= 0): growing os-
cillations. 6.6.2. Show that in general this system has only one equilib-
rium point, at (0, 0).
Exponent complex with negative
This system has solutions of the form
real part (a < 0, b 6= 0): shrinking
oscillations. n x = Ae m1 t + B e m2 t
y = C e m1 t + De m2 t
These things will come up again when we study the equilibria
of systems of two equations, such as those we have seen for as is not hard to imagine. Checking this would be tedious
warfare and two-species interaction. In the predator-prey sys- but straightforward. If you carried out the computations you
tem would find that
n ẋ = ax − bx y
1³ p ´
ẏ = −c y + d x y m= a + d ± (a + d )2 − 4(ad − bc)
2

51
or, if we use the shorthand notation p = a + d , q = ad − bc,
∆ = p 2 − 4q,
1³ p ´
m= p± ∆
2

The point is that we do not need to worry about all these hor-
rible calculations, only classify which types of exponential ex-
pressions we are dealing with. This will give us a clear enough
picture of what is going on without having to worry about the As we see, nodes and saddles involve exceptional lines. So to be
numerical details. able to draw a good picture in those cases we must first know
the slopes m of these lines. The characteristic property of these
If for example the exponents m are purely imaginary (which lines is: if we are on the line, we stay on the line. Being on the
obviously happens if p = 0 and q > 0) then both x(t ) and line means that y = mx (no constant term since the line passes
y(t ) are periodically oscillating functions, as seen in our ta- through the equilibrium, i.e., the origin), and staying on the
ble above. In the phase plane this gives closed loops like those line means that we keep moving in the direction of the line,
shown in the predator-prey figure above. As time ticks away, we i.e., dy/dx = m, so we get the equation
go around and around in the same loop over and over again.
ẏ c x + d y c x + d mx
= = =m
If the exponents are complex with a negative real part then x ẋ ax + b y ax + bmx
and y are oscillating towards zero, so we get an inward spiral. which can be solved for m (the rightmost equality yields
a quadratic equation, corresponding to the two exceptional
And so on. Altogether we get the following classification. lines).
• q > 0, ∆ > 0: node. The directional arrows are found by plugging in particular
points near the equilibrium. If for example ẋ and ẏ are both
positive at a particular point then they are both growing so we
must be heading “northeast,” so we place an arrow in our dia-
gram to that effect.
This classification was for a simple linear system with a single
equilibrium at the origin, but in fact any equilibrium can be ap-
proximately reduced to this situation. This is done as follows.
Take for example the predator-prey system
p < 0: stable node. p > 0: unstable node n ẋ = 3x − x y
ẏ = −4y + 2x y
• q > 0, ∆ < 0: spiral. which we know has an equilibrium at (c/d , a/b) = (2, 3). First
we want to make a change of variables that brings this point to
the origin. We do this by making x = 2 + X and y = 3 + Y , which
means that the equilibrium (2, 3) is the origin (0, 0) in the new
coordinate system (X , Y ). With this change of variables the sys-
tem becomes
n ẋ = 2x − x y = 3(2 + X ) − (2 + X )(3 + Y ) = −2Y + X Y
ẏ = −3y + 2x y = −4(3 + Y ) + 2(2 + X )(3 + Y ) = 6X + X Y

p < 0: stable spiral. p > 0: unstable spiral (We can check our work by the fact that the constant terms
must disappear since the derivatives must vanish at the equi-
librium (X , Y ) = (0, 0).) But we can still not use our classifica-
• q < 0: saddle. tion since there are nonlinear terms present (X Y ). However,
since we are interested in the behaviour close to the equilib-
rium point we can discard any higher-order terms: when X and
Y are very small, X Y is very, very small, so it is negligible in
comparison with X and Y . With this linear approximation we
get
n ẋ = −2Y
ẏ = 6X
Now we can classify the equilibrium using the table above. In
this case, p = 0 and q = 12, so (2, 3) is a centre, i.e., the equilib-
• p = 0, q > 0: centre. rium is encircled by closed loops. To find the direction of the

52
loops we can consider for example a point just to the right of (b) Romeo’s and Juliet’s families are enemies. This
the origin, (X , Y ) = (1, 0). There ẋ = 0 and ẏ > 0, so we are head- can be expressed in the initial condition (r, j ) =
ing straight upwards from the “three o’clock” position, mean- at time t = 0.
ing that the loops go counterclockwise.
(c) What happens in the long run?
The other equilibrium is at (0, 0) already so we do not need to
change the variables. We can simply linearise directly to get (d) “In the Spring a young man’s fancy lightly turns to
thoughts of love,” says Tennyson. What differen-
n ẋ = 3X
tial equation concept is best invoked to capture this
ẏ = −4Y idea?
Here q = −12 so we have a saddle point. As we saw above, nor- 6.6.6. The method we have studied in this section for drawing
mally when we have a saddle point we would look for the ex- the phase portrait of a system of two first-order differen-
ceptional lines by solving tial equations can also be applied to second-order differ-
ẏ −4Y −4mX ential equations. The latter is reduced to the former by
= = =m taking the derivative of the function as a new variable.
ẋ 3X 3X
Let us carry this out in the case of the pendulum equa-
for m. In this case, however, the method breaks down since tion. Recall from 6.4 that the actual pendulum equation
the solutions are the x-axis (which is not found since Y = 0 is is ẋ 0 = −k sin(x). We simplified this to ẋ 0 = −kx, which
impossible in the above equation) and the y-axis (which is not is accurate for small oscillations since sin(x) ≈ x when
found since it has m = ∞). But it is clear enough that the axes x is small. But now we wish to use the actual equation
are the exceptional lines and that along the y-axis we are crash- ẋ 0 = −k sin(x), which holds for oscillations of any size.
ing to zero and along the x-axis we are running off to infinity.
This is also necessary given the counterclockwise orientation (a) Let y = ẋ and note how the pendulum equation be-
of the loops around the other equilibrium. comes a system of two first-order differential equa-
tions.
In general, once we have analysed the equilibria and drawn a
local picture for each we can fill in the rest of the phase plane (b) Carry out the phase plane analysis for this system.
by making one local picture transition smoothly into the other (You will need to use the approximation sin(x) ≈ x
(this is of course permitted whenever ẋ and ẏ are continuous). when linearising near equilibria.)
6.6.3. Draw the phase plane diagram of You should obtain a picture like this:
n ẋ = y + x y
ẏ = 3x + x y

§ 6.6.2. Problems

6.6.4. (a) Argue that the system


(c) Explain the two types of equilibria in physical
n ẋ = x − 2y x terms.
ẏ = 2y − 4x y
(d) Draw a few phase curves and explain their physical
represents two species competing for the same re- meaning.
sources.
6.6.7. In problem 6.6.1d you probably assumed that the equi-
(b) Discuss the difference between the two species. In librium values for x and y give a good indication of the
particular, what could be the biological reason for average numbers of x and y also for orbits other than the
the difference in the coefficients for the nonlinear equilibrium. This looks about right from the phase plane
terms? diagram, but let us confirm it by calculation.
(c) Sketch the phase plane diagram using the approxi- Since the equilibrium is a centre, we know that x and y
mate linearisation method. are periodic functions, say with period T .
(d) Interpret the diagram in real-world terms.
(a) Show that ẋ/x = a − b y and integrate both sides of
6.6.5. Consider the system d r /d t = − j , d j /d t = r , where r (t ) this equation from 0 to T . Hint: The left hand side
represents Romeo’s love (positive values) or hate (nega- is a “logarithmic derivative.”
tive values) for Juliet at time t , and j (t ) similarly repre-
(b) Use the resulting equation to find the average of y.
sents Juliet’s feelings toward Romeo.
(See problem 3.1.6.)
(a) “loves to be loved,” while is intrigued
by rejection. (c) Find the average of x similarly.

53
§ 6.7. Reference summary
§ 6.7.2. Other methods for first-order differential equations

§ 6.7.1. Separation of variables


Linear differential equationR y 0 + P (x)y = Q(x). Multiply both
sides by integrating factor e P (x) dx and interpret left hand side
To solve a differential equation by separation of variables: as outcome of product rule to integrate.
dy
Write y 0 = dx . Move all x’s (including dx) to one side and all
y’s (including dy) to the other. Integrate both sides. Include a y0 − y = x
constant of integration (“+C ”) on one side. If given, plug in ini-
Since, (−1)dx = −x (with C = 0), we see that e −x is an in-
R
tial condition (specific values of y and x) to determine C . Solve
for y to find the function that solves the differential equation. tegrating factor. This gives eR−x y 0 − e −x y = xe −x and
R hence
(e −x y)0 = xe −x and e −x y = xe −x dx = −xe −x + e −x dx =
−xe −x −e −x +C . Thus the general solution is y = −x−1+C e x .
y 0 = −2x y 2
dy
The equation can be written as dx = −2x y 2 . An evident so-
y 0 + y = x, y(0) = 0
lution is y = 0. When y 6= 0, separation of variables gives
dy
= −2xdx. Integration of both sides gives 1y = −x 2 + C , Use integrating factor e x : y 0 e x + ye x = xe x , ye x = xe x dx =
R
y2
and hence y = 1
. xe x − e x + C , y = x − 1 + C e −x . The condition y(0) = 0 gives
x 2 −C
0 = −1 +C , so C = 1 and y = x − 1 + e −x .

1 + x = 2x y y 0 , y(1) = 2
y 0 + 2x y = x
Write y 0 = d y/d x and use separation of variables. 1 + x = 2
Since 2xdx = x 2 (where C = 0), we see that e x is an inte-
R
2x y y 0 =⇒ ( x1 +1)d x = 2yd y =⇒ ln|x|+ x = y 2 +C . Plugging
2 2 2
in the condition x = 1 p when y = 2 gives ln|1| + 1 = 22 +C =⇒ grating factor. This gives e x y 0 + 2xe x y = xe x , and hence
2 2 2 2
(e x y)0 = xe x and e x y = xe x dx = [x 2 = t , 2xdx = dt] =
R
C = −3. Thus y(x) = ± 3 + x + ln|x|.
1 1 t 1 x2
R t
2 e dt = 2 e + C = 2 e + C . The general solution is thus
2
y 0 e x+y = 1, y(0) = 1 y = 12 +C e −x .

dy y
= e −x ⇒ e y dy = e −x dx ⇒ e y dy = e −x dx ⇒ e y =
R R
dx e
−x
−e + C . Since y(0) = 1, we get C = e + 1. Thus the solu- x 2 y 0 + y = e 1/x , y(1) = e.
tion is e y = −e −x + e + 1 ⇔ y = ln(−e −x + e + 1).
x 2 y 0 + y = e 1/x ⇔ y 0 + x12 y = x12 e 1/x . Multiplying by the
integrating factor e −1/x gives e −1/x y 0 + x12 e −1/x y = x12 ⇔
y 0 = y 2 , y(0) = 1 D(e −1/x y) = x12 , which upon integrating yields e −1/x y = C −
1 1 1/x
An evident solution is y = 0, but it does not satisfy the initial x ⇔ y = (C − x )e . The boundary condition y(1) = e im-
condition y(0) = 1, so it can be disregarded. When y 6= 0 we plies e = y(1) = (C − 1)e ⇒ C = 2. Hence the solution is
y0 dy y = (2 − x1 )e 1/x .
can divide by y 2 , which gives y2
= 1 and y2
= dx. Integrating
gives 1y = x + C and plugging in y(0) = 1 gives C = −1. Thus
1
y = 1−x . When above methods not applicable: A substitution may make
it so. Standard substitutions:

(1 + cos x)y 0 = y sin x, y(0) = 1 • If the differential equation is of the form y 0 = f (x, y)
where f is invariant under scaling ( f (λx, λy) = f (x, y)),
dy sin xdx
R dy R sin xdx
(1 + cos x)y 0 = y sin x ⇔ y = 1+cos x ⇔ y = 1+cos x ⇔
substitute u = y/x.
ln y = − ln(1 + cos x) + C . The condition y(0) = 1 gives ln 1 =
− ln 2 + C ⇔ 0 = − ln 2 + C ⇔ C = ln 2. Hence ln y = − ln(1 + • If the differential equation involves terms a y ± bx, try
2
cos x) + ln 2 ⇔ y = e − ln(1+cos x)+ln 2 ⇔ y = 1+cos making this u.
x.
• If the differential equation is of the form y 0 + P (x)y =
(y + 1)y 0 + cos x = 0, y(0) = 0 Q(x)y n , substitute u = y 1−n .

The equation separates to (y + 1)dy = − cos xdx ⇔ 21 y 2 + y = To carry out the substitution: Differentiate the equation defin-
− sin x +C . Plugging in y(0) = 0 gives 21 ·0+0 = 0+C ⇒ C = 0. ing u to find du
dx . Using this and the defining equation for u,
We multiply by 2 to get y 2 + y = −2 sin x ⇔ (y + 1)2 = 1 − eliminate all y’s and dy’s from the differential equation, so as to
2 sin x. The initial condition produce a differential equation for u as a function of x. Solve
p means that only the positive
roots are relevant, so y = 1 − 2 sin x − 1. this equation. Substitute back to express the answer in terms
of the original variables.

54
The full solution of the differential equation is the sum of the
§ 6.7.3. Second-order differential equations
homogenous and particular solutions: x(t ) = x h + x p .
• Solve ẍ + b ẋ + c x = 0.
y 00 + 4y 0 + 4y = x + 1
Form the corresponding characteristic equation m 2 + bm +
c = 0. The roots of this equation tells you the solution: The characteristic equation m 2 + 4m + 4 = 0 has a dou-
ble root m 1,2 = −2. Hence y h = (C 1 x + C 2 )e −2x . We seek a
particular solution of the form y p = Ax + B . We see that
roots of m 2 + bm + c = 0: solution of ẍ + b ẋ + c x = 0: y p00 + 4y p0 + 4y p = 4Ax + (4A + 4B ) = x + 1. Thus 4A = 1 and
distinct real x(t ) = Ae m1 t + B e m2 t
4A + 4B = 1, which gives A = 14 , B = 0. Thus y p = 14 x and the
double root x(t ) = (A + B t )e mt
general solution is y = (C 1 x +C 2 )e −2x + 41 x.
complex a ± bi x(t ) = Ae at cos bt + B e at sin bt

y 00 + 3y 0 − 4y = 0 y 00 + y = x 2 , y(0) = 0, y 0 (0) = 0

The characteristic equation m 2 + 3m − 4 = 0 has two real The characteristic equation m 2 + 1 = 0 has the solution
roots: m 1 = 1 and m 2 = −4. Thus y(x) = Ae x + B e −4x . m = ±i , hence the homogenous solution is y h = C 1 sin x +
C 2 cos x. For the particular solution, assume y p = ax 2 +
bx + c. Then y 00 = 2a. The requirement y p00 + y p = x 2 gives
y 00 + 2y 0 + y = 0
2a + ax 2 + bx + c = x 2 , which has the solution a = 1, b = 0,
The characteristic equation m 2 + 2m + 1 = 0 has a double c = −2. So the particular solution is y p = x 2 − 2, and the full
root: m 1,2 = −1. Thus y(x) = (Ax + B )e −x . solution is y = y p + y h = x 2 − 2 +C 1 sin x +C 2 cos x. The con-
dition y(0) = 0 gives 0 = −2+C 2 , C 2 = 2. Differentiation gives
y 0 = 2x + C 1 cos x − C 2 sin x and the condition y 0 (0) = 0 gives
y 00 + 2y 0 + 2y = sin 2x
C 1 = 0. Hence y = x 2 − 2 + 2 cos x.
The characteristic equation m 2 +2m+2 = 0 has the complex
roots m 1 = −1 + i and m 2 = −1 − i . Thus y(x) = e −x (A sin x +
B cos x). y 00 − 2y 0 − 15y = 6e −x

The characteristic equation m 2 − 2m − 15 = 0 has the solu-


• Solve ẍ + b ẋ + c x = f (t ).
tions m 1 = −3 and m 2 = 5. Hence y h = C 1 e −3x + C 2 e 5x . We
First consider the corresponding homogenous equation ẍ + seek a particular solution of the form y p = Ae −x . Differen-
b ẋ + c x = 0. Solve it as above. Call the solution x h . tiation gives y p0 = −Ae −x , y p00 = Ae −x . Plugging this into the
equation, we get Ae −x + 2Ae −x − 15Ae −x = 6e −x ⇔ −12A =
Next find a particular solution x p , meaning any one function
6 ⇔ A = − 21 . The general solution is thus y = y h + y p =
that satisfies the differential equation. Do this by first guess-
C 1 e −3x +C 2 e 5x − 12 e −x .
ing that x p has the same form as f (t ), except with undeter-
mined coefficients for each term. For instance:

y 00 − 4y 0 + 13y = e −2x + 1
If f (t ) has this form: Try this as x p :
The characteristic equation m 2 − 4m + 13 = 0 has the roots
3e 5t C e 5t
m 1,2 = 2±3i . Thus y h = e 2x (D 1 cos 3x +D 2 sin 3x). We seek a
sin t C sin t + D cos t particular solution of the form y p = Ae −2x + B . This gives
y p00 − 4y p0 + 13y p = 4Ae −2x − 4(−2Ae −2x ) + 13(Ae −2x + B ) =
2 cos 3t C sin 3t + D cos 3t
25Ae −2x + 13B = e −2x + 1. Hence A = 1/25, B = 1/13 and
4x + 2 Cx +D this y p = 25 1 −2x
e 1
+ 13 . The general solution is thus y =
2 1 −2x 1
8x C x2 + D x + E 2x
e (D 1 cos 3x + D 2 sin 3x) + 25 e + 13 .

Special cases not covered by this method:

Differential equation Particular solution x p : § 6.7.4. Phase plane analysis


2 A
ẍ + c x = A cos ct 2c t sin ct
A
ẍ + c 2 x = A sin ct − 2c t cos ct Classification of equilibrium (0, 0) of

Then determine the coefficients needed to fulfil the equation n ẋ = ax + b y


by plugging x p , x p0 , x p00 into the differential equation. ẏ = c x + d y

55
in terms of p = a + d , q = ad − bc, ∆ = p 2 − 4q: Find and classify the equilibria of the system ẋ = x − y, ẏ =
x + y − 2x y. Draw the phase plane.
p q ∆ equilibrium
+ + + unstable node Equilibria occur where x − y = 0, x + y − 2x y = 0. Eliminat-
− + + stable node ing y gives x(1 − x) = 0, so equilibrium points occur at (0, 0)
+ + − unstable spiral and (1, 1). First look at (0, 0). Here ẋ = x − y, ẏ ≈ x + y. Since
− + − stable spiral p = 2, q = 2 and ∆ = −4, it’s an unstable spiral. Next consider
− saddle (1, 1). Introduce new variables by x = 1 + X , y = 1 + Y . Then
0 + centre Ẋ = X − Y , Ẏ = 2 + X + Y − 2(1 + X )(1 + Y ) ≈ −X − Y . Hence
a = 1, b = 1, c = −1, d = −1, so that q = −2 < 0, so this point p
is a saddle (with m = Ẏ / Ẋ = −X −mX −1−m
X −mX = 1−m ⇒ m = 1 ± 2).
3

The system ẋ = x − 5y, ẏ = x − y has an equilibrium at (0, 0).


Classify it.
2

a = 1, b = −5, c = 1, d = −1, so p = a + d = 0, q = ad − bc = 1

4, ∆ = p 2 − 4q = −16. Since p = 0 and q > 0, the equilibrium -3 -2 -1 1 2 3

is a centre.
-1

The system ẋ = 2x + 3y, ẏ = −3x − 3y has an equilibrium at -2

(0, 0). Classify it.


-3

a = 2, b = 3, c = −3, d = −3 ⇒ p = −1, q = 3, ∆ = −11 ⇒ stable


spiral.

The system ẋ = x + y, ẏ = x − 2y has an equilibrium at (0, 0).


Classify it.
a = 1, b = 1, c = 1, d = −2 ⇒ p = −1, q = −3, ∆ = 13. Since
q < 0, the equilibrium point is a saddle. The slopes of
the exceptional lines or axes of the saddle are given by
m = ẏ/ẋ = x−2mx 1−2m 2
x+mx = 1+m , which gives m + 3m − 1 = 0, or Find and classify the equilibria of the system ẋ = 1 − x y, ẏ =
1
p
m = 2 (−3 ± 13). (x − 1)(y + 1). Draw the phase plane.
Equilibria occur where 1 − x y = 0, (x − 1)(y + 1) = 0, which is
• Find and classify the equilibria of a system ẋ = f (x, y), ẏ = at (1, 1) and (−1, −1). Consider (1, 1). Let x = 1+ X , y = 1+Y .
g (x, y). Then Ẋ ≈ −X − Y , Ẏ = X . Hence a = −1, b = −1, c = 1, d =
0 ⇒ p = −1, q = 1, ∆ = −3 ⇒ stable spiral. Consider (−1, −1).
Find all points (x, y) where ẋ = 0 and ẏ = 0 at the same time.
Let x = −1 + X , y = −1 + Y . Then Ẋ ≈ X + Y , Ẏ ≈ −2Y . Hence
These are the equilibria. Classify each equilibrium in turn.
a = 1, b = 1, c = 0, d = −2 ⇒ q = −2 < 0 ⇒ saddle (with
If (0, 0) is an equilibrium, classify it as above. If the expres- m = Ẏ / Ẋ = X−2mX −2m
+mX = 1+m ⇒ m = 0 or −3).
3

sions for ẋ and ẏ contain nonlinear terms (such as x 2 , x y,


etc.), discard these before classifying. 2

If (A, B ) is an equilibrium other than (0, 0), first make the


1

change of variables x = A + X , y = B + Y . The equilibrium -3 -2 -1 1 2 3

is now at (0, 0) in the new X Y coordinate system. Classify it


as above. -1

For nodes and saddles, find the slope of the exceptional lines -2

by substituting Y = mX into Ẏ / Ẋ = m. -3

• Sketch the phase plane diagram of a system ẋ = f (x, y), ẏ =


g (x, y).

Find and classify the equilibria as above. Mark the equilibria


in a coordinate system. Draw phase curves in the vicinity of
each equilibrium according to its classification. Extend the
picture in such a way that it transitions smoothly between
the local pictures.

56
Find and classify the equilibria of the system ẋ = x − y, ẏ =
x 2 − 1. Draw the phase plane.

Equilibria occur where x − y = 0, x 2 − 1 = 0 which is at (1, 1)


and (−1, −1). Consider (1, 1). Let x = 1 + X , y = 1 + Y . Then
Ẋ = X − Y , Ẏ ≈ 2X . Hence a = 1, b = −1, c = 2, d = 0 ⇒ p =
1, q = 2, ∆ = −7 ⇒ unstable spiral. Consider (−1, −1). Let
x = −1 + X , y = −1 + Y . Then Ẋ = X − Y , Ẏ ≈ −2X . Hence
a = 1, b = −1, c = −2, d = 0 ⇒ p = 1, q = −2 ⇒ saddle (with
m = Ẏ / Ẋ = X−2X −2
−mX = 1−m ⇒ m = 2 or −1).
3

-3 -2 -1 1 2 3

-1

-2

-3

57
§ 7.1.2. Problems
7 P OLAR AND PARAMETRIC CURVES
7.1.3. † Another important spiral is the logarithmic spiral r =
§ 7.1. Polar coordinates e kθ .

(a) Prove that for this spiral a magnification is the same


§ 7.1.1. Lecture worksheet thing as a rotation. Hint: what do these operations
mean in algebraic terms?
p
We saw in problem A.1.2 that y = 1 − x 2 is the equation for (b) Show that this means that the logarithmic spiral
a unit circle, or rather the topphalf of it: to get a full circle of cuts all radial lines at the same angle. (This spi-
radius R we must write y = ± R 2 − x 2 . Isn’t this formula dis- ral is therefore central in the theory of navigation,
turbingly complicated? One has the feeling that algebra almost where it corresponds to sailing at a constant com-
fails to convey the simplicity of the circle in some sense. Think pass course.)
about how lines and curves are drawn in geometry: ruler and
Consider the portion of the spiral between r = 0 and
compass—the simplest of tools. The compasspreally captures
some upper bound r = R. Divide it into triangles with
what it means to be a circle; the equation y = R 2 − x 2 almost
central angle dθ.
seems to miss the point by comparison.
(c) Show that by flipping over every other of these
However, the real problem here is that we are trying to put a
triangles (so that the pointy ends of the triangles
round peg in a square hole. Our coordinate system of x’s and
are pointing in alternating directions) they can be
y’s is intrinsically “rectangular” in nature: we describe the po-
made to fit into one large triangle.
sitions of points by saying “so far this way and so far in the per-
pendicular direction.” But this is not the only way to describe (d) Use this to infer the area and arc length of this por-
the positions of points. Instead we could specify the position of tion of the spiral.
a point by saying that it’s so-and-so far away in such-and-such
a direction. This is called polar coordinates:

§ 7.2. Calculus in polar coordinates


(r, θ)

r § 7.2.1. Lecture worksheet


θ
In a figure in §1.1 we saw the differential triangle with sides dx,
dy, ds.
7.1.1. Find the equation of a circle of radius R in polar coordi-
7.2.1. Draw the analogous figure in the polar coordinate case.
nates.
Hint: The figure in §1.1 is showing the step ds from
So it’s not that “circles hate algebra”; it’s just that, like peo-
a point (x, y) on the curve to the “subsequent” point
ple, curves become difficult when you force them to abide by
(x + dx, y + dy) on the curve, as well as the decomposi-
some system that clashes with their nature. Curves that “or-
tion of this step into the variable components dx and dy.
bit” around a center point like polar coordinates better and will
The polar case in quite analogous, except of course the
“play nice” if you let them have their way about this.
two coordinate components are now radial and circular
Another such example is the Archimedean spiral r = θ: rather than horizontal and vertical. Also, note that an in-
crease in θ corresponds to a greater change in position
the greater r is.
θ 7.2.2. (a) Use the figure from problem 7.2.1 to express ds in
terms of r and θ (and their differentials).
r (b) Use this to express the arc length of a polar curve
r (θ) as an integral in θ. Hint: this is in many ways
analogous to problem 4.2.2.

7.2.3. Use the figure from problem 7.2.1 to express the angle the
In fact, in rectangular coordinates, spirals cannot be described curve r (θ) makes with the radial line in terms of r and θ
by a polynomial equation of any kind. (and their differentials).
7.1.2. ? Can you think of a simple way of seeing this at a glance? In §3.1 we saw how the area under the curve y(x) is made up of
Hint: cf. problems A.2.5–A.2.7. rectangles with area y dx.

58
7.2.4. Draw the analogous picture for the polar coordinate At the moment the bomb is dropped we fire a projec-
case, and use it to express by means of an integral the tile aimed at the point from where the bomb is dropped.
area “swept out” by a polar curve. Show that we will hit the bomb mid-air as long as the fir-
ing velocity of our canon is greater than a certain thresh-
Hint: Instead of rectangles the area will be made up of
old.
“pizza slices,” which for the purposes of area calculation
may be considered triangular. The curve traced by a point on a rolling circle is called a cycloid:

§ 7.2.2. Problems

7.2.5. In this problem we shall prove computationally what we


saw geometrically in problem 7.1.3 regarding the loga-
rithmic spiral r = e kθ .
(a) Use problem 7.2.3 to show that this curve makes This is another example of a curve for which the essence of the
equal angles with all radial lines. generating motion is easier to capture in parametric form than
through a standard Cartesian equation in x and y.
Consider the portion of the spiral between r = 0 and
some upper bound r = R. 7.3.3. Show that the cycloid is parametrised by
(b) Use problem 7.2.2b to express the arc length of this x = t − sin t
portion.
y = 1 − cos t
(c) Use problem 7.2.4 to find the area of this portion.
Hint: begin by parametrising the motion of the midpoint
(d) ? Reconcile your results with the geometrical result of the circle.
of problem 7.1.3.
7.2.6. (a) Use problem 7.2.4 to find the area of a circle. § 7.3.2. Problems

(b) ? Discuss how this proof relates to the ones we have 7.3.4. As a variant of the cycloid we can let a circle roll on an-
seen in problems 1.1.1 and 3.4.3. other circle:

§ 7.3. Parametrisation

§ 7.3.1. Lecture worksheet

A parametrisation is a way of characterising a curve “one point


at a time.” For example, (cos t , sin t ) is a parametric representa-
tion of the unit circle, because as the parameter t runs through
all numbers, (cos t , sin t ) runs around and around the unit cir-
cle. In fact no more than a restricted interval such as 0 ≤ t < 2π (a) Sketch the resulting curve in the case where the two
is enough to generate the circle. circles are of the same size.
Parametric representations of curves are often useful when the (b) This curve is called the cardioid—why? Sound peo-
x and y coordinates of the curve are governed by separate ple speak of “cardioid microphones”—can you see
laws. An important physical example is that of projectile mo- what they mean?
tion. The initial firing velocity v 0 of a projectile can be decom-
posed into horizontal and vertical components v 0 cos α and (c) Show that the cardioid is parametrised by:
v 0 sin α. So if the projectile simply kept going in inertial motion x = 2 cos t − cos 2t
its equations would be x = t v 0 cos α, y = t v 0 sin α. But these are
y = 2 sin t − sin 2t
not yet the correct equations. We must also take into account
the constant gravitational acceleration affecting the horizontal
7.3.5. Consider a line through (−1, 0) with slope t . This line
position, giving the parametric representation of the trajectory
cuts the unit circle x 2 + y 2 = 1 in one more point besides
x = t v 0 cos α, y = t v 0 sin α − g t 2 /2.
(−1, 0). Find the coordinates of this point in terms of t .
7.3.1. Combine these equations so as to eliminate t and con-
7.3.6. Use problem 7.3.5 to explain why the substitution u =
firm that the result is the kind of curve you expected.
tan(θ/2) turns a rational function of sin(θ) and cos(θ)
7.3.2. A bomb is dropped from a point directly above a tar- into a rational function of u. (This is useful for integra-
get. We are located at a fortification some distance away. tion purposes, as noted in §3.7.6.)

59
7.3.7. † One of the oldest recorded mathematical documents is
a Babylonian clay tablet from almost four thousand years
ago, way back in the Bronze Age. It consists in nothing
but a long list of Pythagorean triples, i.e., integers a, b, c
such that a 2 + b 2 = c 2 . Explain how problem 7.3.5 can be
used to generate Pythagorean triples.

7.3.8. A thread is wound around a circular spool. You grab the 7.4.5. In this problem we shall prove that the bob of a per-
end of the thread and start unwinding it while keeping it fect pendulum clock swings along a cycloidal path. This
as taunt as possible. means that even as the pendulum’s swings are damp-
ened it still takes the same time to complete one full
(a) Argue that the unwound piece of the string at any swing. In other words, a particle sliding down a cycloidal
stage in this process is tangent to the circle at the ramp (a cycloid turned upside-down compared to our
point of contact. previous pictures) will take the same time to reach the
bottom no matter where it started. The fact that we
(b) Sketch roughly the curve traced out by the free end
are now considering an upside-down cycloid can eas-
of the thread as you unwind it.
ily be accounted for by simply letting the y-axis be di-
(c) Find a parametric representation of this curve. rected downwards. Then y represents the vertical dis-
tance fallen from the highest point of the cycloid (the
Hint: Let the spool be the unit circle and let the ini- start of the ramp), and our previous parametric equa-
tial position of the free end of the thread be (1, 0). As tions carry over without change.
parameter, use the angle θ between the x-axis and
(a) Using the parametric representation of the cycloid,
the point where the unwound portion of the string
rewrite the integral of problem 6.3.2 in terms of θ.
touches the spool.
(b) What are the bounds of integration?
(c) Evaluate the integral.
§ 7.4. Calculus of parametric curves (d) Repeat all of the above steps for the case where the
particle is released not from y = 0 but some lower
point y = y 0 .
§ 7.4.1. Lecture worksheet
(e) Explain how this establishes what we wanted to
show.
In this section we shall see how to deal with tangents, areas,
and arc lengths of curves given parametrically. The idea is sim-
ple: start with the usual expressions, and the rewrite them in § 7.5. Reference summary
terms of the parametrisation, as outlined in §7.5.4. Let us keep
using the cycloid of problem 7.3.3 as our guiding example in
this investigation. § 7.5.1. Polar coordinates

7.4.1. (a) Using the parametric representation of the cycloid, (r, θ)


find an expression for the slope of its tangent at a
r
given point.
θ
(b) How can you check whether the answer seems rea-
sonable? • Given a point (r, θ) in polar coordinates, find its cartesian co-
ordinates (x, y).
7.4.2. Show that the area under one arch of the cycloid is three
times the area of the rolling circle. x = r cos θ, y = r sin θ.

7.4.3. Show that the length of one arch of the cycloid is four • Given a point (x, y) in cartesian coordinates, find its polar co-
times the diameter of the rolling circle. ordinates (r, θ).
r = x 2 + y 2 . If x is positive, θ = arctan(y/x). If x is negative,
p

arctan(y/x) is the negative of the angle with the negative x-


§ 7.4.2. Problems axis, whence θ = arctan(y/x) + 180◦ .

7.4.4. Show that the result of problem 7.4.1a can be interpreted § 7.5.2. Calculus in polar coordinates
geometrically as saying that the tangent is parallel to the
other dotted line in the figure below. Arc length of polar curve r (θ) from θ = a to θ = b:

60
s
¶2 § 7.5.4. Calculus of parametric curves
Z b dr
µ
r2 + dθ
a dθ • Find the slope of the tangent to a parametric curve
(x(t ), y(t )).
Angle r (θ) makes with radial line: dy dy/dt
dx = dx/dt

• Find the area enclosed by a parametric curve (x(t ), y(t )).


r dθ
µ ¶
arctan R
dr Start with the usual area integral y dx and rewrite it as an
integral purely in terms of the parameter t .
Area of polar curve r (θ) from θ = a to θ = b: • Find the arc length of a parametric curve (x(t ), y(t )).
Start with the usual arc length integral ds, where ds2 =
R
b
dx2 + dy 2 , and rewrite it as an integral purely in terms of the
Z
1
2 r 2 dθ
a parameter t .

§ 7.5.3. Parametrisation

A parametrisation is an expression (x(t ), y(t )) that gives one


point for each t -value one plugs into it. As the parameter t runs
through all possible values, a curve is generated.

Motion of projectile fired at angle α with velocity v 0 : x =


t v 0 cos α, y = t v 0 sin α − g t 2 /2.

• Understand the curve determined by a given parametrisa-


tion.

Fill in various values for t and plot the resulting points. These
are all points on the curve.

If you can solve for t in one of the equations x = x(t ) and


y = y(t ) then you can plug this expression for t into the other
equation to obtain the ordinary cartesian equation for the
curve.

• Find the parametrisation of a given curve.

Some simple cases: If the curve is of the form y = f (x),


you can let x be the parametrising variable; then x = t and
y = f (t ). The circle with midpoint (a, b) and radius r is
parametrised by x = a + r cos t , y = b + r sin t .

In general: Try to choose the parameter t to correspond to


a natural characteristic of the curve, such as a certain angle,
time, arc length, etc. We are free to do this since the only
formal requirement for the parameter t is that plugging in
numbers for it generates all points on the curve.

Next we need to look at the x and y coordinates of a point on


the curve in isolation from each other. Forget about y and
try to express x purely as a function of t . Then do the same
for y. This is can often be done directly when the nature of
the curve is based on inherently rectilinear principles, such
as gravity and inertia.

If the curve consists of multiple interacting or compound-


ing motions, first try to express each separately. The
parametrised point can often be described as the sum of
such component effects, for example as a sum of vectors.

61
§ 8.1.2. Problems
8 V ECTORS
8.1.3. You want to paddle across a river in a canoe. You can
paddle twice as fast as the speed of the flow of the river.
§ 8.1. Vectors At what angle should you aim the nose of your canoe in
order to go straight across the river?
§ 8.1.1. Lecture worksheet 8.1.4. Prove that if you join the midpoints of the sides in
any quadrilateral you always get a parallelogram. Hint:
In this section we shall see that geometrical problems regard- Think of quadrilateral as a + b = c + d, divide by 2, inter-
ing projections and perpendicularity can be reduced to algebra pret geometrically.
in a remarkably simple way. This is done using the language of
8.1.5. (a) Prove both algebraically and visually that |a + b| =
vectors. A vector is a directed line segment: it is so-and-so long
|a − b| if and only if a and b are perpendicular.
and it goes one way rather than the other. We draw it as an ar-
row and denote it v. We can also express it in coordinate form (b) Under what conditions is |a + b|2 = |a|2 + |b|2 ?
by putting its foot end at the origin and recording the coordi-
nates of its endpoint (this is why the v is fat: it is “stuffed” with
more than one number). But the vector is the same no matter § 8.2. Scalar product
where it starts, like these two v’s:
§ 8.2.1. Lecture worksheet
u = (-4,1)
Vectors can be used to express projections in a very convenient
way.
v = (1,2)
v = (1,2) a b
The arithmetic of vectors goes like this: |a |cos
b The length of a’s projection onto b is easily expressed trigono-
a a v
a+ b metrically. It is |a| cos θ, where |a| means the length of the vec-
a-

2v
b

a -v tor a (just as absolute value always means distance to the ori-


b b gin, or simply magnitude). The remarkable thing is that the
length of the projection can also be expressed in a very simple
So to add vectors we put them “head to tail” to make a “vector way in terms of the coordinates of the vectors. This is codified
train.” It follows that a−b points from b to a, because a−b must in the scalar product
be something such that when you add b to it you get a.
a · b = |a||b| cos θ = a 1 b 1 + a 2 b 2 + a 3 b 3
8.1.1. What is the sum of the twelve vectors pointing from the When b is a unit vector (i.e., has length 1) the middle expres-
centre of a clock to the hours? sion is precisely the length of the projection, so the formula
tells us that we can find it simply by multiplying the vectors
8.1.2. Which of the following expressions correspond to the
component-wise and adding the results. Nothing could be eas-
midpoint of the diagonal of the parallelogram spanned
ier.
by a and b?
1
What is the reason behind this magical harmony of geometry
 2 (a + b) and algebra? To see this it is useful to introduce the unit vectors
 a + 12 (a − b) i, j, k pointing in the direction of the axes:
z
 b + 12 (a − b)
k=(0,0,1)
1
 a− 2 (a − b)
j=(0,1,0)
 b + 12 (b − a) y
i=(1,0,0)
In physics we often encounter quantities that have both mag- x
nitude and direction, such as force or velocity. Vectors are
the natural language for describing such phenomena. This is Then by breaking up a and b into their coordinate components
in contrast with quantities that have magnitude only, such as we get
temperature. a · b = (a 1 i + a 2 j + a 3 k) · (b 1 i + b 2 j + b 3 k)

62
Now, the projection properties of i, j, k are particularly sim-
ple: any one of them projected onto itself gives 1, and pro-
jected onto each of the other two gives zero. Therefore when
we multiply out the parenthesis all the cross terms go away
and only the “like with like” terms survive. So the result is
a 1 b 1 + a 2 b 2 + a 3 b 3 , as claimed.

8.2.1. Which of the following are assumptions made in the (a) Argue that any point of the form (a, a, a) is equidis-
above proof? tant from each of these three points.
(b) Determine the coordinates of the fourth hydrogen
 u · v = v · u for any vectors u, v.
atom using the condition that all hydrogen atoms
 (u + v) · w = u · w + v · w for any vectors u, v, w must be equidistant.
(c) Determine the coordinates of the carbon atom by
 (ku)·v = k(u·v) for any vectors u, v, and constant k.
computing the “average” position of the four hydro-
 u · v = u 1 v 1 + u 2 v 2 + u 3 v 3 for any vectors u, v gen atoms (i.e., add the hydrogen position vectors
and divide by 4).
To complete the proof any such assumptions would have
(d) Compute the bond angle, i.e., the angle between
to be proved
the lines that join the carbon atom to two of the hy-
 geometrically (using cosine form) drogen atoms.
8.2.5. Show by an example that a · b = a · c does not necessarily
 algebraically (using coordinate form)
imply b = c. In other words, you cannot “cancel” the a.
Our argument about i, j, k also highlighted two useful special 8.2.6. Consider these two ways of setting up a coordinate sys-
cases of the scalar product: projection onto itself, and perpen- tem of three variables:
dicularity. The scalar product of a vector with itself gives the
z z
length squared, a · a = a 12 + a 22 + a 32 = |a|2 , and the scalar prod-
uct of two vectors is zero if and only if they are perpendicular.
Indeed, if a problem has right angles in it, chances are that you
can solve it by scalar products. Here is an example:

8.2.2. Consider two lines, y = ax + b and y = c x + d , that in-


y x
tersect in some point. The vectors ( , a) and ( y
, c) point in the direction of these lines respec-
x
tively. If the lines are perpendicular, the product of their
Is there a difference? Is one more natural than the other?
slopes is
8.2.7. In Proclus’s commentary on Euclid’s Elements one reads:
Another example illustrating the same point is problem 8.2.10.
The Epicureans are wont to ridicule this the-
8.2.3. |a + b| = |a − b| if and only if a and b are perpendic- orem, saying it is evident even to an ass and
ular. One can see this geometrically by means of [the needs no proof. . . . That the present theorem
Pythagorean Theorem, Theorem of Thales, diagonals of is known to an ass they make out from the
parallelograms, trigonometry]. To prove it algebraically, observation that, if straw is placed at one ex-
it is useful to consider: tremity of the sides [of a triangle], an ass in
quest of provender will make his way along
 (a + b) · (a − b) the one side and not by way of the two others.
 (a + b) · (a + b) Benno Artmann, in his book Euclid: The Creation of
Mathematics, adds:
 (a − b) · (a − b)
The Epicureans of today might as well add
that one could see the proof on every campus
where people completely ignorant of mathe-
§ 8.2.2. Problems
matics traverse the lawn in the manner of the
ass.
8.2.4. A molecule of methane, CH4 , forms a regular tetrahe-
What is the theorem in question? Prove it using vector
dron with the four hydrogen atoms at the vertices and
methods. Hint: First prove that |a · b| ≤ |a||b|, and then
the carbon atom at the centroid. Let the vertices of
consider (a + b) · (a + b) and estimate it upwards.
the first three hydrogen atoms be (1, 0, 0), (0, 1, 0), and
(0, 0, 1). 8.2.8. (a) Prove the Pythagorean theorem by vector methods.

63
(b) Is this proof circular? I.e., was the Pythagorean the-
a×b
orem needed to establish the properties of vectors
that you used in your proof?

(c) Generalise to the case where the angle between the b


a θ
two “legs” is no longer a right angle. This is the so-
called “law of cosines.”
Lastly, it is evident from the law of the lever that the magnitude
8.2.9. For this problem, use only scalar products and properties
|a × b| is the area of the parallelogram spanned by a and b, i.e.,
of vectors. Do not use coordinates, x’s and y’s, trigonom-
|a||b| sin θ. In the wrench interpretation, |b| sin θ is the part of
etry, functions, theorems of Euclidean geometry, etc.
the applied force that is perpendicular to the wrench shaft, i.e.,
(a) Let A and B be two diametrically opposite points the part of the force that actually has an effect in rotating the
on a circle, and let C be an arbitrary third point on wrench.
the circle. Prove that ∠ AC B is a right angle.
b
(b) Let ABC be a triangle with a right angle at C , and let |b| sin θ
O be the midpoint of AB . Prove that |O A| = |OB | = θ
|OC |. (Note that this result is obvious when the tri- a
angle is inscribed in a circle as in the previous prob-
lem, but here you are asked to prove it indepen- To find out how to compute the vector product algebraically,
dently without making any reference to circles.) we split a and b into components and consider

8.2.10. Shortest (i.e., perpendicular) distance from a line to a a × b = (a 1 i + a 2 j + a 3 k) × (b 1 i + b 2 j + b 3 k)


point.
Let us assume for the moment that the vector product behaves
(a) Explain why a+t b generates a line as t runs through much like ordinary multiplication, so that we can “multiply
all real numbers. out” the parenthesis in the usual way. Then each of the terms
are easy to find since it is evident from the wrench interpreta-
(b) Let a = (0, 0) and b = (2, 1). Express the vector point-
tion that for example i × j = k and i × i = 0.
ing from a + t b to the point (3, 3).
8.3.1. Explain these equalities.
(c) Find the value of t for which this vector is perpen-
dicular to the line. In this way we see that the product becomes

(d) Find the shortest distance from the line to the a × b = (a 2 b 3 − a 3 b 2 )i + (a 3 b 1 − a 1 b 3 )j + (a 1 b 2 − a 2 b 1 )k


point.
8.3.2. Check this.
This result can be written more elegantly in terms of determi-
§ 8.3. Vector product nants as ¯ ¯
¯i j k ¯¯
¯
a × b = ¯¯a 1 a 2 a 3 ¯¯
§ 8.3.1. Lecture worksheet ¯b
1 b2 b3 ¯
This is the formula to use for computing vector products in
There is another way of “multiplying” vectors, which is writ-
practice. Once we have arrived at this formula through this
ten with a cross instead of a dot: a × b. This is called the vec-
heuristic line of reasoning, we can reverse our path and take
tor product because the result is a vector. The meaning of the
this formula as our definition of the vector product and then
product a × b is most vividly seen in physical terms. If r is a
derive its various properties by direct computation. In partic-
wrench onto which I apply the force F, then r × F is the torque
ular we could then verify our assumption about the algebra of
T, i.e., the force with which the bolt moves.
vector multiplication. It would be easier to be formally com-
plete this way but in terms of insight it would add little. In fact
r θ we could also justify our assumptions physically.
8.3.3. Explain the following in terms of the wrench interpreta-
F tion.
T=r×F (a) a × (b + c) = a × b + a × c

This interpretation makes the main properties of a×b obvious. (b) (a + b) × c = a × c + b × c


First of all a × b is clearly perpendicular to both a and b. Also
(c) (λa) × b = a × (λb) = λ(a × b)
its direction can be determined from our everyday experience
with how bolts and screws move: (d) a × b = −b × a

64
§ 8.3.2. Problems ∆t
∆t
8.3.4. Find (1, 2, 3) × (2, 1, 2) and use it to determine the area of
the parallelogram spanned by these two vectors. Note
that the result is very easily obtained in this way from a
formula that we found using physical reasoning, whereas
it would be much harder to compute directly by brute- This is easily proved by vector methods. Let r be the po-
force analytic geometry. sition vector of the planet with the sun as the origin.
(a) The gravitational force is directed towards the sun.
Explain what this means in terms of r̈. Hint:
§ 8.4. Geometry of vector curves cf. problem 8.4.2.
(b) Explain how the area covered by the planet is mea-
§ 8.4.1. Lecture worksheet sured by |r × ṙ|/2.
d
(c) Prove the product rule for vector products: dt (u ×
Many geometrical properties of curves are more naturally and
v) = u̇ × v + u × v̇.
elegantly expressed in vector language than in terms of explicit
formulae in x and y. If x(t ) = (x(t ), y(t )) is the position of a (d) Use this to prove that the planet covers equal areas
moving particle, then its velocity is ẋ(t ) = v(t ) = (ẋ(t ), ẏ(t )). Ge- in equal times.
ometrically, ẋ is a tangent vector since it indicates the “direc-
tion of instantaneous change.”

8.4.1. Consider a particle moving counterclockwise along the


§ 8.5. Reference summary
unit circle at unit speed. Explain how you could deter-
mine ẋ geometrically without actually differentiating x. § 8.5.1. Vectors
Then, by considering the derivative of x, explain how this
leads to a new proof of the differentiation rules for the b
sine and the cosine. a a v
a+ b

a-
2v

b
8.4.2. (a) Find the acceleration vector a = ẍ for this uniform a -v
circular motion and illustrate with a figure. b b
(b) Interpret this result is physical terms, using New-
Vector addition. Algebraically: (a, b) + (c, d ) = (a + c, b + d ). Ge-
ton’s law F = ma, in the case where x is the motion
ometrically: arrows head to tail.
of a planet about the sun.
) ka = k(a, b) = (ka, kb) = magnification of a by factor k.
8.4.3. By definition, a = v(t +dt)−v(t
dt is proportional to the dif- q
ference between two successive velocity vectors. Argue |x| = x 12 + x 22 + x 32 = length of x.
on this basis that a is always perpendicular to v for any
fixed-speed motion.
a parallel to b ⇐⇒ a = kb
When studying the geometry of curves we prefer to use unit-
speed parametrisations of our curves, |ẋ| = 1, since this gives position
the curve in its purest form, uncontaminated by physical con- −−→
vector of OP ; vector pointing from the origin to P
siderations. For unit-speed parametrisations, as problem 8.4.3 point P
suggests, the geometrical meaning of |ẍ| is “how much the
unit vector vector of length 1
curve is turning,” or curvature. This is the same curvature stud-
orthogonal perpendicular; making a right angle;
ied in §13.1. We see that vector language is more naturally
scalar product 0
suited to the problem and simplifies a mess of a formula into
orthonormal orthogonal and of unit length
the simple and intuitive |ẍ|. And this simplification is no mere
game with symbols, as the following problem shows. z
8.4.4. Compute the curvature of a circle using the vector k=(0,0,1)
method, and compare with the non-vector way of doing
this (problem 13.1.3). j=(0,1,0)
y
i=(1,0,0)
§ 8.4.2. Problems x

8.4.5. Kepler’s area law says that planets sweep out equal areas (2, 5, 1) = 2î + 5ĵ + k̂
in equal times.

65
• Ensure that two vectors are perpendicular.
§ 8.5.2. Scalar product

Set scalar product (coordinate form) equal to zero.


a · b = |a||b| cos θ = a 1 b 1 + a 2 b 2 + a 3 b 3
| {z } | {z }
cosine form coordinate form
Find a value of x such that (x, −1, 2) is perpendicular to
= (length of) projection of a onto b if |b| = 1
(1, 2, 2).
a · b = 0 ⇒ a perpendicular to b
(1, 2, 2) · (x, −1, 2) = x − 2 + 4 = 0 =⇒ x = −2.
a · a = |a|2

Find a value of a such that the vectors (6, 5, − 21 ) and


(1, 2, 2) · (3, 0, 5) = 1 · 3 + 2 · 0 + 2 · 5 = 13 ( 13 , a, 16) are perpendicular.

The vectors are perpendicular when (6, 5, − 12 ) · ( 13 , a, 16) =


§ 8.5.3. Vector product 0. Hence 6 · 31 + 5a − 16 · 12 = 0 and thus a = 65 .
¯ ¯
¯i j k ¯¯
• Determine angle between two vectors.
¯
a × b = ¯¯a 1 a2 a 3 ¯¯
¯b b2 b3 ¯
1
Solve for θ in equality between coordinate and cosine forms
= (a 2 b 3 − a 3 b 2 )i + (a 3 b 1 − a 1 b 3 )j + (a 1 b 2 − a 2 b 1 )k
of the scalar product.

a × b is perpendicular to a and b Determine the angle θ between the vectors (1, 5) and
(3, 2).
|a × b| = area of parallelogram spanned by a and b
From the scalar product identity (1, 5) · (3, 2) = |(1, 5)| ·
|(3, 2)| · cos θ, we obtain cos θ = p 2 1·3+5·2
2
p
2 2
= p 13p =
1 +5 · 3 +2 26 13
§ 8.5.4. Geometry of vector curves
p1 . The angle is thus π/4.
2
For a curve given parametrically by x(t ):
ẋ = tangent vector. Find the angle between v = (1, 0, 1) and w = (1, 2, 2).
|ẍ| = |Å| = curvature. On the one hand, v · w = (1, 0, 1) · (1,
p2, 2) = 3. On the
p other
hand, pv · w = |v||w| cos θ,
p and |v| = 1 2 + 02 + 12 = 2, and
p
|w| = 12p+ 22 + 22 = 9 = 3. Thus 3 = 3 2 cos θ =⇒
§ 8.5.5. Problem guide
cos θ = 1/ 2 =⇒ θ = π4 = 45◦ .
• Find a vector pointing from a to b.
b − a. • Find area of parallelogram.

• Find unit vector pointing in same direction as a.


Find two vectors a, b that span it (i.e., point from one vertex
a/|a|. to the two adjacent ones). Area is |a × b|.

• Express a line in vector form.


• Find area of triangle.
Find a position vector a for a point on the line. Using a sec-
ond point on the line, or its slope or direction, find a vector Consider as half a parallelogram and proceed as above.
b that points in the direction of (is parallel to) the line, i.e.,
so that a + b is a point on the line. Now a + λb is a point on Find the area of the triangle with vertices P = (1, 1, 1),
the line for any number λ, so this expression gives the line in Q = (1, 1, 0), R = (0, 0, 1).
vector form. p
1 1 1
• Find centroid (midpoint, average position) of a set of posi- 2 |QP × RP | = 2 |(−1, 1, 0)| = 2 2.

tion vectors.
Add the vectors and divide by how many there are. • Determine the direction of a × b.

• Find center of mass (weighted average position) of a set of Point the index finger of your right hand in the direction
weighted position vectors. of a, with your palm facing towards b. The vector product
Multiply each position vector by its mass, add all together, a×b points perpendicularly upwards in the direction of your
and divide by total mass. thumb. (See figure and screwdriver analogy in §8.3.1.)

66
§ 8.5.6. Examples

Given that |u| = 2, |v| = 3, and u · v = −1, determine the


length of the vector 2u − 3v.

|2u−3v|2 = (2u−3v)·(2u−3v) = 4u·u−6u·v−6v·u+9v·v =


4|u|2 −12u·v+9|v|2p= 4·22 −12(−1)+9·32 = 109. The length
of 2u − 3v is hence 109.

Find the distance from the point (2, 3, −1) to the plane 2x −
y + 2z = 2.
(x, y, z) = (2, 3, −1)+t (2, −1, 2) = (2+2t , 3−t , −1+2t ) is a para-
metric representation of the line in the direction of the nor-
mal through the given point. Plugging this into the equation
for the plane gives 2 = 2(2 + 2t ) − (3 − t ) + 2(−1 + 2t ) = 9t − 1,
and hence t = 13 . The distance is thus d = |t | · |(2, −1, 2)| =
p
1 2 2 2
3 · 2 + (−1) + 2 = 1.

Find the equation for a plane through the points P = (1, 1, 1),
Q = (1, 1, 0), R = (0, 0, 1).
The vectors QP = (0, 0, 1) and RP = (1, 1, 0) are parallel to the
plane. A normal vector to the plane is QP × RP = (0, 0, 1) ×
(1, 1, 0) = (−1, 1, 0). The equation for the plane thus has the
form −x + y = D. Plugging in one of the points, for example
P = (1, 1, 1), into this equation, shows that D = 0. Thus the
equation for the plane is −x + y = 0.

67
tion for the surface. For the saddle this gives the hyperbolas
9 M ULTIVARIABLE DIFFERENTIAL CALCULUS y = c/x. These curves we then plot in the ordinary x y-plane
and label them with their corresponding value for z = c, as
shown on the right. You are already familiar with contour plots,
§ 9.1. Functions of several variables no doubt, from their use in topographical maps. Here, for ex-
ample, is a mountain and its corresponding contour plot:
§ 9.1.1. Lecture worksheet

The calculus can be extended to


functions of more than one vari-
able. Then instead of y(x) one
has z(x, y). So both x and y
are “input” variables now, and
z is the “output.” We visu-
alise the x y-plane as a horizon-
tal “ground level,” and z as the
height. So the function gives a
specific elevation for each point
on the ground. Thus the graph of z(x, y) is a surface. An 9.1.2. (a) If you needed to scale the highest peak, how could
ice cream cone, depicted here, is a simple example of a sur- you use the contour plot to determine the path of
face. What is the function z(x, y) corresponding to this sur- easiest ascent?
face? Three variables is a lot to keep in one’s mind, so to deal
(b) Conversely, if you are a daredevil skier, which way
with this kind of question we often have to break the problem
should you go down?
into more manageable parts. In this case we notice for example
that the horizontal cross sections are circles. The equation for 9.1.3. (a) Draw contour plots of the parts of the cone z =
a circle is x 2 + y 2 = r 2 . We want one circle for each z, and big-
p
5 − x 2 + y 2 and the unit sphere that lie above the
2 2
ger ones as we go higher, so we might guess z = x + y , which x y-plane.
clearly satisfies both of these requirements. We can check our
guess by taking another cross section. If we cut an ice cream (b) If these were mountains, which would be hardest to
cone in half right down the middle we of course get a V-shaped climb?
cross section. This corresponds to cutting for example along 9.1.4. Match the functions with the real-world object their
the zx-plane, i.e., along the plane y = 0. But if we slice our graphs resemble.
guess z = x 2 + y 2 in this way we see that we get the cross sec-
tion z = x 2 . This is a U rather than a V, so altogether z = x 2 + y 2  x2 + y 2 = z
looks more like a bowl or a wine glass than an ice cream cone.  3(x 2 + y 2 ) = z
We would rather prefer z =p|x| as our intersection, and this can
be arranged by taking z = x 2 + y 2 .  x2 + y 2 = z2

9.1.1. This is almost an ice cream cone, but one more adjust-  5(x 2 + y 2 ) = z 2
ment needs to be made. Explain what it is and how to  wine glass
amend the equation.
 ice cream cone
The idea of understanding a surface by its cross sections is sys-
tematised in contour plots. To produce a contour plot we slice  beach ball
the surface by horizontal planes. Here for example I have sliced  Asian-style farmer’s hat
the saddle-shaped surface z = x y at one positive and one neg-
ative z-value:  pyramid
 hour glass
 champagne glass
-6 0 6
2 A function of two variables has two partial derivatives, f x and
-2
0 f y . They answer the questions: How fast is the height f chang-
2
-2 -6 ing when I take a small step in the x direction? And how much
6
for a step in the y direction? When moving in one of these di-
rections, the other variable doesn’t change. Computationally,
this means that, to find the derivative of f (x, y) with respect to
x you treat y as a constant and vice versa. So when differenti-
Algebraically, this corresponds to plugging z = c into the equa- ating with respect to x you can secretly think of any y’s in the

68
formulas as if they were 5’s or some such innocuous number. curve in x and y coordinates will be some formula in-
For example, if f (x, y) = x 2 y then f x = 2x y and f y = x 2 . To get volving x, y and α set equal to zero (i.e., we have moved
a feeling for the meaning of these derivatives, go back to the ice all terms to the left hand side). What we just proved
cream cone. above is that a point (x, y) on the safety curve satisfies
both f (x, y, α) = 0 and f (x, y, α + dα) = 0.
9.1.5. For the ice cream cone of the previous problem, compute
d
the following partial derivatives and interpret geometri- (b) Show that this implies that dα f (x, y, α) = 0.
cally.
Therefore we can find the safety curve by combining the
d
(a) z x (1, 0) two equations f (x, y, α) = 0 and dα f (x, y, α) = 0 so as to
eliminate α. This will give us all the points with the re-
(b) z y (1, 0)
quired property in terms of x and y only, which is what
we want.
§ 9.1.2. Problems
(c) Express the trajectory in parametric form in the
manner of §7.3.
9.1.6. † The mixed partial derivatives f x y and f y x are equal. The
derivative f x means: if I take a step in the x-direction, (d) Obtain one equation for the trajectory involving
how much does the value of f change? So ( f x ) y means: x, y, and α, but not t . (I.e., combine the equa-
if I take a step in the y-direction, how much does the tions so as to eliminate t .) This essentially gives us
change in f per step in the x-direction change? In other f (x, y, α) = 0.
words, ( f x ) y is the change in f along the top arrow minus
(e) Find the equation for the safety curve using the
the change in f along the bottom arrow:
method given above.
2
α+cos α 2
(Hint: You may want to use cos12 α = sin cos2α =
2
1 + tan α or some similar trick to relate different
trigonometric expression to each other.)
Interpret ( f y )x similarly and explain why ( f x ) y −( f y )x = 0.
(f) Explain how you can see from the equation that the
9.1.7. † This problem illustrates the context in which partial
curve you found has roughly the right shape and
differentiation was first conceived in the late 17th cen-
position.
tury. The creators of the calculus were not interested
in multivariable calculus and partial derivatives as it is 9.1.8. A function of one variable is called continuous if its graph
taught today, since most physical phenomena can be un- can be drawn without lifting the pen, i.e., if it has no
derstood in two-dimensional form. However, examples “gaps” or “jumps.” More technically, f (x) is continuous
like the one below led them to consider partial deriva- at a point x 0 if f (x) approaches f (x 0 ) as x approaches
tives nevertheless, and shows their use even for two- x 0 . For a function of two variables to be continuous, it is
dimensional problems. required that f (x, y) approaches f (x 0 , y 0 ) no matter how
(i.e., in which direction or along which curve) (x, y) ap-
Consider the trajectories of projectiles fired from a canon
proaches (x 0 , y 0 ).
at varying angles:
The famous French mathematician Cauchy claimed in
the early 19th century that if a function of two variables
is continuous at a point in each variable separately then
it is continuous at that point. He was mistaken, however.
xy
Consider the function z(x, y) = x 2 +y 2 . This function is
Note that the figure is two-dimensional: the projectiles
defined everywhere except at (0, 0), since division by zero
are not “coming towards you”; they are all within the
is undefined. We extend it to a function defined every-
same plane.
where by defining z(0, 0) to be 0.
Ignoring air resistance, the trajectories are of course
(a) Show that this function is continuous in each vari-
parabolas, as Galileo discovered. We want to calculate
able separately (i.e., z(x, 0) and z(0, y) are continu-
the dashed “safety curve.” Beyond this curve we are al-
ous as one-variable functions).
ways safe, whereas anywhere inside this curve we can be
hit. This curve can be computed using the fact that the (b) Show, however, that the function is discontinuous
trajectories of projectiles fired at two almost identical an- at the origin by finding another way of approaching
gles, say α and α + dα, intersect at the safety curve. the origin so that the z-values do not approach the
same value as above.
(a) ? Explain briefly how this fact is evident from the
figure. A trickier example still is z(x, y) = 2x y 2 /(x 2 + y 4 ).
Now to put this into equations. Let the trajectory for fir- (c) Show that if we approach the origin on any straight
ing angle α be f (x, y, α) = 0. That is, the equation for the line, z approaches zero, but z has different limits

69
when the origin is being approached along the two The tangent plane to the surface f (x, y) above the point (x 0 , y 0 )
parabolas x = ±y 2 . in the x y-plane is

(d) Illustrate both examples with plots of the func- z = f (x 0 , y 0 ) + f x (x 0 , y 0 )(x − x 0 ) + f y (x 0 , y 0 )(y − y 0 )
tions.
9.2.4. Explain why.

§ 9.2. Tangent planes 9.2.5. Give an expression for the normal vector in terms of
derivatives.

§ 9.2.1. Lecture worksheet


§ 9.3. Unconstrained optimisation
The simplest surfaces are planes. They have equations of the
form Ax + B y +C z = D, as is easy to imagine since they are the
§ 9.3.1. Lecture worksheet
three-dimensional analog of lines in two dimensions, which
can be written Ax + B y = C .
Finding maxima and minima is much the same for functions
9.2.1. How many points does it take to determine a plane? How of two variables as for the one-variable case (§2.1). At a max-
does this square with the argument of problem 5.1.1? imum or minimum point both partial derivatives must be
zero, for the same reason as in the one-variable case. And
The vector (A, B,C ) is perpendicular to the plane—it is a nor-
again the second derivatives help us classify these extremum
mal vector, as it is called.
points. As before, the second derivatives f xx and f y y tell
9.2.2. Is the following proof correct? If not, indicate its first er- us whether the function makes a “happy” or “sad” shape
roneous step. in that direction. This suggests the following classification:

I want to investigate whether the vector (A, B,C ) is al-


ways a normal vector (i.e., perpendicular) to the plane
Ax + B y +C z = D. To do this I reason as follows.

[1] If (A, B,C ) is perpendicular to the plane, then its scalar


product with any vector in the plane should be zero. both negative: both positive: one of each:
max min saddle
[2] Let (X , Y , Z ) be a vector in this plane. In other words, This is almost right, except for one thing. The fact that a sur-
AX + B Y +C Z = D. face is curving upwards in both axes directions does not mean
[3] Then the scalar product is (A, B,C ) · (X , Y , Z ) = AX + that it its turning upwards in every direction: if it dips along the
B Y +C Z = D. diagonal it is a saddle after all.

[4] Since this is not zero the vector is not perpendicular


to the plane.

9.2.3. Find the shortest distance from the point (1, 1, 1) to the both positive
plane 2x − y + 3z = 1. (Hint: Walk in the direction of the but still a saddle
normal until you hit the plane.) The shortest distance is
times the length of the normal vector

Tangents are important in calculus, and in three dimensions Such a “tricky saddle” would throw us off if we looked only at
that means tangent planes. Here I have drawn a surface and f xx and f y y . To catch it we must study the mixed second deriva-
some of its tangent planes and normals: tive f x y , because this basically measures “how much different
the function is along the diagonal than along the axes” (as is
quite clear from the reasoning in problem 9.1.6).
So the final version of our classification is this. If f xx f y y −
( f x y )2 > 0 then the “diagonal effect” is not strong enough to
throw off the simple classification based on the axes directions,
so we get either a max or a min according to the signs of f xx
and f y y . If f xx f y y − ( f x y )2 < 0 we have a saddle one way or the
other (either a simple saddle if f xx f y y alone is negative, or a
tricky one if it becomes negative only after the diagonal effect
has been subtracted). Just as in the one-variable case it can
happen that none of these classification rules apply. In such
cases we are on our own and must seek another way of under-
standing what is going on.

70
9.3.1. Confirm that the function z = x y that I showed you in (b) The price at which each item can be sold is a func-
§9.1 has a saddle at the origin. tion of the total quantity produced, p = 520 − 10q.
What is the real-world reason for this?
9.3.2. What type of extremum does the function have at the ori-
gin? (c) Find the production levels that yield maximum
profit.
 f (x, y) = (x + y)2
p 9.3.5. A manufacturer produces a quantity q of a product to be
 f (x, y) = x2 + y 2 sold on two markets. The prices on each market depends
on the quantity sold there according to the formulas
 f (x, y) = −x 2

 f (x, y) = ln(1/e x
2 +y 2
) p a = 57 − 5q a p b = 40 − 7q b

 f (x, y) = x 2 − y 2 (a) Interpret the difference between the two markets in


real-world terms.
 f (x, y) = x 2 + y 2 + 3x y
(b) How should production be divided over the two
 monkey saddle markets in order to maximise profit? Does this
make sense in terms of your description of the mar-
 tricky saddle
kets?
 simple saddle
 unique maximum § 9.4. Gradients
 unique minimum
§ 9.4.1. Lecture worksheet
 non-unique minimum
 non-unique maximum The derivative f x says: if you go one step over in the x-
direction, then the height will change by this much. And f y
does the same for a step in the y-direction. What about a
§ 9.3.2. Problems step in any other direction? How much would that change the
height? Easy: just break the step into its x and y components
9.3.3. Museums and theatres often offer discounts to senior cit- and use the derivatives for each. This gives us the directional
izens, typically out of profit interest rather than kind- derivative f x cos θ + f y sin θ, where θ is the direction in ques-
ness. The reason is that seniors are more sensitive to tion given as the angle it makes with the positive x-axis. This
price, so discounts have a greater impact on their pur- can be more elegantly expressed in vector form. If the direction
chasing decisions. Suppose a theatre can sell q senior is given as a unit vector s = (cos θ, sin θ) then the directional
tickets and Q full tickets at prices €p and €P respectively, derivative is simply s · ∇ f , where ∇ f , the gradient of f (x, y), is
according to the demand functions q(p) = 100p −4 and the vector made up of its partial derivatives:
Q(P ) = 1000P −2 .
∇ f = ( fx , f y )
(a) Sketch the graphs of the demand functions and ex-
plain how they reflect the differences in sensitivity Besides its use for finding directional derivatives, the gradient
to price between the two consumer groups. vector also has an interesting meaning in itself: it points in the
direction of steepest ascent of f .
(b) The theatre has operating costs of €1 per visitor.
How should the theatre set the prices for full and 9.4.1. Prove this in vector terms. Hint: which direction s makes
senior tickets so as to maximise their overall profit? the directional derivative s · ∇ f the biggest?
(Express profit as a function of p and P .) We can get an intuitive feeling for why the gradient is the di-
(c) Confirm that your answer is a maximum using rection of steepest ascent by the following experiment. Grab
second-order derivatives. Explain, however, why something flat, such as the paper on which this text is printed,
this method is rather overkill in this instance. and hold it up horizontally in front of you. Now tilt its right end
upwards just a little, and then its far end upwards quite a bit.
9.3.4. A product to be sold in the Dutch market can be manu- If you stood in the middle of this plane, which way should you
factured in either the Netherlands or China. The cost of go if you want to go up as fast as possible? Surely you want to
producing q items is 20q 2 −60q +100 in one country and go mostly straight ahead because you gave it the most tilt that
10q 2 − 40q + 90 in the other. way. But you can do even better if you deviate a bit to the right,
in order to utilise that slope as well.
(a) Plot these two functions in the same coordinate
system. Which function do you think corresponds 9.4.2. Explain how this agrees with the gradient pointing in the
to which country? Explain. direction of steepest ascent.

71
9.4.3. Go back and try out the gradient on the ice cream cone Once the coordinates of the cities are given the total dis-
from §9.1. Interpret visually. tance is a function of xpand y; for example, if the first city
is at (a, b) then D 1 = (x − a)2 + (y − b)2 and so on for
9.4.4. If z = f (x, y) is the roof of a building, in what direction the other cities. So the total function to be minimised is
will rain water flow? a sum of three such root expressions. We could minimise
Since ∇ f tells us the direction of steepest ascent, it follows that this formula the brute force way by the method of §9.3,
−∇ f is the direction of steepest descent, and that the directions but the calculations would not be pretty. Here is a more
perpendicular to ∇ f correspond to no ascent at all, i.e., to go- clever method.
ing sideways while staying at the same height. (Visualise this (a) Find the gradient of D 1 as a function of x and y.
on your tilted plane.) Another way of saying this is that the gra- Hint: This can be done without calculations (if you
dient is perpendicular to the contour curves (since the contour have solved problem 9.4.8).
curves correspond to fixed height).
(b) Do the same for D 2 and D 3 and draw the gradients
A clever application of these ideas is to the problem of finding in the figure.
the normal to a given curve. Let’s say I want to find the nor-
(c) Explain why the sum of the gradients must be 0 at
mal to for example the curve x 2 y + 4y = 5 at the point (1, 1).
the optimum point.
Then my first step is to think of this as a level curve of the
function f (x, y) = x 2 y + 4y. At first this may seem like a very (d) Argue that this implies that the angles between the
circumspect way of going about things—after all, the original gradients must be 120◦ , and that this fact is enough
problem was a nice and simple two-dimensional problem and to find the optimum.
here I am making rather a mess of it by imagining it to be a
(e) ? The above method breaks down in certain excep-
cross section of a surface situated in three-dimensional space.
tional cases. Explain.
But sometimes generality is simplicity, and certainly so in this
case. For now I know by the very simple arguments above that
∇ f = ( f x , f y ) = (2x y, x 2 + 4) is perpendicular to the curve, so I
§ 9.5. Constrained optimisation
immediately see that the normal at (1, 1) is (2, 5).

9.4.5. Find normal vectors for a few points on the curve y = § 9.5.1. Lecture worksheet
3x 2 + x + 2. Illustrate with a sketch.

All of these things also generalise to higher dimensions. In par- The geometry of gradients also leads immediately to a sim-
ticular, we get for free the rather powerful result that the normal ple way of solving constrained optimisation problems (the so-
to a surface f (x, y, z) = c is given by the gradient ( f x , f y , f z ). called Lagrange multiplier method). Suppose for example that
we want to make f as big or as small as possible while being
9.4.6. Go back to §9.2.1 and re-explain in this new light what constrained to this dashed circle:
was said about normal vectors there.
f =1
9.4.7. Show by an example that the gradient method is some- f =2
times a more convenient way of finding a normal vector f =3
than the formula for normals that follows from the tan-
gent plane equation (problem 9.2.5).

Clearly the extrema will occur at the points where the con-
§ 9.4.2. Problems straint curve precisely touches one of f ’s contour curves, as
it does for f = 1 and f = 3. When the constraint curve cuts
9.4.8. What is the geometrical meaning of |∇ f |? Hint: what right through a contour, as it does for f = 2, there is one side
happens if I take a unit step in this direction? with bigger values and one with smaller, so it can’t be a mini-
mum nor a maximum. This idea is captured analytically as fol-
9.4.9. † Three cities are to be connected by roads. To minimise lows. The extremum points of f (x, y) subject to the constraint
cost and environmental footprint, we want to minimise g (x, y) = c are found by solving the system of equations
the total length of the roads. This is done by finding a
point (x, y) between the three cities such that the sum of f x = λg x
the distances to the cities, D = D 1 + D 2 + D 3 , is a small as f y = λg y
possible. g =c

The first two equations say that the gradient vectors of g and f
differ only by a multiple, λ. Geometrically, this means that the
normals of the constraint curve and a contour of f are parallel.
This happens precisely where the curves touch each other, as
shown in the picture.

72
For example, let’s find the maximum values of the saddle 9.5.5. 5x 2 + 6x y + 5y 2 = 8 is a tilted ellipse:
f (x, y) = x y subject to the constraint x 2 + y 2 = 1. So the con-
straint curve is the unit circle. When I round it in the x y-plane,
the corresponding points on the graph of f look like this:

Find the lengths of its semi-axes (= greatest and least dis-


tance to the origin) using Lagrange multipliers. Hint: Use
g (x, y) = 5x 2 +6x y +5y 2 = 8 as the constraint in a suitable
optimisation problem.
9.5.6. Economics: Cobb–Douglas model. Suppose the produc-
So it’s a roller-coaster with two maxima and two minima. In the tion P of a company depends on the available labour L
reference summary they are computed analytically. and the capital investment K according to the formula
9.5.1. Draw the contour picture for this example. P = L α K β . Assume further that the production is scal-
able, i.e., if L and K both grow by a certain factor then P
9.5.2. Solving the above system of equations gives us a list of also grows by that same factor.
candidate points that must contain all maxima and min-
ima. But how to tell which is which? Explain why the (a) What does this imply about the relation between
second-derivative tests of §9.3 are of no use here. What α and β? This is best seen by means of the [chain
to do instead? rule, product rule, laws of exponents, laws of loga-
rithms].
9.5.3. If f (x, y) is the revenue as a function of money spent on
television (x) and print (y) advertisements, what is the (b) If α = 0.75, then β = .
meaning of g (x, y) = c? (c) Sketch a few contours of P in a coordinate system
(a) A realistic expression for f (x, y) is [x y, sin(x y), with L and K on the axes.
1/x y, 3x − y] and a realistic expression for g (x, y) (d) Explain how the equation 2L + K = 1000 can be in-
p
is [x y, (x + y)2 , x + y, 2x + y]. terpreted as a budget constraint.
(b) If c = 20, the maximum profit is (e) Include the constraint line in your figure. Visually,
how can you find the maximum production?
(c) Solving for λ in the Lagrange equations suggests
that λ has the real-world interpretation: (f) Find the maximum production using Lagrange
multipliers and explain how this corresponds to the
 profit as fraction of spending
visual method.
 spending as fraction of profit
 extra dollars earned per extra dollar spent
§ 9.6. Multivariable chain rule
 extra dollars spent per extra dollar earned
 payoff of TV ads relative to print ads § 9.6.1. Lecture worksheet

 payoff of print ads relative to TV ads","increase


Imagine standing at a point (x 0 , y 0 ) in the x y-plane underneath
in cost of ads with increasing spending
a surface z(x, y). Suppose you take an infinitesimal vectorial
(d) In our case, λ = step in some direction, (dx, dy). How much does the height z
change? The answer is given by the so-called total differential:
(e) Ad spending should be [increased, decreased].
∂z ∂z
dz = dx + dy
∂x ∂y
§ 9.5.2. Problems
9.6.1. Explain why. Is it important that (dx, dy) is infinitesimal?
9.5.4. A rectangular building is to be designed to minimise heat Explain the precise meaning of the terms in the formula.
loss. The walls lose heat at a rate of 2 units/m2 per day, Explain how the notation of this formula can be confus-
except the south wall, which, since it receives more sun, ing and show how it can be expressed differently to avoid
loses only half this much heat. The floor and the roof this problem.
are better insulated and lose heat at a rate of 1 unit/m2
The same reasoning leads to the chain rule for partial deriva-
per day. The volume of the building must be exactly 1000
tives, which says: Given z(x, y) and x(t ) and y(t ) as functions
m3 . The dimensions that minimise heat loss are: height
of t :
= ; length of south-facing wall = ; length of dz ∂z dx ∂z dy
east-facing wall = . = +
dt ∂x dt ∂y dt

73
9.6.2. Explain why.
§ 9.7.3. Planes
A more general version of the chain rule is: Given z(x, y) and
x(u, v) and y(u, v) as functions of u and v:
Equation for plane:
dz ∂z ∂x ∂z ∂y
= +
du ∂x ∂u ∂y ∂u Ax + B y +C z = D normal vector = (A, B,C )

9.6.3. Explain why one cannot “cancel the ∂x’s” in this formula.
Express the formula in a different notation, which avoids
this temptation. Is this better? The point (10, 18, 3) is in a plane with normal vector
(3, −2, 4). Find the equation of the plane.
9.6.4. Consider the parabolic “bowl” z = x 2 + y 2 . Starting at
Since (3, −2, 4) is a normal vector, the equation for the plane
the point (1, 1) in the x y-plane, how fast is the height z
is 3x−2y+4z = D for some constant D. Plugging in the point
changing if we move radially (straight away from the ori-
(10, 18, 3), we find that D = 3·10−2·18+4·3 = 30−36+12 = 6.
gin) or circularly (remaining at the same radius from the
The equation of the plane is hence 3x − 2y + 4z = 6.
origin)? Use polar coordinates and the chain rule to find
out.
Tangent plane to f (x, y) above the point (x 0 , y 0 ):

§ 9.7. Reference summary z = f (x 0 , y 0 ) + f x (x 0 , y 0 )(x − x 0 ) + f y (x 0 , y 0 )(y − y 0 )

§ 9.7.1. Functions of two variables


Find the tangent plane to f (x, y) = x 2 +y 2 −1 above the point
A function f (x, y) of two variables specifies a height z = f (x, y) (1, 3).
above each point (x, y) in the plane. Geometrically, it therefore f x = 2x, f y = 2y, so the tangent plane is
defines a surface.
z = f (x 0 , y 0 ) + f x (x 0 , y 0 )(x − x 0 ) + f y (x 0 , y 0 )(y − y 0 )
= f (1, 3) + f x (1, 3)(x − 1) + f y (1, 3)(y − 3)
= 9 + 2(x − 1) + 6(y − 3)
z=f(x,y)
= 2x + 6y − 11

(x,y)

§ 9.7.4. Unconstrained optimisation

§ 9.7.2. Partial derivatives


Stationary points of f (x, y) occur where the partial derivatives
f x and f y are both zero. Classification of stationary points:
∂f
fx = = partial derivative of f with respect to x
∂x
f xx f y y − ( f x y )2 f xx and/or f y y type of equilibrium
= rate of change of f as one moves in the x direction
+ + minimum
• Find the partial derivative of a function with respect to a + − maximum
given variable. − saddle

Differentiate as usual with respect to this variable, while


treating all other variables as constants.
Find and classify the stationary points of f (x, y) = x 3 + y 2 −
Find the partial derivatives of f (x, y) = x y 2 + cos(x). 3x − 4y + 2.

f x = 3x 2 − 3 = 0 ⇒ x = ±1 and f y = 2y − 4 = 0 ⇒ y = 2.
f x = y 2 − sin(x) f y = 2x y There are thus two stationary points:(1, 2) and (−1, 2). We
calculate: f xx = 6x, f y y = 2, f x y = 0. At (1, 2): f xx = 6 >
Find the partial derivatives of f (x, y) = xe x y . 0, f y y = 2 > 0, f xx f y y − f x2y = 12 > 0 ⇒ minimum. At (−1, 2):
f xx = −6 < 0, f y y = 2 > 0, f xx f y y − f x2y = −12 < 0 ⇒ saddle.
f x = e x y + x ye x y f y = x2e x y

74
x 2 +y 2 1
Find and classify the stationary points of g (x, y) = − f (x, y) = x 2 +y 2 . You are standing at (1, 1) in the x y plane.
2
1 Which way should you go to make f grow the fastest?
xy .
−2y
Stationary points occur where the partial derivatives are fx = −2x
(x 2 +y 2 )2
⇒ f x (1, 1) = − 21 and f y = (x 2 +y 2 )2
⇒ f y (1, 1) =
zero. This gives g x0 (x, y) = x + x 12 y = 0, which simplifies to − 21 . The direction of fastest increase in f is ∇ f = ( f x , f y ) =
x 3 y = −1, and g y0 (x, y) = y + 1
= 0, which simplifies to (− 12 , − 12 ) (in words: toward the origin).
x y2
3x y = −1. Solving for y in the first equation gives y = − x13 ,
3

which plugged into the second gives x(− x13 )3 = −1 ⇔ x18 = Directional derivative in direction of unit vector ŝ:
1 ⇔ x 8 = 1 ⇔ x = ±1. Combining this with y = − x13 , we
see that the critical points are (1, −1) and (−1, 1). To classify ŝ · ∇ f
these points we need the second derivatives: g xx 00
= 1 − x 23 y ,
g x00y = − x 21y 2 , g y00y = 1 − 2
x y3
. 00
For (1, −1) we get g xx = 1− Directional derivative in direction θ:
(−2) = 3, g x00y
= −1, g y00y
= 1 − (−2) = 3. Thus g xx 00
> 0 and
00 00
g xx gyy > g x002y
so the point is a minimum. For (−1, 1) we get f x cos θ + f y sin θ
00
g xx = 1−(−2) = 3, g x00y = −1, g y00y = 1−(−2) = 3. These are the
same values as before, so this point is also a minimum.
Find the rate of change of f (x, y) = x y 3 when moving away
Find and classify the stationary points of f (x, y) = x ye x−y . from the origin from the point (1, 1).

For ease of writing, let E = e x−y . Note that E > 0. f x = y 3 ⇒ f x (1, 1) = 1 and f y = 3x y 2 ⇒ f y (1, 1) = 3. When
f x = yE + x yE = 0 ⇒ y + x y = 0 ⇒ y = 0 or x = −1. standing at (1, 1), going away from the origin means going
f y = xE − x yE = 0 ⇒ x − x y = 0 ⇒ x = 0 or y = 1. in the direction θ = π4 . Therefore the directional derivative
p
So the stationary points are (−1, 1) and (0, 0). is f x cos θ + f y sin θ = cos π4 + 3 sin π4 = 2 2.
The second derivatives are f xx = 2yE + x yE , f y y = −2xE +
x yE , and f x y = E − yE + xE − x yE .
(x, y) = (−1, 1) ⇒ f xx = E > 0, f y y = E > 0, f x y = 0, so mini- Find the rate of change of f (x, y) = x y 3 when moving in the
mum. direction w = (−2, 1) from the point (1, 1).
(x, y) = (0, 0) ⇒ f xx = 0, f y y = 0, f x y = E ⇒ f xx f y y − ( f x y )2 = We know from the above example that ∇ f = (1, 3) at this
−E 2 < 0, so saddle. point. Moreover, w converted to a unit vectors is ŵ =
w/|w| = (− p2 , p1 ). Thus the directional derivative is ŵ·∇ f =
5 5 p
Find and classify the stationary points of f (x, y) = x 3 + y 3 + (− p2 , p1 ) · (1, 3) = 1/ 5.
5 5
6x y + 2.
∂ f (x,y) ∂ f (x,y)
Stationary points occur where = = 0, which in • Find the normal to a given curve or surface at a given point.
∂x ∂y
2 2
our case becomes 3x + 6y = 0 and 3y + 6x = 0, or, sim- Write in form f (x, y) = 0 (curve) or f (x, y, z) = 0 (surface).
plifying, x 2 + 2y = 0 and y 2 + 2x = 0. Solving for y in the first Compute the gradient of f and evaluate it at the given point.
equation gives y = − 21 x 2 , which inserted in the second gives This is the normal.
(− 21 x 2 )2 +2x = 0, or x 4 +8x = 0. This factors into x(x 3 +8) = 0.
Thus x = 0 or x 3 = −8, which means x = −2. When x = 0 we Find a normal vector to the surface z = x 2 + y 2 at the point
get y = 0 and when x = −2 we get y = −2. We thus have two (1, 1, 2).
stationary points: P 1 = (0, 0) and P 2 = (−2, −2). To classify
∂2 f (x,y) The surface is the level surface f = 0 of the function
them we calculate the second derivatives: A = = 6x,
∂x 2 f (x, y, z) = x 2 + y 2 − z. A normal vector is therefore ∇ f =
∂2 f (x,y) ∂2 f (x,y)
C= ∂y 2
= 6y, B = ∂x∂y = 6. For P 1 we get AC − B 2 = (2x, 2y, −1), which evaluated at the point in question is
−36 < 0, so it’s a saddle. For P 2 we get AC −B 2 = 144−36 > 0, (2, 2, −1).
while A = −12 < 0, so this is a maximum.

§ 9.7.6. Constrained optimisation


§ 9.7.5. Gradients

(Lagrange multipliers.) To extremise f (x, y) subject to the con-


∇ f = ( fx , f y )
straint g (x, y) = c, solve
= direction of steepest ascent of f
= normal to level curve of f f x = λg x
f y = λg y
|∇ f | = rate of steepest ascent of f g =c

75
f =1 Determine the maxima and minima of f (x, y) = 5x 2 − 2y 2 +
f =2 10 on the curve x 2 + y 2 = 1.
f =3
g = c ⇔ x2 + y 2 = 1
f x = λg x ⇔ 10x = 2λx ⇒ λ = 5 or x = 0
f y = λg y ⇔ −4y = 2λy ⇒ λ = −2 or y = 0
Any local maxima or minima will be among the pairs of points
(x, y) that satisfy these equations. To determine which are So the stationary points are (±1, 0) and (0, ±1). f (±1, 0) = 15
max., min., or neither, plug them into f (x, y) and compare and f (0, ±1) = 8, so (±1, 0) are maxima and (0, ±1) minima.
their values. If the constraint curve has endpoints, these are
also potential max. or min. If the constraint curve is infinite,
also investigate the limit behaviour of the function there (e.g., § 9.7.7. Multivariable chain rule
there is no global max. or min. if the function grows or shrinks
without bounds in such a direction). Total differential:
∂z ∂z
dz = dx + dy
∂x ∂y
Find the maximum value of f (x, y) = x y subject to the con-
straint x 2 + y 2 = 1. Chain rule for partial derivatives when x and y are functions of
We need to solve the system t:
dz ∂z dx ∂z dy
= +
y = 2λx dt ∂x dt ∂y dt
x = 2λy Chain rule for partial derivatives when x and y are functions of
x2 + y 2 = 1 u and v:
dz ∂z ∂x ∂z ∂y
= +
Putting the first equation into the second gives x = 2λ(2λx). du ∂x ∂u ∂y ∂u
If we divide by x we get 4λ2 = 1, or λ = ± 12 . Of course, when-
ever we divide by x we must take care that we are not di-
viding by zero. But in this case x cannot be zero, since if
it is then so is y, which is impossible by the last equation.
Putting λ = ± 12 back into the first two equations we find
that y = ±x. Combining this with the last equation we get
x 2 + (±x)2 = 1, or x = ± p1 . Since y = ±x, all possible solu-
2
tions are ( p1 , p1 ), (− p1 , − p1 ), (− p1 , p1 ), ( p1 , − p1 ). To see
2 2 2 2 2 2 2 2
which are maxima and which are minima, we plug these
back into the original function f (x, y) = x y. The largest
value of f is thus 21 and the smallest − 12 .

Find the maximum value of f (x, y) = 2x y subject to the con-


straint 5x + 4y = 100.

g = c ⇔ 5x + 4y = 100
f x = λg x ⇔ 2y = 5λ ⇒ λ = 2y/5
f y = λg y ⇔ 2x = 4λ ⇒ λ = x/2
Hence 2y/5 = x/2, which substituted into the constraint
gives x = 10 and y = 12.5. Our candidate for a maximum
point is thus (10, 12.5), where f = 250. But since the con-
straint curve is an infinite line we must also investigate the
limit values of f as we go toward infinity along the line in
either direction. The constraint conditions shows that if
x → ∞ then y → −∞ and if x → −∞ then y → ∞. In either
case, f → −∞. Hence our candidate maximum is indeed
the biggest f ever gets.

76
§ 10.2. Polar coordinates
10 M ULTIVARIABLE INTEGRAL CALCULUS
§ 10.2.1. Lecture worksheet
§ 10.1. Multiple integrals
Polar coordinates (§7.1) are useful when evaluating double in-
tegrals since many regions of a circular or radial nature are
§ 10.1.1. Lecture worksheet
much more naturally and easily described in polar than rect-
angular coordinates. When an integral in ordinary rectangular
coordinates x, y is rewritten in polar coordinates r, θ it becomes
R
We know that y(x) dx is an area madeÎup of thin rectangles
with base dx and height y(x). Likewise, f (x, y) dx dy is a vol-
ume made up of think rectangular blocks with base area dx dy Ï Z β Z r 1 (θ)
and height f (x, y). This is called a double integral. To evaluate f (x, y) dx dy = f (r cos θ, r sin θ)r dr dθ
R α r 0 (θ)
it we integrate twice: once with respect to x and once with re-
spect to y. Geometrically, the first integration gives an expres-
where α, β are the angular (θ) bounds of R, and r 0 (θ), r 1 (θ)
sion for the cross-sectional areas of the shape, and the second
are the radial (r ) bounds for R for any given θ. The manner in
integration finds the volume by endowing each of these areas
which x and y have been translated we recognise from before
with an infinitesimal thickness.
(§7.5.1). It remains to explain where the extra r comes from in
10.1.1. Consider for example the tetrahedron with vertices the area element expression r dr dθ. We can understand this
(0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, 1). as follows. In an integral in rectangular coordinates, if we let
x increase by increments dx and y by increments dy, this gen-
(a) Argue that the “roof” of this tetrahedron is given by erates a grid of identical rectangles. But if we let r increase by
z = 1 − x − y. increments dr and θ by increments dθ, this generates a differ-
ent kind of grid in which the cells have different sizes.
Suppose I intersect the tetrahedron with a plane perpen-
dicular to the y-axis.
R 1−y
(b) Argue that the cross-sectional area is 0 1−x −
y dx.

(c) Argue that the volume of the tetrahedron is


R 1 ³R 1−y ´
0 0 1 − x − y dx dy.

(d) Find the volume by evaluating first the inner then


the outer of these integrals.

In
Î general terms, to compute the double integral
R f (x, y) dx dy over some region R, we write it as an iterated
integral
Z b Z x1 (y)
f (x, y) dx dy
a x 0 (y)

where a and b are the y-values between which the region R is


contained, and x 0 (y) and x 1 (y) are the x-values between which
any given cross-section of R perpendicular to the y-axis is con- 10.2.1. Argue that each cell can be considered a rectangle with
tained. Alternatively we can invert the roles of x and y to in- sides dr and r dθ.
stead express the integral as

Z d Z y 1 (x)
f (x, y) dy dx
c y 0 (x)

Î r dθ dr
10.1.2. Find, in both ways, R y dx dy where R is the region be-
tween y = x and y = x 2 .

Depending on the shape of R, it may be much easier to specify


the bounds x 0 (y), x 1 (y) than y 0 (x), y 1 (x), or vice versa.
10.2.2. Find the area generated by one full revolution of the
10.1.3. Think of an example to illustrate this. Archimedean spiral r = θ (shown in §7.1.1).

77
§ 10.2.2. Problems
§ 10.4. Spherical coordinates

R∞ 2 2
10.2.3. Consider the integral I = 0 e −x dx. The function e −x § 10.4.1. Lecture worksheet
cannot be integrated in closed form using any of our pre-
vious integration techniques, so we cannot evaluate I by Spherical coordinates characterise points in 3-dimensional
direct integration. Nevertheless we can evaluate this in- space by means of one radial coordinate and two angular co-
tegral by an ingenious use of multiple integration, as we ordinates (analogous to longitude and latitude). Translating an
shall now see. integral into spherical coordinates gives:

(a) Argue that


Ñ
f (x, y, z) dx dy dz =
µZ ∞ ¶ µZ ∞ ¶
R
2 −x 2 −y 2
I = e dx e dy
0 0 Ñ
f (ρ sin φ cos θ, ρ sin φ sin θ, ρ cos φ)ρ 2 sin φ dρ dφ dθ
R
(b) Argue that this can be rewritten as
Z ∞Z ∞
2 2
z ( ρ,φ,θ) z ρ sin φ dθ
e −(x y ) dx dy
0 0

(c) Evaluate this integral using polar coordinates. φ ρ


(d) What is I ?
y y
θ ρdφ
§ 10.3. Cylindrical coordinates
x x dρ
§ 10.3.1. Lecture worksheet
10.4.1. Explain this on the basis of the figure. Hint: To verify the
If we are integrating over a cylindrical region the bounds are lengths of the arcs, ask yourself what the radius is of the
complicated to express in rectangular coordinates x, y, z. It is circle of which it is part.
better to use cylindrical coordinates, which means polar coor- 10.4.2. Find the volume of a sphere as a triple integral in spheri-
dinates in the base x y-plane and the ordinary z-coordinate for cal coordinates.
the height. Similar to the polar coordinate case, the integral
then becomes An important application of spherical coordinates is to the
Ñ Ñ gravitational theory of spherical bodies.
f (x, y, z) dx dy dz = f (r cos θ, r sin θ, z)r dr dθ dz
R R 10.4.3. Suppose a mass M is uniformly distributed in the form
z θ
(r, ,z) z of a thin spherical shell of radius R and thickness dr cen-
tered at the origin. We shall show that the gravitational
force exerted by this shell on a particle of mass m located

z dz
some distance a > R away at the point (a, 0, 0) is the same
as that of a point-mass M located at the origin.

(a) What are the implications of this result for comput-


y y ing the gravitational influences of celestial bodies?
θ r (b) What is the volume of the shell? Hint: surface area
x x dr rdθ × thickness.

(c) Therefore, what is its density?


10.3.1. Explain this on the basis of the figure.
Consider an infinitesimal piece of the shell (in the man-
10.3.2. Something is stored in a cylindrical silo. It compresses
ner of the above figure).
under weight in such a way that the density of each layer
is proportional to the height above it to the top of the (d) What is the mass of this piece?
stack. Find an expression for the net mass as a function
of the height of the stack. (e) Find the gravitational force it exerts on the mass m
in terms of the distance s between them and the
Cylindrical coordinates are useful not only for cylinders but of- angle α the line connecting them makes with the
ten for any shape with an axis of symmetry. x-axis.
10.3.3. Solve problem 10.3.2 when the silo has the shape of the (f) Write down a triple integral expression for the net
cone z = x 2 + y 2 . gravitational force of the whole shell.

78
(g) Argue that two of the integrals can be evaluated where α, β are the angular (θ) bounds of R, and r 0 (θ), r 1 (θ) are
without knowing s. Do so. the radial (r ) bounds for R for any given θ.
Hint: Use the law of cosines to find the rela-
tion between s and the current variable. Also use § 10.6.2. Cylindrical coordinates
trigonometry to rewrite the integrand purely in
terms of s. z (r,θ,z)
(h) For the remaining integral, make a change of vari-
ables to rewrite the integral as an integral in s. z
(i) Evaluate the integral. y
θ r
(j) Conclude.
x

§ 10.4.2. Problems Coordinate transformation formulas:

10.4.4. Solve problem 10.4.3 in the case where the mass m is lo-
cated inside the shell. x = r cos θ

(This result was used in problem 6.5.4.) y = r sin θ


z=z

§ 10.5. Surface area Volume scaling factor:

§ 10.5.1. Lecture worksheet dx dy dz = r dr dθ dz

10.5.1. Show that the area of the surface z = f (x, y) above an in- Integral transformation formula:
finitesimal rectangle of sides dx, dy is Ñ Ñ
q f (x, y, z) dx dy dz = f (r cos θ, r sin θ, z)r dr dθ dz
1 + f x2 + f y2 dx dy R R

Hint: one way of doing this is to consider the area as the


parallelogram spanned by two vectors and computing it § 10.6.3. Spherical coordinates
using a vector product.
z ( ρ,φ,θ)
10.5.2. Find the surface area of a sphere using this method.
φ ρ

§ 10.6. Reference summary y


θ
x
§ 10.6.1. Polar coordinates
Coordinate transformation formulas:
(r, θ)
x = ρ sin φ cos θ
r
y = ρ sin φ sin θ
θ
z = ρ cos φ
Coordinate transformation formulas:
Volume scaling factor:

x = r cos θ dx dy dz = ρ 2 sin φ dρ dφ dθ
y = r sin θ
Integral transformation formula:
Area scaling factor: Ñ
dx dy = r dr dθ f (x, y, z) dx dy dz =
R

Integral transformation formula: Ñ


Z β Z r 1 (θ) f (ρ sin φ cos θ, ρ sin φ sin θ, ρ cos φ)ρ 2 sin φ dρ dφ dθ
R
Ï
f (x, y) dx dy = f (r cos θ, r sin θ)r dr dθ
R α r 0 (θ)

79
x
Î
Find the cartesian coordinates (x, y, z) of the points De dxdy, where D is the triangular region with vertices
(r, θ, φ) = (1, 0, 0) and (r, θ, φ) = (2, π/2, π/2). (0, 0) , (1, 1), (1, 2).
(0, 0, 1), (0, 2, 0) The sides of the triangles are on the lines y = x, y = 2x,
x = 1. Hence D corresponds to x ≤ y ≤ 2x, 0 ≤ x ≤ 1. Thus
Î x R 1 R 2x x R 1 x y=2x R 1 x
Find the spherical polar coordinates (r, θ, φ) of the points D e dxdy = 0 ( x e dy)dx = 0 [e y] y=x = 0 xe dx =
R1
(x, y, z) = (0, 1, 0) and (x, y, z) = (1, 2, 2). [xe x ]10 − 0 e x dx = e − [e x ]10 = e − e + 1 = 1.

(r, θ, φ) = (1, π/2, π/2), (r, θ, φ) = (3, arccos(2/3), arctan(2))


x 2 ydxdy where D = {(x, y) : x 2 + y 2 ≤ 4, x, y ≥ 0}.
Î
D

In polar coordinates D is given by 0 ≤ θ ≤ π2 and 0 ≤


R π R2
r ≤ 2. Thus D x 2 ydxdy = 02 0 (r cos θ)2 (r sin θ)r drdθ =
Î
§ 10.6.4. Surface area R π2 R2 4 π
2 1 3 2 1 5 2 1
0 cos θ sin θd θ 0 r dr = [− 3 cos θ]0 [ 5 r ]0 = (− 3 )(0 −
32
1) · 15 (32 − 0) = 15 .
Area of the surface z = f (x, y) above the region R:
Ï q
−(x 2 +y 2 )
dxdy, where D = {(x, y) : 0 ≤ y ≤ x, x 2 + y 2 ≤
Î
1 + f x2 + f y2 dx dy De
R 3}.
In polar coordinates D corresponds to E = {(r, θ) : 0 ≤
p 2 2
r ≤ 3, 0 ≤ θ ≤ π/4}. Hence D e −(x +y ) dxdy =
Î
p R p
§ 10.6.5. Problem guide 3 π/4
−r 2
drdθ = 0 ( 0 r e −r dθ)dr = π4 0 r e −r dr =
Î R 2 R 3 2
E re
[r 2 = t , 2r dr = dt] = π8 0 e −t dt = π8 [−e −t ]30 = π8 (1 − e −3 ).
R 3
Î
• Evaluate a double integral R f (x, y) dx dy.

First decide whether the region of integration R is most eas- x y 2 dxdy, where D = {(x, y) : x 2 + y 2 ≤ 4, x ≥ 0}.
Î
D
ily described in terms of rectangular coordinates x, y or polar
coordinates r, θ. If you choose polar coordinates, translate In polar coordinates the region corresponds
π
the given function f (x, y) into r, θ and multiply it with the to ≤ θ ≤ π2 and 0 ≤ r ≤ 2.
2 Thus:
2
R π2 R 2 2
(r cos θ)(r sin θ) r drdθ
Î
area scaling factor (r ), as shown above. D x y dxdy = − π2 0
=
R π2 2
π
sin2 θ cos θdθ 0 r 4 dr [ 13 sin3 θ]−2 π [ 51 r 5 ]20
R
Next write the integral as an iterated integral. We seek to −π
= =
2 2
choose the order of integration in such a way that the bounds 1 1 64
3 (1 − (−1)) · 5 (32 − 0) = 15 .
are most easily expressed. We will evaluate the integrals from
the inside out, so the inner differential and the inner bounds
x
Î
of integration correspond to the first integration. But when D (x 2 +y 2 )2 dxdy, where D = {(x, y) : 0 ≤ y ≤ x, 1 ≤
writing down the bounds it is easier to work from the outside 2 2
x + y ≤ 2}.
in. For the outer integral, the bounds should be numbers
In polar coordinates
p D correspondsÎto E = {(r, θ) :
(constants) expressing the bounds between which the region x
1 ≤ r ≤ 2, 0 ≤ θ ≤ π/4}. Thus D (x 2 +y 2 )2 dxdy =
is contained as far as the outer variable is concerned. When p
Î r 2 cos θ R 2 −2 R π/4
specifying the inner bounds of integration you may use the E r4
drdθ = 1 r dr 0 cos θdθ =
p
outer variable in your expressions for these bounds; this is
[1/r ]1 2 [sin θ]π/4
0 = (−
p1 + 1) · p1 = p1 − 12 .
necessary whenever the bounds of the region with respect 2 2 2
to the inner variable are different for different values of the Ð
outer variable. If expressing the inner bounds becomes very • Evaluate a triple integral R f (x, y, z) dx dy dz.
complicated, this suggests that we should try the other or-
First decide whether the region of integration R is most easily
dering of the variables.
described in terms of rectangular coordinates x, y, z, cylin-
When the integrals have been written down completely, eval- drical coordinates r, θ, z (for regions with a symmetry axis,
uate them one at a time, going from the inside out. which we make the z-axis), or spherical coordinates ρ, φ, θ. If
you choose non-rectangular coordinates, translate the given
Î 1
function f (x, y, z) into your chosen coordinates (using stan-
D 1+x 2 dxdy where D is given by 0 ≤ y ≤ x ≤ 1. dard formulas x = · · · , y = · · · , z = · · · ), and multiply it with
R1 Rx R1 x 1 1 the volume scaling factor, as shown above.
= 0( 0
1
1+x 2
dy)dx = 0 1+x 2 dx = [ ln(1 + x 2 )]10 = (ln 2 −
2 2 Next write the integral as an iterated integral. We seek to
ln 2
ln 1) = choose the order of integration in such a way that the bounds
2
are most easily expressed. We will evaluate the three inte-
grals from the inside out, so the innermost differential and

80
the innermost bounds of integration correspond to the first
integration. But when writing down the bounds it is eas-
ier to work from the outside in. For the outermost inte-
gral, the bounds should be numbers (constants) expressing
the bounds between which the region is contained as far as
the outermost variable is concerned. When specifying the
next bounds of integration you may use the outermost vari-
able in your expressions for these bounds; this is necessary
whenever the bounds of the region with respect to the cur-
rent variable are different for different values of the outer-
most variable. Similarly, expression for the last bounds may
contain both of the two outer variables. If expressing the in-
ner bounds becomes very complicated, this suggests that we
should try another ordering of the variables.
When the integrals have been written down completely, eval-
uate them one at a time, going from the inside out.

f (x, y, z) = y 2 z 3 dxdydz where D is a cylinder sym-


Ð
D
metric about the z-axis, with radius a and cut off at z = 0
and z = 1.
R 1 R 2π R a 2 2 3
R1 3
= z=0 φ=0 ρ=0 (ρ sin φz )ρdρdφdz = z=0 z dz ·
a4 πa 4
R 2π 2
Ra 3 1
φ=0 sin φdφ · ρ=0 ρ dρ = 4 · π · 4 = 16

x 2 + y 2 + z 2 dxdydz, where D is the region x 2 + y 2 +


Ð p
D
2
(z − 1) ≤ 1.
In spherical coordinates, x 2 + y 2 + (z − 1)2 ≤ 1 ⇔ x 2 + y 2 +
z 2 ≤ 2z ⇔ ρ 2 ≤ 2ρ cos θ ⇔ ρ ≤ 2 cos θ, so D corresponds to
0 ≤ θ ≤ π/2, 0 ≤ φ ≤ 2π, 0 ≤ ρ ≤ 2 cos θ. Over this region,
R π/2 R 2 cos θ 3
ρρ 2 sin θdρdθd φ = 2π 0 ( 0 ρ dρ) sin θdθ =
Ð
R π/2 1 4 ρ=2 cos θ R π/2
2π 0 ([ 4 ρ ]ρ=0 ) sin θdθ = 8π 0 cos4 θ sin θdθ =
8π[− 51 cos5 θ]π/2 8
0 = 5 π.

+ z 2 )dxdydz, where K = {(x, y, z) : x 2 + y 2


Ð
K (1 ≤
cos z, − π2 ≤ z ≤ π2 }.
R π2 ³ 2
Î ´ R π2
= π (1 + z ) 2 +y 2 ≤cos z dxdy dz = (1 +
−2 x − π2
π π
z 2 )π cos zdz = π[(1 + z 2 ) sin z]−2 π − π −2π 2z sin zdz
R
=
2 2
π R π2
1 2
π(2 + 2 π ) − π[2z(− cos z)]− π + π − π 2(− cos z)dz
2
=
2 2
π
π(2 + 21 π2 ) − 2π[sin z]− π = 12 π3 − 2π.
2
2

81
to move to fill a hole pumped out in the middle. It may
11 V ECTOR CALCULUS help to consider a conical section of the water.)

11.1.2. Find an explicit formula for the vector field as follows.


§ 11.1. Vector fields

(a) Argue that −(x, y, z) is a vector that points from a


§ 11.1.1. Lecture worksheet
given point (x, y, z) to the origin.

The gravitational vector field of a heavy object, such as the sun,


located at the origin looks like this: (b) Find a unit vector that points from (x, y, z) to the
origin.

(c) Find a formula for the gravitational vector field F.

The origin in our example is called a sink. If water was being


pumped in instead that would be a source. If we stick several
pipes in our pool to make several sinks and sources then a more
complicated fluid flow will arise. In fact it will be the superpo-
sition of the individual flows: that is, the net vector field is ob-
tained by vector addition of the individual vector fields of each
source and sink considered separately. Or at least an idealised
fluid with this property is easy to imagine, and it is evidently
what is required to correctly represent for example the com-
bined gravitational field of the sun and Jupiter. Actual water, as
A two-dimensional cross-section might be clearer on paper: it happens, is not quite so simple (because it is compressible
and has a kind of internal friction), but that need not concern
us since this whole business is hypothetical anyway; all this talk
of fluids is meant only as a conceptual tool and a useful aid to
our imagination.

The same kind of reasoning also works if we restrict ourselves


to a plane. We can then imagine a thin layer of fluid, say
trapped between two glass planes.

11.1.3. Adapt the argument of problem 11.1.1 to show that forces


diminish linearly in this case.

At each point a vector shows how an object placed there would


The vector fields of a source and a sink in this case look like
be pulled by gravity. The gravitational pull is stronger closer to
this:
the origin. Electrostatic forces work much the same way: the
same picture shows how a negative charge at the origin would
attract a positively charged particle.

It is useful to think of situations like these in terms of imaginary


fluid flows, even though no actual fluids are involved. We can
imagine a large pool of water, into which someone has stuck a
pipe that ends right in the middle of the pool. If water is sucked
out through the pipe then this will cause the water in the pool
to flow in the manner of the picture above. (This is very much
an idealisation of course; in particular we are ignoring gravity
and consider only the internal pressure of the water.)

11.1.1. One very nice aspect of this analogy is that it makes it ob-
vious “why” the force of gravity (and electrostatic attrac- If I pump water in at one point and suck it out at another I can
tion) diminishes as the inverse square of the distance. find the net effect by superimposing these two pictures. It looks
Explain how. (Hint: Imagine how the water would need like this:

82
lines, as planets and projectiles do in the gravitational force
field. The velocity of the imaginary fluid is the acceleration (or
force, which in effect comes to the same things since F = ma)
of the particle in the force field. So the fluid analogy is just a
way of conceptualising forces, not an actual flow in which real-
world objects can ride along like boats.
Note also that, as again highlighted by this last problem, the
flow of our imaginary fluid is determined solely by its mov-
ing in the direction of lowest pressure; it does not accumulate
momentum, which would interfere with this defining property.
We can imagine our fluid as composed of a myriad little parti-
cles moving about chaotically and bouncing into each other all
over the place. There will then be a net tendency for the fluid
to move towards areas of lower pressure (since there are fewer
particles to bump into in that direction), but at the same time
11.1.4. Sketch visually how the net field is obtained by adding up
there will never be a single, coordinated mass moving in any
of the previous two.
one direction, so the issue of momentum does not arise.
11.1.5. Give an explicit formula for F in this case.
This fluid flow scenario models the way electric current be-
§ 11.2. Divergence
haves if we connect the two poles of a battery to two distinct
points of a conducting sheet of metal. The analogy between
electricity and fluid flow is so powerful that thinking in terms § 11.2.1. Lecture worksheet
of an “electric fluid” is often the most intuitive way of under-
standing electric phenomena. When thinking of a vector field F as a fluid flow, a fundamental
question is how much fluid is being generated at a given point.
In the combined field I also traced (in light gray) the “flow
This is called the divergence of the field.
curves” that follow the arrows.
The divergence can easily be found in terms of the derivatives
11.1.6. Prove that the flow curves are in fact (parts of ) circles, as
of F = (P,Q, R) as follows. Consider an infinitesimal cube, and
follows. (We assume, of course, that the source and the
consider first the two walls of the cube that are pierced by the
sink are of equal magnitude.)
x-direction. The divergence in this direction is the difference
F+ between how much is flowing in through the left wall and how
F much is flowing out through the right wall. The difference in
flow intensity is how much P changes in between, so ∂P ∂x dx.
- F This flow intensity acts across the area dy dz of the wall, so the
– total excess flux generated inside the cube in this direction is
∂P
∂x dx dy dz. And the same in the other directions. So the flux
generated per unit volume is
+
∂P ∂Q ∂R
div F = + +
∂x ∂y ∂z

(a) Prove that the double-striped angles are equal and Another ∂notation ∂ ∂
for this is ∇·F, where ∇ (“nabla”) is the formal
that the their triangles are similar. Hint: the “1/r ” vector ( , ,
∂x ∂y ∂z ).
force law is the key to similarity. 11.2.1. Sketch the fields A = (x, 0, 0) and B = (0, x, 0). Compute
(b) Infer that the single-striped angles are equal. their divergence. Interpret in terms of fluid flow.

(c) Infer that the flow curve is a circle. Hint: draw the For any region is space, the net flow out of it is the amount of
midpoint of the circle and the relevant radii, and fluid generated inside it. This evident fact is expressed in the
consider the relations among the base angles of the so-called Divergence Theorem or Gauss’s Theorem:
isosceles triangles that arise to prove that F is tan-
Ñ Ï
gent to the circle. div F dV = (F · n) dS

This last problem gives us occasion to reflect further on the re- The left hand side is the integral over a region in space with vol-
lationship between a force field and its fluid flow analog. The ume element dV , so it expresses the amount of fluid generated
problem shows that a small piece of paper dropped into our inside it. The right hand side is the integral across the bound-
imaginary fluid will flow along such a circular arc, not that a ary of this region with normal n and surface element dS, so it
particle in the corresponding force field will move in this way: measures how much of the flow is going out of the region (i.e.,
the particle’s momentum will cause it to deviate from the flow in the direction of the normal).

83
§ 11.3. Line integrals (c) Conclude the proof of the desired result.
11.3.4. Infer that, in this force field, the work done in bringing
§ 11.3.1. Lecture worksheet an object from one point to another is independent of
the path taken. Note that this holds for any conservative
Given a force field F, we recall from §4.4 that we can find the field.
work it performs Ron an object being made to traverse a path in Since distance travelled can also be expressed as velocity times
it by the integral F · ds taken along the path (although we are time, ds = ẋ dt, another way of writing R F · ds is R F · ẋ dt, which
integrating along a curve the integral is nevertheless called a is a more practical form of the work integral for cases where the
line integral—a very stupid name). Here ds is an infinitesimal path x is given as a parametrised curve.
piece of the curve, and by taking the scalar product we are pro-
jecting F onto it, i.e., we are counting only the component of 11.3.5. (a) Find an explicit formula for F for the radial grav-
the force that acts in the direction of the curve. In particular, if itational force field. Hint: The vector (x, y, z)
the force is perpendicular to the curve it has no effect at all and points from the origin to this point, so the vector
might as well be absent altogether. Thus we must not think that (−x, −y, −z) points from this point to the origin. It
any effort is required to keep the object from deviating from the remains only to scale it so that it has the right mag-
path; rather the object must be understood to be constrained nitude.
to the path in a natural manner. A prototype example would be (b) Confirm by explicit calculation that the work done
a hockey puck sliding on the surface of a frozen lake. Friction by the field on an object traversing the circle (a +
aside, gravity has no effect on the puck’s free motion along this cos t , sin t , 0) is zero. Note that the special case a = 0
surface, and no effort is required to keep the puck from “turn- is easy to deal with both physically and computa-
ing downwards.” Generalising from this example, we can imag- tionally.
ine any line integral as a puck following a frictionless ice chan-
nel, such as a groove in the ice. The idea is that the ice chan- Another way of seeing that a field is conservative isRto think of
T
nel ensures that the puck’s inertial velocity is directed along the its integrand as the derivative of something. For if 0 F · ẋ dt =
RT 0
path of motion in a “lossless” fashion, as if this had been its nat- 0 y dt then by the fundamental theorem of calculus the inte-
ural inertial motion. Under these conditions the work done by gral evaluates to y(T ) − y(0), which is zero if the start and end
the field amounts to how much it speeds up or slows down the points are the same. This happens precisely when the field is a
motion along the path. gradient field, i.e., F = ∇ f for some function f (x, y, x).

11.3.1. Discuss some line integrals in the gravitational force field 11.3.6. Explain why in this case
near the earth, F = (0, −mg). Explain how this relates to d
potential energy. Also explain the meaning of the sign of f (x(t )) = ∇ f (x(t )) · ẋ(t )
dt
the integral.
and use this to show that
Z T
On a more astronomical scale, the gravitational pull of an ob-
F · ẋ dt = f (x(T )) − f (x(0))
ject of great mass is −G M m/r 2 , directed radially towards it. 0
The net work done when moving a body along any closed path
in this field is zero. This function f has an important physical meaning as the fol-
lowing problem shows.
11.3.2. Show that this follows from energy conservation. (Hence
such fields are called conservative.) 11.3.7. (a) Find an f such that F = ∇ f for the radial gravita-
tional field F.
11.3.3. Also prove the same result more directly as follows.
(b) Show that f is the potential energy. By definition
(a) First prove the result for a path made up of pieces the potential energy is the same thing as the work
that are either radial or circular with respect to the obtained by letting the object fall to the center of
center of force, such as this: force.
(c) An application of this: For a body orbiting the sun,
if at some point in its orbit it is twice as far from the
sun as at another point, what is the difference in ve-
locity between these two points? Hint: Use energy
conservation.
This meaning of f as the potential energy generalises to any
conservative force field, as we can see by the following reason-
ing.
(b) Now consider an infinitesimal right triangle whose Rx
two legs are such radial and circular lines, and 11.3.8. Let F be any conservative field and define f (x) as o F· ds,
prove that the work along these legs is the same as where the integral is taken along any path from the arbi-
the work along the hypothenuse. trarily fixed origin o to the general point x.

84
(a) Explain why it necessary for F to be conservative for pieces. To this end it is useful to consider first the circula-
this construction to make sense. tion around an infinitesimal square, and then taking general
shapes to be made up of them. So consider an infinitesimal
(b) Show that ∇ f = F.
square with its sides parallel to the axes. What is the circula-
(c) Conclude that f is a potential energy function. tion around this square? Consider first the two vertical sides.
What does the arbitrariness of o mean in physical The force F = (P,Q) will be almost the same along both of these
terms? sides, namely Q evaluated there. As we walk around the square,
one of these Q-forces go with the circulation and the other
Restated in purely mathematical terms, this shows that against it, so their net effect is zero except for the fact that they
any conservative field is a gradient field. are not quite equal: since the two sides are dx apart their Q-
∂Q
values differ by ∂x dx. This, then, is the net force contributing
§ 11.3.2. Problems to the circulation, and since it acts across a distance of dy its
∂Q
net contribution is ∂x dx dy.
11.3.9. Show that the field F(x, y) = (P (x, y),Q(x, y)) is conserva- 11.4.1. Continue this line of reasoning to show that the circula-
tive if and only if P y = Q x at every point in the plane. tion around the infinitesimal square is
∂Q ∂P
µ ¶
− dx dy
∂x ∂y
§ 11.4. Circulation
Hint: The signs are easily understood by recalling that
our convention is to round the square counter-clockwise
§ 11.4.1. Lecture worksheet and considering whether an increase in P or Q helps or
R hinders this motion.
We have seen that if F is a force field then F · ds is the work
done by the field in traversing a certain path, that is, the We now wish to consider any region as an aggregate of in-
“boost” that field is giving us, like a wind in our backs, as we finitesimal squares, which will give us Green’s Theorem:
∂Q ∂P
Ï µ ¶
traverse the path.
I
P dx +Q dy = − dx dy
∂x ∂y
A problem with this image is that we must disregard the forces
that are trying to push us off the path. This is what we accom- The H left hand side is just the usual work or circulation integral
plished above with our “ice channel” idea. The analogous idea F·ds written out in terms of the components of F = (P,Q). The
for fluid flows would be to imagine that all of the fluid except right hand side is the sum of the circulations about infinitesi-
for a narrow channel is instantaneously frozen. Then the fluid mal squares.
in the channel will, in general, continue flowing around one 11.4.2. Complete the proof of Green’s Theorem as follows.
way or the other depending on which direction had the greater
momentum in the original flow, and (a) Suppose two infinitesimal squares are joined along
H this net balance of mo-
menta is precisely what the integral F · ds computes. one side. Argue that the circulation around the new
H region is the sum of the circulations of the each
For this reason, when the work integral F · ds is taken around constituent square considered separately. (The pic-
a closed path (indicated by the little circle on the integral sign) ture suggests the idea that the shared edge “can-
it is called the circulation. To fix the sign, the convention is that cels,” but make sure that your explanation makes
we traverse the path so that the inside is on our left. The figure sense in terms of the fluid flow interpretation of cir-
thus shows a negative circulation. culation.)
The suggestive idea of the ice channel is all we really need, but if (b) We can approximate any region very closely by
you wish to think more about what actually happens to the fluid you
infinitesimal squares, but the boundary will be
should be able to convince yourself that, owing to its incompressibil-
“jagged.” Prove that the flow remains the same if we
ity, the fluid will settle into a circulation of uniform speed (as long as
there are no sources or sinks inside the channel). So, when we freeze cut across diagonally instead of following two edges
the rest of the fluid, the flow in the channel also alters, and there- of the boundary squares.
fore ceases to represent the forces in the force field interpretation of
F. Nevertheless, it remains a viable image for the net work done along
§ 11.4.2. Problems
the channel as a whole, which was its purpose in the first place.

You may also wish to think about how to reconcile the momentum- 11.4.3. Show that the area of a region in the plane is given by
H H
based account of circulation with the particle-kinetic fluid model that the integral x dy or − y dx taken around its boundary.
we used at the end of §11.1 to argue against momentum effects. Hint: Hint: Cut the boundary of the figure into infinitesimal
the narrow channel now means that the momenta are coordinated af- pieces and draw the rectangles x dy for each piece.
ter all.
These results are often presented as a corollary of Green’s
We shall now show that the circulation along a loop can be Theorem even though it is much more illuminating to
computed as the sum of the circulations around its interior understand them directly from first principles.

85
11.4.4. By averaging the two area expressions in Hthe previous curl F
problem we can also write the area as 12 x dy − y dx.
This formula can also be interpreted in terms of deter-
minants: Show how it is obtained from a computation
of the areas of triangles with corner points (0, 0), (x, y),
(x + dx, y + dy) by determinant methods. F
F
2 2 2 2
11.4.5. Find the area of the ellipse x /a + y /b = 1.

11.4.6. Show that the Divergence Theorem reduced to two di- § 11.5.2. Problems
mensions gives essentially Green’s Theorem (only with
trivial modifications in signs). 11.5.2. Prove computationally that curl ∇ f = 0 and argue physi-
cally that the curl of a field is zero if and only if the field
is conservative.
§ 11.5. Curl
11.5.3. (a) Prove computationally that div curl F = 0.

§ 11.5.1. Lecture worksheet (b) † Is there a physical interpretation of this result?


11.5.4. Prove that div curl F = 0 by considering the limit of
We would now like to extend our investigation of circulation Stokes’ Theorem as the loop shrinks to a point and then
into three dimensions. We do this by means of the curl of a applying the Divergence Theorem to the resulting inte-
vector field F = (P,Q, R): gral.
µ
∂R ∂Q
¶ µ
∂P ∂R
¶ µ
∂Q ∂P
¶ 11.5.5. Find the curl of F = zi and interpret the result physically.
curl F = − i+ − j+ − k
∂y ∂z ∂z ∂x ∂x ∂y 11.5.6. The fact that “you can’t go uphill both ways to school” is
an instance of what vector calculus theorem?
This expression is more conveniently expressed in terms of the
formal nabla vector as curl F = ∇ × F.

The z-component of the curl vector is the circulation around § 11.6. Electrostatics and magnetostatics
an infinitesimal square parallel to the x y-plane, just as in the
previous section, and the other components are the analogous § 11.6.1. Lecture worksheet
expressions for the other directions. The meaning of the curl
vector, therefore, is that curl F · n measures how much a wheel Space is permeated by two fields: the electric field E, which de-
with axis n would be made to rotate by the fluid. For exam- scribes how a positively charged (static) particle would move,
ple, if n points in the z-direction we are asking for the rotation and the magnetic field B, which describes in which direction
∂Q
of a wheel with this axis, which is just ∂x − ∂P ∂y , or the circu- the north pole of a compass needle points. All electromag-
lation parallel to the x y-plane, just as before. If n points in netic phenomena can be characterised in terms of these fields.
some oblique direction then the rotation of a wheel with this In particular, all information transmitted through all forms of
axis will be a combination of the various coordinate-axis-plane wireless communication is encoded in these fields.
rotations taken in proportions depending on what coordinate
axis n agrees more with, and this is precisely what the scalar The complete theory of electromagnetism is contained in a few
product accomplishes. simple laws, analogous to Newton’s law of classical mechanics.
These are the equations of Maxwell. Before stating these laws
Green’s Theorem about circulation extended to three dimen- in full generality we wish to study the electrostatic and mag-
sions, where is is called Stokes’ Theorem, therefore becomes netostatic special cases. For electrostatics Maxwell’s equations
I Ï reduce to
F · ds = curl F · n dS ∇ · E = ρ/²0 and ∇ × E = 0
where ρ is electric charge and ²0 is a constant. Thus the first
The curl vector can also be interpreted directly, without pro- law is saying that electric charges produce divergence—i.e., act
jecting it onto a specific direction vector, in a manner very as sources or sinks—in the electric field, as we have already dis-
similar to how the direction and magnitude of ∇ f were found cussed above. Note that Coulomb’s inverse-square law is auto-
to have interesting intrinsic meaning once this vector had matically incorporated in this more elegant divergence law (in
been most naturally introduced in terms of its scalar products the manner of problem 11.1.1). The second equation says that
(cf. §9.4). the curl is always zero, as of course we would expect since an
electrostatic field is analogous to a gravitational one and there-
11.5.1. Convince yourself that the direction of curl F is the axis fore conservative.
of maximal circulation (rather like the vortical axis in a
For magnetostatics Maxwell’s equations reduce to
pitcher of lemonade being stirred) and that its magni-
tude is the intensity of rotation. ∇ × B = µ0 J and ∇·B = 0

86
where J is the (constant) flow of electric current and µ0 is a con- the air above the magnet runs a straight wire, fixed
stant. So the “magnetic fluid” spins around and around, in a in position. When I turn on the current in the wire,
direction perpendicular to the current: the magnet tips over. (In which direction?)

current J (b) If in the same arrangement the magnet is glued


fixed in its position, white the wire is hanging freely,
the wire is pushed to the side instead. (In which di-
rection?)

(c) Two parallel wires with currents going the same way
are attracted towards each other.
The magnetic field is like a pitcher of lemonade that someone
is stirring, and the direction of the current is the axis of its ro- 11.6.2. Electric motors harness the above principles to convert
tation. The divergence is zero: no magnetic fluid is generated electric current to mechanical work.
or destroyed. This absence of divergence corresponds to the (a) Explain how this is achieved by means of this ar-
fact that there are no magnetic monopoles, i.e., magnets al- rangement:
ways come in north-south pairs, never a piece of “north only,”
in contrast to the charged particles that generate divergence in N
the electric field. (As Humphry Davy once wrote to a woman to
whom he was attracted: “You are my magnet, though you differ S
from a magnet in having no repulsive points.”)

When we speak of the current J in this law we must understand


(b) Electric motors switch the direction of the current
that we do not mean only currents from a battery or a wall
once every half turn. Explain why.
socket. An ordinary bar magnet also contains a current: in fact,
at an atomic level, for a material to be magnetised means that
the rotations of its electrons are synchronised in such a way
§ 11.6.2. Problems
that they amount to a microscopic current along the surface of
the material. This is why a magnet can always be replaced by a
coiled wire with a current in it: the two are fundamentally the 11.6.3. Fields with curl are non-conservative, i.e., do work along
same thing. So while we have all played with ordinary magnets closed paths. The magnetic field has curl. So why can
and might have expected them to be the most primitive objects we not solve the world’s energy crisis with nothing but a
of magnetic theory, they are in fact from a theoretical point of bunch of magnets?
view a bit of an exotic curiosity; it is rather the definition of
magnetic fields in terms of current that is the basic one.
§ 11.7. Electrodynamics

S N = § 11.7.1. Lecture worksheet

The full versions of Maxwell’s equations are:

In addition to Maxwell’s four equations describing the fields E ∇ · E = ρ/²0


and B one also needs a “0th equation” to understand the effect
of the fields on a charged particle. This law says that the force ∇ × E = −Ḃ
on a particle of charge q moving at velocity v is
∇ × B = µ0 J + ²0 Ė
¡ ¢

F = q(E + v × B)
∇·B = 0
So the electric field is pushing things on directly as we have al-
In the previous section we studied the “static” cases where the
ready seen, whereas the effect of the magnetic field is a bit more
time-derivatives were zero. Of the two new terms, the one in
subtle. When a charge is moving we can think of its velocity
the second equation is the most interesting. This equation says
vector as a wrench and the magnetic field as a force pushing
that if the magnetic field is moving then it causes a “stir” in
the wrench. The resulting torque is the force that the particle
the electric fluid with the direction of motion being the axis of
experiences.
stirring. If a conducting wire is placed so that its particles are
11.6.1. Explain how the following experimental facts are ac- caught in the stirred-up vortex then a current is generated.
counted for by these equations.
11.7.1. Using the setup of problem 11.6.1b, explain how a cur-
(a) On the table in front of me I have a bar magnet rent can be created in the wire by moving it back and
standing with its north pole pointing upwards. In forth. Which way does the current go?

87
11.7.2. Draw the field Ḃ and explain how the motor in problem § 11.8. Reference summary
11.6.2 can be “run backwards” to generate electric cur-
rent from mechanical work. This is how power plants
generate electricity. § 11.8.1. Line integrals

Work done by a field along a path:


§ 11.7.2. Problems
Z Z
F · ds = F · ẋ dt
11.7.3. An alternating current is a current that oscillates from
one direction to the other like a sine wave. This is the
kind of current that comes out of our wall sockets.
Reversal of direction:
(a) Consider the magnetic field B generated by a cur- Z Z
rent in a coiled wire, and draw the field Ḃ resulting F · dx = − F · dx
if the current is alternating. −γ γ

(b) Consider a second coiled wire placed next to the where −γ is the curve γ traversed backwards.
first, with its axis aligned with it. Show that the al-
ternating current in the first induces, without a di- Fundamental theorem for line integrals:
rect connection, a current in a second wire. Z b
This is used for example in electric toothbrush charg- ∇ f · dx = f (b) − f (a)
a
ers, where an exposed electrical connector would be haz-
ardous. Induction cooking is also based on the same where a and b are the start and end points of any curve.
principle, in conjunction with the fact that certain metals
produce a lot of resistive heat when a current is induced
in them. § 11.8.2. Vector field concepts
11.7.4. Discuss the following passage from Maxwell’s paper On
Faraday’s lines of force (1855).
F is conservative
The student [of electrical science] must make ⇐⇒ F is a gradient field (F = ∇ f )
himself familiar with a considerable body of
⇐⇒ line integrals independent of path
most intricate mathematics, the mere reten-
tion of which in the memory materially in- ⇐⇒ line integrals around closed paths = 0
terferes with further progress. The first pro- ⇐⇒ curl F = 0
cess therefore in the effectual study of the sci- ⇐⇒ F irrotational
ence, must be one of simplification and re-
duction of the results of previous investigation
In this case − f is called the potential function. A special case is
to a form in which the mind can grasp them.
the potential energy of a gravitational field.
The results of this simplification may take the
form of a purely mathematical formula or of a
physical hypothesis. In the first case we en-
div F = ∇ · F = generated flux
tirely lose sight of the phenomena to be ex-
plained ; and though we may trace out the
curl F = ∇ × F = axis of maximal circulation
consequences of given laws, we can never ob-
tain more extended views of the connexions
curl F · n = circulation around axis n
of the subject. If, on the other hand, we adopt
a physical hypothesis, we see the phenomena
direction of curl F = axis of maximal circulation
only through a medium, and are liable to that
blindness to facts and rashness in assump-
|curl F| = intensity of maximal circulation
tion which a partial explanation encourages.
We must therefore discover some method of
investigation which allows the mind at every
§ 11.8.3. Properties of curves
step to lay hold of a clear physical concep-
tion, without being committed to any theory
founded on the physical science from which closed returns to where it started; final point =
that conception is borrowed, so that it is nei- initial point
ther drawn aside from the subject in pursuit of simple does not intersect itself
analytical subtleties, nor carried beyond the positively
truth by a favourite hypothesis. interior on left as traversed
oriented

88
Evaluate C x ydx + 2ydy on the curve y = x 2 from x = 0 to
R
§ 11.8.4. Vector calculus theorems
x = 2.
Divergence Theorem (Gauss’s Theorem): R R2 3 3 5x 4 2
C x ydx + 2ydy = 0 x + 4x dx = [ 4 ]0 = 20.
Ñ Ï
div F dV = (F · n) dS
R
C (1 − y)dx + xdy, where C is a curve consisting of three
parts: C 1 , the arc of y = 1 − x 2 from (1, 0) to (0, 1); C 2 ,
Green’s Theorem:
the line segment from (0, 1) to (−1, 0); C 3 the line segment
from (−1, 0) to (1, 0).
∂Q ∂P
I Ï µ ¶
P dx +Q dy = − dx dy Along C 1 we have y = 1 − x 2 ⇒ dy = −2xdx, so C 1 (1 −
R
∂x ∂y R0
y)dx+xdy = C 1 x 2 dx+x(−2xdx) = 1 −x 2 dx = [−xR3 /3]01 =
R

Stokes’ Theorem: 1/3. Along CR2 we have y = 1 + x ⇒ dy = dx, so C 2 (1 −


I Ï y)dx + xdy = C 2 −xdx + xdx = 0. Along C 3 we have y = 0
1
R R
F · ds = curl F · n dS R dy = 0, so C 1 (1− y)dx+ xdy = C 1 dx = [x]−1 = 2. Thus
and
C (1 − y)dx + xdy = 1/3 + 0 + 2 = 7/3.

These theorems are generally not applicable if the functions in-


volved have discontinuities in the region in question.
γ F·dr, where F = (sin(y − x), 2x y +sin(x − y)) and γ is the
R
p
curve y = x, 0 ≤ x ≤ 1.
§ 11.8.5. Electromagnetism Let γ1 be the line segment y = x, 0 ≤ x ≤ 1, and let D be
the region enclosed by γ and γ1 . Note that γ1 − γ is posi-
Maxwell’s equations: tively oriented with respect toR D. Since ∂Q/∂x Î − ∂P /∂y =
2y, Green’s Theorem gives γ1 −γ F · dr = D 2ydxdy =
R 1 R px R 1 2 y=px R1
∇ · E = ρ/²0 0 ( x 2ydy)dx = 0 [y ] y=x = 0 (x − x 2 )dxR = 16 . A
parametrisation of γ1 is r (t ) = (t , t ), 0 ≤ t ≤ 1, so γ1 F·dr =
∇ × E = −Ḃ R1 2
R1 2 2
R 2 1
0 (0, 2t ) · (1, 1)dt = 0 2t dt = 3 . Hence γ F · dr = 3 − 6 =
1
∇ × B = µ0 J + ²0 Ė
¡ ¢
2.

∇·B = 0
y ze x y z dx + zxe x y z dy + x ye x y z dz where γa is the curve
R
γa
Effect of fields on moving particle: x = cos t , y = sin t , z = t , 0 ≤ t ≤ a.

F = q(E + v × B) This is a gradient field with potential function e x y z . Hence


the integral is [e x(t )y(t )z(t ) ]tt =a
=0 = e
a cos a sin a
− 1.

§ 11.8.6. Problem guide


2x 3y
γ 2x 2 +3y 2 dx+ 2x 2 +3y 2 dy, where γ is the curve (x(t ), y(t )) =
R
R R
• Evaluate 2D line integral γ F · dx = γ P (x, y) dx + Q(x, y) dy, (cos t + e sin t cos t
,e ), 0 ≤ t ≤ π.
where γ is a given curve.
Note that Q x0 = P y0 which means that (P,Q) is a gradient
Is F = (P,Q) the gradient field ( f x , f y ) of some function
field. A potential function is U (x, y) = 21 ln(2x 2 + 3y 2 ).
f (x, y)? If so, use the fundamental theorem for line integrals. 3y
Hence γ 2x 22x dx+ 2x 2 +3y 2 dy = U (b)−U (a) = U (0, e −1 )−
R
+3y 2
Are P y , Q x , and the interior of γ relatively simple? If so, con- U (2, e) = 12 ln( e 2 (8+3e
3
2 ) ).
sider using Green’s Theorem.

Parametrise γ (see §7.5.3), or, if easier, pieces of it one at a


2
)dx + (x 2 y cos(y 2 ) + 2x)dy, where γ is the ellipse
R
time. γ x sin(y
2 2
x + 4y = 1 traversed counter-clockwise.
Rewrite the integral purely in terms of t . The integrand can
be rewritten directly by plugging in corresponding expres- Let D = {(x,R y) : x 2 + 4y 2 ≤ 1}. Green’s Theo-
sion for t from the parametrisation equations. We must also rem gives x sin(y 2
)dx + (x 2
y cos(y 2
) + 2x)dy =
γ
differentiate the parametrisation equations to find dx, dy in Î ∂ 2 ∂
( (x y cos(y ) + 2x) − ∂y (x sin(y 2 )))dxdy
2
=
terms of dt. RDR ∂x 1
2 D dxdy = 2 · π · 1 · 2 = π. (Using the formula A = πab
The integral can now be evaluated as an ordinary single- for the area of an ellipse.)
variable integral.

89
2x ydx−x 2 dy
RR
where γ is the curve y = x 2 + 2x − 4 from Y F · NdS where F = (x sin y, x + cos y, z − 1) and Y is the
R
γ x 4 +y 2
,
part of the ellipsoid {x 2 +2y 2 +4z 2 = 1} with z ≥ 0 (normal
(1, −1) to (2, 4).
oriented upwards).
∂Q 2x 5 −2x y 2
Since ∂x = ∂P ∂y = (x 4 +y 2 )2 , we can change the path of Complete Y with the “floor” Y0 = {x 2 + 2y 2 ≤ 1, z = 0}
integration to e.g. the three line segments from (1, −1) (normal pointing downwards). We can then apply the Î Di-
to (1, 0) (γ1 ), (1, 0) to (2, 0) (γ2 ), and (2, 0) to (2, 4) (γ3 ). vergence Theorem to the resulting closed surface:
R 2x ydx−x 2 dy R R R Î Ð Y F·
Then γ x 4 +y 2 = γ1 + γ2 + γ3 = I 1 + I 2 + I 3 . We NdS + Y0 F · NdS = K div Fdxdydz. Two terms R R this
of
R 2x ydx−x 2 dy equation are readily evaluated, namely: I 0 = Y0 F ·
compute each integral separately. I 1 = γ1 x 4 +y 2 = Î
NdS = x 2 +2y 2 ≤1 (x sin y, x + cos y, −1)) · (0, 0, −1)dxdy =
π
R 0 −dy

0
−1 1+y 2 = [− arctan y]−1 = − arctan 0 + arctan(−1) = − 4 .
RR Ð Ð
x 2 +2y 2 ≤1 1dxdy = 2 and I 1 = K div F = K (sin y −
R 2x ydx−x 2 dy R2 R 2x ydx−x 2 dy p
sin y +1)dxdydz = K 1dxdydz = 2 3 ·1· p · 2 = 62π (us-
1 4π 1 1
Ð
I 2 = γ2 x 4 +y 2 = 1 0dx = 0. I 3 = γ3 x 4 +y 2 =
2
ing the formula V = 4π
½ ¾
R 4 −4dy y= 4t R 1 −16dt R 1 −dt 3 abc for the volume of an ellip-
0 16+y 2 = dy = 4dt
= 0 16+16t 2 = 0 1+t 2 =
soid with semi-axes
p
a, b, c). Hence the sought integral
[− arctan t ]10 = − arctan 1+arctan(0) = − π4 . Altogether: γ =
R

is I 1 − I 0 = − 3 .
I 1 + I 2 + I 3 = − π4 + 0 − π4 = − π2 .

γ F · dx, where γ is a given curve.


R
• Evaluate 3D line integral
• Decide whether a given vector field F = (P,Q) is the gradient If curl F is relatively simple, consider using Stokes’ Theorem.
( f x , f y ) of some scalar function f (x, y).
Is F a gradient field? If so, use the fundamental theorem for
If such an f exists you may find it as follows. Integrate P with line integrals.
respect to x. The constant of integration could be any func-
tion of y, say C (y). The resulting expression is our candi- For direct computation, use parametrisation as in 2D case.
date f . Take its derivative with respect to y. Compare the
result with Q. If C (y) can be chosen so that they match then Determine the work done by F = (3x 2 +y z, cos y +xz, 4e z +
F = (P,Q) is the gradient field of the resulting f . x y) along a curve γ from (0, π2 , 1) to (1, 0, 0).

The work done is independent of the path since curl F =


If such an f does not exists, the quickest way to show this
(x − x, y − y, z − z) = (0, 0, 0). There is thus a potential func-
may be to check whether P y = Q x . This would have to be the
tion U such that ( ∂U ∂U ∂U 2 z
∂x , ∂y , ∂z ) = (3x + y z, cos y + xz, 4e +
case for f to exist (equality of mixed partial derivatives).
x y). Hence ∂U 2 3
∂x = 3x + y z ⇒ U = x + x y z + G(y, z) which
gives cos y + xz = ∂U ∂G ∂G
Î Î
• Compute surface integral S F · n dS = S f dS, where S is a
∂y = xz + ∂y ⇒ ∂y = cos y, and hence
given surface. G(y, z) = sin y + H (z). Thus we can write U = x 3 + x y z +
sin y +H (z), which gives 4e z +x y = ∂U ∂H
∂z = x y + ∂z ⇒ H (z) =
If div F is relatively simple, consider using the Divergence z
4e +C . We are free to choose C = 0, so that U = x 3 +x y z +
Theorem. Otherwise:
sin y + 4e z . The work done is thus U (1, 0, 0) −U (0, π2 , 1) =
Express the surface S (or pieces of it) as the result of letting 5 − (1 + 4e) = 4 − 4e.
two variables range between certain bounds (polar, cylindri-
cal, or spherical coordinates may be useful). Express the in-
tegrand f and the surface area element dS in terms of these
variables. Evaluate as a double integral.

Find the flow of the field F = rr2 , where r = (x, y, z) and


r = |r|, out of the region D = {(x, y, z) : 2 ≤ x 2 + y 2 + z 2 ≤ 3}.

div F = 1/r 2 . Hence the Divergence Theorem gives


Ð 1 R p3 Î 1
R p3
Φ = D r 2 dV = p ( x 2 +y 2 +z 2 =r 2 r 2 dS)dr = p 4πdr =
p p 2 2
4π( 3 − 2).

Find the flow of the field F = (−x y 2 , x sin z−y, z y 2 ) into the
region K = {(x, y, z) ∈ R3 : 1 ≤ x ≤ 3, 1 ≤ y ≤ 4, 2 ≤ z ≤ 4}.
With ∂K oriented with Î outward-pointing
Ð normal, the
flow
Ð into the region is − ∂K FdS = − K div Fdxdydz =
− K −1dxdydz = Volume(K ) = 12.

90
§ 12.2. The rainbow
12 F URTHER PROBLEMS
Prerequisites: §2.1.

§ 12.1. Newton’s moon test Rainbows are the result of light from the sun “bouncing”
through raindrops. In this problem we shall show that rain-
drops tend to concentrate the rays of the sun—almost like a
Prerequisites: §1.1. magnifying glass—at one particular outgoing angle, namely
about 42◦ .
The moon is kept in its orbit by the earth’s gravitational pull, or
so your high school textbook told you. How do you know that it From this we can infer the shape and position of the rainbow
is really so? How do you know that the moon is not towed about as follows. Imagine that you have the sun in your back and that
by a bunch of angels? This question doesn’t seem to arise in to- there is a wall of raindrops some distance ahead of you. Since
day’s authoritarian classrooms, but Newton gave an excellent the sun is so far away its rays may be considered parallel. For
answer if anyone is interested. any given raindrop, picture the ray from the sun that hits it, and
picture the line of sight from your eye to the raindrop. Consider
“That force by which the moon is held back in its orbit is that
the angle at which these lines meet.
very force which we usually call ‘gravity’,” says Newton (Prin-
cipia, Book III, Prop. IV). And his proof goes like this. Consider12.2.1. Characterise the set of all raindrops for which this angle
the hypothetical scenario that “the moon be supposed to be is 42◦ . Hint: consider the ray from the sun that passes
deprived of all motion and dropped, so as to descend towards through your head for reference purposes.
the earth.” If we knew how far the moon would fall in, say, one
12.2.2. You never see rainbows at mid-day in mid-summer, no
second, then we could compare its fall to that of an ordinary
matter how much it rains. Why not?
object such as an apple. Ignoring air resistance, the two should
fall equally far if dropped from the same height. Now we must understand where the number 42◦ comes from.
At any given contact surface between air and water, some light
Of course we cannot actually drop the moon, but with the
is reflected and some light is refracted, in the manner de-
power of infinitesimals we can deduce what would happen if
scribed in problems 2.1.5 and 2.1.7 respectively. The light that
we did. Here is a picture of the moon’s orbit, with the earth in
constitutes the rainbow is the light that refracted into the rain-
the center:
drop, reflected at the back of it, and refracted back out gain
through the front. In the plane containing the rays from the
sun, the midpoint of the drop, and the observer, this looks as
B
E follows:

D A α
S C
β f
β
Suppose the moon moves from A to B along a circle with center β
S in an infinitely small interval of time. If there were no gravity
the moon would have moved along the tangent to the circle to
some point E instead of to B (B E is parallel to ASD because the
time interval is infinitely small so gravity has no time to change β
direction).

12.1.1. (a) Prove that ABC is similar to AB D. α

Thus AC /AB = AB /AD, i.e., (diameter of the or-


bit)/(arc)=(arc)/(distance fallen).
We are assuming that raindrops are spherical.
(b) Explain how one can use this relation to find how far
the moon falls in one second. 12.2.3. (a) Explain why all β’s in the figure are equal, and why
the two α’s are equal.
(c) Carry out the calculation. (You will need to look up
some parameters of the moon’s motion.) (b) Express the ray’s net reflection f as a function of α
and β. Hint: it may be easier to first consider π − f ,
(d) Compute how far the moon would fall if dropped at which is the angular difference between the incom-
the surface of the earth, where gravity is 602 times ing and outgoing angle, i.e., the “total amount the ray
stronger since the moon is 60 earth radii away. has turned.”
(e) Is the result the same as for a falling apple? Hint: con- (c) Express β in terms of α using the law of refraction.
sult §A.7.12 if your physics is rusty. The ratio of the speed of light c 1 in air and c 2 in water

91
is about c 1 /c 2 = 4/3. McPhee concludes: “If other models confirm that this is a quite
general consequence, then it means not to keep on looking
(d) Express f purely in terms of α and find its derivative.
endlessly for behavioral ‘reasons’ for the alcoholic’s loss-of-
(e) Set the derivative equal to zero and solve for α. control phenomenon. Rather, we might have to face the awful
truth that what the alcoholic has been saying for years is the
(f) Explain why we should expect a concentration of truth: no ‘reason’ is really necessary.” (p. 213)
light at this angle.

(g) ? From an airplane it is possible to see a full circle


rainbow. Explain. § 12.4. Estimating n!

You may be disappointed that we did all this work about rain-
Prerequisites: §5.
bows, yet said nothing about its colours. Indeed, rainbows
are arguably a light-concentration phenomenon more than a The factorial function n! is intractable to compute and grasp
colour phenomenon, scientifically speaking. We are often so for large n since its definition involves compounded multipli-
distracted by the beautiful colours that we fail to notice that cations that quickly grow beyond bounds. Therefore it is useful
the rainbow is also significantly brighter than the surrounding to have a closed formula approximating n!. We can find such a
sky. Black-and-white photos can help us see this more clearly. formula as follows.
The colours of the rainbow result from the fact that light rays of
12.4.1. (a) Decompose ln(n!) into a sum.
different colours have slightly different refraction angles.
(b) Interpret this sum geometrically as a sum of the areas
of rectangles with base 1 along the x-axis.
§ 12.3. Addiction modelling (c) Estimate the area from below by an integral.

(d) Estimate the area of the pieces left out by the inte-
Prerequisites: §2.5.
gral approximation. Hint: align these pieces by slid-
McPhee (Formal Theories of Mass Behavior, 1963) offers the fol- ing them horizontally to the y-axis.
lowing mathematical model of what he calls “the logic of ad-
(e) Deduce an estimate for n!.
diction.” Let C stand for consumption, scaled so that C = 1
corresponds to “normal maximum” (e.g., in the case of alco- (f ) Check the accuracy of this estimation for a few
hol consumption, a few glasses of wine or so). The change in large values of n.
C depends on the stimulation s to consume, which is propor-
A slightly better estimate may be obtained as follows. First we
tional to the remaining room 1 − C to consume up to normal
note that n! can be expressed as an integral.
satisfaction, and the resistance r , which is proportional to C .
0
Thus C = s(1−C )−r C . But drinking also has an intrinsic effect12.4.2. Show that Z ∞
i , which increases stimulation and weakens resistance (“one x n e −x dx = n!
drink leads to another”, p. 187). So C 0 = (s + i )(1 −C ) − (r − i )C . 0

for any integer n.


12.3.1. Extremum occurs when C 0 = 0, i.e. when C =
. We now apply a standard trick for estimating integrals: the “full
width at half maximum” rule, which says that the area under
12.3.2. A normal drinker has s=moderate, i =moderate, and
a graph with a peak in it may be approximated by multiplying
r =moderate, giving a extremum consump-
the height of the peak with the width of the graph at the y-value
tion. A typical alcoholic may have s=big, i =big, r =–small,
corresponding to half the maximum value. This guesstimation
giving a extremum consumption.
trick can be of use as a last resort when one cannot evaluate an
The model also explains a different type of behaviour among integral analytically, or even when the graph is only known vi-
some alcoholics, illustrated by this quotation from a sociology sually rather than as a formula, which can happen for instance
study: if its is the output of a physical measuring device.
I don’t drink every day and I’ll go weeks without drinking. Then
when I’m on top of the world and everything is going swell, I flop 12.4.3. To estimate n! using the “full width at half maximum”
like a dope. What causes it I don’t know! When I really should heuristic, let f (t ) denote the integrand of the integral in
get drunk is after I’ve sobered up and I’ve got all kinds of prob-
lems. When I start drinking is, when I don’t have any. Everything
problem 12.4.2.
looks fine and rosy and everything. . . . I went in with him for no
reason—not planned—and had a couple. The next thing I knew I (a) Sketch a rough graph of f (t ) from t = 0 to t = ∞.
woke up in a hotel room. (p. 210) Hint: first consider the graphs of the factors sepa-
rately.
12.3.3. Argue that this type of alcoholic seems to have i =big and
r =–small like a regular alcoholic, but has s=–small. (b) Find the maximum, f max , of f (t ) on this interval.

12.3.4. Explain his behaviour in terms of the differential equa- We now need to find the t -values corresponding to half
tion. maximum, f (t ) = 12 f max . We cannot directly solve this

92
equation analytically. But we can approximate the solu- Newton constructed such a polynomial, namely a poly-
tions as follows. nomial p(x) which takes the same values as a given func-
tion y(x) at the x-values 0, b, 2b, 3b, . . .. Here is the con-
(c) Apply logarithms to both sides of this equation.
struction. First, our polynomial p(x) is supposed to have
(d) Replace the left hand side by the first two non-zero the same value as the given function y(x) when x = 0.
terms of its power series expansion about its maxi- Therefore we should start by setting p(x) = y(0). Next, we
mum. want p(x) to take the same value as y(x) when x = b. This
is easily done by setting
(e) Find the desired t -values.
x¡ ¢
(f) Conclude the estimation of n!. p(x) = y(0) + y(b) − y(0) .
b
(g) Check the accuracy of this estimation for a few This polynomial obviously agrees with y(x) when x is 0 or
large values of n. b. Now we need to add a quadratic term to make it agree
when x is 2b as well. We want the new term to contain the
factor (x)(x − b) because then it will vanish when x is 0 or
§ 12.5. Wallis’s product for π b, so our previous work will be preserved. If we set x = 2b
in the piece of p(x) that we have so far we get
Prerequisites: §5.
p(2b) = y(0) + 2y(b) − 2y(0) = 2y(b) − y(0).
Wallis’s product expression for π says:
So we want the quadratic term to have the value y(2b) −
2y(b) + y(0) at x = 2b.
π 2 2 4 4 6 6 8
= · · · · · · · (a) Use this reasoning to write down a second-degree
2 1 3 3 5 5 7 7
polynomial p(x) that agrees with y(x) when x is 0, b
12.5.1. Argue that Wallis’s product follows from the expression in
or 2b. (Keep the factor (x)(x −b) as it is, i.e., do not re-
problem 5.1.10a. Hint: plug in a specific value for x.
duce the expression to the form p(x) = A+B x +C x 2 .)
12.5.2. The following is an alternative proof of Wallis’s product
In the same manner we could add a cubic term to make
expression for π.
R π/2 p(x) agree with y(x) at x = 3b, and so on.
(a) Evaluate 0 (sin x)2m dx.
The formula becomes more transparent if we introduce
R π/2
(b) Evaluate 0 (sin x)2m+1 dx. the notation ∆y(x) for the “forward difference” y(x + b) −
y(x), and ∆2 y(x) for the forward difference of forward dif-
(c) Divide the two results to find an expression for π2 . ferences ∆y(x + b) − ∆y(x), etc., so that
The expression contains the ratio of the two integrals.
∆y(0) = y(b) − y(0)
(d) What needs to be the limit of this ratio as m → ∞ for ∆2 y(0) = ∆y(b) − ∆y(0) = y(2b) − 2y(b) + y(0)
Wallis’s expression to follow?
∆3 y(0) = ∆2 y(b) − ∆2 y(0) = y(3b) − 3y(2b) + 3y(b) − y(0)
To establish this limit we need to estimate the ratio from ..
above and from below. .

(e) Find one of these estimates by considering what (b) Rewrite your formula for p(x) using this notation,
multiplying by sin x does to the values of a function and then extend it to the third power and beyond “at
on the interval (0, π/2). pleasure by observing the analogy of the series,” as
Newton puts it.
(f) RFind the other Restimate by comparing
π/2 2m+1 π/2
0 (sin x) dx and 0 (sin x)2m−1 dx. (c) Show that Taylor’s series
(g) Conclude the proof. y 00 (x) 2 y 000 (x) 3
y(x) = y(0) + y 0 (0)x + x + x +···
(h) ? † Can you see why some people prefer this proof to 2! 3!
that of problem 12.5.1? is the limiting case of Newton’s forward-difference
formula as b goes to 0.

§ 12.6. Power series by interpolation

Prerequisites: §5.
lim
12.6.1. In problem 5.1.1 we argued that a first-degree polynomial
can be made to go through two points, a second-degree
polynomial to go through three points, and so on. Indeed,

93
This is indeed how Taylor himself proved his theorem in problem must be “convex,” meaning that is has no cavities, so
1715. The nowadays more common method of finding to speak. Africa is not convex and its “convex hull,” the least
the series by repeated differentiation (as in problem 5.1.2) convex figure containing it, is obtained by snapping a rubber
was used by Maclaurin in 1742. band around it and taking that as the new figure:

§ 12.7. Path of quickest descent

Prerequisites: §2.1, §6.3, §7.4.

Consider a children’s playground slide, and suppose it is cov-


ered with ice so that there is no friction. Which shape of the
slide will take you as quickly as possible from the starting point
to the endpoint? This is the brachistochrone problem, or the12.8.1. Explain why we can rule out non-convex figures as possi-
problem of quickest descent. Johann Bernoulli solved this ble solutions to the isoperimetric problem.
problem in 1697 by means of the following ingenious optical Second, we need the concept of a “parallel curve.” Take a con-
analogy. vex figure and roll a circle on its boundary. The curve that is
Recall first from problem 6.3.1 that the speed of a particle slid- being traced out by the midpoint of the circle is what we call a
ing down a ramp under the influence of gravity is proportional parallel curve.
to the square root of the vertical distance covered. Let us there-
fore imagine slicing the space to be covered into thin horizon-
tal strips. In each strip the speed may be considered constant,
but from strip to strip it varies in the manner just outlined.
But this reminds us of problem 2.1.7, where we found how to
choose the quickest path in cases involving two mediums of
Or, if you prefer, dip the circle in paint and have it bounce
different speed.
around with its midpoint trapped inside the figure. It will then
12.7.1. Use this information to obtain a differential equation for paint the parallel figure.
the desired curve as follows.
12.8.2. Prove that the area enclosed by the parallel curve is A +
(a) What does problem 2.1.7 imply about the multi-layer Lr + r 2 π, where A is the area of the original figure and
case? Hint: apply the result once for every layer- r is the radius of the circle used to construct the parallel
crossing, and combine the results. curve.
(b) Use a limit argument to express the result in terms of Hint 1: First prove the result for polygons, and then infer
the angle β and velocity v considered as continuous the general case.
functions, rather than a step-by-step process.
Hint 2: The new area is the old area plus the area of the
(c) Express the result in terms of x, y, y 0 , and use this to strips plus the area of the shaded pieces.
form the differential equation.

12.7.2. Check that an appropriately oriented cycloid satisfies the


differential equation. (Cf. §7.3.3.)

§ 12.8. Isoperimetric problem


This is all easy to calculate once we realise that the shaded
pieces fit together to form a disc, because the sum of the
Prerequisites: §10, §11.3. angles of the shaded pieces is how much we turn when
we walk around the figure once.
The isoperimetric problem asks: Among all figures with a given
perimeter L, which encloses the greatest area A? The solu-
tion is the circle, which can also be expressed as an inequality:
L 2 − 4πA ≥ 0.

We are going to give a proof of this fact inspired by problem


3.3.3: to wit, we could get information about how much area a
figure covers by considering how good it is at being hit by nee- Now we are ready for our proof of the isoperimetric inequality.
dles.
12.8.3. Consider a figure of perimeter L. Now, instead of throw-
But before we get to our proof we must explain two prelimi- ing needles at it, throw circles. And use circles of the same
L
nary details. First we note that a solution to the isoperimetric perimeter as the figure, i.e., with radius r = 2π .

94
So now we have two expressions for the weighted area
that we can combine.
(c) Deduce from this the isoperimetric inequality.

We hope that this will give us information about the area


of our figure—the more circles intersecting it, the greater § 12.9. Isoperimetric problem II
the area—and we capture this intuition by considering
the weighted area of the plane Prerequisites: §11.
Ï
This section gives an alternative approach to the isoperimetric
# intersections dx dy
problem of §12.8. It is based on the following physical intu-
R2
ition. Where the earth is perfectly spherical, we can balance a
By this we mean that, to determine the weight of an in- stick by putting it down perpendicular to the ground. Where
finitesimal square, we should put the midpoint of one of the earth is not spherical, on the side of a hill or so, putting
our circles there and count how many times it intersects down a stick perpendicular to the ground will cause it tip over.
the boundary of our figure. Let’s agree that this experience convinces us that if we were
stranded on a non-spherical planet we could always find places
Only the area at most r away from the figure will be given where putting a stick perpendicular to the ground would cause
nonzero weight, that is, the area inside the r outer parallel it to tip over. So it is only for the sphere that gravity always acts
figure. Also, if we drop one of our circles with its midpoint in the direction of the normal of the surface. Let’s agree also
anywhere inside this parallel figure, then it will intersect that this still holds when the universe is flat—when planets are
the boundary of our figure at least twice (there will be no plane figures. To capture the mathematics of gravity, we should
zero-area in the middle, for the circle cannot fit inside the think of this in terms of vector fields, and to make it easier we
figure). consider the negative of the gravitational field—just take the
(a) Therefore Ï ordinary gravitational field and multiply the vectors by −1, pre-
tending that we are in a dual universe where gravity pushes in-
# intersections dx dy
stead of pulls. Now, take all figures of a given area and fill them
R2
Ï with cement. Then they all produce as much negative gravity.
≥ 2 dx dy = 2
¡ ¢ This negative gravity flows out from the figure, but only for the
circle does it always flow out along the normal. For any other
the parallel figure
figure, the negative gravity flows out askew, which we feel is an
Now we will calculate the weighted area by brute force. inefficient use of perimeter. So perhaps, then, this will force
Consider a point on the boundary of our figure. What cir- the perimeter to be greater than that of the circle of the same
cles intersect this point? It is of course the circles with area.
midpoints r away from it. So how do we capture these ideas to make a proof? Well, the
amount of negative gravity being produced can be calculated
by summing what is flowing out
Z
flow along the normal ds
the boundary
So consider an infinitesimal ds-segment of the boundary
of our figure. What circles intersect this segment? This We wish to show that a non-circular figure spreads the outflow
must then be circles with midpoints that lie in two strips over greater perimeter. This would follow at once if we could
like this: show that the flow along the normal from a point on a circle is
greater than the flow along the normal from any point on the
boundary of any figure of the same area. That would mean that
not only does a non-circular figure dilute the outflow at some
places but also that it cannot concentrate the flow somewhere
else. Indeed this is so, as we shall now calculate.
The strips are 2r high and have constant width ds, so
the segment contributes a total amount of weighted area Fix a point and its normal:
4r ds.
(b) Summing these up,
We now have a given area to distribute under this point to make
Ï I the flow of negative gravity in the direction of the normal as
# intersections dx dy = 4r ds = large as possible. How will an arbitrary infinitesimal square
R2 the boundary contribute if we choose to include it?

95
The force it causes to act on the point is proportional to the
area divided by the distance (in the manner of problem 11.1.3).
Then the angle θ the force vector makes with the normal de-
termines what part of the force that acts in the direction of the
normal: cos θ.
Suppose we have found the optimal shape of the area. Walk
straight ahead to the last infinitesimal square we included in
that direction, and do the same thing for an angle θ.

Say that the straight-ahead square is at a distance of r 0 , and the


other square r θ .
12.9.1. (a) Argue that these two squares contributes as much
negative gravity flow in the direction of the normal.
(b) Express this as an equation.
(c) Infer that the figure is a circle.
Hint:

96
y

13 F URTHER TOPICS

§ 13.1. Curvature dds


ds
ds

§ 13.1.1. Lecture worksheet


R
x
In this section we shall investigate a way of measuring how
curved a given curve is at a given point. We quantify this by
means of the curvature Å, which we define as the rate at which We can obtain a differential equation for the elastica as
the tangent to the curve is turning as we move along the curve. follows. The outer side of the elastic beam is thought
If we let φ denote the angle the tangent makes with the x-axis, of as consisting of springs, while the inner side is taken
we thus define Å = dφ/ds. to maintain fixed length. When the beam is bent by the
weight, the spring in a given position extends by dds. The
13.1.1. Express the curvature Å in terms of x and y. extension is proportional to the force acting on the spring
(this is the so-called Hooke’s law of springs; cf. problem
Hint: Recall the geometrically immediate facts that 6.4.3). This force is found by thinking of the remainder of
dy
tan φ = dx and ds2 = dx2 + dy 2 . the beam as the arm of a lever, through which the weight
acts. Since the force of the weight is vertical, the hori-
zontal component of the beam is the effective lever arm.
13.1.2. What curves have curvature zero? Thus the extension dds is proportional to the horizontal
position x. On the other hand it is evident that the exten-
13.1.3. Compute the curvature of a circle of radius R. sion is inversely proportional to the radius of curvature R
defined by the two normals drawn.
13.1.4. Radius of curvature. Another way of characterising curva-
ture is this. Imagine a curve and select a point on it. Now (a) Obtain a differential equation (expressed in terms of
draw the normal to the curve at this point. Then pick a x, y and their derivatives) for the elastica by equating
second point infinitesimally close to the first, and draw these two expressions for the extension.
the normal at this point as well. Let R be the distance
from the original point to the point where these two nor-
mals intersect; this is called the radius of curvature.
§ 13.1.3. Reference summary
(a) Using problem problem 13.1.3, what is the relation
between Å and R in the case of a circle? y 00
Curvature: Å = (1+(y 0 )2 )3/2
= rate at which angle of tangent is
turning.
(b) † Argue that this relation holds generally for any
curve. Hint: approximate the curve locally by a cir-
cle. Radius of curvature: R = 1/|Å| = radius of circle best approxi-
mating curve at given point = distance to point of intersection
of two successive normals to the curve.

§ 13.1.2. Problems
§ 13.2. Evolutes and involutes

13.1.5. An example showing how the idea of radius of curvature


occurs naturally in physical problems is the elastica: the § 13.2.1. Lecture worksheet
shape of a bent elastic beam. The simplest case is that of
a beam fixed vertically at its foot and weighed down with
such a weight that the tangent at its endpoint is horizon- In this figure, OI is the path traced by the end of a string un-
tal. wound from OE :

97
O
(b) Use this to find the length of a cycloidal arc.
I

O P
§ 13.3. Fourier series

E C' § 13.3.1. Lecture worksheet

T
Fourier series can be seen as a generalisation of the idea of the
We say that OE is the evolute of the involute OI . scalar product. Consider on the one hand the ordinary vectors
v1 = (1, 1) and v2 = (1, −1), and on the other hand the functions
13.2.1. Argue that the length I E can be characterised in terms of w1 = sin(x), w2 = sin(2x), w3 = sin(3x), . . ., which we shall think
curvature (§13.1). of as a kind of “vectors” as well. The vectors v1 and v2 form a
This gives us a way of determining the evolute given the in- basis for R2 , as we say, i.e., any vector in the plane can be writ-
volute. Suppose the involute OI is given parametrically as ten in the form av1 + bv2 for some real numbers a and b. To
(x(t ), y(t )). express some vector u in this form we go through the following
steps:
13.2.2. Find expressions for the coordinates (X , Y ) of the point E
in terms of x, y, ẋ, ẏ, Å, R. • “Normalise” the vectors v1 and v2 by dividing each vector
by its length. Call the resulting unit vectors v∗1 and v∗2 .
To make concrete use of this we need parametric expressions
for curvature. • Check that v∗1 and v∗2 are orthogonal (i.e., perpendicular)
using scalar products.
13.2.3. Express the radius of curvature R in terms of ẋ, ẍ, ẏ, ÿ in
one of the following two ways. • “Project” u onto v∗1 and v∗2 using scalar products.

(a) From geometrical first principles, as in §13.1. • The fact that v∗1 and v∗2 are orthogonal ensures that u is
decomposed into independent components, so adding
(b) By expressing y 0 and y 00 parametrically and substitut- up the two projections gives u = av∗1 + bv∗2 , as sought.
ing into the formula found in §13.1. Hint: for y 00 , use
chain and quotient rules. 13.3.1. Carry out these steps for the vector u = (4, 1). Include a
picture showing v1 , v2 , v∗1 , v∗2 , u, av∗1 and bv∗2 .
13.2.4. Show that the evolute of a cycloid is another cycloid con-
gruent to the first. We are now going to generalise this idea to “spaces” where vec-
tors are functions. To do so we essentially define the scalar
13.2.5. Argue that if you wish to make a pendulum bob follow a product of two functions f (x) and g (x) as R f (x)g (x) dx by
cycloidal path you should have it swing between cycloidal analogy with the usual scalar product P f g . More precisely,
i i
“cheeks.” Problem 7.4.5 shows why this is important. when dealing with sine functions we are going to focus on
onlyR πone of their periods, so we define the scalar product to
be −π f (x)g (x) dx.

13.3.2. Now let us try to carry out the same steps as above in
this new setting.

(a) Find the normalised vectors w∗1 , w∗2 , w∗3 , . . . (above


you may have used Pythagoras’s theorem for this
step, but now you will need to formulate the normal-
isation purely in terms of scalar products, using the
usual relation between scalar products and lengths).

(b) Check that these vectors are “orthogonal” (in the


sense of scalar products).

(c) Use the scalar product method to “project” u1 = x


onto w∗1 , w∗2 , w∗3 , w∗4 , w∗5 , w∗6 .

§ 13.2.2. Problems This gives you real numbers a 1 , a 2 , a 3 , a 4 , a 5 , a 6 , such


that
13.2.6. Evolutes and involutes can also be used to find the arc
lengths of curves. u1 ≈ a 1 w∗1 + a 2 w∗2 + a 3 w∗3 + a 4 w∗4 + a 5 w∗5 + a 6 w∗6

(a) Argue that, in the above definition of evolute and in- In other words, you have approximated the function x by
volute, the arc OE = the line segment E I . a sum of sine functions with various coefficients.

98
(d) Plot the function a 1 w∗1 +a 2 w∗2 +a 3 w∗3 +a 4 w∗4 +a 5 w∗5 + (c) Show that if |u| = 1 then the angle between v and w is
a 6 w∗6 and note that it somewhat approximates x on the same as that between uv and uw. Hint: first show
the interval (−π, π). that the mapping z 7→ uz is distance-preserving.
(e) Use the same method to obtain an approximation for (d) Infer from this that ij is perpendicular to 1.
u2 = x/|x|. Plot x/|x| and its approximating function
(e) Prove that (ij)2 must be both 1 and −1, a contradic-
in the same graph.
tion.
(f) ? Why can we not obtain power series approxima-
(f ) Note that the contradiction is avoided if we sacrifice
tions in a similar way (i.e., by “projecting” a function
commutativity and make ij = −ji.
onto x, x 2 , x 3 , . . . with the integral scalar product to
find its Taylor coefficients)? Explain in terms of the So if we continue this line of thought and try to salvage what
“vectors” x, x 2 , x 3 , . . ., and give an analogy with vec- we can from the wreckage, then we must try to figure out what
tors in R2 that illustrates the problem. number this mysterious ij can be. So far we know only that it is
perpendicular to 1 and equal to −ji.
This method of approximating functions by trigonometric se-
ries has an interesting physical meaning in terms of sounds.13.4.2. (a) Show that it is also perpendicular to i and j.
Functions of the form sin(nx) are “pure notes”—they describe
(b) Conclude that ij must go off in a “fourth dimension.”
the vibrations of tuning forks. The fact that any function
can be approximated by such trigonometric functions thus So let us write ij = k and consider the set of all four-
corresponds to the fact that any sound—say, for example, dimensional numbers a1+bi+cj+d k. The set of all such num-
Beethoven’s ninth symphony (including the chorus!)—can be bers may be called quaternions (owing to their fourness) and
produced by nothing but tuning forks. The numbers a n are denoted H (after their discoverer, Hamilton, since Q is already
telling us how hard to strike each tuning fork. There is also a taken).
converse physical meaning: a tuning fork will start vibrating
13.4.3. (a) Show that H is spatially closed, i.e., that any product
spontaneously whenever its tone is being played (“sympathetic
involving 1, i, j, k can be expressed as a linear com-
resonance”). The human ear is based on this principle. It con-
bination of 1, i, j, k without the need for any fifth di-
sists of many hairs that are in effect tuning forks sensitive to a
mension.
particular frequency. When a sound arrives which includes this
frequency as a component, the hair will vibrate with a strength (b) Show that the “multiplication table”
a n determined by the strength of that tone in the sound heard.
Thus the information sent to the brain is the coefficient a n , i2 = j2 = k2 = ijk = −1
so you have been computing scalar products all your life, as
it were, whether you were aware of it or not. contains all the information needed to reduce any
product of numbers in H down to standard form.
Now that the arithmetic of H is well defined it would be
§ 13.4. Hypercomplex numbers straightforward to go back and check that it satisfies all the var-
ious properties we desired of hypercomplex numbers except
§ 13.4.1. Lecture worksheet for the commutativity of multiplication. So by “following our
nose” from our failure with three-dimensional numbers we ar-
Generalising complex numbers into higher dimensions is rived at the next best thing.
problematic. Three-dimensional numbers are in a sense im-
possible, as the following argument shows.
§ 13.4.2. Problems
13.4.1. Consider the set of all points, seen as “hypercomplex
numbers,” in three-dimensional space with the usual13.4.4. Quaternions once rivalled vectors, and as the following
(vector) notions of magnitude and addition. Suppose problem shows they are in some ways almost equivalent.
there is some way of multiplying such numbers which
(a) Show that if you take the quaternion product (ai +
satisfies the usual laws of algebra. We shall now show that
bj+ck)(d i+ej+ f k) and discard all its real terms then
these assumptions are contradictory.
the result is the same as that of the corresponding
For the usual laws of algebra to hold there must be a mul- vector product (ai + bj + ck) × (d i + ej + f k). Hint:
tiplicative identity, call it 1, and since we are in three brute-force calculation is not the only way of seeing
dimensions there must be two other numbers of unit this.
length, i and j, such that these three numbers are all mu-
(b) Show that the scalar part discarded above is the neg-
tually perpendicular.
ative of the corresponding scalar product.
(a) Show that |1 − i2 | = 2.
13.4.5. The following argument gives a simpler and independent
(b) Therefore what is i2 ? And j2 ? Hint: What is its dis- proof that there can in any case not be any very simple
tance to 0? To 0? formula for three-dimensional multiplication.

99
Surely any multiplication worthy of its name should at the Then there are the changes caused by the changes in the
very least satisfy the multiplicative property of the magni- derivatives on the sides of y k , call them ẏ k and ẏ k+1 .
tude: |z 1 z 2 | = |z 1 ||z 2 |. Written out in terms of the compo- When y k changes by dyk , ẏ k changes by dyk /dt, so it
nents of the numbers this says causes the change in the integrand of
p q q (b)
(a 2 + b 2 + c 2 ) (d 2 + e 2 + f 2 ) = (α2 + β2 + γ2 )
which translates into a change in the integral of
or by squaring (c)

(a 2 + b 2 + c 2 )(d 2 + e 2 + f 2 ) = α2 + β2 + γ2 since this change only applies for half the interval dt. Sim-
ilarly, ẏ k+1 changes by −dyk /dt and causes the change
where α, β, γ are some functions of a, b, c, d , e, f .
(d)
(a) In the case of ordinary complex numbers (i.e., c = f =
So the equation for the change being zero altogether is
γ = 0), what is α and β?
(e)
These functions produce integer output for integer input.
Assume that this holds for α, β, γ in three dimensions as This is the equation when a single value y k is altered. In gen-
well. eral y may be altered in any manner, meaning that any num-
ber of y i ’s may be altered. To obtain the criterion for y being
(b) Show that then 15 must be a sum of three squares (of stationary in this general case we must therefore sum the pre-
integers). vious equation over all k. In doing so one finds that each term
∂f
(c) Show that it is not. ∂ ẏ i is counted twice, which cancels the 2 in the denominator
and leaves simply
∂f d ∂f
µ ¶
− = 0.
§ 13.5. Calculus of variations ∂y dt ∂ ẏ
This is the Euler–Lagrange equation of the calculus of varia-
tions.
§ 13.5.1. Lecture worksheet

Consider the problem of finding a function y(t ) that extremises § 13.5.2. Problems
an integral that depends on it,
Z 13.5.2. In problem 6.3.2 we found an integral expression for the
time taken to slide down a ramp as a function of its shape
f (y, ẏ, t ) dt.
y(x).

y(t ) is an extremum if wiggling it a little causes no change in (a) Use the Euler–Lagrange equation to find a differen-
the value of the integral, just as, in ordinary calculus, x is an ex- tial equation for the path of quickest descent.
tremum of f (x) if wiggling it a little causes no change in f . For (b) As in problem 12.7.2, check that a cycloid solves this
this purpose, split the t -axis into infinitesimal segments dt/2, differential equation.
and assume that y(t ) is linear on these intervals.
13.5.3. If we can write Newton’s equation F = ma in the form
of the Euler–Lagrange equation we can infer an integral-
variational formulation of the basic equation of motion.
Then y(t ) and ẏ(t ) are determined by the value of y(t ) at the Indeed, this is easily done: just let f = T + U , where
break points, let’s call them y 1 , y 2 , y 3 , . . .. Let the point y k vary, T = mv 2 /2 is the kinetic energy and U is the potential
so that we increase y k by dyk . function. In other words, U is a function whose deriva-
tives give the forces, as in U = −mg y for gravity, giving
constant gravitational acceleration m ÿ = ∂U
∂y = −mg .

What happens to the integral? (a) Show that in this case the Euler–Lagrange equation
reduces to F = ma.
13.5.1. First there is the direct change caused by the change in
y k . This change causes the integrand to increase by (b) In other words, to find the trajectory of a particle we
used to solve the differential equation F = ma, but
∂f now we can instead determine it by the equivalent
µ ¶
dyk
∂y k problem: . This is the so-called principle of
least action.
on this interval dt, and thus the integral by R R
(c) Instead of T +U dt, Euler uses the integral mv ds.
(a) Argue that this is equivalent.

100
(d) ? Discuss Euler’s interpretation of this result: “Be-
cause of their inertia, bodies are reluctant to move,
and obey applied forces as though unwillingly;
hence, external forces generate the smallest possible
motion consistent with the endpoints.”

§ 13.5.3. Reference summary

Find Ry that extremises Solve the differential


³ ´ equation
⇔ ∂f d ∂f
f (y, ẏ, t ) dt ∂y − dt ∂ ẏ =0

101
We can picture a function as a kind of “machine” where you
A P RECALCULUS REVIEW stick some input in one end, turn some cranks, and receive a
processed version of the input out at the other end. Often this
takes the form of a formula with x’s in it, into which one can
§ A.1. Coordinates “plug” whatever value for x to find the associated output value
y(x).
§ A.1.1. Lecture worksheet
For instance, f (x) = 2x − 1 is a function that doubles the input
and subtracts 1. So f (3) = 5, for example. It is often useful to
The position of a point is characterised analytically by its co- put this kind of information in a table for overview:
ordinates (x, y), meaning its vertical and horizontal distance
from some designated origin point (0, 0): x 1 2 3 4 5 6 7 8
f (x) 1 3 5 7 9 11 13 15
(x, y)
This table can help us see for example a second way of charac-
y terising f (x) verbally, namely as an “odd-number machine,” so
to speak:
x A.2.1. When x is a positive integer, is the odd
number.
By means of this device geometry is turned into algebra, so to
speak. For example, a line can be described as all the points A.2.2. What is the 127th odd number?
(x, y) that satisfy the relation y = mx + b for some fixed num-
bers m and b. The notion and notation of a function f (x) is powerful in its
flexibility and scope. For one thing, functions can be com-
A.1.1. Explain why. Hint: a line is characterised by the property posed, meaning that the output of one function is the input
that any horizontal step always corresponds to the same of another, which is often a useful way of describing composite
number of vertical steps. operations.
p
2
A.1.2. What kind of figure is y = 1 − x ? Hint: consider x and y A.2.3. If d (x) is the number of dollars you get for x euros, and
as legs of a right triangle. h(x) is the number of hot dogs you can buy for x dollars,
A.1.3. What kind of figure is x y = 0? what is h(d (x))?

A.1.4. ? Rotate the figure in the previous problem by 45◦ . What A.2.4. If f (x) = 2x and g (x) = x/2, explain both in formulas and
is its equation now? Hint: What is a formula for combin- purely verbally why f (g (x)) = x and g ( f (x)) = x.
ing x and y that will give you zero if you’re at (5, 0) or (0, 5)? Visually speaking, functions are curves. The value of the func-
What is a formula for combining x and y that will give you tion for a particular x-value corresponds to the height of the
zero if you’re at (5, 5)? At (5, −5)? function above the x-axis at that point:

§ A.1.2. Problems

A.1.5. (a) The enemy has one cannon movable along a shore-
f(x)
line y = − 14 and one cannon located at a fortress off
the shore at the point 0, 14 . You must sail between
¡ ¢

them. What path should you choose to minimise the x


danger of being hit?
This is called the graph of the function.
(b) † The positions of the canons in fact correspond to
(and the problem is meant to illustrate) certain theo- A.2.5. Draw the graph of f (x) = x 2 and show visually that a line
retical notions that always apply to this kind of curve. can cut the curve in at most two points.
Explain.
A.2.6. Draw the graph of f (x) = x 3 and show visually that a line
can cut the curve in at most three points.
§ A.2. Functions A.2.7. Find a fourth-degree graph which a line can cut in four
points.
§ A.2.1. Lecture worksheet
A.2.8. (a) “Quadratic functions are U’s and cubic functions are
S’s.” Discuss.
A function y(x) (“y of x”) is a rule assigning a specific output
y to a specific input x. In other words, to say that y is a func- (b) Do fourth-degree curves correspond to some letter of
tion of x is to say that y depends on x, or is determined by x. the alphabet?

102
A function is called “even” if f (−x) = f (x), and “odd” if f (−x) = § A.3. Trigonometric functions
− f (x). Most functions are neither one nor the other, but for
those that are it is often useful to be aware of these simple
rules for how minuses behave, just as in ordinary arithmetic § A.3.1. Lecture worksheet
you wouldn’t start all over again to compute 53 × (−74) if you
had just computed 53 × 74, or keep minding the sign at each Trigonometric functions are so called for their applications to
step when evaluating (−2)5 . triangles (§A.7.8), but in the context of the calculus they take
on a much broader significance. In fact, they are the language
A.2.9. Is f (x) = x n an even or odd function? in which all kinds of periodic phenomena are best described.
Nature exhibits obvious periodicity in phenomena such as day
A.2.10. What is the visual meaning of a function being even or and night, summer and winter, and the ebb and flow of the sea.
odd? But more important examples still are the many kinds of invis-
ible waves that constitute a big part of our lives, including light
A.2.11. Illustrate with the graphs of x, x 2 , x 3 , x 4 . waves, sound waves, and virtually all man-made forms of wire-
less communication.
A.2.12. Make up some other even and odd functions by drawing The periodic nature of the trigonometric functions is obvious
graphs with two hands using two pens, both starting at from their unit-circle definition, which generalises their defi-
the origin, one going right and one going left. nition in terms of triangles:

A.2.13. Speaking of the graphs of x, x 2 , x 3 , x 4 , explain what hap-


pens graphically as the exponent increases (x 5 , x 6 , x 7 , . . .).
Hint: What happens to 0.9n and n becomes bigger and (cos θ, sin θ)
bigger? To 1.1n ?

tan θ
f −1 (x) means the inverse function of f (x). It’s “ f back-
1 1

sin θ
wards,” or the function that “undoes” f . In symbols this means
f ( f −1 (x)) = x and f −1 ( f (x)) = x: if you do one and then the θ θ
other you get back what you started with. Thus, for example, if cos θ
f (x) = 2x then f −1 (x) = x/2. Another way of putting it is that,
for f −1 , the output of f becomes the input and the input be- On the left we see, in effect, the geometrical definitions of the
comes the output. trigonometric functions embedded in a coordinate system. On
the right we see a further step of abstraction where the cosine
Visually, the graph of f −1 (x) is the graph of f (x) mirrored in the and sine are simply defined as the x and y coordinate respec-
line y = x, since this transformation interchanges the roles of x tively of a point moving along a unit circle. In this way the
and y, i.e., input and output. functions are liberated from their trigonometric origins, as it
were. In particular, in this way we can define their values for
any angle, including angles that are too great to ever occur in a
f −1 (x) y =x triangle. Defined in this manner, the sine and cosine become
beautiful periodic functions that look like this:

1
f (x)
−π − π2 π π 3π 2π
2 2
−1

−π − π2 π π 3π 2π
2 2
−1

The iconic wave shape is unmistakeable, but to draw one of


A.2.14. Consider the function f (x) = x 2 . these graphs correctly we must also know how to start it off cor-
rectly, say when x is 0. We can do this by recalling a very useful
(a) What is f −1 (x)? fact—a quick and dirty trick has been the lazy mathematician’s
friend since time immemorial—namely that sin x ≈ x when x
(b) Find the graph of f −1 (x) using the rule for graphs of is small.
inverse functions. What kind of curve is it?
A.3.1. Explain why this is so, using the unit circle definition of
the sine.
(c) How can you see the same thing directly from the
equation for f −1 (x)? A.3.2. Therefore, which graph is which of the two above?

103
A.3.3. What are some other features of the graphs that you can § A.4. Logarithms
confirm using the unit circle definition?

In analytical trigonometry we measure angles not in degrees § A.4.1. Lecture worksheet


but in radians. This means that an angle is measured by the
corresponding arc length of a unit circle. In short, when using Logarithms were first developed in the early 17th century as a
radians, angle is the same thing as (unit-circle) arc: means of simplifying long calculations. Long calculations were
involved for example in navigation at sea, which was of in-
creasing importance in this era. Indeed, the first ship of slaves
from Africa to America set sail only four years after the publi-
cation of the first book on logarithms.

arc
1 The essence of logarithms is that they turn multiplication into

leng
addition:
θ radians th θ log(ab) = log(a) + log(b) (L1)
This simplifies calculations because if you have to compute by
hand it is much easier to add than to multiply. In this way log-
In calculus we always use radians rather than angles. The pre-
arithms “doubled the lifetime of the astronomer,” it was said at
cise reason for this is seen in problem 1.4.1, but we can already
the time. Not so long ago, before the advent of pocket calcula-
appreciate that radians is the superior angle measure. After
tors, people still learned logarithms for this purpose in school.
all, the notion that a full revolution corresponds to 360◦ is an
You can still see the traces of this today when you go to a used
arbitrary social construction. Basing a theory on such arbi-
bookstore and look at the mathematics section: usually you
trary starting points leads to arbitrary repercussions later, as is
will find there tables of logarithms published in the first half
hardly surprising. Radian angle measure, by contrast, doesn’t
of the 20th century.
introduce any artificial conventions, but rather characterises
angles by means intrinsic to geometry itself. We can rediscover logarithms for ourselves in the following
way. Consider a table of powers of some integer, such as 2:
What are the inverses of the trigonometric functions? In the
manner of §A.2 we can define them abstractly and denote them n 1 2 3 4 5 6 ···
n
−1
sin (x) etc., as is sometimes done. However, it is also reward- 2 2 4 8 16 32 64 ···
ing to think about their meaning more concretely. By defini- A.4.1. Explain how for example 4 × 8 can be found using this ta-
tion sin−1 (x) inverts sin(x). What does this mean in terms of ble without actually performing any multiplication.
the geometrical definition of sin(x)? The sine takes an angle—
or rather, as we have now learned to say, arc—as its input and That’s a neat trick, but it only works for numbers that happen
gives a corresponding coordinate as its output. The inverse to occur in the bottom row. We need to be able to multiply any
sine, sin−1 (x), does the opposite: it takes the coordinate as its numbers. Fortunately it is not hard to extend the idea to pro-
input and tells you what the corresponding arc is. For this rea- duce a table without such big gaps.
son sin−1 (x) is also denoted arcsin(x). A.4.2. Explain how.
Here, then, are the geometrical meanings of the inverse Thus, to produce a table of a function that has the property
trigonometric functions: (L1), all we have to do, it turns out, is to make a table of val-
ues for some exponential function f (x) = a x and then read it
backwards. Logarithms are simply the inverse of exponentia-
y tion.
t
arc

arc

arc
cos x
sin y

In our table we used the base 2, but any number would have
tan t

worked. We get a different logarithm for each base, but all of


x them have the crucial property (L1). The logarithm associated
with our table would be denoted log2 (x). It shall emerge later
A.3.4. Find the value of each of the following and illustrate with
that a certain number e = 2.71828 . . . is the mathematician’s
a figure.
favourite base, and that the associated logarithm is the most
(a) sin(π/4) “natural” of all logarithms and therefore denoted ln(x).

(b) arccos(−1) A.4.3. (L1) is the defining property of logarithms, and the
mother of all logarithm laws. Show how:
(c) arctan(1)
(a) The logarithm of 1 follows from (L1) by restricting
(d) arctan(∞) one of its values to an identity element.
A.3.5. Argue that degree and radian angle measures can be in- (b) The logarithm of an exponential expression follows
terpreted as “observers’s viewpoint” and “mover’s view- from (L1) by regarding multiplication as repeated ad-
point” respectively. dition.

104
(c) The logarithm of a quotient follows from (L1) by con- eventually decay back to nitrogen. However, this may not
sidering that / cancels × and − cancels +. happen until many years later. 14 C decays in proportion
to the amount present at such a rate that in 5730 years
only half of the isotopes originally present have decayed
§ A.5. Exponential functions back into nitrogen.

(a) Use this information to express the amount y of 14 C


§ A.5.1. Lecture worksheet as a function of time t measured in years.

The essence of exponential functions, such as f (x) = 2x , is that Living organisms continually replace their carbon, so
they describe things that grow in proportion to their size. The their 14 C levels are kept at a constant level. However,
more you have the faster it grows. when a plant or animal dies, it stops replacing its car-
bon and the amount of 14 C begins to decrease through
A.5.1. Argue that rabbit populations and money both have this
radioactive decay. Thus any dead organic material can
property.
be dated on the basis of its 14 C content. For example,
A.5.2. Verify that f (x) = 2x has this property by considering how in antiquity precious treatises were written on parchment
f (x + 1) is related to f (x). (dried animal skin).

A.5.3. I put $1000 in a savings account earning 10% interest an- (b) A parchment manuscript has 80% as much 14 C as liv-
nually. ing material. Find an equation expressing the age of
the parchment.
(a) How much money do I have after 1 year? After 2
years? After x years? (c) Find the age of the parchment.
I want to know: How long will it take for my money to
double?
(b) Write down an equation involving the required time § A.6. Complex numbers
T.
There are two ways of tackling this equation. Nowadays § A.6.1. Lecture worksheet
we can simply:
(c) Have a computer or calculator solve the equation. “Complex numbers” are an expanded universe of numbers in
which any polynomial equation has as many roots as its de-
However, we can also make some progress by hand: gree. This is achieved by generalising ordinary real numbers
(d) Use logarithms to solve for T , i.e., write the equation into one more dimension. So in this universe numbers are no
as T = . . .. longer confined to one line or axis, but rather live in a whole
plane. These numbers are points in the plane, or, if you prefer,
Did we accomplish anything this way? You may say no, they are arrows pointing from the origin to that point:
because we still need a calculator to find the numerical
value of T , and we could already do that without loga-
rithms anyway.
(e) ? Explain why this method used to make more sense
when there were no calculators.
Nevertheless solving for T in this way does serve a pur-
pose in other contexts, where solving for T may be merely
a substep in a more complex investigation.
Exponential functions also describe things that shrink in pro-
portion to their size, of which radioactive decay is an important
example. Now we ask ourselves: how does one add and multiply such
numbers? The goal of generalisation is to retain old things but
A.5.4. Radiocarbon dating. Carbon is an essential atom in to think of them in new ways that give them wider applicabil-
plants and animals. Plants absorb it through carbon diox- ity. Indeed, the following is a strange way of looking at ordinary
ide in the atmosphere and animals absorb it through their multiplication, which, however, has the advantage that it gen-
food. A small portion of this carbon is in the form of eralises readily to two-dimensional numbers.
the isotope 14 C (“carbon-14”). This isotope is radioactive,
meaning that it is in an unstable state and will eventually A.6.1. Let |z| denote the magnitude, or distance to the origin, of
revert into another form without external influence. This a number z, and let arg(z), the “argument” of the number,
unnatural state is created by cosmic radiation, which con- denote the angle it makes with the positive x-axis (mea-
verts nitrogen into 14 C. Left to its own devises, the 14 C will sured in radians). Then:

105
 |w| = |z|
z |z| arg(z)
 w = −z
2
 wz = 1
−3
(b) Does any complex number always have precisely two
2 · (−3)
distinct square roots?

Already we are realising that complex numbers have two faces.


On the one hand they are points in the plane and can be char-
z |z| arg(z)
acterised by their x and y coordinates. When we think of com-
−1 plex numbers in this way we write them as a + bi . The i is for
−4 “imaginary.” It is a number such that i 2 = −1, which certainly
tests the imagination.
(−1) · (−4)
A.6.5. Explain why i corresponds to the point (0, 1).

Generalising from this to two-dimensional complex numbers, On the other hand complex numbers can also be usefully char-
we get the following rules. We add by “concatenation”: acterised in terms of their length r and the angle θ that they
make with the x-axis. In this case we write them in the “polar
z +w form” r e i θ .
w

b = Im(z) = imaginary part of z


z
z = a + bi = r e i θ

|
|z
And we multiply by multiplying the lengths and adding the an- r=
gles:
θ = arg(z) = argument of z
zw
|
· |w

a = Re(z) = real part of z


|z |

w
|=

These two notations make it easy to do algebra with complex


|z w

z numbers. In the a + bi notation we just do ordinary algebra


with the added rule i 2 = −1. And in the polar form r e i θ we are
free to use the usual laws of exponents. Indeed, note that the
A.6.2. A multiplication by a complex number is sometimes algebraic identity r 1 e i θ1 r 2 e i θ2 = r 1 r 2 e i (θ1 +θ2 ) reflects precisely
called an “amplitwist.” Explain the rationale behind this our geometrical definition of multiplication of complex num-
name. bers. This convenient fit between algebra and geometry is why
this notation is chosen.
A.6.3. Use geometric reasoning to find the following in a simple
way. A.6.6. Evaluate e i π . Write your answer in the form of a formula
(a) (1 − i )10 . relating “the five most famous numbers.”

(b) For what value of a are there multiple solutions to A.6.7. Show that e = cos θ + i sin θ. This is an important bridge
1−i n 1−i 10
( a ) =( a ) ? between the two notations.
2
(c) For this value of a, what is the next n beyond 10 for A.6.8. (a) Solve the equation x − 2x + 2 = 0 in complex num-
which the equation holds? bers using the usual solution formula for quadratic
equations
A.6.4. If z = w 2 we say that w is a square root of z. To answer the p
−b ± b 2 − 4ac
following questions it will be useful to picture geometri- x=
cally the possible square roots of some complex numbers, 2a
p
such as −1 and 1 + i . and the fact that −1 = i .
(a) If z and w are both square roots of the same complex (b) Express the roots in both notations.
number, which of the following must be true?
(c) For one of the roots x, draw x 2 , −2x, and 2 in the
 arg(w) = − arg(z) complex plane, and note what happens when you
 w = z̄ add them together.

106
Their first triumph, however, was not the quadratic equa-
§ A.6.2. Problems
tions found in textbooks today but rather cubic ones,
i.e., equations of degree 3. For cubic equations there is
A.6.9. (Requires knowledge of derivatives and preferably power
a formula analogous to the common quadratic formula,
series.) The appearance of the number e in the polar form
namely the solution of y 3 = p y + q is
of complex numbers may seem mysterious.
v s v s
(a) Study again the justification for this notation given in
u u
tq
u3
³ q ´2 ³ p ´3 3 q
u ³ q ´2 ³ p ´3
y= + − + − − .
t
the lecture and argue that e could just as well be re-
2 2 3 2 2 3
placed by, say, the number 3 so far as this justification
is concerned. (a) Apply the formula to x 3 = 15x +4. The two cube roots
(b) However, make a case for e on the basis of its distinc- that arise are in fact equal to 2+i and 2−i . Check this!
tive property (e x )0 = e x , not shared by other exponen- (b) So what solution does the formula give? Is it correct?
tial functions.
The conclusion is that even if you think answers with i ’s
(c) Also make a case for e by expanding both sides of the in them are hocus-pocus you still have to admit that com-
identity in problem A.6.7 using the series in problem plex numbers are useful for answering questions about
5.1.2. ordinary real numbers as well.
(d) ? Are these two arguments for e different or is it the
same reason in different guises?
§ A.7. Reference summary
A.6.10. Explain what is wrong in the following argument:
p p p
§ A.7.1. Basic algebra
p
−1 = −1 −1 = (−1)(−1) = 1 = 1

Exponents:
A.6.11. (a) Prove from the geometric definition of complex
numbers that a(b + c) = ab + ac.
(b) ? Are there any other important algebraic rules we 1 1
··· x −2 = x −1 = x0 = 1 x1 = x x2 = x · x ···
need to derive in order to be justified in treating x2 x
the algebraic and geometric conceptions of complex
numbers as equivalent? 1 p 1 p
3
x2 = x x3 = x ···
A.6.12. When we know complex numbers we no longer need to
memorise the trigonometric addition formulas
ax
sin(α + β) = sin(α) cos(β) + sin(β) cos(α) a x a y = a x+y = a x−y (a x ) y = a x y (ab)x = a x b x
ay
cos(α + β) = cos(α) cos(β) − sin(α) sin(β) Fractions:
because we can easily rederive them by simply multiply-
ing e i α by e i β . A C
× =
AC A C
+ =
AD + BC
B D BD B D BD
(a) Show how. (Express the product in two different ways
AC A A A C
and identify real and imaginary parts.) = = ×
BC B B B C
(b) ? Is this proof circular? That is, is the algebraic ma- X B A 1
chinery it involve already rest on the sum formulas A
=X× = A×
A B B
for trigonometric functions in some implicit way? B

A.6.13. Complete the mathematical pun: Why did my stomach Roots:


q p i
hurt after Christmas? 64 , −49, 1+i
−1
1−i , or 1+i ? p p
n
a2 = a an = a (a positive)
A.6.14. With complex numbers we can solve any quadratic equa- p
p p p
r
tion, or so the textbooks tell us. But what kind of “so- a a
ab = a b =p
lutions” are these weird things with i ’s in them any- b b
way? Indeed, the first person to publish on complex
numbers, Cardano in his 1545 treatise Ars magna, called Polynomials:
them “as subtle as they are useless,” a sentiment perhaps
shared by students today. But despite this lack of parental a 2 + 2ab + b 2 = (a + b)2
love from their father, these underdog numbers gradu-
a 2 − 2ab + b 2 = (a − b)2
ally triumphed over adversity by proving themselves use-
ful again and again in field after field. a 2 − b 2 = (a + b)(a − b)

107
a 3 − b 3 = (a − b) a 2 + ab + b 2
¡ ¢
• Find the equation for a line passing through two given points
(x 1 , y 1 ), (x 2 , y 2 ).
a 3 + b 3 = (a + b) a 2 − ab + b 2
¡ ¢
∆y y 2 −y 1
Find the slope m = ∆x = x 2 −x 1 , and then proceed as in the
Quadratic formula: previous problem.

p
2 −b ± b 2 − 4ac § A.7.4. Circles
ax + bx + c = 0 =⇒ x=
2a
Circle with center at origin:
§ A.7.2. Pythagorean Theorem p
x2 + y 2 = r 2 =⇒ y = ± r 2 − x2

c Circle with center at (a, b):


a2 + b2 = c 2 b

a (x − a)2 + (y − b)2 = r 2

Distance between two points (x 1 , y 1 ) and (x 2 , y 2 ):


area = πr 2 circumference = 2πr
q
¢2
(x 1 − x 2 )2 + y 1 − y 2
¡

§ A.7.5. Parabolas
Distance between two points (x 1 , y 1 , z 1 ) and (x 2 , y 2 , z 2 ):
Parabola with vertical axis:
q
¢2
(x 1 − x 2 )2 + y 1 − y 2 + (z 1 − z 2 )2
¡
y = ax 2 + bx + c = A(x − B )2 +C

§ A.7.3. Lines

A = “steepness”
Equation for a line:
A positive =⇒ upward or “happy” parabola
y = mx + b A negative =⇒ downward or “sad” parabola
∆y y2 − y1 B = x-value of axis of symmetry
m = slope = = = rise over run”
∆x x 2 − x 1
C = vertical shift = distance of vertex from x-axis
b = y-intercept = y(0)
• Convert a quadratic function given in the form y = x 2 +bx +c
m=3 into the form y = A(x − B )2 +C .
m=1
³ ´2 ³ ´2 ³ ´2
Rewrite x 2 + bx as x 2 + bx + b2 − b2 = (x + b2 )2 − b2 .
1
m= 2
• Convert a quadratic function given in the form y = ax 2 +bx +
c into the form y = A(x − B )2 +C .
m=0
Divide by a, proceed as above to obtain y/a = (x − B )2 + C ,
then multiply by a.
m = −1
§ A.7.6. Factoring

Fundamental theorem of algebra: a polynomial of degree n can


be factored into n linear factors: a 0 + a 1 x + a 2 x 2 + · · · + a n x n =
lines with slopes m 1 , m 2 parallel ⇐⇒ m 1 = m 2
k(x − r 1 )(x − r 2 ) · · · (x − r n ), where the roots r may be complex
lines with slopes m 1 , m 2 perpendicular ⇐⇒ m 1 = −1/m 2 numbers or repeat occurrences of the same number.
• Find the equation for a line with a given slope m passing • Factor a second-degree polynomial, x 2 + B x +C .
through a given point (x 1 , y 1 ).
If the coefficients and roots are simple numbers: determine
Fill in what you know in y − y 1 = m(x − x 1 ) and then rewrite a and b such that a + b = B and ab = C . The factorisation is
it in the form y = mx + b. (x + a)(x + b).

108
In general: Find the roots r 1 , r 2 of x 2 + B x + C = 0, e.g. using • Given f (x) as a formula, find f (whatever given expression).
the quadratic formula. The factorisation is (x − r 1 )(x − r 2 ).
Replace all occurrences of x in the formula for f (x) with the
• Factor a third-degree polynomial (or higher). given expression enclosed in brackets.
Find one root r by trial-and-error or educated guessing. Fac- • Evaluate a composite function f (g (x)).
tor out (x−r ). This can be done systematically by polynomial
Work “inside out”: first find g (x), then plug the result into
long division (see below), or, in many simpler cases, simply
f (x).
by writing (x−r )( x2+ x+ ) and determin-
ing by inspection what numbers need to go in the blanks to • Find the graph of a function f (x) given as a formula.
make this expression equal to the original cubic expression.
Pick some value for x, compute f (x), mark the point
• Divide one polynomial, p(x), by another, q(x). (x, f (x)). Repeat for various values of x. When the pattern
is clear, connect the dots.
To find “how many times q(x) goes into p(x),” first determine
the numbers a 1 and n 1 such that a 1 x n1 times the highest- • Infer the properties of the graph of a polynomial function.
degree term of q(x) equals the highest-degree term of p(x).
If the polynomial has a factor (x − a), the graph intersects or
Write down a 1 x n1 as the first term of the answer. Next com-
touches the x-axis at x = a.
pute a 1 x n1 q(x), and subtract the result from p(x). This
leaves the remainder of the division, r 1 (x). If the degree of the polynomial is n, the graph has no more
than n − 1 turning points, and no more than n intersections
Now repeat the process with r 1 (x) in place of p(x). Keep re-
with any line.
peating this process until it can’t go any further, i.e., until the
remainder is 0 or of lower degree than q(x). If the highest-degree term is x raised to an odd power, the
function goes to +∞ for big x’s and −∞ for big negative x’s.
If the remainder is 0, then the answer gives the result of the
p(x)
division, i.e., q(x) = a 1 x n1 + a 2 x n2 + · · · . If the highest-degree term is x raised to an even power, the
function goes to +∞ for big x’s and for big negative x’s.
If the remainder is r k (x) 6= 0, then the remaining division that
could not be carried out must be added to the answer, i.e., • Recognise how the graph of a function closely related to f (x)
p(x) n1 n2 r k (x) differs from that of f (x).
q(x) = a 1 x + a 2 x + · · · + q(x) .
f (x) + c moves the graph c steps up.
a1 x n1 +a 2 x n2 +···
− f (x) flips the graph upside-down (i.e., mirrors it in the x-
q(x) p(x)
axis).
−a 1 x n1 q(x) f (−x) flips the graph the other way around (i.e., mirrors it in
r 1 (x) the y-axis).
k f (x) stretches the graph by a factor k in the y-direction; the
−a 2 x n2 q(x)
bigger the k, the “steeper” or “more accentuated” the graph
r 2 (x)
becomes.
..
.
f (kx) stretches the graph by a factor k in the x direction; the
bigger the k, the more “flattened out” the graph becomes.
§ A.7.7. Functions and graphs f −1 (x) interchanges x and y, i.e., reflects the graph in the line
y = x.
A polynomial is an expression of the form a 0 + a 1 x + a 2 x 2 +
· · · + a n x n . The numbers a i are the coefficients of the various • Find the inverse of a function given as a formula y = f (x).
x-terms. Solve for the output variable, i.e., rewrite the given equation
A rational function is one polynomial divided by another. in the form x = (something with y’s). The right hand side
is then the desired inverse function f −1 (y). (Its variable is
An algebraic function is a function defined by an equation built now called y. If you prefer to forget what x denoted when
p p
up in any manner from the operations +, −, ×, div , , 3 , . . ., you started the problem and simply consider f −1 in its own
the variables x, y, and numbers. right, then you can simply replace all the y by x’s and you
A transcendental function is a function that is not algebraic. have f −1 (x).)
¡ p ¢2
Find the inverse of f (x) = sin( x) .
f (x) even ⇐⇒ f (−x) = f (x) p ¢2 p
¡ p p
⇐⇒ graph symmetric in y-axis y = sin( x) =⇒ ± y = sin( x) =⇒ arcsin(± y) =
p p 2 p 2
x =⇒ x = (arcsin(± y)) = (arcsin( y)) . So f −1 (x) =
f (x) odd ⇐⇒ f (−x) = − f (x) p
(arcsin( x))2 .
⇐⇒ graph symmetric in y-axis except upside-down

109
Half angle formulas:
§ A.7.8. Trigonometry
r
Geometrical definitions of trigonometric functions: x 1 − cos x
sin =
2 2
use
r
x 1 + cos x
oten cos =
hy p opposite 2 2
θ Radian angle measure: measuring angles by the length of the
adjacent corresponding arc of a unit circle. In particular, a full revolu-
tion = circumference of unit circle = 2π.
opposite side adjacent side • Convert θ ◦ into radians.
sin θ = cos θ =
hypotenuse hypotenuse 2π
θ · 360
opposite side sin θ
tan θ = = • Convert θ radians into degrees.
adjacent side cos θ
θ · 360

Reciprocal trigonometric functions:
Trigonometric table:

1 1 1
sec x = csc x = cot x = degrees radians sin cos tan
cos x sin x tan x
0◦ 0 0 1 0
p p
Pythagorean property: 30◦ π/6 1/2 3/2 1/ 3
p p
45◦ π/4 1/ 2 1/ 2 1
p p
60◦ π/3 3/2 1/2 3
sin2 θ + cos2 θ = 1 90◦ π/2 1 0

Symmetry properties:
§ A.7.9. Exponential functions
sin (−θ) = − sin (θ)
y(t ) = y 0 e kt : exponential growth/decay function.
cos (−θ) = cos (θ)

tan (−θ) = − tan (θ)


y 0 = y(0) = initial amount
Area of triangle (C = angle opposite side c): k = growth/decay rate constant
(positive k ⇐⇒ growth, negative k ⇐⇒ decay)
base × height 1
= ab sinC
2 2 “After one unit of time, λ times the initial amount remains”
Law of sines: ⇐⇒ k = ln(λ).
sin A sin B sinC
= =
a b c
Law of cosines:
a 2 = b 2 + c 2 − 2bc cos A
y = ex
Compound angle formulas:

¡ ¢
sin x + y = sin x cos y + cos x sin y • Given an exponential growth/decay function, determine the
¡ ¢
sin x − y = sin x cos y − cos x sin y time at which λ times the initial amount is present. (E.g.,
¡ ¢ “half life,” λ = 12 .)
cos x + y = cos x cos y − sin x sin y
¡ ¢ Set y(T ) = λy(0) and solve for T .
cos x − y = cos x cos y + sin x sin y
• Given the initial value y 0 and one data point y(t 1 ) = y 1 of an
exponential growth/decay function y(t ), find the expression
Double angle formulas:
for y(t ).
We need y 0 and k in the formula y(t ) = y 0 e kt . Fill in y 0 , which
sin 2x = 2 sin x cos x
is given, right away. Using the resulting expression, write out
cos 2x = cos2 x − sin2 x = 2 cos2 x − 1 = 1 − 2 sin2 x y(t 1 ) = y 1 and solve for k.

110
• Given two data points, y(t 1 ) = y 1 and y(t 2 ) = y 2 , of an expo-
§ A.7.11. Complex numbers
nential growth/decay function y(t ), find the expression for
y(t ).

b = Im(z) = imaginary part of z


We need y 0 and k in the formula y(t ) = y 0 e kt . Use this for-
mula to write out y(t 1 ) = y 1 , and solve the resulting equation
for y 0 . Also write out y(t 2 ) = y 2 , and substitute the found z = a + bi = r e i θ
expression for y 0 , and then solve for k. Plug back into the
expression for y(t ). |z
|
r=

§ A.7.10. Logarithms θ = arg(z) = argument of z

Logarithm as inverse of exponentiation: a = Re(z) = real part of z

loga x = the inverse of a x i 2 = −1 e i θ = cos θ + i sin θ


= the number to which a needs to be raised to give x

(cos x + i sin x)n = cos nx + i sin nx


ln = loge
Arithmetic of complex numbers, algebraically: as ordinary al-
gebra with added rule i 2 = −1.

y = ln (x) ⇐⇒ x = e y y = logb (x) ⇐⇒ x = b y Arithmetic of complex numbers, geometrically:


zw

|
Laws of logarithms (log can be any logarithm):

· |w
|z |
z +w
w w

|=
|z w
log(ab) = log a + log b
z z
log(a/b) = log a − log b
log(a b ) = b log a
log 1 = 0
a + bi = complex conjugate of a + bi = a − bi
• Solve a quadratic equation with complex roots.
ln(e x ) = x e ln x = x ln e = 1
Solve with
p usual quadratic pformula
p and
p when the root comes
y = ln(x) out as −A, rewrite it as A −1 = Ai .

Solve z 2 − 2z + 4 = 0.
p p p p
1 2± 4−16 2± −12 2±2 −3
z= 2 = 2 = 2 = 1 ± i 3.

• Simplify a fraction with a complex number a + bi as its de-


• Solve for x in an equation in which x occurs in an exponent. nominator.

Isolate the exponential expression on one side of the equa- Multiply top and bottom of the fraction by the conjugate
tion (move other terms to the other side and divide away the a − bi of the denominator. After simplifying, there will be
coefficient of the exponential term, if it has one), and take no i ’s left in the denominator.
logarithms of both sides. 3
Express 1+5i in the standard form a + bi .
• Solve for x in an equation in which x occurs inside a loga- 3 3(1−5i ) 3−15i 3
rithm. 1+5i = (1+5i )(1−5i ) = 1+25 = 26 − 15
26 i .

Isolate the logarithmic expression on one side of the equa-


1+i 1
tion (move other terms to the other side; if there are several Simplify 2 + 1+i .
logarithmic terms, combine them using laws of logarithms; 1+i 1 1+i
divide away the coefficient of the logarithmic term, if it has 2 + 1+i = 2 + (1+i1−i
)(1−i ) =
1+i
2 + 1−i
2 = 1.
one), and write e left hand side = e right hand side .

111
p
3+2i 3 666
Simplify 3−2i . Simplify ( 12 + 2 i) .
(3+2i )2
= (3−2i )(3+2i ) = 5+12i
9+4 = 5
13
12
+ 13 i. = (1(cos 60◦ + i sin 60◦ ))666 = 1666 (cos(666 · 60◦ ) + i sin(666 ·
60◦ )) = cos(111 · 360◦ ) + sin(111 · 360◦ ) = cos 0◦ + i sin 0◦ = 1.

• Convert from rectangular form a + bi to polar form r e i θ or


vice versa.
§ A.7.12. Physics
Direct visualisation often suggests how to find r and θ using
basic geometry. For explicit formulas, see §7.5.1. Basic arithmetic of motion:

p distance = velocity × time


Write in polar form (r e i θ ): 1 − i , 3 + i , 2i .
p −i π/4
2e , 2e i π/6 , 2e i π/2 . Newton’s law:
F = ma
Write p1+i in polar form. Force = mass × acceleration
3+i
p i π/4
= 2e
= p1 e i (π/4−π/6) = p1 e i π/12 . Law of gravity:
2e i π/6 2 2

GMm
Write in the form a + bi : e 3πi /2 , 2e πi /6 , 3e i π/4 . F=
r2
p
−i , 3 + i , p3 + p3 i .
2 2

F = gravitational force (N)


e πi M , m = masses of the two objects (kg)
Write 1−i in the standard form a + bi .
r = distance between the objects (m)
e πi means a complex number with r = 1 and θ = π, so
e πi = −1. Thus m3
G = gravitational constant ≈ 6.67 · 1011
kg · s
e πi −1 −1 1 + i −1 − i
= = · = = − 21 − 21 i
1−i 1−i 1−i 1+i 1+2 Law of gravity close to the earth’s surface:

• Simplify expression involving powers or roots of complex


numbers. a=g
acceleration = constant
Often useful in trickier cases: convert to polar form and use
m
laws of exponents or geometric properties. g ≈ 9.8 2
s

Simplify (1 − i )12 .
p p
For the polar form of 1 − i we havepr = 12 + (−1)2 = 2,
θ = − π4 , so for (1 − i )12 we have r = ( 2)12 = 26 = 64, θ = − π4 ·
12 = −3π, which is equivalent to −π. Thus (1 − i )12 = −64.

p
Write (1 − 3i )11 in the form a + bi .
p 5π
Let z = 1 − 3i . Observe that |z| = 2, and arg z = θ = 3 , so

3 i

3 i
55π
3 i 11 (18π+ π3 )i
z = 2e . Thus z 11 = (2e )11 = 211 e p = 2 e =
π p
211 e 3 i = 211 (cos π3 + i sin π3 ) = 211 ( 12 + i 23 ) = 210 (1 + i 3) =
p
1024 + 1024 3i .

p
Simplify ( 1+1+i3i )8 .
8 ◦ ◦
p2 (cos(8·60 )+i sin(8·60 )) 28 cos 120◦ +i sin 120◦
= 8 (cos(8·45◦ )+i sin 8(·45◦ ))
= 24
· cos 0◦ +i sin 0◦ = 24 (− 21 +
p ( 2)
3
p
i 2 ) = −8 + 8 3i .

112
§ B.2. Matrices
B L INEAR ALGEBRA
§ B.2.1. Lecture worksheet

§ B.1. Introduction A matrix is an array of numbers such as 34 71 . There is a certain


£ ¤

rule for multiplying such things that turns out to be very fun-
damental. Let’s learn the rule algebraically first, and then we
§ B.1.1. Lecture worksheet shall see what it all means. The rule is given in the reference
summary. We see that matrices are multiplied by multiplying
rows of the left matrix by columns of the right matrix. To me,
7x + 3y is a “linear” expression, as opposed to anything involv- multiplying matrices is a tactile experience. I run my left in-
ing x 2 , x y, or such things of higher order. 7x + 3y is a “lineardex finger across the row and my right one down the column,
combination” of x and y: it’s so much of the one plus so much tapping the numbers that are to be multiplied together.
of the other. Linear relationships between quantities is the sim-
plest kind of relation, and they occur everywhere. Innumerable B.2.1. Find
scenarios are modelled by linear systems of equations such as ·
5 0
¸·
2 3
¸

1 −1 −2 0
5x + 2y = 3
x − 4y = 1 and make sure you tap along with your fingers so that
matrix multiplication becomes ingrained in your muscle
memory.
or linear transformations such as
We shall see many examples in which matrix multiplication
µ ¶ µ ¶
x x−y represents the transition from one state to the next. For exam-
7→
y 2x + y ple, the population dynamics of a city can be characterised by
how many people move between the city center and the sub-
urbs. Perhaps mostly young people leave the suburbs to go live
where some “state” (x, y) of some system is being transformed
in the city, while the older generation tend to remain in the sub-
into a new state that is a linear transformation of the original
urbs. This could mean for example that 80% of the suburban
state.
population stay and 20% move in a given decade. Those who
move to the city perhaps do so on a more temporary basis, for
Linear algebra gives us concepts and techniques to deal with
example until they have children of their own. Thus we can
relationships of this type. At first sight these concepts may not
imagine that 50% of the city population move and 50% stay in
seem very impressive or well-motivated, but the more you en-
any
£ .8 .5given decade. This information is encoded in the matrix
counter these types of relationships “in the wild,” the more you ¤
.2 .5 , because if x n and y n represent the two populations after
will come to agree that linear algebra really hit the nail on the
n decades, then
head and singled out some structurally fundamental concepts · ¸· ¸ · ¸
and ideas. This makes linear algebra a bit hard to teach and .8 .5 x 0 x1
=
learn in terms of motivation. The ideas of linear algebra are .2 .5 y 0 y1
not well motivated by any one particular problem. Rather, they
prove their worth in that the capture patterns that recur in a B.2.2. (a) We also have
vast number of diverse situations. At first sight, the concepts of
¸n · ¸ · ¸
linear algebra might seem like complicated ways of saying sim-
·
.8 .5 xk xm
ple things. I will show you various applications, but you may =
.2 .5 yk ym
well object that any one of them you could have handled with
simpler means instead of using the fancy words of linear alge- where m =
bra with their intricate definitions. This is true. But the point is
not that the notions of linear algebra helps you solve any one (b) Which is the city population? x or y?
particular problem, but that they help us highlight patterns
that occur in a vast range of problems across a multitude of (c) The population distribution is in equilibrium when
contexts. The “start-up cost” of learning these concepts might the city population is % of the suburban pop-
seem like a high price to pay for the applications you get, but ulation.
it’s a long-term investment if there ever was one. Linear algebra
is distilled experience. Mathematicians spent centuries work- So figuring out what happens with the populations over time
ing with linear relationships the hard way, and linear algebra is is just a matter of multiplying by the transition matrix so many
the box of insiders’ tricks that emerged as the most common times. Below we shall return to this example and see how its
and fundamental structural patterns they encountered. long-term behaviour can be analysed.

113
§ B.3. Linear transformations B.3.4. Which of the above matrices come back to I when multi-
plied by itself a certain number of times? How many mul-
tiplications does it take? Confirm by both calculation and
§ B.3.1. Lecture worksheet
geometrical interpretation.

Matrix multiplication also has a geometrical meaning. We can


think of a matrix as a function that takes § B.3.2. Problems
£ x ¤as its input a point
in
£ 1 the
¤ plane£written as a column
£ 0 ¤vector y . Then for example
1 1
¤
0 1 leaves 0 alone but tilts 1 since
B.3.5. Find the matrix representation of a reflection in the line
y = 2x.
· ¸· ¸ · ¸ · ¸· ¸ · ¸
1 1 1 1 1 1 0 1
= and =
0 1 0 0 0 1 1 1
§ B.4. Gaussian elimination
So the transformation looks like this:
§ B.4.1. Lecture worksheet
[ ]
1 1
0 1
When faced with a system of linear equations such as
x + 2y = 0
3x − y = 0
students are often tempted to “set them equal”: x +2y = 3x − y.
B.4.1. Explain why this is a bad idea in that it “destroys informa-
I have drawn the grid and dashed vectors here to emphasise tion.”
that the£matrix is
£ 0determined by its effect on £the¤ two unit-basis
vectors 10£ and 2 is just twice its A better strategy, which works for any system of linear equa-
¤ ¤
1 . Its effect on for £example 1
effect on 10 plus once its effect on 01 since tions, is to subtract a multiple of the first equation from the
¤ ¤
second so as to “kill” all the x’s in the second equation.
· ¸· ¸ · ¸ µ · ¸ · ¸¶
1 1 2 1 1 1 0 B.4.2. Solve the same system by this strategy.
= 2 +
0 1 1 0 1 0 1
· ¸· ¸ · ¸· ¸ B.4.3. Use the same strategy to solve the system
1 1 1 1 1 0
=2 +
0 1 0 0 1 1 x − 2y = 1
3x + 4y = 8
· ¸ · ¸ · ¸
1 1 3
=2 + =
0 1 1
Note that when solving systems of linear equations in this way
To summarise in general terms: the columns of a matrix rep- one really only plays around with the coefficients; the x and
resent its effect on the unit-basis vectors, and its general effect the y are merely placeholders. Matrices make for a convenient
can be extrapolated from there. bookkeeping device in such situations. Thus the last system
above can be encoded by the coefficient matrix
B.3.1. Match each of the matrices · ¸
· ¸· ¸· ¸· ¸· ¸· ¸ 1 −2 1
0 1 0 0 2 0 1 0 −1 0 0 1 3 4 8
1 0 0 1 0 2 0 −1 0 −1 −1 0
and solved as follows. First subtract 3 times the first row from

with its geometrical description: 90 clockwise rotation, the second: · ¸
1 −2 1
180◦ rotation, reflection in x-axis, magnification, projec-
0 10 5
tion onto y-axis, reflection in the line y = x.
Divide the last row by 10:
B.3.2. Use this to find matrices A and B such that AB 6= B A. · ¸
1 −2 1
1
We say that matrix multiplication is “not commutative.” 0 1 2
Can you see the etymology of this term, and its connec-
Add 2 times the last row to the first:
tion with everyday phrases such as “I have a long com- · ¸
mute to work”? 1 0 2
1
0 1 2
B.3.3. Use the above to find non-zero matrices A, B,C such that
These steps are nothing but shorthand versions of standard
AB = AC but B 6= C . Explain why this shows another way
ways of manipulating algebraic equations—most likely the
in which matrix algebra is unlike ordinary algebra.
same steps you took in problem B.4.3. Translating the final ma-
The matrix I = 10 01 is called the identity matrix since it
£ ¤
trix back into equations again, this last matrix says x = 2 and
changes nothing. y = 21 , so we have our solution.

114
This strategy for solving linear equations is two thousand years numbers, then calculating Av, where
old: it was used in ancient China, where the coefficient matrix
1 −1 1 −1
 
was represented and manipulated on a counting board. One
 0 1 −1 1 
is reminded of those Mancala board games where marbles are A= ,
 0 0 1 −1 
placed in a grid of pits on a wooden board—a delightfully con-
crete version of a matrix, and one very well suited for this kind 0 0 0 1
of calculation. Let me give you a sample problem from an an-
and then translating the resulting column vector back
cient Chinese text for practice. See if you can feel the mar-
into a four letter word. The spy sends the message CRAR.
bles moving as you solve it in coefficient matrix (or “counting
Decode the message.
board”) form. a b c d e f g h i j k l m n
1 2 3 4 5 6 7 8 9 10 11 12 13 14
B.4.4. “[We are to ascend a mountain carrying a weight of 40
o p q r s t u v w x y z
dan] given one superior horse, two common horses, and 15 16 17 18 19 20 21 22 23 24 25 26
three inferior horses. . . . The superior horse together with
one common horse, the [group of two] common horses
together with one inferior horse, and the [group of three] § B.6. Determinants
inferior horses together with one superior horse, are all
able to ascend. Problem: How much weight do the supe-
§ B.6.1. Lecture worksheet
rior horse, common horse, and inferior horse each have
the strength to pull?”
Above we saw that for a 2 × 2 matrix,
· ¸−1 · ¸
a b 1 d −b
=
§ B.5. Inverse matrices c d ad − bc −c a

The expression in the denominator is called the determinant,


§ B.5.1. Lecture worksheet ¯ ¯
¯a b ¯
det A = ¯
¯ ¯ = ad − bc
c d¯
The inverse A −1 of a matrix A is a matrix that “undoes” A, i.e.,
= ± area of parallelogram spanned by ac and db
£ ¤ £ ¤
A −1 A = A A −1 = I . Computationally, we can find the inverse of
a matrix A in the following way. First form the double matrix = area magnification factor of the transformation A
[A|I ] consisting of the given matrix A and the identity matrix
of the same dimensions written to the right of it. Now perform No wonder, then, that it appears in the denominator of the
row manipulations just as in §B.4 to transform A into I . When inverse, since the inverse must shrink any area magnification
we perform these operations we are focussing on the left, or A, that occurred back down again.
part of our double matrix. However, we are also perform the We can see why determinants correspond to areas by observing
same operations on the right half of the double matrix. Thus (from direct computation with the algebraic definition) that
the I we started with there will be turned into some other ma- ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯k · a b ¯
trix B . I say that this B is in fact the sought inverse A −1 . ¯ = k ¯a b ¯ and ¯a b ¯ = ¯a b − a ¯
¯ ¯ ¯ ¯ ¯ ¯
¯
¯k · c d ¯ ¯c d ¯ ¯c d ¯ ¯c d − c ¯
B.5.1. Prove this by arguing that solving for A −1 in A A −1 = I
amounts to three separate system-of-equations problems B.6.1. Thinking of determinants as areas of parallelograms, in
like the ones studied in §B.4, and that the method just de- what terms are these rules best interpreted?
scribed is simply the Gaussian elimination way of solving  base times (perpendicular) height formula
them all at the same time.
 reshaping a parallelogram like a stack of books
For the special case of a 2 × 2 matrix the inverse is:
 reshaping a parallelogram like four sticks
· ¸−1 · ¸
a b 1 d −b  stacking parallelograms side to side
=
c d ad − bc −c a
 introducing a third dimension/thickness
B.5.2. Name two ways in which you could prove this.  a similar but scaled parallelogram
 the perimeter of the parallelogram
§ B.5.2. Problems Hint: visualise for example how the transformations
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯1 1 ¯ ¯1 0 ¯
¯ and ¯1 1¯ → ¯1 3¯
¯ ¯ ¯ ¯
¯→¯
B.5.3. Cryptology. A spy is encoding four-letter text messages by
¯
¯0 1 ¯ ¯0 1 ¯ ¯0 1 ¯ ¯0 3 ¯
first translating the letters into numbers according to the
table below, then forming a column vector v from these affect area.

115
B.6.2. Argue
¯ ¯ that it follows that determinants are areas since § B.7. Eigenvectors and eigenvalues
¯ 1 0 ¯ = 1 is the area of the unit square and all other par-
01
allelograms can be built up from there by the above oper-
§ B.7.1. Lecture worksheet
ations (or, conversely, can be brought back down to a unit
square by these operations).
Above we discussed a population dynamics example where the
B.6.3. Technically, determinants are “signed areas” since they movements each decade were described by a matrix,
are sometimes negative, although their magnitude always · ¸· ¸ · ¸
corresponds to the area. This is reflected in determinant .8 .5 x n x n+1
=
algebra by the fact that if we switch two columns the de- .2 .5 y n y n+1
terminant changes sign. Compute the determinants of What will happen in the long run in this situation? It seems
the transformations in problem B.3.1 and argue that a likely that the population distribution will eventually settle at
negative determinant corresponds to areas being “flipped an equilibrium, so that the number of people moving one way
upside down.” is equal to the number of people moving in the other direction.
B.6.4. † ? 3×3 determinants are volumes of parallelepipeds, the In equations this means
natural generalisation of the 2 × 2 case. They are com- · ¸· ¸ · ¸
.8 .5 x x
puted by breaking them into 2 × 2 determinants as shown =
.2 .5 y y
in the reference summary. Generalise the argument of
problem B.6.2 to the 3×3 case to justify the interpretation or in other words
of a 3 × 3 determinant as a volume.
.8x + .5y = x
The rules for manipulating determinants that we found above
.2x + .5y = y
can be used to simplify calculations. For example, if we are
looking for the determinant or
¯
¯1 1
¯
5 ¯¯ −.2x + .5y = 0
¯
¯0
¯ 0 1 ¯¯ .2x − .5y = 0
¯2 2 11¯
The second equation is just minus one times the first, so there
we simply subtract twice the first row and once the middle is really only one equation and two unknowns. Therefore there
row from the last row, which then becomes a row of all zeroes. are infinitely many solutions. In such a situation we can pick
Therefore the entire determinant is zero. any value for one of the variables, say x = t , and there will al-
ways be a corresponding value for y that solves the equation,
B.6.5. Explain why the last sentence is clear both computation- in this case y = 52 t . We can then express all solutions in vec-
ally and geometrically. h i
t £ 1 ¤
tor form as 2 t , or t 2/5 . Or, if we prefer to write it without
5
The case of a determinant being zero is often of special inter- £5¤
fractions, t 2 , since any constant multiple is absorbed by the
est. It means that the column/row vectors are “linearly depen- parameter t , which runs through all numbers. So we see that
dent,” i.e., one of them can be obtained by combining the oth- an equilibrium is reached when the populations are in the pro-
ers with certain coefficients, like the last row was a combina- portions 5 to 2, i.e., when the city population is 40% of the sub-
tion of the previous two in our example. So in such cases there urban population. The parameter t reflects the fact that we did
is a kind of redundancy: the last vector “doesn’t add anything not know the total number of people to begin with, so we know
new.” This idea will be important later. only the proportions but not the absolute numbers.
B.6.6. Select all that are true. £a b ¤£x ¤
£ 0 ¤more general terms, a matrix identity Ax = 0, or c d y =
In
 If u, v, w are vectors such that {u, v}, {u, w}, and {v, w} 0 , is really a system of linear equations,
are each linearly independent sets, then {u, v, w} is a
linearly independent set. ax + b y = 0
cx + d y = 0
 If the columns of a matrix are linearly dependent,
then its determinant is zero. Geometrically, each equation is a line. These lines are either
the same (as in the above example), or they intersect in one
 If the rows of a matrix are linearly dependent, then its
point. So there is either one solution or infinitely many.
determinant is zero.
B.7.1. Why are parallel lines not a possibility?
 Just as every real number except 0 has a multiplica-
tive inverse, so every square matrix that has no 0 en- The difference between these two possibilities is reflected in
tries has an inverse. the determinant of A:
 A diagonal matrix commutes with anything: that is, det A = 0 ⇔ number of solutions = ∞
if D is a diagonal square matrix then AD = D A for all
matrices A of the same dimensions. det A 6= 0 ⇔ number of solutions = 1

116
B.7.2. Explain why this is clear in terms of both the area and lin- where n is the generation. To obtain the population dis-
ear independence interpretations of the determinant. tribution in the next generation one multiplies by a tran-
sition matrix such as
I would like to generalise from the population example and
consider any vector that, when multiplied by a matrix A, is sent
 
0 2 4
to a multiple of itself, A = 1/16 0 0
· ¸ · ¸ 0 1/4 0
x x
A =λ
y y  
0 2 4
Such a vector is called an eigenvector of A, and the number λ B = 1/4 0 0
is the corresponding eigenvalue. In the population example we 0 1/2 0
were interested in the special case λ = 1.  
0 1/2 1/2
B.7.3. For each of the matrices in problem B.3.1, find all eigen- C = 1/4 0 0 
vectors and eigenvalues both algebraically and by geo- 0 1/2 0
metrical reasoning.
Match each matrix with its corresponding real-world sce-
The eigenvector equation Ax = λx can also be rewritten as nario among those listed below. Also, without calcu-
Ax = λI x and then (A − λI )x = 0. This is precisely the kind of lations, by reasoning about the real-world meaning of
system we studied above. So we know that there is either one eigenvectors and eigenvalues, deduce which eigenvalues
solution or infinitely many, and we can find out which by com- and eigenvectors from the options provided should go
puting det(A−λI ). Obviously there is always the trivial solution with each scenario. Confirm your inferences computa-
x = 0, but we do not count 0 as an eigenvector. So eigenvectors tionally (perhaps using a computer). Possible eigenvec-
occur precisely when there are infinitely many solutions, i.e., tors:
when det(A − λI ) = 0.
     
2 8 16
v1 = 1 v2 = 2 v3 =  2 
So if we are looking for £the eigenvectors and eigenvalues of our
1 1 1
population matrix A = .8 .5 we begin by solving the equation
¤
.2 .5

¯.8 − λ Possible eigenvalues: 0.5, 1.


¯ ¯
.5 ¯¯
det(A − λI ) = ¯
¯ = (.8 − λ)(.5 − λ) − 0.1 = 0
.2 .5 − λ¯ (a) Good environmental conditions: matrix
The two roots are λ = 1 and λ = .3. To find the corresponding eigenvalue eigenvector
eigenvectors we plug each¤of £ x these
¤ £ 0 ¤values into (A − λI )x = 0. (b) Presence of deadly toxins: matrix eigenvalue
2
For λ = 1 this .5
£ −.2
gives £ 5.2¤ −.5 y = 0 , so y = 5 x, so the eigen- eigenvector
£λ1=¤ .3 we get
£ 1 ¤
vector is t
£ .5 .5 ¤ £ x ¤ 2/5 or t 2 , as we already knew. For
(c) Presence of toxins that diminish fertility: matrix
£0¤
.2 .2 y = 0 , so y = −x, so the eigenvector is t −1 .
eigenvalue eigenvector
B.7.4. How can you check these answers? Do so.
B.7.7. Consider the following model of an expanding econ-
omy. There are three variables: steel, food and labour.
§ B.7.2. Problems The production of each good consumes a part of what
was produced the year before: a new unit of steel requires
B.7.5. (a) Prove that if each column of a matrix A sums to 1 .4 units of existing steel and .5 units of labour; a new unit
then A must have 1 as an eigenvalue. Hint: Show of food requires .1 units of existing food and .7 units of
that you can create a row of zeroes in the matrix A − I labour; producing (or maintaining) a unit of labour costs
by applying row operations, and then consider what .8 units of food and .1 units of steel and labour.
this means for the system of equations (A − I )x = 0.
For ease of writing you may assume that A is a 3 × 3 (a) Represent this situation by a 3 × 3 matrix A such that
matrix.    
s1 s0
(b) What does this result mean in terms of systems A  f1  =  f0 
where the entries of the matrix represent mutually l1 l0
exclusive probabilities (as in the city population ex-
ample)? and explain why this matrix equation correctly rep-
resents the information given above.
B.7.6. The population of a species is divided into three age
groups: child, adolescent, adult. Let the number of in- (Note that this equation is “backwards” in the sense
dividuals in each group be encoded as a column vector that it expresses the “input as a function of the out-
  put,” so to speak.)
xn
yn  (b) It seems people spend [more/less] time on agricul-
zn ture than childcare.

117
(c) Compute the eigenvalues and eigenvectors of the and applying the matrix A many times. The process can
matrix (perhaps by computer). If the proportions be thought of as modelling the behaviour of a “random
of s: f :l are 1 : : then the economy is surfer” who clicks the links of the page he is on with equal
[growing/shrinking] by % per year. probability. The pages with the highest rankings are those
which the random surfer ends up hitting most often.
B.7.8. Consider an economy based on oil and steel. Extract-
ing oil costs both steel and oil: steel for drills and pipes, (a) Compute A 5 x0 , A 20 x0 and A 100 x0 , and notice that the
and oil to run the pumps. Similarly, mining for steel re- results are stabilising at particular values. These are
quires steel drills and rails and oil-driven machinery. Ex- the relative importances of these pages according to
tracting one unit of oil costs .04 units of oil and .08 units Google’s algorithm.
of steel. Extracting one unit of steel costs .04 units of oil
and .01 units of steel. This is encoded in the matrix (b) Explain how this is related to eigenvectors and eigen-
· ¸ values. Hint: Compare A 101 x0 ≈ A 100 x0 with Ax = λx.
.04 .04
A=
.08 .01 (c) Show how the same ranking (and relative impor-
tances) can found by an eigenvector calculation in-
There is a yearly market demand for 100 units of oil and
stead of computing powers of A. (Recall that if v is an
20 units of steel, which we may express by the matrix
· ¸ eigenvector then so is any multiple of v.)
100
D= (d) Suppose the owner of page 4 tries to boost his rank-
20
£ oil ¤ ing by creating a page 5 which links to page 4; page 4
Finally, X = steel is a column vector expressing the also links to page 5. Does the new page 5 help page
yearly production quantities. 4’s ranking?
(a) What
£ ¤ is £ the ¤real-world meaning of AX ? Hint: Consider this disconnected web:
A osnn = osn−1
n−1
.
1 3 5
(b) What is the economic sense of the equation (I −
A)X = D?
2 4
(c) Solve this equation for X and interpret your answer
in real-world terms. (e) Compute the eigenvalues and eigenvectors of this
B.7.9. † When you perform a Google search, the order of the web. Why does the ranking strategy used above not
results is determined using matrix algebra. The basic idea work here?
is that if a web page contains n links to other pages then it (f ) ? Does this problem occur for any disconnected
“passes on” 1/n times its own importance to each of those web? Does it ever occur for a connected web? Ex-
sites. We can think of the web as the board of a board plain using the idea of a stable state and its meaning
game, on which stacks of coins placed on each site is be- in terms of eigenvectors and eigenvalues.
ing moved around in this manner. One “turn” of all web-
sites passing on their importance according to this prin- (g) To fix this and other problems the actual Google ma-
ciple can be encoded in a matrix A such that if x is the trix is not A but 0.85A + 0.15B , where B is a ma-
column vector of the importance of the websites then Ax trix with all entries 1/N (where N is the number of
is the column vector of importances after the passing on pages). Explain how this can be interpreted in terms
has taken place. An example is shown below. of the “random surfer” mentioned above.

1 2 (h) With this modification, rank the pages of the discon-


nected web using the eigenvector method.

3 4
§ B.8. Diagonalisation
0 0 1 1/2
 
1/3 0 0 0  § B.8.1. Lecture worksheet
A=
1/3

1/2 0 1/2
1/3 1/2 0 0 In this section we shall find a clever way of figuring out the
power A n of a matrix without actually having to multiply it out
We can now rank the pages by supposing that each web-
so many times. This is done by finding the “diagonalisation” of
site starts with equal importance, i.e.,
A. A diagonal matrix is a matrix with all zeros except along the
1/4
 
diagonal. Diagonal matrices are very convenient because
 1/4 
 1/4  ,
x0 =  ¸n
an
 · · ¸
a 0 0
=
1/4 0 b 0 bn

118
B.8.1. Interpret this result geometrically. In particular,
Therefore we seek a diagonal representation of A, so that we ·
x0
¸ ·
1 5 5
¸·
x0
¸ ·
1 5(x 0 + y 0 )
¸
n
can take its powers in a convenient way. I claim that in fact lim A = =
n→∞ y0 7 2 2 y0 7 2(x 0 + y 0 )
¸−1
λ1
· · ¸ · ¸
0 So no matter what the initial numbers x 0 and y 0 are, in the
M = v1 v2 A v1 v2 =
0 λ2 long run the ratio will stabilise at five sevenths in the suburbs
where v1 and v2 are the eigenvectors of A written as columns. and two sevenths in the city. We already knew that this was the
This is a splendid fact, because if we solve for A in this equation equilibrium, but now we have confirmed explicitly that we al-
we obtain ways approach this equilibrium regardless of initial condition.
¸−1 B.8.3. Diagonalise one or two of the matrices from problem
λ1 0
· ¸· ¸·
A = v1 v2 v1 v2 (∗) B.7.3. Then use the diagonalisation to find a simple ex-
0 λ2
pression for an arbitrary power of the matrix A n and note
and therefore that the answer could easily have been predicted geomet-
¸n · ¸−1 rically.
λ1 0
· ¸·
A n = v1 v2 v1 v2
0 λ2

In this way we need only three matrix multiplications instead § B.8.2. Problems
of a hundred to compute A 100 .
B.8.4. Explain how formula (∗) from the text can be used to
To prove my claim I only need to compute: generate a matrix with given eigenvectors and eigenval-
· ¸ · ¸−1 · ¸· ¸ ues. (This is useful for teachers designing exam prob-
1 1
M = v1 v2 A v1
v2 lems. If you make up some matrices with simple num-
0 0
bers in them and compute their eigenvalues and eigen-
· ¸−1 · ¸−1
vectors you will find that these are typically not simple
= v1 v2 Av1 = v1 v2 λ1 v1
at all, so this is not a good way of designing manageable
λ1
· ¸ · ¸
1 exam problems.)
=λ1 =
0 0 B.8.5. Re-solve problem B.3.5 by finding the eigenvectors and
eigenvalues through geometric reasoning and applying
B.8.2. Justify each step in this calculation by matching the first,
formula (∗).
second, third, and fourth equalities with a corresponding
justification: B.8.6. Argue that (∗) can be interpreted geometrically as follows.
 definition of A To apply the transformation A is the same thing as to: (1)
perform a change of variables so that the eigenvectors are
 rule for scalar product of vector with itself the new basis vectors, (2) apply the transformation in this
 definition of M new coordinate system, where it is simply a dilation of
each of the basis vectors, (3) revert back to the original
 simplifying by column operations variables.
 matrix multiplication computation B.8.7. The Fibonacci sequence is a sequence of numbers in
 reasoning backwards: what input gives this input? which every number is the sum of the two previous num-
bers: 1, 1, 2, 3, 5, 8, 13, 21, . . ., or in symbols F n = F n−1 +
 formula for inverse matrix F n−2 . These numbers are found in many places in na-
 definition of eigenvector ture. For example, if you turn a pine cone or a pineap-
ple upside-down and count the number of spirals going
In the same way one finds that M 01 = λ02 , so M must be
£ ¤ £ ¤
clockwise and counter-clockwise you will find that these
λ1 0
h i
, as claimed. are two consecutive Fibonacci numbers.
0 λ2

If we apply this to the population example we find that Fibonacci was a 13th century Italian mathematician who
¸n · ¸−1 used this sequence and many other computational prob-
λ1
· ¸·
0 lems to show the superiority of the Arabic numerals that
A n = v1 v2 v1 v2
0 λ2 we use today over the Roman numerals still used in Eu-
· ¸· n ¸· ¸−1 rope at that time. He introduced his sequence by means
5 1 1 0 5 1
= of the following rabbit population scenario. Suppose
2 −1 0 0.3n 2 −1
· ¸· ¸ · ¸ each pair of adult rabbits produces one pair of baby rab-
5 1 1 0 1 −1 −1 bits per season. Next season these baby rabbits become
=
2 −1 0 0.3n −7 −2 5 adults and start producing offspring of their own. Thus
1 5 + 2 · 0.3n 5 − 5 · 0.3n the number of adult rabbit pairs F n in generation n equals
· ¸
= n
7 2 + 2 · 0.3 2 − 5 · 0.3n all the rabbit pairs from last generation F n−1 plus the new

119
rabbits produced by the rabbits who are reproductively Ax) = 0, which can also be written A T Ax = A T d. The only un-
active, i.e., the rabbits who were born at least two seasons known in this equation is x. We have thus reduced the geomet-
ago, F n−2 . Thus F n = F n−1 + F n−2 , as above. rical problem of finding x to a straightforward matter of solving
a system of equations numerically.
We shall now obtain a formula for F n that will enable us to
find for example F 1000 in one step instead of the thousand Now consider a data problem that at first sight appears unre-
steps required to write out the entire Fibonacci sequence lated to the above but in fact turn out to be the same thing.
up to this point. Let’s say I want to investigate the relation between the ages
of male and female actors portraying couples in Hollywood
(a) Find a matrix A such that movies. Here for example is a small data set:
Brad Pitt’s age age of actress
· ¸ · ¸
F n+1 Fn
=A . movie (t ) playing his wife (w)
Fn F n−1
Se7en 32 22
p p Mr. & Mrs. Smith 41 30
1+ 5
The eigenvalues of this matrix are a = and b = 1−2 5 ,
2 World War Z 49 37
¸ ·
· ¸
a b
and the corresponding eigenvectors are and . Does this data fit a linear relationship w = mt + b? Not exactly,
1 1
but not far from it. The equations this data would satisfy if the
Diagonalise A and find a formula for· F n ¸by computing a relationship was linear would be:
1
suitable power of A multiplied with .
1 b + mt 1 = w 1
b + mt 2 = w 2
(b) This leads to F n = (expressed as a formula in
terms of a, b and n). b + mt 3 = w 3
or in matrix terms:
   
1 t1 · ¸ w1
§ B.9. Data b
1 t2  = w 2 
m
1 t3 w3
§ B.9.1. Lecture worksheet In terms of our geometric example, this corresponds to Ax = d.
The b and the m are the unknowns x that we are looking for,
Linear algebra is very often used to process big data. Re- and the wife’s ages are the data d that we are trying to capture.
markably, many useful numerical techniques for dealing with But just as in the geometric example, it is not possible to solve
big sets of data correspond to readily visualisable geometrical this equation Ax = d. But we can solve the problem to the clos-
ideas concerning vectors. The following is an example of this. est possible approximation by the same trick as above, that is,
solve A T Ax = A T d instead. In the movie case this becomes:
I’m given a vector d and I am looking for its projection onto a    
¸ 1 t1 · ¸ · ¸ w1
particular plane, namely the plane spanned by the vectors a1
·
1 1 1  b 1 1 1
and a2 . In other words, I am trying to find some combination 1 t2  = w 2 
t1 t2 t3 m t1 t2 t3
of these vectors, say x 1 a1 + x 2 a2 , that is as close as possible to 1 t3 w3
d. I can formulate this in terms of matrices by forming a matrix It is straightforward to write this out as a system of equa-
A that has the vectors a1 and a2 as columns. Then my problem tions and solve for the unknowns b and m. This gives b =
becomes: choose x = (x 1 , x 2 ) so that Ax is a close as possible to −1350/217 ≈ −6.22 and m = 383/434 ≈ 0.88. Thus w = 0.88t −
d. 6.22 is the best linear fit for the given data. So Brad Pitt’s movie
d wives were, so to speak, born a bit over six years after him and
furthermore age only 0.88 years for every one of his.
d - Ax I used only three age pairs in this example, but the method
a2 works for a data set of any size. The three-dimensional visu-
column space of alisation enabled us to see the method in an intuitive way, but
A=[a1 a2] once we translated it into matrix language we could just as well
extend it to any number of dimensions.
Ax=x1a1+x2a2

a1 § B.9.2. Problems

B.9.1. Find an interesting data set and perform a linear fit


analysis using the matrix method as above.
The best choice of x is characterised by the fact that the error
vector d − Ax is perpendicular to a1 and a2 . In other words, B.9.2. Another example of “data geometry.” The correlation co-
a1 · (d − Ax) = 0 and a2 · (d − Ax) = 0, or in matrix terms: A T (d − efficient ρ captures “how correlated” two variables are:

120
The number multiplies onto each entry:
· ¸ · ¸
a b ka kb
k =
c d kc kd

· ¸ · ¸ · ¸
1 3 3·1 3·3 3 9
3 = =
2 0 3·2 3·0 6 0

• Multiply a matrix by a matrix, AB .


Highlight the first row of A and the first column of B . Multi-
Each data point (x k , y k ) is an empirical observation such
ply the first entry in the row by the first entry in the column,
as, for example, a person’s salary (y k ) and the number of
the second entry in the row by the second entry in the col-
years of education that person has (x k ).
umn, end so on, and then add the results. Write down the
(a) Which plot is the likeliest depiction of this case? result as the entry in the first row and first column of the an-
swer.
(b) Come up with plausible real-world scenarios corre-
sponding to the other plots. Next highlight the second column of B instead and repeat
the process. Keep going until you have exhausted all possi-
The correlation coefficient can be formally defined and ble combinations of rows of A with columns of B . Each such
calculated using the idea that the scalar product mea- combination gives another entry of the answer (namely that
sures the “amount of agreement” between two vectors. in that row and that column).
This is done as follows. Think of the data as two vec-
tors (x 1 , x 2 , x 3 , . . . , x n ) and (y 1 , y 2 , y 3 , . . . , y n ). First we have Note that in general AB 6= B A: the order matters when mul-
to “center the data,” meaning shift it so that the mean of tiplying matrices.
each variable is zero. To this end, compute the mean val-
ues µx = (x 1 +x 2 +x 3 +. . .+x n )/n and µ y = (y 1 +y 2 +y 3 +. . .+
· ¸· ¸
2 1 0 2
y n )/n, and now work with the vectors (x 1 −µx , x 2 −µx , x 3 − 3 −5 1 −1
µx , . . . , x n − µx ) and (y 1 − µ y , y 2 − µ y , y 3 − µ y , . . . , y n − µ y )
instead. Now compute cos θ for the angle between these ·
2 1
¸·
0 2
¸ ·
1
¸
vectors. This is the correlation coefficient. =
3 −5 1 −1
| {z }
(c) Carry out these steps for the first plot above. (Intro- 2·0 + 1·1 = 1
duce a scale on the axes by taking the x-coordinate · ¸· ¸ · ¸
2 1 0 2 1 3
of the first point to be 1.) =
3 −5 1 −1
(d) Visualise the vectors in question in three-
| {z }
2·2 + 1·−1 = 3
dimensional space, and interpret the result visually. · ¸· ¸ · ¸
2 1 0 2 1 3
=
3 −5 1 −1 −5
§ B.10. Reference summary
| {z }
3·0 + −5·1 = −5
· ¸· ¸ · ¸
2 1 0 2 1 3
=
§ B.10.1. Basic matrix algebra 3 −5 1 −1 −5 11
| {z }
3·2 + −5·−1 = 11
ABC = (AB )C = A(BC ) A(B +C ) = AB + AC
So altogether the result is that
• Add a matrix to a matrix, A + B . · ¸· ¸ · ¸
2 1 0 2 1 3
Add entry-by-entry: =
3 −5 1 −1 −5 11
· ¸ · ¸ · ¸
a b e f a +e b+f
+ =
c d g h c +g d +h
· ¸· ¸ · ¸
a b e f ae + bg a f + bh
=
c d g h ce + d g c f + dh

· ¸ · ¸ · ¸ · ¸
1 3 4 2 1+4 3+2 5 5 § B.10.2. Linear transformations
+ = =
2 0 2 −1 2+2 0−1 4 −1
Geometrically, matrices represent linear transformations,
• Multiply a matrix by a number, k A. meaning transformations that preserve lines. Such transfor-

121
mations are rotations, reflections, dilations (i.e., magnifica-
tions, or scalings), linear projections, and combinations of § B.10.3. Gaussian elimination
these. Matrix multiplication always leaves the origin intact so
translations (i.e., vertical or horizontal displacements) are not Correspondence between matrices and systems of linear equa-
included. tions:
£x¤
£Algebraically, this corresponds to the multiplication A y = ax + b y = c
·
a b c
¸
X ↔
¤
Y A transforms the input point (x, y) into some output point dx +ey = f d e f
(X , Y ).
£1¤ Gaussian elimination rules:
The first column of A is£its¤ effect on the unit vector 0 , the sec-
ond on the unit vector 01 . – You may add or subtract any row from any other row.

Matrix multiplication AB represents the composition of the – You may multiply or divide any row by any number.
linear transformations A and B in the order “first apply B then
apply A” (since input vectors are “plugged in on the right”). – You may switch places of any two rows.

• Find the matrix representing a given linear transformation The goal of Gaussian elimination is generally to turn the ma-
specified in geometrical language (rotation, reflection, etc.). trix into identity-matrix form, or at least upper-triangular form
(i.e., having all 0’s in the bottom-left half below its principal di-
Determine the effect of the transformation of the unit basis agonal).
vectors. Write the results as the columns of a matrix. This is
the sought transformation matrix. • Solve a system of linear equations.

Translate the system into matrix form as above. Apply Gaus-


Find the 2 × 2 matrix representing a reflection in the line sian elimination to turn the matrix (the part of it before the
y = −x. bar) into upper-triangular form. Translate the last row back
· ¸ into an ordinary equation. This gives you the value for the
0 −1
last variable. Translate the next-to-last row back into an or-
−1 0
dinary equation, and plug in the value for the last variable.
This gives you the value for the next-to-last variable. And so
on.
Find the 3 × 3 matrix representing a reflection of three-
dimensional space in the plane y = z. If you reach an equation of the form 0 = a where a 6= 0: the

1 0 0
 system has no solutions.
 0 0 1 
If you reach an equation of the form 0 = 0: the system has in-
0 1 0
finitely many solutions. Instead of entering a specific value
for the variable corresponding to this equation, set it equal to
• Characterise geometrically the effect of a given matrix A. a parameter, such as t , and proceed as usual (this means that
the other variables will become expressed in terms of t also).
Calculate A’s effect on the unit basis vectors (A 10 and so
£ ¤ Your formulas for the values of the variables give a solution
on), and picture the resulting vectors along with the origi- for the system of equation for any value of t you plug into it.
nal unit basis vectors. Determine by visualisation what lin-
• Determine whether a system of linear equations has 0, 1, or
ear transformation sends the latter vectors onto the former.
infinitely many solutions.
This is the answer.

p p ¸ Attempt to solve the system as above and see the rules for the
·
1/ 2 −1/ 2 number of solutions there.
What geometric transformation is p p ?
1/ 2 1/ 2
3x + y = 27
45◦ counterclockwise rotation.
2x − y = 0

 
0 1 0 · ¸ · ¸ · ¸ · ¸
What geometric transformation is  −1 0 0 ? 3 0 27 1 0 9 1 0 9 1 0 9
∼ ∼ ∼
0 0 1 2 −1 0 2 −1 0 0 −1 −18 0 1 18

90◦ rotation about the z-axis, clockwise as seen from Hence x = 9 and y = 18.
“above” (positive z position).

122
A −1 Inverse of A. A matrix such that A A −1 =
x + y = 2
A −1 A = I .
2x + 2y = 4
orthogonal
A matrix whose columns (and rows) form
" # " # matrix
1 1 2 1 1 2 a system of orthogonal unit vectors, or
∼ ∼
2 2 4 0 0 0 equivalently: a matrix A such that A −1 =
AT .
Hence · ¸ · ¸
x 2−t singular
= A matrix with determinant zero. A singu-
y t matrix
lar matrix is non-invertible.
is a solution for any value of t .
symmetric
A matrix that equals its own reflection in
matrix
x + y = 2 its principal diagonal; matrix A such that
A = AT .
2x + 2y = 5
antisymmetric
Matrix A such that A = −A T .
matrix
" # " #
1 1 2 1 1 2
∼ ∼
2 2 5 0 0 1
There are no values for x and y that make 0 = 1. Hence the § B.10.5. Determinants
system of equations has no solutions.
¯ ¯
¯a b ¯¯
det A = ¯¯ = ad − bc
For what value(s) of a does the following system of equa- c d¯
£a¤ £b ¤
tions have no solutions? = ± area of parallelogram spanned by c and d

x − 2y + 3z = 2 = ± area scaling factor of linear transformation A


2x − y + 2z = 3
x + y + az = a Similarly for 3 × 3 determinants which represent the volume of
the parallelepiped spanned by its column vectors, and so on.
   
1 −2 3 2 1 −2 3 2
∼ 2 −1 2 3 ∼ 0 3 −4 −1 
1
1 1 a a 0 3 a −3 a −2 det AB = (det A)(det B ) det A −1 = det A = det A T
det A
 
1 −2 3 2
• Compute the determinant of a matrix.
∼ 0 3 −4 −1 
0 0 a +1 a −1
¯ ¯
¯a b ¯
For a 2 × 2 matrix, ¯
¯ ¯ = ad − bc.
There are no solutions when a − 1 6= 0 and at the same time c d¯
a + 1 = 0, thus when a = −1. For a bigger matrix, select any one row or column of the ma-
trix (preferably one with many 0’s in it). For each entry in this
row or column:

§ B.10.4. Special matrices – Block out the row and column containing this entry, and
write down the determinant of the matrix that remains.
I Identity matrix. A matrix with 1’s on its – Write the entry itself as a coefficient in front of this de-
principal
£ ¤ diagonal and 0’s elsewhere, as terminant.
in 10 01 . Multiplying by I has no effect:
AI = I A = A. – Decide whether the entry is associated with a plus or mi-
nus sign. The top left entry (of the original matrix) is
AT Transpose of A. A with rows and positive and every time you go one step over or one step
columns interchanged: the rows become down the sign changes. Once the sign is determined,
the columns and the columns become write it in front of the entry coefficient.
the rows.
The original matrix is equal to the resulting sum of smaller
  matrices (with their appropriate signs and coefficients). Ap-
· ¸T 2 1 ply the same process to these smaller matrices until you have
2 5 6
= 5 3 broken them down to 2 × 2 determinants, which can be eval-
1 3 4
6 4 uated as above.

123
¯ ¯
¯2 1 4¯¯
¯
Find the determinant: ¯¯1 5 1¯¯ ¯ ¯ ¯ ¯
¯2 3 1¯ ¯a b c ¯¯ ¯¯ a b−a c ¯¯
¯
¯d e f ¯¯ = ¯¯d e −d f ¯¯
We can do this by expanding by any one row or column.
¯
¯g h i ¯ ¯g h−g i¯
Let’s pick the first column. Then each entry of this col-
umn needs to be multiplied by the 2 × 2 determinant that
remains when its row and column is blocked out:

If all entries in a row (or column) are multiples of the same


¯ ¯
¯2 1 4 ¯ ¯ ¯
¯ ¯ ¯5 1 ¯

¯ 1 5 1 ¯ = 2 ¯3 1 ¯ = 4
¯ ¯ ¯ number k, then you can factor this number out. Place the k
¯2 3 1 ¯ as a coefficient of the determinant and divide away this fac-
tor in all the entries of the row (or column) in question, leav-
¯ ¯
¯2 1 4¯¯ ¯ ¯ ing other rows (or columns) unchanged.
¯ ¯1 4¯¯
1 ¯¯1 5 1¯¯ = 1 ¯¯ = −11
¯2 3 1¯
3 1¯
¯ ¯
¯2 1 4¯¯ ¯ ¯
¯ ¯1 4¯¯
2 ¯¯1 5 1¯¯ = 2 ¯¯ = −38
¯ ¯ ¯ ¯
¯k · a b c ¯¯ ¯a b c ¯¯
¯2 5 1¯ ¯ ¯
3 1¯ ¯k · d
¯ e f ¯¯ = k ¯¯d e f ¯¯
¯k · g h i¯ ¯g h i¯
These results are to be added together except first some
terms must be given a minus sign based on “where it
came from” according to the pattern

+ − + You can switch places of two of the rows (or two of the
− + − columns), but then you must also switch the sign of the de-
+ − + terminant (multiply by −1 in front). In this way the value of
the determinant remains the same.
In other words, to find the sign of a given position we
“count it off” in alternating plusses and minuses starting
from a plus in the top left corner. In our case, therefore,
the middle term must be negated. So the final answer is
4 + 11 − 38 = −23.
¯ ¯ ¯ ¯
¯a b c ¯¯ ¯d e f ¯¯
¯ ¯
¯d e f ¯¯ = − ¯¯ a b c ¯¯
¯
¯g h i¯ ¯g h i¯
¯ ¯
¯
¯ 3 4 0 0 ¯
¯
¯ 1 2 0 0 ¯
Find the determinant: ¯¯ ¯
¯ 0 0 3 1 ¯
¯
¯ 0 0 4 2 ¯
¯ ¯ ¯ ¯ Evaluate the determinant
¯ 2 0 0 ¯ ¯ 1 0 0 ¯
¯ ¯ ¯ ¯ ¯ ¯
Expanding by first row: = 3 ¯ 0 3 1 ¯ − 4 ¯¯ 0
¯ ¯ 3 1 ¯=
¯ ¯ 1 2 4 4 ¯
¯ ¯
¯ 0 4 2 ¯ ¯ 0 4 2 ¯ ¯ 0 2 9 1 ¯
¯ ¯
¯ ¯
¯ 3 1 ¯
¯
¯ 3 1 ¯
¯ ¯ 0 0 1 8 ¯
3 · 2 ¯¯ ¯−4¯
¯ 4 2 ¯ = 6·2−4·2 = 4
¯ ¯ ¯
¯ 2 4 8 9
4 2 ¯
¯

Subtract twice the first row from the last row. Then expand
• Simplify a determinant before computing it.
by the first column repeatedly.

If you add or subtract any multiple of one row to another row, ¯


¯ 1 2 4 4
¯ ¯
¯ ¯ 1 2 4 4
¯
¯
the value of the determinant does not change. If you add or
¯ ¯ ¯ ¯
¯ 0 2 9 1 ¯ ¯
¯=¯ 0 2 9 1 ¯
subtract any multiple of one column to another column, the
¯ ¯
¯
¯ 0 0 1 8 ¯ ¯
¯ ¯ 0 0 1 8 ¯
¯
value of the determinant does not change. The best use of ¯ 2 4 8 9 ¯ ¯ 0 0 0 1 ¯
this is usually to create as many 0’s as possible.
¯ ¯
¯ 2 9 1 ¯ ¯ ¯
¯ = 1·2·¯ 1 8 ¯¯
¯ ¯ ¯
= 1 · ¯¯ 0 1 8 ¯ 0 = 1·2·1·1 = 2
¯ 0
¯ 1 ¯
¯
¯a
¯ ¯ ¯ 0 1 ¯
¯ b c ¯¯ ¯¯ a b c ¯¯
¯d e f ¯¯ = ¯¯ d e f ¯¯
¯
¯g h i ¯ ¯g − a h −b i − c¯

124
Find the determinant of the matrix
 
0 1 0
Invert A =  −1 0 0 
1 0 1 0
 
0 1 0 0 1
1 1
A= 0 a 3 a 2

a
1 a 2 1
   
0 1 0 1 0 0 1 0 0 0 −1 0
 −1 0 0 0 1 0 ∼ 0 1 0 1 0 0 
Subtract the first row from the last, and then expand by the 0 0 1 0 0 1 0 0 1 0 0 1
first column:
Thus  
¯
¯1 0
¯
1 0 ¯¯ ¯¯ ¯ 0 −1 0
¯ ¯ 13 12 1 ¯
−1
¯
¯0 1 1 1
¯ A = 1 0 0 
det A = ¯¯ ¯ = ¯a a a 0 0 1
3
a 2 a ¯¯ ¯¯
¯
¯0 a
¯
¯0 a a 1 1¯
1 1¯

Subtract the first row from the third:


 
0 1 1
¯ ¯ Find the inverse of A =  2 0 0 
¯ 1 1 1 ¯¯
0 0 1
det A = ¯¯ a 3 a2
¯
a ¯¯
¯a − 1 0 0¯
   
0 1 1 1 0 0 2 0 0 0 1 0
Expand by the third row:  2 0 0 0 1 0 ∼ 0 1 1 1 0 0 
¯ ¯ 0 0 1 0 0 1 0 0 1 0 0 1
¯1 1 ¯¯
det A = (a − 1) ¯¯ 2 = (a − 1)(a − a 2 ) = −a(a − 1)2
a a¯ 1

0 0 0 1
0
 
1 0 0 0 1
0

2 2
∼ 0 1 1 1 0 0 ∼ 0 1 0 1 0 −1 
0 0 1 0 0 1 0 0 1 0 0 1
So  1 
0 2 0
§ B.10.6. Matrix inverses A −1 =  1 0 −1 
0 0 1

• Find the inverse of a given matrix A.

For a 2 × 2 matrix,
§ B.10.7. Eigenvectors and eigenvalues
· ¸−1 · ¸
a b 1 d −b
= Definition: If Av = λv for some number λ then v (6= 0) is called
c d ad − bc −c a
an eigenvector of A and λ is the eigenvalue associated with it.

• Find the eigenvalues λ of a given matrix A.


For a bigger matrix, transform [A|I ] into [I |B ] by row ma-
nipulations; then B is the inverse of A. In greater detail this Solve det(A − λI ) = 0.
means the following. Form the augmented matrix [A|I ] con-
sisting of A and an identity matrix I of the same dimen- µ ¶
7 1
sions written next to it. Focussing on the left half the aug- Find the eigenvalues of A = .
2 6
mented matrix, but carrying out all operations also on the
right half, rewrite A into I using the Gaussian elimination ¯ 7−λ
¯ ¯
1 ¯¯
rules (§B.10.3). As you turn the left half of the matrix into I , 0=¯ ¯ = (7 − λ)(6 − λ) − 2 = λ2 − 13λ + 40 =
2 6−λ ¯
the right half will keep changing with every operation. When (λ − 5)(λ − 8), so the eigenvalues are λ1 = 5 and λ2 = 8.
the left side has become I , what remains on the right half is
the inverse of A, as sought.
• Find the eigenvectors v of a given matrix A.
¶−1 2 1 Having determined the eigenvalues λ as above, plug each
µ µ ¶ µ ¶
1 −1 1 2 1 5 5
= = of them into (A − λI )v = 0 and solve for the corresponding
3 2 5 −3 1 − 35 1
5
eigenvector.

125
Find the eigenvectors and eigenvalues of A = 30 12 .
£ ¤ · ¸
−1 4
Given A = find A 139 using diagonalisation.
0 1
¯3 − λ
¯ ¯
1 ¯¯
det(A − λI ) = ¯ = (3 − λ)(2 − λ) = 0
det(A − λI ) = λ2 − 1 = 0 ⇒ λ1 = −1, λ2 = 1. To find the eigen-
¯
0 2 − λ¯
Thus the two eigenvalues are λ = 2 and λ = 3. To find · for λ1¸=
vector · −1 ¸we ·form¸the equation·det(A ¸ − λ1 I )~
v =
0 4 x 0 1
the corresponding eigenvectors we plug each £ 1 of¤ £these £val- 0⇒ = . Thus ~v1 = . By a sim-
x¤ 0 2 y 0 0
ues into (A − λI )x = 0. For λ = 2 this 1 0 ,
¤
gives 00 y = 0 · ¸
2
¤ eigenvector is t −1 . For λ = 3 we get
£1¤
£ 0 x1 ¤=£ x−y,
so ¤ so
£ 0 the ilar calculation, ~ v2 = . By diagonalisation we know
1
0 −1 y = ¤ y = 0 and x can be anything, so the
0 £, so ·
1 2
¸ ·
−1 0
¸
eigenvector is t 10 . that A = SDS −1 where S = and D = .
0 1 0 1
To find the required power we multiply A 139 = (SDS −1 )139 =
SDS −1 SDS −1 · · · SDS −1 = SD 139 S −1 which becomes
· ¸
3 1
Find the eigenvalues and eigenvectors of A = .
0 2 · ¸· ¸139 · ¸ · ¸
1 2 −1 0 1 −2 −1 4
det(A −·λI ) = 0 ⇒ (3 − λ)(2 A 139 = =
¸ · − λ)
¸ =·0 ⇒ λ
¸ 1 = 2 and λ2 = 3. 0 1 0 1 0 1 0 1
3−2 1 x 0 © x+y =0
For λ1 : = ⇒
0 2−2 y 0 0·x +0· y = 0
· ¸
1
⇒ y = −x ⇒ v1 = t
−1
· ¸· ¸ · ¸
3−3 1 x 0 © 0·x +0· y = 0
For λ2 : = ⇒
0 2−3 y 0 0·x +1· y = 0
· ¸
1
⇒ y = 0 ⇒ v2 = t
0

§ B.10.9. Examples
§ B.10.8. Diagonalisation

¸−1
λ1
· ¸· ¸·
0
A = v1 v2 v1 v2
0 λ2

where v1 and v2 are the eigenvectors of A written as columns. Give an example of a 2 × 2 matrix B such that B 8 = I but
B 6= I , B 2 6= I , and B 4 6= I .
• Diagonalise a given matrix A.
Reasoning geometrically, we see that a rotation 45◦ either
Find its eigenvectors and eigenvalues and enter them into clockwise or counterclockwise has the desired properties.
the above formula. This corresponds to the matrices
· ¸ p p ¸ p p ¸
3 1
· ·
1/ 2 −1/ 2 1/ 2 1/ 2
Diagonalise A = , that is, find a matrix C , C −1 , B= p p or B = p p
0 2 1/ 2 1/ 2 −1/ 2 1/ 2
and a diagonal matrix D such that A = C DC −1 .
· ¸ · ¸
1 1
Above we found λ1 = 2, λ2 = 3, v1 = , v2 = .
−1 0
Therefore  
· ¸ 1 7
¸−1 2 3
·
1 1
¸·
2 0
¸·
1 1 Given A = and B =  2 −4  find a matrix X
A= 1 4
−1 0 0 3 −1 0 0 2
such that X A = B .
· ¸· ¸· ¸
1 1 2 0 0 −1
¸ ·
= 4 −3 1
−1 0 0 3 1 1 Multiply both sides of the equation by A −1 = 5
−1 2
on the right. Since X A A = B A simplifies to X = B A −1
−1 −1
   
• Compute a power A n of a given matrix A. 1 7 · ¸ −3 11
4 −3
we get X = 15  2 −4  = 15  12 −14 
Diagonalise A. Multiplying its diagonalised expression with −1 2
0 2 −2 4
itself repeatedly yields
¸−1
λn1
· ¸· ¸·
n 0
A = v1 v2 v1 v2
0 λn2

126
 
3 −4 8
Find all eigenvalues and eigenvectors of A =  2 −3 8 
0 0 1
 
3
Then find A 11~v where ~
v = 2 
1

¯ (3 − λ)
¯ ¯
−4 8 ¯
det(A − λI ) = 0 ⇒ ¯¯ (−3 − λ)
¯ ¯
2 8 ¯=0
¯
¯ 0 0 (1 − λ) ¯
⇒ (1 − λ)(λ2 − 1) = 0 ⇒ (1 − λ)(λ − 1)(λ + 1) = 0.
The eigenvectors for λ1 = 1 are found by
    
2 −4 8 x 0
 2 −4 8  y  =  0 
0 0 0 z 0

which reduces to x − 2y + 4z = 0. The solutions can be writ-


ten as      
x 2 −4
 y  = s 1 +t  0 
z 0 1
so these are the two eigenvectors.
For λ2 = −1 we get
    
4 −4 8 x 0
 2 −2 8  y  =  0 
0 0 2 z 0

or in other words x − y + 2z = 0 and z = 0, which means that


the solutions are    
x 1
 y =t 1 
z 0
This is our eigenvector for this eigenvalue.
For the second part we diagonalise the matrix: our eigen-
calulations show that A = P DP −1 where
   
2 −4 1 1 0 0
P = 1 0 1  and D =  0 1 0 
0 1 0 0 0 −1
It follows that A 11 = P D 11 P −1 . But
 11   
1 0 0 1 0 0
D 11 =  0 111 0 = 0 1 0 =D
11
0 0 (−1) 0 0 −1

so A 11 = P D 11 P −1 = P DP −1 = A. Hence
    
3 −4 8 3 9
11
A ~ v = A~ v =  2 −3 8   2  =  8 
0 0 1 1 1

127
3
C N OTATION REFERENCE TABLE (1 − x n ) = (1 − x)(1 − x 2 )(1 − x 3 )
Y
n=1

§ C.1. Logic
§ C.4. Vectors
=⇒ implies

⇐⇒ is equivalent to; if and only if v or ~


v vector

v̂ unit vector (length 1)


§ C.2. Calculus
|v| length of vector v
f 0
derivative of f −→
AB vector pointing from point A to point B
df
dx
derivative of f
d
f (x) derivative of f § C.5. Multivariable and vector calculus
dx

ẋ derivative of x when x is a function of


∂x infinitesimal change in x in a multivari-
time
able context
∆x change in x; difference between two x- ∂f
f x , ∂x partial derivative of f with respect to x
values (∆ = delta = difference)
∂f
³ ´
dx “infinitesimal” or “infinitely small” ∂x y Partial derivative of f with respect to x,
change in x (d = difference) emphasising the fact that y is considered
fixed; not different in meaning from f x ,
f −1 (x) inverse function of f (x) but useful to avoid confusion in certain
[F (x)]ba F (b) − F (a) contexts.

→ goes to; approaches ∇f gradient of f ; ( f x , f y , f z )


³ ´
∂ ∂ ∂
∞ infinity ∇ formal vector ∂x , ∂y , ∂z

§ C.3. Algebra
§ C.6. Sets and intervals

|x| absolute value of x; “size,” distance to ori-


(a,b) the interval from a to b; all numbers be-
gin
tween a to b
Example: |−5| = 5.
( or ) endpoint not included
±x +x or −x.
[ or ] endpoint included
±···∓ + in the first position goes with − in the
second, and vice versa. ∪ union (the aggregate, “everything com-
n! Multiply n by every integer below it. bined,” “pool your resources”)

∩ intersection (what is common to both)


4! = 4 · 3 · 2 · 1
\ or − “set minus”; difference
b
A \ B or A − B A with everything from B taken out
X
f (n) For every integer n starting with a and go-
n=a ing to b, compute f (n), and add all of the
∈ is an element of
results together. (Σ = sigma = sum.)
A⊂B A is contained in B , is a subset of B

(−1)n x n = 1 − x + x 2 − x 3 + x 4 − x 5 + · · ·
X
A⊆B A is contained in B , and is possibly equal
n=0 to it
b
Y N the natural numbers (1, 2, 3, . . .)
f (n) For every integer n starting with a and go-
n=a ing to b, compute f (n), and multiply all of Z the whole numbers, the integers
the results together. (Π = pi = product.) (. . . , −2, −1, 0, 1, 2, 3, . . .)

128
Q the rational numbers; numbers that are
the ratio of two integers ∅⊂N⊂Z⊂Q⊂R⊂C

R the real numbers; a “whole axis”


3
2∈Z ∈Q π∈R π∉Q
C the complex numbers 2

; or ∅ the empty set; nothing


§ C.7. Book-organisational
(−1, 1) =
−2 −1 0 1 2 Calculator or computer to be used for this
problem.

? “Bonus” material; asides that can be


(−∞, −1) ∪ (0, 1] = skipped.
−2 −1 0 1 2
† Harder problem.

1 ∈ [0, 1] 1 ∉ (0, 1)

[−2, 2] ∩ (0, 4) = (0, 2]

[0, 2] − (1, ∞) = [0, 1]

129

You might also like