Calculus PDF
Calculus PDF
8 Vectors 62
8.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
C ONTENTS 8.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.3 Vector product . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.4 Geometry of vector curves . . . . . . . . . . . . . . . . . . 65
8.5 Reference summary . . . . . . . . . . . . . . . . . . . . . 65
1 Differentiation 3
1.1 Infinitesimals . . . . . . . . . . . . . . . . . . . . . . . . . 3 9 Multivariable differential calculus 68
1.2 The derivative . . . . . . . . . . . . . . . . . . . . . . . . . 4 9.1 Functions of several variables . . . . . . . . . . . . . . . . 68
1.3 Derivatives of polynomials . . . . . . . . . . . . . . . . . 5 9.2 Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . 70
1.4 Derivatives of elementary functions . . . . . . . . . . . . 6 9.3 Unconstrained optimisation . . . . . . . . . . . . . . . . 70
1.5 Basic differentiation rules . . . . . . . . . . . . . . . . . . 7 9.4 Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
1.6 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . 7 9.5 Constrained optimisation . . . . . . . . . . . . . . . . . . 72
1.7 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 9.6 Multivariable chain rule . . . . . . . . . . . . . . . . . . . 73
1.8 Reference summary . . . . . . . . . . . . . . . . . . . . . 8 9.7 Reference summary . . . . . . . . . . . . . . . . . . . . . 74
12 Further problems 91
4 Applications of integration 31
12.1 Newton’s moon test . . . . . . . . . . . . . . . . . . . . . . 91
4.1 Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
12.2 The rainbow . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.2 Arc length . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
12.3 Addiction modelling . . . . . . . . . . . . . . . . . . . . . 92
4.3 Center of mass . . . . . . . . . . . . . . . . . . . . . . . . . 31
12.4 Estimating n! . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.4 Energy and work . . . . . . . . . . . . . . . . . . . . . . . 33
12.5 Wallis’s product for π . . . . . . . . . . . . . . . . . . . . . 93
4.5 Logarithms redux . . . . . . . . . . . . . . . . . . . . . . . 34
12.6 Power series by interpolation . . . . . . . . . . . . . . . . 93
4.6 Reference summary . . . . . . . . . . . . . . . . . . . . . 35
12.7 Path of quickest descent . . . . . . . . . . . . . . . . . . . 94
12.8 Isoperimetric problem . . . . . . . . . . . . . . . . . . . . 94
5 Power series 37
12.9 Isoperimetric problem II . . . . . . . . . . . . . . . . . . . 95
5.1 The idea of power series . . . . . . . . . . . . . . . . . . . 37
5.2 The geometric series . . . . . . . . . . . . . . . . . . . . . 38 13 Further topics 97
5.3 The binomial series . . . . . . . . . . . . . . . . . . . . . . 39 13.1 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.4 Divergence of series . . . . . . . . . . . . . . . . . . . . . 40 13.2 Evolutes and involutes . . . . . . . . . . . . . . . . . . . . 97
5.5 Reference summary . . . . . . . . . . . . . . . . . . . . . 41 13.3 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . 98
13.4 Hypercomplex numbers . . . . . . . . . . . . . . . . . . . 99
6 Differential equations 43
13.5 Calculus of variations . . . . . . . . . . . . . . . . . . . . . 100
6.1 Separation of variables . . . . . . . . . . . . . . . . . . . . 43
6.2 Statics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 A Precalculus review 102
6.3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 A.1 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.4 Second-order differential equations . . . . . . . . . . . . 47 A.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.5 Second-order differential equations: complex case . . . 50 A.3 Trigonometric functions . . . . . . . . . . . . . . . . . . . 103
6.6 Phase plane analysis . . . . . . . . . . . . . . . . . . . . . 51 A.4 Logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.7 Reference summary . . . . . . . . . . . . . . . . . . . . . 54 A.5 Exponential functions . . . . . . . . . . . . . . . . . . . . 105
1
A.6 Complex numbers . . . . . . . . . . . . . . . . . . . . . . 105
A.7 Reference summary . . . . . . . . . . . . . . . . . . . . . 107
2
1.1.2. Explain why. Hint: a tangent line may be considered a
1 D IFFERENTIATION line that cuts a curve in “two successive points,” i.e., as
the limit of a secant line as the two points of intersection
are brought closer and closer together:
§ 1.1. Infinitesimals
1.1.5. Explain how the result of problem 1.1.1 can also be ob-
tained by considering the area as made up of infinitesi-
mally thin concentric rings instead of “pizza slices.”
b B
ds dy
dx S A
3
§ 1.2. The derivative the derivative as “so many steps in y for so many steps in
x.” This is what we do for example when we speak of so-
and-so many “miles per hour.” It can be confusing that
§ 1.2.1. Lecture worksheet
concepts like “per hour” or “per step in the x-direction”
dy
occurs in the description of something that is supposedly
The idea that dx is the slope of the graph of y(x) is very useful. instantaneous and not at all ongoing for hours.
It has many faces besides the geometrical one:
The confusion can be illustrated with the following sce-
dy
• Geometrically, dx is the slope of the graph of y. nario. A police officer stops a car.
dy
• Verbally, is the rate of change of y. O FFICER : The speed limit here is 60 km/h and you were
dx
going 80.
• Algebraically,
D RIVER : 80 km/h? That’s impossible. I have only been
dy y(x + dx) − y(x) driving for ten minutes.
= .
dx dx
O FFICER : No, it doesn’t mean that you have been driving
for an hour. It means that if you kept going at that speed
• Physically, the rate of change of distance is velocity; the
for an hour you would cover 80 km.
rate of change of velocity is acceleration.
D RIVER : Certainly not. If I kept going like that I would
1.2.1. Sketch the graphs of the distance covered, the speed, and
soon smash right into that building there at the end of
the acceleration of a sprinter during a 100 meter race,
the street.
and explain how your graphs agree with the above char-
acterisations of these quantities. (a) How can the officer better explain what a speed of
80 km/h really means?
Of course the rate of change of y(x) is generally different for dif-
ferent values of x. We use y 0 (x) to denote the function whose Imagine an electric train travelling frictionlessly on an
value is the rate of change of y(x). We call y 0 (x) the derivative infinite, straight railroad. The train is running its engine
of y(x). Thus y(x) is the primitive function, meaning the start- at various rates, speeding up and slowing down accord-
ing point, while y 0 (x) is merely derived from it. For example, ingly. Then at a certain point it turns off the engine. Of
y 0 (0) = 3 and y 0 (1) = −1 means that the function is at first ris- course the train keeps moving inertially.
ing quite steeply but is later coming back down, albeit at a less
rapid rate. (b) Explain how this image captures both the “instan-
taneous” and the “per hour” aspect of a derivative
1.2.2. Express symbolically in terms of derivatives: in a concrete way.
(a) The volume of the arctic ice V (t ) is shrinking. 1.2.5. Sometimes we consider the derivative of the derivative,
(b) The population P (t ) grows at a rate of 10% per unit or the “second derivative,” y 00 . The second derivative is
d2 y
time. also denoted 2 .
dx
(a) Argue that this notation makes algebraic sense by
§ 1.2.2. Problems considering
d dy
µ ¶
ds y 0 (x + dx) − y 0 (x)
dy y 00 (x) =
dx
dx
y
and show that it leads to the same result.
σ 1.2.6. “In the fall of 1972, President Nixon announced that the
rate of increase of inflation was decreasing. This was the
(a) Express σ in terms of y and y 0 . first time a sitting president had used the third derivative
(b) A famous curve has “constant subtangent”— to advance his case for re-election.” (Notices of the AMS,
indeed this is how it was usually referred to in the Oct. 1996, 43(10), p. 1108.)
th
17 century. Which curve is it?
If Nixon was speaking of f 000 , what is f ?
0
1.2.4. The derivative y (x) represents the “instantaneous” rate
rate of change of quantity of goods purchasable by
of change. In the case of a moving object the derivative of
one dollar
its distance is the velocity “at a given instant.” Neverthe-
dy quantity of goods purchasable by one dollar
less it is useful to read the fractional interpretation dx of
4
quantity of goods accumulated from t=0 until now dx
by someone spending one dollar per unit time
§ 1.3. Derivatives of polynomials 1.3.1. Find the derivative of x 3 and draw the corresponding
picture. Thus, as its side length grows, the volume of a
cube grows at a rate of times its [side length, diag-
§ 1.3.1. Lecture worksheet onal, surface area, volume, number of sides].
• calculate the corresponding change in y, which is de- 1.3.2. (a) What is the formula for the volume of a sphere?
noted dy; (Just state it for now. We shall prove it in problem
4.1.1.)
dy
• divide the two to obtain the rate of change dx .
(b) Take its derivative with respect to the radius. What
At the final stage we typically discard any remaining terms in- is the geometrical meaning of the result? The
volving dx on the right hand side, since dx is infinitely small. derivative of the volume of a sphere with respect
We cannot, however, discard all dx’s from the outset. This is to its radius is: [the radius, the radius squared, the
because even though it is infinitely small, it still has an impact surface area of the sphere, the area of the equatorial
when considered in relation to dy. This is the way any small circle, the volume of the sphere, the volume of the
numbers work. If we have 5+0.00001 then the second term can circumscribed cube, none of the above].
pretty much be discarded since it is so insignificant compared
0.00001 (c) Draw the corresponding picture and compare with
to the first. However, if we have 5 + 0.000001 then in fact those
problem 1.3.1.
tiny numbers become very significant indeed, in this case even
outweighing the “big” number and making the end result 15. (d) The derivative of the area of a circle with respect
Thus 0.00001 and 0.000001, though tiny on their own, become to its radius is: [the radius, the radius squared, the
big when taken in ratio. In the same way, infinitesimal terms area, the diameter, the area of the circumscribed
can be discarded in expressions like 5 + dx but not in expres- square, the circumference, none of the above].
dy
sions like dx . It is therefore safest to do our discarding only
at the final step of our three-step plan for finding derivatives,
since that is when we are done dividing. § 1.3.2. Problems
5
1.3.4. Find the derivative of 1/x algebraically and confirm that (b) If degrees were used instead of radians, which as-
it agrees with the rule (x n )0 = nx n−1 . Hint: after writing pect(s) of the figure, if any, would become invalid?
out dy, combine the terms on a common denominator.
parts marked 1
1.3.5. A pizza is to be shared among x friends. They cut it into
so many equal pieces. Then one more friend shows up. parts marked d θ
The pizza now has to be divided into x +1 pieces, but the parts marked sin(θ) and cos(θ)
cutting had already taken place. Therefore each person
cuts off an x th piece of their slice and give it to the new- parts marked d sin(θ) and −d cos(θ)
comer.
(c) Complete the ratio based on similar triangles:
(a) How much smaller did each piece of pizza become? sin(θ) / 1 = /
(b) This illustrates the fact that the derivative of f (x) = In §A.5 we saw that exponential functions have the property
is f 0 (x) = . that they grow in proportion to their size.
(c) Does everyone have the same amount of pizza in 1.4.2. Formulate this in terms of derivatives.
the end?
In fact, the number e we mentioned there can be defined as
1.3.6. Imagine a string wrapped around the earth’s equator. the number such that (e x )0 = e x . In other words, e is the base
How much longer would the string need to be for it to for the exponential function that is its own derivative.
be able to be raised one meter above the earth’s surface
at all points? What if you used a beach ball in place of 1.4.3. To find the derivative of ln(x), write y = ln(x). Then the
dy
the earth? By considering the derivative of the circum- derivative we seek is dx .
ference of a circle with respect to its radius we see that
the results are [equal, proportional to r , proportional to (a) Rewrite y = ln(x) in exponential form and find dx
dy .
r 2 , proportional to r 3 ]. dy
(b) Invert the fraction to find dx .
dθ
1.4.5. Derivative of arctangent. Recall the geometrical defini-
1 tions of the tangent and arctangent functions from §A.3:
1
dx
sin θ
dθ
θ
dy
cos θ
tan θ
x
y=
arct
6
(a) Show that this infinitesimal triangle is similar to the § 1.6. The chain rule
large one that has x as one of its sides.
(b) By what factor is the second circle larger than the § 1.6.1. Lecture worksheet
first? (Hint: Find the hypotenuse of the triangle
with x in it. Remember that the first circle was a The final differentiation rule we need tells us what happens
unit circle.) when one function is “trapped” inside another (like the links
of a chain):
(c) Use this to express the “arc” leg of the infinitesimal
triangle as a multiple of dy. ( f (g (x)))0 = f 0 (g (x)) · g 0 (x) (chain rule)
(d) Find dy/dx by similar triangles. 1.6.1. Argue that the chain rule can be written
1.4.6. If f (x) = arctan(x) + arctan(1/x), compute f (1), f (−1) d df dg
and f 0 (x), and explain the “paradox.” f (g (x)) =
dx dg dx
1.5.2. Prove the product rule by viewing f g as the area of a rect- Some people find the notions of infinitesimals and the in-
angle with sides f and g and considering how the area finitely small disagreeable. Mathematics should not be built
changes as x (the implied variable) grows by dx. on such mysterious and quasi-metaphysical notions, they say.
No, give me good old real numbers, they say; otherwise it’s not
mathematics.
§ 1.5.2. Problems
This kind of conservatism can be accommodated using the no-
tion of limits. The limit of f (x) as x goes to a, or in symbols
1.5.3. Find the derivative of 1/x by letting y = 1/x and differen- limx→a f (x), means the number that f (x) approaches as x is
tiating x y. taken closer and closer to the given number a. Thus for exam-
p ple limx→∞ 1/x = 0 because 1 divided by something very big
1.5.4. Come up with a similar proof for 1/ x.
is virtually zero. Also limx→0 1/x = ∞ because as x becomes
1.5.5. To find the derivative of a quotient f /g of two functions closer and closer to 0 (think of x = 0.0001 and such numbers)
we can simply write it as f · g1 and apply the product rule. the function 1/x becomes very big and grows beyond bounds.
So there is really no need for a separate rule. However, we 1.7.1. Actually we should be a bit more careful and write
can save ourselves some time by working out this prod- limx→0+ 1/x = ∞ and limx→0− 1/x = −∞. How so? Illus-
uct rule calculation once and for all, so that we have a trate with a figure.
rule for quotients “ready to go” for future reference. Do
so. 1.7.2. What is the visual meaning of limx→a f (x) = f (a) in
terms of the graph of f ? In such cases we say that f (x)
1.5.6. What is the product rule for a product of three functions? is continuous.
7
The principles of the calculus can be formulated in terms of (or maybe the average balance across the year) and then
limits. Then instead of speaking of an infinitesimal increment gave us 100% of that amount. So if we started with $1 we
dx we can speak of the limit of a finite increment ∆x as ∆x → 0. would have $2 on January 1, and $4 the year after that.
Thus for example the derivative may be more formally defined However, why should interest calculations be based on
as the notion of a calendar year? There is no intrinsic rea-
son for this. The bank could just as well give us interest
y(x + ∆x) − y(x) y(x + dx) − y(x)
lim instead of payments every month or every week or whatever.
∆x→0 ∆x dx
If we simply deposit an initial amount B 0 and leave it to
1.7.3. Write out the proof that (x 2 )0 = 2x in both manners of ex- grow, the formula for our balance after t years will be
pression side by side.
100 t
µ ¶
The limit approach has some advantages over freewheeling use B (t ) = B 0 1 +
100
of infinitesimals in certain technical contexts. However, as the
above example shows, this comes at the cost of often needless if the interest is added yearly,
pedantry. The infinitesimal manner of speaking is simpler and 100/12 12t
µ ¶
more suggestive, and we shall therefore stick to it throughout B (t ) = B 0 1 +
100
this book. But it will do us well to know that we can fall back on
a more formal explication in the language of limits if we should if the interest is added monthly, and
run into tricky problems where the meaning of infinitesimals 100/52 52t
µ ¶
becomes unclear. B (t ) = B 0 1 +
100
if the interest is added weekly.
§ 1.7.2. Problems
(a) What is the balance after one year in each of these
scenarios?
1.7.4. L’Hôpital’s rule. Suppose you want to compute the limit
(b) ? Explain how these differences arise. Hint: con-
f (x)
lim sider the phenomena of “interest upon interest.”
x→a g (x)
Thus the account holder should insist on more frequent
(a) Explain why this would be easy if f and g were con- compounding. And so should the mathematician, be-
tinuous functions and f (a) = 8 and g (a) = 2. cause mathematically there is no reason why the theory
If f (a) and g (a) are both zero, however, the situation is of interest should be based on any particular chunks of
trickier. But consider this analogy. Suppose f (x) and time that are nothing but arbitrary social conventions.
g (x) represent the distance from the finish line of two Better then to denote the number of compounding oc-
sprinters in a race x seconds after they took off. casions in a year by n and let n → ∞ instead of fixing it at
1, 12, 52, or any other arbitrary number.
(b) What is the meaning of f (a) = g (a) = 0 in this con-
text? (c) Write down the balance formula for n compound-
ing occasions per year.
Suppose now that one of the runners was running twice
as fast as the other during the last second of the race. (d) What is the balance after one year in this case?
Imagine watching a slow-motion replay of the finish. (e) Let n → ∞ in this expression. This limit is the num-
(c) Argue that this makes it clear that ber e.
(f) In §1.4 we said that e could be defined as the num-
f (x) f 0 (x)
lim = lim 0 ber such that (e x )0 = e x . Explain how this is related
x→a g (x) x→a g (x) to the economic definition.
8
y 0 negative ⇐⇒ graph of y heading downwards – . . . an x is inside a root or in the denominator of a frac-
0 tion.
y = 0 ⇐⇒ graph of y horizontal
Rewrite in form x n and apply differentiation rule for
magnitude of y 0 = steepness of graph of y
this form.
∆x change in x; difference between two x-
function equivalent form derivative
values (∆ = delta = difference) p 1/2 1
x x p
2 x
dx “infinitesimal” or “infinitely small”
change in x (d = difference) 1/x x −1 − x12
§ 1.8.2. Physical meaning of derivative Product rule. Differentiate one and keep the other as it
is, then vice versa, and add the results.
d
distance = velocity
dt function derivative
3
d x sin(x) 3x sin(x) + x 3 cos(x)
2
velocity = acceleration
dt xe x e x + xe x
Coefficient rule. Keep the number in front and differ- sin3 (x) g3 sin(x) 3 sin2 (x) · cos(x)
entiate the something. (1 + 2x)8 g8 1 + 2x 8(1 + 2x)7 · 2
1 1
(5x 2 )0 = 10x x+x 3
g −1 x + x3 − (x+x 2
3 )2 · (1 + 3x )
9
– . . . an expression inside another inside another inside or negative as x approaches a; the answer is +∞ or −∞ ac-
another . . . cordingly. If the result is of the form 0/0 or ∞/∞: rewrite
f (x) and try again. Rewriting could involve factoring, can-
Apply chain rule repeatedly. The inside derivatives will
celling, algebraic/trigonometric/logarithmic identities, and
spit out inside derivatives of their own. Just keep multi-
L’Hôpital’s rule. If f (x) is oscillating (e.g. sin(x)) without ap-
plying.
proaching any one value: limit does not exist.
¡ p ¢2
Differentiate f (x) = sin( x) .
³¡ p ¢2 ´0 p ¢¡ p ¢0 § 1.8.7. Examples
f0 =
¡
sin( x) = 2 sin( x) sin( x) =
p ¢ p p 0
Differentiate:
¡
2 sin( x) cos( x)( x) =
¡ p ¢ p 1 −1 p p p
2 sin( x) cos( x)( 2 x 2 ) = sin( x) cos( x)/ x. cos(x 2 )
−2x sin(x)
– . . . a very complicated algebraic expression involving
roots, exponents and/or fractions.
x 2 + sin(x)
If using the usual rules seems to daunting, try taking the
2x + cos(x)
logarithm of both sides of the equation, simplify with
logarithm laws, and then use implicit differentiation.
5 ln(x)
• Estimate how much y changes (∆y) when x goes from some
value x 0 to some other value x 1 , given derivative of y. 5 · x1
∆y dy dy
Since ∆x ≈ dx , we get ∆y ≈ dx · ∆x. Fill in ∆x = x 1 − x 0 and
2
dy 0
= y (x 0 ). xe −x
dx
2 2
• Estimate y 0 (a) given the graph or a table of values for y(x). e −x − 2x 2 e −x
Focus on two x-values not too far apart, both in the vicinity
of x = a. Find the difference between them. This is ∆x. De- xe 1/x
termine the corresponding change ∆y in y (i.e., how much e 1/x + xe 1/x (− x12 ) = e 1/x (1 − x1 )
y changes when you go from one of your x’s to the other).
∆y dy
Divide to obtain ∆x ≈ dx .
x+1 4
( x−1 )
• Find the derivative of y(x) from first principles.
3
3 1·(x−1)−(x+1)·1 (x+1)
Let x increase by an infinitesimal amount dx; calculate the 4( x+1
x−1 ) (x−1)2
= 4( x+1 3 −2
x−1 ) (x−1)2 = −8 (x−1)5
corresponding change dy = y(x + dx) − y(x) in y; divide the
dy
two to obtain the rate of change dx . 2 +3x−2
xe x
2 2 2 +3x−2
e x +3x−2 + xe x +3x−2 (2x + 3) = e x (1 + x(2x + 3)) =
2
§ 1.8.6. Limits e x +3x−2 (2x 2 + 3x + 1)
0 ∞
you get or when plugging in x = a, you can take the deriva-
0 ∞
tive of both numerator and denominator without altering the
value of the limit:
f (x) f 0 (x)
lim = lim 0
x→a g (x) x→a g (x)
10
(0, a)
2 A PPLICATIONS OF DIFFERENTIATION
(b, c)
α β
§ 2.1.1. Lecture worksheet
(x, 0)
2.1.2. To get a more hands-on feel for this, draw a closed curve
with various squiggles in it on a piece of paper. Hold the (b) Find the derivative of this function and set it equal
paper up against the wall. Consider the tangent lines at to zero.
the lowest and highest points. Rotate the paper in vari-
ous ways an repeat.
(c) Show that the law of reflection follows. Hint: inter-
Once found, the points where f 0 = 0 can be classified using sec- pret the fractions in your equation as ratios of sides
ond derivatives: of the triangles in the figure.
f 00 positive
=⇒ minimum
f 00 negative
=⇒ maximum
§ 2.1.2. Problems
2.1.5. The law of reflection says that when light reflects off a
mirror, the angle of incidence α is equal to the angle of sin(α) c 1
reflection β. =
sin(β) c 2
11
(b) A point where f 0 = 0 yet no maximum or minimum
occurs.
(c) What can you infer about the values of the coeffi-
cients a, b, c, d based on what is known about the
path and its endpoint slopes?
2.2.1. What can you say about the concavity of a quadratic (c) Do the same for the more general parabola y = ax 2 .
function?
(d) How can you characterise the y-intercept of these
2.2.2. Are the following true or false? Illustrate with figures. An tangent lines in general, verbal terms?
inflection point could be:
(e) Use this information to give a calculus-free recipe
(a) A maximum or a minimum. for constructing tangent lines to parabolas.
12
This is in fact how tangents to parabolas were charac- (e) What are the coordinates of T ? Hint: this was de-
terised over two thousand years ago by Apollonius in his termined in problem 2.3.1.
classic Κωνικά, Prop. I.33.
(f) Find the lengths F T and F P .
(g) Conclude the proof of the focal property of the
§ 2.3.2. Problems parabola.
2.3.2. Focal property of the parabola. Parabolic reflectors con- 2.3.3. Newton’s method for finding roots of equations numeri-
centrate all incoming rays parallel to its axis in a single cally. Tangent lines can be used to find roots of equa-
point: tions numerically. Suppose we are looking for the points
where a certain function y(x) is zero. Often we cannot
determine these values directly by setting the function
equal to zero, since this equation may be much too dif-
ficult to solve by hand. In such a case we can proceed
as follows. Let x 0 be out best guess as to where the root
might be. Evaluating y(x 0 ) will probably show us that it
was not quite zero as we had aimed for. However, we
then compute the tangent line to the function at this
point and find its x-intercept. If our guess was reason-
This is useful for picking up as much as possible of a ably close to a root this will give us a new x-value, x 1 ,
signal that has been weakened by travelling a long dis- which is closer to the root, as the figure shows. Then we
tance, such as communications from a satellite. It can can repeat the process, getting closer and closer.
also be used to focus the rays of the sun; legend has it that
Archimedes set fire to enemy ships this way. The mirror
also works in the converse direction: light originating at
the focal point will be reflected into parallel beams. This
is useful for making a limited amount of light reach as far
as possible in a focussed direction, such as in the head-
lights of a car.
T Many useful physical laws are conservation laws, i.e., laws that
say that some particular quantity is preserved. Mathemati-
13
cally speaking, proving a conservation law means proving that § 2.5. Differential equations
something has derivative zero with respect to time. In this sec-
tion we shall derive a conservation law for motion in this way.
§ 2.5.1. Lecture worksheet
As we shall see, proving that the derivative is zero follows at
once from the basic laws of physics we summarised in §A.7.12.
In many real-world scenarios we want to know the value of a
Thus this conservation law—and many others—are in a way
certain function but we know, initially, only its derivative. If I
really nothing but other basic laws in disguise (and a disguise,
put some money in the bank I may want to know how much I
furthermore, that the calculus readily unmasks). Nevertheless
will have some years from now, for example when I retire. But
conservation laws are so useful that physicists have made up
this is not what the bank tells me. Instead they tell me the inter-
names for the things that are preserved, such as “energy.”
est rate; that is to say, the rate at which the money is growing,
Suppose we fire a projectile straight up into the air. If its ve- or the derivative. In physics it’s even worse. We may want to
locity is great enough if will never fall back down. The mini- know for example the position of a satellite. But, like the bank,
mum velocity for which this happens is called the escape ve- nature doesn’t tell us. We have to start with Newton’s law
locity. Calculating it is an example of a situation where using a
conservation law is very convenient. Let m = mass of the pro- Force = mass × acceleration,
jectile, v = velocity of the projectile, G = gravitational constant,
and figure out the position from there. So nature tells us the
M = mass of the earth, y = distance from the projectile to the
second derivative (acceleration) of what we really want to know
center of the earth. I say that this conservation law holds:
(position).
mv 2 Mm What we are given in these kinds of situations is a differential
−G = constant
2 y equation, i.e., not the function itself but some condition that
its derivatives must fulfil. Differential equations are equations
2.4.1. Check this by calculating the derivative with respect to involving derivatives, such as y 0 = y. In words, this equation
time. Hint: recall from §A.7.12 that: says: the rate of growth is equal to the current amount. So the
more you have the faster it grows. Things that have this prop-
Mm erty include money and rabbits, as noted in §A.5.
ma = F = gravity = −G
y2
A solution to a differential equation is a function that satis-
fies it. In our example e x is a solution, since if y = e x then
(Later we shall reason our way to this conservation law in a y 0 = e x = y. But there are also other solutions. You can mul-
more intuitive way, as opposed to merely checking it “after the tiply e x by any constant and it will still solve the equation: if
fact,” as it were. See §4.4.) y = ce x then y 0 = ce x = y.
2.5.1. What is the real-world meaning of the constant c in the
How does this help us calculate the escape velocity? If we have
case of money? In general, constants that occur in so-
just enough initial velocity to escape to y = ∞ we will get there
lutions to differential equations are determined by plug-
with zero velocity v = 0.
ging in known initial conditions.
2.4.2. Plug in these conditions to determine the value of the 2.5.2. Match each differential equation with the real-world sce-
constant in the conservation law corresponding to this nario it models, and explain the meaning of the variables
scenario. and constants involved.
2.4.3. Use the conservation law with this value for the constant A. Growth of population with unlimited resources.
to determine the escape velocity. Hint: when the projec-
tile is fired, y = radius of the earth. B. Growth of population with limited resources.
C. Motion of pendulum.
2.4.4. ? What are the physical names for the quantities in our
conservation law? D. Body in free fall.
E. Predator-prey system (foxes/rabbits, shark/fish,
etc.).
§ 2.4.2. Problems
F. Predator-prey system with harvesting (hunting,
fishing, etc.).
2.4.5. Commercial aircrafts fly at an altitude of about 10,000
meters. With what velocity does a projectile need to be G. Growth of population with limited resources and
fired from the surface of the earth to reach this altitude? harvesting.
2.4.6. The escape velocity for a black hole exceeds the speed of H. Conventional two-army warfare.
light. To what radius would the mass of the earth need to I. Conventional army versus guerrilla. (Imagine the
be compressed for it to become a black hole? battle taking place in a jungle, with the guerrillas
14
hiding in the trees and bushes. The guerrillas can 2.5.5. If we ignore air resistance the only force acting on a
target enemies as in any battle, but the conven- falling object is the constant gravitational acceleration
tional army, since they cannot see the guerrillas, (g ≈ 10; see §A.7.12), which gives the differential equa-
can only fire their machine guns into the jungle tion v 0 = −10. Suppose you fire a gun straight up into the
somewhat randomly, hoping to hit a guerrilla.) air. The initial velocity of the bullet is 1000 meters/sec-
00 ond.
i. y = −k y
(a) Solve the differential equation for v as a function of
ii. y 0 = k y
t with the given initial condition.
iii. y 00 = −k (b) Find the height of the bullet as a function of time.
iv. y 0 = k y(a − y) − b (c) With what velocity does the bullet strike the ground
0
v. y = k y(a − y) when it lands?
n x 0 = −b y Of course this is unrealistic. The resistance of the air is
vi. considerable. Experience shows that it is roughly propor-
y 0 = −ax
tional to the square of the velocity. Then the differential
n x 0 = ax − bx y − ex
equation is something like v 0 = −10 − 0.004v|v|.
vii.
y 0 = −c y + d x y − e y
(d) Why did I write v|v| instead of v 2 ? Hint: Consider
n x 0 = −b y the difference between going up and coming down.
viii.
y 0 = −ax y
(e) With what velocity does the bullet strike the ground
n x 0 = ax − bx y according to this model? Hint: Instead of solving
ix.
y 0 = −c y + d x y the differential equation to find this out, use the
reasonable assumption that the descending bul-
2.5.3. The case of pendulum motion is more important than it let will reach terminal velocity (i.e., a velocity at
might seem. We shall come back to it later, but for now which it is no longer accelerating) before hitting the
we can use it to introduce an important distinction be- ground.
tween two types of equilibria. What are the two positions
in which a pendulum (with a rigid rod) can be in equilib-
rium? What is the qualitative difference between them? § 2.5.2. Problems
This is a useful metaphor for many less concrete situa-
tions. 2.5.6. It makes sense that Newton’s law F = ma has accelera-
tion in it, because to stand still and to move with con-
This is similar to which of the following?
stant velocity is physically equivalent. That is, no physi-
A ball rolling in a landscape of hills and valleys. cal experiment can tell one state from the other. This was
known to Galileo, who explained it as follows in his Dia-
A ball thrown vertically into the air.
logue Concerning the Two Chief World Systems (1632).
An elastic beam that is bent and then released. Shut yourself up with some friend in the main cabin be-
low decks on some large ship, and have with you there
A planet orbiting the sun. some flies, butterflies, and other small flying animals.
Have a large bowl of water with some fish in it; hang up
2.5.4. A disease spreads in proportion to the number of en- a bottle that empties drop by drop into a wide vessel be-
neath it. With the ship standing still, observe carefully
counters between infected and healthy individuals. Ar- how the little animals fly with equal speed to all sides of
gue that this can be captured by a differential equation of the cabin. The fish swim indifferently in all directions; the
drops fall into the vessel beneath; and, in throwing some-
the same form as the population growth model in prob- thing to your friend, you need throw it no more strongly
lem 2.5.2, but with a, y, k now corresponding to: in one direction than another, the distances being equal;
jumping with your feet together, you pass equal spaces in
number of infected every direction. When you have observed all these things
carefully (though doubtless when the ship is standing still
everything must happen in this way), have the ship pro-
infectiousness of disease ceed with any speed you like, so long as the motion is uni-
form and not fluctuating this way and that. You will dis-
recovery rate cover not the least change in all the effects named, nor
could you tell from any of them whether the ship was
total population moving or standing still. In jumping, you will pass on
the floor the same spaces as before, nor will you make
number of uninfected larger jumps toward the stern than toward the prow even
though the ship is moving quite rapidly, despite the fact
that during the time that you are in the air the floor under
incubation time you will be going in a direction opposite to your jump. In
throwing something to your companion, you will need no
Hence “the growth of a population is the spread of the more force to get it to him whether he is in the direction of
disease of life,” so to speak. Differential equations often the bow or the stern, with yourself situated opposite. The
droplets will fall as before into the vessel beneath without
reveal analogies like this. dropping toward the stern, although while the drops are
15
in the air the ship runs many spans. The fish in their wa- y 0 = p(b − d )(1 − y)
ter will swim toward the front of their bowl with no more
effort than toward the back, and will go with equal ease
to bait placed anywhere around the edges of the bowl. Fi- y 0 = p(b − d )(y − 1)
nally the butterflies and flies will continue their flights in-
differently toward every side, nor will it ever happen that
they are concentrated toward the stern, as if tired out from (d) Therefore, with the initial condition y(0) = 1/1000,
keeping up with the course of the ship, from which they
will have been separated during long intervals by keeping
y(t ) = .
themselves in the air. And if smoke is made by burning
some incense, it will be seen going up in the form of a lit- According to Rashevsky, “If we roughly assume that for
tle cloud, remaining still and moving no more toward one
side than the other. religious beliefs . . . only about one person in a thousand
in early antiquity was a natural agnostic” (i.e., y(0) = p =
This means that physical laws cannot speak directly 1/1000) and that “the order of magnitude of b is about
about velocity. An observer on the shore thinks the guy 10−2 individuals/year,” then “in about the last 10,000
in the ship is moving; but the guy in the ship could claim years, the time that has elapsed since the emergence of
that he is in fact standing still and that it is the guy on the mankind from a primitive state, we find an increase of y
shore that is moving. As we just saw, no physical experi- from 1/1000 to only about 1/100,” and indeed “we actu-
ment can settle their dispute, so they must both be con- ally find that all the major religions . . . still share between
sidered to be equally right so far as physics is concerned. them practically all of humanity.” As for the future, “in
Nature does not distinguish between them, so her laws 100,000 years the fraction y . . . will have increased to only
must be equally true for them both. about 2/3.”
To illustrate this more formally, let the person on the
shore be the origin of a coordinate system, and let the (e) Do we have all the information we need to verify
ship be traveling in the positive x-direction with con- these calculations with our formula for y(t )?
stant velocity v. Now imagine releasing a butterfly inside
the ship, in the manner described by Galileo. Suppose
the butterfly moves in the x-direction only, and let X (t )
be its position in the coordinate system of an observer on § 2.6. Direction fields
the ship (i.e., taking a point inside the ship as the origin).
(a) Find the general formula for the position x(t ) of the
butterfly in the coordinate system of the observer § 2.6.1. Lecture worksheet
on the shore.
(b) Express the position, velocity, and acceleration of A useful tool for understanding a differential equation is its di-
the butterfly in terms of both coordinate systems. rection field. You construct it as follows: pick a point (x, y), plug
these values for x and y into the differential equation, solve for
(c) What is the conclusion?
y 0 , and draw a little line segment with this slope at the point
2.5.7. Rashevsky (Looking At History Through Mathematics, in question. Then you repeat this for many points until you
M.I.T. Press, 1968) proposed the following model of the see the pattern. Let us take the population growth with un-
increase of agnosticism on a historical timescale. limited resources as an example. This is the direction field for
y 0 = 0.01y:
Assume that most people are receptive to common faiths
while a small fraction pN of the total population N is
naturally agnostic. Let’s say that the birth rate is b and 300
16
300 (f) Other constants remaining as above, what is the
highest harvesting rate that can be maintained
without eventually depleting the fish population
200
(assuming that the initial population is large
enough)?
100
(g) Draw the direction fields for this rate of fishing, and
for a higher rate of fishing. Explain what these pic-
100 200 300 400 tures show.
17
• Classify values where f 0 (a) = 0 as max., min., or neither.
f 00 (a) positive =⇒ min.; f 00 (a) negative =⇒ max. If f 00 (a) = f 00 = 0 and changing sign =⇒ inflection point of f
0, it could be max., min., or inflection point. To find out
which, determine the value of f (x) for values of x slightly • Sketch the shape of the graph of a function based on knowl-
greater than and less than a. edge of its derivative.
Find and classify the critical points of f (x) = 2x 2 − ln|x|. Find where the derivative is zero; at these points the function
“goes flat,” i.e., has a horizontal tangent.
f 0 = 4x − 1/x, f 00 = 4 + 1/x 2 . f 0 = 0 =⇒ 4x 2 − 1 = 0 =⇒
x = ± 21 . f 00 (± 21 ) = 8 > 0, so x = 21 and x = − 12 are both local
For each interval between these points, determine the sign
minima.
of the derivative. Positive or negative derivative means that
the graph goes up or down respectively. Assuming that the
• Find global max. or min. of f (x).
derivative is continuous, its sign will be the same at any point
Find local max. and min. as above. Check the value of f (x) at within one such interval; therefore it is enough to evaluate
each of these points. The biggest of these values is the global the derivative at any one point in the interval. The sign of the
max. and the smallest the global min., if such exist. Inves- derivative on these intervals can also be inferred from the
tigate the values of f (x) as x approaches ±∞ or any point second derivative, if known (y 00 positive =⇒ y 0 increasing,
at which f (x) is not defined (such as a point corresponding and so on).
to division by zero). If f (x) → ∞ in any of these cases, no
global maximum exists. If f (x) → −∞ in any of these cases, Also determine what happens to the function as x goes to
no global minimum exists. plus or minus ∞, either by examining the function or by de-
termining whether the derivative is positive or negative for
Find any local and global extrema of f (z) = 10 + 4z + z 2 − big values of x.
2 3
3z .
A sketch agreeing with these three types of information will
Stationary points occur where f 0 (z) = 4 + 2z − 2z 2 = 0, be a good approximation of the shape of the graph.
which means z 1 = −1 or z 2 = 2. The second deriva-
tive is f 00 (z) = 2 − 4z. Since f 00 (−1) = 6 > 0 we see that Note that from information about the derivative alone it is
z 1 = −1 is a local minimum (with value f (−1) = 23 3 ). Since
not possible to know the vertical position of the graph, i.e.,
f 00 (2) = −6 < 0 we see that z 2 = 2 is a local maximum whether it needs to be shifted up or down. However, know-
(with value f (2) = − 50 2 2 3
3 ). Since f (z) = 10 + 4z + z − 3 z =
ing the value of the function at any one point is enough to
3 10 4 1 2 determine the vertical position of the graph.
z ( z 3 + z 2 + z − 3 ), we see that limz→−∞ f (z) = ∞ and
limz→∞ f (z) = −∞. Hence the function has no global
maximum or global minimum.
Set f 0 (x) = 0 and solve for x; list the x-values that are in the • Sketch the direction field of a differential equation.
given interval. Also include the endpoints of your interval in
your list. Any max. or min. occurs at one of x-values in this Pick a point (x, y), plug these values for x and y into the dif-
list. ferential equation, solve for y 0 , and draw a little line segment
with this slope at the point in question. Repeat for many
To classify whether max., min., or neither, evaluate f (x) at points until you see the pattern.
the given points and determine which points give the great-
est and smallest values. Non-endpoints may also still be clas-
Draw the direction field for y 0 = x 2 .
sified with the second derivative test.
-1
00
f negative =⇒ graph of f concave down -2
18
Draw the direction field for y 0 = y.
3
-5 -4 -3 -2 -1 0 1 2 3 4 5
-1
-2
-3
19
3.1.4. If σ(h) is the density of water at depth h, what does the
3 I NTEGRATION integral of σ(h) from the surface to the bottom of the sea
represent?
3.1.5. Which of the following have a common etymological root
§ 3.1. Integrals
meaning with the mathematical term integral?
We need to integrate immigrants into society.
§ 3.1.1. Lecture worksheet
Common name for 1, 2, 3, . . .
Rb
The integral a y dx means: Spaghetti integrale.
R
• Algebraically, the sum (hence the , which is a kind of None of the above
“s”) of infinitesimal rectangles with height y and base dx:
y
§ 3.1.2. Problems
1
Rb
3.1.6. Argue that b−a a f (x) dx represents the average value of
f (x) on the interval [a, b].
a b
dx
And thus: average value
Rb
• Geometrically, the integral a y dx is the area under the
graph of y(x) from x = a to x = b. (Technically the
“signed area”: area below the x-axis is negative since the Hint: this can be done very concretely by thinking of the
height of the rectangle, y, is a negative number.) area under the graph as so much sand, for example.
Rb
I now wish to convince you that also: 3.1.7. Argue that a f (x) − g (x) dx represents the area between
• Physically, the integral of velocity is distance; the integral the graphs of these two functions. Illustrate with a figure.
of acceleration is velocity. 3.1.8. Social science: index of inequality. Class inequalities are
often quantified by saying that the richest so-and-so per-
3.1.1. If I drive 50 km/h for one hour and then 100 km/h for
cent have so-and-so much of all the world’s wealth. To
three hours, how far did I go? Show that this is the area
put this in analytic form, let F (x) be the fraction of a par-
under the graph of the speed function.
ticular resource, such as money, owned by the poorest
3.1.2. Objects in free fall have a constant acceleration, accord- fraction x of the population. Thus F (0.3) = 0.1 means
ing to Galileo. Draw the graphs of the acceleration, ve- that the poorest 30% of the population owns 10% of the
locity, and distance fallen functions of a dropped stone, resource. Gini’s index of inequality is one way to mea-
and explain how your graphs agree with the above char- sure how evenly the resource is distributed. It is defined
acterisations of these quantities in terms of integrals. as the integral
Z 1
Finally: 2 x − F (x) dx.
0
• Verbally, integrals often represent some kind of net (a) Show graphically what this integral represents in
change or net effect, though this viewpoint is not always terms of the graph of F (x).
applicable.
(b) Which of the following must always be true in any
This shall become clearer in §3.2 but we can already feel it in society? (Assume that no one can have a negative
the above examples. amount of the resource, and that F (x) is twice dif-
ferentiable.)
3.1.3. (a) Argue that the verbal description too applies well to
problem 3.1.1. F (0) = 0
(b) How do these descriptions play out if I then drive F (1) = 1
backwards at 50 km/h for two hours? In which two
F 0 (x) ≥ 0
senses can “distance travelled” be interpreted now?
How can you express each as an integral? F 00 (x) ≥ 0
20
F (x) < x
§ 3.2.2. Problems
None of the above
3.2.2. Discrete analog of the fundamental theorem of calculus.
(c) The maximum possible value of Gini’s index of in-
Write down a list of eight arbitrary numbers, leaving gen-
equality is and the minimum is . In
erous spaces around them. Above the gap between each
the latter case F (x) = .
pair of numbers, write down the sum of all numbers up
(d) Which of the following represents the most unequal to this point. Below the gap between each pair of num-
society? bers, write down the difference between those two num-
bers. Above and below the new lists, write down the sums
F (x) = x
of the difference list, and the differences of the sum list.
F (x) = x 2 Explain how this is related to the fundamental theorem
of calculus.
F (x) = x 3
3.2.3. † What happens if, in FTC1, we use a slanted line instead
of a perpendicular one? In other words, what is dA dx , with
§ 3.2. Relation between differentiation and integra- A(x) defined like this:
tion
t dt
§ 3.3. Evaluating integrals
so Rt
d a y(x) dx y(t ) dt
= = y(t ),
dt dt § 3.3.1. Lecture worksheet
which proves FTC1.
FTC2 says that in order to integrate some function f (x) one has
3.2.1. What happens if we take the derivative with respect to
only to find an antiderivative F (x), that is, a function such that
theR lower bound instead? If y(x) is a positive function,
d a F 0 = f , because then
d t t y(x)d x = because if the endpoint of
integration is moved to the the area decreases. Z b
f (x) dx = F (b) − F (a)
FTC2 is even easier to prove: a
Z b Z b Z b
dy so to evaluate the integral we just have to plug in the bounds
y 0 dx = dx = dy
a a dx a into F (x) and take the difference between them.
= sum of little changes in y from a to b
To see that this is a very powerful result, consider the problem
= net change in y from a to b of finding the area under the parabola y = x 2 between x = 0
= y(b) − y(a) and x = 1.
21
Let’s say that we toss a 1-inch needle on a floor with 2-
inch floor boards going east-west. The position of the
1 needle is determined by two parameters: the distance y
from the southern end of the needle to the joint to the
north of it, and the angle θ the needle makes with the
floorboards. Thus the possible values of y are 0 ≤ y < 2
and the possible values of θ are 0 ≤ θ < π. Call this the
1
“possibility space.”
As in the case of differentiation, we also need rules for how to 3.3.4. The population P (t ) of a country is growing continuously
deal with functions that are built up from standard functions at a rate of (1 + t )% per year, where t is the number of
combined in various ways. Two simple rules are: years from today.
Z Z (a) This means that P 0 (t )/P (t ) = .
c f (x) dx = c f (x) dx R b P 0 (t )
(b) Find a P (t ) dt in terms of P (a) and P (b). Hint: The
Z Z Z integrand is a “logarithmic derivative.” The integral
f (x) + g (x) dx = f (x) dx + g (x) dx evaluates to .
3.3.1. Explain why these rules are quite obvious. (c) How many percent bigger will the population be
R2 in 6 years compared to today? Hint: Work out the
3.3.2. Evaluate 1 1 + 2x 4 dx.
R6 0
integral 0 PP (t(t)) dt in two ways, once using (a) and
once using (b). The population will have grown by
%.
§ 3.3.2. Problems
22
cos(x 2 )(2x dx) = cos(u) du. This is easy to integrate: it’s
R R
The following example illustrates how we can deal with such
sin(u)+C , or, putting the answer back in terms of x, sin(x 2 )+C .
cases.
R1p
This technique always works when we need to integrate some- 3.4.3. Geometrically, 0 1 − x 2 d x is the area of:
thing where a function is “trapped” inside another function,
(a)
and the derivative of the trapped function appears on the out-
side (give or take a constant). In such a situation we should R π/2substitution x = cos(u), the integral be-
(b) Using the
choose the trapped function as our new variable u. For exam- comes 0 = du
ple:
R 2 x3 We can find a clever way of rewriting this integrand by ex-
x e dx u = x3 pressing the dashed length in the figure below in two dif-
R p
5x 1 − x 2 dx u = 1 − x2 ferent ways (by the Pythagorean theorem in the left figure
R and in terms of sines in the right one).
cos(8x + 1) dx u = 8x + 1
(cos x)7 sin x dx
R
u = cos x
23
anti-differentiate x 2 , which in itself makes the integral worse. 3.6.1. Find x3+x
R
2 −4 dx.
But this is a small price to pay for getting a clear shot at the
logarithm, which becomes vastly simpler when differentiating.
§ 3.6.2. Problems
The rest is simple fill-in-the-blanks. The first example above
for instance goes like this: 3.6.2. (a) One of the differential equations in problem 2.5.2
describes the growth of a population with limited
Z Z
|{z} e x/2 dx = |{z}
x |{z} x 2e x/2 x/2
1 |2e{z
| {z } − |{z} } dx resources. Which one? Explain briefly the real-
f g0 f g f0 g
world meaning of the terms in the equation.
x/2 x/2
= 2xe − 4e +C
(b) For the case of the human species inhabiting the
3.5.2. Work out the last example in the table above. earth, make a rough estimate as to the values of the
constants in the equation. Give brief justifications.
§ 3.5.2. Problems (c) Solve the differential equation. Hint: First find the
derivative of time with respect to population, inte-
3.5.3. † Geometrical interpretation of integration by parts. grate this expression, and then solve for population
as a function of time.
(a) Sketch a schematic representation of the paramet-
ric curve ( f (t ), g (t )). Assume that it is always head- (d) Use the “biblical” case of an initial population of 2
ing upwards and to the right. to determine the constant of integration and sketch
(possibly with computer assistance) the graph for
(b) Pick two t -values a, b and express the area under this case.
the curve between these two points as an integral.
(e) Mark the present-day population on the graph.
(c) Draw and express the areas of the axis-parallel rect- How many years after t = 0 is it?
angles with lower-left point at the origin and upper-
right point at ( f (a), g (a)) and ( f (b), g (b)) respec- (f) At what population size does the inflection point
tively. occur? Hint: Use the differential equation instead
of the formula for the population.
(d) Explain how the integration by parts formula is ge-
ometrically evident from your figure. (g) Complete the sentence: “When population growth
stops accelerating, the population has reached
.”
§ 3.6. Partial fractions 3.6.3. Chemistry: rate of reaction. Consider a chemical reaction
in which one molecule of reagent A combines with one
§ 3.6.1. Lecture worksheet molecule of reagent B to produce one molecule of a com-
pound X. The rate at which molecules of X are produced
Consider the problem of integrating a function with several is proportional to the concentration of the reagents:
factors in the denominator, such as dx
Z
1 = k(a − x)(b − x),
dx. dt
x(1 − x) where x is the concentration of X, a, b are the initial con-
The trick here is to split the integrand into partial fractions: centrations of each reagent, in mols per unit volume, and
1 A B k is a constant. Let us assume that a < b.
= + . dt
x(1 − x) x 1 − x (a) Rewrite the equation in the form dx = ···.
So I gave each factor of the denominator its own fraction, leav- (b) Find t as a function of x. Determine the constant of
ing the numerators as unknown constants. I say that we can in integration using the fact that no molecules of the
fact find numbers A and B that make this equation true. To find compound X are present at the beginning of the re-
these numbers, multiply both sides by x(1 − x) to clear the de- action.
nominators. This gives us 1 = A(1−x)+B x. Now, for the partial
fraction decomposition to be valid this equation must be true (c) Plot the solution curve for a = 1, b = 2, k = 1.
for any value of x. So in particular it must be true if we plug in (Note that the same graph, when rotated, can be
x = 1, for instance. I chose this value of x because it simplifies read as a graph of x as a function of t .)
the equation so nicely; in fact, the equation now says 1 = B , (d) Usually one can speed up chemical reactions by in-
so we have figured out one of our constants. Another clever creasing the temperature. Suppose this increases k
choice of x will be the other root, x = 0, which gives 1 = A. So to 2, but reduces each of the concentrations a and b
actually both constants are 1. Therefore by 10% due to heat expansion of the solution. Plot
Z
1
Z
1 1 this new situation. Is the reaction faster than be-
dx = + dx = ln|x| − ln|1 − x| +C . fore?
x(1 − x) x 1−x
24
§ 3.7. Reference summary
§ 3.7.5. Rules of integration
b 2
5xdx = 5 xdx = 5 x2 +C
Z R R
rate of change of something dt = net change in that thing
a
Z b
• Integrate: a power of x.
velocity dt = net distance travelled
a
Integrate with power rule, i.e., increase exponent by 1 and
Z b
acceleration dt = net increase in velocity divide by the new exponent.
a
x 3 dx = 14 x 4 +C
R
25
• Integrate: function plus or minus function. R
tan x dx
R
Integrate each separately and keep the sign in between. R sinux = cos x.
Let R Then1
du = − sin x dx. Thus tan x dx =
R −1
cos x dx = cos x sin x dx = u du = − ln |u| + C =
− ln | cos x| +C .
(1 − p1x )dx
R
If (a constant times) the derivative of the inside function oc- Now differentiate your f -function to find f 0 and anti-
curs on the outside: Let u = (the inside function) and solve differentiate your g 0 to find g . Fill this into the integration
by substitution. by parts formula:
Z Z
• Perform a substitution (change of variables) in an integral. f g 0 dx = |{z} |{z} − |{z} |{z} dx
f g f0 g
Express the new variable u in terms of the old variable x or
vice versa. Find du dx
dx or du and use this to solve for dx. Use the If bounds are involved, put them everywhere:
resulting expression to replace the dx in the integral. Rewrite
any remaining expressions involving x in the integrand in Z b Z b
terms of u. Also translate the bounds from x-values into u- f g 0 dx = [|{z} |{z}]ba − |{z} |{z} dx
a a
values by plugging them in for x in the formula defining u. f g f0 g
Solve the resulting integral. If indefinite integral, rewrite the
answer in terms of x by substituting back using the formula
If a power of x is involved, repeated integration by parts is
defining u.
generally needed to bring it down one degree at a time.
e x/2 x dx
R
R1 9
0 (x − 1) dx
Z Z
Substitute u = x − 1. Then dd ux = 1 ⇒ d x = d u, so the inte- e x/2 dx = |{z}
x |{z} x 2e x/2 x/2
R0 10 ¤0 | {z } − 1 2e| {z } dx
gral becomes −1 u 9 d u = u10 −1 = − 101
|{z} |{z}
. f g0 f g f0 g
x/2 x/2
= 2xe − 4e +C
R (ln t )10
t dt R
x cos x dx
1
The substitution u = ln t , which implies du = t dt, gives R
R (ln t )10 1 11 1 = x sin x − sin x dx = x sin x + cos x +C .
dt = u 10 du = 11 (ln t )11 +C .
R
t u +C = 11
xe 2x dx
R
1
R
x ln x dx
Integrate by parts with f = x and g 0 = e 2x , which gives
1 1 1 2x 2x R 2x
R R
Let u = ln x. Then du = dx/x, so dx = dx = f 0 = 1 and g = e2 . Hence xe 2x d x = x e2 − e2 d x =
R
R 1 x ln x ln x x
2x 2x
u du = ln |u| +C = ln | ln x| +C . x e2 − e4 +C = ( x2 − 14 )e 2x +C .
26
R ln x R π/2
x2
dx 0 sin x cos x dx
= − x1 ln x + 1
dx = − x1 ln x − x1 +C Let u = cos x. Then du = − sin x dx. Thus
R
x2 R π/2 R0 R1 1 2
i1
0 sin x cos x dx = 1 −u du = 0 u du = 2 u = 1/2.
0
x
R
sin(x)e dx
cos2 x dx
R
If you try to solve this by integration by parts you will feel
that you are going in circles: when you have done integra- R
cos2 x dx = 1+cos(2x)
R
dx =
R ¡1 1 ¢ 1
2 2 + 2 cos(2x) dx = 2x +
tion by parts twice you are back to the same integral that 1 sin(2x)
2 2 +C = 12 x + sin(2x)
4 +C .
you started with:
Z Z
sin(x)e dx = sin(x)e − cos(x)e − sin(x)e x dx.
x x x
3 sin3 xdx
R
Now you can solve for I to get • Integrate: a product of sine and/or cosine factors with differ-
ent coefficients of x.
sin(x)e x − cos(x)e x
I= ,
2 Rewrite using addition formulas for sine or cosine.
• Integrate: a product of powers of sines and/or cosines. Use a separate letter for the constants A, B , etc., in each term.
These are some numbers yet to be determined. We find these
Only even powers occur: Rewrite using double angle formula numbers as follows. Set the original fraction equal to the sum
of the partial fractions. Proceed in one of two ways:
cos 2x = cos2 x − sin2 x = 2 cos2 x − 1 = 1 − 2 sin2 x
– (Best for simpler cases.) Multiply both sides by the de-
nominator of the original fraction. Many things can-
The power of one of them is 1: Make that function u and cel and we are left with an equation without fractions.
solve by substitution. Plug in the values for x that make one of the factors zero
Other cases: Rewrite using sin2 x + cos2 x = 1 to the forms (i.e., the roots of the original denominator). Each time
above. you plug in one of these values many terms will become
zero and you will get some quite simple equations from
27
which you can figure out the values of the constants A, R3 p
2
B , etc. −3 9 − x dx
R3 p
Let x = 3 sin θ. Then dx = 3 cos θ dθ. Thus −3 9 − x 2 dx =
– (Better in advanced cases.) Multiply out all parentheses R π/2 p R π/2 p
and identify coefficients of like terms on left and right −π/2 9 − 9 sin2 θ(3 cos θ) dθ = −π/2 3 9 cos2 θ cos θ dθ =
R π/2 R π/2 2
R π/2 9 ¡
sides (coefficient of x n on left hand side = coefficient of −π/2 3|3 cos θ| cos θ dθ = −π/2 9 cos θ dθ = −π/2 2 1 +
x n on right hand side). ¢ π/2
i
cos(2θ) dθ = 92 θ + 12 sin(2θ) = 29 π.
¢ ¡
−π/2
We have now reduced the problem of integrating the original
fraction to the problem of integrating the sum of the partial • Integrate: an expression involving (±a 2 ± x 2 )n .
fractions. This is done term by term.
If other rules are inapplicable, a substitution strategy anal-
– A
integrates to a logarithm (substitute u = ax − b). ogous to that of the previous case may help to simplify the
ax−b
integrand.
B C
– ,
(ax−b)2 (ax−b)3
, etc., integrates using power rule (sub- • Integrate: a more complicated case not covered above which
stitute u = ax − b). involves . . .
Ax+B
– integrates to an arctangent (first complete the
ax 2 +bx+c – . . . a rational function of sin(x) and cos(x).
square in the denominator).
Unless some simplification using trigonometric identi-
R x+1
ties suggests itself, substitute u = tan(x/2). This should
(x−1)(x−3)2
dx turn the integrand into a rational function of u, which
x+1 A B C can then be integrated. (For geometric interpretation
(x−1)(x−3)2
= (x−1) + (x−3) + (x−3) 2
see problem 7.3.5.)
⇒x + 1 = A(x − 3)2 + B (x− 1)(x
− 3) +C (x − 1)
1 – . . . a root expression.
0 = A +B A= 2
⇒ 1 = −6A − 4B +C ⇒ B = − 21 Try substituting u = this root expression. If this does
1 = 9A + 3B −C C =2
1 1
not help, try substituting u 2 = the interior of the root
R x+1
R 2 −2 2 ln(x−1)
So (x−1)(x−3) 2 dx = (x−1) + (x−3) + (x−3)2 dx = 2 − expression
ln(x−3) 2
2 − x−3 +C . – . . . a fraction of exponential expressions.
Try to find a substitution that will rationalise the ex-
R (x+2) pression (i.e., turn it into a ratio of polynomials).
(x+3)(x+4) dx
2 • Evaluate an integral where one of the bounds is infinite, or
2 1
dx = [2 ln(x + 4) − ln(x + 3)] +C = ln (x+4)
R
= x+4 − x+3 x+3 +C on an interval where the integrand becomes infinite (i.e., has
vertical asymptote).
R2 dx Evaluate the integral with a generic bound, say a, in place
1 x 2 −3x−4
of the exceptional one, and take the limit of the answer as a
R2 1/5 1/5
= 1 x−4 − x+1 dx = 51 [ln |x − 4| − ln |x + 1|]21 = 25 ln 23 goes to the exceptional point. Split into two integrals if the
exceptional point is in the interior of the interval. The inte-
gral is convergent if the limit exists and is finite, otherwise
dx divergent.
R
x 2 +4x+5
R
R dx • Find the derivative of an integral f (x) dx with respect to t
= (x+2)2 +1
= arctan(x + 2) +C
where t . . .
p – . . . is the upper bound of integration.
• Integrate: an expression involving ±a 2 ± x 2 .
f (t ).
Use a trigonometric substitution that enables you to rewrite
– . . . is the lower bound of integration.
the expression under the root as a perfect square using the
Pythagorean identity sin2 x + cos2 x = 1 or some variant of it − f (t ).
(e.g. divided by cos2 x). In this way the root can be eliminated
– . . . occurs in an expression g (t ) for the upper bound of
which should make the function easier to integrate. For the
integration.
purposes of substituting back a trigonometric answer to the
original variable, it is useful to interpret the original root ex- Combine the above with chain rule: f (g (t ))g 0 (t ).
pression as one of the sides of a right triangle; the trigono-
– . . . occurs in an expression g (t ) for the lower bound of
metric answer can then be interpreted as a ratio in this figure
integration.
and translated accordingly.
Combine the above with chain rule: − f (g (t ))g 0 (t ).
28
– . . . occurs in both bounds. R ln 3 ex
0 1+e x dx
Treat each bound separately according to the above;
SubstituteR u = 1 + e x , so that e x dx = du. Then the integral
add the results. 4
becomes 2 u1 du = [ln u]42 = ln 2.
et
Rx
Differentiate F (x) = 0 t +2 dt. R 1/2 1
0 dx
F 0 (x) = e x x + 2 2+8x 2
1 1/2 1 π
dx = 41 [arctan(2 X )]1/2
R
= 2 0 1+(2 X )2 0 = 16
R x2 et
Differentiate G(x) = 0 t +2 dt.
R4p
0 ex
2
1 x ln xdx
G (x) = x 2 +2
· 2x p
3/2 ln x R4 x
By parts: = [ x 3/2 ]41 − 1 3/2 dx =
16 ln 4
3 − 28
9
R2 x
§ 3.7.7. Examples 0 (x 2 +4)1/3 dx
1 8 du 2/3 3
= [ 3u4 ]84 = 3 − 22/3
R
= 2 4 u 1/3
( 12 + x)2 ln(1 + 2x)dx
R
29
R∞ dx
1 x 2 +x
R∞ RR
= 1 ( x1 − x+1
1
)dx = limR→∞ 1 ( x1 − x+1
1
)dx = limR→∞ [ln x −
R x R
ln(x + 1)]1 = limR→∞ [ln x+1 ]1 = ln 2
30
§ 4.2. Arc length
4 A PPLICATIONS OF INTEGRATION
§ 4.2.1. Lecture worksheet
Rb
§ 4.1. Volume 4.2.1. Explain what a ds represents. (The meaning of ds is
shown in a figure in §1.1.)
Rbq ¡ ¢2
§ 4.1.1. Lecture worksheet 4.2.2. Show that a 1 + y 0 (x) dx expresses the arc length of
the curve y(x) from x = a to x = b. Hint: This is re-
ally nothing but the Pythagorean Theorem applied to in-
4.1.1. As the area under y(x) is made up of rectangles with
finitesimal triangles. Express ds in terms of dx and dy and
hight y and base dx, so is the volume of a sphere made
then factor out dx.
up of cylindrical slices with thickness dx.
4.2.3. Find the arc length of the semi-cubical parabola y 2 = x 3
from (0, 0) to (1, 1).
§ 4.1.2. Problems
§ 4.3.1. Lecture worksheet
4.1.4. Find an integral formula for the volume of the solid of The “law of the lever” says that a lever multiplies the effect of
revolution generated when area under y(x) is revolved a force by the length of the lever arm from the fulcrum to the
about the y-axis. Hint: Consider the volume to be made point where the force is applied. Thus we can lift a stone with,
up of thin cylindrical shells with height y and thickness say, a three times smaller force than that required to lift it di-
dy. What is the circumference and volume of such a rectly by using a lever with a three times longer arm on our side
shell? than on the stone’s side.
31
point-masses. In our example, we see already without calcula-
tions that x̄ for symmetry reasons, meaning that our piece of
metal would balance on the edge of a knife placed along the y-
axis. But what is ȳ? Along which horizontal line does the shape
balance?
4.3.3. Why does it not matter where the origin of the coordinate
system is located?
Suppose now that the masses are located at various points § 4.3.2. Problems
(x i , y i ) in a plane (say placed on a thin metal tray) instead of
along a single axis. 4.3.8. Above we sliced the area into horizontal strips when cal-
4.3.4. What is the physical meaning of your expression for x̄ in culating ȳ of a figure.
this context? (a) Explain why this is in a way more natural than using
4.3.5. Find an expression for the center of mass in this case (i.e., vertical slicing.
the point at which you could balance the whole tray on (b) Explain how you could nevertheless compute ȳ on
the tip of your finger). the basis of vertical strips. Hint: first pool the mass
Suppose we want to find the center of mass of a figure, such as of each strip at its center of mass.
for example the area between the parabola y = x 2 −1 and the x- (c) Give an example where vertical slice are more con-
axis. We can imagine this area being cut out of a sheet of metal venient than horizontal ones.
and we want to know on which point this piece of metal could
be balanced on the tip of a needle. Above we dealt only with the 4.3.9. Theorem of Pappus. Show that if some plane area is
center of mass of a system of point-masses, but the idea is eas- rotated about the y-axis then the volume generated is
ily extended by thinking of the figure as made up of many little equal to the area of the region times the distance trav-
32
elled by its center of mass. Hint: compare the integral like those of §A.7.12, so integration can be used to go the other
expression for x̄ with that for rotational volume (§4.6.1). way and “build up” to the quantities occurring in conservation
laws starting from what is known. This is done by means of the
4.3.10. Find a similar theorem for the surface area of a solid of
concept of work, which is another way of looking at energy. We
revolution.
may define work as force times distance, or,
R more generally, the
4.3.11. Galileo thought that the shape of a necklace held up by integral of force with respect to distance, F ds.
its endpoints is a parabola. We shall prove that it is not.
4.4.1. Looking back at §2.4, explain how the two energy terms
The physical principle we need for this is: nature strives
can be obtained as work integrals given known expres-
to arrange the necklace in such a way that its center of
sions for force (§A.7.12):
gravity is as low as possible.
(a) −G Mym .
Consider the parabola y = x 2 −1 and view the portion be-
2
low the x-axis as the shape of a necklace suspended from (b) mv
R
2 . Hint: rewrite ma ds as an integral with re-
the points (−1, 0) and (1, 0). spect to v.
(a) Find the center of mass of this part of the parabola. But why should work, as defined above, be the same thing as
I say that a necklace placed along this curve and released energy? How can we arrive at this definition of work in the
would instead attain the shape first place? The rest of this section is devoted to developing our
physical intuition about this.
y = 0.314 e x/0.628 + e −x/0.628 − 1.607
¡ ¢
An object of mass m at a not-too-great height h above the sur-
face of the earth has a potential energy of mgh. This means
These two curves look like this, with the parabola drawn that we could, potentially, have it do so much work for us. You
solid: can think of for example a water wheel driven by a water fall:
this device takes advantage of the potential energy stored in the
water by virtue of its altitude, and harnesses it for some other
purpose. Thinking in terms of water wheels, it is easy to under-
stand why potential energy is proportional to mass and height.
For if the height is double, you can have the water run through
twice as many wheels on its way down, so you get twice as
much work out of it. And if the mass is double you can split it
in half and run each part through the water wheels separately,
which makes it clear that you get twice the work in this case
also. By the same argument we obtain the general relation
33
to characterise it in terms of the action of the worker who set 146 meters. Its interior is basically solid stone through-
it moving and however long of a run-up he used. We should out, except for a few small chambers. The stones used
much prefer to express it in terms of the mass and velocity of weigh about 2700 kg/m3 .
the wagon. But this is easily done, for we know that
(a) Calculate the total work done in erecting the pyra-
force = mass × acceleration mid. Hint: Slice the pyramid into horizontal lay-
ers and express the work required to lift the stones
and for each layer. Then integrate to get the total work
distance = average velocity × time needed for all layers.
“Distance” here means the length of your run-up before you According to the ancient Greek historian Herodotus, the
released the wagon, and “time” how long you took to complete pyramid was built in 20 years by 100 000 workers. Let’s
it. Let’s say that you push equally hard throughout, so that the check whether this seems plausible.
force, and thus acceleration, is constant.
(b) Estimate how much work a man can do in one hour.
4.4.2. Conclude from this that the kinetic energy is 12 mv 2 . Hint: Picture the man lifting weights onto a ledge
The two forms of energy that we have studied are clearly in- of height 1 meter. What weight can he lift and how
terchangeable: when an object falls it “trades in” potential en- many times can he repeat this in one hour?
ergy for kinetic, and conversely when its velocity is directed up- (c) Based on your estimate, how many man hours
wards. By means of some ramps we could turn a water fall into would have been needed to lift the stones into place
a stream and conversely, so we would quite like to know which for the pyramid?
is better for driving water wheels. But it turns out to be all the
same. The economy of nature is such that the exchange rate in (d) Does Herodotus’s claim seem plausible?
these kinds of transactions is one to one. Energy is conserved.
This agrees with experience but we can also prove it formally.
§ 4.5. Logarithms redux
4.4.3. Prove, by taking its time-derivative, that the total energy
mgh + 21 mv 2 is constant for a freely falling object.
§ 4.5.1. Lecture worksheet
Another useful way of establishing this sort of result is to prove
that if it didn’t hold one could exploit the discrepancy to build
Integrals give us a new way of looking at logarithms, which is
a perpetual-motion machine which could create energy out of
more illuminating in certain respects. In particular, this per-
nothing, which is known to be impossible or at least a point
spective enables us to understand the derivative of the loga-
on which we would be very pleasantly surprised to be proven
rithm in a more direct way than the approach we used in prob-
wrong.
lem 1.4.1.
4.4.4. Argue on such grounds that mgh + 12 mv 2 is constant.
In §A.4 we saw that the essence of logarithms is that they turn
multiplication into addition:
§ 4.4.2. Problems
log(ab) = log(a) + log(b) (L1)
4.4.5. When you are pushing the wagon to get it moving, if you
push it for twice as long, while maintaining the same and that a table of powers of some integer has this property
constant force, then you double its final speed. But the when “read backwards.” If we plot the values of such a table in
kinetic energy doesn’t double but quadruple since it is a coordinate system we get a picture like this:
proportional to v 2 . So by doubling the input effort you
got four times the output energy stored in the system: a
violation of energy conservation. Resolve the paradox.
4.4.6. Consider a lever with one lever arm just over twice as
long as the other. Attached to the shorter arm is a weight
of mass 2. You lift a weight of mass 1 and attach it to
the longer lever arm. Then this weight will sink and the
We are looking to extend this table to include all intermediate
other one will rise. Doesn’t this prove that lifting a unit
values as well. In §A.4 we did this in an algebraic fashion. Now
weight is equivalent to lifting two unit weights? By con-
we wish to do it geometrically. Thus we look at the plot and try
necting several levers you could even lift any weight of
to characterise the function that runs through these points.
mass 2n with no more effort required to lift the origi-
nal unit weight. This surely violates energy conservation. 4.5.1. (a) Argue that it seems plausible that a function run-
Resolve the paradox. ning through these points has derivative 1/x.
Rx
4.4.7. The Great Pyramid of Giza, Egypt, has a square base (b) Explain why the function f (x) = c 1t dt, where c is
wide side length 230 meters, and its original height was any constant, has this derivative.
34
We now wish to check whether the function so defined in Find the rotational volume generated when f (x) =
p
fact has the property (L1), as desired. sin(2x), 0 ≤ x ≤ π2 , is rotated about the x-axis.
(c) For the formula f (ab) = f (a) + f (b) to hold for R π/2 p R π/2
= π 0 ( sin(2x))2 dx = π 0 sin(2x)dx =
all positive numbers a and b, it is necessary that π π/2
f (p) = 0 for a certain number p. Explain why. (Hint: 2 [− cos(2x)]0 = π.
§ 4.5.2. Problems
Volume of solid of revolution generated when area under y(x)
is revolved about y-axis (seen as made up of thin cylindrical
4.5.2. † Discuss what you consider to be the advantages and shells):
disadvantages of the two ways of defining logarithms
presented in sections §A.4 and §4.5. Hint: aspects
Z b
to consider might include how these definitions incor-
2πx y dx
porate non-fractional exponents, and what is “natural” a
about the natural logarithm.
4.5.3. With the logarithm defined as in §4.5.1, we can define the Find the volume generated when the area under the curve
exponential function e x as its inverse. Prove (e x )0 = e x y = x 2 , 1 < x < 2, is rotated about the line x = 2.
and e x y = e x e y starting with this definition. R2 R2 £ 3 4 ¤2
= 1 2π(2 − x)x 2 dx = 2π 1 2x 2 − x 3 dx = 2π 2x3 − x4 1 =
4.5.4. We write ln|x| rather than ln x for the antiderivative of
¡ 16 16 ¡ 2 1 ¢¢ 11π
2π 3 − 4 − 3 − 4 = 6 .
1/x. This may seem like a hassle. Of course, when the
logarithm is defined as above it only exists for positive
numbers, but what’s stopping us from simply extending Arc length of curve y(x) from x = a to x = b:
the definition to include negative numbers as well, so
that ln x = ln|x|, which would spare us the trouble of writ- Z b Z bq ¡ ¢2
ing the absolute value bars all the time? Hint: What are ds = 1 + y 0 dx
a a
some other important properties that we want the loga-
rithm function to have?
Surface area of a surface of revolution (curve y(x) revolved
4.5.5. Argue that ln|x| + C is not quite the most general an- about the x-axis):
tiderivative of 1/x. Hint: replace C with a locally constant Z b Z q
¡ ¢2
function. 2πy ds = 2πy 1 + y 0 dx
a
1
Z b
§ 4.6.1. Geometrical applications of integration f (x) dx
b−a a
35
§ 4.6.2. Center of mass
Rb Rb 1 2
a x y dx a y dx
x̄ = R b ȳ = R b2
a y dx a y dx
Rb Rb 1 2 2
a x( f − g ) dx a 2 ( f − g ) dx
x̄ = R b ȳ = Rb
a ( f − g ) dx a ( f − g ) dx
Rb Rb q ¡ ¢2
0 dx
a a x 1+ y
x ds
x̄ = R b = R q ¡ ¢2
b
a ds a 1 + y 0 dx
Rb Rb q ¡ ¢2
0 dx
a a y 1+ y
y ds
ȳ = R b = R q ¡ ¢2
b
a ds a 1 + y 0 dx
§ 4.6.3. Work
Z
work = force × distance = F ds
ln(x)
1 x
36
Instead we build them up from standard series such as the
5 P OWER SERIES above, by algebraic manipulations like substituting, multiply-
ing, and so on, as we are used to doing for ordinary polyno-
mials. Furthermore we shall see below that many important
§ 5.1. The idea of power series series arise more naturally in other ways altogether.
x x2 x3
ex = 1 + + + +···
1! 2! 3!
3 5 7
x x x
sin x = x − + − +··· 5.1.4. By picturing the graph of ln(x), I feel that the power series
3! 5! 7!
for ln(1 + x) starts with a [positive/negative/zero] con-
x2 x4 x6
cos x = 1 − + − +··· stant term, a [positive/negative/zero] linear term, and a
2! 4! 6! [positive/negative/zero] quadratic term.
5.1.3. Suppose you know the power series for sine and cosine 5.1.5. Argue that the power series for the sine implies that
but have no calculator at your disposal. In which of the sin x ≈ x when x is small. (This is a useful approximation
following situations could you use the power series to re- in many situations. We mentioned it already in §A.3, and
solve your problem? we also effectively used this approximation in §6.4 when
deriving the differential equation for pendulum motion.)
I remember the wavy shape of the graph of the sine
function, but I forget how to plot it and tell it apart 5.1.6. Show that applied to a general function f (x) the method
from the cosine graph. gives
I remember that the sine and cosine functions are f 0 (0) f 00 (0) 2 f 000 (0) 3
basically each other’s derivative, except there is a f (x) = f (0) + x+ x + x +···
1! 2! 3!
minus sign somewhere, and I forget where it goes.
I remember that sin(60◦ ) is something quite simple, One use of the idea of polynomial approximation is to tackle
but I forget the exact value. difficult integrals. We quite often face integrals for which
This method of repeated differentiation for finding power se- none of our usual integration tricks work, such as the integral
2
ries is in principle always applicable. But in practice we rarely of e −x (the normal q distribution function at the heart of sta-
a 4 +(b 2 −a 2 )x 2
derive power series by the method of repeated differentiation. tistical theory) or a 4 −a 2 x 2
(the arc-length of the ellipse
37
x 2 /a 2 + y 2 /b 2 = 1, an important problem in astronomy since § 5.2. The geometric series
planets move in elliptical orbits). In such cases the best we can
do is often to expand the function as a power series and inte-
grate term by term, which gives us the desired integral in series § 5.2.1. Lecture worksheet
form.
2 5.2.1. (a) What is the greatest number smaller than 1? One is
5.1.7. Let us evaluate e −x dx in this way.
R
inclined to suggest a = 0.99999 . . ., but argue against
(a) If I include only the first four non-zero terms, the this by considering 10a − a.
integral is
(b) Generalise your argument to find a closed formula
(b) Though not an exact solution in closed form, this for 1 + x + x 2 + x 3 + · · ·
is still very useful. For example, I could use it
R1 2
to find a good approximation to 0 e −x d x. Sup- 5.2.2. Explain how power series are related to the paradox of
pose I use only the first three non-zero terms for motion mentioned by Aristotle, Physics, 239b11: “[Zeno]
this. This must already be quite good because I see asserts the non-existence of motion on the ground that
from the above that the next term would only affect that which is in locomotion must arrive at the half-way
the decimal and subsequent terms are even stage before it arrives at the goal,” and then the half-way
smaller. stage of what is left, etc., ad infinitum.
5.1.8. Suppose I use the first five terms of the power series for 5.2.3. Derive the series
e x to approximate e 0.1 , then use this result to find an ap-
proximation for e by [raising the result to the power 10, x2 x3 x4
ln(1 + x) = x − + − +···
taking 1 divided by the result, multiplying the result by 2 3 4
10, taking the ln of the result and multiplying by 10]. Al-
ternatively, I could find an approximation for e directly by first noting that
from the series by [plugging in x = 0, plugging in x = 1, Z x+1 Z x
1 1
using a geometric series, using a binomial series]. Which ln(1 + x) = dt = du,
of the two methods will be more accurate? [the first, the 1 t 0 1 + u
second, both equal]
then applying the geometric series, then integrating term
by term.
§ 5.1.2. Problems
5.1.9. (a) Estimate the sine of 1◦ using nothing but a simple § 5.2.2. Problems
calculator that only has the operations +, −, ×, /.
(b) Check your answer using a more advanced calcula- 5.2.4. What is 0.888 . . .? Does it “spill over” like we saw 0.999 . . .
tor that has a sine button. do in problem 5.2.1a?
(c) Is the “more advanced” calculator really more ad- 5.2.5. By the fundamental theorem of calculus, the arctangent
vanced, or does it just have the algorithm of (a) on is the integral of its derivative.
“speed dial”?
(a) Use this to find a power series for the arctangent.
5.1.10. (a) By considering the roots of sin(x)/x, argue that its (To make sure that you take the constant of integra-
power series tion into account, check that your constant term is
2 4 6 correct using the geometrical definition of the arct-
x x x
sin(x)/x = 1 − + − +··· angent.)
3! 5! 7!
(b) Find the value of arctan(1) in two ways: by the geo-
can be factored as
metrical definition, and from the power series.
x2 x2 x2
µ ¶µ ¶µ ¶
1− 2 1− 2 1− 2 ··· (c) Equate these two expressions for arctan(1) to find
π 4π 9π
an infinite series representation for π.
by analogy with the way one factors ordinary poly-
When Leibniz found this series he concluded that “God
nomials, such as x 2 − x − 2 = (x + 1)(x − 2).
loves the odd integers,” as you can see in the figure below
(b) What is the coefficient of x 2 when the product is ex- (taken from his 1682 paper).
panded?
(c) Equate this with the coefficient of x 2 in the ordinary
power series and use the result to find a formula for
the sum of the reciprocals of the squares, 1/n 2 .
P
38
number, a non-integer, 0] which is absurd. There-
fore our initial assumption e = p/q must have been
false.
(d) ? What does Leibniz’s series have to do with a § 5.3.1. Lecture worksheet
square of area 1, which is what Leibniz has drawn
on the left? Hint: The use of the Greek letter π The binomial series
to denote the famous circle constant is a relatively
q(q − 1) 2 q(q − 1)(q − 2) 3
recent invention. It was never used in Leibniz’s (1 + x)q = 1 + q x + x + x +···
2! 3!
time, and certainly not by the ancient Greeks. This
is because they preferred to formulate mathemati- is perhaps best thought of by analogy with the integer-
cal truths geometrically rather than by “formulas.” exponent case. When q is an integer the series dies after the
So to understand Leibniz’s mode of expression you (q + 1)th term, and one has the trivial results
should consider how to express your formula for π
in purely geometrical terms. (1 + x)2 = 1 + 2x + x 2 , (1 + x)3 = 1 + 3x + 3x 2 + x 3 , etc.
Leibniz’s series is beautiful but it is not really very effi- Here we can think of, say, the coefficient of x 2 in the last ex-
cient for computing π. Already in 1424 al-Kashi had com- pression as follows. To get an x 2 -term when expanding (1 +
puted π with 16-decimal accuracy using different meth- x)3 = (1 + x)(1 + x)(1 + x) we need to choose x’s from two of
ods. the parentheses and 1 from the third. For the first x we have
(e) ? Estimate how many terms of Leibniz’s series must three choices, and for the second x we have two choices, giv-
be added together to achieve al-Kashi’s accuracy. ing 2 · 3 = 6 choices in total, except that we must divide by
the number of ways in which these two things we choose can
Here is al-Kashi’s result in his own notation: be ordered internally (choosing first the third parenthesis and
then the first is the same as choosing first the first and then the
third), which is 2!, thus explaining the coefficient 3. The gen-
eral binomial theorem can be thought of in the same way, even
That’s 3.1415926535897932. Note the interesting way in though q is no longer an integer. Let’s try the x 2 coefficient
which our symbols for 2 and 3 are derived from their Ara- again. To get an x 2 -term when expanding (1 + x)q we need to
bic counterparts. The Arabic symbols are perhaps a more choose x’s from “two of the q parentheses” so to speak (what-
natural way of denoting “one and then some.” ever that is supposed to mean when q is something like −1/2).
5.2.6. In this problem we shall show that e is not a rational For the first x we have q choices, and for the second x we have
number, i.e., not a ratio of two integers. We shall do this q − 1 choices, and correcting for internal ordering the coeffi-
q(q−1)
by assuming on the contrary that e = p/q for some inte- cient for x 2 should be 2! .
gers p and q, and showing that this leads to a contradic-
5.3.1. The binomial series for q = 2, q = −1, q = 21 have very dif-
tion.
ferent standing. Match them with a suitable description:
(a) Find a series representation of e using the power se- becomes another famous series
ries for e x , and set it equal to p/q.
becomes finite
(b) Multiply both sides by q!. You will find that the in-
finite series now starts with a series of integers and amazingly works
then from a certain point onwards becomes a series
equals a logarithm function
of fractions. The first non-integer term is
We typically use the binomial series for functions involv-
(c) By writing down this and the next few terms, we see
ing:
that, from this point onwards, the series is [</=/>]
the geometric series powers
1 1 1 fractions
+ + +···
q + 1 (q + 1)2 (q + 1)3 roots
which in closed form = logarithms other than ln
(d) Consequently, we have shown that [e, p, q, p!, q!, An important application of the binomial series concerns the
pq!/q] can be written as [an integer, a negative inverse trigonometric functions.
39
5.3.2. (a) Recall from problem 4.2.4 how to express arcsin(x) • We cannot naively assume that we can always manipu-
as an integral. late infinite series according to the same rules as finite
expressions (although this works more often than not).
(b) Find a power series representation of arcsin(x) by
expanding the integrand as a binomial series and • We can often avoid pitfalls and be more careful in our
integrating term by term. reasoning by considering the infinite series as the limit
case of a finite expression.
§ 5.3.2. Problems For the purposes of finitistic analysis, we can cut the series off
and add up all the terms we have up to that point. This is called
a partial sum. Often the terms of the series shrink very quickly.
5.3.3. The general binomial series can be bypassed in specific
Then the partial sums will soon be basically equal to the whole
cases in the followingp way. Let’s say we are looking for series since the cut-off terms are so small. In such a case we
the power series of 1 + x 2 . This means that we want a
say that the series converges, i.e., approaches a particular value.
series such that
Convergent series are “almost like a finite expression” in this
sense, so it is not surprising that they can generally be treated
p
1 + x 2 = A + B x +C x 2 + D x 3 + · · · ,
as such without any need to worry about falling into absurdi-
or, in other words, ties like those of problem 5.4.1.
40
Which proof is valid? [the first/the second/both/nei- since every integer has a unique prime factorisa-
ther]. The result shows that a series can [be comput- tion.
ed/converge/diverge/vanish/oscillate] even though [its
(b) Deduce from this and problem 5.4.5 that there must
terms go to 0/all its terms are positive/it has only inte-
be infinitely many primes.
ger denominators/it has the same derivative as ln]. Can
the same thing happen with a geometric series? [yes/no]
§ 5.5. Reference summary
§ 5.4.2. Problems
§ 5.5.1. Standard series
5.4.6. Another proof that the harmonic series diverges can be
given on the basis of the inequality General power series of f (x):
1 1 1 3
+ + > f 0 (0) f 00 (0) 2 f 000 (0) 3
n −1 n n +1 n f (x) = f (0) + x+ x + x +···
1! 2! 3!
(a) Prove this inequality. Hint: geometrically, it reflects
the fact that the function 1/x “flattens out” as x in- General power series of f (x) centered at x = a:
creases.
(b) Apply this inequality to the harmonic series f 0 (a) f 00 (a) f 000 (a)
f (x) = f (a) + (x − a) + (x − a)2 + (x − a)3 + · · ·
1 1 1 1! 2! 3!
1+ + + +···
µ 3 4 ¶ µ
2 ¶ Power series of elementary functions:
1 1 1 1 1 1
=1 + + + + + + +···
2 3 4 5 6 7
x x2 x3
and show how the divergence of the series follows ex = 1 + + + +···
from this. 1! 2! 3!
x3 x5 x7
5.4.7. I claimed that it is important to distinguish divergent se- sin x = x − + − +···
3! 5! 7!
ries from convergent ones on the ground that the former x2 x4 x6
can lead to absurdities if ordinary algebra is assumed to cos x = 1 − + − +···
2! 4! 6!
apply to them. However, I showed only that divergence
x2 x3 x4
of the types in problems 5.4.3 lead to absurdities, not that ln(1 + x) = x − + − +···
divergence of the type of problem 5.4.5 does so also. De- 2 3 4
rive an absurdity by careless reasoning with the latter se-
Geometric series (|x| < 1):
ries to establish that my point was well taken.
1
5.4.8. In fact, even convergent series are not entirely free of = 1 + x + x2 + x3 + · · ·
1−x
“paradoxes.”
(a) Using the power series for the logarithmic function, Binomial series (|x| < 1):
write down a series expression for ln(2). q(q − 1) 2 q(q − 1)(q − 2) 3
(1 + x)q = 1 + q x + x + x +···
(b) Rearrange the order of these terms by moving some 2! 3!
negative terms toward the beginning of the series in
such a way that after every positive term comes the § 5.5.2. Terminology
next two negative terms. Now subtract from every
positive term the negative term that follows, and Taylor series power series of a function centered at
sum the resulting series. some point x = a
(c) Discuss. Maclaurin
power series of of a function centered at
5.4.9. Consider the product series
the origin; special case a = 0 of Taylor se-
à !à !à !à ! ries
Y 1 1 1 1 1
= ···
1
1 − 12 1 − 13 1 − 51 1 − 71 partial sum
p prime 1 − p sum of cut off series; the terms of the se-
of a series
ries added together up to a certain point
(a) Expand each term as a geometric series and argue
that the result of multiplying everything out must convergent
the partial sums of the series approach a
be series
1 1 1 1 specific value as more and more terms are
1+ + + + +··· added
2 3 4 5
41
divergent
the partial sums of the series do no ap- e x sin x
series
proach a specific value as more and more ³
x 2 3
´³ 3 5 7
´
= 1 + 1! + x2! + x3! + · · · x − x3! + x5! − x7! + · · ·
terms are added (instead go to ±∞ or os-
= x + x 2 + 2!1 − 3!1 x 3 + 3!1 − 3!1 x 4 + · · ·
¡ ¢ ¡ ¢
cillate)
3
= x + x 2 + x3 + · · ·
alternating x5
every other term positive, every other (Note that we could not multiply the 1 by the 5! to put
series 5 4
negative a x5! in the answer, since there is a x4! hiding in the dots
of the e x -series, which when multiplied by x would affect
n! Multiply n by every integer below it. the fifth-power term.)
sin(x 2 ) R1 2
Evaluate 0 e −x dx using a power series.
2 (x 2 )3 (x 2 )5 (x 2 )7
= (x ) − 3! + 5! − 7! +···
x x2 x3 2 2 4 6
=x −2 x6
+ x 10 14
− x7! + · · · e x = 1 + 1! + 2! + 3! + · · · =⇒ e −x = 1 − x1! + x2! − x3! + · · · =
3! 5! 4 x6
R1 2 R1 4 6
1 − x 2 + x2 − 6 + · · · . Thus 0 e −x dx = 0 1 − x 2 + x2 − x6 +
3 5 7
· · · dx = [x − x3 + 10
x x
− 42 + · · · ]10 = 1 − 31 + 10
1 1
− 42 +···.
• Find the power series for a given function from first princi-
ples.
Rx
Approximate f (x) = x 2 + 0 sin2 t dt by a power series of de-
(Hardly ever the easiest way to find a series in practice.) Set
gree 2.
the function equal to A +B x +C x 2 +D x 3 +· · · . Plug in zero to
determine A. Take the derivative of both sides, and plug in Using the chain rule and the FTC we get f 0 (x) = 2x +
zero again to determine B . Repeat. sin2 x and f 00 (x) = 2 + 2 sin x cos x. Hence f (0) = 0, f 0 (0) =
0, f 00 (0) = 2. Thus the power series approximation is f (x) ≈
• Multiply two series. x2.
42
blowing up a bridge. If you take them on one at a
6 D IFFERENTIAL EQUATIONS time, how many enemies will survive to march on
your capital in this case?
§ 6.1. Separation of variables (f) Explain how, more generally, the conclusion “never
divide your forces” can be deduced from 6.1.2b di-
rectly.
§ 6.1.1. Lecture worksheet
(g) Solve the equations for guerrilla warfare. Does the
same maxim apply in this case?
The simplest strategy for solving differential equations is sep-
aration of variables: move all x’s to one side and all y’s to the
other, then integrate both sides. Thus if we have the equation § 6.1.2. Problems
dy 2 2
dx = x y we rewrite it as dy/y = x dx and then integrate both
3 /3+C
sides to get ln|y| = x 3 /3 +C , or y = ±e x . 6.1.3. Geometrical interpretation of separation of variables. Ex-
plain how solving y 0 = y by separation of variables corre-
This example alerts us to some technical points that often sponds to the figure below. Areas in the same shade are
come up in this context. First of all you may be upset that equal. The point generalises to any separable differential
I included the constant of integration only on the right hand equation.
side of the equation, even though I integrated both sides. But
this comes to the same thing, for if I had included constants on
both sides, say C 1 on the left and C 2 on the right, then I could
just move them to the same side to get C 2 − C 1 on the right,
which we might as well denote by a single letter C since a con-
stant minus a constant is just another constant. Also, we dislike
having constants in the exponents; it’s impractical. Therefore
the standard trick in these kinds of situations is to rewrite the
3 3 3
solution as y = e x /3+C = e x /3 e C = Ae x /3 . Again, e C is just an-
other constant so there is no point in writing it this way. It is
neater to just give it its own letter, A.
6.1.4. Consider a differential equation with separated vari-
6.1.1. Solve the differential equation for population growth
ables, f (x) dx = g (y) dy, to be solved for the initial con-
with unlimited resources (from problem 2.5.2). Note that
dition y(x 0 ) = y 0 . Instead of taking the indefinite inte-
our manner of rewriting constants makes the final con-
gral of both sides and including a Rconstant of integration
stant easy to interpret in real-world terms. x Ry
we can take the definite integrals x0 f (x) dx = y 0 g (y) dy
6.1.2. Find the equations for warfare in problem 2.5.2. For which gives us the solution directly, bypassing the need
the conventional warfare case, we shall now prove for the constant of integration. Explain why this works.
the famous military-strategic maxim “never divide your Hint: this can be done using problem 6.1.3.
forces,” or, if you prefer, “divide and conquer” (divide the 6.1.5. Forensic medicine. Newton’s law of cooling says that the
enemy, that is). temperature, H , of a hot object decreases at a rate pro-
(a) In these equations, the derivatives are taken with portional to the difference between its temperature and
0 dy
respect to time. So y means . But by dividing one that of its surroundings, S:
dt
equation by the other we can obtain a new equation dH
dy = −k(H − S)
involving only dx and no t . Do this. dt
(b) Solve this differential equation. (a) ? If you stir your coffee, does it cool faster or slower?
(c) What is the real-world meaning of the constant of How is this reflected in Newton’s law?
integration? Hint: Consider the cases where one The body of a murder victim is found at noon in a room
army has been depleted. with a constant temperature of 20◦ C. At noon the tem-
Suppose you and the enemy have equal fighting effi- perature of the body is 35◦ C; two hours later the temper-
ciency, a = b = 1. You have 5000 soldiers and the enemy ature of the body is 33◦ C.
has 7000. (b) Find the temperature of the body as a function of t ,
the time in hours since it was found.
(d) If you took the enemy head on, how many of their
soldiers would survive the battle and march on to- (c) Explain how you can check your work by consider-
ward your capital? ing the cases t = 0 and t → ∞.
(e) Suppose you managed to split the enemy into two (d) When did the murder occur? Assume that the vic-
groups of 4000 and 3000 soldiers, for example by tim had the normal body temperature 37◦ C at the
43
time of the murder. Provide the answer in both ex- (c) Remarkably, this problem can be alleviated by dif-
act and decimal form. ferentiating both sides. Do so! This leads to the dif-
ferential equation dp/dh = .
6.1.6. Since atmospheric pressure decreases when you climb
a mountain, it ought to be possible to determine one’s (d) Solve the resulting differential equation for p as a
altitude simply by measuring the atmospheric pressure. function of h. Consider p(0) in order to determine
In this problem we shall derive a formula that does pre- the constant of integration.
cisely this.
(e) Explain how you can check your work by consider-
For this purpose we need Boyle’s law of gases, which ing the physical meaning of p(∞).
states that pressure p is proportional to density σ, i.e.,
(f) Solve for h as a function of p.
p = aσ, for some constant a. (Background: Boyle discov-
ered this law in 1662 using “a long glass-tube, which, by This formula gives an easy way of finding the altitude
a dexterous hand and the help of a lamp, was in such a from the pressure, as sought. Note that the constants in
manner crooked at the bottom, that the part turned up the final formula are easily determined once and for all,
was almost parallel to the rest of the tube.” The pres- so that pressure is indeed the only input that needs to be
sure exerted on the enclosed air is the combined effect of measured in the field.
the atmospheric pressure p 0 and the weight of the excess
mercury (measured by h). The density of the enclosed (g) ? Does the formula also work below sea level?
air is of course readily measured by v. Thus, by pouring The “column of air” part of the argument may have both-
in more mercury, we can test the effect of an increase in ered you. Robert Hooke (Micrographia, 1665) explains it
pressure on density, which reveals Boyle’s law: every unit as follows: “I say Cylinder, not a piece of a cone, because,
increase in pressure causes an increase in density of a as I may elsewhere shew in the Explication of Gravity,
units.) that triplicate proportion of the shels of a Sphere, to their
p0 respective diameters, I suppose to be removed by the de-
crease of the power of Gravity.” In other words, while the
base area of a cone with its vertex at the surface of the
earth is as the height squared, gravity is as the inverse
height squared, meaning that the weight is equivalent to
that of a cylinder with constant gravity.
h
mercury
v
air
44
not exist as far as the spread of the disease is con- § 6.2. Statics
cerned, so we do not need R in our analysis.)
(d) Find the equation for I as a function of S and draw § 6.2.1. Lecture worksheet
its graph for the case of one initial sick student and
the others all susceptible.
Statics is the study of physical systems in equilibrium. Things
(e) How many students remain uninfected at the end that don’t move, in other words. In the problems of this sec-
of the epidemic? tion we shall see how two interesting statics problems reduce
to differential equations. This comes about because they in-
(f) Suppose half the students were vaccinated against volve tangential forces, and tangents are related to derivatives.
the flu (and thus not part of the susceptible popu-
lation). With a and b as above, and I = 1 at t = 0,
how many students remain uninfected in this case?
Indicate how this relates to your direction field. § 6.2.2. Problems
(b) Find a differential equation for the tractrix by ex- (b) Use this to express the condition of equilibrium as
pressing the slope of the curve in terms of this tri- a differential equation.
angle. dy/dx =
(c) Solve it.
(c) Separate the variables and make the substitution
u 2 = 1 − y 2 . Then: dx = du
6.2.2. The shape of a freely hanging chain suspended from two
points is called the “catenary,” from the Latin word for
(d) Factor out a u, split into partial fractions, and in- chain. In principle any piece of string would do, but one
tegrate. Substitute back to get x as a function of y, speaks of a chain since a chain with fine links embodies
and choose the constant of integration so that the in beautifully concrete form the ideal physical assump-
asymptote (along which the free end of the string is tions that the string is non-stretchable and that its ele-
pulled) is the x-axis and the point (0, 1) corresponds ments have complete flexibility independently of each
to the vertical position. Then: x = other.
45
nec alia melioris generis dari potest, nam certa quadam propo
ne semel in vniuersum assumpta, de caetero inueniuntur innum
1
seu quot lubet,the
puncta lineae
figure from quaesitae
Leibniz’s 1692 papervera
on theper geometriam
“linea cate-
na[ria]m sine suppositione
naria,” as he callsquadraturarum, quod you
it. Having solved the problem, in Algebram
can
easily understand what Leibniz means by “linea logarith-
scendentibus summum est. Hanc igitur placet paucis subbjicc
mica.”
T0 as
dy
=s
dx
(e) Solve the resulting differential equation. Take the § 6.3.2. Problems
constant of integration to be zero (this corresponds
to a convenient choice of coordinate system). 6.3.1. Problems about balls rolling frictionlessly down curved
ramps are reduced to differential equations by the fact
(f) Solve for s in the resulting expression.
that the speed acquired is equal to the speed of an ob-
(g) Substitute this expression for s into the original dif- ject in free fall having covered the same vertical distance.
ferential equation for the catenary. This is a simple consequence of energy conservation: no
matter how the ball descends, the speed it acquires must
(h) Verify by differentiation that y = (e x +e −x )/2 is a so-
be precisely sufficient to take it back up to its starting
lution of the resulting differential equation.
point, whether by the same or any other path. The speed
(i) Sketch the graphs of e x , e −x , and the catenary, as of an object falling under constant gravitational accel-
well as the coordinate system axes in the same fig- eration is of course proportional to time, but to charac-
ure. terise the curve geometrically we do not want time to fig-
ure in our equations. Therefore we note that, since dis-
The link between the catenary and logarithms led Leib- tance fallen is proportional to time squared, time is pro-
niz to suggest that measurements on an actual hanging portional to the square root of the distance fallen.
chain could be used in place of logarithm tables for cal-
culations. For your amusement I have included below (a) Prove this by integrating the equation acceleration
46
= g twice (using the initial conditions correspond- (b) Using conservation of momentum, show that this
ing to an object dropped from rest). leads to the differential equation
Thus we have speed in geometrical terms as propor- dv b
tional to the square root of the vertical distance covered. =−
dm m
Stated as a differential equation, this becomes
Hint: Recall the principles for discarding negligible
q
infinitesimal expressions from §1.3.
dx2 + dy 2 p
= a y. (c) Solve the differential equation. (Find v as a func-
dt
tion of m, i.e., your answer should have the form
(b) The appearance of time in the above equation is an v = ···)
obstacle to finding a solution in purely geometrical
terms as an equation in x and y. However, consider You have a choice between two space ships. Ship A has
the special problem of finding a curve along which a mass of m = 1 and an exhaust velocity of b = 1. Ship B
a ball descends at uniform vertical speed, so that also has a mass of m = 1 but has an exhaust velocity of
dt = dy. b = 2. Ship A is cheaper so if you buy it you can afford
more fuel: a mass of 2 instead of just a mass of 1 if you
(c) Sketch a rough guess of what the solution curve will buy Ship B. Thus when you start your journey (from rest,
look like based on your physical intuition. v = 0) the masses of the ships would be m = 3 and m = 2
(d) Find an equation for the solution curve by solving respectively for Ship A and Ship B.
the differential equation. Graph it and check your (d) Use these initial conditions to determine the value
guess. of the constant in your expression for v for each
6.3.2. The following problem establishes a general result about ship.
beads on ramps which will be of use on multiple occa- (e) Which ship has a higher terminal velocity (i.e.,
sions in later problems. Consider a bead sliding down a velocity when fuel is exhausted, i.e., when m =
ramp of shape y(x). ship’s mass = 1)?
(a) Using the same reasoning as in problem 6.3.1, ex-
press the velocity ds/dt of a particle released from
the top of the ramp as a function of y. § 6.4. Second-order differential equations
(b) The
R time it takes the particle to reach the bottom is
dt. Using the above, rewrite this integral in terms § 6.4.1. Lecture worksheet
of x and y.
Pendulum motion is the prototype for all periodic phenomena.
6.3.3. Many second-order differential equations arise from Indeed, we shall see in this chapter that the many variants of
Newton’s law F = ma. However, the actual Newton’s law the pendulum motion problem exhaust the better part of the
d
is not F = ma but F = dt (mv). theory of second-order differential equations. But our first or-
(a) Explain in terms of rules of differentiation why der of business is to derive the equation for pendulum motion
these two forms of the law are often equivalent in in its very simplest case.
practice.
The saying that something relatively easy is “not rocket
science” was perhaps coined by someone familiar with L
this distinction, because in rocket science it is in fact nec-
d
essary to use the more complicated form F = dt (mv). x
Consider a rocket in outer space with no external forces
acting on it. Then mv is constant (“conservation of mo-
s
mentum”) since dt d
(mv) = F = 0. But the rocket can still mg sin
move forward by throwing out parts of its mass in the mg cos
form of exhaust products, say with velocity −b relative
to the ship. Thus for any infinitesimal time period dt we mg
have the following “before” and “after” scenarios:
We wish to find s(t ), the elevation of the pendulum measured
along its arc. As always we must start with Newton’s law F =
time mass velocity
ma. The force involved is the component of gravity that pulls
ship t m v in the direction of the tangent; this is −gm sin θ (negative be-
ship t + dt m + dm v + dv cause it acts to decrease s), so F = ma says m s̈ = −gm sin θ.
Since we are looking for a differential equation for s(t ) we want
exhaust t + dt −dm v −b s and t to be the only variables. But θ is also variable, so we
47
must get rid of it. We could make it sin θ = x/L, which doesn’t Using the pendulum case as a prototype, we might say:
seem much better since x is also variable. But here is the trick:
horizontal displacement is almost equal to displacement along y = height of pendulum
the arc, i.e., x ≈ s, at least for small θ. With this approximation = the thing you want to control
gm
we can get rid of all unwanted variables and obtain m s̈ = − L s. ẏ = speed of pendulum
In other words, s(t ) is a function such that when you differenti-
= change in the thing you want to control
ate it twice you get back the function itselfp times −g /L. Which
functions behave like this? Well, sin( g /Lt ) does, and you ÿ = gravity on pendulum
could p also put a constant A in front and it would still work. And = change in the change in the thing you want to control
B cos( g /Lt ) does too. So s(t ) must have the form
6.4.2. What corresponds to y, ẏ, and ÿ in the other two scenar-
ios?
p p
s(t ) = A sin( g /Lt ) + B cos( g /Lt )
hot water valve / pig births
for some constants A and B .
temperature of room / pigs to sell
6.4.1. Select all that are true:
radiator temp. / adolescent pigs
The higher the starting point, the greater the swing
time, according to the solution formula. To repeat the above equation schematically, Newton’s law F =
ma in the case of pendulum motion is of the form
The smaller the swings, the more accurate the solu-
tion. −c x = ẍ
This gives an easy way to estimate g . Now we wish introduce air resistance. This is another force, so
it will go on the left (F ) side of the equation. Like gravity, it too
The fact that the solution has precisely two unde-
is working against the motion of the pendulum, so it has a mi-
termined constants in it corresponds to the fact
nus sign on front of it. But unlike gravity it does not depend
that the highest derivative in the differential equa-
on position but on velocity: the faster you go the more the air
tion was of order two.
“pushes back.” So the new schematic equation is
None of the above.
−c x − b ẋ = ẍ
Pendulum motion is the archetype of periodic or oscillatory
phenomena resulting from a kind of “delayed feedback” mech- Four qualitatively different scenarios are possible:
anism. Gravity is always trying to pull the pendulum down to
• The “undamped” (b = 0) pendulum keeps on swinging
its lowest position. One might say that gravity “wants” the pen-
forever. This is the idealised case where there is no air
dulum to reach this position. But if this is what gravity is trying
resistance.
to achieve, gravity acts a bit stupidly, because it always over-
shoots its target. The problem is that gravity controls the ac- • The “damped” (b small) pendulum gradually swings
celeration rather than the velocity of the pendulum: When the shorter and shorter arcs. This is the realistic case of an
pendulum reaches the lowest position, gravity makes the ac- ordinary pendulum with air resistance.
celeration zero, but the pendulum still has velocity, so it keeps
• The “overdamped” (b big) pendulum goes slowly to its
going anyway.
lowest point without oscillating. This could be for exam-
An analogous situation is that of a thermostat trying to keep ple a pendulum submerged in thick syrup.
the temperature of a room at a fixed level by means of a warm-
• The “critically damped” pendulum is right on the bound-
water radiator. The thermostat “wants” the temperature in a
ary between damped and overdamped.
room to be at a desired level, say 20◦C . But it doesn’t control
the rate of change of temperature, but rather the rate of change Here are graphs of these four cases:
of the rate of change of temperature. This is because it controls
the valve of the radiator, not the radiator temperature itself. It
the thermostat wants it to be warmer, it opens the valve and
lets warm water into the radiator. When the thermostat doesn’t
want to increase the temperature anymore, it closes the valve.
But when the valve closes there is still warm water left in the ra-
diator, which will keep warming up the room for a while longer.
So the thermostat missed its target.
Another example is what economists call the “pork cycle,” I have varied the resistance b while keeping other things the
which arises due to the tension between the immediate current same. The critically damped case (dashed) has “just the right
demand for pork and the delay of raising a pigs to a suitable age amount of resistance.” Therefore it is often desirable for appli-
for slaughter. cations such as shock-absorbing suspensions.
48
It is not for nothing that we ended up with an example in- how we found the solutions because once we have them we
volving springs rather than pendulums. Strictly speaking, the can verify them by simply plugging them into the differential
spring rather than the pendulum is a truer archetype for the equation. Nevertheless one may wonder where the idea of the
theory of second-order differential equations. In fact, the dif- characteristic equation came from. The heuristic behind it is
ferential equation we derived above holds exactly for springs, that one guesses that there will be a solution of the form e mt
whereas it holds only approximately for pendulums (because and then plugs this in, giving
of our approximation horizontal displacement ≈ arc).
(e mt )00 + b(e mt )0 + ce mt = (m 2 + bm + c)e mt = 0
There is a systematic way of solving all second-oder differential x h + x p is a solution to the differential equation.
equations with constant coefficients. For the “homogenous”
(i.e., non-forced) case ẍ + b ẋ + c x = 0, the strategy is to first x h is a solution to the differential equation.
solve a related quadratic equation, the “characteristic equa-
tion” m 2 + bm + c = 0. So the power of m in this quadratic x p is a solution to the differential equation.
equation equals the order of the derivative in the differential
equation. If this equation has two distinct real roots m 1 , m 2 , x p + 5 is another particular solution.
then the solution is x(t ) = Ae m1 t + B e m2 t . If the characteris-
tic equation has a double root m 1 = m 2 then the solution is 5x p is another particular solution.
x(t ) = Ae m1 t +B t e m1 t . These are the overdamped and critically
damped cases. The oscillating cases will come from character- x h contains two undetermined constants.
istic equations that have no real roots at all, such as x 2 + 1 = 0.
Such cases will be treated in the next section.
The choice of particular solution is not unique.
We know that we have found all solutions in this way since a Once I have an initial condition such as x(0) = 2 I
general solution for a second-order problem should have two can give one concrete answer for the general solu-
constants in it (as in problem 6.4.1). It doesn’t really matter tion.
49
§ 6.5. Second-order differential equations: complex ẍ + ẋ − x = 0
case ẍ + 0.1ẋ + 2x = 0
ẍ − 0.1ẋ + 2x = 0
§ 6.5.1. Lecture worksheet
Scenarios: A = pendulum with no air resistance; B = child
The method for solving second-order differential equations in- on swing pushed well by parent; C = child on swing given
troduced above needs some tweaking to apply to cases where out-of-synch pushes; D = pendulum in syrup; E = air
the roots of the characteristic equation are complex num- “encouragement” instead of air resistance; F = “negative
bers (§A.6). Though this is mathematically the most compli- gravity”; G = pendulum with slight air resistance; H =
cated case, it includes the physically simplest situation of all: pendulum pulled in one direction.
the simple pendulum equation introduced already in problem
2.5.2. Types of solution: a = perpetual oscillations; b = dying
oscillations; c = growing oscillations; d = slow approach
Suppose we want to solve ẍ + b ẋ + c x = 0 and find that the to equilibrium without oscillations; e = running off to in-
characteristic equation m 2 + bm + c = 0 has the complex roots finity; f = jerky motion.
m = a ± bi (complex roots always come in pairs like this). If
we proceed as above we would then have solutions of the form
x(t ) = Ae (a+bi )t + B e (a−bi )t . But of course we do not want com- § 6.5.2. Problems
plex numbers in our solution since we are interested in real
things like pendulums. We therefore break apart the complex 6.5.4. † Suppose we dig a straight, frictionless tunnel between
exponentials using problem A.6.7, any two points of the earth’s surface, such as New York
and Paris. Then, if we jump into the tunnel at one end,
e (a+bi )t = e at e bti = e at (cos bt + i sin bt ),
gravity will transport us to the other in less than 45 min-
utes. Let’s prove this.
and then just stick constants in front of every term to get the
final answer The motion is of course governed by F = ma. The force
in question is gravity. The gravitational pull exerted on
x(t ) = Ae at cos bt + B e at sin bt . an object with mass m inside the earth at a distance r
from the center of the earth is F = GRM3 mr , where G is the
As always, we can verify our answer by plugging it back into the usual gravitational constant, M is the mass of the earth,
original equation, and we know that it is the most general so- R is the radius of the earth. This is so for the following
lution since it has as many constants as the order of the equa- reasons. First of all the mass of the earth further than r
tion. Note that we only need to break apart one of the complex from the center will have no influence on the object since
exponentials to get all real solutions. The other one, e (a−bi )t , the net gravitational effect of this outer shell is zero.
would not add anything new since the constants will “eat up”
the minus sign. (a) Argue that this is so. Hint: Consider the object as
the vertex of a double cone, and argue that the two
6.5.1. Solve ẍ − 2ẋ + 2x = 0 given the initial conditions x(0) = 1
0 pieces of any thin outer shell that the cones cut out
and x (0) = 0.
cancel each other in terms of their gravitational ef-
6.5.2. What values of a and b correspond to undamped and fect.
damped pendulum motion respectively? Are there any
(b) Show that therefore the force of gravity acting on
other possibilities?
the body at a distance r from center of earth is
M
6.5.3. The standard form for a second-order differential equa- G R3
mr .
tion is ẍ + b ẋ + c x = f (t ). In terms of a pendulum, b is
air resistance, c is gravity, and f (t ) is forcing (i.e., an ex-
ternal force pushing the pendulum). By picturing this
x
prototype example we can get a good feeling for the be-
haviour of such a differential equation. Use this way of θ
thinking to associate the following differential equations
F
with their corresponding scenario and type of solution. r
Equations:
ẍ + x = 0
ẍ + x = t This force tries to pull the object towards the center of
ẍ + 2x = sin(t ) the earth, but since the object can only move along the
tunnel it is only the part of the force that pulls in the di-
ẍ + x = sin(t ) rection of the tunnel that has any effect. Let the tunnel be
50
the x-axis of a coordinate system with the tunnel’s mid- there are two equilibrium points which are of very different
point as the origin. character, as one can see from the phase plane diagram:
We have thus found the part of gravity that acts in the di-
rection of the tunnel, i.e., the actual force F in the law
F = ma governing the motion of the object along the
tunnel.
51
or, if we use the shorthand notation p = a + d , q = ad − bc,
∆ = p 2 − 4q,
1³ p ´
m= p± ∆
2
The point is that we do not need to worry about all these hor-
rible calculations, only classify which types of exponential ex-
pressions we are dealing with. This will give us a clear enough
picture of what is going on without having to worry about the As we see, nodes and saddles involve exceptional lines. So to be
numerical details. able to draw a good picture in those cases we must first know
the slopes m of these lines. The characteristic property of these
If for example the exponents m are purely imaginary (which lines is: if we are on the line, we stay on the line. Being on the
obviously happens if p = 0 and q > 0) then both x(t ) and line means that y = mx (no constant term since the line passes
y(t ) are periodically oscillating functions, as seen in our ta- through the equilibrium, i.e., the origin), and staying on the
ble above. In the phase plane this gives closed loops like those line means that we keep moving in the direction of the line,
shown in the predator-prey figure above. As time ticks away, we i.e., dy/dx = m, so we get the equation
go around and around in the same loop over and over again.
ẏ c x + d y c x + d mx
= = =m
If the exponents are complex with a negative real part then x ẋ ax + b y ax + bmx
and y are oscillating towards zero, so we get an inward spiral. which can be solved for m (the rightmost equality yields
a quadratic equation, corresponding to the two exceptional
And so on. Altogether we get the following classification. lines).
• q > 0, ∆ > 0: node. The directional arrows are found by plugging in particular
points near the equilibrium. If for example ẋ and ẏ are both
positive at a particular point then they are both growing so we
must be heading “northeast,” so we place an arrow in our dia-
gram to that effect.
This classification was for a simple linear system with a single
equilibrium at the origin, but in fact any equilibrium can be ap-
proximately reduced to this situation. This is done as follows.
Take for example the predator-prey system
p < 0: stable node. p > 0: unstable node n ẋ = 3x − x y
ẏ = −4y + 2x y
• q > 0, ∆ < 0: spiral. which we know has an equilibrium at (c/d , a/b) = (2, 3). First
we want to make a change of variables that brings this point to
the origin. We do this by making x = 2 + X and y = 3 + Y , which
means that the equilibrium (2, 3) is the origin (0, 0) in the new
coordinate system (X , Y ). With this change of variables the sys-
tem becomes
n ẋ = 2x − x y = 3(2 + X ) − (2 + X )(3 + Y ) = −2Y + X Y
ẏ = −3y + 2x y = −4(3 + Y ) + 2(2 + X )(3 + Y ) = 6X + X Y
p < 0: stable spiral. p > 0: unstable spiral (We can check our work by the fact that the constant terms
must disappear since the derivatives must vanish at the equi-
librium (X , Y ) = (0, 0).) But we can still not use our classifica-
• q < 0: saddle. tion since there are nonlinear terms present (X Y ). However,
since we are interested in the behaviour close to the equilib-
rium point we can discard any higher-order terms: when X and
Y are very small, X Y is very, very small, so it is negligible in
comparison with X and Y . With this linear approximation we
get
n ẋ = −2Y
ẏ = 6X
Now we can classify the equilibrium using the table above. In
this case, p = 0 and q = 12, so (2, 3) is a centre, i.e., the equilib-
• p = 0, q > 0: centre. rium is encircled by closed loops. To find the direction of the
52
loops we can consider for example a point just to the right of (b) Romeo’s and Juliet’s families are enemies. This
the origin, (X , Y ) = (1, 0). There ẋ = 0 and ẏ > 0, so we are head- can be expressed in the initial condition (r, j ) =
ing straight upwards from the “three o’clock” position, mean- at time t = 0.
ing that the loops go counterclockwise.
(c) What happens in the long run?
The other equilibrium is at (0, 0) already so we do not need to
change the variables. We can simply linearise directly to get (d) “In the Spring a young man’s fancy lightly turns to
thoughts of love,” says Tennyson. What differen-
n ẋ = 3X
tial equation concept is best invoked to capture this
ẏ = −4Y idea?
Here q = −12 so we have a saddle point. As we saw above, nor- 6.6.6. The method we have studied in this section for drawing
mally when we have a saddle point we would look for the ex- the phase portrait of a system of two first-order differen-
ceptional lines by solving tial equations can also be applied to second-order differ-
ẏ −4Y −4mX ential equations. The latter is reduced to the former by
= = =m taking the derivative of the function as a new variable.
ẋ 3X 3X
Let us carry this out in the case of the pendulum equa-
for m. In this case, however, the method breaks down since tion. Recall from 6.4 that the actual pendulum equation
the solutions are the x-axis (which is not found since Y = 0 is is ẋ 0 = −k sin(x). We simplified this to ẋ 0 = −kx, which
impossible in the above equation) and the y-axis (which is not is accurate for small oscillations since sin(x) ≈ x when
found since it has m = ∞). But it is clear enough that the axes x is small. But now we wish to use the actual equation
are the exceptional lines and that along the y-axis we are crash- ẋ 0 = −k sin(x), which holds for oscillations of any size.
ing to zero and along the x-axis we are running off to infinity.
This is also necessary given the counterclockwise orientation (a) Let y = ẋ and note how the pendulum equation be-
of the loops around the other equilibrium. comes a system of two first-order differential equa-
tions.
In general, once we have analysed the equilibria and drawn a
local picture for each we can fill in the rest of the phase plane (b) Carry out the phase plane analysis for this system.
by making one local picture transition smoothly into the other (You will need to use the approximation sin(x) ≈ x
(this is of course permitted whenever ẋ and ẏ are continuous). when linearising near equilibria.)
6.6.3. Draw the phase plane diagram of You should obtain a picture like this:
n ẋ = y + x y
ẏ = 3x + x y
§ 6.6.2. Problems
53
§ 6.7. Reference summary
§ 6.7.2. Other methods for first-order differential equations
1 + x = 2x y y 0 , y(1) = 2
y 0 + 2x y = x
Write y 0 = d y/d x and use separation of variables. 1 + x = 2
Since 2xdx = x 2 (where C = 0), we see that e x is an inte-
R
2x y y 0 =⇒ ( x1 +1)d x = 2yd y =⇒ ln|x|+ x = y 2 +C . Plugging
2 2 2
in the condition x = 1 p when y = 2 gives ln|1| + 1 = 22 +C =⇒ grating factor. This gives e x y 0 + 2xe x y = xe x , and hence
2 2 2 2
(e x y)0 = xe x and e x y = xe x dx = [x 2 = t , 2xdx = dt] =
R
C = −3. Thus y(x) = ± 3 + x + ln|x|.
1 1 t 1 x2
R t
2 e dt = 2 e + C = 2 e + C . The general solution is thus
2
y 0 e x+y = 1, y(0) = 1 y = 12 +C e −x .
dy y
= e −x ⇒ e y dy = e −x dx ⇒ e y dy = e −x dx ⇒ e y =
R R
dx e
−x
−e + C . Since y(0) = 1, we get C = e + 1. Thus the solu- x 2 y 0 + y = e 1/x , y(1) = e.
tion is e y = −e −x + e + 1 ⇔ y = ln(−e −x + e + 1).
x 2 y 0 + y = e 1/x ⇔ y 0 + x12 y = x12 e 1/x . Multiplying by the
integrating factor e −1/x gives e −1/x y 0 + x12 e −1/x y = x12 ⇔
y 0 = y 2 , y(0) = 1 D(e −1/x y) = x12 , which upon integrating yields e −1/x y = C −
1 1 1/x
An evident solution is y = 0, but it does not satisfy the initial x ⇔ y = (C − x )e . The boundary condition y(1) = e im-
condition y(0) = 1, so it can be disregarded. When y 6= 0 we plies e = y(1) = (C − 1)e ⇒ C = 2. Hence the solution is
y0 dy y = (2 − x1 )e 1/x .
can divide by y 2 , which gives y2
= 1 and y2
= dx. Integrating
gives 1y = x + C and plugging in y(0) = 1 gives C = −1. Thus
1
y = 1−x . When above methods not applicable: A substitution may make
it so. Standard substitutions:
(1 + cos x)y 0 = y sin x, y(0) = 1 • If the differential equation is of the form y 0 = f (x, y)
where f is invariant under scaling ( f (λx, λy) = f (x, y)),
dy sin xdx
R dy R sin xdx
(1 + cos x)y 0 = y sin x ⇔ y = 1+cos x ⇔ y = 1+cos x ⇔
substitute u = y/x.
ln y = − ln(1 + cos x) + C . The condition y(0) = 1 gives ln 1 =
− ln 2 + C ⇔ 0 = − ln 2 + C ⇔ C = ln 2. Hence ln y = − ln(1 + • If the differential equation involves terms a y ± bx, try
2
cos x) + ln 2 ⇔ y = e − ln(1+cos x)+ln 2 ⇔ y = 1+cos making this u.
x.
• If the differential equation is of the form y 0 + P (x)y =
(y + 1)y 0 + cos x = 0, y(0) = 0 Q(x)y n , substitute u = y 1−n .
The equation separates to (y + 1)dy = − cos xdx ⇔ 21 y 2 + y = To carry out the substitution: Differentiate the equation defin-
− sin x +C . Plugging in y(0) = 0 gives 21 ·0+0 = 0+C ⇒ C = 0. ing u to find du
dx . Using this and the defining equation for u,
We multiply by 2 to get y 2 + y = −2 sin x ⇔ (y + 1)2 = 1 − eliminate all y’s and dy’s from the differential equation, so as to
2 sin x. The initial condition produce a differential equation for u as a function of x. Solve
p means that only the positive
roots are relevant, so y = 1 − 2 sin x − 1. this equation. Substitute back to express the answer in terms
of the original variables.
54
The full solution of the differential equation is the sum of the
§ 6.7.3. Second-order differential equations
homogenous and particular solutions: x(t ) = x h + x p .
• Solve ẍ + b ẋ + c x = 0.
y 00 + 4y 0 + 4y = x + 1
Form the corresponding characteristic equation m 2 + bm +
c = 0. The roots of this equation tells you the solution: The characteristic equation m 2 + 4m + 4 = 0 has a dou-
ble root m 1,2 = −2. Hence y h = (C 1 x + C 2 )e −2x . We seek a
particular solution of the form y p = Ax + B . We see that
roots of m 2 + bm + c = 0: solution of ẍ + b ẋ + c x = 0: y p00 + 4y p0 + 4y p = 4Ax + (4A + 4B ) = x + 1. Thus 4A = 1 and
distinct real x(t ) = Ae m1 t + B e m2 t
4A + 4B = 1, which gives A = 14 , B = 0. Thus y p = 14 x and the
double root x(t ) = (A + B t )e mt
general solution is y = (C 1 x +C 2 )e −2x + 41 x.
complex a ± bi x(t ) = Ae at cos bt + B e at sin bt
y 00 + 3y 0 − 4y = 0 y 00 + y = x 2 , y(0) = 0, y 0 (0) = 0
The characteristic equation m 2 + 3m − 4 = 0 has two real The characteristic equation m 2 + 1 = 0 has the solution
roots: m 1 = 1 and m 2 = −4. Thus y(x) = Ae x + B e −4x . m = ±i , hence the homogenous solution is y h = C 1 sin x +
C 2 cos x. For the particular solution, assume y p = ax 2 +
bx + c. Then y 00 = 2a. The requirement y p00 + y p = x 2 gives
y 00 + 2y 0 + y = 0
2a + ax 2 + bx + c = x 2 , which has the solution a = 1, b = 0,
The characteristic equation m 2 + 2m + 1 = 0 has a double c = −2. So the particular solution is y p = x 2 − 2, and the full
root: m 1,2 = −1. Thus y(x) = (Ax + B )e −x . solution is y = y p + y h = x 2 − 2 +C 1 sin x +C 2 cos x. The con-
dition y(0) = 0 gives 0 = −2+C 2 , C 2 = 2. Differentiation gives
y 0 = 2x + C 1 cos x − C 2 sin x and the condition y 0 (0) = 0 gives
y 00 + 2y 0 + 2y = sin 2x
C 1 = 0. Hence y = x 2 − 2 + 2 cos x.
The characteristic equation m 2 +2m+2 = 0 has the complex
roots m 1 = −1 + i and m 2 = −1 − i . Thus y(x) = e −x (A sin x +
B cos x). y 00 − 2y 0 − 15y = 6e −x
y 00 − 4y 0 + 13y = e −2x + 1
If f (t ) has this form: Try this as x p :
The characteristic equation m 2 − 4m + 13 = 0 has the roots
3e 5t C e 5t
m 1,2 = 2±3i . Thus y h = e 2x (D 1 cos 3x +D 2 sin 3x). We seek a
sin t C sin t + D cos t particular solution of the form y p = Ae −2x + B . This gives
y p00 − 4y p0 + 13y p = 4Ae −2x − 4(−2Ae −2x ) + 13(Ae −2x + B ) =
2 cos 3t C sin 3t + D cos 3t
25Ae −2x + 13B = e −2x + 1. Hence A = 1/25, B = 1/13 and
4x + 2 Cx +D this y p = 25 1 −2x
e 1
+ 13 . The general solution is thus y =
2 1 −2x 1
8x C x2 + D x + E 2x
e (D 1 cos 3x + D 2 sin 3x) + 25 e + 13 .
55
in terms of p = a + d , q = ad − bc, ∆ = p 2 − 4q: Find and classify the equilibria of the system ẋ = x − y, ẏ =
x + y − 2x y. Draw the phase plane.
p q ∆ equilibrium
+ + + unstable node Equilibria occur where x − y = 0, x + y − 2x y = 0. Eliminat-
− + + stable node ing y gives x(1 − x) = 0, so equilibrium points occur at (0, 0)
+ + − unstable spiral and (1, 1). First look at (0, 0). Here ẋ = x − y, ẏ ≈ x + y. Since
− + − stable spiral p = 2, q = 2 and ∆ = −4, it’s an unstable spiral. Next consider
− saddle (1, 1). Introduce new variables by x = 1 + X , y = 1 + Y . Then
0 + centre Ẋ = X − Y , Ẏ = 2 + X + Y − 2(1 + X )(1 + Y ) ≈ −X − Y . Hence
a = 1, b = 1, c = −1, d = −1, so that q = −2 < 0, so this point p
is a saddle (with m = Ẏ / Ẋ = −X −mX −1−m
X −mX = 1−m ⇒ m = 1 ± 2).
3
a = 1, b = −5, c = 1, d = −1, so p = a + d = 0, q = ad − bc = 1
is a centre.
-1
For nodes and saddles, find the slope of the exceptional lines -2
by substituting Y = mX into Ẏ / Ẋ = m. -3
56
Find and classify the equilibria of the system ẋ = x − y, ẏ =
x 2 − 1. Draw the phase plane.
-3 -2 -1 1 2 3
-1
-2
-3
57
§ 7.1.2. Problems
7 P OLAR AND PARAMETRIC CURVES
7.1.3. † Another important spiral is the logarithmic spiral r =
§ 7.1. Polar coordinates e kθ .
7.2.3. Use the figure from problem 7.2.1 to express the angle the
In fact, in rectangular coordinates, spirals cannot be described curve r (θ) makes with the radial line in terms of r and θ
by a polynomial equation of any kind. (and their differentials).
7.1.2. ? Can you think of a simple way of seeing this at a glance? In §3.1 we saw how the area under the curve y(x) is made up of
Hint: cf. problems A.2.5–A.2.7. rectangles with area y dx.
58
7.2.4. Draw the analogous picture for the polar coordinate At the moment the bomb is dropped we fire a projec-
case, and use it to express by means of an integral the tile aimed at the point from where the bomb is dropped.
area “swept out” by a polar curve. Show that we will hit the bomb mid-air as long as the fir-
ing velocity of our canon is greater than a certain thresh-
Hint: Instead of rectangles the area will be made up of
old.
“pizza slices,” which for the purposes of area calculation
may be considered triangular. The curve traced by a point on a rolling circle is called a cycloid:
§ 7.2.2. Problems
(b) ? Discuss how this proof relates to the ones we have 7.3.4. As a variant of the cycloid we can let a circle roll on an-
seen in problems 1.1.1 and 3.4.3. other circle:
§ 7.3. Parametrisation
59
7.3.7. † One of the oldest recorded mathematical documents is
a Babylonian clay tablet from almost four thousand years
ago, way back in the Bronze Age. It consists in nothing
but a long list of Pythagorean triples, i.e., integers a, b, c
such that a 2 + b 2 = c 2 . Explain how problem 7.3.5 can be
used to generate Pythagorean triples.
7.3.8. A thread is wound around a circular spool. You grab the 7.4.5. In this problem we shall prove that the bob of a per-
end of the thread and start unwinding it while keeping it fect pendulum clock swings along a cycloidal path. This
as taunt as possible. means that even as the pendulum’s swings are damp-
ened it still takes the same time to complete one full
(a) Argue that the unwound piece of the string at any swing. In other words, a particle sliding down a cycloidal
stage in this process is tangent to the circle at the ramp (a cycloid turned upside-down compared to our
point of contact. previous pictures) will take the same time to reach the
bottom no matter where it started. The fact that we
(b) Sketch roughly the curve traced out by the free end
are now considering an upside-down cycloid can eas-
of the thread as you unwind it.
ily be accounted for by simply letting the y-axis be di-
(c) Find a parametric representation of this curve. rected downwards. Then y represents the vertical dis-
tance fallen from the highest point of the cycloid (the
Hint: Let the spool be the unit circle and let the ini- start of the ramp), and our previous parametric equa-
tial position of the free end of the thread be (1, 0). As tions carry over without change.
parameter, use the angle θ between the x-axis and
(a) Using the parametric representation of the cycloid,
the point where the unwound portion of the string
rewrite the integral of problem 6.3.2 in terms of θ.
touches the spool.
(b) What are the bounds of integration?
(c) Evaluate the integral.
§ 7.4. Calculus of parametric curves (d) Repeat all of the above steps for the case where the
particle is released not from y = 0 but some lower
point y = y 0 .
§ 7.4.1. Lecture worksheet
(e) Explain how this establishes what we wanted to
show.
In this section we shall see how to deal with tangents, areas,
and arc lengths of curves given parametrically. The idea is sim-
ple: start with the usual expressions, and the rewrite them in § 7.5. Reference summary
terms of the parametrisation, as outlined in §7.5.4. Let us keep
using the cycloid of problem 7.3.3 as our guiding example in
this investigation. § 7.5.1. Polar coordinates
7.4.3. Show that the length of one arch of the cycloid is four • Given a point (x, y) in cartesian coordinates, find its polar co-
times the diameter of the rolling circle. ordinates (r, θ).
r = x 2 + y 2 . If x is positive, θ = arctan(y/x). If x is negative,
p
7.4.4. Show that the result of problem 7.4.1a can be interpreted § 7.5.2. Calculus in polar coordinates
geometrically as saying that the tangent is parallel to the
other dotted line in the figure below. Arc length of polar curve r (θ) from θ = a to θ = b:
60
s
¶2 § 7.5.4. Calculus of parametric curves
Z b dr
µ
r2 + dθ
a dθ • Find the slope of the tangent to a parametric curve
(x(t ), y(t )).
Angle r (θ) makes with radial line: dy dy/dt
dx = dx/dt
§ 7.5.3. Parametrisation
Fill in various values for t and plot the resulting points. These
are all points on the curve.
61
§ 8.1.2. Problems
8 V ECTORS
8.1.3. You want to paddle across a river in a canoe. You can
paddle twice as fast as the speed of the flow of the river.
§ 8.1. Vectors At what angle should you aim the nose of your canoe in
order to go straight across the river?
§ 8.1.1. Lecture worksheet 8.1.4. Prove that if you join the midpoints of the sides in
any quadrilateral you always get a parallelogram. Hint:
In this section we shall see that geometrical problems regard- Think of quadrilateral as a + b = c + d, divide by 2, inter-
ing projections and perpendicularity can be reduced to algebra pret geometrically.
in a remarkably simple way. This is done using the language of
8.1.5. (a) Prove both algebraically and visually that |a + b| =
vectors. A vector is a directed line segment: it is so-and-so long
|a − b| if and only if a and b are perpendicular.
and it goes one way rather than the other. We draw it as an ar-
row and denote it v. We can also express it in coordinate form (b) Under what conditions is |a + b|2 = |a|2 + |b|2 ?
by putting its foot end at the origin and recording the coordi-
nates of its endpoint (this is why the v is fat: it is “stuffed” with
more than one number). But the vector is the same no matter § 8.2. Scalar product
where it starts, like these two v’s:
§ 8.2.1. Lecture worksheet
u = (-4,1)
Vectors can be used to express projections in a very convenient
way.
v = (1,2)
v = (1,2) a b
The arithmetic of vectors goes like this: |a |cos
b The length of a’s projection onto b is easily expressed trigono-
a a v
a+ b metrically. It is |a| cos θ, where |a| means the length of the vec-
a-
2v
b
62
Now, the projection properties of i, j, k are particularly sim-
ple: any one of them projected onto itself gives 1, and pro-
jected onto each of the other two gives zero. Therefore when
we multiply out the parenthesis all the cross terms go away
and only the “like with like” terms survive. So the result is
a 1 b 1 + a 2 b 2 + a 3 b 3 , as claimed.
8.2.1. Which of the following are assumptions made in the (a) Argue that any point of the form (a, a, a) is equidis-
above proof? tant from each of these three points.
(b) Determine the coordinates of the fourth hydrogen
u · v = v · u for any vectors u, v.
atom using the condition that all hydrogen atoms
(u + v) · w = u · w + v · w for any vectors u, v, w must be equidistant.
(c) Determine the coordinates of the carbon atom by
(ku)·v = k(u·v) for any vectors u, v, and constant k.
computing the “average” position of the four hydro-
u · v = u 1 v 1 + u 2 v 2 + u 3 v 3 for any vectors u, v gen atoms (i.e., add the hydrogen position vectors
and divide by 4).
To complete the proof any such assumptions would have
(d) Compute the bond angle, i.e., the angle between
to be proved
the lines that join the carbon atom to two of the hy-
geometrically (using cosine form) drogen atoms.
8.2.5. Show by an example that a · b = a · c does not necessarily
algebraically (using coordinate form)
imply b = c. In other words, you cannot “cancel” the a.
Our argument about i, j, k also highlighted two useful special 8.2.6. Consider these two ways of setting up a coordinate sys-
cases of the scalar product: projection onto itself, and perpen- tem of three variables:
dicularity. The scalar product of a vector with itself gives the
z z
length squared, a · a = a 12 + a 22 + a 32 = |a|2 , and the scalar prod-
uct of two vectors is zero if and only if they are perpendicular.
Indeed, if a problem has right angles in it, chances are that you
can solve it by scalar products. Here is an example:
63
(b) Is this proof circular? I.e., was the Pythagorean the-
a×b
orem needed to establish the properties of vectors
that you used in your proof?
64
§ 8.3.2. Problems ∆t
∆t
8.3.4. Find (1, 2, 3) × (2, 1, 2) and use it to determine the area of
the parallelogram spanned by these two vectors. Note
that the result is very easily obtained in this way from a
formula that we found using physical reasoning, whereas
it would be much harder to compute directly by brute- This is easily proved by vector methods. Let r be the po-
force analytic geometry. sition vector of the planet with the sun as the origin.
(a) The gravitational force is directed towards the sun.
Explain what this means in terms of r̈. Hint:
§ 8.4. Geometry of vector curves cf. problem 8.4.2.
(b) Explain how the area covered by the planet is mea-
§ 8.4.1. Lecture worksheet sured by |r × ṙ|/2.
d
(c) Prove the product rule for vector products: dt (u ×
Many geometrical properties of curves are more naturally and
v) = u̇ × v + u × v̇.
elegantly expressed in vector language than in terms of explicit
formulae in x and y. If x(t ) = (x(t ), y(t )) is the position of a (d) Use this to prove that the planet covers equal areas
moving particle, then its velocity is ẋ(t ) = v(t ) = (ẋ(t ), ẏ(t )). Ge- in equal times.
ometrically, ẋ is a tangent vector since it indicates the “direc-
tion of instantaneous change.”
a-
2v
b
8.4.2. (a) Find the acceleration vector a = ẍ for this uniform a -v
circular motion and illustrate with a figure. b b
(b) Interpret this result is physical terms, using New-
Vector addition. Algebraically: (a, b) + (c, d ) = (a + c, b + d ). Ge-
ton’s law F = ma, in the case where x is the motion
ometrically: arrows head to tail.
of a planet about the sun.
) ka = k(a, b) = (ka, kb) = magnification of a by factor k.
8.4.3. By definition, a = v(t +dt)−v(t
dt is proportional to the dif- q
ference between two successive velocity vectors. Argue |x| = x 12 + x 22 + x 32 = length of x.
on this basis that a is always perpendicular to v for any
fixed-speed motion.
a parallel to b ⇐⇒ a = kb
When studying the geometry of curves we prefer to use unit-
speed parametrisations of our curves, |ẋ| = 1, since this gives position
the curve in its purest form, uncontaminated by physical con- −−→
vector of OP ; vector pointing from the origin to P
siderations. For unit-speed parametrisations, as problem 8.4.3 point P
suggests, the geometrical meaning of |ẍ| is “how much the
unit vector vector of length 1
curve is turning,” or curvature. This is the same curvature stud-
orthogonal perpendicular; making a right angle;
ied in §13.1. We see that vector language is more naturally
scalar product 0
suited to the problem and simplifies a mess of a formula into
orthonormal orthogonal and of unit length
the simple and intuitive |ẍ|. And this simplification is no mere
game with symbols, as the following problem shows. z
8.4.4. Compute the curvature of a circle using the vector k=(0,0,1)
method, and compare with the non-vector way of doing
this (problem 13.1.3). j=(0,1,0)
y
i=(1,0,0)
§ 8.4.2. Problems x
8.4.5. Kepler’s area law says that planets sweep out equal areas (2, 5, 1) = 2î + 5ĵ + k̂
in equal times.
65
• Ensure that two vectors are perpendicular.
§ 8.5.2. Scalar product
a × b is perpendicular to a and b Determine the angle θ between the vectors (1, 5) and
(3, 2).
|a × b| = area of parallelogram spanned by a and b
From the scalar product identity (1, 5) · (3, 2) = |(1, 5)| ·
|(3, 2)| · cos θ, we obtain cos θ = p 2 1·3+5·2
2
p
2 2
= p 13p =
1 +5 · 3 +2 26 13
§ 8.5.4. Geometry of vector curves
p1 . The angle is thus π/4.
2
For a curve given parametrically by x(t ):
ẋ = tangent vector. Find the angle between v = (1, 0, 1) and w = (1, 2, 2).
|ẍ| = |Å| = curvature. On the one hand, v · w = (1, 0, 1) · (1,
p2, 2) = 3. On the
p other
hand, pv · w = |v||w| cos θ,
p and |v| = 1 2 + 02 + 12 = 2, and
p
|w| = 12p+ 22 + 22 = 9 = 3. Thus 3 = 3 2 cos θ =⇒
§ 8.5.5. Problem guide
cos θ = 1/ 2 =⇒ θ = π4 = 45◦ .
• Find a vector pointing from a to b.
b − a. • Find area of parallelogram.
tion vectors.
Add the vectors and divide by how many there are. • Determine the direction of a × b.
• Find center of mass (weighted average position) of a set of Point the index finger of your right hand in the direction
weighted position vectors. of a, with your palm facing towards b. The vector product
Multiply each position vector by its mass, add all together, a×b points perpendicularly upwards in the direction of your
and divide by total mass. thumb. (See figure and screwdriver analogy in §8.3.1.)
66
§ 8.5.6. Examples
Find the distance from the point (2, 3, −1) to the plane 2x −
y + 2z = 2.
(x, y, z) = (2, 3, −1)+t (2, −1, 2) = (2+2t , 3−t , −1+2t ) is a para-
metric representation of the line in the direction of the nor-
mal through the given point. Plugging this into the equation
for the plane gives 2 = 2(2 + 2t ) − (3 − t ) + 2(−1 + 2t ) = 9t − 1,
and hence t = 13 . The distance is thus d = |t | · |(2, −1, 2)| =
p
1 2 2 2
3 · 2 + (−1) + 2 = 1.
Find the equation for a plane through the points P = (1, 1, 1),
Q = (1, 1, 0), R = (0, 0, 1).
The vectors QP = (0, 0, 1) and RP = (1, 1, 0) are parallel to the
plane. A normal vector to the plane is QP × RP = (0, 0, 1) ×
(1, 1, 0) = (−1, 1, 0). The equation for the plane thus has the
form −x + y = D. Plugging in one of the points, for example
P = (1, 1, 1), into this equation, shows that D = 0. Thus the
equation for the plane is −x + y = 0.
67
tion for the surface. For the saddle this gives the hyperbolas
9 M ULTIVARIABLE DIFFERENTIAL CALCULUS y = c/x. These curves we then plot in the ordinary x y-plane
and label them with their corresponding value for z = c, as
shown on the right. You are already familiar with contour plots,
§ 9.1. Functions of several variables no doubt, from their use in topographical maps. Here, for ex-
ample, is a mountain and its corresponding contour plot:
§ 9.1.1. Lecture worksheet
9.1.1. This is almost an ice cream cone, but one more adjust- 5(x 2 + y 2 ) = z 2
ment needs to be made. Explain what it is and how to wine glass
amend the equation.
ice cream cone
The idea of understanding a surface by its cross sections is sys-
tematised in contour plots. To produce a contour plot we slice beach ball
the surface by horizontal planes. Here for example I have sliced Asian-style farmer’s hat
the saddle-shaped surface z = x y at one positive and one neg-
ative z-value: pyramid
hour glass
champagne glass
-6 0 6
2 A function of two variables has two partial derivatives, f x and
-2
0 f y . They answer the questions: How fast is the height f chang-
2
-2 -6 ing when I take a small step in the x direction? And how much
6
for a step in the y direction? When moving in one of these di-
rections, the other variable doesn’t change. Computationally,
this means that, to find the derivative of f (x, y) with respect to
x you treat y as a constant and vice versa. So when differenti-
Algebraically, this corresponds to plugging z = c into the equa- ating with respect to x you can secretly think of any y’s in the
68
formulas as if they were 5’s or some such innocuous number. curve in x and y coordinates will be some formula in-
For example, if f (x, y) = x 2 y then f x = 2x y and f y = x 2 . To get volving x, y and α set equal to zero (i.e., we have moved
a feeling for the meaning of these derivatives, go back to the ice all terms to the left hand side). What we just proved
cream cone. above is that a point (x, y) on the safety curve satisfies
both f (x, y, α) = 0 and f (x, y, α + dα) = 0.
9.1.5. For the ice cream cone of the previous problem, compute
d
the following partial derivatives and interpret geometri- (b) Show that this implies that dα f (x, y, α) = 0.
cally.
Therefore we can find the safety curve by combining the
d
(a) z x (1, 0) two equations f (x, y, α) = 0 and dα f (x, y, α) = 0 so as to
eliminate α. This will give us all the points with the re-
(b) z y (1, 0)
quired property in terms of x and y only, which is what
we want.
§ 9.1.2. Problems
(c) Express the trajectory in parametric form in the
manner of §7.3.
9.1.6. † The mixed partial derivatives f x y and f y x are equal. The
derivative f x means: if I take a step in the x-direction, (d) Obtain one equation for the trajectory involving
how much does the value of f change? So ( f x ) y means: x, y, and α, but not t . (I.e., combine the equa-
if I take a step in the y-direction, how much does the tions so as to eliminate t .) This essentially gives us
change in f per step in the x-direction change? In other f (x, y, α) = 0.
words, ( f x ) y is the change in f along the top arrow minus
(e) Find the equation for the safety curve using the
the change in f along the bottom arrow:
method given above.
2
α+cos α 2
(Hint: You may want to use cos12 α = sin cos2α =
2
1 + tan α or some similar trick to relate different
trigonometric expression to each other.)
Interpret ( f y )x similarly and explain why ( f x ) y −( f y )x = 0.
(f) Explain how you can see from the equation that the
9.1.7. † This problem illustrates the context in which partial
curve you found has roughly the right shape and
differentiation was first conceived in the late 17th cen-
position.
tury. The creators of the calculus were not interested
in multivariable calculus and partial derivatives as it is 9.1.8. A function of one variable is called continuous if its graph
taught today, since most physical phenomena can be un- can be drawn without lifting the pen, i.e., if it has no
derstood in two-dimensional form. However, examples “gaps” or “jumps.” More technically, f (x) is continuous
like the one below led them to consider partial deriva- at a point x 0 if f (x) approaches f (x 0 ) as x approaches
tives nevertheless, and shows their use even for two- x 0 . For a function of two variables to be continuous, it is
dimensional problems. required that f (x, y) approaches f (x 0 , y 0 ) no matter how
(i.e., in which direction or along which curve) (x, y) ap-
Consider the trajectories of projectiles fired from a canon
proaches (x 0 , y 0 ).
at varying angles:
The famous French mathematician Cauchy claimed in
the early 19th century that if a function of two variables
is continuous at a point in each variable separately then
it is continuous at that point. He was mistaken, however.
xy
Consider the function z(x, y) = x 2 +y 2 . This function is
Note that the figure is two-dimensional: the projectiles
defined everywhere except at (0, 0), since division by zero
are not “coming towards you”; they are all within the
is undefined. We extend it to a function defined every-
same plane.
where by defining z(0, 0) to be 0.
Ignoring air resistance, the trajectories are of course
(a) Show that this function is continuous in each vari-
parabolas, as Galileo discovered. We want to calculate
able separately (i.e., z(x, 0) and z(0, y) are continu-
the dashed “safety curve.” Beyond this curve we are al-
ous as one-variable functions).
ways safe, whereas anywhere inside this curve we can be
hit. This curve can be computed using the fact that the (b) Show, however, that the function is discontinuous
trajectories of projectiles fired at two almost identical an- at the origin by finding another way of approaching
gles, say α and α + dα, intersect at the safety curve. the origin so that the z-values do not approach the
same value as above.
(a) ? Explain briefly how this fact is evident from the
figure. A trickier example still is z(x, y) = 2x y 2 /(x 2 + y 4 ).
Now to put this into equations. Let the trajectory for fir- (c) Show that if we approach the origin on any straight
ing angle α be f (x, y, α) = 0. That is, the equation for the line, z approaches zero, but z has different limits
69
when the origin is being approached along the two The tangent plane to the surface f (x, y) above the point (x 0 , y 0 )
parabolas x = ±y 2 . in the x y-plane is
(d) Illustrate both examples with plots of the func- z = f (x 0 , y 0 ) + f x (x 0 , y 0 )(x − x 0 ) + f y (x 0 , y 0 )(y − y 0 )
tions.
9.2.4. Explain why.
§ 9.2. Tangent planes 9.2.5. Give an expression for the normal vector in terms of
derivatives.
9.2.3. Find the shortest distance from the point (1, 1, 1) to the both positive
plane 2x − y + 3z = 1. (Hint: Walk in the direction of the but still a saddle
normal until you hit the plane.) The shortest distance is
times the length of the normal vector
Tangents are important in calculus, and in three dimensions Such a “tricky saddle” would throw us off if we looked only at
that means tangent planes. Here I have drawn a surface and f xx and f y y . To catch it we must study the mixed second deriva-
some of its tangent planes and normals: tive f x y , because this basically measures “how much different
the function is along the diagonal than along the axes” (as is
quite clear from the reasoning in problem 9.1.6).
So the final version of our classification is this. If f xx f y y −
( f x y )2 > 0 then the “diagonal effect” is not strong enough to
throw off the simple classification based on the axes directions,
so we get either a max or a min according to the signs of f xx
and f y y . If f xx f y y − ( f x y )2 < 0 we have a saddle one way or the
other (either a simple saddle if f xx f y y alone is negative, or a
tricky one if it becomes negative only after the diagonal effect
has been subtracted). Just as in the one-variable case it can
happen that none of these classification rules apply. In such
cases we are on our own and must seek another way of under-
standing what is going on.
70
9.3.1. Confirm that the function z = x y that I showed you in (b) The price at which each item can be sold is a func-
§9.1 has a saddle at the origin. tion of the total quantity produced, p = 520 − 10q.
What is the real-world reason for this?
9.3.2. What type of extremum does the function have at the ori-
gin? (c) Find the production levels that yield maximum
profit.
f (x, y) = (x + y)2
p 9.3.5. A manufacturer produces a quantity q of a product to be
f (x, y) = x2 + y 2 sold on two markets. The prices on each market depends
on the quantity sold there according to the formulas
f (x, y) = −x 2
f (x, y) = ln(1/e x
2 +y 2
) p a = 57 − 5q a p b = 40 − 7q b
71
9.4.3. Go back and try out the gradient on the ice cream cone Once the coordinates of the cities are given the total dis-
from §9.1. Interpret visually. tance is a function of xpand y; for example, if the first city
is at (a, b) then D 1 = (x − a)2 + (y − b)2 and so on for
9.4.4. If z = f (x, y) is the roof of a building, in what direction the other cities. So the total function to be minimised is
will rain water flow? a sum of three such root expressions. We could minimise
Since ∇ f tells us the direction of steepest ascent, it follows that this formula the brute force way by the method of §9.3,
−∇ f is the direction of steepest descent, and that the directions but the calculations would not be pretty. Here is a more
perpendicular to ∇ f correspond to no ascent at all, i.e., to go- clever method.
ing sideways while staying at the same height. (Visualise this (a) Find the gradient of D 1 as a function of x and y.
on your tilted plane.) Another way of saying this is that the gra- Hint: This can be done without calculations (if you
dient is perpendicular to the contour curves (since the contour have solved problem 9.4.8).
curves correspond to fixed height).
(b) Do the same for D 2 and D 3 and draw the gradients
A clever application of these ideas is to the problem of finding in the figure.
the normal to a given curve. Let’s say I want to find the nor-
(c) Explain why the sum of the gradients must be 0 at
mal to for example the curve x 2 y + 4y = 5 at the point (1, 1).
the optimum point.
Then my first step is to think of this as a level curve of the
function f (x, y) = x 2 y + 4y. At first this may seem like a very (d) Argue that this implies that the angles between the
circumspect way of going about things—after all, the original gradients must be 120◦ , and that this fact is enough
problem was a nice and simple two-dimensional problem and to find the optimum.
here I am making rather a mess of it by imagining it to be a
(e) ? The above method breaks down in certain excep-
cross section of a surface situated in three-dimensional space.
tional cases. Explain.
But sometimes generality is simplicity, and certainly so in this
case. For now I know by the very simple arguments above that
∇ f = ( f x , f y ) = (2x y, x 2 + 4) is perpendicular to the curve, so I
§ 9.5. Constrained optimisation
immediately see that the normal at (1, 1) is (2, 5).
9.4.5. Find normal vectors for a few points on the curve y = § 9.5.1. Lecture worksheet
3x 2 + x + 2. Illustrate with a sketch.
All of these things also generalise to higher dimensions. In par- The geometry of gradients also leads immediately to a sim-
ticular, we get for free the rather powerful result that the normal ple way of solving constrained optimisation problems (the so-
to a surface f (x, y, z) = c is given by the gradient ( f x , f y , f z ). called Lagrange multiplier method). Suppose for example that
we want to make f as big or as small as possible while being
9.4.6. Go back to §9.2.1 and re-explain in this new light what constrained to this dashed circle:
was said about normal vectors there.
f =1
9.4.7. Show by an example that the gradient method is some- f =2
times a more convenient way of finding a normal vector f =3
than the formula for normals that follows from the tan-
gent plane equation (problem 9.2.5).
Clearly the extrema will occur at the points where the con-
§ 9.4.2. Problems straint curve precisely touches one of f ’s contour curves, as
it does for f = 1 and f = 3. When the constraint curve cuts
9.4.8. What is the geometrical meaning of |∇ f |? Hint: what right through a contour, as it does for f = 2, there is one side
happens if I take a unit step in this direction? with bigger values and one with smaller, so it can’t be a mini-
mum nor a maximum. This idea is captured analytically as fol-
9.4.9. † Three cities are to be connected by roads. To minimise lows. The extremum points of f (x, y) subject to the constraint
cost and environmental footprint, we want to minimise g (x, y) = c are found by solving the system of equations
the total length of the roads. This is done by finding a
point (x, y) between the three cities such that the sum of f x = λg x
the distances to the cities, D = D 1 + D 2 + D 3 , is a small as f y = λg y
possible. g =c
The first two equations say that the gradient vectors of g and f
differ only by a multiple, λ. Geometrically, this means that the
normals of the constraint curve and a contour of f are parallel.
This happens precisely where the curves touch each other, as
shown in the picture.
72
For example, let’s find the maximum values of the saddle 9.5.5. 5x 2 + 6x y + 5y 2 = 8 is a tilted ellipse:
f (x, y) = x y subject to the constraint x 2 + y 2 = 1. So the con-
straint curve is the unit circle. When I round it in the x y-plane,
the corresponding points on the graph of f look like this:
73
9.6.2. Explain why.
§ 9.7.3. Planes
A more general version of the chain rule is: Given z(x, y) and
x(u, v) and y(u, v) as functions of u and v:
Equation for plane:
dz ∂z ∂x ∂z ∂y
= +
du ∂x ∂u ∂y ∂u Ax + B y +C z = D normal vector = (A, B,C )
9.6.3. Explain why one cannot “cancel the ∂x’s” in this formula.
Express the formula in a different notation, which avoids
this temptation. Is this better? The point (10, 18, 3) is in a plane with normal vector
(3, −2, 4). Find the equation of the plane.
9.6.4. Consider the parabolic “bowl” z = x 2 + y 2 . Starting at
Since (3, −2, 4) is a normal vector, the equation for the plane
the point (1, 1) in the x y-plane, how fast is the height z
is 3x−2y+4z = D for some constant D. Plugging in the point
changing if we move radially (straight away from the ori-
(10, 18, 3), we find that D = 3·10−2·18+4·3 = 30−36+12 = 6.
gin) or circularly (remaining at the same radius from the
The equation of the plane is hence 3x − 2y + 4z = 6.
origin)? Use polar coordinates and the chain rule to find
out.
Tangent plane to f (x, y) above the point (x 0 , y 0 ):
(x,y)
f x = 3x 2 − 3 = 0 ⇒ x = ±1 and f y = 2y − 4 = 0 ⇒ y = 2.
f x = y 2 − sin(x) f y = 2x y There are thus two stationary points:(1, 2) and (−1, 2). We
calculate: f xx = 6x, f y y = 2, f x y = 0. At (1, 2): f xx = 6 >
Find the partial derivatives of f (x, y) = xe x y . 0, f y y = 2 > 0, f xx f y y − f x2y = 12 > 0 ⇒ minimum. At (−1, 2):
f xx = −6 < 0, f y y = 2 > 0, f xx f y y − f x2y = −12 < 0 ⇒ saddle.
f x = e x y + x ye x y f y = x2e x y
74
x 2 +y 2 1
Find and classify the stationary points of g (x, y) = − f (x, y) = x 2 +y 2 . You are standing at (1, 1) in the x y plane.
2
1 Which way should you go to make f grow the fastest?
xy .
−2y
Stationary points occur where the partial derivatives are fx = −2x
(x 2 +y 2 )2
⇒ f x (1, 1) = − 21 and f y = (x 2 +y 2 )2
⇒ f y (1, 1) =
zero. This gives g x0 (x, y) = x + x 12 y = 0, which simplifies to − 21 . The direction of fastest increase in f is ∇ f = ( f x , f y ) =
x 3 y = −1, and g y0 (x, y) = y + 1
= 0, which simplifies to (− 12 , − 12 ) (in words: toward the origin).
x y2
3x y = −1. Solving for y in the first equation gives y = − x13 ,
3
which plugged into the second gives x(− x13 )3 = −1 ⇔ x18 = Directional derivative in direction of unit vector ŝ:
1 ⇔ x 8 = 1 ⇔ x = ±1. Combining this with y = − x13 , we
see that the critical points are (1, −1) and (−1, 1). To classify ŝ · ∇ f
these points we need the second derivatives: g xx 00
= 1 − x 23 y ,
g x00y = − x 21y 2 , g y00y = 1 − 2
x y3
. 00
For (1, −1) we get g xx = 1− Directional derivative in direction θ:
(−2) = 3, g x00y
= −1, g y00y
= 1 − (−2) = 3. Thus g xx 00
> 0 and
00 00
g xx gyy > g x002y
so the point is a minimum. For (−1, 1) we get f x cos θ + f y sin θ
00
g xx = 1−(−2) = 3, g x00y = −1, g y00y = 1−(−2) = 3. These are the
same values as before, so this point is also a minimum.
Find the rate of change of f (x, y) = x y 3 when moving away
Find and classify the stationary points of f (x, y) = x ye x−y . from the origin from the point (1, 1).
For ease of writing, let E = e x−y . Note that E > 0. f x = y 3 ⇒ f x (1, 1) = 1 and f y = 3x y 2 ⇒ f y (1, 1) = 3. When
f x = yE + x yE = 0 ⇒ y + x y = 0 ⇒ y = 0 or x = −1. standing at (1, 1), going away from the origin means going
f y = xE − x yE = 0 ⇒ x − x y = 0 ⇒ x = 0 or y = 1. in the direction θ = π4 . Therefore the directional derivative
p
So the stationary points are (−1, 1) and (0, 0). is f x cos θ + f y sin θ = cos π4 + 3 sin π4 = 2 2.
The second derivatives are f xx = 2yE + x yE , f y y = −2xE +
x yE , and f x y = E − yE + xE − x yE .
(x, y) = (−1, 1) ⇒ f xx = E > 0, f y y = E > 0, f x y = 0, so mini- Find the rate of change of f (x, y) = x y 3 when moving in the
mum. direction w = (−2, 1) from the point (1, 1).
(x, y) = (0, 0) ⇒ f xx = 0, f y y = 0, f x y = E ⇒ f xx f y y − ( f x y )2 = We know from the above example that ∇ f = (1, 3) at this
−E 2 < 0, so saddle. point. Moreover, w converted to a unit vectors is ŵ =
w/|w| = (− p2 , p1 ). Thus the directional derivative is ŵ·∇ f =
5 5 p
Find and classify the stationary points of f (x, y) = x 3 + y 3 + (− p2 , p1 ) · (1, 3) = 1/ 5.
5 5
6x y + 2.
∂ f (x,y) ∂ f (x,y)
Stationary points occur where = = 0, which in • Find the normal to a given curve or surface at a given point.
∂x ∂y
2 2
our case becomes 3x + 6y = 0 and 3y + 6x = 0, or, sim- Write in form f (x, y) = 0 (curve) or f (x, y, z) = 0 (surface).
plifying, x 2 + 2y = 0 and y 2 + 2x = 0. Solving for y in the first Compute the gradient of f and evaluate it at the given point.
equation gives y = − 21 x 2 , which inserted in the second gives This is the normal.
(− 21 x 2 )2 +2x = 0, or x 4 +8x = 0. This factors into x(x 3 +8) = 0.
Thus x = 0 or x 3 = −8, which means x = −2. When x = 0 we Find a normal vector to the surface z = x 2 + y 2 at the point
get y = 0 and when x = −2 we get y = −2. We thus have two (1, 1, 2).
stationary points: P 1 = (0, 0) and P 2 = (−2, −2). To classify
∂2 f (x,y) The surface is the level surface f = 0 of the function
them we calculate the second derivatives: A = = 6x,
∂x 2 f (x, y, z) = x 2 + y 2 − z. A normal vector is therefore ∇ f =
∂2 f (x,y) ∂2 f (x,y)
C= ∂y 2
= 6y, B = ∂x∂y = 6. For P 1 we get AC − B 2 = (2x, 2y, −1), which evaluated at the point in question is
−36 < 0, so it’s a saddle. For P 2 we get AC −B 2 = 144−36 > 0, (2, 2, −1).
while A = −12 < 0, so this is a maximum.
75
f =1 Determine the maxima and minima of f (x, y) = 5x 2 − 2y 2 +
f =2 10 on the curve x 2 + y 2 = 1.
f =3
g = c ⇔ x2 + y 2 = 1
f x = λg x ⇔ 10x = 2λx ⇒ λ = 5 or x = 0
f y = λg y ⇔ −4y = 2λy ⇒ λ = −2 or y = 0
Any local maxima or minima will be among the pairs of points
(x, y) that satisfy these equations. To determine which are So the stationary points are (±1, 0) and (0, ±1). f (±1, 0) = 15
max., min., or neither, plug them into f (x, y) and compare and f (0, ±1) = 8, so (±1, 0) are maxima and (0, ±1) minima.
their values. If the constraint curve has endpoints, these are
also potential max. or min. If the constraint curve is infinite,
also investigate the limit behaviour of the function there (e.g., § 9.7.7. Multivariable chain rule
there is no global max. or min. if the function grows or shrinks
without bounds in such a direction). Total differential:
∂z ∂z
dz = dx + dy
∂x ∂y
Find the maximum value of f (x, y) = x y subject to the con-
straint x 2 + y 2 = 1. Chain rule for partial derivatives when x and y are functions of
We need to solve the system t:
dz ∂z dx ∂z dy
= +
y = 2λx dt ∂x dt ∂y dt
x = 2λy Chain rule for partial derivatives when x and y are functions of
x2 + y 2 = 1 u and v:
dz ∂z ∂x ∂z ∂y
= +
Putting the first equation into the second gives x = 2λ(2λx). du ∂x ∂u ∂y ∂u
If we divide by x we get 4λ2 = 1, or λ = ± 12 . Of course, when-
ever we divide by x we must take care that we are not di-
viding by zero. But in this case x cannot be zero, since if
it is then so is y, which is impossible by the last equation.
Putting λ = ± 12 back into the first two equations we find
that y = ±x. Combining this with the last equation we get
x 2 + (±x)2 = 1, or x = ± p1 . Since y = ±x, all possible solu-
2
tions are ( p1 , p1 ), (− p1 , − p1 ), (− p1 , p1 ), ( p1 , − p1 ). To see
2 2 2 2 2 2 2 2
which are maxima and which are minima, we plug these
back into the original function f (x, y) = x y. The largest
value of f is thus 21 and the smallest − 12 .
g = c ⇔ 5x + 4y = 100
f x = λg x ⇔ 2y = 5λ ⇒ λ = 2y/5
f y = λg y ⇔ 2x = 4λ ⇒ λ = x/2
Hence 2y/5 = x/2, which substituted into the constraint
gives x = 10 and y = 12.5. Our candidate for a maximum
point is thus (10, 12.5), where f = 250. But since the con-
straint curve is an infinite line we must also investigate the
limit values of f as we go toward infinity along the line in
either direction. The constraint conditions shows that if
x → ∞ then y → −∞ and if x → −∞ then y → ∞. In either
case, f → −∞. Hence our candidate maximum is indeed
the biggest f ever gets.
76
§ 10.2. Polar coordinates
10 M ULTIVARIABLE INTEGRAL CALCULUS
§ 10.2.1. Lecture worksheet
§ 10.1. Multiple integrals
Polar coordinates (§7.1) are useful when evaluating double in-
tegrals since many regions of a circular or radial nature are
§ 10.1.1. Lecture worksheet
much more naturally and easily described in polar than rect-
angular coordinates. When an integral in ordinary rectangular
coordinates x, y is rewritten in polar coordinates r, θ it becomes
R
We know that y(x) dx is an area madeÎup of thin rectangles
with base dx and height y(x). Likewise, f (x, y) dx dy is a vol-
ume made up of think rectangular blocks with base area dx dy Ï Z β Z r 1 (θ)
and height f (x, y). This is called a double integral. To evaluate f (x, y) dx dy = f (r cos θ, r sin θ)r dr dθ
R α r 0 (θ)
it we integrate twice: once with respect to x and once with re-
spect to y. Geometrically, the first integration gives an expres-
where α, β are the angular (θ) bounds of R, and r 0 (θ), r 1 (θ)
sion for the cross-sectional areas of the shape, and the second
are the radial (r ) bounds for R for any given θ. The manner in
integration finds the volume by endowing each of these areas
which x and y have been translated we recognise from before
with an infinitesimal thickness.
(§7.5.1). It remains to explain where the extra r comes from in
10.1.1. Consider for example the tetrahedron with vertices the area element expression r dr dθ. We can understand this
(0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, 1). as follows. In an integral in rectangular coordinates, if we let
x increase by increments dx and y by increments dy, this gen-
(a) Argue that the “roof” of this tetrahedron is given by erates a grid of identical rectangles. But if we let r increase by
z = 1 − x − y. increments dr and θ by increments dθ, this generates a differ-
ent kind of grid in which the cells have different sizes.
Suppose I intersect the tetrahedron with a plane perpen-
dicular to the y-axis.
R 1−y
(b) Argue that the cross-sectional area is 0 1−x −
y dx.
In
Î general terms, to compute the double integral
R f (x, y) dx dy over some region R, we write it as an iterated
integral
Z b Z x1 (y)
f (x, y) dx dy
a x 0 (y)
Z d Z y 1 (x)
f (x, y) dy dx
c y 0 (x)
Î r dθ dr
10.1.2. Find, in both ways, R y dx dy where R is the region be-
tween y = x and y = x 2 .
77
§ 10.2.2. Problems
§ 10.4. Spherical coordinates
R∞ 2 2
10.2.3. Consider the integral I = 0 e −x dx. The function e −x § 10.4.1. Lecture worksheet
cannot be integrated in closed form using any of our pre-
vious integration techniques, so we cannot evaluate I by Spherical coordinates characterise points in 3-dimensional
direct integration. Nevertheless we can evaluate this in- space by means of one radial coordinate and two angular co-
tegral by an ingenious use of multiple integration, as we ordinates (analogous to longitude and latitude). Translating an
shall now see. integral into spherical coordinates gives:
z dz
some distance a > R away at the point (a, 0, 0) is the same
as that of a point-mass M located at the origin.
78
(g) Argue that two of the integrals can be evaluated where α, β are the angular (θ) bounds of R, and r 0 (θ), r 1 (θ) are
without knowing s. Do so. the radial (r ) bounds for R for any given θ.
Hint: Use the law of cosines to find the rela-
tion between s and the current variable. Also use § 10.6.2. Cylindrical coordinates
trigonometry to rewrite the integrand purely in
terms of s. z (r,θ,z)
(h) For the remaining integral, make a change of vari-
ables to rewrite the integral as an integral in s. z
(i) Evaluate the integral. y
θ r
(j) Conclude.
x
10.4.4. Solve problem 10.4.3 in the case where the mass m is lo-
cated inside the shell. x = r cos θ
10.5.1. Show that the area of the surface z = f (x, y) above an in- Integral transformation formula:
finitesimal rectangle of sides dx, dy is Ñ Ñ
q f (x, y, z) dx dy dz = f (r cos θ, r sin θ, z)r dr dθ dz
1 + f x2 + f y2 dx dy R R
x = r cos θ dx dy dz = ρ 2 sin φ dρ dφ dθ
y = r sin θ
Integral transformation formula:
Area scaling factor: Ñ
dx dy = r dr dθ f (x, y, z) dx dy dz =
R
79
x
Î
Find the cartesian coordinates (x, y, z) of the points De dxdy, where D is the triangular region with vertices
(r, θ, φ) = (1, 0, 0) and (r, θ, φ) = (2, π/2, π/2). (0, 0) , (1, 1), (1, 2).
(0, 0, 1), (0, 2, 0) The sides of the triangles are on the lines y = x, y = 2x,
x = 1. Hence D corresponds to x ≤ y ≤ 2x, 0 ≤ x ≤ 1. Thus
Î x R 1 R 2x x R 1 x y=2x R 1 x
Find the spherical polar coordinates (r, θ, φ) of the points D e dxdy = 0 ( x e dy)dx = 0 [e y] y=x = 0 xe dx =
R1
(x, y, z) = (0, 1, 0) and (x, y, z) = (1, 2, 2). [xe x ]10 − 0 e x dx = e − [e x ]10 = e − e + 1 = 1.
First decide whether the region of integration R is most eas- x y 2 dxdy, where D = {(x, y) : x 2 + y 2 ≤ 4, x ≥ 0}.
Î
D
ily described in terms of rectangular coordinates x, y or polar
coordinates r, θ. If you choose polar coordinates, translate In polar coordinates the region corresponds
π
the given function f (x, y) into r, θ and multiply it with the to ≤ θ ≤ π2 and 0 ≤ r ≤ 2.
2 Thus:
2
R π2 R 2 2
(r cos θ)(r sin θ) r drdθ
Î
area scaling factor (r ), as shown above. D x y dxdy = − π2 0
=
R π2 2
π
sin2 θ cos θdθ 0 r 4 dr [ 13 sin3 θ]−2 π [ 51 r 5 ]20
R
Next write the integral as an iterated integral. We seek to −π
= =
2 2
choose the order of integration in such a way that the bounds 1 1 64
3 (1 − (−1)) · 5 (32 − 0) = 15 .
are most easily expressed. We will evaluate the integrals from
the inside out, so the inner differential and the inner bounds
x
Î
of integration correspond to the first integration. But when D (x 2 +y 2 )2 dxdy, where D = {(x, y) : 0 ≤ y ≤ x, 1 ≤
writing down the bounds it is easier to work from the outside 2 2
x + y ≤ 2}.
in. For the outer integral, the bounds should be numbers
In polar coordinates
p D correspondsÎto E = {(r, θ) :
(constants) expressing the bounds between which the region x
1 ≤ r ≤ 2, 0 ≤ θ ≤ π/4}. Thus D (x 2 +y 2 )2 dxdy =
is contained as far as the outer variable is concerned. When p
Î r 2 cos θ R 2 −2 R π/4
specifying the inner bounds of integration you may use the E r4
drdθ = 1 r dr 0 cos θdθ =
p
outer variable in your expressions for these bounds; this is
[1/r ]1 2 [sin θ]π/4
0 = (−
p1 + 1) · p1 = p1 − 12 .
necessary whenever the bounds of the region with respect 2 2 2
to the inner variable are different for different values of the Ð
outer variable. If expressing the inner bounds becomes very • Evaluate a triple integral R f (x, y, z) dx dy dz.
complicated, this suggests that we should try the other or-
First decide whether the region of integration R is most easily
dering of the variables.
described in terms of rectangular coordinates x, y, z, cylin-
When the integrals have been written down completely, eval- drical coordinates r, θ, z (for regions with a symmetry axis,
uate them one at a time, going from the inside out. which we make the z-axis), or spherical coordinates ρ, φ, θ. If
you choose non-rectangular coordinates, translate the given
Î 1
function f (x, y, z) into your chosen coordinates (using stan-
D 1+x 2 dxdy where D is given by 0 ≤ y ≤ x ≤ 1. dard formulas x = · · · , y = · · · , z = · · · ), and multiply it with
R1 Rx R1 x 1 1 the volume scaling factor, as shown above.
= 0( 0
1
1+x 2
dy)dx = 0 1+x 2 dx = [ ln(1 + x 2 )]10 = (ln 2 −
2 2 Next write the integral as an iterated integral. We seek to
ln 2
ln 1) = choose the order of integration in such a way that the bounds
2
are most easily expressed. We will evaluate the three inte-
grals from the inside out, so the innermost differential and
80
the innermost bounds of integration correspond to the first
integration. But when writing down the bounds it is eas-
ier to work from the outside in. For the outermost inte-
gral, the bounds should be numbers (constants) expressing
the bounds between which the region is contained as far as
the outermost variable is concerned. When specifying the
next bounds of integration you may use the outermost vari-
able in your expressions for these bounds; this is necessary
whenever the bounds of the region with respect to the cur-
rent variable are different for different values of the outer-
most variable. Similarly, expression for the last bounds may
contain both of the two outer variables. If expressing the in-
ner bounds becomes very complicated, this suggests that we
should try another ordering of the variables.
When the integrals have been written down completely, eval-
uate them one at a time, going from the inside out.
81
to move to fill a hole pumped out in the middle. It may
11 V ECTOR CALCULUS help to consider a conical section of the water.)
11.1.1. One very nice aspect of this analogy is that it makes it ob-
vious “why” the force of gravity (and electrostatic attrac- If I pump water in at one point and suck it out at another I can
tion) diminishes as the inverse square of the distance. find the net effect by superimposing these two pictures. It looks
Explain how. (Hint: Imagine how the water would need like this:
82
lines, as planets and projectiles do in the gravitational force
field. The velocity of the imaginary fluid is the acceleration (or
force, which in effect comes to the same things since F = ma)
of the particle in the force field. So the fluid analogy is just a
way of conceptualising forces, not an actual flow in which real-
world objects can ride along like boats.
Note also that, as again highlighted by this last problem, the
flow of our imaginary fluid is determined solely by its mov-
ing in the direction of lowest pressure; it does not accumulate
momentum, which would interfere with this defining property.
We can imagine our fluid as composed of a myriad little parti-
cles moving about chaotically and bouncing into each other all
over the place. There will then be a net tendency for the fluid
to move towards areas of lower pressure (since there are fewer
particles to bump into in that direction), but at the same time
11.1.4. Sketch visually how the net field is obtained by adding up
there will never be a single, coordinated mass moving in any
of the previous two.
one direction, so the issue of momentum does not arise.
11.1.5. Give an explicit formula for F in this case.
This fluid flow scenario models the way electric current be-
§ 11.2. Divergence
haves if we connect the two poles of a battery to two distinct
points of a conducting sheet of metal. The analogy between
electricity and fluid flow is so powerful that thinking in terms § 11.2.1. Lecture worksheet
of an “electric fluid” is often the most intuitive way of under-
standing electric phenomena. When thinking of a vector field F as a fluid flow, a fundamental
question is how much fluid is being generated at a given point.
In the combined field I also traced (in light gray) the “flow
This is called the divergence of the field.
curves” that follow the arrows.
The divergence can easily be found in terms of the derivatives
11.1.6. Prove that the flow curves are in fact (parts of ) circles, as
of F = (P,Q, R) as follows. Consider an infinitesimal cube, and
follows. (We assume, of course, that the source and the
consider first the two walls of the cube that are pierced by the
sink are of equal magnitude.)
x-direction. The divergence in this direction is the difference
F+ between how much is flowing in through the left wall and how
F much is flowing out through the right wall. The difference in
flow intensity is how much P changes in between, so ∂P ∂x dx.
- F This flow intensity acts across the area dy dz of the wall, so the
– total excess flux generated inside the cube in this direction is
∂P
∂x dx dy dz. And the same in the other directions. So the flux
generated per unit volume is
+
∂P ∂Q ∂R
div F = + +
∂x ∂y ∂z
(a) Prove that the double-striped angles are equal and Another ∂notation ∂ ∂
for this is ∇·F, where ∇ (“nabla”) is the formal
that the their triangles are similar. Hint: the “1/r ” vector ( , ,
∂x ∂y ∂z ).
force law is the key to similarity. 11.2.1. Sketch the fields A = (x, 0, 0) and B = (0, x, 0). Compute
(b) Infer that the single-striped angles are equal. their divergence. Interpret in terms of fluid flow.
(c) Infer that the flow curve is a circle. Hint: draw the For any region is space, the net flow out of it is the amount of
midpoint of the circle and the relevant radii, and fluid generated inside it. This evident fact is expressed in the
consider the relations among the base angles of the so-called Divergence Theorem or Gauss’s Theorem:
isosceles triangles that arise to prove that F is tan-
Ñ Ï
gent to the circle. div F dV = (F · n) dS
This last problem gives us occasion to reflect further on the re- The left hand side is the integral over a region in space with vol-
lationship between a force field and its fluid flow analog. The ume element dV , so it expresses the amount of fluid generated
problem shows that a small piece of paper dropped into our inside it. The right hand side is the integral across the bound-
imaginary fluid will flow along such a circular arc, not that a ary of this region with normal n and surface element dS, so it
particle in the corresponding force field will move in this way: measures how much of the flow is going out of the region (i.e.,
the particle’s momentum will cause it to deviate from the flow in the direction of the normal).
83
§ 11.3. Line integrals (c) Conclude the proof of the desired result.
11.3.4. Infer that, in this force field, the work done in bringing
§ 11.3.1. Lecture worksheet an object from one point to another is independent of
the path taken. Note that this holds for any conservative
Given a force field F, we recall from §4.4 that we can find the field.
work it performs Ron an object being made to traverse a path in Since distance travelled can also be expressed as velocity times
it by the integral F · ds taken along the path (although we are time, ds = ẋ dt, another way of writing R F · ds is R F · ẋ dt, which
integrating along a curve the integral is nevertheless called a is a more practical form of the work integral for cases where the
line integral—a very stupid name). Here ds is an infinitesimal path x is given as a parametrised curve.
piece of the curve, and by taking the scalar product we are pro-
jecting F onto it, i.e., we are counting only the component of 11.3.5. (a) Find an explicit formula for F for the radial grav-
the force that acts in the direction of the curve. In particular, if itational force field. Hint: The vector (x, y, z)
the force is perpendicular to the curve it has no effect at all and points from the origin to this point, so the vector
might as well be absent altogether. Thus we must not think that (−x, −y, −z) points from this point to the origin. It
any effort is required to keep the object from deviating from the remains only to scale it so that it has the right mag-
path; rather the object must be understood to be constrained nitude.
to the path in a natural manner. A prototype example would be (b) Confirm by explicit calculation that the work done
a hockey puck sliding on the surface of a frozen lake. Friction by the field on an object traversing the circle (a +
aside, gravity has no effect on the puck’s free motion along this cos t , sin t , 0) is zero. Note that the special case a = 0
surface, and no effort is required to keep the puck from “turn- is easy to deal with both physically and computa-
ing downwards.” Generalising from this example, we can imag- tionally.
ine any line integral as a puck following a frictionless ice chan-
nel, such as a groove in the ice. The idea is that the ice chan- Another way of seeing that a field is conservative isRto think of
T
nel ensures that the puck’s inertial velocity is directed along the its integrand as the derivative of something. For if 0 F · ẋ dt =
RT 0
path of motion in a “lossless” fashion, as if this had been its nat- 0 y dt then by the fundamental theorem of calculus the inte-
ural inertial motion. Under these conditions the work done by gral evaluates to y(T ) − y(0), which is zero if the start and end
the field amounts to how much it speeds up or slows down the points are the same. This happens precisely when the field is a
motion along the path. gradient field, i.e., F = ∇ f for some function f (x, y, x).
11.3.1. Discuss some line integrals in the gravitational force field 11.3.6. Explain why in this case
near the earth, F = (0, −mg). Explain how this relates to d
potential energy. Also explain the meaning of the sign of f (x(t )) = ∇ f (x(t )) · ẋ(t )
dt
the integral.
and use this to show that
Z T
On a more astronomical scale, the gravitational pull of an ob-
F · ẋ dt = f (x(T )) − f (x(0))
ject of great mass is −G M m/r 2 , directed radially towards it. 0
The net work done when moving a body along any closed path
in this field is zero. This function f has an important physical meaning as the fol-
lowing problem shows.
11.3.2. Show that this follows from energy conservation. (Hence
such fields are called conservative.) 11.3.7. (a) Find an f such that F = ∇ f for the radial gravita-
tional field F.
11.3.3. Also prove the same result more directly as follows.
(b) Show that f is the potential energy. By definition
(a) First prove the result for a path made up of pieces the potential energy is the same thing as the work
that are either radial or circular with respect to the obtained by letting the object fall to the center of
center of force, such as this: force.
(c) An application of this: For a body orbiting the sun,
if at some point in its orbit it is twice as far from the
sun as at another point, what is the difference in ve-
locity between these two points? Hint: Use energy
conservation.
This meaning of f as the potential energy generalises to any
conservative force field, as we can see by the following reason-
ing.
(b) Now consider an infinitesimal right triangle whose Rx
two legs are such radial and circular lines, and 11.3.8. Let F be any conservative field and define f (x) as o F· ds,
prove that the work along these legs is the same as where the integral is taken along any path from the arbi-
the work along the hypothenuse. trarily fixed origin o to the general point x.
84
(a) Explain why it necessary for F to be conservative for pieces. To this end it is useful to consider first the circula-
this construction to make sense. tion around an infinitesimal square, and then taking general
shapes to be made up of them. So consider an infinitesimal
(b) Show that ∇ f = F.
square with its sides parallel to the axes. What is the circula-
(c) Conclude that f is a potential energy function. tion around this square? Consider first the two vertical sides.
What does the arbitrariness of o mean in physical The force F = (P,Q) will be almost the same along both of these
terms? sides, namely Q evaluated there. As we walk around the square,
one of these Q-forces go with the circulation and the other
Restated in purely mathematical terms, this shows that against it, so their net effect is zero except for the fact that they
any conservative field is a gradient field. are not quite equal: since the two sides are dx apart their Q-
∂Q
values differ by ∂x dx. This, then, is the net force contributing
§ 11.3.2. Problems to the circulation, and since it acts across a distance of dy its
∂Q
net contribution is ∂x dx dy.
11.3.9. Show that the field F(x, y) = (P (x, y),Q(x, y)) is conserva- 11.4.1. Continue this line of reasoning to show that the circula-
tive if and only if P y = Q x at every point in the plane. tion around the infinitesimal square is
∂Q ∂P
µ ¶
− dx dy
∂x ∂y
§ 11.4. Circulation
Hint: The signs are easily understood by recalling that
our convention is to round the square counter-clockwise
§ 11.4.1. Lecture worksheet and considering whether an increase in P or Q helps or
R hinders this motion.
We have seen that if F is a force field then F · ds is the work
done by the field in traversing a certain path, that is, the We now wish to consider any region as an aggregate of in-
“boost” that field is giving us, like a wind in our backs, as we finitesimal squares, which will give us Green’s Theorem:
∂Q ∂P
Ï µ ¶
traverse the path.
I
P dx +Q dy = − dx dy
∂x ∂y
A problem with this image is that we must disregard the forces
that are trying to push us off the path. This is what we accom- The H left hand side is just the usual work or circulation integral
plished above with our “ice channel” idea. The analogous idea F·ds written out in terms of the components of F = (P,Q). The
for fluid flows would be to imagine that all of the fluid except right hand side is the sum of the circulations about infinitesi-
for a narrow channel is instantaneously frozen. Then the fluid mal squares.
in the channel will, in general, continue flowing around one 11.4.2. Complete the proof of Green’s Theorem as follows.
way or the other depending on which direction had the greater
momentum in the original flow, and (a) Suppose two infinitesimal squares are joined along
H this net balance of mo-
menta is precisely what the integral F · ds computes. one side. Argue that the circulation around the new
H region is the sum of the circulations of the each
For this reason, when the work integral F · ds is taken around constituent square considered separately. (The pic-
a closed path (indicated by the little circle on the integral sign) ture suggests the idea that the shared edge “can-
it is called the circulation. To fix the sign, the convention is that cels,” but make sure that your explanation makes
we traverse the path so that the inside is on our left. The figure sense in terms of the fluid flow interpretation of cir-
thus shows a negative circulation. culation.)
The suggestive idea of the ice channel is all we really need, but if (b) We can approximate any region very closely by
you wish to think more about what actually happens to the fluid you
infinitesimal squares, but the boundary will be
should be able to convince yourself that, owing to its incompressibil-
“jagged.” Prove that the flow remains the same if we
ity, the fluid will settle into a circulation of uniform speed (as long as
there are no sources or sinks inside the channel). So, when we freeze cut across diagonally instead of following two edges
the rest of the fluid, the flow in the channel also alters, and there- of the boundary squares.
fore ceases to represent the forces in the force field interpretation of
F. Nevertheless, it remains a viable image for the net work done along
§ 11.4.2. Problems
the channel as a whole, which was its purpose in the first place.
You may also wish to think about how to reconcile the momentum- 11.4.3. Show that the area of a region in the plane is given by
H H
based account of circulation with the particle-kinetic fluid model that the integral x dy or − y dx taken around its boundary.
we used at the end of §11.1 to argue against momentum effects. Hint: Hint: Cut the boundary of the figure into infinitesimal
the narrow channel now means that the momenta are coordinated af- pieces and draw the rectangles x dy for each piece.
ter all.
These results are often presented as a corollary of Green’s
We shall now show that the circulation along a loop can be Theorem even though it is much more illuminating to
computed as the sum of the circulations around its interior understand them directly from first principles.
85
11.4.4. By averaging the two area expressions in Hthe previous curl F
problem we can also write the area as 12 x dy − y dx.
This formula can also be interpreted in terms of deter-
minants: Show how it is obtained from a computation
of the areas of triangles with corner points (0, 0), (x, y),
(x + dx, y + dy) by determinant methods. F
F
2 2 2 2
11.4.5. Find the area of the ellipse x /a + y /b = 1.
11.4.6. Show that the Divergence Theorem reduced to two di- § 11.5.2. Problems
mensions gives essentially Green’s Theorem (only with
trivial modifications in signs). 11.5.2. Prove computationally that curl ∇ f = 0 and argue physi-
cally that the curl of a field is zero if and only if the field
is conservative.
§ 11.5. Curl
11.5.3. (a) Prove computationally that div curl F = 0.
The z-component of the curl vector is the circulation around § 11.6. Electrostatics and magnetostatics
an infinitesimal square parallel to the x y-plane, just as in the
previous section, and the other components are the analogous § 11.6.1. Lecture worksheet
expressions for the other directions. The meaning of the curl
vector, therefore, is that curl F · n measures how much a wheel Space is permeated by two fields: the electric field E, which de-
with axis n would be made to rotate by the fluid. For exam- scribes how a positively charged (static) particle would move,
ple, if n points in the z-direction we are asking for the rotation and the magnetic field B, which describes in which direction
∂Q
of a wheel with this axis, which is just ∂x − ∂P ∂y , or the circu- the north pole of a compass needle points. All electromag-
lation parallel to the x y-plane, just as before. If n points in netic phenomena can be characterised in terms of these fields.
some oblique direction then the rotation of a wheel with this In particular, all information transmitted through all forms of
axis will be a combination of the various coordinate-axis-plane wireless communication is encoded in these fields.
rotations taken in proportions depending on what coordinate
axis n agrees more with, and this is precisely what the scalar The complete theory of electromagnetism is contained in a few
product accomplishes. simple laws, analogous to Newton’s law of classical mechanics.
These are the equations of Maxwell. Before stating these laws
Green’s Theorem about circulation extended to three dimen- in full generality we wish to study the electrostatic and mag-
sions, where is is called Stokes’ Theorem, therefore becomes netostatic special cases. For electrostatics Maxwell’s equations
I Ï reduce to
F · ds = curl F · n dS ∇ · E = ρ/²0 and ∇ × E = 0
where ρ is electric charge and ²0 is a constant. Thus the first
The curl vector can also be interpreted directly, without pro- law is saying that electric charges produce divergence—i.e., act
jecting it onto a specific direction vector, in a manner very as sources or sinks—in the electric field, as we have already dis-
similar to how the direction and magnitude of ∇ f were found cussed above. Note that Coulomb’s inverse-square law is auto-
to have interesting intrinsic meaning once this vector had matically incorporated in this more elegant divergence law (in
been most naturally introduced in terms of its scalar products the manner of problem 11.1.1). The second equation says that
(cf. §9.4). the curl is always zero, as of course we would expect since an
electrostatic field is analogous to a gravitational one and there-
11.5.1. Convince yourself that the direction of curl F is the axis fore conservative.
of maximal circulation (rather like the vortical axis in a
For magnetostatics Maxwell’s equations reduce to
pitcher of lemonade being stirred) and that its magni-
tude is the intensity of rotation. ∇ × B = µ0 J and ∇·B = 0
86
where J is the (constant) flow of electric current and µ0 is a con- the air above the magnet runs a straight wire, fixed
stant. So the “magnetic fluid” spins around and around, in a in position. When I turn on the current in the wire,
direction perpendicular to the current: the magnet tips over. (In which direction?)
(c) Two parallel wires with currents going the same way
are attracted towards each other.
The magnetic field is like a pitcher of lemonade that someone
is stirring, and the direction of the current is the axis of its ro- 11.6.2. Electric motors harness the above principles to convert
tation. The divergence is zero: no magnetic fluid is generated electric current to mechanical work.
or destroyed. This absence of divergence corresponds to the (a) Explain how this is achieved by means of this ar-
fact that there are no magnetic monopoles, i.e., magnets al- rangement:
ways come in north-south pairs, never a piece of “north only,”
in contrast to the charged particles that generate divergence in N
the electric field. (As Humphry Davy once wrote to a woman to
whom he was attracted: “You are my magnet, though you differ S
from a magnet in having no repulsive points.”)
F = q(E + v × B)
∇·B = 0
So the electric field is pushing things on directly as we have al-
In the previous section we studied the “static” cases where the
ready seen, whereas the effect of the magnetic field is a bit more
time-derivatives were zero. Of the two new terms, the one in
subtle. When a charge is moving we can think of its velocity
the second equation is the most interesting. This equation says
vector as a wrench and the magnetic field as a force pushing
that if the magnetic field is moving then it causes a “stir” in
the wrench. The resulting torque is the force that the particle
the electric fluid with the direction of motion being the axis of
experiences.
stirring. If a conducting wire is placed so that its particles are
11.6.1. Explain how the following experimental facts are ac- caught in the stirred-up vortex then a current is generated.
counted for by these equations.
11.7.1. Using the setup of problem 11.6.1b, explain how a cur-
(a) On the table in front of me I have a bar magnet rent can be created in the wire by moving it back and
standing with its north pole pointing upwards. In forth. Which way does the current go?
87
11.7.2. Draw the field Ḃ and explain how the motor in problem § 11.8. Reference summary
11.6.2 can be “run backwards” to generate electric cur-
rent from mechanical work. This is how power plants
generate electricity. § 11.8.1. Line integrals
(b) Consider a second coiled wire placed next to the where −γ is the curve γ traversed backwards.
first, with its axis aligned with it. Show that the al-
ternating current in the first induces, without a di- Fundamental theorem for line integrals:
rect connection, a current in a second wire. Z b
This is used for example in electric toothbrush charg- ∇ f · dx = f (b) − f (a)
a
ers, where an exposed electrical connector would be haz-
ardous. Induction cooking is also based on the same where a and b are the start and end points of any curve.
principle, in conjunction with the fact that certain metals
produce a lot of resistive heat when a current is induced
in them. § 11.8.2. Vector field concepts
11.7.4. Discuss the following passage from Maxwell’s paper On
Faraday’s lines of force (1855).
F is conservative
The student [of electrical science] must make ⇐⇒ F is a gradient field (F = ∇ f )
himself familiar with a considerable body of
⇐⇒ line integrals independent of path
most intricate mathematics, the mere reten-
tion of which in the memory materially in- ⇐⇒ line integrals around closed paths = 0
terferes with further progress. The first pro- ⇐⇒ curl F = 0
cess therefore in the effectual study of the sci- ⇐⇒ F irrotational
ence, must be one of simplification and re-
duction of the results of previous investigation
In this case − f is called the potential function. A special case is
to a form in which the mind can grasp them.
the potential energy of a gravitational field.
The results of this simplification may take the
form of a purely mathematical formula or of a
physical hypothesis. In the first case we en-
div F = ∇ · F = generated flux
tirely lose sight of the phenomena to be ex-
plained ; and though we may trace out the
curl F = ∇ × F = axis of maximal circulation
consequences of given laws, we can never ob-
tain more extended views of the connexions
curl F · n = circulation around axis n
of the subject. If, on the other hand, we adopt
a physical hypothesis, we see the phenomena
direction of curl F = axis of maximal circulation
only through a medium, and are liable to that
blindness to facts and rashness in assump-
|curl F| = intensity of maximal circulation
tion which a partial explanation encourages.
We must therefore discover some method of
investigation which allows the mind at every
§ 11.8.3. Properties of curves
step to lay hold of a clear physical concep-
tion, without being committed to any theory
founded on the physical science from which closed returns to where it started; final point =
that conception is borrowed, so that it is nei- initial point
ther drawn aside from the subject in pursuit of simple does not intersect itself
analytical subtleties, nor carried beyond the positively
truth by a favourite hypothesis. interior on left as traversed
oriented
88
Evaluate C x ydx + 2ydy on the curve y = x 2 from x = 0 to
R
§ 11.8.4. Vector calculus theorems
x = 2.
Divergence Theorem (Gauss’s Theorem): R R2 3 3 5x 4 2
C x ydx + 2ydy = 0 x + 4x dx = [ 4 ]0 = 20.
Ñ Ï
div F dV = (F · n) dS
R
C (1 − y)dx + xdy, where C is a curve consisting of three
parts: C 1 , the arc of y = 1 − x 2 from (1, 0) to (0, 1); C 2 ,
Green’s Theorem:
the line segment from (0, 1) to (−1, 0); C 3 the line segment
from (−1, 0) to (1, 0).
∂Q ∂P
I Ï µ ¶
P dx +Q dy = − dx dy Along C 1 we have y = 1 − x 2 ⇒ dy = −2xdx, so C 1 (1 −
R
∂x ∂y R0
y)dx+xdy = C 1 x 2 dx+x(−2xdx) = 1 −x 2 dx = [−xR3 /3]01 =
R
∇·B = 0
y ze x y z dx + zxe x y z dy + x ye x y z dz where γa is the curve
R
γa
Effect of fields on moving particle: x = cos t , y = sin t , z = t , 0 ≤ t ≤ a.
89
2x ydx−x 2 dy
RR
where γ is the curve y = x 2 + 2x − 4 from Y F · NdS where F = (x sin y, x + cos y, z − 1) and Y is the
R
γ x 4 +y 2
,
part of the ellipsoid {x 2 +2y 2 +4z 2 = 1} with z ≥ 0 (normal
(1, −1) to (2, 4).
oriented upwards).
∂Q 2x 5 −2x y 2
Since ∂x = ∂P ∂y = (x 4 +y 2 )2 , we can change the path of Complete Y with the “floor” Y0 = {x 2 + 2y 2 ≤ 1, z = 0}
integration to e.g. the three line segments from (1, −1) (normal pointing downwards). We can then apply the Î Di-
to (1, 0) (γ1 ), (1, 0) to (2, 0) (γ2 ), and (2, 0) to (2, 4) (γ3 ). vergence Theorem to the resulting closed surface:
R 2x ydx−x 2 dy R R R Î Ð Y F·
Then γ x 4 +y 2 = γ1 + γ2 + γ3 = I 1 + I 2 + I 3 . We NdS + Y0 F · NdS = K div Fdxdydz. Two terms R R this
of
R 2x ydx−x 2 dy equation are readily evaluated, namely: I 0 = Y0 F ·
compute each integral separately. I 1 = γ1 x 4 +y 2 = Î
NdS = x 2 +2y 2 ≤1 (x sin y, x + cos y, −1)) · (0, 0, −1)dxdy =
π
R 0 −dy
pπ
0
−1 1+y 2 = [− arctan y]−1 = − arctan 0 + arctan(−1) = − 4 .
RR Ð Ð
x 2 +2y 2 ≤1 1dxdy = 2 and I 1 = K div F = K (sin y −
R 2x ydx−x 2 dy R2 R 2x ydx−x 2 dy p
sin y +1)dxdydz = K 1dxdydz = 2 3 ·1· p · 2 = 62π (us-
1 4π 1 1
Ð
I 2 = γ2 x 4 +y 2 = 1 0dx = 0. I 3 = γ3 x 4 +y 2 =
2
ing the formula V = 4π
½ ¾
R 4 −4dy y= 4t R 1 −16dt R 1 −dt 3 abc for the volume of an ellip-
0 16+y 2 = dy = 4dt
= 0 16+16t 2 = 0 1+t 2 =
soid with semi-axes
p
a, b, c). Hence the sought integral
[− arctan t ]10 = − arctan 1+arctan(0) = − π4 . Altogether: γ =
R
2π
is I 1 − I 0 = − 3 .
I 1 + I 2 + I 3 = − π4 + 0 − π4 = − π2 .
Find the flow of the field F = (−x y 2 , x sin z−y, z y 2 ) into the
region K = {(x, y, z) ∈ R3 : 1 ≤ x ≤ 3, 1 ≤ y ≤ 4, 2 ≤ z ≤ 4}.
With ∂K oriented with Î outward-pointing
Ð normal, the
flow
Ð into the region is − ∂K FdS = − K div Fdxdydz =
− K −1dxdydz = Volume(K ) = 12.
90
§ 12.2. The rainbow
12 F URTHER PROBLEMS
Prerequisites: §2.1.
§ 12.1. Newton’s moon test Rainbows are the result of light from the sun “bouncing”
through raindrops. In this problem we shall show that rain-
drops tend to concentrate the rays of the sun—almost like a
Prerequisites: §1.1. magnifying glass—at one particular outgoing angle, namely
about 42◦ .
The moon is kept in its orbit by the earth’s gravitational pull, or
so your high school textbook told you. How do you know that it From this we can infer the shape and position of the rainbow
is really so? How do you know that the moon is not towed about as follows. Imagine that you have the sun in your back and that
by a bunch of angels? This question doesn’t seem to arise in to- there is a wall of raindrops some distance ahead of you. Since
day’s authoritarian classrooms, but Newton gave an excellent the sun is so far away its rays may be considered parallel. For
answer if anyone is interested. any given raindrop, picture the ray from the sun that hits it, and
picture the line of sight from your eye to the raindrop. Consider
“That force by which the moon is held back in its orbit is that
the angle at which these lines meet.
very force which we usually call ‘gravity’,” says Newton (Prin-
cipia, Book III, Prop. IV). And his proof goes like this. Consider12.2.1. Characterise the set of all raindrops for which this angle
the hypothetical scenario that “the moon be supposed to be is 42◦ . Hint: consider the ray from the sun that passes
deprived of all motion and dropped, so as to descend towards through your head for reference purposes.
the earth.” If we knew how far the moon would fall in, say, one
12.2.2. You never see rainbows at mid-day in mid-summer, no
second, then we could compare its fall to that of an ordinary
matter how much it rains. Why not?
object such as an apple. Ignoring air resistance, the two should
fall equally far if dropped from the same height. Now we must understand where the number 42◦ comes from.
At any given contact surface between air and water, some light
Of course we cannot actually drop the moon, but with the
is reflected and some light is refracted, in the manner de-
power of infinitesimals we can deduce what would happen if
scribed in problems 2.1.5 and 2.1.7 respectively. The light that
we did. Here is a picture of the moon’s orbit, with the earth in
constitutes the rainbow is the light that refracted into the rain-
the center:
drop, reflected at the back of it, and refracted back out gain
through the front. In the plane containing the rays from the
sun, the midpoint of the drop, and the observer, this looks as
B
E follows:
D A α
S C
β f
β
Suppose the moon moves from A to B along a circle with center β
S in an infinitely small interval of time. If there were no gravity
the moon would have moved along the tangent to the circle to
some point E instead of to B (B E is parallel to ASD because the
time interval is infinitely small so gravity has no time to change β
direction).
91
is about c 1 /c 2 = 4/3. McPhee concludes: “If other models confirm that this is a quite
general consequence, then it means not to keep on looking
(d) Express f purely in terms of α and find its derivative.
endlessly for behavioral ‘reasons’ for the alcoholic’s loss-of-
(e) Set the derivative equal to zero and solve for α. control phenomenon. Rather, we might have to face the awful
truth that what the alcoholic has been saying for years is the
(f) Explain why we should expect a concentration of truth: no ‘reason’ is really necessary.” (p. 213)
light at this angle.
You may be disappointed that we did all this work about rain-
Prerequisites: §5.
bows, yet said nothing about its colours. Indeed, rainbows
are arguably a light-concentration phenomenon more than a The factorial function n! is intractable to compute and grasp
colour phenomenon, scientifically speaking. We are often so for large n since its definition involves compounded multipli-
distracted by the beautiful colours that we fail to notice that cations that quickly grow beyond bounds. Therefore it is useful
the rainbow is also significantly brighter than the surrounding to have a closed formula approximating n!. We can find such a
sky. Black-and-white photos can help us see this more clearly. formula as follows.
The colours of the rainbow result from the fact that light rays of
12.4.1. (a) Decompose ln(n!) into a sum.
different colours have slightly different refraction angles.
(b) Interpret this sum geometrically as a sum of the areas
of rectangles with base 1 along the x-axis.
§ 12.3. Addiction modelling (c) Estimate the area from below by an integral.
(d) Estimate the area of the pieces left out by the inte-
Prerequisites: §2.5.
gral approximation. Hint: align these pieces by slid-
McPhee (Formal Theories of Mass Behavior, 1963) offers the fol- ing them horizontally to the y-axis.
lowing mathematical model of what he calls “the logic of ad-
(e) Deduce an estimate for n!.
diction.” Let C stand for consumption, scaled so that C = 1
corresponds to “normal maximum” (e.g., in the case of alco- (f ) Check the accuracy of this estimation for a few
hol consumption, a few glasses of wine or so). The change in large values of n.
C depends on the stimulation s to consume, which is propor-
A slightly better estimate may be obtained as follows. First we
tional to the remaining room 1 − C to consume up to normal
note that n! can be expressed as an integral.
satisfaction, and the resistance r , which is proportional to C .
0
Thus C = s(1−C )−r C . But drinking also has an intrinsic effect12.4.2. Show that Z ∞
i , which increases stimulation and weakens resistance (“one x n e −x dx = n!
drink leads to another”, p. 187). So C 0 = (s + i )(1 −C ) − (r − i )C . 0
12.3.4. Explain his behaviour in terms of the differential equa- We now need to find the t -values corresponding to half
tion. maximum, f (t ) = 12 f max . We cannot directly solve this
92
equation analytically. But we can approximate the solu- Newton constructed such a polynomial, namely a poly-
tions as follows. nomial p(x) which takes the same values as a given func-
tion y(x) at the x-values 0, b, 2b, 3b, . . .. Here is the con-
(c) Apply logarithms to both sides of this equation.
struction. First, our polynomial p(x) is supposed to have
(d) Replace the left hand side by the first two non-zero the same value as the given function y(x) when x = 0.
terms of its power series expansion about its maxi- Therefore we should start by setting p(x) = y(0). Next, we
mum. want p(x) to take the same value as y(x) when x = b. This
is easily done by setting
(e) Find the desired t -values.
x¡ ¢
(f) Conclude the estimation of n!. p(x) = y(0) + y(b) − y(0) .
b
(g) Check the accuracy of this estimation for a few This polynomial obviously agrees with y(x) when x is 0 or
large values of n. b. Now we need to add a quadratic term to make it agree
when x is 2b as well. We want the new term to contain the
factor (x)(x − b) because then it will vanish when x is 0 or
§ 12.5. Wallis’s product for π b, so our previous work will be preserved. If we set x = 2b
in the piece of p(x) that we have so far we get
Prerequisites: §5.
p(2b) = y(0) + 2y(b) − 2y(0) = 2y(b) − y(0).
Wallis’s product expression for π says:
So we want the quadratic term to have the value y(2b) −
2y(b) + y(0) at x = 2b.
π 2 2 4 4 6 6 8
= · · · · · · · (a) Use this reasoning to write down a second-degree
2 1 3 3 5 5 7 7
polynomial p(x) that agrees with y(x) when x is 0, b
12.5.1. Argue that Wallis’s product follows from the expression in
or 2b. (Keep the factor (x)(x −b) as it is, i.e., do not re-
problem 5.1.10a. Hint: plug in a specific value for x.
duce the expression to the form p(x) = A+B x +C x 2 .)
12.5.2. The following is an alternative proof of Wallis’s product
In the same manner we could add a cubic term to make
expression for π.
R π/2 p(x) agree with y(x) at x = 3b, and so on.
(a) Evaluate 0 (sin x)2m dx.
The formula becomes more transparent if we introduce
R π/2
(b) Evaluate 0 (sin x)2m+1 dx. the notation ∆y(x) for the “forward difference” y(x + b) −
y(x), and ∆2 y(x) for the forward difference of forward dif-
(c) Divide the two results to find an expression for π2 . ferences ∆y(x + b) − ∆y(x), etc., so that
The expression contains the ratio of the two integrals.
∆y(0) = y(b) − y(0)
(d) What needs to be the limit of this ratio as m → ∞ for ∆2 y(0) = ∆y(b) − ∆y(0) = y(2b) − 2y(b) + y(0)
Wallis’s expression to follow?
∆3 y(0) = ∆2 y(b) − ∆2 y(0) = y(3b) − 3y(2b) + 3y(b) − y(0)
To establish this limit we need to estimate the ratio from ..
above and from below. .
(e) Find one of these estimates by considering what (b) Rewrite your formula for p(x) using this notation,
multiplying by sin x does to the values of a function and then extend it to the third power and beyond “at
on the interval (0, π/2). pleasure by observing the analogy of the series,” as
Newton puts it.
(f) RFind the other Restimate by comparing
π/2 2m+1 π/2
0 (sin x) dx and 0 (sin x)2m−1 dx. (c) Show that Taylor’s series
(g) Conclude the proof. y 00 (x) 2 y 000 (x) 3
y(x) = y(0) + y 0 (0)x + x + x +···
(h) ? † Can you see why some people prefer this proof to 2! 3!
that of problem 12.5.1? is the limiting case of Newton’s forward-difference
formula as b goes to 0.
Prerequisites: §5.
lim
12.6.1. In problem 5.1.1 we argued that a first-degree polynomial
can be made to go through two points, a second-degree
polynomial to go through three points, and so on. Indeed,
93
This is indeed how Taylor himself proved his theorem in problem must be “convex,” meaning that is has no cavities, so
1715. The nowadays more common method of finding to speak. Africa is not convex and its “convex hull,” the least
the series by repeated differentiation (as in problem 5.1.2) convex figure containing it, is obtained by snapping a rubber
was used by Maclaurin in 1742. band around it and taking that as the new figure:
94
So now we have two expressions for the weighted area
that we can combine.
(c) Deduce from this the isoperimetric inequality.
95
The force it causes to act on the point is proportional to the
area divided by the distance (in the manner of problem 11.1.3).
Then the angle θ the force vector makes with the normal de-
termines what part of the force that acts in the direction of the
normal: cos θ.
Suppose we have found the optimal shape of the area. Walk
straight ahead to the last infinitesimal square we included in
that direction, and do the same thing for an angle θ.
96
y
13 F URTHER TOPICS
§ 13.1.2. Problems
§ 13.2. Evolutes and involutes
97
O
(b) Use this to find the length of a cycloidal arc.
I
O P
§ 13.3. Fourier series
T
Fourier series can be seen as a generalisation of the idea of the
We say that OE is the evolute of the involute OI . scalar product. Consider on the one hand the ordinary vectors
v1 = (1, 1) and v2 = (1, −1), and on the other hand the functions
13.2.1. Argue that the length I E can be characterised in terms of w1 = sin(x), w2 = sin(2x), w3 = sin(3x), . . ., which we shall think
curvature (§13.1). of as a kind of “vectors” as well. The vectors v1 and v2 form a
This gives us a way of determining the evolute given the in- basis for R2 , as we say, i.e., any vector in the plane can be writ-
volute. Suppose the involute OI is given parametrically as ten in the form av1 + bv2 for some real numbers a and b. To
(x(t ), y(t )). express some vector u in this form we go through the following
steps:
13.2.2. Find expressions for the coordinates (X , Y ) of the point E
in terms of x, y, ẋ, ẏ, Å, R. • “Normalise” the vectors v1 and v2 by dividing each vector
by its length. Call the resulting unit vectors v∗1 and v∗2 .
To make concrete use of this we need parametric expressions
for curvature. • Check that v∗1 and v∗2 are orthogonal (i.e., perpendicular)
using scalar products.
13.2.3. Express the radius of curvature R in terms of ẋ, ẍ, ẏ, ÿ in
one of the following two ways. • “Project” u onto v∗1 and v∗2 using scalar products.
(a) From geometrical first principles, as in §13.1. • The fact that v∗1 and v∗2 are orthogonal ensures that u is
decomposed into independent components, so adding
(b) By expressing y 0 and y 00 parametrically and substitut- up the two projections gives u = av∗1 + bv∗2 , as sought.
ing into the formula found in §13.1. Hint: for y 00 , use
chain and quotient rules. 13.3.1. Carry out these steps for the vector u = (4, 1). Include a
picture showing v1 , v2 , v∗1 , v∗2 , u, av∗1 and bv∗2 .
13.2.4. Show that the evolute of a cycloid is another cycloid con-
gruent to the first. We are now going to generalise this idea to “spaces” where vec-
tors are functions. To do so we essentially define the scalar
13.2.5. Argue that if you wish to make a pendulum bob follow a product of two functions f (x) and g (x) as R f (x)g (x) dx by
cycloidal path you should have it swing between cycloidal analogy with the usual scalar product P f g . More precisely,
i i
“cheeks.” Problem 7.4.5 shows why this is important. when dealing with sine functions we are going to focus on
onlyR πone of their periods, so we define the scalar product to
be −π f (x)g (x) dx.
13.3.2. Now let us try to carry out the same steps as above in
this new setting.
(a) Argue that, in the above definition of evolute and in- In other words, you have approximated the function x by
volute, the arc OE = the line segment E I . a sum of sine functions with various coefficients.
98
(d) Plot the function a 1 w∗1 +a 2 w∗2 +a 3 w∗3 +a 4 w∗4 +a 5 w∗5 + (c) Show that if |u| = 1 then the angle between v and w is
a 6 w∗6 and note that it somewhat approximates x on the same as that between uv and uw. Hint: first show
the interval (−π, π). that the mapping z 7→ uz is distance-preserving.
(e) Use the same method to obtain an approximation for (d) Infer from this that ij is perpendicular to 1.
u2 = x/|x|. Plot x/|x| and its approximating function
(e) Prove that (ij)2 must be both 1 and −1, a contradic-
in the same graph.
tion.
(f) ? Why can we not obtain power series approxima-
(f ) Note that the contradiction is avoided if we sacrifice
tions in a similar way (i.e., by “projecting” a function
commutativity and make ij = −ji.
onto x, x 2 , x 3 , . . . with the integral scalar product to
find its Taylor coefficients)? Explain in terms of the So if we continue this line of thought and try to salvage what
“vectors” x, x 2 , x 3 , . . ., and give an analogy with vec- we can from the wreckage, then we must try to figure out what
tors in R2 that illustrates the problem. number this mysterious ij can be. So far we know only that it is
perpendicular to 1 and equal to −ji.
This method of approximating functions by trigonometric se-
ries has an interesting physical meaning in terms of sounds.13.4.2. (a) Show that it is also perpendicular to i and j.
Functions of the form sin(nx) are “pure notes”—they describe
(b) Conclude that ij must go off in a “fourth dimension.”
the vibrations of tuning forks. The fact that any function
can be approximated by such trigonometric functions thus So let us write ij = k and consider the set of all four-
corresponds to the fact that any sound—say, for example, dimensional numbers a1+bi+cj+d k. The set of all such num-
Beethoven’s ninth symphony (including the chorus!)—can be bers may be called quaternions (owing to their fourness) and
produced by nothing but tuning forks. The numbers a n are denoted H (after their discoverer, Hamilton, since Q is already
telling us how hard to strike each tuning fork. There is also a taken).
converse physical meaning: a tuning fork will start vibrating
13.4.3. (a) Show that H is spatially closed, i.e., that any product
spontaneously whenever its tone is being played (“sympathetic
involving 1, i, j, k can be expressed as a linear com-
resonance”). The human ear is based on this principle. It con-
bination of 1, i, j, k without the need for any fifth di-
sists of many hairs that are in effect tuning forks sensitive to a
mension.
particular frequency. When a sound arrives which includes this
frequency as a component, the hair will vibrate with a strength (b) Show that the “multiplication table”
a n determined by the strength of that tone in the sound heard.
Thus the information sent to the brain is the coefficient a n , i2 = j2 = k2 = ijk = −1
so you have been computing scalar products all your life, as
it were, whether you were aware of it or not. contains all the information needed to reduce any
product of numbers in H down to standard form.
Now that the arithmetic of H is well defined it would be
§ 13.4. Hypercomplex numbers straightforward to go back and check that it satisfies all the var-
ious properties we desired of hypercomplex numbers except
§ 13.4.1. Lecture worksheet for the commutativity of multiplication. So by “following our
nose” from our failure with three-dimensional numbers we ar-
Generalising complex numbers into higher dimensions is rived at the next best thing.
problematic. Three-dimensional numbers are in a sense im-
possible, as the following argument shows.
§ 13.4.2. Problems
13.4.1. Consider the set of all points, seen as “hypercomplex
numbers,” in three-dimensional space with the usual13.4.4. Quaternions once rivalled vectors, and as the following
(vector) notions of magnitude and addition. Suppose problem shows they are in some ways almost equivalent.
there is some way of multiplying such numbers which
(a) Show that if you take the quaternion product (ai +
satisfies the usual laws of algebra. We shall now show that
bj+ck)(d i+ej+ f k) and discard all its real terms then
these assumptions are contradictory.
the result is the same as that of the corresponding
For the usual laws of algebra to hold there must be a mul- vector product (ai + bj + ck) × (d i + ej + f k). Hint:
tiplicative identity, call it 1, and since we are in three brute-force calculation is not the only way of seeing
dimensions there must be two other numbers of unit this.
length, i and j, such that these three numbers are all mu-
(b) Show that the scalar part discarded above is the neg-
tually perpendicular.
ative of the corresponding scalar product.
(a) Show that |1 − i2 | = 2.
13.4.5. The following argument gives a simpler and independent
(b) Therefore what is i2 ? And j2 ? Hint: What is its dis- proof that there can in any case not be any very simple
tance to 0? To 0? formula for three-dimensional multiplication.
99
Surely any multiplication worthy of its name should at the Then there are the changes caused by the changes in the
very least satisfy the multiplicative property of the magni- derivatives on the sides of y k , call them ẏ k and ẏ k+1 .
tude: |z 1 z 2 | = |z 1 ||z 2 |. Written out in terms of the compo- When y k changes by dyk , ẏ k changes by dyk /dt, so it
nents of the numbers this says causes the change in the integrand of
p q q (b)
(a 2 + b 2 + c 2 ) (d 2 + e 2 + f 2 ) = (α2 + β2 + γ2 )
which translates into a change in the integral of
or by squaring (c)
(a 2 + b 2 + c 2 )(d 2 + e 2 + f 2 ) = α2 + β2 + γ2 since this change only applies for half the interval dt. Sim-
ilarly, ẏ k+1 changes by −dyk /dt and causes the change
where α, β, γ are some functions of a, b, c, d , e, f .
(d)
(a) In the case of ordinary complex numbers (i.e., c = f =
So the equation for the change being zero altogether is
γ = 0), what is α and β?
(e)
These functions produce integer output for integer input.
Assume that this holds for α, β, γ in three dimensions as This is the equation when a single value y k is altered. In gen-
well. eral y may be altered in any manner, meaning that any num-
ber of y i ’s may be altered. To obtain the criterion for y being
(b) Show that then 15 must be a sum of three squares (of stationary in this general case we must therefore sum the pre-
integers). vious equation over all k. In doing so one finds that each term
∂f
(c) Show that it is not. ∂ ẏ i is counted twice, which cancels the 2 in the denominator
and leaves simply
∂f d ∂f
µ ¶
− = 0.
§ 13.5. Calculus of variations ∂y dt ∂ ẏ
This is the Euler–Lagrange equation of the calculus of varia-
tions.
§ 13.5.1. Lecture worksheet
Consider the problem of finding a function y(t ) that extremises § 13.5.2. Problems
an integral that depends on it,
Z 13.5.2. In problem 6.3.2 we found an integral expression for the
time taken to slide down a ramp as a function of its shape
f (y, ẏ, t ) dt.
y(x).
y(t ) is an extremum if wiggling it a little causes no change in (a) Use the Euler–Lagrange equation to find a differen-
the value of the integral, just as, in ordinary calculus, x is an ex- tial equation for the path of quickest descent.
tremum of f (x) if wiggling it a little causes no change in f . For (b) As in problem 12.7.2, check that a cycloid solves this
this purpose, split the t -axis into infinitesimal segments dt/2, differential equation.
and assume that y(t ) is linear on these intervals.
13.5.3. If we can write Newton’s equation F = ma in the form
of the Euler–Lagrange equation we can infer an integral-
variational formulation of the basic equation of motion.
Then y(t ) and ẏ(t ) are determined by the value of y(t ) at the Indeed, this is easily done: just let f = T + U , where
break points, let’s call them y 1 , y 2 , y 3 , . . .. Let the point y k vary, T = mv 2 /2 is the kinetic energy and U is the potential
so that we increase y k by dyk . function. In other words, U is a function whose deriva-
tives give the forces, as in U = −mg y for gravity, giving
constant gravitational acceleration m ÿ = ∂U
∂y = −mg .
What happens to the integral? (a) Show that in this case the Euler–Lagrange equation
reduces to F = ma.
13.5.1. First there is the direct change caused by the change in
y k . This change causes the integrand to increase by (b) In other words, to find the trajectory of a particle we
used to solve the differential equation F = ma, but
∂f now we can instead determine it by the equivalent
µ ¶
dyk
∂y k problem: . This is the so-called principle of
least action.
on this interval dt, and thus the integral by R R
(c) Instead of T +U dt, Euler uses the integral mv ds.
(a) Argue that this is equivalent.
100
(d) ? Discuss Euler’s interpretation of this result: “Be-
cause of their inertia, bodies are reluctant to move,
and obey applied forces as though unwillingly;
hence, external forces generate the smallest possible
motion consistent with the endpoints.”
101
We can picture a function as a kind of “machine” where you
A P RECALCULUS REVIEW stick some input in one end, turn some cranks, and receive a
processed version of the input out at the other end. Often this
takes the form of a formula with x’s in it, into which one can
§ A.1. Coordinates “plug” whatever value for x to find the associated output value
y(x).
§ A.1.1. Lecture worksheet
For instance, f (x) = 2x − 1 is a function that doubles the input
and subtracts 1. So f (3) = 5, for example. It is often useful to
The position of a point is characterised analytically by its co- put this kind of information in a table for overview:
ordinates (x, y), meaning its vertical and horizontal distance
from some designated origin point (0, 0): x 1 2 3 4 5 6 7 8
f (x) 1 3 5 7 9 11 13 15
(x, y)
This table can help us see for example a second way of charac-
y terising f (x) verbally, namely as an “odd-number machine,” so
to speak:
x A.2.1. When x is a positive integer, is the odd
number.
By means of this device geometry is turned into algebra, so to
speak. For example, a line can be described as all the points A.2.2. What is the 127th odd number?
(x, y) that satisfy the relation y = mx + b for some fixed num-
bers m and b. The notion and notation of a function f (x) is powerful in its
flexibility and scope. For one thing, functions can be com-
A.1.1. Explain why. Hint: a line is characterised by the property posed, meaning that the output of one function is the input
that any horizontal step always corresponds to the same of another, which is often a useful way of describing composite
number of vertical steps. operations.
p
2
A.1.2. What kind of figure is y = 1 − x ? Hint: consider x and y A.2.3. If d (x) is the number of dollars you get for x euros, and
as legs of a right triangle. h(x) is the number of hot dogs you can buy for x dollars,
A.1.3. What kind of figure is x y = 0? what is h(d (x))?
A.1.4. ? Rotate the figure in the previous problem by 45◦ . What A.2.4. If f (x) = 2x and g (x) = x/2, explain both in formulas and
is its equation now? Hint: What is a formula for combin- purely verbally why f (g (x)) = x and g ( f (x)) = x.
ing x and y that will give you zero if you’re at (5, 0) or (0, 5)? Visually speaking, functions are curves. The value of the func-
What is a formula for combining x and y that will give you tion for a particular x-value corresponds to the height of the
zero if you’re at (5, 5)? At (5, −5)? function above the x-axis at that point:
§ A.1.2. Problems
A.1.5. (a) The enemy has one cannon movable along a shore-
f(x)
line y = − 14 and one cannon located at a fortress off
the shore at the point 0, 14 . You must sail between
¡ ¢
102
A function is called “even” if f (−x) = f (x), and “odd” if f (−x) = § A.3. Trigonometric functions
− f (x). Most functions are neither one nor the other, but for
those that are it is often useful to be aware of these simple
rules for how minuses behave, just as in ordinary arithmetic § A.3.1. Lecture worksheet
you wouldn’t start all over again to compute 53 × (−74) if you
had just computed 53 × 74, or keep minding the sign at each Trigonometric functions are so called for their applications to
step when evaluating (−2)5 . triangles (§A.7.8), but in the context of the calculus they take
on a much broader significance. In fact, they are the language
A.2.9. Is f (x) = x n an even or odd function? in which all kinds of periodic phenomena are best described.
Nature exhibits obvious periodicity in phenomena such as day
A.2.10. What is the visual meaning of a function being even or and night, summer and winter, and the ebb and flow of the sea.
odd? But more important examples still are the many kinds of invis-
ible waves that constitute a big part of our lives, including light
A.2.11. Illustrate with the graphs of x, x 2 , x 3 , x 4 . waves, sound waves, and virtually all man-made forms of wire-
less communication.
A.2.12. Make up some other even and odd functions by drawing The periodic nature of the trigonometric functions is obvious
graphs with two hands using two pens, both starting at from their unit-circle definition, which generalises their defi-
the origin, one going right and one going left. nition in terms of triangles:
tan θ
f −1 (x) means the inverse function of f (x). It’s “ f back-
1 1
sin θ
wards,” or the function that “undoes” f . In symbols this means
f ( f −1 (x)) = x and f −1 ( f (x)) = x: if you do one and then the θ θ
other you get back what you started with. Thus, for example, if cos θ
f (x) = 2x then f −1 (x) = x/2. Another way of putting it is that,
for f −1 , the output of f becomes the input and the input be- On the left we see, in effect, the geometrical definitions of the
comes the output. trigonometric functions embedded in a coordinate system. On
the right we see a further step of abstraction where the cosine
Visually, the graph of f −1 (x) is the graph of f (x) mirrored in the and sine are simply defined as the x and y coordinate respec-
line y = x, since this transformation interchanges the roles of x tively of a point moving along a unit circle. In this way the
and y, i.e., input and output. functions are liberated from their trigonometric origins, as it
were. In particular, in this way we can define their values for
any angle, including angles that are too great to ever occur in a
f −1 (x) y =x triangle. Defined in this manner, the sine and cosine become
beautiful periodic functions that look like this:
1
f (x)
−π − π2 π π 3π 2π
2 2
−1
−π − π2 π π 3π 2π
2 2
−1
103
A.3.3. What are some other features of the graphs that you can § A.4. Logarithms
confirm using the unit circle definition?
arc
1 The essence of logarithms is that they turn multiplication into
leng
addition:
θ radians th θ log(ab) = log(a) + log(b) (L1)
This simplifies calculations because if you have to compute by
hand it is much easier to add than to multiply. In this way log-
In calculus we always use radians rather than angles. The pre-
arithms “doubled the lifetime of the astronomer,” it was said at
cise reason for this is seen in problem 1.4.1, but we can already
the time. Not so long ago, before the advent of pocket calcula-
appreciate that radians is the superior angle measure. After
tors, people still learned logarithms for this purpose in school.
all, the notion that a full revolution corresponds to 360◦ is an
You can still see the traces of this today when you go to a used
arbitrary social construction. Basing a theory on such arbi-
bookstore and look at the mathematics section: usually you
trary starting points leads to arbitrary repercussions later, as is
will find there tables of logarithms published in the first half
hardly surprising. Radian angle measure, by contrast, doesn’t
of the 20th century.
introduce any artificial conventions, but rather characterises
angles by means intrinsic to geometry itself. We can rediscover logarithms for ourselves in the following
way. Consider a table of powers of some integer, such as 2:
What are the inverses of the trigonometric functions? In the
manner of §A.2 we can define them abstractly and denote them n 1 2 3 4 5 6 ···
n
−1
sin (x) etc., as is sometimes done. However, it is also reward- 2 2 4 8 16 32 64 ···
ing to think about their meaning more concretely. By defini- A.4.1. Explain how for example 4 × 8 can be found using this ta-
tion sin−1 (x) inverts sin(x). What does this mean in terms of ble without actually performing any multiplication.
the geometrical definition of sin(x)? The sine takes an angle—
or rather, as we have now learned to say, arc—as its input and That’s a neat trick, but it only works for numbers that happen
gives a corresponding coordinate as its output. The inverse to occur in the bottom row. We need to be able to multiply any
sine, sin−1 (x), does the opposite: it takes the coordinate as its numbers. Fortunately it is not hard to extend the idea to pro-
input and tells you what the corresponding arc is. For this rea- duce a table without such big gaps.
son sin−1 (x) is also denoted arcsin(x). A.4.2. Explain how.
Here, then, are the geometrical meanings of the inverse Thus, to produce a table of a function that has the property
trigonometric functions: (L1), all we have to do, it turns out, is to make a table of val-
ues for some exponential function f (x) = a x and then read it
backwards. Logarithms are simply the inverse of exponentia-
y tion.
t
arc
arc
arc
cos x
sin y
In our table we used the base 2, but any number would have
tan t
(b) arccos(−1) A.4.3. (L1) is the defining property of logarithms, and the
mother of all logarithm laws. Show how:
(c) arctan(1)
(a) The logarithm of 1 follows from (L1) by restricting
(d) arctan(∞) one of its values to an identity element.
A.3.5. Argue that degree and radian angle measures can be in- (b) The logarithm of an exponential expression follows
terpreted as “observers’s viewpoint” and “mover’s view- from (L1) by regarding multiplication as repeated ad-
point” respectively. dition.
104
(c) The logarithm of a quotient follows from (L1) by con- eventually decay back to nitrogen. However, this may not
sidering that / cancels × and − cancels +. happen until many years later. 14 C decays in proportion
to the amount present at such a rate that in 5730 years
only half of the isotopes originally present have decayed
§ A.5. Exponential functions back into nitrogen.
The essence of exponential functions, such as f (x) = 2x , is that Living organisms continually replace their carbon, so
they describe things that grow in proportion to their size. The their 14 C levels are kept at a constant level. However,
more you have the faster it grows. when a plant or animal dies, it stops replacing its car-
bon and the amount of 14 C begins to decrease through
A.5.1. Argue that rabbit populations and money both have this
radioactive decay. Thus any dead organic material can
property.
be dated on the basis of its 14 C content. For example,
A.5.2. Verify that f (x) = 2x has this property by considering how in antiquity precious treatises were written on parchment
f (x + 1) is related to f (x). (dried animal skin).
A.5.3. I put $1000 in a savings account earning 10% interest an- (b) A parchment manuscript has 80% as much 14 C as liv-
nually. ing material. Find an equation expressing the age of
the parchment.
(a) How much money do I have after 1 year? After 2
years? After x years? (c) Find the age of the parchment.
I want to know: How long will it take for my money to
double?
(b) Write down an equation involving the required time § A.6. Complex numbers
T.
There are two ways of tackling this equation. Nowadays § A.6.1. Lecture worksheet
we can simply:
(c) Have a computer or calculator solve the equation. “Complex numbers” are an expanded universe of numbers in
which any polynomial equation has as many roots as its de-
However, we can also make some progress by hand: gree. This is achieved by generalising ordinary real numbers
(d) Use logarithms to solve for T , i.e., write the equation into one more dimension. So in this universe numbers are no
as T = . . .. longer confined to one line or axis, but rather live in a whole
plane. These numbers are points in the plane, or, if you prefer,
Did we accomplish anything this way? You may say no, they are arrows pointing from the origin to that point:
because we still need a calculator to find the numerical
value of T , and we could already do that without loga-
rithms anyway.
(e) ? Explain why this method used to make more sense
when there were no calculators.
Nevertheless solving for T in this way does serve a pur-
pose in other contexts, where solving for T may be merely
a substep in a more complex investigation.
Exponential functions also describe things that shrink in pro-
portion to their size, of which radioactive decay is an important
example. Now we ask ourselves: how does one add and multiply such
numbers? The goal of generalisation is to retain old things but
A.5.4. Radiocarbon dating. Carbon is an essential atom in to think of them in new ways that give them wider applicabil-
plants and animals. Plants absorb it through carbon diox- ity. Indeed, the following is a strange way of looking at ordinary
ide in the atmosphere and animals absorb it through their multiplication, which, however, has the advantage that it gen-
food. A small portion of this carbon is in the form of eralises readily to two-dimensional numbers.
the isotope 14 C (“carbon-14”). This isotope is radioactive,
meaning that it is in an unstable state and will eventually A.6.1. Let |z| denote the magnitude, or distance to the origin, of
revert into another form without external influence. This a number z, and let arg(z), the “argument” of the number,
unnatural state is created by cosmic radiation, which con- denote the angle it makes with the positive x-axis (mea-
verts nitrogen into 14 C. Left to its own devises, the 14 C will sured in radians). Then:
105
|w| = |z|
z |z| arg(z)
w = −z
2
wz = 1
−3
(b) Does any complex number always have precisely two
2 · (−3)
distinct square roots?
Generalising from this to two-dimensional complex numbers, On the other hand complex numbers can also be usefully char-
we get the following rules. We add by “concatenation”: acterised in terms of their length r and the angle θ that they
make with the x-axis. In this case we write them in the “polar
z +w form” r e i θ .
w
|
|z
And we multiply by multiplying the lengths and adding the an- r=
gles:
θ = arg(z) = argument of z
zw
|
· |w
w
|=
106
Their first triumph, however, was not the quadratic equa-
§ A.6.2. Problems
tions found in textbooks today but rather cubic ones,
i.e., equations of degree 3. For cubic equations there is
A.6.9. (Requires knowledge of derivatives and preferably power
a formula analogous to the common quadratic formula,
series.) The appearance of the number e in the polar form
namely the solution of y 3 = p y + q is
of complex numbers may seem mysterious.
v s v s
(a) Study again the justification for this notation given in
u u
tq
u3
³ q ´2 ³ p ´3 3 q
u ³ q ´2 ³ p ´3
y= + − + − − .
t
the lecture and argue that e could just as well be re-
2 2 3 2 2 3
placed by, say, the number 3 so far as this justification
is concerned. (a) Apply the formula to x 3 = 15x +4. The two cube roots
(b) However, make a case for e on the basis of its distinc- that arise are in fact equal to 2+i and 2−i . Check this!
tive property (e x )0 = e x , not shared by other exponen- (b) So what solution does the formula give? Is it correct?
tial functions.
The conclusion is that even if you think answers with i ’s
(c) Also make a case for e by expanding both sides of the in them are hocus-pocus you still have to admit that com-
identity in problem A.6.7 using the series in problem plex numbers are useful for answering questions about
5.1.2. ordinary real numbers as well.
(d) ? Are these two arguments for e different or is it the
same reason in different guises?
§ A.7. Reference summary
A.6.10. Explain what is wrong in the following argument:
p p p
§ A.7.1. Basic algebra
p
−1 = −1 −1 = (−1)(−1) = 1 = 1
Exponents:
A.6.11. (a) Prove from the geometric definition of complex
numbers that a(b + c) = ab + ac.
(b) ? Are there any other important algebraic rules we 1 1
··· x −2 = x −1 = x0 = 1 x1 = x x2 = x · x ···
need to derive in order to be justified in treating x2 x
the algebraic and geometric conceptions of complex
numbers as equivalent? 1 p 1 p
3
x2 = x x3 = x ···
A.6.12. When we know complex numbers we no longer need to
memorise the trigonometric addition formulas
ax
sin(α + β) = sin(α) cos(β) + sin(β) cos(α) a x a y = a x+y = a x−y (a x ) y = a x y (ab)x = a x b x
ay
cos(α + β) = cos(α) cos(β) − sin(α) sin(β) Fractions:
because we can easily rederive them by simply multiply-
ing e i α by e i β . A C
× =
AC A C
+ =
AD + BC
B D BD B D BD
(a) Show how. (Express the product in two different ways
AC A A A C
and identify real and imaginary parts.) = = ×
BC B B B C
(b) ? Is this proof circular? That is, is the algebraic ma- X B A 1
chinery it involve already rest on the sum formulas A
=X× = A×
A B B
for trigonometric functions in some implicit way? B
107
a 3 − b 3 = (a − b) a 2 + ab + b 2
¡ ¢
• Find the equation for a line passing through two given points
(x 1 , y 1 ), (x 2 , y 2 ).
a 3 + b 3 = (a + b) a 2 − ab + b 2
¡ ¢
∆y y 2 −y 1
Find the slope m = ∆x = x 2 −x 1 , and then proceed as in the
Quadratic formula: previous problem.
p
2 −b ± b 2 − 4ac § A.7.4. Circles
ax + bx + c = 0 =⇒ x=
2a
Circle with center at origin:
§ A.7.2. Pythagorean Theorem p
x2 + y 2 = r 2 =⇒ y = ± r 2 − x2
a (x − a)2 + (y − b)2 = r 2
§ A.7.5. Parabolas
Distance between two points (x 1 , y 1 , z 1 ) and (x 2 , y 2 , z 2 ):
Parabola with vertical axis:
q
¢2
(x 1 − x 2 )2 + y 1 − y 2 + (z 1 − z 2 )2
¡
y = ax 2 + bx + c = A(x − B )2 +C
§ A.7.3. Lines
A = “steepness”
Equation for a line:
A positive =⇒ upward or “happy” parabola
y = mx + b A negative =⇒ downward or “sad” parabola
∆y y2 − y1 B = x-value of axis of symmetry
m = slope = = = rise over run”
∆x x 2 − x 1
C = vertical shift = distance of vertex from x-axis
b = y-intercept = y(0)
• Convert a quadratic function given in the form y = x 2 +bx +c
m=3 into the form y = A(x − B )2 +C .
m=1
³ ´2 ³ ´2 ³ ´2
Rewrite x 2 + bx as x 2 + bx + b2 − b2 = (x + b2 )2 − b2 .
1
m= 2
• Convert a quadratic function given in the form y = ax 2 +bx +
c into the form y = A(x − B )2 +C .
m=0
Divide by a, proceed as above to obtain y/a = (x − B )2 + C ,
then multiply by a.
m = −1
§ A.7.6. Factoring
108
In general: Find the roots r 1 , r 2 of x 2 + B x + C = 0, e.g. using • Given f (x) as a formula, find f (whatever given expression).
the quadratic formula. The factorisation is (x − r 1 )(x − r 2 ).
Replace all occurrences of x in the formula for f (x) with the
• Factor a third-degree polynomial (or higher). given expression enclosed in brackets.
Find one root r by trial-and-error or educated guessing. Fac- • Evaluate a composite function f (g (x)).
tor out (x−r ). This can be done systematically by polynomial
Work “inside out”: first find g (x), then plug the result into
long division (see below), or, in many simpler cases, simply
f (x).
by writing (x−r )( x2+ x+ ) and determin-
ing by inspection what numbers need to go in the blanks to • Find the graph of a function f (x) given as a formula.
make this expression equal to the original cubic expression.
Pick some value for x, compute f (x), mark the point
• Divide one polynomial, p(x), by another, q(x). (x, f (x)). Repeat for various values of x. When the pattern
is clear, connect the dots.
To find “how many times q(x) goes into p(x),” first determine
the numbers a 1 and n 1 such that a 1 x n1 times the highest- • Infer the properties of the graph of a polynomial function.
degree term of q(x) equals the highest-degree term of p(x).
If the polynomial has a factor (x − a), the graph intersects or
Write down a 1 x n1 as the first term of the answer. Next com-
touches the x-axis at x = a.
pute a 1 x n1 q(x), and subtract the result from p(x). This
leaves the remainder of the division, r 1 (x). If the degree of the polynomial is n, the graph has no more
than n − 1 turning points, and no more than n intersections
Now repeat the process with r 1 (x) in place of p(x). Keep re-
with any line.
peating this process until it can’t go any further, i.e., until the
remainder is 0 or of lower degree than q(x). If the highest-degree term is x raised to an odd power, the
function goes to +∞ for big x’s and −∞ for big negative x’s.
If the remainder is 0, then the answer gives the result of the
p(x)
division, i.e., q(x) = a 1 x n1 + a 2 x n2 + · · · . If the highest-degree term is x raised to an even power, the
function goes to +∞ for big x’s and for big negative x’s.
If the remainder is r k (x) 6= 0, then the remaining division that
could not be carried out must be added to the answer, i.e., • Recognise how the graph of a function closely related to f (x)
p(x) n1 n2 r k (x) differs from that of f (x).
q(x) = a 1 x + a 2 x + · · · + q(x) .
f (x) + c moves the graph c steps up.
a1 x n1 +a 2 x n2 +···
− f (x) flips the graph upside-down (i.e., mirrors it in the x-
q(x) p(x)
axis).
−a 1 x n1 q(x) f (−x) flips the graph the other way around (i.e., mirrors it in
r 1 (x) the y-axis).
k f (x) stretches the graph by a factor k in the y-direction; the
−a 2 x n2 q(x)
bigger the k, the “steeper” or “more accentuated” the graph
r 2 (x)
becomes.
..
.
f (kx) stretches the graph by a factor k in the x direction; the
bigger the k, the more “flattened out” the graph becomes.
§ A.7.7. Functions and graphs f −1 (x) interchanges x and y, i.e., reflects the graph in the line
y = x.
A polynomial is an expression of the form a 0 + a 1 x + a 2 x 2 +
· · · + a n x n . The numbers a i are the coefficients of the various • Find the inverse of a function given as a formula y = f (x).
x-terms. Solve for the output variable, i.e., rewrite the given equation
A rational function is one polynomial divided by another. in the form x = (something with y’s). The right hand side
is then the desired inverse function f −1 (y). (Its variable is
An algebraic function is a function defined by an equation built now called y. If you prefer to forget what x denoted when
p p
up in any manner from the operations +, −, ×, div , , 3 , . . ., you started the problem and simply consider f −1 in its own
the variables x, y, and numbers. right, then you can simply replace all the y by x’s and you
A transcendental function is a function that is not algebraic. have f −1 (x).)
¡ p ¢2
Find the inverse of f (x) = sin( x) .
f (x) even ⇐⇒ f (−x) = f (x) p ¢2 p
¡ p p
⇐⇒ graph symmetric in y-axis y = sin( x) =⇒ ± y = sin( x) =⇒ arcsin(± y) =
p p 2 p 2
x =⇒ x = (arcsin(± y)) = (arcsin( y)) . So f −1 (x) =
f (x) odd ⇐⇒ f (−x) = − f (x) p
(arcsin( x))2 .
⇐⇒ graph symmetric in y-axis except upside-down
109
Half angle formulas:
§ A.7.8. Trigonometry
r
Geometrical definitions of trigonometric functions: x 1 − cos x
sin =
2 2
use
r
x 1 + cos x
oten cos =
hy p opposite 2 2
θ Radian angle measure: measuring angles by the length of the
adjacent corresponding arc of a unit circle. In particular, a full revolu-
tion = circumference of unit circle = 2π.
opposite side adjacent side • Convert θ ◦ into radians.
sin θ = cos θ =
hypotenuse hypotenuse 2π
θ · 360
opposite side sin θ
tan θ = = • Convert θ radians into degrees.
adjacent side cos θ
θ · 360
2π
Reciprocal trigonometric functions:
Trigonometric table:
1 1 1
sec x = csc x = cot x = degrees radians sin cos tan
cos x sin x tan x
0◦ 0 0 1 0
p p
Pythagorean property: 30◦ π/6 1/2 3/2 1/ 3
p p
45◦ π/4 1/ 2 1/ 2 1
p p
60◦ π/3 3/2 1/2 3
sin2 θ + cos2 θ = 1 90◦ π/2 1 0
Symmetry properties:
§ A.7.9. Exponential functions
sin (−θ) = − sin (θ)
y(t ) = y 0 e kt : exponential growth/decay function.
cos (−θ) = cos (θ)
¡ ¢
sin x + y = sin x cos y + cos x sin y • Given an exponential growth/decay function, determine the
¡ ¢
sin x − y = sin x cos y − cos x sin y time at which λ times the initial amount is present. (E.g.,
¡ ¢ “half life,” λ = 12 .)
cos x + y = cos x cos y − sin x sin y
¡ ¢ Set y(T ) = λy(0) and solve for T .
cos x − y = cos x cos y + sin x sin y
• Given the initial value y 0 and one data point y(t 1 ) = y 1 of an
exponential growth/decay function y(t ), find the expression
Double angle formulas:
for y(t ).
We need y 0 and k in the formula y(t ) = y 0 e kt . Fill in y 0 , which
sin 2x = 2 sin x cos x
is given, right away. Using the resulting expression, write out
cos 2x = cos2 x − sin2 x = 2 cos2 x − 1 = 1 − 2 sin2 x y(t 1 ) = y 1 and solve for k.
110
• Given two data points, y(t 1 ) = y 1 and y(t 2 ) = y 2 , of an expo-
§ A.7.11. Complex numbers
nential growth/decay function y(t ), find the expression for
y(t ).
|
Laws of logarithms (log can be any logarithm):
· |w
|z |
z +w
w w
|=
|z w
log(ab) = log a + log b
z z
log(a/b) = log a − log b
log(a b ) = b log a
log 1 = 0
a + bi = complex conjugate of a + bi = a − bi
• Solve a quadratic equation with complex roots.
ln(e x ) = x e ln x = x ln e = 1
Solve with
p usual quadratic pformula
p and
p when the root comes
y = ln(x) out as −A, rewrite it as A −1 = Ai .
Solve z 2 − 2z + 4 = 0.
p p p p
1 2± 4−16 2± −12 2±2 −3
z= 2 = 2 = 2 = 1 ± i 3.
Isolate the exponential expression on one side of the equa- Multiply top and bottom of the fraction by the conjugate
tion (move other terms to the other side and divide away the a − bi of the denominator. After simplifying, there will be
coefficient of the exponential term, if it has one), and take no i ’s left in the denominator.
logarithms of both sides. 3
Express 1+5i in the standard form a + bi .
• Solve for x in an equation in which x occurs inside a loga- 3 3(1−5i ) 3−15i 3
rithm. 1+5i = (1+5i )(1−5i ) = 1+25 = 26 − 15
26 i .
111
p
3+2i 3 666
Simplify 3−2i . Simplify ( 12 + 2 i) .
(3+2i )2
= (3−2i )(3+2i ) = 5+12i
9+4 = 5
13
12
+ 13 i. = (1(cos 60◦ + i sin 60◦ ))666 = 1666 (cos(666 · 60◦ ) + i sin(666 ·
60◦ )) = cos(111 · 360◦ ) + sin(111 · 360◦ ) = cos 0◦ + i sin 0◦ = 1.
GMm
Write in the form a + bi : e 3πi /2 , 2e πi /6 , 3e i π/4 . F=
r2
p
−i , 3 + i , p3 + p3 i .
2 2
Simplify (1 − i )12 .
p p
For the polar form of 1 − i we havepr = 12 + (−1)2 = 2,
θ = − π4 , so for (1 − i )12 we have r = ( 2)12 = 26 = 64, θ = − π4 ·
12 = −3π, which is equivalent to −π. Thus (1 − i )12 = −64.
p
Write (1 − 3i )11 in the form a + bi .
p 5π
Let z = 1 − 3i . Observe that |z| = 2, and arg z = θ = 3 , so
5π
3 i
5π
3 i
55π
3 i 11 (18π+ π3 )i
z = 2e . Thus z 11 = (2e )11 = 211 e p = 2 e =
π p
211 e 3 i = 211 (cos π3 + i sin π3 ) = 211 ( 12 + i 23 ) = 210 (1 + i 3) =
p
1024 + 1024 3i .
p
Simplify ( 1+1+i3i )8 .
8 ◦ ◦
p2 (cos(8·60 )+i sin(8·60 )) 28 cos 120◦ +i sin 120◦
= 8 (cos(8·45◦ )+i sin 8(·45◦ ))
= 24
· cos 0◦ +i sin 0◦ = 24 (− 21 +
p ( 2)
3
p
i 2 ) = −8 + 8 3i .
112
§ B.2. Matrices
B L INEAR ALGEBRA
§ B.2.1. Lecture worksheet
rule for multiplying such things that turns out to be very fun-
damental. Let’s learn the rule algebraically first, and then we
§ B.1.1. Lecture worksheet shall see what it all means. The rule is given in the reference
summary. We see that matrices are multiplied by multiplying
rows of the left matrix by columns of the right matrix. To me,
7x + 3y is a “linear” expression, as opposed to anything involv- multiplying matrices is a tactile experience. I run my left in-
ing x 2 , x y, or such things of higher order. 7x + 3y is a “lineardex finger across the row and my right one down the column,
combination” of x and y: it’s so much of the one plus so much tapping the numbers that are to be multiplied together.
of the other. Linear relationships between quantities is the sim-
plest kind of relation, and they occur everywhere. Innumerable B.2.1. Find
scenarios are modelled by linear systems of equations such as ·
5 0
¸·
2 3
¸
1 −1 −2 0
5x + 2y = 3
x − 4y = 1 and make sure you tap along with your fingers so that
matrix multiplication becomes ingrained in your muscle
memory.
or linear transformations such as
We shall see many examples in which matrix multiplication
µ ¶ µ ¶
x x−y represents the transition from one state to the next. For exam-
7→
y 2x + y ple, the population dynamics of a city can be characterised by
how many people move between the city center and the sub-
urbs. Perhaps mostly young people leave the suburbs to go live
where some “state” (x, y) of some system is being transformed
in the city, while the older generation tend to remain in the sub-
into a new state that is a linear transformation of the original
urbs. This could mean for example that 80% of the suburban
state.
population stay and 20% move in a given decade. Those who
move to the city perhaps do so on a more temporary basis, for
Linear algebra gives us concepts and techniques to deal with
example until they have children of their own. Thus we can
relationships of this type. At first sight these concepts may not
imagine that 50% of the city population move and 50% stay in
seem very impressive or well-motivated, but the more you en-
any
£ .8 .5given decade. This information is encoded in the matrix
counter these types of relationships “in the wild,” the more you ¤
.2 .5 , because if x n and y n represent the two populations after
will come to agree that linear algebra really hit the nail on the
n decades, then
head and singled out some structurally fundamental concepts · ¸· ¸ · ¸
and ideas. This makes linear algebra a bit hard to teach and .8 .5 x 0 x1
=
learn in terms of motivation. The ideas of linear algebra are .2 .5 y 0 y1
not well motivated by any one particular problem. Rather, they
prove their worth in that the capture patterns that recur in a B.2.2. (a) We also have
vast number of diverse situations. At first sight, the concepts of
¸n · ¸ · ¸
linear algebra might seem like complicated ways of saying sim-
·
.8 .5 xk xm
ple things. I will show you various applications, but you may =
.2 .5 yk ym
well object that any one of them you could have handled with
simpler means instead of using the fancy words of linear alge- where m =
bra with their intricate definitions. This is true. But the point is
not that the notions of linear algebra helps you solve any one (b) Which is the city population? x or y?
particular problem, but that they help us highlight patterns
that occur in a vast range of problems across a multitude of (c) The population distribution is in equilibrium when
contexts. The “start-up cost” of learning these concepts might the city population is % of the suburban pop-
seem like a high price to pay for the applications you get, but ulation.
it’s a long-term investment if there ever was one. Linear algebra
is distilled experience. Mathematicians spent centuries work- So figuring out what happens with the populations over time
ing with linear relationships the hard way, and linear algebra is is just a matter of multiplying by the transition matrix so many
the box of insiders’ tricks that emerged as the most common times. Below we shall return to this example and see how its
and fundamental structural patterns they encountered. long-term behaviour can be analysed.
113
§ B.3. Linear transformations B.3.4. Which of the above matrices come back to I when multi-
plied by itself a certain number of times? How many mul-
tiplications does it take? Confirm by both calculation and
§ B.3.1. Lecture worksheet
geometrical interpretation.
114
This strategy for solving linear equations is two thousand years numbers, then calculating Av, where
old: it was used in ancient China, where the coefficient matrix
1 −1 1 −1
was represented and manipulated on a counting board. One
0 1 −1 1
is reminded of those Mancala board games where marbles are A= ,
0 0 1 −1
placed in a grid of pits on a wooden board—a delightfully con-
crete version of a matrix, and one very well suited for this kind 0 0 0 1
of calculation. Let me give you a sample problem from an an-
and then translating the resulting column vector back
cient Chinese text for practice. See if you can feel the mar-
into a four letter word. The spy sends the message CRAR.
bles moving as you solve it in coefficient matrix (or “counting
Decode the message.
board”) form. a b c d e f g h i j k l m n
1 2 3 4 5 6 7 8 9 10 11 12 13 14
B.4.4. “[We are to ascend a mountain carrying a weight of 40
o p q r s t u v w x y z
dan] given one superior horse, two common horses, and 15 16 17 18 19 20 21 22 23 24 25 26
three inferior horses. . . . The superior horse together with
one common horse, the [group of two] common horses
together with one inferior horse, and the [group of three] § B.6. Determinants
inferior horses together with one superior horse, are all
able to ascend. Problem: How much weight do the supe-
§ B.6.1. Lecture worksheet
rior horse, common horse, and inferior horse each have
the strength to pull?”
Above we saw that for a 2 × 2 matrix,
· ¸−1 · ¸
a b 1 d −b
=
§ B.5. Inverse matrices c d ad − bc −c a
115
B.6.2. Argue
¯ ¯ that it follows that determinants are areas since § B.7. Eigenvectors and eigenvalues
¯ 1 0 ¯ = 1 is the area of the unit square and all other par-
01
allelograms can be built up from there by the above oper-
§ B.7.1. Lecture worksheet
ations (or, conversely, can be brought back down to a unit
square by these operations).
Above we discussed a population dynamics example where the
B.6.3. Technically, determinants are “signed areas” since they movements each decade were described by a matrix,
are sometimes negative, although their magnitude always · ¸· ¸ · ¸
corresponds to the area. This is reflected in determinant .8 .5 x n x n+1
=
algebra by the fact that if we switch two columns the de- .2 .5 y n y n+1
terminant changes sign. Compute the determinants of What will happen in the long run in this situation? It seems
the transformations in problem B.3.1 and argue that a likely that the population distribution will eventually settle at
negative determinant corresponds to areas being “flipped an equilibrium, so that the number of people moving one way
upside down.” is equal to the number of people moving in the other direction.
B.6.4. † ? 3×3 determinants are volumes of parallelepipeds, the In equations this means
natural generalisation of the 2 × 2 case. They are com- · ¸· ¸ · ¸
.8 .5 x x
puted by breaking them into 2 × 2 determinants as shown =
.2 .5 y y
in the reference summary. Generalise the argument of
problem B.6.2 to the 3×3 case to justify the interpretation or in other words
of a 3 × 3 determinant as a volume.
.8x + .5y = x
The rules for manipulating determinants that we found above
.2x + .5y = y
can be used to simplify calculations. For example, if we are
looking for the determinant or
¯
¯1 1
¯
5 ¯¯ −.2x + .5y = 0
¯
¯0
¯ 0 1 ¯¯ .2x − .5y = 0
¯2 2 11¯
The second equation is just minus one times the first, so there
we simply subtract twice the first row and once the middle is really only one equation and two unknowns. Therefore there
row from the last row, which then becomes a row of all zeroes. are infinitely many solutions. In such a situation we can pick
Therefore the entire determinant is zero. any value for one of the variables, say x = t , and there will al-
ways be a corresponding value for y that solves the equation,
B.6.5. Explain why the last sentence is clear both computation- in this case y = 52 t . We can then express all solutions in vec-
ally and geometrically. h i
t £ 1 ¤
tor form as 2 t , or t 2/5 . Or, if we prefer to write it without
5
The case of a determinant being zero is often of special inter- £5¤
fractions, t 2 , since any constant multiple is absorbed by the
est. It means that the column/row vectors are “linearly depen- parameter t , which runs through all numbers. So we see that
dent,” i.e., one of them can be obtained by combining the oth- an equilibrium is reached when the populations are in the pro-
ers with certain coefficients, like the last row was a combina- portions 5 to 2, i.e., when the city population is 40% of the sub-
tion of the previous two in our example. So in such cases there urban population. The parameter t reflects the fact that we did
is a kind of redundancy: the last vector “doesn’t add anything not know the total number of people to begin with, so we know
new.” This idea will be important later. only the proportions but not the absolute numbers.
B.6.6. Select all that are true. £a b ¤£x ¤
£ 0 ¤more general terms, a matrix identity Ax = 0, or c d y =
In
If u, v, w are vectors such that {u, v}, {u, w}, and {v, w} 0 , is really a system of linear equations,
are each linearly independent sets, then {u, v, w} is a
linearly independent set. ax + b y = 0
cx + d y = 0
If the columns of a matrix are linearly dependent,
then its determinant is zero. Geometrically, each equation is a line. These lines are either
the same (as in the above example), or they intersect in one
If the rows of a matrix are linearly dependent, then its
point. So there is either one solution or infinitely many.
determinant is zero.
B.7.1. Why are parallel lines not a possibility?
Just as every real number except 0 has a multiplica-
tive inverse, so every square matrix that has no 0 en- The difference between these two possibilities is reflected in
tries has an inverse. the determinant of A:
A diagonal matrix commutes with anything: that is, det A = 0 ⇔ number of solutions = ∞
if D is a diagonal square matrix then AD = D A for all
matrices A of the same dimensions. det A 6= 0 ⇔ number of solutions = 1
116
B.7.2. Explain why this is clear in terms of both the area and lin- where n is the generation. To obtain the population dis-
ear independence interpretations of the determinant. tribution in the next generation one multiplies by a tran-
sition matrix such as
I would like to generalise from the population example and
consider any vector that, when multiplied by a matrix A, is sent
0 2 4
to a multiple of itself, A = 1/16 0 0
· ¸ · ¸ 0 1/4 0
x x
A =λ
y y
0 2 4
Such a vector is called an eigenvector of A, and the number λ B = 1/4 0 0
is the corresponding eigenvalue. In the population example we 0 1/2 0
were interested in the special case λ = 1.
0 1/2 1/2
B.7.3. For each of the matrices in problem B.3.1, find all eigen- C = 1/4 0 0
vectors and eigenvalues both algebraically and by geo- 0 1/2 0
metrical reasoning.
Match each matrix with its corresponding real-world sce-
The eigenvector equation Ax = λx can also be rewritten as nario among those listed below. Also, without calcu-
Ax = λI x and then (A − λI )x = 0. This is precisely the kind of lations, by reasoning about the real-world meaning of
system we studied above. So we know that there is either one eigenvectors and eigenvalues, deduce which eigenvalues
solution or infinitely many, and we can find out which by com- and eigenvectors from the options provided should go
puting det(A−λI ). Obviously there is always the trivial solution with each scenario. Confirm your inferences computa-
x = 0, but we do not count 0 as an eigenvector. So eigenvectors tionally (perhaps using a computer). Possible eigenvec-
occur precisely when there are infinitely many solutions, i.e., tors:
when det(A − λI ) = 0.
2 8 16
v1 = 1 v2 = 2 v3 = 2
So if we are looking for £the eigenvectors and eigenvalues of our
1 1 1
population matrix A = .8 .5 we begin by solving the equation
¤
.2 .5
117
(c) Compute the eigenvalues and eigenvectors of the and applying the matrix A many times. The process can
matrix (perhaps by computer). If the proportions be thought of as modelling the behaviour of a “random
of s: f :l are 1 : : then the economy is surfer” who clicks the links of the page he is on with equal
[growing/shrinking] by % per year. probability. The pages with the highest rankings are those
which the random surfer ends up hitting most often.
B.7.8. Consider an economy based on oil and steel. Extract-
ing oil costs both steel and oil: steel for drills and pipes, (a) Compute A 5 x0 , A 20 x0 and A 100 x0 , and notice that the
and oil to run the pumps. Similarly, mining for steel re- results are stabilising at particular values. These are
quires steel drills and rails and oil-driven machinery. Ex- the relative importances of these pages according to
tracting one unit of oil costs .04 units of oil and .08 units Google’s algorithm.
of steel. Extracting one unit of steel costs .04 units of oil
and .01 units of steel. This is encoded in the matrix (b) Explain how this is related to eigenvectors and eigen-
· ¸ values. Hint: Compare A 101 x0 ≈ A 100 x0 with Ax = λx.
.04 .04
A=
.08 .01 (c) Show how the same ranking (and relative impor-
tances) can found by an eigenvector calculation in-
There is a yearly market demand for 100 units of oil and
stead of computing powers of A. (Recall that if v is an
20 units of steel, which we may express by the matrix
· ¸ eigenvector then so is any multiple of v.)
100
D= (d) Suppose the owner of page 4 tries to boost his rank-
20
£ oil ¤ ing by creating a page 5 which links to page 4; page 4
Finally, X = steel is a column vector expressing the also links to page 5. Does the new page 5 help page
yearly production quantities. 4’s ranking?
(a) What
£ ¤ is £ the ¤real-world meaning of AX ? Hint: Consider this disconnected web:
A osnn = osn−1
n−1
.
1 3 5
(b) What is the economic sense of the equation (I −
A)X = D?
2 4
(c) Solve this equation for X and interpret your answer
in real-world terms. (e) Compute the eigenvalues and eigenvectors of this
B.7.9. † When you perform a Google search, the order of the web. Why does the ranking strategy used above not
results is determined using matrix algebra. The basic idea work here?
is that if a web page contains n links to other pages then it (f ) ? Does this problem occur for any disconnected
“passes on” 1/n times its own importance to each of those web? Does it ever occur for a connected web? Ex-
sites. We can think of the web as the board of a board plain using the idea of a stable state and its meaning
game, on which stacks of coins placed on each site is be- in terms of eigenvectors and eigenvalues.
ing moved around in this manner. One “turn” of all web-
sites passing on their importance according to this prin- (g) To fix this and other problems the actual Google ma-
ciple can be encoded in a matrix A such that if x is the trix is not A but 0.85A + 0.15B , where B is a ma-
column vector of the importance of the websites then Ax trix with all entries 1/N (where N is the number of
is the column vector of importances after the passing on pages). Explain how this can be interpreted in terms
has taken place. An example is shown below. of the “random surfer” mentioned above.
3 4
§ B.8. Diagonalisation
0 0 1 1/2
1/3 0 0 0 § B.8.1. Lecture worksheet
A=
1/3
1/2 0 1/2
1/3 1/2 0 0 In this section we shall find a clever way of figuring out the
power A n of a matrix without actually having to multiply it out
We can now rank the pages by supposing that each web-
so many times. This is done by finding the “diagonalisation” of
site starts with equal importance, i.e.,
A. A diagonal matrix is a matrix with all zeros except along the
1/4
diagonal. Diagonal matrices are very convenient because
1/4
1/4 ,
x0 = ¸n
an
· · ¸
a 0 0
=
1/4 0 b 0 bn
118
B.8.1. Interpret this result geometrically. In particular,
Therefore we seek a diagonal representation of A, so that we ·
x0
¸ ·
1 5 5
¸·
x0
¸ ·
1 5(x 0 + y 0 )
¸
n
can take its powers in a convenient way. I claim that in fact lim A = =
n→∞ y0 7 2 2 y0 7 2(x 0 + y 0 )
¸−1
λ1
· · ¸ · ¸
0 So no matter what the initial numbers x 0 and y 0 are, in the
M = v1 v2 A v1 v2 =
0 λ2 long run the ratio will stabilise at five sevenths in the suburbs
where v1 and v2 are the eigenvectors of A written as columns. and two sevenths in the city. We already knew that this was the
This is a splendid fact, because if we solve for A in this equation equilibrium, but now we have confirmed explicitly that we al-
we obtain ways approach this equilibrium regardless of initial condition.
¸−1 B.8.3. Diagonalise one or two of the matrices from problem
λ1 0
· ¸· ¸·
A = v1 v2 v1 v2 (∗) B.7.3. Then use the diagonalisation to find a simple ex-
0 λ2
pression for an arbitrary power of the matrix A n and note
and therefore that the answer could easily have been predicted geomet-
¸n · ¸−1 rically.
λ1 0
· ¸·
A n = v1 v2 v1 v2
0 λ2
In this way we need only three matrix multiplications instead § B.8.2. Problems
of a hundred to compute A 100 .
B.8.4. Explain how formula (∗) from the text can be used to
To prove my claim I only need to compute: generate a matrix with given eigenvectors and eigenval-
· ¸ · ¸−1 · ¸· ¸ ues. (This is useful for teachers designing exam prob-
1 1
M = v1 v2 A v1
v2 lems. If you make up some matrices with simple num-
0 0
bers in them and compute their eigenvalues and eigen-
· ¸−1 · ¸−1
vectors you will find that these are typically not simple
= v1 v2 Av1 = v1 v2 λ1 v1
at all, so this is not a good way of designing manageable
λ1
· ¸ · ¸
1 exam problems.)
=λ1 =
0 0 B.8.5. Re-solve problem B.3.5 by finding the eigenvectors and
eigenvalues through geometric reasoning and applying
B.8.2. Justify each step in this calculation by matching the first,
formula (∗).
second, third, and fourth equalities with a corresponding
justification: B.8.6. Argue that (∗) can be interpreted geometrically as follows.
definition of A To apply the transformation A is the same thing as to: (1)
perform a change of variables so that the eigenvectors are
rule for scalar product of vector with itself the new basis vectors, (2) apply the transformation in this
definition of M new coordinate system, where it is simply a dilation of
each of the basis vectors, (3) revert back to the original
simplifying by column operations variables.
matrix multiplication computation B.8.7. The Fibonacci sequence is a sequence of numbers in
reasoning backwards: what input gives this input? which every number is the sum of the two previous num-
bers: 1, 1, 2, 3, 5, 8, 13, 21, . . ., or in symbols F n = F n−1 +
formula for inverse matrix F n−2 . These numbers are found in many places in na-
definition of eigenvector ture. For example, if you turn a pine cone or a pineap-
ple upside-down and count the number of spirals going
In the same way one finds that M 01 = λ02 , so M must be
£ ¤ £ ¤
clockwise and counter-clockwise you will find that these
λ1 0
h i
, as claimed. are two consecutive Fibonacci numbers.
0 λ2
If we apply this to the population example we find that Fibonacci was a 13th century Italian mathematician who
¸n · ¸−1 used this sequence and many other computational prob-
λ1
· ¸·
0 lems to show the superiority of the Arabic numerals that
A n = v1 v2 v1 v2
0 λ2 we use today over the Roman numerals still used in Eu-
· ¸· n ¸· ¸−1 rope at that time. He introduced his sequence by means
5 1 1 0 5 1
= of the following rabbit population scenario. Suppose
2 −1 0 0.3n 2 −1
· ¸· ¸ · ¸ each pair of adult rabbits produces one pair of baby rab-
5 1 1 0 1 −1 −1 bits per season. Next season these baby rabbits become
=
2 −1 0 0.3n −7 −2 5 adults and start producing offspring of their own. Thus
1 5 + 2 · 0.3n 5 − 5 · 0.3n the number of adult rabbit pairs F n in generation n equals
· ¸
= n
7 2 + 2 · 0.3 2 − 5 · 0.3n all the rabbit pairs from last generation F n−1 plus the new
119
rabbits produced by the rabbits who are reproductively Ax) = 0, which can also be written A T Ax = A T d. The only un-
active, i.e., the rabbits who were born at least two seasons known in this equation is x. We have thus reduced the geomet-
ago, F n−2 . Thus F n = F n−1 + F n−2 , as above. rical problem of finding x to a straightforward matter of solving
a system of equations numerically.
We shall now obtain a formula for F n that will enable us to
find for example F 1000 in one step instead of the thousand Now consider a data problem that at first sight appears unre-
steps required to write out the entire Fibonacci sequence lated to the above but in fact turn out to be the same thing.
up to this point. Let’s say I want to investigate the relation between the ages
of male and female actors portraying couples in Hollywood
(a) Find a matrix A such that movies. Here for example is a small data set:
Brad Pitt’s age age of actress
· ¸ · ¸
F n+1 Fn
=A . movie (t ) playing his wife (w)
Fn F n−1
Se7en 32 22
p p Mr. & Mrs. Smith 41 30
1+ 5
The eigenvalues of this matrix are a = and b = 1−2 5 ,
2 World War Z 49 37
¸ ·
· ¸
a b
and the corresponding eigenvectors are and . Does this data fit a linear relationship w = mt + b? Not exactly,
1 1
but not far from it. The equations this data would satisfy if the
Diagonalise A and find a formula for· F n ¸by computing a relationship was linear would be:
1
suitable power of A multiplied with .
1 b + mt 1 = w 1
b + mt 2 = w 2
(b) This leads to F n = (expressed as a formula in
terms of a, b and n). b + mt 3 = w 3
or in matrix terms:
1 t1 · ¸ w1
§ B.9. Data b
1 t2 = w 2
m
1 t3 w3
§ B.9.1. Lecture worksheet In terms of our geometric example, this corresponds to Ax = d.
The b and the m are the unknowns x that we are looking for,
Linear algebra is very often used to process big data. Re- and the wife’s ages are the data d that we are trying to capture.
markably, many useful numerical techniques for dealing with But just as in the geometric example, it is not possible to solve
big sets of data correspond to readily visualisable geometrical this equation Ax = d. But we can solve the problem to the clos-
ideas concerning vectors. The following is an example of this. est possible approximation by the same trick as above, that is,
solve A T Ax = A T d instead. In the movie case this becomes:
I’m given a vector d and I am looking for its projection onto a
¸ 1 t1 · ¸ · ¸ w1
particular plane, namely the plane spanned by the vectors a1
·
1 1 1 b 1 1 1
and a2 . In other words, I am trying to find some combination 1 t2 = w 2
t1 t2 t3 m t1 t2 t3
of these vectors, say x 1 a1 + x 2 a2 , that is as close as possible to 1 t3 w3
d. I can formulate this in terms of matrices by forming a matrix It is straightforward to write this out as a system of equa-
A that has the vectors a1 and a2 as columns. Then my problem tions and solve for the unknowns b and m. This gives b =
becomes: choose x = (x 1 , x 2 ) so that Ax is a close as possible to −1350/217 ≈ −6.22 and m = 383/434 ≈ 0.88. Thus w = 0.88t −
d. 6.22 is the best linear fit for the given data. So Brad Pitt’s movie
d wives were, so to speak, born a bit over six years after him and
furthermore age only 0.88 years for every one of his.
d - Ax I used only three age pairs in this example, but the method
a2 works for a data set of any size. The three-dimensional visu-
column space of alisation enabled us to see the method in an intuitive way, but
A=[a1 a2] once we translated it into matrix language we could just as well
extend it to any number of dimensions.
Ax=x1a1+x2a2
a1 § B.9.2. Problems
120
The number multiplies onto each entry:
· ¸ · ¸
a b ka kb
k =
c d kc kd
· ¸ · ¸ · ¸
1 3 3·1 3·3 3 9
3 = =
2 0 3·2 3·0 6 0
· ¸ · ¸ · ¸ · ¸
1 3 4 2 1+4 3+2 5 5 § B.10.2. Linear transformations
+ = =
2 0 2 −1 2+2 0−1 4 −1
Geometrically, matrices represent linear transformations,
• Multiply a matrix by a number, k A. meaning transformations that preserve lines. Such transfor-
121
mations are rotations, reflections, dilations (i.e., magnifica-
tions, or scalings), linear projections, and combinations of § B.10.3. Gaussian elimination
these. Matrix multiplication always leaves the origin intact so
translations (i.e., vertical or horizontal displacements) are not Correspondence between matrices and systems of linear equa-
included. tions:
£x¤
£Algebraically, this corresponds to the multiplication A y = ax + b y = c
·
a b c
¸
X ↔
¤
Y A transforms the input point (x, y) into some output point dx +ey = f d e f
(X , Y ).
£1¤ Gaussian elimination rules:
The first column of A is£its¤ effect on the unit vector 0 , the sec-
ond on the unit vector 01 . – You may add or subtract any row from any other row.
Matrix multiplication AB represents the composition of the – You may multiply or divide any row by any number.
linear transformations A and B in the order “first apply B then
apply A” (since input vectors are “plugged in on the right”). – You may switch places of any two rows.
• Find the matrix representing a given linear transformation The goal of Gaussian elimination is generally to turn the ma-
specified in geometrical language (rotation, reflection, etc.). trix into identity-matrix form, or at least upper-triangular form
(i.e., having all 0’s in the bottom-left half below its principal di-
Determine the effect of the transformation of the unit basis agonal).
vectors. Write the results as the columns of a matrix. This is
the sought transformation matrix. • Solve a system of linear equations.
p p ¸ Attempt to solve the system as above and see the rules for the
·
1/ 2 −1/ 2 number of solutions there.
What geometric transformation is p p ?
1/ 2 1/ 2
3x + y = 27
45◦ counterclockwise rotation.
2x − y = 0
0 1 0 · ¸ · ¸ · ¸ · ¸
What geometric transformation is −1 0 0 ? 3 0 27 1 0 9 1 0 9 1 0 9
∼ ∼ ∼
0 0 1 2 −1 0 2 −1 0 0 −1 −18 0 1 18
90◦ rotation about the z-axis, clockwise as seen from Hence x = 9 and y = 18.
“above” (positive z position).
122
A −1 Inverse of A. A matrix such that A A −1 =
x + y = 2
A −1 A = I .
2x + 2y = 4
orthogonal
A matrix whose columns (and rows) form
" # " # matrix
1 1 2 1 1 2 a system of orthogonal unit vectors, or
∼ ∼
2 2 4 0 0 0 equivalently: a matrix A such that A −1 =
AT .
Hence · ¸ · ¸
x 2−t singular
= A matrix with determinant zero. A singu-
y t matrix
lar matrix is non-invertible.
is a solution for any value of t .
symmetric
A matrix that equals its own reflection in
matrix
x + y = 2 its principal diagonal; matrix A such that
A = AT .
2x + 2y = 5
antisymmetric
Matrix A such that A = −A T .
matrix
" # " #
1 1 2 1 1 2
∼ ∼
2 2 5 0 0 1
There are no values for x and y that make 0 = 1. Hence the § B.10.5. Determinants
system of equations has no solutions.
¯ ¯
¯a b ¯¯
det A = ¯¯ = ad − bc
For what value(s) of a does the following system of equa- c d¯
£a¤ £b ¤
tions have no solutions? = ± area of parallelogram spanned by c and d
§ B.10.4. Special matrices – Block out the row and column containing this entry, and
write down the determinant of the matrix that remains.
I Identity matrix. A matrix with 1’s on its – Write the entry itself as a coefficient in front of this de-
principal
£ ¤ diagonal and 0’s elsewhere, as terminant.
in 10 01 . Multiplying by I has no effect:
AI = I A = A. – Decide whether the entry is associated with a plus or mi-
nus sign. The top left entry (of the original matrix) is
AT Transpose of A. A with rows and positive and every time you go one step over or one step
columns interchanged: the rows become down the sign changes. Once the sign is determined,
the columns and the columns become write it in front of the entry coefficient.
the rows.
The original matrix is equal to the resulting sum of smaller
matrices (with their appropriate signs and coefficients). Ap-
· ¸T 2 1 ply the same process to these smaller matrices until you have
2 5 6
= 5 3 broken them down to 2 × 2 determinants, which can be eval-
1 3 4
6 4 uated as above.
123
¯ ¯
¯2 1 4¯¯
¯
Find the determinant: ¯¯1 5 1¯¯ ¯ ¯ ¯ ¯
¯2 3 1¯ ¯a b c ¯¯ ¯¯ a b−a c ¯¯
¯
¯d e f ¯¯ = ¯¯d e −d f ¯¯
We can do this by expanding by any one row or column.
¯
¯g h i ¯ ¯g h−g i¯
Let’s pick the first column. Then each entry of this col-
umn needs to be multiplied by the 2 × 2 determinant that
remains when its row and column is blocked out:
+ − + You can switch places of two of the rows (or two of the
− + − columns), but then you must also switch the sign of the de-
+ − + terminant (multiply by −1 in front). In this way the value of
the determinant remains the same.
In other words, to find the sign of a given position we
“count it off” in alternating plusses and minuses starting
from a plus in the top left corner. In our case, therefore,
the middle term must be negated. So the final answer is
4 + 11 − 38 = −23.
¯ ¯ ¯ ¯
¯a b c ¯¯ ¯d e f ¯¯
¯ ¯
¯d e f ¯¯ = − ¯¯ a b c ¯¯
¯
¯g h i¯ ¯g h i¯
¯ ¯
¯
¯ 3 4 0 0 ¯
¯
¯ 1 2 0 0 ¯
Find the determinant: ¯¯ ¯
¯ 0 0 3 1 ¯
¯
¯ 0 0 4 2 ¯
¯ ¯ ¯ ¯ Evaluate the determinant
¯ 2 0 0 ¯ ¯ 1 0 0 ¯
¯ ¯ ¯ ¯ ¯ ¯
Expanding by first row: = 3 ¯ 0 3 1 ¯ − 4 ¯¯ 0
¯ ¯ 3 1 ¯=
¯ ¯ 1 2 4 4 ¯
¯ ¯
¯ 0 4 2 ¯ ¯ 0 4 2 ¯ ¯ 0 2 9 1 ¯
¯ ¯
¯ ¯
¯ 3 1 ¯
¯
¯ 3 1 ¯
¯ ¯ 0 0 1 8 ¯
3 · 2 ¯¯ ¯−4¯
¯ 4 2 ¯ = 6·2−4·2 = 4
¯ ¯ ¯
¯ 2 4 8 9
4 2 ¯
¯
Subtract twice the first row from the last row. Then expand
• Simplify a determinant before computing it.
by the first column repeatedly.
124
Find the determinant of the matrix
0 1 0
Invert A = −1 0 0
1 0 1 0
0 1 0 0 1
1 1
A= 0 a 3 a 2
a
1 a 2 1
0 1 0 1 0 0 1 0 0 0 −1 0
−1 0 0 0 1 0 ∼ 0 1 0 1 0 0
Subtract the first row from the last, and then expand by the 0 0 1 0 0 1 0 0 1 0 0 1
first column:
Thus
¯
¯1 0
¯
1 0 ¯¯ ¯¯ ¯ 0 −1 0
¯ ¯ 13 12 1 ¯
−1
¯
¯0 1 1 1
¯ A = 1 0 0
det A = ¯¯ ¯ = ¯a a a 0 0 1
3
a 2 a ¯¯ ¯¯
¯
¯0 a
¯
¯0 a a 1 1¯
1 1¯
For a 2 × 2 matrix,
§ B.10.7. Eigenvectors and eigenvalues
· ¸−1 · ¸
a b 1 d −b
= Definition: If Av = λv for some number λ then v (6= 0) is called
c d ad − bc −c a
an eigenvector of A and λ is the eigenvalue associated with it.
125
Find the eigenvectors and eigenvalues of A = 30 12 .
£ ¤ · ¸
−1 4
Given A = find A 139 using diagonalisation.
0 1
¯3 − λ
¯ ¯
1 ¯¯
det(A − λI ) = ¯ = (3 − λ)(2 − λ) = 0
det(A − λI ) = λ2 − 1 = 0 ⇒ λ1 = −1, λ2 = 1. To find the eigen-
¯
0 2 − λ¯
Thus the two eigenvalues are λ = 2 and λ = 3. To find · for λ1¸=
vector · −1 ¸we ·form¸the equation·det(A ¸ − λ1 I )~
v =
0 4 x 0 1
the corresponding eigenvectors we plug each £ 1 of¤ £these £val- 0⇒ = . Thus ~v1 = . By a sim-
x¤ 0 2 y 0 0
ues into (A − λI )x = 0. For λ = 2 this 1 0 ,
¤
gives 00 y = 0 · ¸
2
¤ eigenvector is t −1 . For λ = 3 we get
£1¤
£ 0 x1 ¤=£ x−y,
so ¤ so
£ 0 the ilar calculation, ~ v2 = . By diagonalisation we know
1
0 −1 y = ¤ y = 0 and x can be anything, so the
0 £, so ·
1 2
¸ ·
−1 0
¸
eigenvector is t 10 . that A = SDS −1 where S = and D = .
0 1 0 1
To find the required power we multiply A 139 = (SDS −1 )139 =
SDS −1 SDS −1 · · · SDS −1 = SD 139 S −1 which becomes
· ¸
3 1
Find the eigenvalues and eigenvectors of A = .
0 2 · ¸· ¸139 · ¸ · ¸
1 2 −1 0 1 −2 −1 4
det(A −·λI ) = 0 ⇒ (3 − λ)(2 A 139 = =
¸ · − λ)
¸ =·0 ⇒ λ
¸ 1 = 2 and λ2 = 3. 0 1 0 1 0 1 0 1
3−2 1 x 0 © x+y =0
For λ1 : = ⇒
0 2−2 y 0 0·x +0· y = 0
· ¸
1
⇒ y = −x ⇒ v1 = t
−1
· ¸· ¸ · ¸
3−3 1 x 0 © 0·x +0· y = 0
For λ2 : = ⇒
0 2−3 y 0 0·x +1· y = 0
· ¸
1
⇒ y = 0 ⇒ v2 = t
0
§ B.10.9. Examples
§ B.10.8. Diagonalisation
¸−1
λ1
· ¸· ¸·
0
A = v1 v2 v1 v2
0 λ2
where v1 and v2 are the eigenvectors of A written as columns. Give an example of a 2 × 2 matrix B such that B 8 = I but
B 6= I , B 2 6= I , and B 4 6= I .
• Diagonalise a given matrix A.
Reasoning geometrically, we see that a rotation 45◦ either
Find its eigenvectors and eigenvalues and enter them into clockwise or counterclockwise has the desired properties.
the above formula. This corresponds to the matrices
· ¸ p p ¸ p p ¸
3 1
· ·
1/ 2 −1/ 2 1/ 2 1/ 2
Diagonalise A = , that is, find a matrix C , C −1 , B= p p or B = p p
0 2 1/ 2 1/ 2 −1/ 2 1/ 2
and a diagonal matrix D such that A = C DC −1 .
· ¸ · ¸
1 1
Above we found λ1 = 2, λ2 = 3, v1 = , v2 = .
−1 0
Therefore
· ¸ 1 7
¸−1 2 3
·
1 1
¸·
2 0
¸·
1 1 Given A = and B = 2 −4 find a matrix X
A= 1 4
−1 0 0 3 −1 0 0 2
such that X A = B .
· ¸· ¸· ¸
1 1 2 0 0 −1
¸ ·
= 4 −3 1
−1 0 0 3 1 1 Multiply both sides of the equation by A −1 = 5
−1 2
on the right. Since X A A = B A simplifies to X = B A −1
−1 −1
• Compute a power A n of a given matrix A. 1 7 · ¸ −3 11
4 −3
we get X = 15 2 −4 = 15 12 −14
Diagonalise A. Multiplying its diagonalised expression with −1 2
0 2 −2 4
itself repeatedly yields
¸−1
λn1
· ¸· ¸·
n 0
A = v1 v2 v1 v2
0 λn2
126
3 −4 8
Find all eigenvalues and eigenvectors of A = 2 −3 8
0 0 1
3
Then find A 11~v where ~
v = 2
1
¯ (3 − λ)
¯ ¯
−4 8 ¯
det(A − λI ) = 0 ⇒ ¯¯ (−3 − λ)
¯ ¯
2 8 ¯=0
¯
¯ 0 0 (1 − λ) ¯
⇒ (1 − λ)(λ2 − 1) = 0 ⇒ (1 − λ)(λ − 1)(λ + 1) = 0.
The eigenvectors for λ1 = 1 are found by
2 −4 8 x 0
2 −4 8 y = 0
0 0 0 z 0
so A 11 = P D 11 P −1 = P DP −1 = A. Hence
3 −4 8 3 9
11
A ~ v = A~ v = 2 −3 8 2 = 8
0 0 1 1 1
127
3
C N OTATION REFERENCE TABLE (1 − x n ) = (1 − x)(1 − x 2 )(1 − x 3 )
Y
n=1
§ C.1. Logic
§ C.4. Vectors
=⇒ implies
§ C.3. Algebra
§ C.6. Sets and intervals
128
Q the rational numbers; numbers that are
the ratio of two integers ∅⊂N⊂Z⊂Q⊂R⊂C
1 ∈ [0, 1] 1 ∉ (0, 1)
129