0% found this document useful (0 votes)
21 views90 pages

Analysis in Multivariables

Multivariate Calculus lecture notes

Uploaded by

john.saji13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views90 pages

Analysis in Multivariables

Multivariate Calculus lecture notes

Uploaded by

john.saji13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

Analysis in Many Variables 2022-23 Michaelmas Term

Michaelmas AMV Notes


The lecture notes for this term were originally prepared by Drs. Emma Coutts and Ian Vernon and
Professor Patrick Dorey, and have been edited slightly since. If you find any typos, please let me know.
Sam Fearn

0 Introduction 4

1 Maps between real vector spaces 5


1.1 General notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Scalar fields, vector fields and curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Partial derivatives and the chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 The gradient of a scalar field 11


2.1 Di↵erential operators and r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Directional derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Some properties of the gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 r acting on vector fields 16


3.1 Divergence (div) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Applying r twice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Index notation 22
4.1 Einstein Summation Convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2 The Kronecker delta, ij . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 The Levi-Cevita symbol, ✏ijk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.4 The very useful ✏ijk formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5 Di↵erentiability of scalar fields 28


5.1 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2 Open sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 Di↵erentiable maps Rn ! R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.4 Continuous Di↵erentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.5 The chain rule revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.6 The implicit function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6 Di↵erentiability of vector fields 45


6.1 Di↵erentiable maps Rn ! Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Di↵eomorphisms and the inverse function theorem . . . . . . . . . . . . . . . . . . . . . . 46

7 Volume, line and surface integrals 49


7.1 Double integrals and Fubini’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2 Volume integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.3 Line integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7.4 Surface integrals I: defining a surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.5 Surface integrals II: evaluating the integral . . . . . . . . . . . . . . . . . . . . . . . . . . 62

8 Green’s, Stokes’ and divergence theorems 66


8.1 The three big theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
8.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.3 Conservation laws and the continuity equation (Non-Examinable) . . . . . . . . . . . . . . 69
Analysis in Many Variables 2022-23 Michaelmas Term
8.4 Path independence of line integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

9 Proofs of the three big theorems (Non-Examinable) 79


9.1 Green’s theorem in the plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
9.2 Stokes’ theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
9.3 The divergence theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
9.4 Further examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Admin Details
• Dr Sam Fearn, [email protected]
• Use the Topic forums (available from the Blackboard Ultra Discussions Page) to ask questions
about the course. This way other students will also be able to see any answers. You can post
anonymously if you prefer. If you think you can answer another student’s question you should try
to do so - trying to explain something to others is often the best way to learn.
• Lectures: Lectures will take place in TLC 042. These are timetabled for Monday at 1pm, Tuesday
at 10am, and Thursday at 5pm (odd teaching weeks only). Lecture notes and the recorded videos
from the previous year will be provided to go along with the lectures. You are expected to spend
time studying the lecture notes yourself as well as attending the lectures.
• All the material for this term will be available via the Ultra course. I will try to make everything
as easy to find as possible, but please also familiarise yourself with what is available.
• Tutorials: Tutorials will be held in odd numbered weeks, starting in week 3. The suggested problems
for these tutorials will be available through the ‘Michaelmas Tutorials’ block in Ultra. You should
always make sure that you have both the lecture notes and any relevant problems sheets available
in your tutorials.
• Problems Classes: These will take place in TLC 042 on Thursdays at 5pm in even numbered
teaching weeks. In these classes I will focus on working through examples rather than introducing
any new material.
• Assignments: Assignments will be set weekly, and will alternate between work to be done on
paper, and handed in via Gradescope, and e-assessments to be completed on the Maths Quiz
Server. Further details, as well as the necessary links, are available in the Assignments block on
Ultra.
• Additional Reading: The notes available in each topic, along with the lecture videos and problems
classes will contain all the material that you will be examined on in this course. If you feel that
you would like some additional reading to support the lecture notes then I recommend the book
Mathematical Methods for Physics and Engineering by Riley, Hobson and Bence. This is available
for you to read online via the ‘Reading List’, which is available as a link in Ultra.

3
0 Introduction
Most of this term of this course is actually about Vector Calculus. Vector calculus is a generalisation
of the calculus you studied in Calculus I, which focussed mostly on functions of one or two variables,
to studying functions defined from and to higher dimensional real spaces. Although the notation we
use today was first introduced in the study of Electromagnetism (Maxwell, 1831-1879) in the 1860’s
and 1870’s, the language of vector calculus is now fundamental in many parts of pure and applied
mathematics, as is shown in Figure 1.

Figure 1: Not only is vector calculus fundamental for studying most topics in mathematical physics, it’s
also important for the study of the geometry and topology of di↵erentiable spaces (more specifically,
di↵erentiable manifolds, which you’ll meet in a later course), and the study of di↵erential equations.

4
1 Maps between real vector spaces
1.1 General notation
• R: the set of real numbers, which we think of as points on a line.
• Rn : the set of ordered n-tuples (x1 , x2 , . . . , xn ) where each xi is real (xi 2 R). Such an n-tuple can
be thought of as the cartesian coordinates of a point in n-dimensional space. Rn is a real vector
space (see Linear Algebra I for the full set of axioms of a real vector space), and so we refer to
elements of Rn as n-dimensional vectors. You may be used to using column vector notation for
such vectors, but here we will use row vector notation.
• Recall that the standard basis vector ei is defined as the vector with a 1 in the ith position, and a
0 in all other positions, e.g. x2 = (0, 1, 0, . . . , 0).
The standard basis vectors ei are othonormal (orthogonal, and normalised to have length 1) with
respect to the scalar (dot) product on Rn :

1, i=j
ei .ej = ⌘ ij .
0, i 6= j

ij as just defined is called the Kronecker delta, and we will revisit this in our later section on index
notation 4.
• In terms of the standard basis of Rn , {e1 , e2 , . . . , en }, the position vector of a point in Rn , x can
be written as
Xn
x = x1 e1 + x2 e2 + · · · + xn en = xi e i = xi e i .
i=1

• Note: Index notation (as in the final expression in (1.1), xi ei ) is a very important convention:
when we see an indexPn (like i) repeated, we will assume that it is to be summed over from 1 to n
and leave o↵ the “ i=1 ”. This convention is known as the Einstein Summation Convention
(ESC). Index notation is sufficiently important in vector calculus that we will have a whole section
on it later in the course.
• For low values of n, we will often write x1 , x2 , . . . as x, y, . . .

e.g. n=2: x = xe1 + ye2


n=3: x = xe1 + ye2 + ze3 .

• Important: Don’t forget the vector signs, i.e. x 6= x. xei has a quite di↵erent meaning to x.ei .
• Given two vectors u, v 2 Rn
n
X
u =u1 e1 + u2 e2 + · · · + un en = ui ei = ui e i
i=1
n
X
v =v1 e1 + v2 e2 + · · · + vn en = vi ei = vi ei
i=1

(note the use of the Einstein summation convention again here), the scalar (dot) product between
these vectors is then
n
X
u.v = u1 v1 + u2 v2 + · · · + un vn = ui v i = ui v i .
i=1

This can easily be proved by multiplying out u.v, and using the fact that the standard basis vectors
are orthonormal.

5
• The length or magnitude of u is found using the dot product between u and itself,
p
|u| = u.u,

and if ✓ is the angle between u and v, then

u.v = |u| |v| cos(✓).

|x| is sometimes denoted r (note that this is a scalar quantity, and not a vector), since it is the
radial coordinate of x in spherical polars. Similarly x is sometimes denoted r.
• Draw vectors as arrows:

Figure 2: The vector x illustrated as the position vector of a point in R3 . The length of x, r, is shown,
as are the standard basis vectors of R3 . Using Einstein Summation Convention, we can write x in terms
of the standard basis as x = xi ei

1.2 Scalar fields, vector fields and curves


Now we can introduce the main objects of study in this course, scalar fields, vector fields and curves.
These are all defined as maps from one real vector space to another, that is as maps from Rm ! Rn .
• Scalar fields are real-valued functions on Rn , i.e. maps

Rn ! R
x 7! f (x)

e.g. for n = 3, we could have the function f defined as


xy
f (x) = .
tan z
Note that here the argument of the function is underlined to show that it’s a vector quantity. f (x)
could also be written as f (x, y, z).
The functions of two variables that you studied in the Epiphany term of Calculus I were all examples
of scalar fields.
• Vector fields are vector-valued functions on Rn , i.e. maps

R n ! Rn
x 7! f (x).

6
Note that since the image of x is also a vector, we underline the function f to indicate this. In
textbooks you may also see vector fields written in bold, rather than underlined.
Example 1. Given a constant vector a 2 Rn , we might have the vector field f on Rn , given by

f (x) = (a.x)x.

If we let n = 2 and take a = (1, 1), then the vector field near the origin looks as in Figure 3.

Figure 3: A plot of the vector field f (x) = (a.x)x near the origin in 2 dimensions with a = (1, 1). The
vectors are drawn at a sample of points x, as arrows.

Pay careful attention to the two di↵erent types of multiplication being used in this example. We
take the scalar product between a and x, and we can then use scalar multiplication to multiply the
vector x by the scalar (a.x).
Note: We can also write the formula “in components”, i.e. using index notation - by giving a
formula for the ith component of f . So example 1 would be (using index notation and ESC):

fi = (aj xj )xi

• Curves in Rn are given parametrically by specifying x as a function x(t) of some parameter, t say.
That is, a curve is a map
R ! Rn
t 7! x(t).
Since the image of t is a vector quantity, we underline the function x(t) to indicate this. t itself is
a scalar quantity however, and so is not underlined.
Example 2. Given constant vectors a, b 2 Rn , the curve

e.g. x(t) = a + tb,

which can be written in components as

xi (t) = ai + tbi ,

is a straight line in Rn , which goes through the point a and is parallel to b.

7
If x(t) is di↵erentiable, then dx
dt is tangent to the curve (if non-zero). (If you studied Dynamics I,
then you’ve already come across this idea in that course. There, the trajectory of a particle r(t)
was a curve in space, parameterised by time t, the velocity was the derivative of the trajectory with
d2 r(t)
respect to time, dr(t)
dt , and the acceleration was the second derivative dt2 .)

Figure 4: The tangent to a curve x(t) is found by di↵erentiating the curve with respect to t.

Example 3. A helix in R3 can be parameterised as

x(t) = cos(t)e1 + sin(t)e2 + te3 .

The tangent to the helix is therefore given by


dx
(t) = sin(t)e1 + cos(t)e2 + e3 .
dt
Note that the standard basis vectors are constant, and hence the components of the derivative of
x(t) with respect to t, are just the derivatives of the components of x(t).
Example 4. If a curve is parameterised in terms of the so-called arc-length s (we won’t define this
precisely here) along the curve from a fixed point on it, then | dx
ds | = 1.

with t = s, h = s,
|x(s + s) x(s)| ' s
|x(s + s) x(s)|
) lim =1
s!0 s
dx
) | |=1 a unit vector.
ds

1.3 Partial derivatives and the chain rule


In Calculus I, given a function of two variables f (x, y) : R2 ! R, you learned that the partial derivatives

@f f (x + h, y) f (x, y)
= lim
@x h!0 h
@f f (x, y + h) f (x, y)
= lim ,
@y h!0 h

8
Figure 5: A helix in R3 , with the tangent at a point shown.

Figure 6: If the curve is parameterised by its arc-length (which we haven’t rigorously defined), then for
small s, the length of the vector x(s + s) x(s) becomes approximately the same as the length of the
curve segment s. In the limit that s ! 0, these become equal, and hence the length of the derivative
of the curve with respect to s (the tangent) becomes 1.

9
tell us the rate of change of the function f as we move parallel to the x- and y- axes respectively. If we
write x, y as x1 , x2 & f (x, y) as f (x), with x = x1 e1 + x2 e2 , then we can re-express the partial derivatives
using vector notation as
@f @f f (x + he1 ) f (x)
⌘ = lim
@x @x1 h!0 h
@f @f f (x + he2 ) f (x)
⌘ = lim .
@y @x2 h!0 h
This now suggests the obvious generalisation to scalar fields in n dimensions. If we let f (x) : Rn ! R,
then the function f has n (1st order) partial derivatives given by

@f (x) f (x + hea ) f (x)


= lim ,
@xa h!0 h
for a = 1, 2, . . . , n. These partial derivatives tell us about the rate of change of the function as we move
parallel to any of the n coordinate axes in n dimensions.
You also learned that the rate of change of a function f (x, y) : R2 ! R along a parametrically defined
curve C given as (x(t), y(t)) can be found using the chain rule. Along this curve we have F (t) ⌘
f (x(t), y(t)). Note that I don’t write the function of t as f (t), since f is a map from R2 and hence
strictly speaking f (t) is a di↵erent function, this time a map from R. To avoid the confusion of two
di↵erent functions with the same name, we will denote the restricted function as F (t). The chain rule
then tells us that
dF (t) dx @f dy @f
= + .
dt dt @x dt @y

To extend this to the n-dimensional case using vector notation, we first note that in the two-dimensional
case, the curve C is given by
x(t) = x1 (t)e1 + x2 (t)e2 ,
with x1 (t) = x(t) and x2 (t) = y(t). Similarly, f (x, y) is f (x), so

dF (t) df (x(t)) dx1 @f dx2 @f


= = + .
dt dt dt @x1 dt @x2
We can now see how this should be generalised to the case of a scalar field in n dimensions. The curve
C can be given parametrically as

x(t) = x1 (t)e1 + . . . + xn (t)en .

Our scalar field is given as f (x) : Rn ! R, and the restriction of the scalar field to the curve C can then
be written as F (t) = f (x(t)). The chain rule then tells us that

dF (t) d dx1 @f dxn @f


= f (x(t)) = + ··· + .
dt dt dt @x1 dt @xn

Note: The chain rule holds for di↵erentiable functions of two variables. You defined what this meant in
the context of functions of two variables in Calculus I, and we shall revisit precisely what it means for
a scalar function in n dimensions to be di↵erentiable in section 5. For now we simply assume that our
scalar fields are indeed di↵erentiable.

10
2 The gradient of a scalar field
2.1 Di↵erential operators and r
In the previous section, we saw that when we can compute the rate of change of a scalar field f (x) along
a curve C given by x(t) = xi (t)ei (ESC), using the chain rule as
dF (t) d dx1 @f dxn @f dxi @f
= f (x(t)) = + ··· + = ,
dt dt dt @x1 dt @xn dt @xi
where F (t) is the restriction of f (x) to the curve x(t), and where again we’ve used ESC in the final
equality.
Since this is true for all (di↵erentiable) scalar fields, it’s often useful to write this rule in terms of the
derivative operators themselves, separate from the field F (t). In this form, the chain rule can be given
as
d dx1 @ dxn @
= + ··· +
dt dt @x1 dt @xn
This is known as a di↵erential operator, which can be thought of as a map which takes functions to
functions using derivatives.
When using operator notation, we need to be careful about exactly what the di↵erential operator is
acting on.
Example 5. Given two real functions f (x), g(x) : R ! R, then:
• f (x) dx
d
is a di↵erential operator which can act on g(x) to give f (x) dg(x)
dx .
df (x)
• d
dx f (x)is a di↵erential operator which can on g(x) to give d
dx (f (x)g(x)) = dx g(x) + f (x) dg(x)
dx
by the product rule.
• If I want an operator which multiples g(x) by dfdx
(x)
, then I should write the operator as d
dx f (x) ,
where the brackets make it clear the the derivative is “used up”, only acting on f (x).
d
The derivative with respect to t along the curve C : x(t) = xi (t)ei as above, in operator dt form, can
then be rewritten using the scalar product as
d dx1 @ dxn @
= + ··· +
dt dt @x1 dt @xn
✓ ◆ ✓ ◆
dx1 dxn @ @
= e1 + · · · + en . e1 + · · · + en
dt dt @x1 @xn
dx
= . r,
dt
where
@ @ @
r = e1 + · · · + en = ei . (ESC)
@x1 @xn @xi
This di↵erential operator r is called ‘del’, or ‘nabla’, and is one of the most important objects in this
course. Note that since r is a vector quantity, we always write it with an underline.
If f (x) : Rn ! R is a scalar field, then we define its gradient (“grad f ”) to be given by the action of r
on f :
@f @f @f
rf ⌘ grad f = e1 + e2 + · · · + en .
@x1 @x2 @xn
@f
The gradient of a scalar field is therefore a vector field, with components @xa .

Example 6. In two dimensions, with x = xe1 + ye2 , let f (x) = (x2 + y 2 )/4. Then we have
@f x @f y
) = =
@x 2 @y 2
1 1 1
) rf = xe + ye = x.
2 1 2 2 2

11
The vector field can be drawn by arrows of length |rf | and direction parallel to rf starting at a variety
of sample points:

Figure 7: Level sets of the function f (x) = (x2 + y 2 )/4 are shown in a contour plot. A plot of the vector
field rf (x) is overlaid on top of this, showing the vectors pointing away from the origin, parallel to the
position vectors x. The vectors in the vector field rf (x) are perpendicular to the level sets of f (x).

Example 7. In three dimensions, with x = xe1 + ye2 + ze3, let a = a1 e1 + a2 e2 + a3 e3 be a constant


vector, and let f (x) = a.x x.x. Then we have

f = a1 x + a2 y + a3 z (x2 + y 2 + z 2 )
@f @f @f
) = a1 2x, = a2 2y, = a3 2z
@x @y @z
) rf = e1 (a1 2x) + e2 (a2 2y) + e3 (a3 2z)
=a 2x.

Although the picture is less easy to interpret than the 2-dimensional example, for completeness this is
included as Figure 8.

2.2 Directional derivatives


Let C : x = x(t) be a curve in Rn , and f : Rn ! R a scalar field. Then f (x(t)) : R ! R is f restricted
to C and
d dx
f (x(t)) = .rf by chain rule.
dt dt

As we saw in subsection 1.2, dx


dt is tangent to C at x(t). If we change to a parameterisation in terms of
the arc-length s, such that the tangent dx
ds = n̂ is a unit tangent (see example 4 for a justification as to
why this is possible), then we have
df (x(s))
= n̂ . rf. (2.1)
ds

Now df (x(s))
ds is the rate of change of f with respect to distance (arc length) in the direction n̂. This is
df
called the directional derivative of f in the direction n̂ (and is sometimes written dn̂ ).

12
Figure 8: The contour plot of a scalar field overlaid with the gradient vector field in 3d. Parts of three
level sets can be seen as spherical shells (with what center?) and representative vectors of rf can be
seen as red arrows. As in example 6, the vectors of rf can be seen to be normal to the level sets of f .
We return to this idea in the next section.

Figure 9: The tangent to a curve C at a point p is shown alongside an example of the gradient of a scalar
field at the same point p.

13
Notice that:
df
= n̂ . rf = |n̂| |rf | cos ✓
ds
= |rf | cos ✓  |rf |.

Therefore |rf | is the greatest value of the directional derivative over all possible directions n̂. This value
is achieved when ✓ = 0, i.e. when n̂ k rf . Therefore rf points in the direction where f increases fastest.
Note that in example 6, the vectors in the vector field rf are normal to the curves of constant |x|, which
were the curves of constant f . This also holds more generally. In Rn , suppose C lies entirely in the
level set f (x) = k for k some constant. Call this whole level set S, so C ⇢ S.

Figure 10: The curve C contained entirely within a level set of the function f . In two dimension the
level sets are themselves curves, but in higher dimension spaces these level sets become surfaces, and
hypersurfaces. You should think that the condition f (x) = k is a single constraint on a n-dimensional
space, so the points which satisfy this constraint for an (n 1)-dimensional hypersurface.

So on this C, f (x(t)) = k and


df dx
0= = . rf
dt dt
dx
) ?rf,
dt
i.e. At all points p in a level set of f , rf is orthogonal to any curve through p contained in the level set.
In R3 , the plane through P orthogonal to rf is called the tangent plane to S at P . The tangent to any
curve in S through P lies in this plane. In Calculus I you already used this to find the equation of the
tangent plane to a surface at a point.

2.3 Some properties of the gradient


If f, g : Rn ! R are scalar fields, is a function : R ! R, and a & b are constants then,
(i) r(af + bg) = arf + brg
(ii) r(f g) = (rf )g + f (rg)
d
(iii) r (f ) = (rf ) .
df
Note that as always with di↵erential operators, the brackets are very important here. This is because
@
rf g means r(f g), since derivatives ( @x etc.) act on everything to the right.
Example 8. Let n = 2 and x = xe1 + ye2 . If we now take (f ) = f 2 and f (x, y) = x sin y,
If we now take
(f (x, y)) = x2 sin2 y
) r (f ) = r(x2 sin2 y)
= 2x sin2 y e1 + 2x2 cos y sin y e2 ,

14
by direct calculation. Or:

rf = sin y e1 + x cos y e2
d
) = 2f
df
d
) (rf ) = (sin y e1 + x cos y e2 )2x sin y
df
) r (f ) = 2x sin2 y e1 + 2x2 cos y sin y e2 ,

which shows that property iii holds in this case.


If we now let x = x1 e1 + . . . + xn en , then these properties can be shown to hold in the general case as
follows:
@ @
(i) r(af + bg) = e1 (af + bg) + . . . + en (af + bg)
@x1 @xn
✓ ◆ ✓ ◆
@f @g @f @g
= e1 a +b + . . . + en a +b
@x1 @x1 @xn @xn
✓ ◆ ✓ ◆
@f @f @g @g
= a e1 + . . . + en + b e1 + e2
@x1 @y @x1 @xn
= arf + brg.

@ @
(ii) r(f g) = e1 (f g) + . . . + en (f g)
@x1 @xn
✓✓ ◆ ◆ ✓✓ ◆ ◆
@f @g @f @g
= e1 g+f + . . . + en g+f
@x1 @x1 @xn @xn
✓ ◆ ✓ ◆
@f @f @g @g
= e1 + . . . + en g + f e1 + . . . + en
@x1 @xn @x1 @xn
= (rf )g + f rg.

@ @
(iii) r = e1 + · · · + en
@x1 @xn
@f d @f d
= e1 + · · · + en
@x1 df @xn df
d
= (rf ) .
df

15
3 r acting on vector fields
So far we’ve introduced scalar fields and vector fields, and seen how the di↵erential operator r can act
on a scalar field f (x) : Rn ! R to give a vector field known as the gradient of the scalar field, rf .
We also saw that the gradient tells us the direction in which the scalar field increases fastest. As the
title of this section suggests, we’re now going to see how we can define an action of r on vector fields
v(x) : Rn ! Rn , and what these quantities can tell us about the original vector field.

3.1 Divergence (div)


Since r is a vector operator, and vector fields assign a vector v(x) for each point x 2 Rn , we can take
the dot product between r and v(x). This quantity r.v is the known as the divergence of v, and is also
written as div v.
In the standard cartesian basis for Rn
v(x) = e1 v1 (x) + e2 v2 (x) + · · · + en vn (x)
@ @
) r.v ⌘ div v ⌘ (e1 + · · · + en ) . (e1 v1 (x) + · · · + en vn (x))
@x1 @xn
@v1 @v2 @vn
= + + ··· +
@x1 @x2 @xn
Xn
@vi
= (index notation)
i=1
@x i

@vi
= , (Einstein Summation Convention)
@xi
since the ea are orthonormal and constant.
Beware! In other coordinate systems the basis vectors ea might vary with x and hence the formula needs
more care. We may return to the divergence in other coordinate systems later in the term, depending
on time.
Note: r.v(x) is a scalar field, as can be seen in the following example.
Example 9.
v(x) = (v1 (x), v2 (x), v3 (x)) = (x2 , y 2 , z 2 )
@x2 @y 2 @z 2
r.v = + +
@x @y @z
= 2(x + y + z),
a number for each point x = (x, y, z) not a vector.
Although we treat r like a vector, note that it is a vector di↵erential operator, and therefore is not
actually a vector in Rn like x is. Therefore although the inner product on Rn is symmetric, note that
r.v 6= v.r,
as the left-hand side of this is a scalar field, whereas the right-hand side is the (scalar) di↵erential operator
✓ ◆
@ @ @
x2 + y2 + z2
@x @y @z
acting on scalar fields.
To get an intuition of what the divergence tells us about our vector field, we should think of our vector
field as if it were a fluid, where the direction of a vector at a point tells us the direction of fluid flow at that
point, and the magnitude of the vector tells us how fast the fluid is flowing at the point. The divergence
of the vector field at a point then tells us whether the point is acting like a source (corresponding to
positive divergence) or a sink (negative divergence) for the fluid. That is, whether more of the ‘fluid’
is entering than the point than is leaving. This is shown for a vector field v in Figure 11, where the
divergence at the origin is positive, negative and zero respectively.

16
r.v > 0 r.v < 0 r.v = 0

Figure 11: The divergence of a vector field at a point tells us whether the point acts like a source or a
sink for the vector field, if we think of the vector field as describing the flow of a fluid.

Properties of div:
Let a, b be constants, f, g be scalar fields and v, w vector fields, all in Rn . Then

(i)r.(av + bw) = ar.v + br.w


(ii)r.(f v) = (rf ).v + f r.v

Proof: (i) This follows from linearity of the partial derivative:

@(av1 + bw1 ) @(avn + bwn )


r.(av + bw) = + ··· +
@x1 @xn
@v1 @w1 @vn @wn
=a +b + ··· + a +b
@x1 @x1 @xn @xn
✓ ◆ ✓ ◆
@v1 @vn @w1 @wn
=a + ··· + +b + ··· +
@x1 @xn @x1 @xn
= ar.v + br.w.

(ii) First note that f v is a vector field with components (f v1 , f v2 , f v3 , . . . , f vn ), so

@f v1 @f vn
r.(f v) = + ··· +
@x1 @xn
@f @v1 @f @vn
= v1 + f + ··· + vn + f
@x1 @x1 @xn @xn
@f @f @v1 @vn
= v1 + · · · + vn + f + ··· + f
@x1 @xn @x1 @xn
= (rf ).v + f (r.v).

Once we finish section 4, we’ll be able to derive this more e↵ectively by using index notation.

17
Example 10. Suppose

f (x) = a.x
v(x) = a a constant
) f v = (a.x)a
= (a1 x + a2 y + a3 z)(a1 e1 + a2 e2 + a3 e3 )
@
) r.((a.x)a) = (a1 (a1 x + a2 y + a3 z))
@x
@
+ (a2 (a1 x + a2 y + a3 z))
@y
@
+ (a3 (a1 x + a2 y + a3 z))
@z
2
= a1 + a22 + a23 = kak2

by direct calculation. Or, using property (ii):

r.((a.x)a) = (ra.x).a + (a.x)r.a.

But r.a = 0, while


@ @ @
r(a.x) = (e1 + e2 + e3 )(a1 x + a2 y + a3 z)
@x @y @z
= e1 a1 + e2 a2 + e3 a3 = a
r.((a.x)a) = a.a = kak2

agreeing with the direct calculation.

3.2 Curl
In 3 dimensions, there is a second type of product one can take between vectors, and this is the vector
cross product. Recall that we define the vector product of two vectors A and B as

e1 e2 e3
A⇥B = A1 A2 A3
B1 B2 B3
= e1 (A2 B3 A3 B2 ) + e2 (A3 B1 A1 B3 ) + e3 (A1 B2 A2 B1 ).

Then, for a vector field v(x) in 3 dimensions, define the curl of v as

e1 e2 e3
@ @ @
r ⇥ v(x) ⌘ curl v ⌘ @x @y @z ,
v1 v2 v3
@
where we have to expand this always making sure that the derivatives @xi are on the left of the vi . So
therefore we have
e1 e2 e3
@ @ @ @v3 @v2 @v1 @v3 @v2 @v1
r ⇥ v(x) = @x @y @z = e1 ( ) + e2 ( ) + e3 ( ).
@y @z @z @x @x @y
v1 v2 v3

Note that the curl of a vector field is therefore a new vector field.
Example 11. if v : R3 ! R3 is a vector field, and can be expressed in terms of its components as

18
v = (x2 z, xyz, x), then

e1 e2 e3
@ @ @
curl v = r ⇥ v = @x @y @z
2
x z xyz x
2
@x @xyz @x z @x @xyz @x2 z
= e1 ( ) + e2 ( ) + e3 ( )
@y @z @z @x @x @y
2
= xye1 + (x 1)e2 + yze3

Note that since r ⇥ v is a vector field, we can calculate its divergence. In the case of v being the vector
field from Example 11,
@ @ 2 @
r.(r ⇥ v) = ( xy) + (x 1) + (yz)
@x @y @z
= y + 0 + y = 0.
In turns out that this is always true so long as v has components with continuous second partial deriva-
tives. We’ll come back to this in subsection 3.3.
The curl of a vector field v, tells us how much v is ‘curling around’ at a point. If we imagine our vector
field v as a fluid, like when we thought about the meaning of the divergence (see subsection 3.1), the
magnitude of the curl then tells us about the rotational speed of the fluid, and the direction of the
curl then tells us which axis the fluid is rotating around. This axis is determined using the so-called
‘right-hand rule’: If you curl the fingers of your right hand, such that your fingers represent the rotation
of the fluid, then your thumb points in the direction of the curl vector.
Example 12. Consider the vector field v with components v = ( y, x, 0). To imagine this vector field,
realise that it is independent of z, and so you can imagine the vector field in the z = 0 plane, and the
vector field in any other plane of z is then just a translation of the vector field at z = 0. This z = 0
plane is shown in Figure 12

Figure 12: The plane z = 0 of the vector field v = ( y, x, 0), where the horizontal axis of the image is
the x-axis, and the vertical axis is the y-axis. You should imagine the z-axis as coming straight out of
the page.

19
The curl of this vector field can then easily be checked to be

r ⇥ v = (0, 0, 2),

so is a vector field of constant magnitude pointing in the positive z direction, as one would expect using
the right-hand rule.

Properties of curl:
Let a, b be constants, v, w be vector fields, and f be a scalar field, all in R3 . Then:

(i)r ⇥ (au + bv) = ar ⇥ u + br ⇥ v


(ii)r ⇥ (f v) = (rf ) ⇥ v + f r ⇥ v

Proof:
(i) follows from linearity of derivatives, as before.
Lets check (ii):
 
@ @ @ @
r ⇥ (f v) =e1 (f v3 ) (f v2 ) + e2 (f v1 ) (f v3 )
@y @z @z @x

@ @
+ e3 (f v2 ) (f v1 )
@x @y
✓ ◆ ✓ ◆
@f @v3 @f @v2
= e1 v3 + f v2 f + [. . .] + [. . .]
@y @y @z @z
✓ ◆ ✓ ◆ 
@f @f @v3 @v2
= e1 v3 v2 + e1 f + ··· + ...
@y @z @y @z
= rf ⇥ v + f r ⇥ v

This is tedious. Once we develop our index notation in section 4 we will be able to do this in a slicker
way.

3.3 Applying r twice


If f : Rn ! R is a scalar field then rf is a vector field and we can take its divergence. This is

r.(rf ) = div grad f ⌘ r2 f ⌘ f


✓ ◆ ✓ ◆
@ @ @ @
= e1 + · · · + en . e1 + · · · + en f
@x1 @xn @x1 @xn
@2f @2f
= 2 + ··· +
@x1 @x2n

in cartesian coordinates - note that it will look di↵erent in other coordinates if the basis vectors are not
constant (as in the case of polar coordinates for example).
f ⌘ r2 f is called the Laplacian of f . Since the Laplacian is the divergence of a vector field, it is a
scalar field.
Note that you actually met the Laplacian in Calculus I, when you considered Linear Partial Di↵erential
Equations. In particular, the Laplacian is a di↵erential operator that appears in Laplace’s Equation, the
Heat Equation and the Wave Equation. More on this next term.

20
Example 13. For n = 2, let f = log(x2 + y 2 ) = log(x.x). Then

@f 2x @2f 2 4x2 2(y 2 x2 )


) = 2 , = =
@x x + y2 @x2 x2 + y 2 (x2 + y 2 )2 (x2 + y 2 )2
@f 2y @2f 2(x2 y 2 )
) = 2 2
, = (just swap x and y)
@y x +y @y 2 (x2 + y 2 )2
@2f @2f 2(y 2 x2 ) 2(x2 y 2 )
) r2 f = 2
+ 2 = + 2
@x @y (x2 + y 2 )2 (x + y 2 )2
=0 if x 6= 0

Example 14. Again, with n = 2 let f = x3 3xy 2 . Then

@2f @
= (3x2 3y 2 ) = 6x
@x2 @x
@2f @
2
= ( 6xy) = 6x
@y @y
) r2 f = 0

Notice: If we let z = x + iy, then f = Re[(x + iy)3 ] = Re[z 3 ]. If you study Complex Analysis II, you will
see that this is a di↵erentiable complex function. As such, the real and imaginary parts of the function
satisfy the Cauchy-Riemann equations. One can show (exercise) that in this case the Laplacian of both
the real and imaginary parts of the function have Laplacian equal to 0.
For n = 3 (i.e. in R3 ) there are a couple of other natural combinations. Firstly, since the gradient of a
scalar field is a vector field, we can take its curl:

e1 e2 e3
@ @ @
r ⇥ rf = @x @y @z
@f @f @f
@x @y @z
@ @f @ @f @ @f @ @f @ @f @ @f
= e1 ( ) + e2 ( ) + e3 ( )
@y @z @z @y @z @x @x @z @x @y @y @x
= 0,
2 2
@ f @ f
assuming the 2nd partial derivatives of f are continuous (so we can equate @y@z = @z@y etc.). Also in
R , we can find the divergence of a curl. We did this once follow example 11, but in general
3

e1 e2 e3
@ @ @
r.(r ⇥ v) = r. @x @y @z
v1 v2 v3
@ @v3 @v2 @ @v1 @v3 @ @v2 @v1
= ( )+ ( )+ ( )
@x @y @z @y @z @x @z @x @y
= 0,

again, assuming the 2nd partial derivatives of f are continuous. Later we will redo these cases in a more
elegant way using index notation.

21
4 Index notation
4.1 Einstein Summation Convention
Recall that in n dimensions, the indices i, j etc. labelling the components of vectors run from 1 to n,
e.g. we write v = (v1 , v2 , . . . , vn ).
Einstein spotted that in quantities like
n
X
u.v = ui v i
i=1
Xn
@ui
r.u =
i=1
@xi
n
X
((a.x)x)i = aj xj xi ,
j=1

the index to be summed appears exactly twice in a term or product of terms, while all other indices
appear only once (the reason for this is to do with invariance under rotations, or for those of you studying
Special Relativity this year, Lorentz transformations). He suggested dropping the summation sign, with
the convention that wherever an index is repeated you sum over it.
So, we write X
u.v = ui v i = ui v i
i
X @ui @ui
r.u = =
i
@xi @xi
((a.x)x)i = aj xj xi .

We call the repeated indices dummy indices, and those that are not repeated are called free indices. The
dummy indices can be renamed without changing the expression, i.e.

a j xj xi = a k xk xi ,

since clearly
n
X n
X
aj xj xi = ak xk xi = (a.x)xi .
j=1 k=1

However, the free indices must match on both sides of an equation.


We must also be careful never to repeat an index more than twice in any single term or product of terms
in an expression. If we were to write ai xi xi , we can’t tell whether this is supposed to be the component
form of (a.x)xi , or of (x.x)ai . So if we want to write (u.v)2 in index notation, we should write ui vi uj vj
and not ui vi ui vi .
Although writing an expression like this in index notation can sometimes looks messy, we’ll see shortly
that it can be incredibly efficient for calculations.

4.2 The Kronecker delta, ij

A very useful object for manipulating expressions in terms of their components, is the Kronecker delta.
This is an object with two indices, defined by

1 i=j
ij ⌘ .
0 i 6= j

22
We can think of the Kronecker delta as the components of the n ⇥ n identity matrix I, and in fact this
was how you first met ij . E.g. for n = 3
0 1
1 0 0
I = @ 0 1 0 A ) Iij = ij ,
0 0 1

where Iij represents the element of I on the ith row and jth column.
The Kronecker delta appears naturally when we take partial derivatives, as we have
@xi
= ij
@xj

Example 15. If ij is the 3-dim Kronecker delta, and A = (A1 , A2 , A3 ), simplify


1. As ts

2. rs st

3. rs sr

Answers:
1.
As ts = A1 t1 + A2 t2 + A3 t3 (0 if t 6= 3 etc.)
= At

2.
rs st = r1 1t + r2 2t + r3 3t (0 unless r = 3, t = 3 etc.)
= rt

From these two examples, we can now see the formal rule:
If a has a dummy index, then delete the and replace the dummy index in the rest of the
expression by the other index on the deleted . E.g. As ts = At .
3.
rs sr = rr (by (2.))
= 11 + 22 + 33 = 3

4.3 The Levi-Cevita symbol, ✏ijk


If we are working in 3 dimension, there is a another device which is useful for handling expressions
involving the vector cross product, such as A ⇥ (B ⇥ C) or r ⇥ (r ⇥ v).
Consider C = A ⇥ B. If A = A1 e1 + A2 e2 + A3 e3 , and B = B1 e1 + B2 e2 + B3 e3 , then

e1 e2 e3
C = A⇥B = A1 A2 A3
B1 B2 B3
= e1 (A2 B3 A3 B2 ) + e2 (A3 B1 A1 B3 ) + e3 (A1 B2 A2 B1 ).

The components of C are then given by

C 1 = A2 B 3 A3 B 2
C 2 = A3 B 1 A1 B 3
C 3 = A1 B 2 A2 B 1 .

We can write these equations as a single equation, by introducing ✏ijk , a set of numbers labelled by three
indices i, j and k, each of which is equal to 1, 2 or 3.

23
This symbol is know as the Levi-Civita symbol, ✏ijk and is defined by:

I ✏ijk = ✏jik = ✏ikj (antisymmetric)


II ✏123 = 1

These definitions imply the following properties:


1. ✏ijk = ✏kji (i.e. also antisymmetric when swapping 1st and 3rd index).
Proof:
✏ijk = ✏jik = +✏jki = ✏kji

2. ✏ijk = 0 if any two indices have the same value.


Proof: e.g.
✏112 = ✏112 ) 2✏112 = 0 ) ✏112 = 0

3. The only non-zero ✏ijk therefore have ijk all di↵erent (by property 2.), so (ijk) is some permutation
of (123).
4. ✏ijk = +1 if ijk is an even permutation of 123 (“even” = even # swaps)
✏ijk = 1 if ijk is an odd permutation of 123 (“odd” = odd # swaps)
5. ✏ijk = ✏kij = ✏jki (cyclic permutations).
So e.g. 1 = ✏123 = ✏312 = ✏231 , and 1 = ✏132 = ✏321 = ✏213
Most importantly, now the vector product C = A ⇥ B can be written as:

Ci = ✏ijk Aj Bk . (4.1)

Check:
3
X
C1 = ✏1jk Aj Bk = ✏1jk Aj Bk
j,k=1

(must have j, k 6= 1 and j 6= k by property 2.,


so either j = 2, k = 3 or j = 3, k = 2)
= ✏123 A2 B3 + ✏132 A3 B2
but ✏123 = 1 by (II) above, and ✏132 = 1 by both (I) and (II)
= A2 B 3 A3 B 2

I leave it as an exercise for you to check other two components C2 and C3 .

4.4 The very useful ✏ijk formula


If we want to write out the vector triple product A ⇥ (B ⇥ C), for example, we’ll need to be able to put
these ✏s together. Luckily there is a neat way to do it:

3
X
✏ijk ✏klm = il jm im jl . (†) (4.2)
k=1

Best just to remember this formula!


Check: Let’s first consider the left-hand side (LHS) or Equation (4.2).

LHS of Equation (4.2) = ✏ij1 ✏1lm + ✏ij2 ✏2lm + ✏ij3 ✏3lm (⇤)

24
Now ✏ij1 ✏1lm = 0 unless
(i, j) = (2, 3) or (3, 2)
and (l, m) = (2, 3) or (3, 2)

+1 if (l, m) = (i, j)
) ✏ij1 ✏1lm = ,
1 if (l, m) = (j, i)
in which case the second and third terms in (*) will be zero.
A similar argument holds for other terms in (*), ✏ij2 ✏2lm and ✏ij3 ✏3lm . Therefore the sum in (*) will be
zero, except when
i 6= j, l 6= m, (i, j) = (l, m) ) (⇤) = +1
i 6= j, l 6= m, (i, j) = (m, l) ) (⇤) = 1.

Let’s now consider the right-hand side (RHS) of Equation (4.2).

RHS of Equation (4.2) = il jm im jl

We have il jm = 1 if i = l and j = m , (i, j) = (l, m), otherwise this term is zero.


Similarly, im jl = 1 if i = m and j = l , (i, j) = (m, l), otherwise this term is zero.
The combination of both terms is zero if either i = j or l = m.
This implies that LHS of Equation (4.2) = RHS of Equation (4.2), and hence the formula holds.
Example 16. Show that A ⇥ (B ⇥ C) = B(A.C) C(A.B)
To do this using index notation, we should compute the ith component of A ⇥ (B ⇥ C). We can write
this as [A ⇥ (B ⇥ C)]i . Going slowly, this gives

[A ⇥ (B ⇥ C)]i = ✏ijk Aj (B ⇥ C)k j, k : dummy indices, i : free index, 1,2,3


= ✏ijk Aj ✏klm Bl Cm l, m : more dummy indices
=( il jm im jl )Aj Bl Cm by useful formula (†)
= il jm Aj Bl Cm im jl Aj Bl Cm
= il Aj Bl Cj im Aj Bj Cm by the rule for
= Aj B i C j Aj B j C i by the rule for
= Bi (Aj Cj ) Ci (Aj Bj )
= Bi (A.C) Ci (A.B)
= [B(A.C) C(A.B)]i ,

i.e. ith component of A ⇥ (B ⇥ C) = ith component of B(A.C) C(A.B). Since this is true for all
i = 1, 2, 3 the result follows.

4.5 Applications
We’ve already seen in example 16, that index notation can be used to prove the vector triple product
identity, A ⇥ (B ⇥ C) = B(A.C) C(A.B).
For many vector calculus calculations we need rf , r.v and r ⇥ v in index notation
• The ith component of rf is simply
@f
(rf )i = . (4.3)
@xi
@
A common notation used to simplify this further is to write @xi ⌘ @i , so then we can write

(rf )i = @i f. (4.4)

25
• r.v can be thought of simply as A.v where A has “components” @
@xi :

@vi
r.v = , (4.5)
@xi
and using the notation from above this can also be written as

r.v = @i vi . (4.6)

• Similarly, r ⇥ v is like A ⇥ v. So if u = r ⇥ v, then its ith component is:


@ @vk
ui = (r ⇥ v)i = ✏ijk vk = ✏ijk , (4.7)
@xj @xj

and using the same notation as above this can also be written as

ui = (r ⇥ v)i = ✏ijk @j vk . (4.8)

Here are some more examples that show how useful index notation can be for proving identities in vector
calculus.
Example 17. Find the gradient of f (x) = |x|2 in Rn .

f (x) = |x|2 = x.x = xi xi = xj xj


@
) (rf )i = (xj xj ) important not to reuse the dummy index j
@xi
✓ ◆ ✓ ◆
@xj @xj
= xj + xj product rule
@xi @xi
=2 ij xj = 2xi = (2x)i
) rf = 2x

Example 18. Find the divergence of v(x) = x and u(x) = (a.x)x in R3 .

v i = xi
@xi
) r.v = = ii = 3
@xi
While ui = (a.x)xi = aj xj xi
@
) r.u = (aj xj xi )
@xi
✓ ◆
@xj @xi
= aj xi + xj
@xi @xi
= aj ( ij xi + xj ii )
= aj (xj + 3xj ) = 4aj xj
= 4 a.x

26
Example 19. Find the curl of v(x) = x and u(x) = (a.x)x in R3 .

v i = xi
@
) (r ⇥ v)i = ✏ijk xk
@xj
= ✏ijk jk
= ✏ijj
= ✏i11 + ✏i22 + ✏i33
=0
While uk = a j xj xk = a l xl xk
@
) (r ⇥ u)i = ✏ijk uk
@xj
@
= ✏ijk (al xl xk )
@xj
✓ ◆
@xl @xk
= al ✏ijk xk + xl
@xj @xj
= al ✏ijk ( lj xk + xl kj )
= al ✏ijk lj xk + al ✏ijk xl kj
= aj ✏ijk xk + al ✏ijj xl
= ✏ijk aj xk + 0
= (a ⇥ x)i
) r⇥u=a⇥x

Example 20. Show that r.(r ⇥ v) = 0, (if vk has continuous 2nd partial derivatives).

@
r.(r ⇥ v) = (r ⇥ v)i
@xi
@ @
= (✏ijk vk )
@xi @xj
@ @
= ✏ijk vk
@xi @xj
@ @
Now ✏ijk is anti-symmetric when i and j swap, whereas @xi @xj vk is symmetric, so answer is zero(!).
In more detail:
@ @
r.(r ⇥ v) = ✏ijk vk
@xi @xj
@ @
= ✏jik vk swapping labelling of dummy indices i and j
@xj @xi
@ @
= ✏ijk vk anti-symmetric ✏ijk
@xj @xi
@ @
= ✏ijk vk symmetric derivatives, due to continuity
@xi @xj
= r.(r ⇥ v)
) r.(r ⇥ v) = 0

Much more elegant (and quicker!) then writing it all out.

27
5 Di↵erentiability of scalar fields
We first met r in Section 2, when trying to work out how a scalar field f (x) varied as we moved along
a parametric curve x(t),
df (x(t)) dx(t)
= .rf.
dt dt
This follows from the chain rule, which as we noted at the end of Subsection 1.3, holds for di↵erentiable
functions f . So, when are our scalar fields di↵erentiable?
In the Epiphany term of Calculus I, you saw that a function of two variables f (x, y) was di↵erentiable
at a point x0 = (x0 , y0 ) 2 D, if 9 > 0 such that |x x0 | < =) x 2 D and

f (x, y) = f (x0 , y0 ) + M (x x0 ) + N (y y0 ) + R(x, y), (5.1)

such that
R(x, y)
lim = 0.
x!x0 |x x0 |
@f @f
You also saw that the constants M and N were given by the partial derivatives @x (x0 , y0 ) and @y (x0 , y0 )
respectively, and hence Equation (5.1) can be written as

@f @f
f (x, y) f (x0 , y0 ) = (x x0 ) (x0 , y0 ) + (y y0 ) (x0 , y0 ) + R(x, y), (5.2)
@x @y
where as before
R(x, y)
lim = 0.
x!x0 |x x0 |

We can understand this as saying that a function of two variables f (x, y) is di↵erentiable at a point x0 ,
if the function is ‘flat’ in a small neighbourhood of x0 . That is, if f looks like a plane near x0 . This is
because Equation (5.1) (or equivalently Equation (5.2)) is linear in both the change in the x-directional
and in the y-direction.
We now want to generalise this using vector notation to define when scalar fields f : Rn ! R are
di↵erentiable.

Figure 13: In a small neighbourhood of the origin, the function f (x, y) = x2 + y 2 looks like the plane
z = 0.

28
5.1 Continuity
Note that the definition of di↵erentiability for a function of two variables requires us to define the limit
limx!x0 .
Definition 5.1. We say that a scalar field f (x) : Rn ! R tends to L as x tends to a, if the di↵erence
between f and L can be made as small as we please by taking x sufficiently close to a (without being
equal to a).
We can phrase this more precisely, as

lim f (x) = L if 8✏ > 0, 9 >0 s.t.


x!a

|f (x) L| < ✏ 8x s.t. 0 < |x a| < .

We can now define what it means for a scalar field f to be continuous in terms of this definition of the
limit.
Definition 5.2. f (x) is continuous at a if limx!a f (x) exists and is equal to f (a).
Example 21. If f (x) = x2 + y 2 show that f (x) is continuous at the origin.
p
Given ✏ pick = ✏. Then
|x| <
p p
) x2 + y 2 < ✏
) x2 + y 2 < ✏.

So for all x with 0 < |x| < we have

|f (x)| = |f (x) 0| < ✏


) lim f (x) = 0.
x!0

Since f (0) = 0 = limx!0 , f (x) is continuous at the origin.


Example 22. Show that f (x) = x2xy +y 2 has no limit as x ! 0 even though limx!0 f (x, 0) and limy!0 f (0, y)
both exist (this illustrates the di↵erence between limits in R and R2 ).
For x 6= 0, f (x, 0) = 0 and for y 6= 0, f (0, y)

) lim f (x, 0) = 0 and lim f (0, y) = 0,


x!0 y!0

so we would need limx!0 f (x) = 0. To prove this we would need to show that for any ✏ (in particular
small values), there’s a such that

xy
< ✏ 8x s.t. 0 < |x| < .
x2 + y2

But if y = x then
xy x2 1
= = for all x, (even very small!)
x2 + y 2 x2 + x2 2
so the condition can’t be fulfilled. Therefore, f (x) is not continuous at the origin.
However, the following results help to prove when functions are continuous:
Theorem 5.3. 1. If f and g are continuous functions at a then so are
• f + g,
• f g,

29
f
• and g, provided that g(a) 6= 0.
2. Both
f (x) = constant,
and f (x) = f (x1 , x2 , . . . , xn ) = xa , a = 1, . . . , n
are continuous at all points in R .n

The proof of this theorem is left as an exercise.


Example 23. f (x, y) = x and g(x, y) = y are both continuous at the origin by part 2 of Theorem 5.3,
and so by repeated use of part 1 of the theorem, so is
x3 y 3
h(x, y) = .
1 + x2

5.2 Open sets


The definition of the limit of f as x ! a (Definition 5.1) requires us to consider the set of all points x
such that 0 < |x a| < for a > 0. To define di↵erentiability for our scalar fields, we therefore want
to introduce the notion of an open set, where the assumption that such points exist in the domain of our
function f is satisfied.
We therefore make the following definitions, each of which may be well understood by the accompanying
figure.
Definition 5.4. An open ball with centre a 2 Rn and radius > 0 is the set of points

B (a) = {x 2 Rn : |x a| < }.

An example of an open ball of radious about the point a 2 Rn is shown in Figure 14.

Figure 14: An open ball of radius about the point a 2 Rn .

Definition 5.5. A subset S of Rn is open (S is an open set) if for each point a 2 S there is an open
ball B (a) which is also in S (where might depend on a).
An example of an open set S is shown in Figure 15, where an open ball around a point a is shown.
Definition 5.6. A neighbourhood N of a point a 2 Rn is a subset of Rn which includes an open set
containing a.
A neighbourhood N of a point a 2 Rn is shown in Figure 16, where an open set S around a is shown.
Definition 5.7. A set S ✓ Rn is closed if its complement in Rn (S c ⌘ Rn \ S, points not in S) is open
e.g.
B̄ (a) = {x 2 Rn : |x a|  }
is the closed ball. The complement of the closed ball,

B̄ c (a) = {x 2 Rn : |x a| > }

30
Figure 15: An open set S ⇢ Rn , with the open ball around a point a 2 S shown.

Figure 16: A neighbourhood N of a point a 2 Rn , with an open set S around a shown.

31
is an open set, and so the closed ball is closed.
Example 24. • D = {(x, y) 2 R2 : x > 0} is open, as given a = (a1 , a2 ) 2 D, pick = a1 /2 then
B (a) is in D. This is demonstrated in Figure 17.

The open set D An open ball in D

Figure 17: The set D = {(x, y) 2 R2 : x > 0} is an open set, as we can find an open ball around every
point in D.

• E = {(x, y) 2 R2 : x 0} is not open, as e.g. the point (0, 0) is in E, but for all > 0 there
are points in B (0) not in E. This is demonstrated in Figure 18. Set E is a closed set, as the
complement of E, E c = {(x, y) 2 R2 : x < 0} is an open set.

The set E The origin in E isn’t contained in any open ball

Figure 18: The set E = {(x, y) 2 R2 : x 0} is not an open set, as we can find a point which isn’t
contained in any open ball. This set is a closed set, as it is the complement of an open set.

• Likewise F = {(x, y) 2 R2 : 1 < x < 3, 1 < y < 2} is open, but G = {(x, y) 2 R2 : 1  x  3, 1 <
y < 2} is not (but G is not closed either!). This is demonstrated in Figure 19
Example 25. Every open ball B (a) in Rn is open. The proof of this is left as exercise.
Definition 5.8. If U is an open subset of Rn and f : U ! R is a function on U , then f is said to be
continuous on U if it is continuous at each point in U .

32
The open set F The set G which is neither open nor closed

Figure 19: The set F = {(x, y) 2 R2 : 1 < x < 3, 1 < y < 2} is an open set, but the set G = {(x, y) 2
R2 : 1  x  3, 1 < y < 2} is neither open (since points along the left and right edges aren’t contained
in any open balls), nor closed (since the complement is the outside of the rectangle, along with the top
and bottom edges of the rectangle, and hence contains points which aren’t contained in any open ball).

5.3 Di↵erentiable maps Rn ! R


We’re now ready to define what it means for a scalar field f (x) : Rn ! R to be di↵erentiable. Remember,
a function of two variables f (x, y) : D ! R for D ✓ R2 was di↵erentiable at a point x0 = (x0 , y0 ) 2 D,
if 9 > 0 such that |x x0 | < =) x 2 D and

@f @f
f (x, y) f (x0 , y0 ) = (x x0 ) (x0 , y0 ) + (y y0 ) (x0 , y0 ) + R(x, y), (†)
@x @y
such that
R(x, y)
lim = 0.
x!x0 |x x0 |

We can now generalise this to Rn using vector notation.


Definition 5.9. Suppose U is an open set in Rn , and f : U ! R is a scalar field (function) on U . f is
di↵erentiable at a 2 U if we can find a vector v(a) such that

f (a + h) f (a) = h.v(a) + R(h) (⇤)


R(h)
with lim =0 (⇤⇤)
h!0 |h|

We can regard (*) in the definition as the definition of R(h), so part (**) is the meaningful part of the
definition of di↵erentiability for a scalar field. Note that the term h.v(a) is linear in h (which represents
the small di↵erence between our points a and a + h). Equation (*) therefore takes the same form as
Equation (†) above, where the first two terms on the right-hand side are also linear in the di↵erences
(x x0 ) and (y y0 ).
If v exists, it is equal to rf since we could take h = hei , so h = ±|h| (i.e. |h| = |h|). Then by the

33
definition of the gradient,
@f f (a + hei ) f (a)
(rf )i = = lim
@xi h!0 h

hei .v(a) + R(h)
= lim by (⇤)
h!0 h

R(h)
= lim ei .v(a) +
h!0 h
R(h)
= ei .v(a) ± lim
h!0 |h|

= ei .v(a) ± 0 by (⇤⇤)
= vi (a),
i.e. the components of rf are the components of v.
Warning: the components of rf might be well-defined at a point but this does not ensure that f is
di↵erentiable there.
Example 26. f : R2 ) R ⇢ xy
x2 +y 2 x 6= 0
f (x, y) =
0 x=0

At the origin
h.0
@f f (h, 0) f (0, 0) 2 0
= lim = lim h +0 = 0,
@x h!0 h h!0 h
@f
and similarly @y = 0 at origin, so rf (0) has components (0, 0), and therefore if v exists it must be (0, 0)
as well.
Task 1: Define R. With h = (h1 , h2 )
f (h1 , h2 ) f (0, 0) = h.rf (0) + R(h)
= 0 + R(h)
h1 h2
) R(h) =
h21 + h22

Task 2: Examine the behaviour of limh!0 R(h)/|h|:


✓ ◆
R(h) h1 h2
= /|h|
|h| h21 + h22
h1 h2
= 2
(h1 + h22 )3/2

On the line h2 = h1 ,
h21 R(h)
=
(2h21 )3/2 |h|
1
= 3/2
,
2 |h1 |
which does not tend to zero as h1 ! 0, therefore limh!0 R(h)/|h| does not exist, and so f is not
di↵erentiable at the origin.
Note that for x 6= 0, ✓ ◆
@f @ xy y 2x2 y
= =
@x @x x2 + y 2 x2
+y 2 (x + y 2 )2
2

y(y 2 x2 )
= 2 ,
(x + y 2 )2
@f (0,y) 1 @f
so @x = y, and @x is not continuous at 0.

34
5.4 Continuous Di↵erentiability
In fact there is a theorem that continuity of partial derivatives implies di↵erentiability.
Theorem 5.10. Let f : U ! R be a function on an open set U ⇢ Rn , and suppose a 2 U . If all partial
derivatives of f exist and are continuous in a neighbourhood of a, then f is di↵erentiable at a.
(You can quote this theorem without proof.)
Definition 5.11. A function is continuously di↵erentiable at a if it and all of its partial derivatives
exist and are continuous at a; it is continuously di↵erentiable on an open set U if it and all of its partial
derivatives exist and are continuous on U .
By the theorem, if a function is continuously di↵erentiable on an open set then it is also di↵erentiable
there. This is important: continuous di↵erentiability is easier to check than di↵erentiability.
Note: function for which partial derivatives of all orders exist are called smooth functions.
Theorem 5.12. If f, g : U ! R (U open in Rn ) are di↵erentiable (respectively smooth) at x = a 2 U
then so are
• f +g
• fg
f
• g, provided g(a) 6= 0.
You can quote these facts!
Example 27. Where is f (x, y) = y|x 2| continuously di↵erentiable and where is it di↵erentiable? A
plot of this function is shown in Figure 20.

Figure 20: A plot of the function f (x, y) = y|x 2| for 0  x  4 and 5  y  5.

Answer: It is continuously di↵erentiable where its partial derivatives are continuous. So calculate
@f @f
For x > 2 : f = y(x 2) ) = y, = x 2 (continuous)
@x @y
@f @f
For x < 2 : f= y(x 2) ) = y, = x + 2 (continuous).
@x @y

35
@f (2, y) f (2 + h, y) f (2, y)
For x = 2 : = lim
@x h!0 h
y|h|
= lim .
h!0 h

For y 6= 0, the limit of this for h < 0 is di↵erent to the limit for h > 0. Hence the limit does not exist,
and so the partial derivative does not exist for x = 2, y 6= 0. If y = 0, the limit exists and is equal to 0.

@f (2, y) f (2, y + h) f (2, y)


= lim
@y h!0 h
0 0
= lim = 0
h!0 h

Since @f
@x does not exist for x = 2, y 6= 0 it cannot be continuous for x = 2 (even at y = 0) so f is not
continuously di↵erentiable for x = 2 (even at (2,0)).

=) f is continuously di↵erentiable at {(x, y) 2 R2 : x 6= 2}

By the di↵erentiability theorem (Theorem 5.10), this means that f is also di↵erentiable at all points
{(x, y) 2 R2 : x 6= 2}. rf does not exist for x = 2, y 6= 0 so f not di↵erentiable there. But what about
x = (2, 0) = 2e1 ?
The partial derivatives do exist there (and are both zero) so f might be di↵erentiable there.

As rf (2e1 ) = (0, 0)
) R(h) = f (2e1 + h) f (2e1 ) h.rf (2e1 )
= h2 |h1 | 0 0
R(h) h2 |h1 |
) =
|h| |h|
|h2 ||h1 | |h| |h|
= < = |h|
|h| |h|

Now, since
R(h)
lim = lim |h| = 0,
h!0 |h| h!0

and
R(h) R(h) R(h)
  ,
|h| |h| |h|
we have that
R(h)
lim =0
h!0 |h|
by the squeezing theorem, and so f is di↵erentiable at 2e1 .

=) f is di↵erentiable at {(x, y) 2 R2 : x 6= 2} [ {(2, 0)}

36
5.5 The chain rule revisited
Now that we’ve defined di↵erentiability for scalar fields, we can give a proof of the chain rule for such
fields.
Theorem 5.13. If f (x) : U ! R is di↵erentiable with U an open set in Rn , and if x is a function of
@x1 @x1 @x2 @x2 @xn
u1 , u2 , . . . um such that the partial derivatives @u ,
1 @u2
, . . . @u ,
1 @u2
, . . . @u m
exist (that is all the partial
@xi
derivatives @uj for 1  i  n and 1  j  m exist), and if

F (u1 , . . . , um ) = f (x(u1 , . . . , um )),


@F @x1 @f @x2 @f @xn @f @xi @f
then = + + ··· + = (Using ESC)
@ub @ub @x1 @ub @x2 @ub @xn @ub @xi
@x
= .rf
@ub

Proof. We have
f : U ⇢ Rn ! R
x : Rm ! Rn
F : Rm ! R,
where F = f (x(u)).
Set
x(u1 , . . . ub , . . . um ) = a
and x(u1 , . . . ub + k, . . . um ) = a + h(k),
then
@F F (u1 , . . . , ub + k, . . . um ) F (u1 , . . . , ub , . . . , um )
= lim
@ub k!0 k
f (x((u1 , . . . , ub + k, . . . um )) f (x((u1 , . . . , ub , . . . , um ))
= lim
k!0 k
f (a + h(k))) f (a)
= lim
k!0 k
h(k).rf (a) + R(h(k))
= lim
k!0 k
(since f is di↵erentiable ) f (a + h) f (a) = h.rf (a) + R(h))
h(k).rf (a) R(h(k))
= lim + lim
k!0 k k!0 k
✓ ◆
h(k) R(h(k))
= lim .rf (a) + lim , (a)
k!0 k k!0 k

since rf doesn’t depend on k. Now

h(k) (a + h(k)) a
lim = lim
k!0 k k!0 k
x(u1 , . . . ub + k, . . . um ) x(u1 , . . . ub , . . . um )
= lim
k!0 k
@x
= , (b)
@ub
and
R(h(k)) R(h(k)) ||h(k)||
lim = lim .
k!0 k k!0 ||h(k)|| k

37
Then since
R(h(k))
lim = 0 (by definition, since f is di↵erentiable)
k!0 |h(k)|

|h(k)| h(k) @x
and lim = ± lim = ± by (b)
k!0 k k!0 k @ub
✓ ◆ ✓ ◆
R(h(k)) R(h(k)) |h(k)| @x
We have lim = lim = 0. ± = 0. (c)
k!0 k k!0 |h(k)| k @ub

Combining (a), (b) and (c) therefore gives

@F @x @x1 @f @xn @f @xi @f


= .rf (a) = + ··· + =
@ub @ub @ub @x1 @ub @xn @ub @xi

5.6 The implicit function theorem


We first consider the implicit function theorem in the case of a scalar field on R2 .
Recall: A level set S of f : U ! R, where U is an open subset of Rn , is the set of points {x 2 U : f (x) = c}
for some constant c.
For n = 2, the level set will usually be a curve, called a level curve.
Question: When can the level curve of f : U ! R, with U open in R2 , be written in the form y = g(x)
with g a di↵erentiable function? (Remember, if g(x) is a function it must give a single value g(x) for
each value of the input x.)

Figure 21: A level curve of the function f defined on an open set U , f (x, y) = c. This level curve can
also be given in the form y = g(x) for some di↵erentiable function g.

Example 28. Let U = R2 .


• (A) f (x, y) = x2 y = c.
Here y = g(x) = x2 c. Three level curves are show in Figure 22.
• (B) f (x, y) = x2 + y 2 = c.
p p p
Here we have y = ± c x2 ; even for c > 0 and x 2 ( c, + c), we can’t write the whole level
curves as y = g(x), but we can write bits of them. Three level curves are show in Figure 23.
• (C) f (x, y) = (x + y)exy .
A selection of level curves of this function are shown in Figure 24. It’s difficult to see where we can
write these level curves in the form y = g(x).

38
Figure 22: Three level curves of f (x, y) = x2 y. Each one can be written in the form y = g(x) = x2 c.

Figure 23: Three level curves of f (x, y) = x2 + y 2 . These can’t be written in the form y = g(x) for any
di↵erentiable function g, though we can write sections of the curves in this form.

Figure 24: Level curves of the function f (x, y) = (x + y)exy . Where can these be written in thee form
y = g(x) for g a di↵erentiable function of x?

39
Terminology: y = g(x) gives y as an explicit function of x; f (x, y) = 0 gives y as an implicit function
of x. To go from explicit to implicit, just set f (x, y) = g(x) y.
Question: When is it possible to go the other way, from implicit to explicit?
Suppose the level curve f (x, y) = c can be written as y = g(x). Then

f (x, g(x)) = c. (I)

Di↵erentiating this, using the chain rule (LHS is a function of x only) gives

d @f dg @f
f (x, g(x)) = +
dx @x dx @y
d dc
but f (x, g(x)) = = 0 (as c is a constant)
dx dx
@f (x,g(x))
dg @x
) = @f (x,g(x))
(II)
dx
@y

@f (x,g(x))
Note: We have a problem if @y = 0.
Theorem 5.14. The Implicit Function Theorem states that if f (x, y) : U ! R is di↵erentiable, and if
(x0 , y0 ) is a point in U on the level curve f (x, y) = c at which @f
@y 6= 0, then a di↵erentiable function g(x)
of x exists in a neighbourhood of x0 satisfying (I) and (II) with g(x0 ) = y0 .
Example 29. Let’s now return to the three examples we saw before.
• (A)
f (x, y) = x2 y
@f
) = 1 6= 0,
@y
so since this is non-zero everywhere, we can find g(x).
• (B)
f (x, y) = x2 + y 2
@f
) = 2y,
@y
so if y0 = 0 then @f@y (x0 , y0 ) = 0. We therefore can’t use the IFT to guarantee the existence
of a di↵erentiable function g(x) describing the level curves of f in a neighbourhood of y = 0.
Looking at Figure 25, we see that no single-valued function can describe the level curves of f in a
neighbourhood of y = 0.
• (C)
f (x, y) = (x + y)exy
@f @
) = ((x + y)exy )
@y @y
= exy + (x + y)xexy
= (1 + x2 + xy)exy
@f
so = 0 $ 1 + x2 + xy = 0
@y
(1 + x2 ) 1
) y= = (x + ).
x x
So we have trouble at points x0 , y0 with y0 = (x0 + 1/x0 ), as shown in Figure 26
x2 y2
Example 30. If f (x, y) = a2 + b2 then f (x, y) = c, c > 0 is an ellipse, as shown in Figure 27 f (x, y) is

40
Figure 25: In any neighbourhood of y = 0, we can’t describe the level curve of f (x, y) = x2 + y 2 by a
di↵erentiable function g(x).

Figure 26: The level curves of y = (x + y)exy can be written in the form g(x), for a di↵erentiable function
g, aside from in a neighbourhood of a point on the line y = (x + 1/x).

Figure 27: The curve can be written in the form g(x) aside from in a neighbourhood of either point A
or B.

41
@f
di↵erentiable so, in a neighbourhood of x0 , the curve can be written as y = g(x) provided @y (x0 , y0 ) 6= 0.
@f 2y
But @y = b2 so this can be done except at points on x-axis.
r
x2
In fact g(x) = b c if y0 > 0
a2
r
x2
or g(x) = b c if y0 < 0
a2

At points A and B, @f
@y = 0 so g(x) is not di↵erentiable. However at these points
@f
@x 6= 0, so instead of
writing y = g(x) we could find an h(y) such that x = h(y).
If there is a point Q on a level curve f = c at which rf = 0 (this is a critical point), the value of c
is called a critical value (otherwise it is a regular value) and the level curve cannot be written either as
y = g(x) or as x = h(y) in the neighbourhood of Q (with g,h di↵erentiable).
Example 31. f = x3 y 2 + 1, then rf = 3x2 e1 2ye2 so rf = 0 at x = 0, so c = 1 is a critical value
for f (x, y) = c.
p
f =1 , y = ± x3 no unique function in neighbourhood of 0
2
x=y 3 not di↵erentiable at 0

Figure 28: Level curves of the function f (x, y) = x3 y 2 + 1.

Implicit function theorem for surfaces


All of this generalises to higher dimensions. For example, the level sets of scalar fields in R3 (i.e. Rn
with n = 3) will generally be surfaces. We therefore have the implicit function theorem for surfaces.
Theorem 5.15. Let f (x, y, z) : U ! R be di↵erentiable (U an open subset of R3 ) and let (x0 , y0 , z0 ) 2 U
be a point on the level set f (x, y, z) = c, so f (x0 , y0 , z0 ) = c.
If @f
@z (x0 , y0 , z0 ) 6= 0 then the equation f (x, y, z) = c implicitly defines a surface z = g(x, y) in a neigh-
bourhood of the point (x0 , y0 , z0 ), via

f (x, y, g(x, y)) = c, (I)

42
with g(x0 , y0 ) = z0 , and where
@f
@g @x (x0 , y0 , z0 )
(x0 , y0 ) = @f
(IIa)
@x @z (x0 , y0 , z0 )
@f
@g @y (x0 , y0 , z0 )
(x0 , y0 ) = @f
. (IIb)
@y @z (x0 , y0 , z0 )

As in the implicit function theorem for curves, conditions (IIa) and (IIb) must hold for g(x, y) if it
exists, since
f (x, y, g(x, y)) = c
for all (x, y) in some neighbourhood of (x0 , y0 ). If we partially di↵erentiate both sides with respect to x,
and use the chain rule:
@ @x @f @y @f @g @f
0 = (f (x, y, g(x, y)) = + +
@x @x @x @x @y @x @z
@f @g @f
= + ,
@x @x @z
@x @y
as @x = 1, @x = 0 and z = g(x, y). Hence
@f
@g @x
= @f
@x @z

@g
as required. The argument follows similarly for @y .

Recall: As we saw in Subsection 2.2, rf at x0 = (x0 , y0 , z0 ) is normal to the tangent plane of the
surface z = g(x, y) at (x0 , y0 ). So the normal line is given in parametric form by

x(t) = x0 + trf,

and the tangent plane is given by


(x x0 ).rf = 0 (5.3)

Example 32. This example is taken from the 2011 exam, with notation slightly changed.
Question: Check if z sin(⇡x) = log(2z y 2 ) defines a function z = g(x, y) around x = 2, y = 1. Find
the tangent plane and normal line to z = g(x, y) at (2, 1).
Answer: Set f (x, y, z) = z sin(⇡x) log(2z y 2 ), then the question is asking about the level set f = 0.
We have
f (2, 1, z) = z sin(2⇡) log(2z 1)
= log(2z 1)
So f =0 , log(2z 1) = 0
, 2z 1=1
, z = 1.

The Implicit Function Theorem then states that f (x, y, z) = 0 defines a surface z = g(x, y) near
(x0 , y0 , z0 ) = (2, 1, 1) provided @f
@z 6= 0 there. But

@f 2
= sin(⇡x)
@z 2z y 2
@f 2
So (2, 1, 1) = = 2 6= 0.
@z 2 1

43
The other partial derivatives are:
@f @f
= ⇡z cos(⇡x) ) (2, 1, 1) = ⇡
@x @x
@f 2y 2y @f 2
= = ) (2, 1, 1) = = 2,
@y 2z y 2 2z y 2 @y 1

so rf (2, 1, 1) = (⇡, 2, 2).


[Aside: So partial derivatives of g(x, y) are (not that we need them here):
@f
@g @x
= @f
= ⇡/2
@x @z
@f
@g @y
= @f
= 1
@y @z

end aside].
So a normal line is
(x, y, z) = (2, 1, 1) + t(⇡, 2, 2)
) x(t) = (2 + ⇡t)e1 + ( 1 2t)e2 + (1 2t)e3 ,
and the tangent plane is

((x, y, z) (2, 1, 1)).(⇡, 2, 2) = 0


) (x 2)⇡ + (y + 1)( 2) + (z 1)( 2) = 0
) ⇡x 2y 2z = 2⇡

) x y z=⇡
2

⇡ @g @g
Note, that if we write z = g(x, y) and rearrange to get g(x, y) = 2x y ⇡, we see that @x and @y agree
with the values found above, as they must.

44
6 Di↵erentiability of vector fields
6.1 Di↵erentiable maps Rn ! Rn
We will now generalise the idea of di↵erentiability of scalar fields from section 5. Recall that for a scalar
field f (x) : U ! R, with U open in Rn , f is di↵erentiable at a 2 U if
f (a + h) f (a) = h.rf (a) + R(h) (a)
R(h)
with lim = 0, (b)
h!0 |h|
where you should note that the first term on RHS of (a), h.rf (a), is linear in h.
Definition 6.1. Consider a vector field F (x) : U ! Rn , U open in Rn . Then F is defined to be
di↵erentiable at a 2 U if there is a linear function L : Rn ! Rn such that
F (a + h) F (a) = L(h) + R(h) (A)
R(h)
with lim = 0. (B)
h!0 |h|

Now linear functions Rn ! Rn are matrices. To see what matrix use the standard basis,
F (x) = F1 (x)e1 + F2 (x)e2 + · · · + Fn (x)en
L(h) = L1 (h)e1 + L2 (h)e2 + · · · + Ln (h)en
R(h) = R1 (h)e1 + R2 (h)e2 + · · · + Rn (h)en ,
so the jth components of (A) and (B) are
Fj (a + h) Fj (a) = Lj (h) + Rj (h) (A)j
Rj (h)
with lim = 0. (B)j
h!0 |h|
These are just conditions (a) and (b) for Fj (x) to be di↵erentiable as a map U ! R, i.e. as a scalar field.
So we can use results from section 5 to see that
Lj (h) = h.rFj (a),
that is
@F1 @F1 @F1
L1 = h.rF1 (a) = h1 + h2 + · · · + hn
@x1 @x2 @xn
@F2 @F2 @F2
L2 = h.rF2 (a) = h1 + h2 + · · · + hn
@x1 @x2 @xn
.. ..
. .
@Fn @Fn @Fn
Ln = h.rFn (a) = h1 + h2 + · · · + hn ,
@x1 @x2 @xn
or 0 1 0 @F1 @F1 1
L1 h1 @x1 + h2 @F @x2 + . . . + hn @xn
1

B L2 C B h1 @F2 + h2 @F2 + . . . + hn @F2 C


B C B @x1 @x2 @xn C
B .. C = B .. C
@ . A @ . A
Ln @F @F @F
h1 @x1 + h2 @x2 + . . . + hn @xn
n n n

0 @F1 @F1 @F1 1 0 1


@x1 @x2 . . . @x n
h1
B @F2 @F2 . . . @F2 C B h2 C
B @x1 @x2 @xn C B C
= B . .. .. C B .. C .
@ .. . . A @ . A
@Fn @Fn @Fn hn
@x1 @x2 ... @xn

The n ⇥ n matrix on the RHS of the last equation is called the Jacobian matrix, or di↵erential, of F (x)
at x = a; it is written as DF (a), or dF (a) (or DF a or dF a or even Jij ).

45
Definition 6.2. The determinant of the di↵erential,

det(Dv) ⌘ |Dv|

is called the Jacobian, J(v).


Example 33. If ✓ ◆ ✓ ◆
x2 y 2 v1
v(x) = = ,
2xy v2
then ! ✓ ◆
@v1 @v1
@x @y 2x 2y
Dv(x) = @v2 @v2 =
@x @y
2y 2x

The Jacobian is then given by


2x 2y
J(v) =
2y 2x
= 4(x2 + y 2 ) = 4|x|2

Example 34. If x 2 Rn and v(x) = x then


0 1
0 @x1 @x1 @x1 1 1 0 ... 0
... B0
B
@x1 @x2 @xn
.. C = B 1 ... 0C C
Dv(x) = @ ... ..
. . A B .. .. .. C = In ,
@xn @xn @xn
@. . .A
...
@x1 @x2 @xn 0 0 ... 1

the n ⇥ n identity matrix, and


J(v) = |In | = 1

6.2 Di↵eomorphisms and the inverse function theorem


We can think of a vector field v(x) as a mapping Rn ! Rn , or equivalently as a coordinate transformation
on Rn . If we think of the components of h as the coordinates of a point x = a + h relative to an “origin”
at a, then the components of v(a + h) v(a) are the transformed coordinates relative to the transformed
origin v(a). Then for di↵erentiable v (and small h)

v(a + h) v(a) ' Dv(a) . h


new coordinates ' matrix ⇥ old coordinates,

which is a linear transformation, invertible if the determinant of Dv(a) (i.e. the Jacobian) is non-zero.
The inverse function theorem says that this invertibility can be extended beyond the linear behaviour:
Theorem 6.3. Let v : U ! Rn (with U open in Rn ) be a di↵erentiable vector field with continuous
partial derivatives, and let a 2 U . Then if J(v(a)) 6= 0, 9 an open set Ũ ✓ U containing a such that
(i) v(Ũ ) is open
(ii) The mapping v from Ũ to v(Ũ ) has a di↵erentiable inverse - i.e. there exists a di↵erentiable vector
field w : v(Ũ ) ! Rn such that w(v(x)) = x and v(w(y)) = y
Definition 6.4.
• A mapping v : Ũ ! V ⇢ Rn satisfying (i) and (ii) above is called a di↵eomorphism of Ũ onto
Ṽ = v(Ũ ), and Ũ and Ṽ are said to be di↵eomorphic.
• More generally, a mapping v : U ! V is called a local di↵eomorphism if for every point a 2 U
there is an open set Ũ ⇢ U containing a such that v : Ũ ! v(Ũ ) is a di↵eomorphism.

46
Figure 29: A di↵eomorphism from Ũ to Ṽ

Remarks In general, suppose that


v : U ! V ⇢ Rn
w : V ! W ⇢ Rn
(with U and V both open in Rn ) are both continuously di↵erentiable vector fields (not necessarily
di↵eomorphisms).

Figure 30: Composition of maps v and w

Then w(v(x)) is a mapping U ! W ⇢ Rn and its di↵erential can be calculated using the chain rule (see
Q61,62 on the Problems Sheets), giving

Dw(v(x)) = Dw(v) Dv(x) i.e. by matrix multiplication

For the special case when v is a local di↵eomorphism and w is its inverse map,

w(v(x)) = x
) Dw Dv = Dw(v(x)) = Dx(x) = In ,

using Example 34 above. Likewise

v(w(y)) = y
) Dv Dw = Dv(w(y)) = Dy(y) = In .

1
So Dv is an invertible matrix, with inverse (Dv) = Dw. Taking determinants,
1
J(w) = ,
J(v)

and in particular, J(v) 6= 0, which was the main condition of the inverse function theorem.
Definition 6.5. Such a v is called orientation preserving if J(v) > 0, and orientation reversing if J(v) <
0.

47
Example 35. Continuing Example 33, we had
✓ 2 ◆
x y2
v(x) = ) J(v) = 4(x2 + y 2 ).
2xy

So for (x, y) 6= (0, 0), J(v) > 0.


Hence if U = R2 {0}, v : U ! U is an orientation preserving local di↵eomorphism. However it is
not a global di↵eomorphism since v( x) = v(x) (so no inverse can exist globally). But v does map
{(x, y) : x > 0} onto R2 {(x, 0), x  0} di↵eomorphically.
Example 36. Consider the transformation from polar coordinates (r, ✓) back to cartesians (x, y). We
have ✓ ◆ ✓ ◆
x(r, ✓) r cos ✓
v(r, ✓) = =
y(r, ✓) r sin ✓

Di↵erential: ✓ @x ◆ ✓ ◆
@x
@r @✓ cos ✓ r sin ✓
Dv = @y @y = (6.1)
@r @✓
sin ✓ r cos ✓

Jacobian:
@x @x
@r @✓ cos ✓ r sin ✓
J(v(r, ✓)) = @y @y =
@r @✓
sin ✓ r cos ✓
= r cos2 ✓ + r sin2 ✓ = r

For r > 0, J(v) > 0, and the transformation is therefore orientation preserving.
The inverse mapping is ✓ ◆ ✓p ◆
r(x, y) x2 + y 2
w(x, y) = =
✓(x, y) tan 1 xy

Exercise: check this!

48
7 Volume, line and surface integrals
7.1 Double integrals and Fubini’s theorem
We’ll start with the familiar case of one-dimensional integrals. Recall the single integral computes the
area under a curve, as illustrated in figure 31.

Figure 31: Graph showing a section of the (red) curve given by y = h(x), with the discretised area under the
curve given by the sum of the black rectangles of width xi .

As the widths of the rectangles tend to zero, so the sum of their areas tends to the integral of the curve
over the desired range, defining the integral via a Riemann sum:
Z b n
X1
h(x) dx = lim h(x⇤i ) xi ,
a n!1
i=0

where the interval [a, b] is partitioned as a = x0 < x1 < ... < xi < ... < xn = b , xi = xi+1 xi , and
the choice of x⇤i 2 [xi , xi+1 ] is arbitrary as long as the limit exists (in the figure they were all taken at
the left-hand ends).
In general the xi ’s can be di↵erent sizes, but if they are, the limit n ! 1 must include the extra
requirement that sup( xi ) ! 0. From here on we will assume equally-spaced partitions, and mostly
assume that the functions we deal with are such that the relevant limits all exist.

Double integrals
R RR
As for the single integral, the double integral, written as R f (x, y) dA (or sometimes R f (x, y) dA),
can be used to calculate the volume under the surface defined by the equation z = f (x, y) where f (x, y)
is continuous over R ⇢ R2 , the region over which we wish to perform the integration. This is illustrated
in figure 32.
This double integral can be defined in a similar fashion to the single integral using a Riemann sum. We
start by splitting the region of integration R into N smaller areas Ak (see figure 33, where the areas
Ak are chosen to be rectangles).
As for single integrals, where we add the areas of the small rectangles, we can add up the smaller volumes
(prisms, with volumes (area of base = Ak ) ⇥ (height)) to give an approximation to the double integral.

49
Figure 32: The section of the surface z = f (x, y) lying above the region R = { (x, y) 2 R2 | x 2 [1, 4], y 2 [1, 4] } .

Figure 33: A sketch of the discretised surface z = f (x, y). The double integral can be approximated by the
sum of the volumes of all the small cuboids; if we take the limit of infinitely many cuboids, the sum tends to the
integral.

50
As we increase the number of prisms the approximation becomes more and more accurate, and the
integral can be defined as the limit of the sum:
Z N
X
f (x, y) dA = lim f (x⇤k , yk⇤ ) Ak .
R N !1
k=1

Here (x⇤k , yk⇤ ) is in the base of the k th prism. If we choose the small areas to be rectangles on a regular
grid, then Ak = xi yj with xi = xi+1 xi , yj = yj+1 yj and x and y are partitioned in a
similar way to that used before: x0 < x1 < ... < xi < ... < xn , y0 < y1 < ... < yj < ... < ym . We then
obtain
Z n
X1 mX1
f (x, y) dA = lim f (x⇤i , yj⇤ ) xi yj ,
R n,m!1
i=0 j=0

where x⇤i and yj⇤ are the coordinates of the points at which the function is evaluated and are in the
ranges x⇤i 2 [xi , xi+1 ], yj⇤ 2 [yj , yj+1 ]. These points are often taken to be the mid-points. If we take the
limit m ! 1 first, and only then take n ! 1, we get:
0 1
n
X1 m X1 X1 Z
n
lim f (x⇤i , yj⇤ ) yj xi = lim @ f (x⇤i , y)dy A xi
n,m!1 n!1
i=0 j=0 i=0 y
0 1
Z Z
= @ f (x, y) dy A dx .
x y

Let’s see an example of calculating a double integral:


Example 37. Integrate the function f (x, y) = 6xy 2 over R = [2, 4] ⇥ [1, 2].
Z 4 Z 2 Z 4
2
⇥ ⇤2
6xy dy dx = 2xy 3 1
dx
2 1 2
Z 4
= (16x 2x) dx
2
⇥ ⇤4
= 7x2 2 = 84 .

If we’d taken n ! 1 first instead, we would have ended up with the opposite order of integrations, but
the final result is unchanged:
Z 2Z 4 Z 2
2
⇥ 2 2 ⇤4
6xy dx dy = 3x y 2 dy
1 2 1
Z 2
= 36y 2 dy
1
⇥ ⇤2
= 12y 3 1 = 84 .

As we can see, we obtain the same final result as before.


Things become a little more complicated if the region of integration, let’s call it A, is not a rectangle.
Suppose that it is defined to be the set of points in the x, y plane lying between two curves y0 (x) and
y1 (x) with a  x  b :
A = {(x, y) 2 R2 | a  x  b , y0 (x)  y  y1 (x)} .
(To visualise this, it might help to draw a picture!) Then taking m ! 1 first:
Z Z b Z y1 (x)
f (x, y) dx dy = f (x, y) dy dx .
A a y0 (x)

51
Instead we could take n ! 1 first, taking care to rearrange the limits (see example 38 below for this in
action). Again, the final answer will be the same.

Fubini’s theorem sums this all up.


Theorem 7.1. If the function f (x, y) is continuous over a bounded and closed (i.e. compact) region of
integration A, then the double integral over that region can be written as an iterated integral, with
the integrals in either order:
Z Z Z Z Z
f (x, y) dA = f (x, y) dx dy = f (x, y) dy dx .
A y x x y

In order to calculate an integral in this form we take the inner integration first while treating the outer
variable as a constant, and then do the outer integration. In this way the problem of two- (or more!)
dimensional integration has been reduced to doing a bunch of one-dimensional integrals, one after the
other.
Important note: If the region and/or the function is unbounded (the latter option arising, for example,
if A is open and f (x, y), while continuous on A, tends to infinity somewhere on its boundary), then
Fubini’s theorem still holds provided that the double integral is absolutely convergent, meaning
that the integral of |f (x, y)| over A must be finite. If this doesn’t hold then the result might not be true,
with the iterated integrals in the two orders giving di↵erent answers – see questions 61, 63 and 64 from
the problem sheets for some examples.
Example 38. Consider integrating f (x, y) = 4xy y 3 over the region drawn in figure 34 below, where
the region we wish to integrate over is the area between two curves.

Figure 34: Graph of the region of integration, lying between the two curves.

p
From the graph we can see that 0  x  1 and x3  yp x, so in the notations of the previous page
we should take a = 0, b = 1, y0 (x) = x3 , and y1 (x) = x. The integral of f (x, y) = 4xy y 3 over this

52
region can be calculated as:
p  p
Z 1Z x Z 1 x
3 2 y4
(4xy y ) dy dx = 2xy ) dx
0 x3 0 4 x3
Z 1
7 x12
= ( x2 2x7 + ) dx
0 4 4
 1
7x3 x8 x13 55
= + = .
12 4 52 0 156
If we want to change the order of integration, we must take care over the limits, as they are functions. To
integrate with respect to x first so we must calculate the limits of the inner integral in the form x = g(y);
p
on inspection of the graph, figure 34, we can see that the lower limit will be given by the line y = x
which must be re-written as x = y 2 , similarly the upper limit will be x = y 1/3 . The outer integral, with
respect to y, has lower and upper limits given by y = 0 and y = 1 respectively. Now we are ready to
integrate!
Z 1 Z y1/3 Z 1
3
⇥ 2 ⇤y1/3
(4xy y ) dx dy = 2x y y 3 x y2 dy
0 y2 0
Z 1
= (2y 5/3 y 10/3 y 5 )dy
0
 1
3y 8/3 3y 13/3 y6 55
= = .
4 13 6 0 156

In the example just treated, changing the order of integration didn’t make much di↵erence to the calcu-
lation, but sometimes it can be crucial:
x2
Example 39. Evaluate the integral of the function f (x, y) = e over the triangle A shown in figure
35.

Figure 35: Graph of the region of integration A, a triangle with base and height = 1.

Attempting to take the x integral first, we have


Z Z 1 Z 1
2
x2
I= e x dA = e dx dy ,
A 0 y

53
and then we seem to be stuck: the integral w.r.t. x has no elementary solution, and can only be given in
terms of the error function. Thankfully all is not lost: using Fubini we can change the order of integration
and obtain the answer:
Z
2
I= e x dA
A
Z 1 Z x
x2
= e dy dx
0 0
Z 1
x2
= xe dx
0
h i1
1 x2 1 1
= 2 e = 2 (1 e ).
0

Note: to calculate an area in the plane, for example between two curves, simply set f (x, y) = 1:
Z Z
1 dA = dA = Area of R
R R

an integral which can be evaluated in whichever order is easiest.

Sometimes it is better to use a non-rectangular grid of elementary areas Ak , but the idea is the same.
Example 40. Integrate f (x, y) = x2 + y 2 over the unit circle centred on the origin.
Using polar coordinates, the elementary areas are A= ri ⇥ (rj ✓j ), and so
Z Z Z 2⇡ Z 1
(x2 + y 2 ) dA = r2 r dr d✓
A 0 0
Z 2⇡ Z 1
= r3 dr d✓
0 0
Z 2⇡ ⇥1 ⇤
4 1
= 4r 0
d✓
0
Z 2⇡
1
= 4 d✓ = ⇡/2 .
0

There will be more on the systematics of this later in the term.

7.2 Volume integrals


Volume integrals are the obvious extension to all of this, and Fubini’s theorem applies to them too
(and also to four, five and so on dimensional integrals, though we’ll stop at three). To begin, let’s divide
up our volume and define the integral as the limit of a Riemann sum.
Let f (x) be a continuous scalar field defined
R on a volume V 2 R3 enclosed by the surface S. We define
the volume integral of f over V (I = V f (x) dV ) as the limit of a Riemann sum associated with a
partition of V into many, N , small volumes Vi , as in figure 36, as the number of these small volumes
tends to infinity and their sizes all tend to zero:
Z N
X
I= f (x) dV = lim f (xi ) Vi .
V N !1
i=1

Some notes:
• The limit should be independent of the partition taken, as we saw with double integrals.

54
Figure 36: Diagram showing a volume to be integrated over, V , with its bounding surface, S. Vi is a volume
element in the partition, situated at xi .

55
• Geometrically the volume (triple) integral is a 4-dimensional ‘hypervolume’ lying ‘below’ the graph
of f (x, y, z) in R4 . This is a natural extension of single integrals giving the area under the curve
and double integrals giving the (3-D) volume under the surface, but it is hard to visualise!
• Physically,
R if f (x) is the density of some quantity (e.g. number of flying ants per unit volume),
then I = V f (x) dV is the amount of ‘stu↵’ (total number of flying ants) inside all of V .
• As in the double integral case, we can calculate the volume inside a surface by setting f (x, y, z) = 1:
Z Z
1 dV = dV = Volume inside S.
V V

Now consider a simple shape such as the sphere shown in figure 37, which can be split into an upper
and lower surface z = gU (x, y) and z = gL (x, y) respectively. (Note that the sphere does not need to be
centred on the origin for this to be possible.)

Figure 37: Diagram showing the volume to be integrated over, V with its bounding surface, S. The circle on
the x, y plane is the projection of the volume onto the plane, and it is labelled A. Simple shapes like the sphere
can be split into an upper surface z = gU (x, y) and a lower surface z = gL (x, y).

If we consider in this case a simple partition which is parallel to the coordinate planes then

Vi = x r y s zt ,

where xr = xr+1 xr , ys = ys+1 ys , zt = zt+1 zt .

56
Taking the limits of the Riemann sum,
N
X
I = lim f (xi ) Vi
N !1
i=1
X
= lim f (x⇤r , ys⇤ , zt⇤ ) xr ys zt
N !1
r,s,t
!
X X
= lim lim f (x⇤r , ys⇤ , zt⇤ ) zt x r ys take z limit first
xr , ys !0 zt !0
r,s t
0 1
z=gUZ(x⇤ ⇤
r ,ys )
XB C
= lim @ f (x⇤r , ys⇤ , z)dz A x r ys and now x, y limits
xr , ys !0
r,s
z=gL (x⇤ ⇤
r ,ys )
0 1
Z z=gZU (x,y)
B C
= @ f (x, y, z)dz A dx dy ,
A z=gL (x,y)

where x⇤r 2 [xr , xr+1 ], ys⇤ 2 [ys , ys+1 ], zt⇤ 2 [zt , zt+1 ]. We may also disentangle the area integral to write
I as a three-times iterated integral:
Z Z Z
I = f (x) dz dy dx .
x y z

By Fubini’s theorem, we can change the order of integration as long as f (x) is continuous over the closed
volume, as in the following example.

Example 41. Find the mass M of air inside a hemispherical volume of radius r centred on the origin,
when the air density varies with height as ⇢ = cz + ⇢0 (and c and ⇢0 are constants).
Doing the integrals in the order x first, then y, then z, we have
Z Z Z p2 2 r rz rz y
M= p ⇢(z) dx dy dz
0 rz rz2 y 2

p
where rz = r2 z 2 . Noting that ⇢(z) can be taken outside the x and y integrals,
Z Z Z p2 2
r
!
rz rz y
M= ⇢(z) p dx dy dz .
0 rz rz2 y 2

Now the quantity in round brackets is just the area A(z) of the horizontal ‘slice’ of the hemisphere at
height z, so it’s equal to ⇡rz2 , and
Z r
M =⇡ (cz + ⇢0 )rz2 dz
0
Z r
=⇡ (cz + ⇢0 )(r2 z 2 ) dz
0
Z r
=⇡ cr2 z + ⇢0 r2 cz 3 ⇢0 z 2 dz
0
⇥1 2 2

3 r
=⇡ 2 cr z + ⇢0 r 2 z 1
4 cz
4 1
3 ⇢0 z 0

= ⇡( 14 cr4 + 23 ⇢0 r3 ) .

57
Bonus example 1. Use the volume integral to calculate the volume of a sphere of radius a (which we
know to be V = 43 ⇡a3 ).
R
We can calculate this volume as V f dV , where V is the set of points {(x, y, z) 2 R3 : x2 + y 2 + z 2  a2 }
and f = 1. We have:
Z Z Z !
z=gU (x,y)
dV = dz dx dy
V A z=gL (x,y)
Z Z p !
z= a2 x 2 y 2
= p dz dx dy
A z= a2 x 2 y 2
Z p
= 2 a2 x2 y 2 dx dy
A
Z Z p 2 2
a a y p
= p 2 a2 x2 y 2 dx dy
a a2 y 2
p p
Use a substitution to complete the x-integration: x = a2 y 2 cos ✓, dx = a2 y 2 sin ✓ d✓
Z a ✓Z ⇡ ◆
= 2(a2 y 2 ) sin2 ✓ d✓ dy
a 0
Z a
2 2a3 4
= (a y 2 )⇡ dy = ⇡(2a3 ) = ⇡a3 , as expected.
a 3 3

Later, we’ll rederive this result slightly less painfully, via a change of variables.

7.3 Line integrals


So far the integrals have been over ‘flat’ regions in one, two or three dimensions, but for applications it
is important to generalise to curved lines and surfaces.
A regular arc C ⇢ Rn is a parametrised curve x(t) for which the Cartesian components xa (t), a = 1 . . . n
are continuous with continuous first derivatives, where t lies in some (maybe infinite) interval [↵, ]. A
regular curve consists of a finite number of regular arcs joined end to end.
If v(x) is a vector field in Rn , then its restriction to a regular arc, v(x(t)), is a vector function of t and its
scalar product with the tangent dx(t)/dt to the arc is a scalar function of t. We can therefore integrate
R arc to get a real number, called the line integral of v along the arc C: t 7! x(t), t = ↵ . . . ,
it along the
denoted C v · dx :
Z Z
dx(t)
v · dx = v(x(t)) · dt . (7.1)
C ↵ dt

An important fact is that, as suggested by the notation, the line integral does not depend on the choice
of parametrisation of C. This follows from the chain rule, but it is instructive to check it for yourself in
some examples.
If C is a regular curve made up of one or more regular arcs, Rthen the line integral of v along C is the
sum of the line integrals over these arcs; it is also denoted as C v · dx .
H
Finally, if the integral is performed over a closed regular curve then it is often written as C v · dx .
Some variants:
Interpretation Form
R Rb
Length of curve, C C
ds = a || dx(t)
dt || dt
R Rb
Mass, if f is a density function C
f ds = a f (x(t)) || dx(t)
dt || dt
R
Work done, if F is a force C
F · dx

58
Here is a quick list of useful parametrisations (with a positive orientation i.e. anti-clockwise) for some
common curves in two dimensions:

Curve Parametric Equations


x2 y2
Ellipse: + =1 a2 b2 x(t) = a cos(t), y(t) = b sin(t)
y = f (x) x(t) = t, y(t) = f (t)
x = g(y) x(t) = g(t), y(t) = t
Straight line segment from (x0 , y0 ) to (x1 , y1 ) x(t) = (1 t)x0 + tx1 , y(t) = (1 t)y0 + ty1

Example 42. Let u(x) be the vector field (x2 z, xyz, x) and C be the circle parallel to the x, y plane
with centre a = (a1 , a2 , a3 ) and radius r, given by
x(t) = a + r cos t e1 + r sin t e2 , 0  t  2⇡ .
H
Evaluate C
u · dx.
Calculating, we have
dx
= r sin t e1 + r cos t e2 = ( r sin t, r cos t, 0)
dt
and
x(t) = (a1 + r cos t, a2 + r sin t, a3 )
so
u(x(t)) = ((a1 + r cos t)2 a3 , (a1 + r cos t)(a2 + r sin t)a3 , a1 + r cos t)
and
dx
u(x(t)) · = r sin t(a1 + r cos t)2 a3 + r cos t(a1 + r cos t)(a2 + r sin t)a3 .
dt
Hence
I Z 2⇡
u · dx = r sin t(a1 + r cos t)2 a3 + r cos t(a1 + r cos t)(a2 + r sin t)a3 dt
C 0

= ...
= ⇡r2 a2 a3 .
Note: if we had instead parametrized C as
x(t) = a + r cos 2t e1 + r sin 2t e2 , 0t⇡
the steps of the calculation would have changed slightly, but the final answer is unchanged – it’s worth-
while to check this for yourself.
Bonus example 2. Consider a circle of radius 1, centred at the origin. Let C be the arc of this circle
which lies in the first quadrant, taken in an anticlockwise direction. Find the value of the line integral
of F (x, y) = ( y, xy) along C.
First we must parametrise the path, C: x(t) = (cos(t), sin(t)), 0  t  ⇡/2 (the restriction on the range
of t is because the arc lies only in the first quadrant). Therefore dx(t)dt = ( sin(t), cos(t)). Next note
that F (x(t)) = F (x(t), y(t)) = ( sin(t), cos(t) sin(t)), so
Z Z ⇡2
F · dx = ( sin(t), cos(t) sin(t)) · ( sin(t), cos(t)) dt
C 0
Z ⇡
2
= (sin2 (t) cos2 (t) sin(t)) dt
0
Z ⇡
2
= 1
2 (1 cos(2t)) cos2 (t) sin(t) dt
0
h i ⇡2
= 1
2 (t
1
2 sin(2t)) + 1
3 cos3 (t) = 14 ⇡ 1
3 .
0

Once you have understood Green’s theorem [next chapter!], try to use it to double-check this result.

59
To summarize: the steps for performing a line integral of a vector field v(x) along a regular arc C are as
follows:
• parametrise C somehow, as t 7! x(t), with t in some range;
dx(t)
• compute dt and v(x(t));
• compute their scalar product and re-write the integrand in terms of t;
• perform the integration between the limits identified in the first part.
For a regular curve made up of a number of regular arcs, just do the above steps for each arc, and then
add up the results.

7.4 Surface integrals I: defining a surface


As with a line integral where the integration of a vector field along a curve yields a real number, a three-
dimensional vector field can be integrated over a two-dimensional surface S sitting in R3 , to give a double
integral analogue of the line integral. Surface integrals are of particular importance in electromagnetism
and fluid mechanics.
The first task is to specify the surface. There are (at least) two standard ways to do this:
Method 1: Give the surface in parametric form as x(u, v) where the real parameters u and v lie in
some region U ⇢ R2 called the parameter domain.
Example 43. Points on a sphere of radius a centred on the origin can be parametrised in spherical
(polar) coordinates, with u and v usually written as ✓ and , as

x(✓, ) = (x(✓, ), y(✓, ), z(✓, ))

where
x(✓, ) = a sin ✓ cos , y(✓, ) = a sin ✓ sin , z(✓, ) = a cos ✓ ,
and 0  ✓  ⇡, 0   2⇡ .

Small aside: The parametrisation for the surface of a sphere comes from the spherical polar coordinate
system, a three-dimensional extension of two-dimensional polar coordinates. Figure 38 shows how we
get these coordinates: first calculate r0 = r sin(✓). 0 0
p Then x = r cos( ) = r sin(✓) cos( ), y = r sin( ) =
2 2 2
r sin(✓) sin( ), while z = r cos(✓). As ever, r = x + y + z is the distance from the origin, while ✓
and are known as the polar and azimuthal angles respectively, with ✓ 2 [0, ⇡] and 2 [0, 2⇡]. Beware
that some texts have these the other way around, so that ✓ is the azimuthal and the polar angle. Note
though that here we fix r to be equal to a, the radius of the sphere, as we want to stay on its surface.

The following table gives some more examples:


Surface Parametric Equations
General x(u, v) = x(u, v)e1 + y(u, v)e2 + z(u, v)e3
Sphere/part sphere, radius a x = a sin(u) cos(v), y = a sin(v) sin(u), z = a cos(u)
Cylinder, radius a, centred on z-axis x = a cos(u), y = a sin(u), z = v
z = f (x, y) x = u, y = v, z = f (u, v)
y = g(x, z) x = u, y = g(u, v), z = v
x = h(y, z) x = h(u, v), z = v, y = u

@x @x
Returning to the general case, @u and @v are two tangent vectors to S at x(u, v), and so their cross
product
@x(u, v) @x(u, v)

@u @v

60
Figure 38: Spherical polar coordinates (r, ✓, ).

is a normal vector to S there, and


⇣ @x(u, v) @x(u, v) ⌘ @x(u, v) @x(u, v)
b=
n ⇥ ⇥
@u @v @u @v

is a unit normal to S at x(u, v). If u and v are swapped over, then we also get a unit normal to S at
x(u, v), but one pointing in the opposite direction (which for a surface sitting in R3 is the only other
option).
It’s a good exercise to check for example 43 that this recipe gives the answer you’d expect, that is a unit
vector pointing in the radial direction. For this case the relevant partial derivatives are
@x
= (a cos ✓ cos , a cos ✓ sin , a sin ✓)
@✓
@x
= ( a sin ✓ sin , a cos sin ✓, 0)
@
and so
x✓ ⇥ x = (a2 sin2 ✓ cos , a2 sin2 ✓ sin , a2 sin ✓ cos ✓) = a sin ✓ x .
Finally constructing the unit normal vector gives

a sin ✓ x (x, y, z)
b=
n = ,
a sin ✓ |x| a

which is indeed the expected answer, if you think about it.

Method 2: Express the surface as (part of) a level surface (recall from term 1) of a scalar field f , i.e.
give the surface implicitly as f (x, y, z) = const. Then the gradient of f , rf , is a normal vector to S,
and
rf
b=
n , (7.2)
|rf |
is a unit normal.
Example 44. For the sphere discussed above, we can take f (x, y, z) := x2 + y 2 + z 2 a2 = 0 (or

61
f (x, y, z) := x2 + y 2 + z 2 = a2 would work just as well). Then

rf
b=
n
|rf |
(2x, 2y, 2z)
=p
4x2 + 4y 2 + 4z 2
(x, y, z) (x, y, z)
=p = .
2
x +y +z2 2 a

This agrees with the result from method 1.


Note: if we’d instead defined the surface of the sphere by h(x, y, z) = f (x, y, z) = a2 x2 y 2 z 2 = 0,
then the normal vector obtained by using f (which we’ll call n bf ) would have been replaced by

( x, y, z)
bh = p
n = bf
n
x2 + y 2 + z 2

This reflects the same ambiguity in the sign of the normal vector that was seen when swapping u and v
in method 1.

7.5 Surface integrals II: evaluating the integral


We will once again use a Riemann sum to define the surface integral of a continuous vector field F (x)
over a surface S lying in R3 . The parametrised position vector of a point on the surface is x = x(u, v)
with (u, v) 2 U , the parameter domain. We will also assume that the partial derivatives of x exist and
are continuous, and that the unit normal vector n b(u, v) is continuous – this means that the surface is
what we call orientable. (This last point may seem obvious but it is not always true e.g. the Möbius
strip is a non-orientable surface.) The general setup is illustrated in figure 39. Then the surface integral
is defined as Z X
F · dA = lim F (x⇤k ) · n
b k Ak . (7.3)
S Ak !0
k

Note the dot product between the normal to the surface and the vector field: this implies that we are
looking at the vector field contributions which are perpendicular to the surface.
To turn this rather-formal definition into something which can be used in practice, we use either method
1 or method 2 to specify the surface, and then convert everything into a ‘flat’ two-dimensional area
integral of the sort seen in section 7.1 above.
Method 1: We construct the area elements Ak by approximating them as parallelograms given by
the partitioning of the surface along lines of constant u and v. Remember u and v are in the parameter
domain U , split as shown in figure 39(b) and indexed i and j respectively, and that the modulus of the
cross product of two vectors in equal to the area of the parallelogram that they span. Thus

bk Ak ⇡ (x(ui + ui , vj ) x(ui , vj )) ⇥ (x(ui , vj +


n vj ) x(ui , vj ))
@x @x
⇡ ( ui ) ⇥ ( vj ). (7.4)
@u @v
Substituting equation (7.4) into equation (7.3) gives us
Z X ✓ ◆
⇤ @x(ui , vj ) @x(ui , vj )
F · dA = lim F (xij ) · ⇥ ui v j
S ui , vj !0
i,j
@u @v

and, taking the limit, we get the key formula, the ‘surface’ version of the line integral definition (7.1):

Z Z ✓ ◆
@x @x
F · dA = F (x(u, v)) · ⇥ du dv . (7.5)
S U @u @v

62
(a) (b)

Figure 39: Plot (a) shows a hemispherical surface, over which we may wish to integrate a vector field; plot (b)
shows an area element, bij
Aij , of the surface with unit normal vector n
.

This is a double integral over the parameter domain U , and can be written neatly as
Z Z
F · dA = F · N du dv ,
S U

where N = xu ⇥ xv is a normal vector to S (but not of unit length).

Example 45. Find the integral of F = e3 over the surface, S, given by the hemisphere of radius 1,
centred at the origin, with z > 0, as shown in figure 39 (a). We have
Z Z ✓ ◆
@x @x
F · dA = F (x) · ⇥ du dv ,
S U @u @v
where r = 1, and taking U = {(u, v) 2 R2 : 0  u  ⇡2 , 0  v  2⇡} captures the part of the surface
of the unit sphere with z 0. We also calculated xu ⇥ xv in the discussion following example 43, and
substituting all our values and completing the calculation,
Z Z 2⇡ Z ⇡2 ✓ ◆
@x @x
F · dA = F (x) · ⇥ du dv
S 0 0 @u @v
Z 2⇡ Z ⇡2
= e3 · (sin2 (u) cos(v)e1 + sin2 (u) sin(v)e2 + sin(u) cos(u)e3 ) du dv
0 0
Z 2⇡ Z ⇡
2
= sin(u) cos(u) du dv
0 0
Z 2⇡ Z 2⇡
⇥1 2
⇤ ⇡2 1
= 2 sin (u) 0
dv = 2 dv = ⇡ .
0 0

Method 2: Suppose that S is given as the level set (or part of the level set) of a function f (x, y, z), and
furthermore that @f /@z 6= 0 on S. Then, by the implicit function theorem for surfaces, the points of S
can be written as (x, y, g(x, y)) for some function g(x, y), where the (x, y) ranges over some region A of the
x,y plane, which is the projection of S onto that plane. (This is not possible for every surface; for example
the x,z plane cannot be parametrised this way, and neither can the surface of a whole sphere.) We can
then apply method 1 taking the parameters to be the x, y coordinates, so that x(x, y) = (x, y, g(x, y)) =
x e1 + y e2 + g(x, y) e3 , and (after a short calculation)
@x @x @g @g
⇥ = e1 e2 + e3 .
@x @y @x @y

63
The partial derivatives of g can be calculated as in the implicit function theorem for surfaces, noting
that the function of two variables, F (x, y), defined by F (x, y) = f (x, y, g(x, y)), is constant, so that
@F @f @f @g
0= = +
@x @x @z @x
@F @f @f @g
0= = +
@y @y @z @y
Using these equations,
@x @x @f @f @f @f
⇥ = e1 / + e2 / + e3 = (rf )/(e3 · rf ) ,
@x @y @x @z @y @z
enabling the surface integral of F over S to be written as an area integral over the region A in the
x, y-plane: Z Z
F · rf
F · dA = dx dy .
S A e3 · rf

Note: whenever we compute a surface integral, there is a choice as to the direction to take the normal
vectors to the surface – ‘in’ or ‘out’ for a closed surface, or ‘upwards’ or ‘downwards’ for a surface lying
above the x,y plane, for example. In the derivation of the formula just given, z component of @x @x
@x ⇥ @y
was equal to 1, so it corresponds to the ‘upwards’ choice of normals. If the downwards option was the
one you were after, the formula should be negated.

Figure 40: Plot of the tangent plane to a surface at a point xk , making an angle with the x,y plane.

We can give an alternative derivation of the method 2 formula starting from the definition of the surface
integral in terms of a Riemann sum, equation (7.3). We shall use equation (7.2) for the unit normal,
rf
bk = |rf
n | , where f (x, y, z) = const defines the surface. Instead of approximating the small area element
by a parallelogram we can use a geometrical argument. Figure 40 shows the tangent plane to S at a
point of interest, xk , on S, and an area Ak on that plane together with the ‘shadow’ A = x y that
it casts on the x,y plane. From the figure we can see that AAk = cos( ) , while cos( ) = e3 · n bk from
definition of dot product. Hence we can write
x y x y bk x y
n
Ak = = bk
and n Ak = .
cos( ) e3 · n
bk e3 · n
bk

64
rf
bk =
Returning to the Riemann sum, and substituting n |rf | bk
in the formula just given for n k , we have
Z X
F · dA = lim F (x⇤k ) · n
b k Ak
S Ak !0
k
X rf xi yj
= lim F (x⇤i,j ) ·
xi , yj !0 |rf | e3 · rf
i,j |rf |
X F (x⇤i,j ) · rf
= lim x i yj
xi , yj !0
i,j
e3 · rf
Z
F · rf
= dx dy ,
A e3 · rf

where A is the area of the surface projected onto the x,y plane. Let’s return to the previous example
and calculate the surface integral using method 2.

Example 46. Recompute the integral of F = e3 over the surface, S, given by the hemisphere of radius
1 and centred at the origin with z > 0, as shown in figure 39(a), using method 2. Here the surface
can be represented by f (x, y, z) = x2 + y 2 + z 2 = 1, A is the unit circle centred at the origin, and
rf = (2x, 2y, 2z). Hence
Z Z
F · rf
F · dA = dx dy
S A e3 · rf
Z
e3 · rf
= dx dy
A e3 · rf
Z
= dx dy = ⇡ ,
A

since the area of the unit disk is ⇡. This agrees with our previous calculation, as of course it had to.

To recap: there are two ways to evaluate a surface integral, and which one is the best to use depends on
the form of the surface you are integrating over:

Equation of surface Form of surface integral


Z Z
Parametric: x(u, v) as in example 45 F · dA = F · (xu ⇥ xv ) du dv
S U
Z Z
F · rf
Implicit/level surface: f (x, y, z) = const as in example 46 F · dA = dx dy
S A e3 · rf

These two forms of the surface integral are equivalent, but one may be easier than the other for any
given example so you must practise! The problem sheets contain plenty of examples, many of them in
the context of the three theorems that will be discussed next.

65
8 Green’s, Stokes’ and divergence theorems
8.1 The three big theorems
1. Green’s theorem in the plane
Let P (x, y) and Q(x, y), (x, y) 2 R2 , be continuously di↵erentiable scalar fields in 2 dimensions. Then
I Z ⇣
@Q @P ⌘
P (x, y) dx + Q(x, y) dy = dx dy (8.1)
C A @x @y

where C is the boundary of A, traversed in a positive (anti-clockwise) direction. A good way to remember
which way is positive is to imagine you are walking around the boundary with the area of integration to
your left; then you are walking in a positive direction!

Green’s theorem in the plane can also be written in vector form, embedding the x,y plane into R3 (with
z = 0) and setting F (x, y, z) = (P (x, y), Q(x, y), R) with R arbitrary:
I Z
F · dx = (r ⇥ F ) · e3 dA .
C A

It’s a good exercise to check that you agree with this statement!

2. Stokes’ theorem
This generalises the vector form of Green’s theorem to arbitrary surfaces in R3 .
Take a continuously di↵erentiable vector field F (x, y, z) in R3 , and a surface S also in R3 with area
b dA and boundary curve C ⌘ @S. Then
elements dA = n
I Z
F · dx = (r ⇥ F ) · dA (8.2)
C S

As with Green’s theorem, we need to make a comment about orientations. The surface S has two choices
of normal vector (say nb and n b), and the curve C = @S also has two possible choices of orientation. In
either integral, changing from one orientation to the other (changing from n b to nb, or vice versa, in the
case of the surface integral) changes the overall sign of the answer. So, given a surface with a choice of
normal, how do we know which orientation we should take for the boundary (or equivalently, given a
choice of boundary orientation, which normal should we take for the surface) in order to get the equality
of Stokes’ theorem?
The answer is given by the right hand rule:
Curl the fingers of your right hand, and extend your thumb. If you imagine placing your hand on the
surface, near the boundary, with your thumb pointing in the direction of the surface normal, then your
fingers curl in the direction of the orientation of the boundary.
Equivalently, if you were to stand on the boundary with your head pointing in the direction of the
normal, and walk around the boundary such that the surface is on your left, then you are walking in the
direction the boundary should be oriented.

66
3. The divergence theorem:
As the name suggests, this theorem involves the divergence operator. If F is a continuously di↵erentiable
vector field defined over the volume V with bounding surface S, then
Z Z
F · dA = r · F dV (8.3)
S V

b dA, while n
As in Stokes’ theorem, dA = n b is the outward unit normal.

Proofs? See later!

These theorems can be considered as higher-dimensional analogues of the fundamental theorem of cal-
culus: in each case the integral of a di↵erentiated object over some region is equated to the integral of
the undi↵erentiated object over the boundary of that region, as illustrated in figure 41.

8.2 Examples
Example 47. (Green) Check Green’s theorem when P (x, y) = y 2 7y, Q(x, y) = 2xy + 2x and the area
of integration is bounded by the unit circle x2 + y 2 = 1. Calculating the RHS of equation (8.1) first:
Z Z
@Q @P
( )dx dy = (2y + 2) (2y 7) dx dy
A @x @y A
Z
=9 dx dy = 9⇡ ,
A
R
since A dx dy is the area of the unit circle (recall from section 1). Next we parametrise the circle as
before, so that dx = sin t dt, dy = cos t dt, and calculate the LHS of equation (8.1):
I Z 2⇡
dx dy
(P + Q ) dt = (sin2 t 7 sin t)( sin t) + (2 cos t sin t + 2 cos t)(cos t) dt
C dt dt 0
Z 2⇡
= ( sin3 t + 7 sin2 t + 2 cos2 t sin t + 2 cos2 t) dt = 9⇡
0

This agrees with the previous calculation, but was more arduous. Usually (not always) the area integral
is the easier part, so if faced with a tough closed line integral, see if Green’s Theorem is applicable.
H
Example 48. (Stokes) Evaluate I = C
F · dx where

F = x2 e5z e1 + x cos y e2 + 3y e3

and C is the circle defined by x = 0, y = 2 + 2 cos ✓, z = 2 + 2 sin ✓, 0  ✓  2⇡ .

We’ll do this in two ways. First, the direct route. We have

x(✓) = (2 + 2 cos ✓) e2 + (2 + 2 sin ✓) e3

so
dx
= 2 sin ✓ e2 + 2 cos ✓ e3
d✓
and
dx
F (x(✓)) · = 3(2 + 2 cos ✓)2 cos ✓ .
d✓
Hence Z 2⇡
I= 12(1 + cos ✓) cos ✓ d✓ = 12⇡ .
0

67
Figure 41: The integration regions for the divergence theorem, Stokes’ theorem, and the fundamental theorem
of calculus. Note the similarities between the formulae: the left hand side of each has one less integral than the
right hand side, and this is ‘compensated’ by the presence of a derivative on each right hand side. Also notice
that the domains of the right hand side integrations, V , S and I, are bounded by the domains of the left hand
sides, S, C and {a, b} respectively.

68
Alternatively, Stokes’ theorem can be used. Calculating,

r ⇥ F = 3 e1 + 5x2 e5z e2 + cos y e3 .

Take S to be the planar (flat) disk spanning C. Note that C is such that y initially decreases as ✓
increases from 0, and that z increases similarly. By the right hand rule, we should therefore take the
b = e1 everywhere on S, dA = e1 dA, and (r ⇥F )·dA = 3 dA.
normal on S in the positive x direction. So n
Hence ZZ
I= 3 dA = 3 ⇥ (area of S) = 12⇡ as before.
A
Clearly, if we had taken the normal as e1 instead, we would have got the opposite sign for the surface
integral, and the two answers wouldn’t have agreed.
Example 49. (Stokes again) Let S be the upper (z 0) half of the sphere x2 + y 2 + z 2 = 1 ; evaluate
the surface integral Z
I = (x3 ey e1 3x2 ey e2 ) · dA .
S

To do this directly is tricky; instead we’ll attempt to use Stokes’ theorem. Observe that if F = x3 ey e3 ,
then r ⇥ F = (x3 ey e1 3x2 ey e2 ), which is the thing we want to integrate over the hemisphere. Hence
by Stokes’ theorem I
I= F · dx
C
where C is the boundary of S, which can be parametrised via x(✓) = cos ✓ e1 + sin ✓ e2 , 0  ✓  2⇡.
Then dx
d✓ = sin ✓ e1 + cos ✓ e2 , which has zero scalar product with F for all ✓. Hence I = 0.
R
Example 50. (Divergence) Evaluate I = S (3x e1 + 2y e2 ) · dA where S is the sphere x2 + y 2 + z 2 = 9.
Here F = (3x, 2y, 0) so r · F = 3 + 2 = 5 and so, by the divergence theorem,
Z
4
I= 5 dV = 5 ⇡ 33 = 180⇡
V 3

where we used the fact that the volume of V , the interior of S, is 43 ⇡ 33 .


Example 51. (Divergence again) Let R be the region of R3 bounded by 2 2
R the surface z = x + y 2(a
paraboloid) and the plane z = 1, and let S be its boundary. Evaluate I = S F · dA where F = (y, x, z ).
Using the divergence theorem,
Z
I= r · F dV
V
Z
= 2z dV
V
Z 1
p
= 2z ⇡ rz2 dz where rz = z, as in example 41
0
Z 1
2⇡
= 2 ⇡ z 2 dz = .
0 3

8.3 Conservation laws and the continuity equation (Non-Examinable)


This is an important application of Rthe divergence theorem; it might also help you to get an intuition for
the meaning of the surface integral S F · dA, which is usually interpreted as the flux of some quantity F
through the surface of interest, S. Flux is the rate of flow per unit area, which is measured perpendicular
to the face of the surface, so it is dA multiplied by the unit normal vector nb with which the dot product
with F is taken. Typical fluxes of interest are the magnetic flux (F = B), the electric flux (F = E), and
transport fluxes such as mass (F = ⇢v) (of fluid perhaps), heat or energy.
To give a concrete example, let us consider a swarm of bees and suppose that at time t and position
x 2 R3 their density (number of bees per unit volume) is ⇢(x, t), and that near that point they are flying

69
Figure 42: Cartoon of flying bees, travelling through the surface A at a velocity v.

with average velocity v(x, t). Then the flux of bees is j = ⇢v, meaning that the number of bees per unit
time passing through a small area A with unit normal n b is j · n
b A.
To see this, imagine you are one of the flying bees near to the area element A; in a small time interval
t this element will appear to move with a distance v t and so the element will have swept out a volume
Abn · v t, as shown in figure 42. The total number of bees in this volume is ⇢ Abn · v t, and they will
all cross the area element A in this time t; this implies that the rate at which the bees fly through
A is ⇢ Ab n · v. Now imagine we have a larger surface S made up of many small area elements n b dA.
The total flux of bees across the whole surface S can be found by subdividing S into its constiuent area
elements and adding up the fluxes across each of these. Taking the limit as the sizes of these small area
elements tend to zero, the total flux through S is
X Z
lim ⇢ An b·v = j · dA .
A!0 S
all A’s

We can now think about the implications of the “conservation of bees” to this situation. The total
number of bees in some fixed volume V at time t is
Z Z
B(t) = ⇢ dV = ⇢(x, y, z, t) dx dy dz
V V

and the rate of change of B(t) is


Z Z
dB d @⇢
= ⇢ dV = dV
dt dt V V @t
(where the derivative can be taken inside the integral since the volume V is fixed). At the same time,
the rate of flow of bees out of V is Z
R= j · dA
S
where S is the boundary of V . Assuming no bees are being born (hatched?) or killed, it must be true
that
dB
= R
dt

70
where the minus sign is because R is the rate of flow out of V , while it is flow into V which causes B to
increase. Substituting in the formulae obtained above for B and R,
Z Z Z
@⇢
dV = j · dA = r · j dV
V @t S V

using the divergence theorem for the second equality. It’s important to remember that it is the three-
dimensional divergence (involving x, y and z derivatives) that appears on the right-hand side, since it
arose from applying the divergence theorem to a three-dimensional surface integral in x, y and z. Hence,
Z ✓ ◆
@⇢
+ r · j dV = 0
V @t
must hold, and must do so for all volumes V . The only way for this to be true is for the thing being
integrated to be zero everywhere, or in other words
@⇢
= r·j.
@t
This is called the continuity equation and besides its relevance to beekeeping, it crops up in fluid me-
chanics, electromagnetism, and many other situations where some conservation law is in play.
Bees are perhaps not the best example, since they are not continuous and so the dot product in the
integral will consider only the part of the bee crossing the surface perpendicularly. So what happens
to the other part of the bee? (This is why we should really consider continuous quantities e.g. flow of
water, magnetic fluxes etc.)
To quote Monty Python, “Eric the half a bee”:
“Half a bee, philosophically, must ipso facto half not be. But half the bee has got to be, vis-à-vis its
entity - d’you see? But can a bee be said to be or not to be an entire bee when half the bee is not a bee,
due to some ancient injury?”

8.4 Path independence of line integrals


Having seen an application of the divergence theorem, we now discuss an important special case of Stokes’
theorem.
In general, line integrals will depend on the path taken between their end points. This is not always the
case, though, and vector fields for which the line integral is path independent are rather special. They
are called conservative vector fields. We begin with a short example.
R
Bonus example 3. Calculate the line integral C F · dx, for F = (y cos(x), sin(x)) between (0, 0) and
(1, 1) over the two paths shown in figure 43 below.
R
Let us first calculate C1 F · dx. The parametrisation (recall the table of parametrisations given in
section 7.3) of this path can be taken to be x(t) = (t, t) (0  t  1), giving dx = (1, 1) dt and F (x(t)) =
(t cos(t), sin(t)). We can now perform the integration:
Z Z 1 Z 1
d(t sin(t))
F · dx = (t cos(t) + sin(t)) dt = dt = sin(1) .
C1 0 0 dt

Okay, now let’s try along the second path. It can be split into two sub-paths C21 and C22 , parametrised
as x(t) = (t, 0) and x(t) = (1, t) respectively, with t running from 0 to 1 in each case. Using this setup
we obtain for C21 : dx = (1, 0) dt, F (x(t)) = (0, sin(t)), and so F · dx = 0. For C22 : dx = (0, 1) dt,
F (x(t)) = (t cos(1), sin(1)), and so F · dx = sin(1). Putting these together to perform the line integral
along C2 gives
Z Z Z Z Z 1
F · dx = F · dx + F · dx = F · dx = sin(1) dt = sin(1) .
C2 C21 C22 C22 0

Both paths give us the same answer!

71
Figure 43: The field plot of F with two distinct paths shown, C1 and C2 = C21 + C22 .

Now, we only showed path independence for two very simple paths between two particular points, and
to get the general result, in other words to prove that the field is conservative, we would need to check
all possible paths joining all pairs of points. This would be both tediously (infinitely!) time consuming,
and rather hard in the case of very wiggly paths. We will now show that there is a simple method to
show path independence for all possible paths, which uses Stokes’ theorem in a crucial way.
Suppose that F (x) is a continuously di↵erentiable vector field defined on an open subset D of R3 . Let
C1 and C2 be any two paths from point a to point b in D, as shown in figure 44.

Figure 44: Two paths between points a and b in a domain D.

Let us investigate the conditions required for


Z Z
I := F · dx F · dx
C1 C2

to be zero, i.e. path independence. Suppose we have parametrised C2 by a parameter t running from ta

72
to tb , and that C 2 is the path along C2 taken in the opposite direction. Then
Z Z ta Z tb Z
dx dx
F · dx = F (x(t)) · dt = F (x(t)) · dt = F · dx .
C2 tb dt ta dt C2

Combining these equations we have


Z Z I
I= F · dx + F · dx = F · dx ,
C1 C2 C

where C is the closed path consisting of C1 followed by C 2 . Now if C is the boundary of a surface S in
D then, by Stokes’ theorem, Z
I= r ⇥ F · dA .
S
and so I will be zero if r ⇥ F = 0 throughout D, giving us path independence. However for this to
work we need not only that r ⇥ F = 0 in D, but also that every closed path in D is the boundary of
some surface in D. Fortunately, there is a standard condition which ensures that this is the case. A
region D is called simply connected if any closed curve in D can be continuously shrunk to a point in D,
which in particular implies the condition we need, namely that every closed curve in D is the boundary
of a surface in D. Two examples: the two-dimensional surface of a sphere is simply connected, while the
two-dimensional surface of a torus is not simply connected – see figure 45.

(a) (b)

Figure 45: Examples of simply and non simply connected spaces: the surface of a sphere (a) and of a torus (b).
The closed curve on (a) can be shrunk to a point and is the boundary of a part of the surface, while the closed
curve on (b) can’t be shrunk to a point, and is not the boundary of a part of the surface.

Three-dimensional spaces are harder to visualise, but for example D = R3 is simply-connected, while if
we drill out a cylinder around the z axis to leave the set of points

D0 = (x, y, z) 2 R3 : x2 + y 2 > 1

then D0 is not simply connected. (Check that you can see why – can you identify a closed curve in D0
which can’t be contracted to a point?)
To summarise: if a vector field F satisfies r ⇥ F = 0 in some simply-connected region D, and if C1 and
C2 are two paths in D joining points a and b, then
Z Z
F · dx = F · dx
C1 C2

and the line integral depends only on the end points – it is path independent and the vector field F is
conservative.

73
The scalar potential (Non-Examinable)
A further observation: in section 3, we saw that one way for r ⇥ F to be zero in some region D is for
F to be the gradient of a scalar field there. A natural question is to ask whether it is possible to go the
other way, and deduce from r ⇥ F = 0 that F = r for some . If the region D is simply connected,
then the answer is yes. Suppose that r ⇥ F = 0 in D, and define a scalar field by
Z x
(x) = F · dx
x0

where x0 is some (fixed) reference point in D. By path independence, this is well-defined. But what is
its gradient?
We can prove this in a few di↵erent ways. One method (which is covered in the lectures), is just to
calculate the partial derivatives directly. Consider the partial derivative of with respect to x,
R x+he1 Rx
@ (x + he1 ) (x) x0
F · dx x0
F · dx
= lim = lim .
@x h!0 h h!0 h

Since we have that r ⇥ F = 0 over a simply-connected region D, then the line integral of F is path
independent over D. For the first of the two integrals in the expression above, let us therefore choose to
integrate along a path that goes first from x0 to x, then from x to x + he1 in a straight line, as shown
in figure 46 (where h is taken to be he1 ).

Figure 46: A path C from x0 to x, and C from x to x + h, where h is a small increment.

We can therefore compute the line integral of F from x0 to x + he1 as the sum of the line integrals along
the two parts of this path, and hence
Rx R x+he1 Rx Z
@ x0
F · dx + x
F · dx x0
F · dx 1 x+he1
= lim = lim F · dx .
@x h!0 h h!0 h x

If we parameterise this path as x(t) = (x + th, y, z), 0  t  1, then we have F (x(t)) = F (x + th, y, z)
and dx
dt = (h, 0, 0), so
Z x+he1 Z 1 Z 1
F · dx = F (x + th, y, z) · he1 dt = h F1 (x + th, y, z) dt,
x 0 0

@
where F1 is the e1 component of F . Substituting this back in to our expression for @x gives
Z 1
@
= lim F1 (x + th, y, z) dt.
@x h!0 0

74
A nice way to evaluate this expression is using the mean value theorem for integrals, which in general
Rb
states that a G(x) dx = G(x⇤ )(b a) for some x⇤ 2 [a, b]. In this case, this says that there exists a
R1
t⇤ 2 [0, 1] such that 0 F1 (x + th, y, z) = F1 (x + t⇤ h, y, z), and then taking the limit gives @@x = F1 (x, y, z)
by continuity.
Alternatively, we can consider a Taylor expansion of F1 (x + th, y, z), and then we see that
Z 1
@ @
= lim (F1 (x, y, z) + th F1 (x, y, z) + . . .) dt
@x h!0 0 @x
 1
1 @
= lim tF1 (x, y, z) + t2 h F1 (x, y, z) + . . .
h!0 2 @x 0
✓ ◆
1 @
= lim F1 (x, y, z) + h F1 (x, y, z) + . . . ,
h!0 2 @x

where all the terms contained within . . . are of order at least h2 , and hence upon taking the limit we
obtain
@
= F1 (x, y, z).
@x
The partial derivatives with respect to y and z can be calculated following exactly the same procedure,
and so we find
r (x) = F1 (x, y, z)e1 + F2 (x, y, z)e2 + F3 (x, y, z)e3 = F (x),
as required.
Here’s an alternative (though similar) approach, which doesn’t require us to consider the di↵erent com-
ponents separately, and has the bonus of showing that is fully di↵erentiable in the sense seen in section
5. If we return to 46, then (similarly to before), for x + h 2 D, we can write:
Z
(x + h) (x) = F · dx .
C

Choosing the small increment to the path to be a straight line, we can parametrise C as x(t) = x + h t,
where 0  t  1. Now
Z Z 1
F · dx = F (x + h t) · h dt = F (x + t⇤ h) · h , for some t⇤ 2 [0, 1] ,
C 0

where again the last step comes from the integral form of the mean value theorem. So, we have

(a + h) (a) = F (a) · h + F (a + t⇤ h) · h F (a) · h .


| {z }
R(h)

bh ,
Denoting the unit vector in the direction h by n

R(h)
= (F (a + t⇤ h) F (a)) · n
bh
|h|

which tends to zero as h ! 0 if F is continuous. Hence is di↵erentiable at a, and the piece linear in h
allows us to identify
r (a) = F (a) .

To conclude: if r ⇥ F = 0 in D and D is simply connected, then

9 s.t. F = r

in D. The function (or sometimes its negative) is called the scalar potential.

75
Path independence summary
One more question: starting with F = r , can we show path independence directly? The answer is
again yes. This follows from the answer to exercise 76 on problem sheet 7, but for completeness the
relevant calculation is repeated here. Let C be any curve running from a to b, and t 7! x(t), t = ta . . . tb
be any parametrisation of it, so that x(ta ) = a and x(tb ) = b . Then
Z Z ta
dx
F · dx = r (x(t)) · dt
C tb dt
Z ta
@ dxi
= dt (Einstein notation!)
tb @xi dt
Z ta
d (x(t))
= dt (Chain rule)
tb dt

= (x(ta )) (x(tb )) (Fundamental Theorem of Calculus)


= (a) (b) .

Since we have not specified C beyond giving its end points a and b, the integral is path independent.

Finally we can use our double ended arrows (though path independence implying the existence of the
scalar field is only in the non-examinable section above)! In a simply connected region D,

r ⇥ F = 0 , path independence of I , 9 s.t. F = r .

R
Example 52. Compute I = C
F · dx for F = (y cos(xy), x cos(xy) z sin(yz), y sin(yz)) , where C
is specified by ✓ ◆
sin(t) log(1+t) 1 et
t 7! x(t) = , , , 0  t  1.
sin(1) log(2) 1 e
Answer: first note that x(0) = (0, 0, 0) and x(1) = (1, 1, 1) . Then compute r ⇥ F = · · · = 0 , on all of
R3 (which is simply connected). Hence F = r for some . To find we need to solve

(1) x = F1 ; (2) y = F2 ; (3) z = F3 .

From (1), = sin(xy) + f (y, z) for some function f of y and z; then from (2), f (y, z) = cos(yz) + g(z).
Finally from (3), g(z) = constant = A, say. Hence

(x, y, z) = sin(xy) + cos(yz) + A

and
I = (1, 1, 1) (0, 0, 0) = (sin(1) + cos(1) + A) (1 + A) = sin(1) + cos(1) 1.

Some vocabulary: if r ⇥ F = 0, F is said to be closed ; and if F = r , F is said to be exact. Even


though it wasn’t phrased in this way, we saw in subsection 3.3 that exact ) closed, and we have just
shown that closed ) exact whenever the space D on which F is defined is simply connected. However,
if D is not simply connected, this may fail, and the extent of this failure gives some information about
the ‘shape’ (or topology) of D. This features in a scene in the film ‘A beautiful mind’. . .
Example 53. A quick example to see what can go wrong on a non-simply connected space.
⇣ ⌘
y
Let F (x) = x2 +y x
2 , x2 +y 2 , 0 be a vector field defined on D = R3 {0}. D is a non-simply connected
region, as can be seen by considering any circle around the origin. Then F is closed, since we have
y 2 x2
@ x F 2 = @ y F1 = ,
(x2 + y 2 )2

76
and hence r ⇥ F = 0.
R
Now consider C F · dx, where C is the circle of radius 1 in the x, y-plane centred on the origin, which
we can parameterise as x(t) = (cos(t), sin(t), 0) for 0  t  2⇡. Then using our standard method for
computing line integrals, we have dx(t)
dt = ( sin(t), cos(t), 0) and F (x(t)) = ( sin(t), cos(t), 0), and
hence Z Z 2⇡ Z 2⇡
dx(t)
F · dx = F (x(t)) · dt = 1 dt = 2⇡.
C 0 dt 0

If
R we now consider
R the left and right semicircles
R CRL and CR asRshown in Figure 47, then we must have
CL
F · dx 6
= CR
F · dx, as otherwise C
F · dx = CR
F · dx CL
F · dx, in contradiction to the direct
calculation we did above.

Figure 47: The curve C, and the left and right semicircles CL and CR .

Line integrals of F on D are therefore not necessarily path independent, and we cannot find a scalar
field such that F = r on D. F is therefore not exact.
The existence of a vector field which is closed, but not exact, is due to the topology of D.

Aside: Green ) Cauchy


One final remark, which might be relevant if you are taking the complex variable course (otherwise,
don’t worry about it). In two dimensions Stokes becomes Green, stating that if P (x, y) and Q(x, y) are
continuously di↵erentiable on A ⇢ R2 then
I Z ✓ ◆
@Q @P
P (x, y) dx + Q(x, y) dy = dA .
C=@A A @x @y

A special (‘zero-curl’) case is the fact that that if P (x, y) and Q(x, y) are continuously di↵erentiable on
A ⇢ R2 and satisfy @Q @P
@x = @y there, then
I
P (x, y) dx + Q(x, y) dy = 0 .
C=@A

This might remind you of Cauchy’s integral theorem, which states that if f : C ! C is holomorphic on
some region A ⇢ C, then I
f (z) dz = 0 .
C=@A

77
This isn’t a coincidence. In fact Cauchy is a consequence of Green, at least if f is assumed to have
continuous partial derivatives1 . To prove this, write

f (z) = f (x + iy) = u(x, y) + i v(x, y) .

Given that f is holomorphic, the following (Cauchy-Riemann) equations hold:

@u @v @u @v
= (CR1) ; = (CR2) .
@x @y @y @x
Since dz = dx + i dy, we have
I I I I
f (z) dz = (u + iv) (dx + i dy) = (u dx v dy) + i (v dx + u dy) .
C C C C

Using Green with P = u and Q = v the first integral on the right-hand side can be rewritten as
I Z ✓ ◆
@v @u
(u dx v dy) = dA
C A @x @y

which vanishes since the integrand on the RHS is zero by (CR2). Likewise the second integral is
I Z ✓ ◆
@u @v
(v dx + u dy) = dA
C A @x @y
H
and this is also zero, by (CR1). Hence C f (z) dz = 0, as stated by Cauchy.

1 For a more powerful result special to complex analysis, look up Goursat’s theorem.

78
9 Proofs of the three big theorems (Non-Examinable)
In this chapter we return to the ‘big three’ theorems, and indicate how they can be proved. We’ll begin
with Green’s theorem, and work up from there. Don’t worry about memorising these proofs, but instead
try to understand how they work. The chapter also includes a number of further examples which you
may find useful in your revision.

9.1 Green’s theorem in the plane


Recall that the ‘coordinate’ version of the theorem reads as follows. Suppose P (x, y) and Q(x, y),
(x, y) 2 R2 , are continuously di↵erentiable scalar fields in 2 dimensions, and suppose that C is a simple
closed curve in the x-y plane, traversed anticlockwise and surrounding an area A. Then
I Z
@Q @P
(P (x, y)dx + Q(x, y)dy) = dx dy ,
C A @x @y

(a) (b)

Figure 48: Area of integration, A, with boundary C. Plot (a) shows the curve split into left and right sections
x = hL (y), x = hR (y), while plot (b) shows the curve split into upper and lower sections y = gU (x) and y = gL (x).

Proof of Green’s theorem in the plane: We’ll make the additional assumption that the area A is
both horizontally and vertically simple, meaning that each point in A lies between exactly two boundary
points (on C) to its left and right, and also between exactly two boundary points above and below, as
illustrated in figure 48. We’ll start with the right hand side of the theorem and look at the two terms,
which we’ll label as ○1 and ○,
2 separately, remembering that we can interchange the order of integration
by Fubini’s Theorem, and splitting up the integration region and its bounding curve in the two ways
shown in the figure. We have
Z Z
@Q @P
RHS = dx dy dx dy
A @x A @y
Z Z
@Q @P
= dx dy dy dx
@x @y
| A {z } | A {z }

1 ○
2

where Fubini’s theorem was used to swap the order of integration in the second term in going from the

79
first to the second line. Now evaluating the two terms in turn,
Z y+ Z hR (y) Z y+ h ihR (y)
@Q

1 = dx dy = Q(x, y) dy
y hL (y) @x y hL (y)
Z y+
= (Q(hR , y) Q(hL , y)) dy (now split the integral)
y
Z y+ Z y
= Q(hR , y) dy + Q(hL , y) dy (note change of limits in 2nd integral)
y y+
I
= Q dy .
C

We follow a similar argument for ○:


2
Z x+ Z gU (x) Z x+ h igU (x)
@P

2 = dydx = P (x, y) dx
x gL (x) @y x gL (x)
Z x+
= ( P (x, gU ) + P (x, gL )) dx (now split the integral)
x
Z x Z x+
= P (x, gU ) dx + P (x, gL ) dx (note change of limits in 1st integral)
x+ x
I
= P dx .
C

The last steps in both calculations (○1 and ○)


2 are made using the fact that the curve C can be split as
in the diagram, i.e. C = hL (y) [ hR (y) = gU (x) [ gL (x).
Finally we can bring ○ 1 and ○ 2 back together to see that ○+1 ○ 2 = LHS, as required by the theorem.
(To prove the theorem for more complicated regions, divide it up into subregions first, and then add the
results, noting that line integrals on bits of boundaries shared by two subregions will be traversed in
opposite directions, and hence cancel.)

9.2 Stokes’ theorem


Stokes’ theorem generalises Green’s theorem in the plane, relating an integral over a surface S, now in
R3 , to the line integral over the boundary of S, C. This is most clearly seen by writing Green’s theorem
in vector form as I Z
F · dx = (r ⇥ F ) · e3 dx dy ,
C A

where e3 is the unit vector in the z direction, F (x, y, z) = (P (x, y), Q(x, y), R), A is a region in the x-y
plane, and C is the curve which bounds A, traversed anticlockwise when viewed from above.
Stokes’ looks much the same:
I Z
F · dx = (r ⇥ F ) · dA
C S

where F (x, y, z) is now any continuously di↵erentiable vector field in R3 and S ⇢ R3 is a smooth oriented
surface which is bounded by the closed curve C. As before, dA is shorthand for n b dA, and nb is a unit
vector normal to the surface at the location of the area element dA, pointing up when viewed from the
side of the surface from which C appears to be traversed anticlockwise.
Put simply, Stokes’ theorem relates the microscopic circulation (curl), of some quantity F on a surface
to the total circulation around the boundary of that surface. As a warm-up to the full proof, let’s start
by looking at this intuitively without specific vector fields or surfaces.
Figure 49 (adapted from http://www.youtube.com/watch?v=9iaYNaENVH4)Hshows two vector fields over
a surface and its bounding curve. The line integral part of Stokes’ theorem, C F · dx, has been split into

80
(a) (b)

Figure 49: Sketches of two vector fields over a surface. The black arrows represent the vector field and the red
arrows the direction of the line integral around the boundary of the surface. Sketch (a) has the line integral
approximately zero while for sketch (b) it is positive.

four sections, one along each edge of the surface. The orientation of the line integral has been chosen to
be positive and is shown with red arrows.
Looking at the bottom of sketch (a), we see that field lines and path element dx are aligned therefore the
dot product will be positive, along left and right sides respectively the field lines and dx are perpendicular
therefore the line integral here contributes nothing, finally along the top field lines and dx are in opposite
directions therefore giving us a negative dot product. Adding these to get the overall line integral will
give us zero (or close to) as the top and bottom sections cancel each other. RThis is in agreement with
what we would expect from the surface integral part of Stokes’ theorem, S (r ⇥ F ) · dA, since the
homogeneous field lines will not induce any rotation, so r ⇥ F = 0.
Conversely for the field shown in sketch (b), at all 4 sections of the line integral the field lines are parallel
to dx and so the dot product will produce a positive result, making the overall line integral positive.
Again this agrees with the surface integral part of Stokes’ theorem as the circular field lines will induce
a rotation and so a positive curl, which in turn will cause the surface integral to be positive.
This is a very rough argument, far from a proof, but with any luck you can see how it might be that
the surface integral of the curl of a vector field over S is related to the line integral of the vector field
around the bounding curve C. Now for a more rigorous proof.

Proof of Stokes’ theorem: The basic setup is a di↵erentiable vector field, F (x, y, z), and a surface
S ⇢ R3 bounded by a closed curve C, as shown in figure 50.
The surface S can be described parametrically by a continuously di↵erentiable map x(u, v) : U ! R3
with x(u, v) = (x1 (u, v), x2 (u, v), x3 (u, v)) . If the boundary of the parameter domain U ⇢ R2 is a closed
e given parametrically as u(t) = (u(t), v(t)), t1  t  t2 , then the points of the boundary C of S
curve C,
are x(u(t)), again with t1  t  t2 .
Now consider the right hand side of Stokes’ theorem, writing everything as a two-dimensional area
integral over the region U in the u-v plane. Let’s write it out in full first – it will help us to appreciate

81
Figure 50: The surface S ⇢ R3 with its bounding curve C. Also shown is the parameter domain U ⇢ R2 with
e in the parametrised coordinate system u, v. Under the mapping (u, v) 7! x(u, v), C
its bounding curve C, e ! C.

the utility of index notation!


Z Z
@x @x
I = (r ⇥ F ) · dA = (r ⇥ F ) · ( ⇥ ) du dv
S U @u @v
Z ✓ ◆✓ ◆ ✓ ◆✓ ◆
@F3 @F2 @x2 @x3 @x2 @x3 @F1 @F3 @x3 @x1 @x3 @x1
= +
U @x 2 @x 3 @u @v @v @u @x3 @x1 @u @v @v @u
✓ ◆✓ ◆
@F2 @F1 @x1 @x2 @x1 @x2
+ du dv
@x1 @x2 @u @v @v @u

Now we’ll rewrite the integrand using the epsilon and (Kronecker) delta symbols; the expression is much
more concise, and can be manipulated as follows:
@Fk @xl @xm @Fk @xl @xm
(r ⇥ F ) · (xu ⇥ xv ) = ✏ijk ✏ilm = ( jl km jm kl )
@xj @u @v @xj @u @v
@Fk @xj @xk @Fk @xk @xj
=
@xj @u @v @xj @u @v
✓ ◆ ✓ ◆
@Fk @xj @xk @Fk @xj @xk
=
@xj @u @v @xj @v @u
@Fk @xk @Fk @xk
= (using the chain rule in reverse)
@u @v @v @u
✓ ◆ ✓ ◆
@ @xk @ 2 xk @ @xk @ 2 xk
= Fk Fk Fk + Fk
@u @v @u@v @v @u @v@u
(using the product rule in reverse)
✓ ◆ ✓ ◆
@ @xk @ @xk
= Fk Fk
@u @v @v @u

Written in this way, the surface integral I has been expressed as an area integral, and as a bonus the
integrand has turned out to be in just the right form to enable us to use Green’s theorem in the plane.
The ‘plane’ is now the u-v plane, with P (u, v) = Fk @x @u
k
and Q(u, v) = Fk @x @v . Hence,
k

Z ✓ ✓ ◆ ✓ ◆◆
@ @xk @ @xk
I= Fk Fk du dv (now use Green’s theorem)
U @u @v @v @u
I ✓ ◆
@xk @xk
= Fk du + Fk dv (note this is a line integral in the u,v plane)
e
C @u @v
Z t2 ✓ ◆
@xk du @xk dv
= Fk + dt (now use chain rule in reverse on the term in brackets)
t1 @u dt @v dt
Z t2 I
dxk
= Fk dt = F · dx .
t1 dt C

Hence Stokes’ theorem follows from Green’s theorem in the plane, which was proved earlier.

82
9.3 The divergence theorem
The divergence theorem gives us a relationship between volume integrals and surface integrals, just as
Stokes’ theorem related surface integrals to line integrals. Recall its content: if F is a continuously
di↵erentiable vector field defined over a volume V ⇢ R3 with bounding surface S, then
Z Z
F · dA = r · F dV
S V

b dA, and now the unit normal n


As before, dA = n b should be chosen to point out of the volume V .
Aside: It may not look like it, but the divergence theorem is actually a higher dimensional version of
Green’s theorem in the plane. This can be seen by rewriting Green’s theorem as follows.
Recall Green’s theorem: I Z ✓ ◆
@Q @P
I= P dx + Qdy = dx dy , (9.1)
C A @x @y
where C is a closed curve which bounds an area A in the x-y plane. We can parametrise the curve with
a parameter t as x(t) = (x(t), y(t)); then the tangent at any point on the curve is given by dx
dt .

Now we can write the line integral in terms of t:


I ✓ ◆
dx dy
I= P +Q dt , (9.2)
C dt dt

and (though this may seem a little odd right now) let us define the following 2-D vectors:

F = (Q, P )
dy dx
N = ( , ),
dt dt
so that the integrand of equation (9.2) can be written as the following dot product:
I
I= F · N dt . (9.3)
C

dx
Now N is normal to the curve C, as can be checked by taking its dot product with the tangent dt :

dx dy dx dx dy
N· = = 0.
dt dt dt dt dt
We may also write: q
dy 2 dx 2 ds
b |N | = n
N =n b dt + dt b
=n ,
dt
H
where
H s is the arc length and n b ds
b is the unit normal. Now we can write equation (9.3) as I = C F · n dt dt =
C
F · b
n ds.
R
Finally with F = (Q, P ) we can rewrite the RHS of equation (9.1) as A r · F dA and so we get Green’s
theorem in the following form: I Z
F ·n
b ds = r · F dA
C A
which is clearly a lower dimensional form of the divergence theorem
Z Z
F ·nb dA = r · F dV.
S V

This is another example of the analogies between the big theorems that were mentioned at the end of
section 10.1. With this aside out of the way, we return to the proof of the full theorem.

83
Proof of the divergence theorem: Suppose that, in components, the vector field in the statement of
the theorem is F = (F1 , F2 , F3 ). Then we can write
F = F1 + F2 + F3
where F 1 = F1 e1 , F 2 = F2 e2 and F 3 = F3 e3 . We’ll start by proving the theorem for F 3 . Suppose that
V projects onto a region A in the x-y plane, and assume further that V is vertically simple, meaning
that each point in the interior of V lies between exactly two points above and below it on the boundary
of V , with all three of these points projecting onto the same point on the x-y plane in the interior of A.
This means that S, the boundary of V , can be split into two halves as S = SL [ SU where SL (‘lower’)
contains all of the ‘below’ points on S, and SU (‘upper’) contains all of the ‘above’ points, as illustrated
in figure 51.

Figure 51: A vertically simple volume, and the corresponding split of its boundary S as S = SL [ SU .

Much as in method 2 for doing surface integrals, we’ll parametrise SL and SU using x and y, with the
z coordinates of points on the two surfaces being given by two functions g and h, say, so that on SL ,
z = g(x, y), while on SU , z = h(x, y).
Now consider the volume integral side of the divergence theorem, a triple integral in x, y and z. Opting
to do the z integral first, which by Fubini’s theorem is allowed, we have
Z ZZ Z h(x,y) !
@F3
r · F 3 dV = dz dx dy
V A g(x,y) @z
ZZ
⇥ ⇤z=h(x,y)
= F3 (x, y, z) z=g(x,y) dx dy
A
ZZ
= F3 (x, y, h(x, y, z)) F3 (x, y, g(x, y)) dx dy = (⇤) .
A

On the other (‘surface’) side of the theorem, we can write


Z Z Z
F 3 · dA = F 3 · dA F 3 · dA
S SU SL

84
where both surface integrals on the RHS are to be taken with ‘upwards’ pointing (i.e. positive z com-
ponent) normals. Note that all normals in the integral on the LHS point out of V , which on SL means
downwards. The minus sign in front of the second term on the RHS converts the ‘upwards-normals’
integral to the downwards-normals version, so as to match the LHS.
Now treat SL as a parametrised surface with coordinates x(x, y) = (x, y, g(x, y)). By method 1, we have
Z ZZ ✓ ◆
@x @x
F 3 · dA = F 3 (x(x, y)) · ⇥ dx dy
SL A @x @y
ZZ
= F3 (x, y, g(x, y)) e3 · ((1, 0, gx ) ⇥ (0, 1, gy )) dx dy
A
ZZ ZZ
= F3 (x, y, g(x, y)) e3 · ( gx , gy , 1) dx dy = F3 (x, y, g(x, y)) dx dy .
A A

Likewise, Z ZZ
F 3 · dA = F3 (x, y, h(x, y)) dx dy
SU A
and hence Z ZZ
F 3 · dA = F3 (x, y, h(x, y)) F3 (x, y, g(x, y)) dx dy = (⇤)
S A
and we’ve proved that Z Z
r · F 3 dV = F 3 · dA .
V S
In the same way (but doing the x or y integrals first)
Z Z
r · F 1 dV = F 1 · dA
V S

and Z Z
r · F 2 dV = F 2 · dA .
V S
Adding these up and using F = F 1 + F 2 + F 3 ,
Z Z
r · F dV = F · dA ,
V S

as required. Notice that this is very similar to the way that Green’s theorem in the plane was proved
earlier. Given the aside that began this subsection, this might not come as a huge surprise.

9.4 Further examples


Bonus example 4. Verify Stokes’ theorem for the vector field F (x, y, z) = (y, z, x) and the parabolic
surface z = 1 (x2 + y 2 ), z 0, shown in figure 52.
We first need to find the bounding curve C, in order to calculate the line integral part of Stokes’ theorem.
It can be found, see figure, by locating the intersection of the surface and the z = 0 plane.
z=0=1 (x2 + y 2 ) ) x2 + y 2 = 1, the unit circle.
H
Next we are going to parametrise the curve and calculate I1 ⌘ C F ·dx. As seen earlier, a parametrisation
for a unit circle in R2 is
x(t) = cos t , y(t) = sin t ,
and to situate it in the z = 0 plane we simply set z(t) = 0. Using this we can write x(t), F (x(t)) and
dx(t)/dt in terms of the parameter t:
x(t) = (cos t, sin t, 0) ;
F (x(t)) = (sin t, 0, cos t) ;
dx(t)
= ( sin t, cos t, 0) .
dt

85
Figure 52: Parabolic surface z = 1 (x2 + y 2 ).

So now we are ready to do the line integral:


I Z 2⇡
I1 = F · dx = (sin t, 0, cos t) · ( sin t, cos t, 0) dt
C 0
Z 2⇡
= sin2 t dt
0
Z 2⇡
1
= 2 (cos(2t) 1) dt = ⇡.
0

R
Now let’s tackle the surface integral: I2 ⌘ S (r ⇥ F ) · dA. We will start in Cartesian coordinates and
calculate the integrand using ‘method 2’, the level surface method. First o↵,
e1 e2 e3
@ @ @
r⇥F = @x @y @z = ( 1, 1, 1) .
y z x
Next we recall from section 9how to calculate the area element dA = n b dA. Since we can express our
surface as a level set of a scalar field i.e. f (x, y, z) = z + x2 + y 2 = 1, and
@f @f @f
rf (x, y, z) = e1 + e2 + e = (2x, 2y, 1) ,
@x @y @z 3
our integrand is given by:
( 1, 1, 1) · (2x, 2y, 1)
(r ⇥ F ) · dA = dx dy = ( 2x 2y 1) dx dy ,
e3 · (2x, 2y, 1)
which leaves us ready to compute the double integral:
Z
I2 = ( 2x 2y 1) dx dy .
x2 +y 2 1

Quickest at this stage is to spot that the integration region is symmetrical in both the x and y directions,
so the integrals of 2x and 2y both vanish; this leaves us with
Z
I2 = ( 1) dx dy = (area of unit disk) = ⇡ .
x2 +y 2 1

Alternatively, we change to polar coordinates (x = r cos ✓, y = r sin ✓), with the two parameters r and ✓
running between 0 and 1 and 0 and 2⇡ respectively. (Projecting the surface integral onto the x,y plane

86
so that our region for the double integral is the unit circle.) The change of variables replaces dA = dx dy
by dA = rdrd✓; this comes from the Jacobian, as discussed in section 10.5. Hence
Z Z 2⇡ Z 1
I2 = (r ⇥ F ) · dA = ( 2r cos ✓ 2r sin ✓ 1) r dr d✓
S 0 0
Z 2⇡
2 2 1
= 3 cos ✓ 3 sin ✓ 2 d✓
0
⇥ 2 2

1 2⇡
= 3 sin ✓ + 3 cos ✓ 2✓ 0 = ⇡.
H R
Either way, we have obtained the same result as before, and C
F · dx = S
(r ⇥ F ) · dA, as required.

Bonus example 5. Use the divergence theorem to compute


Z
F · dA
S

where F = (y 2 z, y 3 , xz) and S is the surface of the cube |x|  1 , |y|  1 , 0  z  2 .


Answer: r · F = 3y 2 + x, so
Z Z
F · dA = (3y 2 + x) dV
S V
Z 1 Z 1 Z 2
= (3y 2 + x) dz dx dy
1 1 0
Z 1 Z 1
=2 (3y 2 + x) dx dy
1 1
Z 1  1
1
=2 3y 2 x + x2 dx dy
1 2 1
Z 1
12 ⇥ 3 ⇤1
=2 6y 2 dy = y 1
= 8.
1 3

Bonus example 6. Verify the divergence theorem by calculating both left and right hand sides of
Z Z
F · dA = r · F dV ,
S V

when F = (7x, 0, z), S is the sphere x2 + y 2 + z 2 = 4, and V is the volume inside it.
Let us start with the surface integral. Since the surface is a Rsphere, which
R does not sit in a single-valued
fashion above any single plane, we use the parametric form: S F ·dA = U F (x(u, v))·(xu ⇥xv ) du dv. As
in section 9, the (radius 2) sphere can be parametrised as x(u, v) = (2 sin(u) cos(v), 2 sin(u) sin(v), 2 cos(u)),
with 0  u  ⇡, 0  v  2⇡, and so (as we calculated in example 7 above) (xu ⇥ xv ) = 2 sin(u) x. Now

87
we construct our integral in terms of the parameters u and v:
I Z ⇡ Z 2⇡
F (x(u, v)) · (xu ⇥ xv ) du dv = (14 sin(u) cos(v), 0, 2 cos(u)) · (2 sin(u)x) dv du
U 0 0
Z ⇡ Z 2⇡
= (56 sin3 (u) cos2 (v) 8 cos2 (u) sin(u)) dv du
0 0
Z ⇡  ✓ ◆ 2⇡
3 v sin(2v)
= 56 sin (u) + 8 cos2 (u) sin(u)v du
0 2 4 0
Z ⇡
= (56⇡ sin3 (u) 16⇡ cos2 (u) sin(u)) du
0
Z ⇡
= (56⇡ sin(u)(1 cos2 (u)) 16⇡ cos2 (u) sin(u)) du
Z0 ⇡
= ( 72⇡ sin(u) cos2 (u) + 56⇡ sin(u)) du
0
 ⇡
( cos3 (u))
= 72⇡ + 56⇡( cos(u)) = 64 ⇡ .
3 0

Next, the volume integral. The first step is to calculate the divergence of F :
✓ ◆
@ @ @
r·F = , , · (7x, 0, z) = 6 .
@x @y @z

Hence the integral is Z ✓ ◆


4 3
6 dV = 6 ⇡r = 64 ⇡ .
V 3
This was much simpler than the surface integral, and in fact we didn’t have to integrate at all – we just
used the formula for the volume of a sphere!
H
Bonus example 7. Evaluate the line integral C F · dx with F = (y, z2 , 3y 2 ) around the curve C given
by the intersection of the sphere x2 + y 2 + z 2 = 6z, and the plane z = x + 3. The curve is a circle lying
in the plane z = x + 3. If we use Stokes’ theorem then we need to think about how to set up the surface
integral for a surface with C as its boundary. If we do this by using x, y as parameters then we will need
the projection of C onto the x, y-plane which is given by: x2 + y 2 + (x + 3)2 = 6(x + 3) = 2x2 + y 2 = 9,
an ellipse. H R
The line integral can be done directly, or we can apply Stokes’ theorem: C F · dx = S (r ⇥ F ) · dA. The
easiest surface to consider that has C as its boundary is the flat disk S in the z = x + 3 plane (we could
have used either parts of the sphere that have C as boundary but that would be more complcated).
Let’s start by calculating the curl of F :

e1 e2 e3
@ @ @
r⇥F = @x @y @z = (1, 0, 1)
y z/2 3y/2

And now let’s find dA using method 2, describing S as (part of) a level set and parametrising it using x
and y:
rf
dA = dx dy
e3 · rf
Taking f (x, y, z) = z x so the plane is f (x, y, z) = 3, rf = ( 1, 0, 1) , e3 · rf = 1 , and our integrand is
(r ⇥ F ).dA = 2 dx dy. The region of integration A is the interior of the ellipse 2x2 + y 2 = 9 in the x,y
plane. Finally we can use a simple change
p of variables
p to convert the ellipse to a circle and make the area
integration even simpler: if we set x̄ = 2x, dx̄ = 2dx, then our ellipse becomes the circle x̄2 + y 2 = 9,
and we can apply the usual coordinate transformations as follows (Ā (a circle) is the transformation of

88
Figure 53: Intersection of sphere x2 + y 2 + z 2 6z = 0 and plane z = x + 3.

region A (an ellipse)):


Z Z p Z
(r ⇥ F ) · dA = 2 dx dy = 2 dx̄ dy
S A Ā
p Z 2⇡ Z 3
= 2 r dr d✓
0 0
p Z 2⇡
9
p
= 2 2 d✓ = 9 2⇡ .
0

Alternatively, we can do the line integral directly around the ellipse 2x2 + y 2 = 9. This can be written
2 2
in the standard form as (3/xp2)2 + y32 = 1 and using the parametrisation from the table in section 9.1,
x(t) = p32 cos(t), y(t) = 3 sin(t) and so z(t) = x(t) + 3 = p32 cos(t) + 3. We can then find the line element
along the path to be dx = ( p32 sin(t), 3 cos(t), p32 sin(t)) dt, and so the integral is
I Z 2⇡
3
F · dx = (3 sin(t), 2p 2
cos(t) + 32 , 92 sin(t)) · ( p32 sin(t), 3 cos(t), p32 sin(t)) dt
C 0
Z 2⇡ ⇣ ⌘
= p9
2
sin2 t + 9
p
2 2
cos2 t + 92 cos t 27
p
2 2
sin2 t dt
0
p
= p9 ⇡ + 9
p ⇡ 27
p ⇡ = 9 2⇡ ,
2 2 2 2 2

as before.

Bonus example 8. The next example is a little more R general: Suppose B(x) is defined everywhere in
R3 , and that S is a closed surface in R3 . Show that S (r ⇥ B) · dA = 0, (i) using Stokes’ theorem, and
(ii) using the divergence theorem.
(i) Using Stokes’ theorem, the proof takes a little work. First imagine the closed surface split into two
surfaces S+ and S , bounded by the curves C and C̄ respectively, as shown in figure 54.
Now: Z Z Z
(r ⇥ B) · dA = (r ⇥ B) · dA + (r ⇥ B) · dA .
S S+ S

89
Figure 54: The closed surface is split into two open surfaces; S+ is bounded (positively) by curve C and S is
bounded by C̄. The unit normal is oriented outwards as usual. Note that C̄ is just C traversed backwards.

Next, apply Stokes’ theorem:


Z Z I I
(r ⇥ B) · dA + (r ⇥ B) · dA = B · dx + B · dx
S+ S C C̄
I I
= B · dx B · dx = 0 .
C C

(ii) Using the divergence theorem, the result is a little easier to see:
Z Z Z
(r ⇥ B) · dA = r · (r ⇥ B) dV = 0 dV = 0 ,
S V V

since the divergence of a curl is zero. Try to prove it using index notation!

Bonus
R example 9. One final quick example: show that the volume enclosed by a closed surface S is
1
3 S x · dA.
R
First o↵, remember that x = (x, y, z). Now let’s show that 13 S x · dA is indeed equal to volume by
applying the divergence theorem:
Z Z Z
1 1
3 x · dA = 3 r · x dV = dV = V, as required.
S V V

One last request: if you spot any typos in these notes, or if you find that anything is particularly obscure,
please let me know! (At [email protected])

90

You might also like