0% found this document useful (0 votes)
82 views39 pages

Mvcalc Notes 2018 v2

This document provides information about the MATH20901 Multivariable Calculus course taught by Dr. Richard Porter at the University of Bristol in 2018. The course develops multivariable calculus concepts building on single-variable calculus. It focuses on vector calculus, coordinate systems, and integral theorems for functions of more than one variable. The document outlines course prerequisites, content, resources, and assessment. It also provides detailed notes on differential calculus topics for multivariable functions, including derivatives of maps, gradients, directional derivatives, and operations on maps and their derivatives.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views39 pages

Mvcalc Notes 2018 v2

This document provides information about the MATH20901 Multivariable Calculus course taught by Dr. Richard Porter at the University of Bristol in 2018. The course develops multivariable calculus concepts building on single-variable calculus. It focuses on vector calculus, coordinate systems, and integral theorems for functions of more than one variable. The document outlines course prerequisites, content, resources, and assessment. It also provides detailed notes on differential calculus topics for multivariable functions, including derivatives of maps, gradients, directional derivatives, and operations on maps and their derivatives.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

MATH20901 Multivariable Calculus

Richard Porter
University of Bristol
2018

Course Information
• Prerequistes: Calculus 1 (and Linear Algebra and Geometry, Analysis 1)
• The course develops multivariable calculus from Calculus 1. The main focus of the course is
on developing differential vector calculus, tools for changing coordinate systems and major
theorems of integral calculus for functions of more than one variable.
This unit is central to many branches of pure and applied mathematics. For example, in
applied mathematics vector calculus is an integral part of describing field theories that model
physical processes and dealing with the equations that arise.
It is used in 2nd year Applied Partial Differential Equations and in year 3 Fluid Dy-
namics, Quantum Mechanics, Mathematical Methods, and Modern Mathematical
Biology.
• Lecturer: Dr. Richard Porter, Room SM2.7
• Web:
https://people.maths.bris.ac.uk/~marp/mvcalc
Notes may contain extra sections for interest or additional information. Problem sheets, so-
lutions, homework feedback forms, problems class sheets, past exam papers, video tutorials.
• Email: [email protected]
• Books: Lots of books on multivariable/vector calculus. Jerrold E. Marsden & Anthony J.
Tromba, “Vector Calculus”, ed. 5 , W. H. Freeman and Company, 2003
• Maths Café: TBA
• Office Hours: Tuesday 9-10am.
• Homework set weekly from 5 problems sheets.
• Timetabled problems classes/exercise classes: unseen problems/some from the problem
sheets/and as many as possible from past exam papers.
• Exam: Jan 90 mins. 2 compulsory questions. No calculators.

1
1 A review of differential calculus for functions of more
than one variable
Revision and extension of results from Calculus 1.

1.1 General maps from Rm to Rn

Let x ∈ Rm = (x1 , x2 , . . . , xm )1 .
Often in 2D write x ≡ (x, y) or in 3D x ≡ (x, y, z).

Defn: A scalar map or scalar function, f , say is defined by f : Rm → R s.t. x → f (x). We


write it as f (x).

Defn: A general map, or vector function, say F : Rm → Rn s.t. x → F(x) is defined as

F(x) = (F1 (x), F2 (x), . . . , Fn (x)).

and the components are scalar maps denoted by Fi : Rm → R (i = 1, . . . , n).

Defn: A map F : Rm → Rn is linear if ∀x, y ∈ Rm , and λ, µ ∈ R, F(λx + µy) = λF(x) + µF(y).

Proposition: A map F is linear iff ∃ a matrix A ∈ Rn×m s.t. F = Ax.

E.g. 1.1: F : R3 → R2 s.t. (x1 , x2 , x3 ) → (x3 − x1 , x2 + x1 ). Then


 
  x1
−1 0 1 
F(x) = x2  = Ax
1 1 0
x3

is a linear map, since A(λx + µy) = λAx + µAy.


E.g. 1.2: F : R2 → R2 s.t. (x1 , x2 ) → (x2 x1 , ex2 ). (Not a linear map since, for example,
F(2x) 6= 2F(x))

1.2 The derivative of a map

Defn: The derivative of the map F : Rm → Rn is the n × m matrix F′ (x) such that the i, jth
element is
∂Fi
{F′ (x)}ij = .
∂xj
1
Although written on the page as a row vector, in computations, vectors are actually arranged as column vectors
unless indicated by a T for transpose.

2
For scalar functions of single variables, the derivative f ′ (x0 ) is defined to be precisely the function
such that the line formed by
f (x0 ) + (x − x0 )f ′ (x0 )
is tangent at x = x0 to the curve f (x).
For vector functions of multiple variables, F′ (x0 ) is defined to be precisely the matrix such that

F(x0 ) + F′ (x0 )(x − x0 ) (1)

defines the tangent plane at x = x0 to the hypersurface formed by F(x).


E.g. 1.3: If F = Ax (a linear map) then F′ (x) = A.
Proof: We have
m
X
Fi = Aik xk
k=1

where Aij is the i, jth element of A and so


m
∂Fi X ∂xk
{F′ (x)}ij = = Aik = Aij
∂xj ∂xj
k=1

since ∂xk /∂xj = 1 if j = k else zero.


E.g. 1.4: With F(x) = (x2 x1 , ex2 ) we have
 
′ x2 x1
F (x) = .
0 ex2

Defn: The matrix F′ (x) with elements ∂Fi /∂xj is called the Jacobian matrix.

1.3 The gradient of a function

Defn: The gradient of a scalar function f : Rm → R (i.e. f (x)) is denoted

∇f ≡ (∂f /∂x1 , ∂f /∂x2 , . . . , ∂f /∂xm ).

Note: The rows of the Jacobian matrix are formed by gradients of the components of F, viz
 
(∇F1 )T
 (∇F2 )T 
′  
F (x) =  .. .
 . 
T
(∇Fn )

(More on this later.)

3
1.4 The directional derivative

Defn: The directional derivative of F at x0 along v (such that |v| = 1) is a vector in Rn given
by  
dF1 (x0 + tv) dFn (x0 + tv) dF(x0 + tv)
Dv F(x0 ) = ,..., ≡ .
dt dt t=0 dt t=0
It measures the rate of change of F in the direction of v and it is formulated in terms of ordinary
1D derivatives.
Note: Can be shown that
Dv F(x0 ) = F′ (x0 )v.
Proof: (informal)
   
dF(x0 + tv) F(x0 + tv) − F(x0 ) F(x0 ) + F′ (x0 )(x0 + tv − x0 ) − F(x0 )
dt = lim
t→0 t
= lim
t→0 t
t=0

which gives the result. In above, we replace F by equation of tangent plane (1) which coincides
in limit t → 0.
p
6 1 then redefine v by v/|v| where |v| = v12 + v22 + . . . + vm
Note: If |v| = 2 .

Note: If x ∈ Rm and v ∈ Rm with |v| = 1 and f : Rm → R is a scalar function then

Dv f = (∇f )T v ≡ v · ∇f. (2)

1.5 Operations on maps and their derivatives


1. (Addition of maps) Let F, G be maps from Rm → Rn . Then if H : Rm → Rn is the new
function defined as
H(x) = F(x) + G(x)
it follows
H′ (x) = F′ (x) + G′ (x).
Proof: (simple)
∂Hi ∂Fi ∂Gi
(x) = (x) + (x).
∂xj ∂xj ∂xj

2. (Product of maps) Let F : Rm → Rn and f : Rm → R, then if H : Rm → Rn is defined by

H(x) = f (x)F(x)

it follows that H′ (x) is the matrix whose i, jthe element is defined by


∂Hi ∂f ∂Fi
(x) = (x)Fi (x) + f (x) (x)
∂xj ∂xj ∂xj

using product rule for differentiation. No simple representation for the result using stan-
dard linear algebra.

4
3. (Composition of maps) If F : Rm → Rn and G : Rn → Rp , then if we define H : Rm → Rp
by
H(x) = (G ◦ F)(x) = G(F(x))
it follows that
H′ (x) = G′ (F(x)) F′ (x) (3)
where the right-hand side denotes the product of the p × n matrix G′ (F(x)) with the n × m
matrix F′ (x).
Proof: From the definition, Hi = Gi (F1 (x1 , . . . , xm ), F2 (x1 , . . . , xm ), . . . , Fn (x1 , . . . , xm )).
So
∂Hi ∂Gi ∂Fi ∂Gi ∂F2 ∂Gi ∂Fn
{H′ (x)}ij = (x) = + + ...+
∂xj ∂x1 ∂xj ∂x2 ∂xj ∂xn ∂xj
n
X ∂Gi ∂Fk
= (F(x)) (x)
k=1
∂xk ∂xj

using the chain rule. This summation can be interpreted as the ith row of G′ (F(x)) multi-
plied by the jth column of F′ (x) and this gives the result.
Note: See the Appendix for a revision of the chain rule and how it applies to multivariable
functions.

1.6 Inverse maps

Let F : Rn → Rn and G = F−1 be the inverse map such that

(F−1 ◦ F)(x) = x (4)

Differentiating, applying (3) to (4) and using the fact that x = Ix where I is the n × n identity
matrix we have
(F−1 )′ (F(x))F′ (x) = I.
Thus
(F−1 )′ (F(x)) = (F′ )−1 (x). (5)
In other words “the derivative of the inverse is equal to the inverse of the derivative”.
Note: For scalar maps, we recognise this statement as
 −1
dx dy 1
= = .
dy dx dy
dx
and (5) generalises this to functions of more than one variable.

E.g. 1.5: (mapping 2D Cartesian to plane polar coordinates)


Let F : R2 → R2 s.t. (r, θ) → (r cos θ, r sin θ) ≡ (x, y).

5
This means that    
′ ∂x/∂r ∂x/∂θ cos θ −r sin θ
F (r, θ) = = .
∂y/∂r ∂y/∂θ sin θ r cos θ
Taking inverses  
′ −1 cos θ sin θ
(F (r, θ)) = .
− sin θ/r cos θ/r
p
Now consider the inverse map F−1 : R2 7→ R2 s.t. (x, y) → ( x2 + y 2 , tan−1 (y/x)) ≡ (r, θ). Then
   p p 
−1 ′ ∂r/∂x ∂r/∂y x/ x2 + y 2 y/ x2 + y 2
(F ) (x, y) = = .
∂θ/∂x ∂θ/∂y −y/(x2 + y 2) x/(x2 + y 2 )

Finally,  
−1 ′ r cos θ/r r sin θ/r
(F ) (F(r, θ)) =
−r sin θ/r r cos θ/r 2
2

which is the same as before.

1.7 Solving equations

Question: Given a function F : Rn → Rn , is there always an inverse function G ≡ F−1 , which


satisfies
(G ◦ F)(x) = x ?

The same question can be stated in terms of a solution to a nonlinear system of equations. Namely,
let
F(x) = y (6)
for x, y ∈ Rn . Or, in full,

F1 (x1 , . . . , xn ) = y1
.. ..
. .
Fn (x1 , . . . , xn ) = yn .

Then, given y, is there a x such that (6) is solved. If so, then x = F−1 (y).

1.7.1 Inverse function theorem

Let F : Rn → Rn , with x0 , y0 ∈ Rn such that

y0 = F(x0 ).

If the Jacobian matrix F′ (x0 ) is invertible, then (6) can be solved uniquely as

x = F−1 (y),

for y in the neighbourhood of y0 .

6
Note: A matrix is invertible if and only if its determinant is non-zero. The determinant of the
Jacobian matrix F′ is often written as

∂(F1 , . . . , Fn )
JF (x0 ) ≡ (7)
∂(x1 , . . . , xn ) x=x0

and called the Jacobian determinant.

Proof: (informal, but instructive)


For x close to x0 , we can use (1) to locally approximate F(x) so that

y ≈ F(x0 ) + F′ (x0 )(x − x0 )

which means
x ≈ x0 + (F′ (x0 ))−1 (y − y0 )
since y0 = F(x0 ) but relies on the existence of the inverse of the Jacobian. I.e. given y, x can be
determined.
Note: The theorem tells us nothing about what happens if the inverse does not exist.

E.g. 1.6: Consider the system of equations


x2 + y 2
= u, sin x + cos y = v.
x
Q: Given (u, v), we want to solve for (x, y). Near which points is this guaranteed to define a
unique function ?
A: We define F : R2 \{0} → R2 s.t.
 
x2 + y 2
y ≡ F(x) = , sin x + cos y
x
(so that y ≡ (u, v) and x = (x, y).)
The Jacobian determinant is

∂(u, v) (x2 − y 2)/x2 2y/x y 2 − x2 2y
= = sin y − cos x.
∂(x, y) cos x − sin y x2 x
E.g. (i) near x0 = (1, 1) (where y = (2, sin(1) + cos(1))) we can solve for x in a neighborhood of
x0 ; E.g. (ii) near x0 = (π/2, π/2) (where y = (π, 1)) where JF = 0 we cannot say anything about
existence or uniqueness of solutions.

1.7.2 Implicit function theorem

Similar to above. Consider an equation for x ∈ Rm , y ∈ Rn in the form

F(x, y) = 0 (8)

7
where F : Rm+n → Rn .
Note: If F is linear in y then (8) can be written in the form y = G(x) for some G and the inverse
function theorem applies. We suppose that this is not the case.
Suppose that (8) is satisfied by the pair x0 , y0 (i.e. F(x0 , y0 ) = 0.) Then we can express solutions
of this as y = y(x) for y : Rm → Rn in the neighbourhood of y0 provided the Jacobian determinant

∂(F1 , . . . , Fn )
(9)
∂(y1 , . . . , yn ) x=x0 ,y=y0

is non-zero.

Proof: (informal but instructive)


The ith component of (8) is
Fi (x1 , . . . , xm , y1 , . . . , yn ) = 0
and we suppose that

y1 = y1 (x1 , . . . , xm ), y2 = y2 (x1 , . . . , xm ), ... yn = yn (x1 , . . . , xm ).

Taking the partial derivative of Fi w.r.t xk gives (by the chain rule)
∂Fi ∂Fi ∂y1 ∂Fi ∂yn
+ + ... =0
∂xj ∂y1 ∂xj ∂yn ∂xj
and this can be interpreted as the matrix equation
 ∂F ∂F1
  ∂F1 ∂F1
  ∂y1 ∂y1
  
∂x1
1
. . . ∂xm ∂y1
. . . ∂yn ∂x1
... ∂xm 0
 .. . .. .  
..  +  ... . . .
. . ..   .. . . .
  ..   .. 
 . . = . 
∂Fn ∂Fn ∂Fn ∂yn ∂yn
∂x1
. . . ∂x m ∂y1
. . . ∂F
∂yn
n
∂x1
... ∂xm
0

and therefore
 ∂y1 ∂y1
  ∂F1 ∂F1
−1  ∂F1 ∂F1

∂x1
... ∂xm ∂y1
... ∂yn ∂x1
... ∂xm
 .. .. ..   .. .. ..   .. .. .. 
 . . .  = − . . .   . . . 
∂yn ∂yn ∂Fn ∂Fn ∂Fn ∂Fn
∂x1
... ∂xm ∂y1
... ∂yn ∂x1
... ∂xm

and the existence of y′ (x0 ) requires (9) holds.


If y′ (x0 ) holds we can use (1) for points x close to x0 to write

y(x) ≈ y(x0 ) + y′ (x0 )(x − x0 )

showing the existence of solutions in the neighbourhood of (x0 , y0 ).

E.g. 1.7: Consider f (x, y) = 0 where f (x, y) = x2 + y 2 − 1. This is satisfied by points (x0 , y0 ) on
the unit circle. If we try to express it as y = y(x) we get into trouble since

y = ± 1 − x2

8
and there are two solutions. The implicit function theorem applied to this example requires the
determinant of the 1 × 1 matrix
∂f
∂y
evaluated at (x0 , y0 ) to be non-zero. This is 2y0 which is non-zero apart from at y0 = 0. So we
can express the solution y = y(x) local to a point (x0 , y0 ) provided y√0 6= 0. Which is obvious in
our case as if y0 > 0 we are on the upper solution branch where y = 1 − x2 and vice versa.

1.8 Higher-order derivatives

Start with 2nd order.


Defn: For F : Rm → Rn ,
   
∂ 2 Fi ∂ ∂Fi ∂ ∂Fi ∂ 2 Fi
(x) = (x) = (x) = (x)
∂xk ∂xj ∂xk ∂xj ∂xj ∂xk ∂xj ∂xk

under normal circumstances. For example if f (x, y) = x3 − 3xy 2 ,


∂2f ∂2f
fxx ≡ = 6x, fxy ≡ = −6y, fyx = −6y, fyy = −6x.
∂x∂x ∂x∂y
Note: Extended naturally to higher orders.

1.8.1 Taylor’s theorem

Higher-order derivatives are useful in Taylor’s theorem in dimension ≥ 2, allowing one to approx-
imate functions of several variables near a point.

Recall that for a scalar function of a single variable,


(x − x0 )2 ′′
f (x) = f (x0 ) + (x − x0 )f ′ (x0 ) + f (x0 ) + higher order terms (h.o.t.)
2!

How do we generalise to higher dimensions ? Well, it gets tricky. For a scalar function f (x, y), or
more than one (e.g. 2) variable
 
x − x0
f (x, y) = f (x0 , y0 ) + (fx (x0 , y0 ), fy (x0 , y0))
y − y0
  
1 fxx (x0 , y0 ) fxy (x0 , y0) x − x0
+ (x − x0 , y − y0 ) + h.o.t
2 fyx (x0 , y0) fyy (x0 , y0 ) y − y0
with an obvious generalisation to functions of more than 2 variables.

The higher order terms are complicated and require some complex notation.

9
For vector functions, F = (F1 , . . . , Fn ) we can use the scalar result above for each scalar function
component, Fi . I.e.

Fi (x) = Fi (x0 ) + (∇Fi (x0 ))T (x − x0 ) + h.o.t of size like |x − x0 |2 .

Stacking equations gives


F(x) = F(x0 ) + F′ (x0 )(x − x0 ) + h.o.t
involving matrix/vector multiplications and hence

F(x) ≈ F(x0 ) + F′ (x0 )(x − x0 )

for x close to x0 . This is equation (1) for the tangent plane at the point x0 on F. We’ve used this
approximation as the basis of earlier informal proofs.

10
2 Differential vector calculus

2.1 Linear algebra

Focus now on 3D, and adopt convention that position vector r = (x, y, z) ≡ (x1 , x2 , x3 ) ∈ R3 to
describe equations pertaining to physical applications.

Notation: The Cartesian (unit) basis vectors in R3 are x̂ = (1, 0, 0) ≡ e1 , ŷ = (0, 1, 0) ≡ e2 and
ẑ = (0, 0, 1) ≡ e3 such that r = xx̂ + yŷ + zẑ ≡ x1 e1 + x2 e2 + x3 e3 .
p p
Also use r = |r| = x21 + x22 + x23 ≡ x2 + y 2 + z 2 as the length of the position vector.

Defn: The dot product of two vectors u = (u1 , u2 , u3 ) and v = (v1 , v2 , v3 ) is defined

3
X
u · v = u1 v1 + u2 v2 + u3 v3 ≡ uj vj
j=1

3
X
Notation: (Einstein summation convention) Drop the in the above on the understanding
j=1
that repeated suffices imply summation. I.e.

u · v = uj vj

√ p
For e.g., r = |r| = r·r= x2i .

Defn: The Kronecker delta, δij is defined by



1, if i = j,
δij =
0, if i =
6 j.

E.g. 2.1: (Sampling property) Another way of defining δij is to be the set of elements for
which
xi = δij xj , for every i
3
X
Note that δij xj = δij xj = xi if and only if the defn above holds.
j=1

E.g. 2.2: δii = 3.


E.g. 2.3: (Orthonormality of basis vectors) ei · ej = δij .
E.g. 2.4: r = xj ej and taking dot product with ei gives xi = r · ei .

11
Defn: The cross product of two vectors u, v ∈ R3 , vector given by

e1 e2 e3

u × v = u1 u2 u3 ≡ (u2 v3 − v2 u3 )e1 + (u3 v1 − v3 u1 )e2 + (u1 v2 − v1 u2 )e3 . (10)
v1 v2 v3

Note: Defintion implies antisymmetry: u × v = −v × u.

Defn: The Levi-Civita tensor (or antisymmetric tensor) is defined by

1. ǫ123 = 1

2. ǫijk = 0 if any repeated suffices. E.g. ǫ113 = 0.

3. Interchanging suffices implies reversal of sign. E.g. ǫijk = −ǫjik .

Implies ǫijk are invariant under cyclic rotation of suffices. Thus ǫ123 = ǫ231 = ǫ312 = 1, ǫ213 =
ǫ132 = ǫ321 = −1, and all 21 others are zero.

Notation: We use [v]i = vi to denote the ith component of a vector.

Note: Another way of defining ǫijk are as the set of 27 elements such that

[u × v]i = ǫijk uj vk . (11)

3 X
X 3
For e.g. [u × v]1 = ǫ1jk uj vk = ǫ1jk uj vk = ǫ111 u1 v1 + ǫ112 u1 v2 + ǫ113 u1 v3 + ǫ121 u2 v1 +
j=1 k=1
ǫ122 u2 v2 + ǫ123 u2 v3 + ǫ131 u3 v1 + ǫ132 u3 v2 + ǫ133 u3 v3 and this equals u2 v3 − u3 v2 only if ǫ123 = 1,
ǫ132 = −1 and all 7 other ǫ1jk are zero. Repeat for 2nd and 3rd components.
Note: The defintion of ǫijk guarantees the antisymmetry of the cross product.

Proposition: A (double product)

ǫijk ǫilm = δjl δkm − δjm δkl . (12)

Proof: Just have to consider all possible (non-trivial) combinations to see it is true.

E.g. 2.6: (A vector triple product)

a × (b × c) = (a · c)b − (a · b)c.

12
Proof: Vector identity, so start by looking at the scalar quantity which is the ith component of
the LHS:

[a × (b × c)]i = ǫijk aj [b × c]k


= ǫijk aj ǫklm bl cm
= ǫkij ǫklm aj bl cm
= (δil δjm − δim δjl )aj bl cm
= aj cj bi − aj bj ci = (a · c)bi − (a · b)ci

True for i = 1, 2, 3, so result is proved.

2.2 Scalar and vector fields

Defn: Conventional language:

A scalar field on R3 is a function f : R3 → R.


A vector field on R3 is a map v : R3 → R3 . We write v(r) = (v1 (r), v2 (r), v3 (r)) where vi (r)
i = 1, 2, 3 are scalar fields.

Scalar and vector fields defined in R3 are of particular importance for physical applications. For
example:

• (Scalar fields) Temperature T (r); mass density ρ(r) for a fluid or gas; electric charge density
q(r).
• (Vector fields) Velocity v(r) of a fluid or gas; electric and magnetic fields E(r) and B(r),
displacement fields in elastic solid u(r).

In these physical applications, one often derives equations that govern vector and scalar fields
which involve derivatives in space (and time).
The following three first-order differential operations of vector calculus emerge from this:

2.3 Gradient (grad)

Defn: The gradient of a scalar field f , denoted ∇f , is the vector field given by
 
∂f ∂f ∂f ∂f
∇f (r) = , , , or, in component form, [∇f ]i = , i = 1, 2, 3
∂x ∂y ∂z ∂xi
The gradient maps scalar to vector fields.
E.g. 2.7:  
y  −y x 1
−1
∇ tan = , 2 ,0 ≡ (−yx̂ + xŷ).
x x + y x + y2
2 2 x2 + y2

13
p
E.g. 2.8: Recall r = x2 + y 2 + z 2 . A direct calculation gives
 x y z  (x, y, z) r
∇r = , , = = or, in component form, [∇r]i = xi /r
r r r r r
Note: r/r is the unit vector from the origin to the point r; we often denote this as r̂.
E.g. 2.9: If f (r) = g(r) (i.e. a function depends only on the distance from the origin) then
∂g(r) ∂r dg(r) r
[∇g(r)]i = = ≡ g ′(r)[∇r]i = g ′ (r)
∂xi ∂xi dr r i
since r is a function of x1 , x2 and x3 and by using the Chain rule.
Thus ∇g(r) = g ′(r)r̂ (c.f. potentials, central forces in Mech 1).
Recall from Calculus 1, two important interpretations of the gradient:

2.3.1 Interpretation of the gradient

Provided ∇f is nonzero, the gradient points in the direction in which f increases most rapidly.

Proof: let v be s.t. |v| = 1. Then rate of change of f in direction v is the directional derivative
(see (2)) Dv f (r) = v · ∇f = |∇f | cos θ, where θ is the angle between v and ∇f . Maximised when
θ = 0. I.e. when v in direction ∇f .

2.3.2 Another interpretation of the gradient

The gradient of f is perpendicular to the level surfaces of f .


(A level surface S is defined by values of r s.t. f (r) = C, a constant.)

Proof: Let c(t) lie in S. Then f (c(t)) = C, for all t. The chain rule yields
d
0= f (c(t)) = ∇f (c(t)) · c′ (t)
dt
and since c′ (t) is parallel to S at c(t), we have our result.

E.g. 2.10: Consider the temperature T in a room to be a function of 3D position (x, y, z):
ex sin(πy)
T (r) =
1 + z2
Q: If you stand at the point (1, 1, 1) in which direction will the room get coolest fastest ?
A:  
ex sin(πy) πex cos(πy) 2zex sin(πy)
∇T = , ,−
1 + z2 1 + z2 (1 + z 2 )2

14
and at (x, y, z) = (1, 1, 1), ∇T = (0, − 12 e, 0). So a vector pointing in the direction where tem-
perature gets coolest (i.e. decreases most rapidly) is (0, 1, 0).

E.g. 2.11: Take f (x, y) = x2 + 2y 2 . Then ∇f = (2x, 4y).


For instance ∇f evaluated at (x, y) = (1, 1) is (2, 4) and so the steepest ascent
√ of f at (1, 1) is in
direction tan (2) w.r.t. x axis and gradient in that direction is |∇f | = 2 5.
−1

2.4 Divergence (Div)

Defn: The divergence of a vector field v(r), denoted ∇ · v, is the scalar field given by

∂v1 ∂v2 ∂v3 ∂


∇·v= + + ≡ vj (r) ≡ ∂j vj .
∂x1 ∂x2 ∂x3 ∂xj

Note: The use of a ‘dot’ between the symbol used for the gradient and the vector field is purely
notational. Do not get the divergence, which is a differential operation, confused with the dot
product. For e.g. the dot product is a · b = b · a since multiplication is commutative, but

∂ ∂ ∂
v · ∇ = v1 + v2 + v3 6= ∇ · v.
∂x1 ∂x2 ∂x3

2.4.1 Interpretation of divergence

Harder without physical setting, but broadly it measures the expansion (positive divergence) or
contraction of a field at a point.
For example, consider (i) v(r) = (x, y, 0) then ∇ · v = 2 and (ii) v(r) = (−y, x, 0), then ∇ · v = 0.
Note: The first case corresponds to a 2D radially spreading field and the second to a 2D circular
rotating field (just believe me) which is why the divergence is positive (expanding) and zero (nei-
ther expanding nor contracting) in the two cases.

E.g. 2.12: v(r) = (xyz, xyz, xyz) then ∇ · v = yz + zx + xy


∂xj
E.g. 2.13: ∇ · r = = δjj = 3.
∂xj
∂ ∂xk
E.g. 2.14: ∇ · (a × r) = ǫijk aj xk = ǫijk aj = ǫijk aj δik = ǫiji aj = 0 since ǫiji = 0.
∂xi ∂xi

15
2.5 Curl

Defn: The curl of a vector field v(r), denoted ∇ × v, is the vector field (i.e. in R3 ) given by

x̂ ŷ ẑ  
∂v 3 ∂v2 ∂v1 ∂v3 ∂v2 ∂v 1
∇ × v = ∂x ∂y ∂z ≡ − , − , − .
v1 v2 v3 ∂y ∂z ∂z ∂x ∂x ∂y

Alternatively (and very conveniently)



[∇ × v]i = ǫijk vk (13)
∂xj
as for cross products.

2.5.1 Interpretation of Curl

Again harder without physical setting, but broadly it measures the rotation or circulation of a
vector field (because it needs direction) at a point.
Using same examples from ‘Div’ section: (i) v(r) = (x, y, 0) then ∇×v = 0 (the radially spreading
field has no rotation); and (ii) v(r) = (−y, x, 0), then ∇ × v = 2ẑ (the rotational field has !)

x̂ ŷ ẑ

E.g. 2.16: Let v(r) = (y 2, x2 , y 2). Then ∇ × v = ∂x ∂y ∂z = (2y, 0, 2(x − y))
y 2 x2 y 2

E.g. 2.17: [∇ × r]i = ǫijk ∂j xk = ǫijk δjk = ǫijj = 0, so ∇ × r = 0.

2.6 Second-order differential operations

Schematically, grad, div and curl act as follows:

grad: scalar fields → vector fields


div: vector fields → scalar fields
curl: vector fields → vector fields

The operations of grad, div and curl can be combined. Thus, only the following combination of
operations make sense:

curl(grad): scalar fields → vector fields


div(grad): scalar fields → scalar fields
grad(div): vector fields → vector fields
div(curl): vector fields → scalar fields
curl(curl): vector fields → vector fields

16
2.6.1 Two Null Identities

1) For any scalar field f


∇ × (∇f ) = 0.

Proof: We have that

[∇ × (∇f )]i = ǫijk ∂j ∂k f = −ǫikj ∂j ∂k f = −ǫikj ∂k ∂j f = −[∇ × (∇f )]i .

Thus since the expression equals its own negative, it must vanish.

2) For any vector field, v,


∇ · (∇ × v) = 0

Proof: We have that


∇ · (∇ × v) = ∂i ǫijk ∂j vk = ǫijk ∂i ∂j vk ,
which must vanish for the same reason.
The remaining combinations of grad, div and curl are related to a second-order differential operator
called the Laplacian...

2.7 The Laplacian

Defn: The Laplacian of a scalar field f (r), denoted ∇2 f (or △f ), is the scalar field defined by
 2 
2 ∂ ∂2 ∂2
△f = ∇ · ∇f (r) = ∂i f ≡ + + f.
∂x2 ∂y 2 ∂z 2

The definition can also be extended to consider the Laplacian of a vector field v(r) which is

△v = (△v1 , △v2 , △v3 ) .

E.g. 2.18: For a vector field v(r),

△v = −∇ × (∇ × v) + ∇(∇ · v).

Proof: We consider the ith component of the 1st RHS term:

[∇ × (∇ × v)]i = ǫijk ∂j [∇ × v]k


= ǫijk ∂j ǫklm ∂l vm
= ǫkij ǫklm ∂j ∂l vm
= (δil δjm − δim δjl )∂j ∂l vm
= ∂i ∂m vm − ∂j ∂j vi = [∇(∇ · v) − △v]i

which shows that all components agree with our claim.

17
2.8 Curvilinear coordinate systems

All differential operators defined above were expressed in Cartesian coordinates. For many prac-
tical problems more natural to express problems in coordinates aligned with principal features of
the problem. E.g. Polars are appropriate for circular domains.
Q: How do we recast the differential operators in a differential coordinate system ?

2.8.1 Coordinate transformations

Defn: Curvilinear coordinates are defined by a smooth function r : R3 → R3 which maps a


point q = (q1 , q2 , q3 ) in one coordinate system to a point in Cartesian space: r ≡ (x, y, z) = r(q) =
r(q1 , q2 , q3 ). I.e.
x = x(q1 , q2 , q3 ), y = y(q1, q2 , q3 ), z = z(q1 , q2 , q3 )
The inverse map, if it exists (see later) is

q1 = q1 (x, y, z), q2 = q2 (x, y, z), q3 = q3 (x, y, z).

For example: in 2D, if q1 = rpand q2 = θ then x = x(r, θ) = r cos θ and y = y(r, θ) = r sin θ. The
inverse map is r = r(x, y) = x2 + y 2 , θ = θ(x, y) = tan−1 (y/x).

Defn: The surfaces qi = const are called coordinate surfaces. The space curves formed by their
intersection in pairs are called the coordinate curves. The coordinate axes are determined
by the tangents to the coordinate curves at the intersection of three surfaces. They are not, in
general, fixed directions in space.

The two points r(q1 , q2 , q3 ) and r(q1 +dq1 , q2 , q3 ) lie on a coordinate curve formed by q2 , q3 constant.
Thus, the q1 -coordinate axis is determined by letting dq1 → 0 in
∂r
r(q1 , q2 , q3 ) + dq1 − r(q1 , q2 , q3 ) + h.o.t. order (dq1 )2
r(q1 + dq1 , q2 , q3 ) − r(q1 , q2 , q3 ) ∂q1 ∂r
= =
dq1 dq1 ∂q1

(after Taylor expanding). Repeat with q2 and q3 . Thus, we can describe the point q = q1 q̂1 +
q2 q̂2 + q3 q̂3 in terms of the local coordinate basis given by unit vectors directed along the local
coordinate axes:
1 ∂r 1 ∂r 1 ∂r
q̂1 = , q̂2 = , q̂3 =
h1 ∂q1 h2 ∂q2 h3 ∂q3
where, to ensure |q̂i | = 1, we have normalised by

∂r
hi = .
∂qi

The hi are called the metric coefficients or scale factors.

18
Note: the use of Greek indices in, for e.g.
1 ∂r
q̂α =
hα ∂qα
for α = 1, 2, 3 indicates that the summation convention is not applied.

Remark: Is this always possible ? I.e. is there always a unique map from one system to another
? This is the same as asking if there is an inverse map. Thus (by the inverse function theorem)
the answer lies in the Jacobian matrix of the map, given here by r′ (q) which is the matrix with
hα q̂α as column vectors (for α = 1, 2, 3). Thus the Jacobian determinant

h1 [q̂1 ]1 h2 [q̂2 ]1 h3 [q̂3 ]1
∂(x, y, z)
Jr = ≡ h1 [q̂1 ]2 h2 [q̂2 ]2 h3 [q̂3 ]2
∂(q1 , q2 , q3 )
h1 [q̂1 ]3 h2 [q̂2 ]3 h3 [q̂3 ]3

must be non-vanishing. The 2nd representation simply shows that no new calculations are needed
to populate the entries of the Jacobian determinant.

Defn: If local basis vectors of a curvilinear coordinate system are mutually orthogonal, we call
it an orthogonal curvilinear coordinate system. Convention dictates that the system be right-
handed, or q̂1 = q̂2 × q̂3 . (or, form axes from your thumb, index and middle fingers of your right
hand and order basis vectors 1, 2, 3 on each respective digit.)

In the following, we will deal exclusively with orthogonal systems.

E.g. 2.19: Consider the linear map

r = Rq, such that xi = Rij qj

and R is an orthogonal matrix (a matrix s.t. RT R = I which implies R−1 = RT ). Then


∂r
= (R11 , R21 , R31 ),
∂q1
∂r
= (R12 , R22 , R32 ),
∂q2
∂r
= (R13 , R23 , R33 ).
∂q3
The scale factors are

∂r q
h1 =
= R2 + R2 + R2 , and similarly for h2 , h3
11 21 31
∂q1

The matrix equation RT R = I can be expressed as


T
δij = Rik Rkj = Rki Rkj (14)

19
Hence h1 = 1 (and similarly h2 = h3 = 1).
Thus the local basis vectors are

q̂j = (R1j , R2j , R3j ) , j = 1, 2, 3.

These are constant, i.e. they do not vary with position.

Note: From the definition of the basis vectors and using summation notation for the dot product
we have q̂i · q̂j = Rki Rkj = δij (using (14) and so the basis vectors are othonormal.

In other words, the new coordinate axes are a general rotation of the original x̂, ŷ, ẑ axes.

Figure 1: A local basis in cylindrical polar coordinates.

E.g. 2.20: In 3D, cylindrical polar coordinates are defined by the mapping

(x, y, z) = r(r, θ, z) = (r cos θ, r sin θ, z)

see Fig. 1. It follows that


∂r ∂r ∂r
= (cos θ, sin θ, 0), = (−r sin θ, r cos θ, 0), = (0, 0, 1).
∂r ∂θ ∂z

The scale factors are



∂r ∂r ∂r
hr = = 1, hθ = = r, hz = = 1. (15)
∂r ∂θ ∂z

Thus the local basis vectors are (using standard notation):

r̂ = (cos θ, sin θ, 0), θ̂ = (− sin θ, cos θ, 0), ẑ = (0, 0, 1), (16)

20
Figure 2: A local basis in spherical polar coordinates. The vector r̂ points along a ray from the
center, φ̂ points along the meridians, and θ̂ along the parallels.

and these vary with position. Note that r̂ · θ̂ = r̂ · ẑ = θ̂ · ẑ = 0, and r̂ = θ̂ × ẑ, so cylindrical
coordinates are indeed orthogonal.

E.g. 2.21: Spherical polar coordinates are defined by the mapping

(x, y, z) = r(r, φ, θ) = (r sin φ cos θ, r sin φ sin θ, r cos φ),

see Fig. 2. Now the derivatives with respect to the coordinates are

∂r
= (sin φ cos θ, sin φ sin θ, cos φ),
∂r
∂r
= (r cos φ cos θ, r cos φ sin θ, −r sin φ),
∂φ
∂r
= (−r sin φ sin θ, r sin φ cos θ, 0),
∂θ
and the scale factors become (check):

∂r ∂r ∂r
hr = = 1, hφ = = r, hθ = = r sin φ. (17)
∂r ∂φ ∂θ

Thus the local basis vectors are

r̂ = (sin φ cos θ, sin φ sin θ, cos φ),


φ̂ = (cos φ cos θ, cos φ sin θ, − sin φ),
θ̂ = (− sin θ, cos θ, 0), (18)

and vary with position. Again, r̂ · φ̂ = r̂ · θ̂ = φ̂ · θ̂ = 0, and r̂ = φ̂ × θ̂, so spherical coordinates


are orthogonal.

21
2.8.2 Transformation of the gradient

The differential operator ∇ is the Cartesian vector


 
∂ ∂ ∂
∇= , , .
∂x ∂y ∂z
We want this to be transformed into derivatives w.r.t the local coordinates q1 , q2 , q3 .
Consider f (r) = f (r(q)). Then for fixed α = 1, 2, 3, the chain rule gives
1 ∂f 1 ∂xj ∂f
= = q̂α · ∇f (19)
hα ∂qα hα ∂qα ∂xj
(summation over j is implied, but not α).

Now if u = u1 q̂1 + u2q̂2 + u3 q̂3 then the orthonormal property of the local basis functions means
uj = u · q̂j . If we let u = ∇f and with (19) we find
3 3
X q̂α ∂f X q̂α ∂
∇f = , and so ∇= . (20)
h ∂qα
α=1 α
h ∂qα
α=1 α

E.g. 2.22: In cylindrical polar coordinates, according to (20) and (15), we have
∂ 1 ∂ ∂
∇ = r̂ + θ̂ + ẑ
∂r r ∂θ ∂z
E.g. 2.23: In spherical coordinates, according to (20) and (18),
∂ 1 ∂ 1 ∂
∇ = r̂ + φ̂ + θ̂
∂r r ∂φ r sin φ ∂θ

2.8.3 Transformation of the divergence

To find ∇ · u in curvilinear coordinates we first need to express the vector field u in the local
coordinate system. I.e.
u = u1 q̂1 + u2 q̂2 + u3 q̂3 .
The difficulty here is that both ui and q̂i depend on (q1 , q2 , q3 ). We come at the divergence in a
slightly roundabout way.

First, we note from (20) that


q̂α
∇qα = .

Now note that
q̂2 q̂3 q̂1
∇ × (q2 ∇q3 ) = q2 (∇ × (∇q3 )) +(∇q2 ) × (∇q3 ) = × = .
| {z } h2 h3 h2 h3
=0

22
Then from §2.6.1 (Null identites: ∇ · (∇ × A) = 0, ∇ × (∇f ) = 0 for any A, f ,)
   
q̂α q̂1
∇× = 0, ∇· = 0.
hα h2 h3
Results true for the 2 cyclic permutations (1 → 2, 2 → 3, 3 → 1)
   
q̂2 q̂3
∇· =∇· = 0.
h1 h3 h1 h2
So now
 
q̂1
∇ · u = ∇. (u1 h2 h3 ) + 2 cyclic perms
h2 h3
 
q̂1 q̂1
= · ∇ (u1 h2 h3 ) + (u1 h2 h3 ) ∇ · + 2 cyclic perms
h2 h3 h2 h3
3
!
q̂1 X q̂α ∂(u1 h2 h3 )
= · + 2 cyclic perms
h2 h3 h α ∂qα
α=1 
1 ∂(u1 h2 h3 ) ∂(u2 h1 h3 ) ∂(u3 h1 h2 )
= + +
h1 h2 h3 ∂q1 ∂q2 ∂q3
using the fact that q̂α · q̂β = δαβ .

E.g. 2.24: (Cylindrical polar coordinates.) First write


u = ur r̂ + uθ θ̂ + uz ẑ.
with hr = 1, hθ = r, hz = 1, so
 
1 ∂(rur ) ∂uθ ∂(ruz ) ∂ur ur 1 ∂uθ ∂uz
∇·u= + + = + + + .
r ∂r ∂θ ∂z ∂r r r ∂θ ∂z

For example, if u = f (r)θ̂ then ∇ · u = 0.

2.8.4 Transformation of curl

Taking a similar approach to ‘Div’, we write


 
q̂1
∇ × u = ∇ × (h1 u1 ) + 2 cyclic perms
h1
q̂1 q̂1
= ∇(h1 u1 ) × + (h1 u1 )∇ × + 2 cyclic perms
h1 h1
3
!
X q̂α ∂(h1 u1 ) q̂1
= × + 2 cyclic perms
h
α=1 α
∂qα h1
q̂2 ∂(h1 u1 ) q̂3 ∂(h1 u1 )
= − + 2 cyclic perms
h1 h3 ∂q3 h1 h2 ∂q2
     
q̂1 ∂(h3 u3 ) ∂(h2 u2 ) q̂2 ∂(h1 u1 ) ∂(h3 u3 ) q̂3 ∂(h2 u2 ) ∂(h1 u1 )
= − + − + − .
h2 h3 ∂q2 ∂q3 h1 h3 ∂q3 ∂q1 h1 h2 ∂q1 ∂q2

23
Now we see
h1 q̂1 h2 q̂2 h3 q̂3
1
∇×u= ∂/∂q 1 ∂/∂q 2 ∂/∂q 3
.
h1 h2 h3
h1 u1 h2 u2 h3 u3

2.9 Examples

E.g. 2.25: The Laplacian of a scalar field φ is △φ = ∇ · ∇φ. Since


∂φ 1 ∂φ ∂φ
∇φ = r̂ + θ̂ + ẑ .
∂r r ∂θ ∂z
we use the defn of div to give
     
∂ ∂φ 1 ∂φ 1 ∂ 1 ∂φ ∂ ∂φ ∂ 2 φ 1 ∂φ 1 ∂2φ ∂2φ
△φ = + + + = 2 + + 2 2 + 2.
∂r ∂r r ∂r r ∂θ r ∂θ ∂z ∂z ∂r r ∂r r ∂θ ∂z

E.g. 2.26: Now the curl in cylindrical polars:


     
1 ∂uz ∂uθ ∂ur ∂uz ∂uθ uθ 1 ∂ur
∇×u= − r̂ + − θ̂ + + − ẑ,
r ∂θ ∂z ∂z ∂r ∂r r r ∂θ

Exercise: Do the same for spherical polars !!

Remark: If curvilinear system not orthogonal then we are in a real mess.

24
3 Integration theorems of vector calculus
Having done differential vector calculus, we turn to integral vector calculus. These are equally
important in applications as you will see in APDE2, Fluid Dynamics and beyond. We shall derive
three (quite stunning) main integral identities all of which may be considered as higher-dimensional
generalisations of the Fundamental Theorem of Calculus:
Z b
f ′ (x) dx = f (b) − f (a).
a

The LHS is a one-dimensional integral (i.e. an integral over a line) which is equated to zero-
dimensional (i.e. pointwise) evaluations on the boundary of the integral (here at x = a, b).
Remark: The formula for integration by parts is found by letting f (x) = u(x)v(x) in the above !

3.1 The line integral of a scalar field

An ordinary 1D integral can be regarded as integration along a straight line. For example if F (x)
is the force on a particle alowed to move along the x-axis,
Z x2
F (x) dx
x1

is the “work done” moving it from x1 to x2 . We want to integrals along general paths in R2 or R3 .

Defn: A path is a bijective (i.e. one-to-one) map p : [t1 , t2 ] → R3 s.t. t 7→ p(t). It connects the
point p(t1 ) to p(t2 ) along a curve C, say. We say the curve C is parametrised by the path.

Defn: The line integral of a scalar field f : R3 → R along a curve C is denoted


Z
f (r) ds.
C

and ds = |dr| denotes the elemental arclength. Since r = p(t) on C, dr = p′ (t) dt and so
Z Z t2
f (r) ds = f (p(t))|p′ (t)| dt.
C t1

E.g. 3.1:√Let p(t) = (t, t, t) for t ∈ [0, 1] connects the points (0, 0, 0) to (1, 1, 1) by a straight line
of length 3. If f = xyz then
Z Z 1 √
3
√ 3
f ds = t 1 + 1 + 1 dt =
C 0 4

E.g. 3.2: Let p(t) = (t2 , t2 , t2 ) for t ∈ [0, 1] parametrises the same curve as in E.g. 3.1. With the
same f we have √
Z Z 1 p
3
f ds = t6 (2t)2 + (2t)2 + (2t)2 dt = .
C 0 4

25
Note: Parameterisation is not unique. Suggests value of line integral is independent of parametri-
sation.
Proof: Consider the bijective map t = g(u) for t1 < t < t2 such that t1 = g(u1), t2 = g(u2). Then
Z t2 Z u2 Z u2
′ ′ ′
f (p(t))|p (t)| dt = f (p(g(u)))|p (g(u))|g (u) du = f (q(u))|q′(u)| du.
t1 u1 u1

after letting q(u) = p(g(u)) and noting q (u) = g (u)p (g(u)) by the chain rule.
′ ′ ′

Note: Value of line integral does depend on direction of path along C:


Z Z
f ds = − f ds
C −C
Z x2 Z x1
(−C is C in reverse). We are used to this notion in 1D, viz f (x)dx = − f (x)dx.
x1 x2

3.2 The line integral of a vector field

Defn: Let F(r) : R3 → R3 be a vector field, and let p(t) be a path on the interval [t1 , t2 ]. The
line integral of F along p is defined by
Z Z t2
F · dr = F(p(t)) · p′ (t) dt
C t1

as above.
Note: As above, the value of the line integral is not dependent on parametrisation of C but is
negated by a reversal of C.

E.g. 3.3: Integrate F = sin φẑ (φ is polar angle in spherical polars) along a meridian of a sphere
of radius R from the south to the north pole.
A: From the description of the path, C, convenient to use spherical coordinates (r, φ, θ). I.e.
p(φ) = Rr̂ = R(sin φ cos θ, sin φ sin θ, cos φ);
(see earlier defn of r̂, θ̂, φ̂ in spherical polars (18)) then
dp
= R(cos φ cos θ, cos φ sin θ, − sin φ) ≡ Rφ̂.

Thus we can see that ẑ · φ̂ = − sin φ and so
Z Z 0 Z 0 Z 0

F · dr = ′
F · p (φ) dφ = R sin φ ẑ · φ̂ dφ = −R sin2 φdφ = .
C π π π 2

Proposition: Let f (r) be a scalar field and let C be a curve in R3 parametrised by the path p(t),
t1 ≤ t ≤ t2 . Then Z
∇f · dr = f (p(t2 )) − f (p(t1 )).
C

26
This is the fundamental theorem of Calculus for line integrals.

Proof: We have Z Z t2
∇f · dr = ∇f (p(t)) · p′ (t) dt.
C t1

But from the chain rule it follows that


d
f (p(t)) = p′ (t) · ∇f (p(t)).
dt
Therefore, Z Z t2
d
∇f · dr = f (p(t)) dt = f (p(t2 )) − f (p(t1 )),
C t1 dt
from the Fundamental Theorem of Calculus.

Note: If C is closed, the line integral over a gradient field vanishes. As a result, line integrals of
gradient fields are independent of the path C.

Remark: The line integral of a vector field is often called the work integral, since if F represents
a force, the integral represents the work done moving a particle between two points. If F = ∇f
for some scalar field f (often called the potential) then the work done moving the particle is
independent of the path taken. Moreover the work done moving a particle which returns to the
same position is zero. Such a force is called conservative.

3.3 Surface integrals of scalar and vector fields.

We now generalise 1D integrals to 2D integrals. We start with parametrisations of surfaces.

Defn: A path p(t), for t ∈ [t1 , t2 ] is closed if p(t1 ) = p(t2 ). A closed path is simple it it does
not intersect with itself apart from at the end points t1 , t2 .
Defn: Let D ⊂ R2 , let ∂D represent the boundary of D (it should be a simple closed path) and
let D̄ be D ∪ ∂D.

Now define a map s : D̄ → R3 s.t (u, v) 7→ s(u, v) and ∂s/∂u, ∂s/∂v are linearly independent on
D. A surface S ∈ R3 is given in parametrised form by S = {s(u, v) | (u, v) ∈ D}.

E.g. 3.4: Let


D = {(u, v) | u2 + v 2 < R2 }.

Then ∂D is the circle {(u, v) | u2 + v 2 = R2 } of radius R. Let s(u, v) = (u, v,√ R2 − u2 − v 2 ),
p S is a hemispherical surface since the map defines x = u, y = v and z = R − u − v =
then 2 2 2

R −x −y .
2 2 2

27
Note: this is not the only way to parametrise a hemisphere; could (and will) use spherical polars.

Defn: The integral of a scalar field f over a surface S is denoted by


Z Z
f (r) dS ≡ f (r) |dS|
S S

where dS = n̂dS and n̂ is a unit vector pointing out from S (a surface element is defined by its
size dS and a direction, n̂, being the normal to the surface).
Z Z
Note: dS is the physical area of the surface S in the same way that ds is the physical length
S C
of the curve C.

Now the two vectors s(u + du, v) and s(u, v) both lie on S and, assuming du vanishingly small,
and Taylor expanding,
∂s
s(u + du, v) − s(u, v) = s(u, v) + du (u, v) − s(u, v) + h.o.t. order (du)2.
∂u
Do the same thing with v. It follows that the two vectors
∂s ∂s
du, dv
∂u ∂v
lie in the tangent plane to the surface S and the area dS of the elemental rhomboid formed by
these two vectors in the direction normal to dS is
∂s ∂s ∂s ∂s
dS = du × dv ≡ N(u, v)dudv, N(u, v) = ×
∂u ∂v ∂u ∂v
and so Z Z
f (r) dS = f (s(u, v))|N| du dv.
S D

Note: If S lies in the 2D plane, then s(u, v) = (x(u, v), y(u, v), 0) is a mapping from 2D to 2D
and so
∂(x, y)
|N| =
∂(u, v)
is the Jacobian of the map (easy to confirm). We know this from Calculus 1 (e.g. dxdy 7→ rdrdθ).

Defn: Let v be a vector field on R3 . The integral of v over S, is denoted


Z Z Z
v · dS ≡ v · n̂ dS = v(s(u, v)) · N(u, v) du dv,
S S D

as above.

Important remark: By analogy with line integrals, can show that the surface integral of a vec-
tor field is independent of parametrisation up to a sign. The sign depends on the orientation of

28
the parametrisation, which is determined by the direction of the unit normal n̂ = N/|N|. Thus,
the direction of n̂, (or N) must be specified in order to fix the sign of the integral unambiguously.
Z
E.g. 3.5: Calculate B · dS where B(r) = rẑ + r̂ (expressed in cylindrical polars) in which
S
S = {(x, y, 0) | x2 + y 2 ≤ 1} is directed in positive z-direction.

A: Cylindrial polar coordinates are a sensible parametrisation of S, i.e.

s(r, θ) = (r cos θ, r sin θ, 0),

so that D = {(r, θ) | 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π}. Then

∂s ∂s
N= × = rẑ,
∂r ∂θ
(factor r is the Jacobian determinant in this 2D mapping) and points in direction ẑ normal to the
2D plane as required by the question. Then
Z Z 2π Z 1 Z 2π Z 1

B · dS = (rẑ + r̂) · ẑ rdrdθ = r 2 dr dθ =
S 0 0 0 0 3

and we have used r̂ · ẑ = 0.

3.4 Stokes’ theorem

Consider that the vector field v is expressed as the curl of another vector field, i.e. v = ∇ × A.
This frequently happens in applications.

Defn: The boundary of the surface S is denoted ∂S and, since it is mapped from the boundary
∂D, it inherits its properties, being a simple closed path. If c(t) ∈ R2 is the simple closed path
along ∂D then
p(t) = s(c(t))
is the simple closed path along ∂S.

Note: A closed path has no start and finish point and can be oriented in either the anti-clockwise
or clockwise directions.
Proposition: (Stokes’ theorem) Let S be a surface in R3 with boundary ∂S; let A be a vector
field. Then Z Z
∇ × A · dS = A · dr, (21)
S ∂S

where dS and ∂S must be consistently oriented according to the right-hand THUMB rule.

29
Defn: (Right-hand thumb rule) Point the thumb of your right hand along curve ∂S (either
clockwise or anti-clockwise). Wrap your fingers around the curve; your fingers will indicate the
direction of the normal N (or n̂) of the surface that must be chosen in accordance with your choice
of direction around the curve.
Z
E.g. 3.6: Compute ∇ × f · dS where f(r) = ω × r and ω = (ω1 , ω2 , ω3 ) is a constant vector.
S
S is the hemisphere in z > 0 of radius R with the normal to the surface defined to point inwards
towards the origin.

A: On problem sheet 2, we have shown that ∇ × f = 2ω ≡ (2ω1 , 2ω2 , 2ω3).

We will calculate the surface integral using three different approaches.

(i) Cartesian coordinates. Use the parametrisation defined in E.g. 3.4. I.e. the map

s(u, v) = (u, v, R2 − u2 − v 2 ).
maps the domain D = {(u, v) | u2 + v 2 < R2 } onto S.
Now
∂s ∂s  u  v  u v 
N(u, v) = × = 1, 0, − × 0, 1, − = , ,1 ,
∂u ∂v w w w w

abbreviating w = R2 − u2 − v 2 , which is the z-coordinate on the sphere. We can see N points
in same direction as s which is in the direction of the outward normal: r̂ = (u, v, w)/R. This is
not the direction we specified so we must adjust the sign of N by inserting a minus sign in
Z Z
∇ × f · dS = − ∇ × f · N(u, v) dudv
S D
Z   Z
2ω1 u 2ω1 v
= − + + 2ω3 dudv = − 2ω3 dudv = −2πω3 R2
u2 +v2 <R2 w w u2 +v2 <R2

(using the oddness of the functions w.r.t. u and v in the first two terms).

(ii) Use spherical polars, we define the map


s(φ, θ) = Rr̂ = (R sin φ cos θ, R sin φ sin θ, R cos φ),
which maps D = {(θ, φ) | 0 ≤ φ ≤ π/2, 0 ≤ θ ≤ 2π} onto the hemisphere.
So now
∂s ∂s
N= × = ... = R2 (sin2 φ cos θ, sin2 φ sin θ, sin φ cos φ) = R2 sin φr̂,
∂φ ∂θ
and, as before, N points outwards on S. Because we defined the integral in terms of a surface
pointing inwards we have to reverse the sign of N and write
Z Z π/2 Z 2π
∇ × f · dS = − 2ω · (R2 sin φr̂) dθdφ
S 0 0

30
but Z Z
2π 2π
ω · r̂ dθ = (ω1 sin φ cos θ + ω2 sin φ sin θ + ω3 cos φ) dθ = 2πω3 cos φ.
0 0

Thus, taken together,


Z Z π/2
2
∇ × f · dS = −4πR ω3 sin φ cos φ dφ = −2πR2 ω3 ,
S 0

the same as before.


(iii) Using Stokes’ theorem (21) the answer is
Z
f · dr
∂S

where ∂S is the edge of the hemisphere, radius R, where it meets z = 0. This is the circle of radius
R in the (x, y)-plane. By the RH thumb rule, the integral needs to be directed in the clockwise
direction (looking from above).
We define the circle by the path p(θ) = Rr̂ = (R cos θ, R sin θ, 0), 0 ≤ θ < 2π. Now
Z Z 0
f · dr = f(p(θ)) · p′ (θ) dθ,
∂S 2π

and we have
p′ (θ) = R(− sin θ, cos θ) = Rθ̂,
(in cylindrical polars) and on ∂S.
Z Z 0 Z 0
2
f · dr = (ω × (Rr̂)) · Rθ̂ dθ = R (r̂ × θ̂) · ω dθ
∂S 2π 2π

using a vector result a · (b × c) = b · (c × a). Since r̂ × θ̂ = ẑ we end up with


Z Z 0
2
f · dr = R ω3 dθ = −2πR2 ω3 ,
∂S 2π

the same as before (and confirms Stokes’ theorem, for this example).

3.4.1 Outline proof of Stokes’ theorem (non-exam)

We first prove for a rectangle in (u, v)-space. The loose argument then proceeds that rectangles
can be assembled as a checkboard into larger domains, given that the limit of rectangle size can
be taken to go to zero and since contributions from adjacent sides cancel leaving just the circuit
around the total domain. See fig 3.
Let D = {(u, v) | 0 < u < a, 0 < v < b}. The surface S is defined by the map s(u, v) : D → R3
The boundary ∂D = C1 ∪ C2 ∪ C3 ∪ C4 is mapped by s onto ∂S = ∂S1 ∪ ∂S2 ∪ ∂S3 ∪ ∂S4 .

31
For e.g. C2 is the path p(t) = s(a, t), 0 < t < b and
Z Z b Z b
′ ∂s
A · dr = A(p(t)) · p (t) dt = A(s(a, v)) · (a, v) dv
∂S2 0 0 ∂v
Similarly, Z Z b
∂s
A · dr = − A(s(0, v)) · (0, v) dv
∂S4 0 ∂v
Z 0 Z b
(the minus sign accounts for reversing the orientation of the segment C4 , viz: =− )
b 0

Combining results gives


Z Z  Z b 
∂s ∂s
+ A.dr = A(s(a, v)) · (a, v) − A(s(0, v)) · (0, v) dv
∂S2 ∂S4 0 ∂v ∂v
Z bZ a  
∂ ∂s
= A(s) · du dv
0 0 ∂u ∂v
by the FTC. We apply the same method to the side ∂S1 and ∂S3 and find
Z Z  Z bZ a  
∂ ∂s
+ A.dr = − A(s) · du dv
∂S1 ∂S3 0 0 ∂v ∂u
and so Z Z     
∂ ∂s ∂ ∂s
A.dr = A(s) · − A(s) · dudv
∂S D ∂u ∂v ∂v ∂u
Concentrate on the integrand of the LHS. So
   
∂ ∂s ∂ ∂s ∂A(s) ∂s ∂A(s) ∂s
A(s) · − A(s) · = · − ·
∂u ∂v ∂v ∂u ∂u ∂v ∂v ∂u
 
∂Ak ∂xj ∂xk ∂xj ∂xk
= −
∂xj ∂u ∂v ∂v ∂u
∂Ak ∂xl ∂xm
= (δjl δkm − δjm δkl )
∂xj ∂u ∂v
∂Ak ∂xl ∂xm
= ǫijk ǫilm
∂xj ∂u ∂v
  
∂Ak ∂xl ∂xm
= ǫijk ǫilm
∂xj ∂u ∂v
 
∂s ∂s
= (∇ × A) · × = (∇ × A) · N(u, v).
∂u ∂v
Hence we have Z Z Z
A.dr = (∇ × A) · N(u, v) dudv = (∇ × A) · dS
∂S D S
as required. As mentioned earlier, we now “glue together” small rectangles to create the actual
domain we wish to cover. This can be done formally.

32
Figure 3: A number of rectangles (left) can be put together to cover the domain (right).

3.4.2 Green’s theorem in the plane

Stokes’ theorem is applied in 2D. Let S be a surface on z = 0 and A = (A1 (x, y), A2 (x, y), 0) is a
vector field with no ẑ component and no dependence on z. Now dS = ẑdS = ẑdxdy and
 
∂A2 ∂A1
∇×A = − ẑ.
∂x ∂y
Thus Stokes’ theorem is reduced to
Z   Z Z
∂A2 ∂A1
− dxdy = A · dr ≡ A1 dx + A2 dy
S ∂x ∂y ∂S ∂S

since dr = dxx̂ + dyŷ and ∂S is anti-clockwise by the RH Thumb rule.

3.5 Volume integrals

3.5.1 Volume integrals of scalar fields

Defn: Let f : R3 → R s.t. r 7→ f (r) be a scalar field. The volume integral of f is given by
Z ZZZ
f (r) dV ≡ f (x, y, z) dx dy dz
V

in Cartesians.
Note: Unlike curves and surfaces, volumes in R3 do not have directions.
Z
Note: if f = 1, then 1 dV gives the physical volume of V .
V

Proposition: If we move to a different coordinate system, q = (q1 , q2 , q3 ) from r = (x, y, z) under


the mapping r : R3 → R3 s.t. q 7→ r(q) then
Z Z
∂(x, y, z)
f (r) dxdydz = f (r(q)) dq1 dq2 dq3
V Vq ∂(q1 , q2 , q3 )
where Vq is mapped by r into V . The scale factor is the Jacobian determinant of the mapping.

33
Proof: The elemental volume dV = dxdydz is (ẑdz) · ((x̂dx) × (ŷdy)). Under the mapping, the
mapped elemental volume dVq is defined by a parallelepiped with sides given by
∂r ∂r ∂r
dq1 , dq2 , and dq3
∂q1 ∂q2 ∂q3
(just as we did for surfaces). The volume of dVq is therefore

|(q̂3 h3 dq3 ) · ((q̂1 h1 dq1 ) × (q̂2 h2 dq2 ))| = |Jr | dq1 dq2 dq3 .
1 ∂r
since q̂α = and using a result in §2.8.1. If q̂j are orthonormal, then |Jr | = h1 h2 h3 .
hα ∂qα

3.6 Divergence theorem (Gauss’ theorem)

Defn: A simply connected domain, V , say, is one in which all paths within V can be continu-
ously transformed into all other paths within V without ever leaving V .
If V is finite and simply connected then ∂V forms a closed surface (a surface with no boundaries).
Proposition: For a vector field F : R3 → R3 , Let V ⊂ R3 be simply connected with boundary
∂V (a closed surface). Then Z Z
∇ · F dV = F · dS
V ∂V
where dS = n̂dS is a surface element and n̂ points outwards from the volume V .

3.6.1 Outline proof of the divergence theorem (non-exam)

As in Stokes’ theorem, start with a proof for a cuboid

V = {r | 0 < x < a, 0 < y < b, 0 < z < c}.

The argument will be again that an arbitrary V can be divided into many small rectangular vol-
umes over each of which the divergence applies.

We write F = F1 x̂ + F2 ŷ + F3 ẑ. Then it follows that


Z Z aZ b
F · dS = (F3 (x, y, c) − F3 (x, y, 0)) dy dx +
∂V 0 0
Z aZ c Z bZ c
(F2 (x, b, z) − F2 (x, 0, z)) dz dx + (F1 (a, y, z) − F1 (0, y, z)) dz dy.(22)
0 0 0 0

(there are 6 sides, and unit outward normal is one of ±x̂, ±ŷ, ±ẑ depending on the cuboid side).
Next we consider the volume integral,
Z Z aZ bZ c 
∂F1 ∂F2 ∂F3
∇ · F dV = + + dz dy dx.
V 0 0 0 ∂x ∂y ∂z

34
The 3 terms are considered separately but in the same manner. For example, consider the contri-
bution from ∂F3 /∂z. From the Fundamental Theorem of Calculus,
Z a Z bZ c Z a Z b
∂F3
dz dy dx = (F3 (x, y, c) − F3 (x, y, 0)) dy dx.
0 0 0 ∂z 0 0

The result is
Z Z a Z b
∇ · F dV = (F3 (x, y, c) − F3 (x, y, 0)) dy dx +
V 0 0
Z a Z c Z bZ c
(F2 (x, b, z) − F2 (x, 0, z)) dz dx + (F1 (a, y, z) − F1 (0, y, z)) dz dy,
0 0 0 0

which coincides with (22), thus confirming the theorem.

Figure 4: Gauss’ theorem for the cuboid V . The top and bottom faces of the boundary, S1 and
S2 , are indicated.

Z
E.g. 3.7: Compute ∇ · v dV where V is a sphere of radius a about the origin, and
V

v(r) = r + f (r)ẑ × r,

(and ẑ is the unit vector along the z-axis, and r = (x2 + y 2 + z 2 )1/2 ).
A (i): We have that

∇ · v = 3 + (∇f (r)) · (ẑ × r) + f (r)∇ · (ẑ × r)

f ′ (r)
after using PS2, Q6(a). But ∇f = r and r·(ẑ×r) = 0. Also, can be shown that ∇·(ẑ×r) = 0,
r
so that
∇ · v = 3.
As the divergence of v is a constant, its integral over V is just its value times the volume of V ,
Z

∇ · v dV = 3 a3 = 4πa3 .
V 3

35
Z
A (ii): Using the divergence theorem, answer is v(r) · dS where ∂V is the sphere of radius a.
∂V

Surface parametrised using spherical polars by

s(φ, θ) = ar̂ = a(sin φ cos θ, sin φ sin θ, cos φ),

over domain D = {(φ, θ)| 0 ≤ φ ≤ π, 0 ≤ θ < 2π}. Then


Z Z π Z 2π
v · dS = v(s(φ, θ)) · N(φ, θ) dθ dφ.
∂V 0 0

We have (confirm yourselves)


∂r ∂r
= aφ̂, = a sin φθ̂,
∂φ ∂θ
so that
N(φ, θ) = a2 sin φr̂.
Therefore,
v(s(φ, θ)) · N(u, v) = (ar̂ + f (a)ẑ × (ar̂)) · a2 sin φr̂ = a3 sin φ.
The surface integral is given by
Z Z π Z 2π
v · dS = a3 sin φ dθ dφ = 4πa3 .
∂V 0 0

3.6.2 Green’s Identities

If F = ∇f (i.e. the vector field can be described by a scalar potential) then the divergence theorem
reads Z Z
△f dV = n̂.∇f dS
V ∂V

If F = g∇f , g, f scalar fields then


Z Z
∇g · ∇f + g△f dV = gn̂ · ∇f dS
V ∂V

subtracting the result of using F = f ∇g we have


Z Z
(g△f − f △g) dV = (gn̂ · ∇f − f n̂ · ∇g) dS
V ∂V

These can be very useful in deriving equations underlying physical applications.

36
Appendix: revision of the chain rule
Taylor series: Recall for a scalar function f (x),

h2 ′′
f (x0 + h) = f (x0 ) + hf ′ (x0 ) + f (x0 ) + . . .
2!
or, equivalently, (x = x0 + h)

(x − x0 )2 ′′
f (x) = f (x0 ) + (x − x0 )f ′ (x0 ) + f (x0 ) + . . .
2!
E.g. A.1: If f (x) = cos x, with x0 = 0 we get

h2
cos h = 1 − + ...
2!

Proposition: Start with chain rule for a scalar function of a single variable (sometimes
referred to a differentiation of a function of a function). Consider a function f (x) such that
x = x(t). Then F (t) = f (x(t)) and

dF
= x′ (t)f ′ (x(t))
dt

E.g. A.2: If f (x) = cos x and x(t) = t2 then x′ (t) = 2t f ′ (x) = sin x and so we get

d
cos(t2 ) = 2t sin(t2 )
dt

Proof: Standard limit used to define a derivative, along with Taylor series expansions (notation
O(h2 ) means collect terms as small as and smaller than h2 )

d f (x(t + h)) − f (x(t))


f (x(t)) = lim
dt h→0 h
f (x(t) + hx′ (t) + O(h2 )) − f (x(t))
= lim
h→0 h
f (x(t)) + hx′ (t)f ′ (x(t) + O(h)) + O(h2 ) − f (x(t))
= lim
h→0 h
′ ′
= x (t)f (x(t))

Now consider a scalar function of more than one variable, f (x, y), say, and let x = x(t) and
y = y(t).
That is, we can write F (t) = f (x(t), y(t)), say. From the chain rule, it follows that

dF dx ∂f dy ∂f
= (x(t), y(t)) + (x(t), y(t))
dt dt ∂x dt ∂y

37
Proof: Similar to before and requires the extension to multiple variables of Taylor’s expansion:

f (x0 + h, y0 + k) = f (x0 , y0 ) + hfx (x0 , y0) + kfy (x0 , y0 ) + higher order terms

E.g. A.3: If f (x, y) = xy and x(t) = t2 and y(t) = e−t then F (t) = t2 e−t . Then we can see that
by a direct calculation that
F ′ (t) = 2te−t − t2 e−t
Using the chain rule, we get x′ (t) = 2t, y ′(t) = −e−t , fx = y, fy = x and so

F ′ (t) = (2t)e−t − e−t t2

and we get the same answer.


Note: Clearly this extends to scalar functions of more than two variables, so that if we have the
function f (x1 , . . . , xm ) and xi = xi (t), for i = 1, 2, . . . , m then with F (t) = f (x1 (t), . . . , xm (t))
m
dF X dxi ∂f
= (x1 (t), . . . , xm (t))
dt i=1
dt ∂xi

Put another way, F (t) = f (x) where x = x(t) and the the summation above can be either
interpreted as row times column vectors or as a dot product:

F ′ (t) = (∇f )T x′ (t) ≡ ∇f · x′ (t).

noting that ∇f is evaluated at x(t).


Next, consider scalar functions of more than one variable, each of which is a function
of more than one variable. The simplest example is to consider f (x, y) where x = x(u, v) and
y = y(u, v). Then F (u, v) = f (x(u, v), y(u, v)) and the chain rule gives
∂F ∂x ∂f ∂y ∂f
= +
∂u ∂u ∂x ∂u ∂y
and
∂F ∂x ∂f ∂y ∂f
= +
∂v ∂v ∂x ∂v ∂y
This can be arranged as a matrix/vector relation as
 
xu xv
(Fu , Fv ) = (fx , fy )
yu yv

Note: The 2 × 2 matrix is the Jacobian of the map (u, v) 7→ (x(u, v), y(u, v)), the 1 × 2 matrix
(Fu , Fv ) is the Jacobian of of the map (u, v) 7→ F (u, v) and and the 1 × 2 matrix (fx , fy ) is the
Jacobian of of the map (x, y) 7→ f (x, y).
Note: Again, we can see through a way of extending this to a more general case in which
f = f (x1 , . . . , xm ) and xi = xi (u1, . . . , up ). Then we can write

F (u1, . . . , un ) = f (x1 (u1 , . . . , up), . . . , xm (u1 , . . . , up ))

38
or
F (u) = f (x(u))
and application of the chain rule gives
m
∂F X ∂xk ∂f
=
∂uj k=1
∂uj ∂xk

for j = 1, . . . , p. This can be intepreted as the matrix/vector relation


 
∂x1 /∂u1 . . . ∂x1 /∂un

(Fu1 , . . . , Fun ) = (fx1 , . . . , fxm )  .. .. .. 
. . . .
∂xm /∂u1 . . . ∂xm /∂un

The final generalisation of this is to consider a vector function where f : Rm → Rn s.t. x 7→ f(x)
and another map x = x(u) where x : Rp → Rm . Then if we define F(u) = f(x(u)) and apply the
chain rule as above to each scalar component, Fi , i = 1, . . . , n of F we get
m
∂Fi X ∂xk ∂fi
=
∂uj k=1
∂uj ∂xk

or
    
∂F1 /∂u1 . . . ∂F1 /∂up ∂f1 /∂x1 . . . ∂f1 /∂xm ∂x1 /∂u1 . . . ∂x1 /∂up
 .. .. ..   .. .. ..  .. .. .. 
 . . . = . . .  . . . .
∂Fn /∂u1 . . . ∂Fn /∂up ∂fn /∂x1 . . . ∂fn /∂xm ∂xm /∂u1 . . . ∂xm /∂up

At this high level or generality we still have the same underlying structure from the first result.
Thus the equation above reads F′ (u) = f ′ (x(u))x′ (u), which is the result quoted in the notes for
the chain rule applies to the composition of maps, although the notation is shifted here from there.

39

You might also like