Mvcalc Notes 2018 v2
Mvcalc Notes 2018 v2
Richard Porter
University of Bristol
2018
Course Information
• Prerequistes: Calculus 1 (and Linear Algebra and Geometry, Analysis 1)
• The course develops multivariable calculus from Calculus 1. The main focus of the course is
on developing differential vector calculus, tools for changing coordinate systems and major
theorems of integral calculus for functions of more than one variable.
This unit is central to many branches of pure and applied mathematics. For example, in
applied mathematics vector calculus is an integral part of describing field theories that model
physical processes and dealing with the equations that arise.
It is used in 2nd year Applied Partial Differential Equations and in year 3 Fluid Dy-
namics, Quantum Mechanics, Mathematical Methods, and Modern Mathematical
Biology.
• Lecturer: Dr. Richard Porter, Room SM2.7
• Web:
https://people.maths.bris.ac.uk/~marp/mvcalc
Notes may contain extra sections for interest or additional information. Problem sheets, so-
lutions, homework feedback forms, problems class sheets, past exam papers, video tutorials.
• Email: [email protected]
• Books: Lots of books on multivariable/vector calculus. Jerrold E. Marsden & Anthony J.
Tromba, “Vector Calculus”, ed. 5 , W. H. Freeman and Company, 2003
• Maths Café: TBA
• Office Hours: Tuesday 9-10am.
• Homework set weekly from 5 problems sheets.
• Timetabled problems classes/exercise classes: unseen problems/some from the problem
sheets/and as many as possible from past exam papers.
• Exam: Jan 90 mins. 2 compulsory questions. No calculators.
1
1 A review of differential calculus for functions of more
than one variable
Revision and extension of results from Calculus 1.
Let x ∈ Rm = (x1 , x2 , . . . , xm )1 .
Often in 2D write x ≡ (x, y) or in 3D x ≡ (x, y, z).
Defn: The derivative of the map F : Rm → Rn is the n × m matrix F′ (x) such that the i, jth
element is
∂Fi
{F′ (x)}ij = .
∂xj
1
Although written on the page as a row vector, in computations, vectors are actually arranged as column vectors
unless indicated by a T for transpose.
2
For scalar functions of single variables, the derivative f ′ (x0 ) is defined to be precisely the function
such that the line formed by
f (x0 ) + (x − x0 )f ′ (x0 )
is tangent at x = x0 to the curve f (x).
For vector functions of multiple variables, F′ (x0 ) is defined to be precisely the matrix such that
Defn: The matrix F′ (x) with elements ∂Fi /∂xj is called the Jacobian matrix.
Note: The rows of the Jacobian matrix are formed by gradients of the components of F, viz
(∇F1 )T
(∇F2 )T
′
F (x) = .. .
.
T
(∇Fn )
3
1.4 The directional derivative
Defn: The directional derivative of F at x0 along v (such that |v| = 1) is a vector in Rn given
by
dF1 (x0 + tv) dFn (x0 + tv) dF(x0 + tv)
Dv F(x0 ) = ,..., ≡ .
dt dt t=0 dt t=0
It measures the rate of change of F in the direction of v and it is formulated in terms of ordinary
1D derivatives.
Note: Can be shown that
Dv F(x0 ) = F′ (x0 )v.
Proof: (informal)
dF(x0 + tv) F(x0 + tv) − F(x0 ) F(x0 ) + F′ (x0 )(x0 + tv − x0 ) − F(x0 )
dt = lim
t→0 t
= lim
t→0 t
t=0
which gives the result. In above, we replace F by equation of tangent plane (1) which coincides
in limit t → 0.
p
6 1 then redefine v by v/|v| where |v| = v12 + v22 + . . . + vm
Note: If |v| = 2 .
H(x) = f (x)F(x)
using product rule for differentiation. No simple representation for the result using stan-
dard linear algebra.
4
3. (Composition of maps) If F : Rm → Rn and G : Rn → Rp , then if we define H : Rm → Rp
by
H(x) = (G ◦ F)(x) = G(F(x))
it follows that
H′ (x) = G′ (F(x)) F′ (x) (3)
where the right-hand side denotes the product of the p × n matrix G′ (F(x)) with the n × m
matrix F′ (x).
Proof: From the definition, Hi = Gi (F1 (x1 , . . . , xm ), F2 (x1 , . . . , xm ), . . . , Fn (x1 , . . . , xm )).
So
∂Hi ∂Gi ∂Fi ∂Gi ∂F2 ∂Gi ∂Fn
{H′ (x)}ij = (x) = + + ...+
∂xj ∂x1 ∂xj ∂x2 ∂xj ∂xn ∂xj
n
X ∂Gi ∂Fk
= (F(x)) (x)
k=1
∂xk ∂xj
using the chain rule. This summation can be interpreted as the ith row of G′ (F(x)) multi-
plied by the jth column of F′ (x) and this gives the result.
Note: See the Appendix for a revision of the chain rule and how it applies to multivariable
functions.
Differentiating, applying (3) to (4) and using the fact that x = Ix where I is the n × n identity
matrix we have
(F−1 )′ (F(x))F′ (x) = I.
Thus
(F−1 )′ (F(x)) = (F′ )−1 (x). (5)
In other words “the derivative of the inverse is equal to the inverse of the derivative”.
Note: For scalar maps, we recognise this statement as
−1
dx dy 1
= = .
dy dx dy
dx
and (5) generalises this to functions of more than one variable.
5
This means that
′ ∂x/∂r ∂x/∂θ cos θ −r sin θ
F (r, θ) = = .
∂y/∂r ∂y/∂θ sin θ r cos θ
Taking inverses
′ −1 cos θ sin θ
(F (r, θ)) = .
− sin θ/r cos θ/r
p
Now consider the inverse map F−1 : R2 7→ R2 s.t. (x, y) → ( x2 + y 2 , tan−1 (y/x)) ≡ (r, θ). Then
p p
−1 ′ ∂r/∂x ∂r/∂y x/ x2 + y 2 y/ x2 + y 2
(F ) (x, y) = = .
∂θ/∂x ∂θ/∂y −y/(x2 + y 2) x/(x2 + y 2 )
Finally,
−1 ′ r cos θ/r r sin θ/r
(F ) (F(r, θ)) =
−r sin θ/r r cos θ/r 2
2
The same question can be stated in terms of a solution to a nonlinear system of equations. Namely,
let
F(x) = y (6)
for x, y ∈ Rn . Or, in full,
F1 (x1 , . . . , xn ) = y1
.. ..
. .
Fn (x1 , . . . , xn ) = yn .
Then, given y, is there a x such that (6) is solved. If so, then x = F−1 (y).
y0 = F(x0 ).
If the Jacobian matrix F′ (x0 ) is invertible, then (6) can be solved uniquely as
x = F−1 (y),
6
Note: A matrix is invertible if and only if its determinant is non-zero. The determinant of the
Jacobian matrix F′ is often written as
∂(F1 , . . . , Fn )
JF (x0 ) ≡ (7)
∂(x1 , . . . , xn ) x=x0
which means
x ≈ x0 + (F′ (x0 ))−1 (y − y0 )
since y0 = F(x0 ) but relies on the existence of the inverse of the Jacobian. I.e. given y, x can be
determined.
Note: The theorem tells us nothing about what happens if the inverse does not exist.
F(x, y) = 0 (8)
7
where F : Rm+n → Rn .
Note: If F is linear in y then (8) can be written in the form y = G(x) for some G and the inverse
function theorem applies. We suppose that this is not the case.
Suppose that (8) is satisfied by the pair x0 , y0 (i.e. F(x0 , y0 ) = 0.) Then we can express solutions
of this as y = y(x) for y : Rm → Rn in the neighbourhood of y0 provided the Jacobian determinant
∂(F1 , . . . , Fn )
(9)
∂(y1 , . . . , yn ) x=x0 ,y=y0
is non-zero.
Taking the partial derivative of Fi w.r.t xk gives (by the chain rule)
∂Fi ∂Fi ∂y1 ∂Fi ∂yn
+ + ... =0
∂xj ∂y1 ∂xj ∂yn ∂xj
and this can be interpreted as the matrix equation
∂F ∂F1
∂F1 ∂F1
∂y1 ∂y1
∂x1
1
. . . ∂xm ∂y1
. . . ∂yn ∂x1
... ∂xm 0
.. . .. .
.. + ... . . .
. . .. .. . . .
.. ..
. . = .
∂Fn ∂Fn ∂Fn ∂yn ∂yn
∂x1
. . . ∂x m ∂y1
. . . ∂F
∂yn
n
∂x1
... ∂xm
0
and therefore
∂y1 ∂y1
∂F1 ∂F1
−1 ∂F1 ∂F1
∂x1
... ∂xm ∂y1
... ∂yn ∂x1
... ∂xm
.. .. .. .. .. .. .. .. ..
. . . = − . . . . . .
∂yn ∂yn ∂Fn ∂Fn ∂Fn ∂Fn
∂x1
... ∂xm ∂y1
... ∂yn ∂x1
... ∂xm
E.g. 1.7: Consider f (x, y) = 0 where f (x, y) = x2 + y 2 − 1. This is satisfied by points (x0 , y0 ) on
the unit circle. If we try to express it as y = y(x) we get into trouble since
√
y = ± 1 − x2
8
and there are two solutions. The implicit function theorem applied to this example requires the
determinant of the 1 × 1 matrix
∂f
∂y
evaluated at (x0 , y0 ) to be non-zero. This is 2y0 which is non-zero apart from at y0 = 0. So we
can express the solution y = y(x) local to a point (x0 , y0 ) provided y√0 6= 0. Which is obvious in
our case as if y0 > 0 we are on the upper solution branch where y = 1 − x2 and vice versa.
Higher-order derivatives are useful in Taylor’s theorem in dimension ≥ 2, allowing one to approx-
imate functions of several variables near a point.
How do we generalise to higher dimensions ? Well, it gets tricky. For a scalar function f (x, y), or
more than one (e.g. 2) variable
x − x0
f (x, y) = f (x0 , y0 ) + (fx (x0 , y0 ), fy (x0 , y0))
y − y0
1 fxx (x0 , y0 ) fxy (x0 , y0) x − x0
+ (x − x0 , y − y0 ) + h.o.t
2 fyx (x0 , y0) fyy (x0 , y0 ) y − y0
with an obvious generalisation to functions of more than 2 variables.
The higher order terms are complicated and require some complex notation.
9
For vector functions, F = (F1 , . . . , Fn ) we can use the scalar result above for each scalar function
component, Fi . I.e.
for x close to x0 . This is equation (1) for the tangent plane at the point x0 on F. We’ve used this
approximation as the basis of earlier informal proofs.
10
2 Differential vector calculus
Focus now on 3D, and adopt convention that position vector r = (x, y, z) ≡ (x1 , x2 , x3 ) ∈ R3 to
describe equations pertaining to physical applications.
Notation: The Cartesian (unit) basis vectors in R3 are x̂ = (1, 0, 0) ≡ e1 , ŷ = (0, 1, 0) ≡ e2 and
ẑ = (0, 0, 1) ≡ e3 such that r = xx̂ + yŷ + zẑ ≡ x1 e1 + x2 e2 + x3 e3 .
p p
Also use r = |r| = x21 + x22 + x23 ≡ x2 + y 2 + z 2 as the length of the position vector.
Defn: The dot product of two vectors u = (u1 , u2 , u3 ) and v = (v1 , v2 , v3 ) is defined
3
X
u · v = u1 v1 + u2 v2 + u3 v3 ≡ uj vj
j=1
3
X
Notation: (Einstein summation convention) Drop the in the above on the understanding
j=1
that repeated suffices imply summation. I.e.
u · v = uj vj
√ p
For e.g., r = |r| = r·r= x2i .
E.g. 2.1: (Sampling property) Another way of defining δij is to be the set of elements for
which
xi = δij xj , for every i
3
X
Note that δij xj = δij xj = xi if and only if the defn above holds.
j=1
11
Defn: The cross product of two vectors u, v ∈ R3 , vector given by
e1 e2 e3
u × v = u1 u2 u3 ≡ (u2 v3 − v2 u3 )e1 + (u3 v1 − v3 u1 )e2 + (u1 v2 − v1 u2 )e3 . (10)
v1 v2 v3
1. ǫ123 = 1
Implies ǫijk are invariant under cyclic rotation of suffices. Thus ǫ123 = ǫ231 = ǫ312 = 1, ǫ213 =
ǫ132 = ǫ321 = −1, and all 21 others are zero.
Note: Another way of defining ǫijk are as the set of 27 elements such that
3 X
X 3
For e.g. [u × v]1 = ǫ1jk uj vk = ǫ1jk uj vk = ǫ111 u1 v1 + ǫ112 u1 v2 + ǫ113 u1 v3 + ǫ121 u2 v1 +
j=1 k=1
ǫ122 u2 v2 + ǫ123 u2 v3 + ǫ131 u3 v1 + ǫ132 u3 v2 + ǫ133 u3 v3 and this equals u2 v3 − u3 v2 only if ǫ123 = 1,
ǫ132 = −1 and all 7 other ǫ1jk are zero. Repeat for 2nd and 3rd components.
Note: The defintion of ǫijk guarantees the antisymmetry of the cross product.
Proof: Just have to consider all possible (non-trivial) combinations to see it is true.
a × (b × c) = (a · c)b − (a · b)c.
12
Proof: Vector identity, so start by looking at the scalar quantity which is the ith component of
the LHS:
Scalar and vector fields defined in R3 are of particular importance for physical applications. For
example:
• (Scalar fields) Temperature T (r); mass density ρ(r) for a fluid or gas; electric charge density
q(r).
• (Vector fields) Velocity v(r) of a fluid or gas; electric and magnetic fields E(r) and B(r),
displacement fields in elastic solid u(r).
In these physical applications, one often derives equations that govern vector and scalar fields
which involve derivatives in space (and time).
The following three first-order differential operations of vector calculus emerge from this:
Defn: The gradient of a scalar field f , denoted ∇f , is the vector field given by
∂f ∂f ∂f ∂f
∇f (r) = , , , or, in component form, [∇f ]i = , i = 1, 2, 3
∂x ∂y ∂z ∂xi
The gradient maps scalar to vector fields.
E.g. 2.7:
y −y x 1
−1
∇ tan = , 2 ,0 ≡ (−yx̂ + xŷ).
x x + y x + y2
2 2 x2 + y2
13
p
E.g. 2.8: Recall r = x2 + y 2 + z 2 . A direct calculation gives
x y z (x, y, z) r
∇r = , , = = or, in component form, [∇r]i = xi /r
r r r r r
Note: r/r is the unit vector from the origin to the point r; we often denote this as r̂.
E.g. 2.9: If f (r) = g(r) (i.e. a function depends only on the distance from the origin) then
∂g(r) ∂r dg(r) r
[∇g(r)]i = = ≡ g ′(r)[∇r]i = g ′ (r)
∂xi ∂xi dr r i
since r is a function of x1 , x2 and x3 and by using the Chain rule.
Thus ∇g(r) = g ′(r)r̂ (c.f. potentials, central forces in Mech 1).
Recall from Calculus 1, two important interpretations of the gradient:
Provided ∇f is nonzero, the gradient points in the direction in which f increases most rapidly.
Proof: let v be s.t. |v| = 1. Then rate of change of f in direction v is the directional derivative
(see (2)) Dv f (r) = v · ∇f = |∇f | cos θ, where θ is the angle between v and ∇f . Maximised when
θ = 0. I.e. when v in direction ∇f .
Proof: Let c(t) lie in S. Then f (c(t)) = C, for all t. The chain rule yields
d
0= f (c(t)) = ∇f (c(t)) · c′ (t)
dt
and since c′ (t) is parallel to S at c(t), we have our result.
E.g. 2.10: Consider the temperature T in a room to be a function of 3D position (x, y, z):
ex sin(πy)
T (r) =
1 + z2
Q: If you stand at the point (1, 1, 1) in which direction will the room get coolest fastest ?
A:
ex sin(πy) πex cos(πy) 2zex sin(πy)
∇T = , ,−
1 + z2 1 + z2 (1 + z 2 )2
14
and at (x, y, z) = (1, 1, 1), ∇T = (0, − 12 e, 0). So a vector pointing in the direction where tem-
perature gets coolest (i.e. decreases most rapidly) is (0, 1, 0).
Defn: The divergence of a vector field v(r), denoted ∇ · v, is the scalar field given by
Note: The use of a ‘dot’ between the symbol used for the gradient and the vector field is purely
notational. Do not get the divergence, which is a differential operation, confused with the dot
product. For e.g. the dot product is a · b = b · a since multiplication is commutative, but
∂ ∂ ∂
v · ∇ = v1 + v2 + v3 6= ∇ · v.
∂x1 ∂x2 ∂x3
Harder without physical setting, but broadly it measures the expansion (positive divergence) or
contraction of a field at a point.
For example, consider (i) v(r) = (x, y, 0) then ∇ · v = 2 and (ii) v(r) = (−y, x, 0), then ∇ · v = 0.
Note: The first case corresponds to a 2D radially spreading field and the second to a 2D circular
rotating field (just believe me) which is why the divergence is positive (expanding) and zero (nei-
ther expanding nor contracting) in the two cases.
15
2.5 Curl
Defn: The curl of a vector field v(r), denoted ∇ × v, is the vector field (i.e. in R3 ) given by
x̂ ŷ ẑ
∂v 3 ∂v2 ∂v1 ∂v3 ∂v2 ∂v 1
∇ × v = ∂x ∂y ∂z ≡ − , − , − .
v1 v2 v3 ∂y ∂z ∂z ∂x ∂x ∂y
Again harder without physical setting, but broadly it measures the rotation or circulation of a
vector field (because it needs direction) at a point.
Using same examples from ‘Div’ section: (i) v(r) = (x, y, 0) then ∇×v = 0 (the radially spreading
field has no rotation); and (ii) v(r) = (−y, x, 0), then ∇ × v = 2ẑ (the rotational field has !)
x̂ ŷ ẑ
E.g. 2.16: Let v(r) = (y 2, x2 , y 2). Then ∇ × v = ∂x ∂y ∂z = (2y, 0, 2(x − y))
y 2 x2 y 2
The operations of grad, div and curl can be combined. Thus, only the following combination of
operations make sense:
16
2.6.1 Two Null Identities
Thus since the expression equals its own negative, it must vanish.
Defn: The Laplacian of a scalar field f (r), denoted ∇2 f (or △f ), is the scalar field defined by
2
2 ∂ ∂2 ∂2
△f = ∇ · ∇f (r) = ∂i f ≡ + + f.
∂x2 ∂y 2 ∂z 2
The definition can also be extended to consider the Laplacian of a vector field v(r) which is
△v = −∇ × (∇ × v) + ∇(∇ · v).
17
2.8 Curvilinear coordinate systems
All differential operators defined above were expressed in Cartesian coordinates. For many prac-
tical problems more natural to express problems in coordinates aligned with principal features of
the problem. E.g. Polars are appropriate for circular domains.
Q: How do we recast the differential operators in a differential coordinate system ?
For example: in 2D, if q1 = rpand q2 = θ then x = x(r, θ) = r cos θ and y = y(r, θ) = r sin θ. The
inverse map is r = r(x, y) = x2 + y 2 , θ = θ(x, y) = tan−1 (y/x).
Defn: The surfaces qi = const are called coordinate surfaces. The space curves formed by their
intersection in pairs are called the coordinate curves. The coordinate axes are determined
by the tangents to the coordinate curves at the intersection of three surfaces. They are not, in
general, fixed directions in space.
The two points r(q1 , q2 , q3 ) and r(q1 +dq1 , q2 , q3 ) lie on a coordinate curve formed by q2 , q3 constant.
Thus, the q1 -coordinate axis is determined by letting dq1 → 0 in
∂r
r(q1 , q2 , q3 ) + dq1 − r(q1 , q2 , q3 ) + h.o.t. order (dq1 )2
r(q1 + dq1 , q2 , q3 ) − r(q1 , q2 , q3 ) ∂q1 ∂r
= =
dq1 dq1 ∂q1
(after Taylor expanding). Repeat with q2 and q3 . Thus, we can describe the point q = q1 q̂1 +
q2 q̂2 + q3 q̂3 in terms of the local coordinate basis given by unit vectors directed along the local
coordinate axes:
1 ∂r 1 ∂r 1 ∂r
q̂1 = , q̂2 = , q̂3 =
h1 ∂q1 h2 ∂q2 h3 ∂q3
where, to ensure |q̂i | = 1, we have normalised by
∂r
hi = .
∂qi
18
Note: the use of Greek indices in, for e.g.
1 ∂r
q̂α =
hα ∂qα
for α = 1, 2, 3 indicates that the summation convention is not applied.
Remark: Is this always possible ? I.e. is there always a unique map from one system to another
? This is the same as asking if there is an inverse map. Thus (by the inverse function theorem)
the answer lies in the Jacobian matrix of the map, given here by r′ (q) which is the matrix with
hα q̂α as column vectors (for α = 1, 2, 3). Thus the Jacobian determinant
h1 [q̂1 ]1 h2 [q̂2 ]1 h3 [q̂3 ]1
∂(x, y, z)
Jr = ≡ h1 [q̂1 ]2 h2 [q̂2 ]2 h3 [q̂3 ]2
∂(q1 , q2 , q3 )
h1 [q̂1 ]3 h2 [q̂2 ]3 h3 [q̂3 ]3
must be non-vanishing. The 2nd representation simply shows that no new calculations are needed
to populate the entries of the Jacobian determinant.
Defn: If local basis vectors of a curvilinear coordinate system are mutually orthogonal, we call
it an orthogonal curvilinear coordinate system. Convention dictates that the system be right-
handed, or q̂1 = q̂2 × q̂3 . (or, form axes from your thumb, index and middle fingers of your right
hand and order basis vectors 1, 2, 3 on each respective digit.)
19
Hence h1 = 1 (and similarly h2 = h3 = 1).
Thus the local basis vectors are
Note: From the definition of the basis vectors and using summation notation for the dot product
we have q̂i · q̂j = Rki Rkj = δij (using (14) and so the basis vectors are othonormal.
In other words, the new coordinate axes are a general rotation of the original x̂, ŷ, ẑ axes.
E.g. 2.20: In 3D, cylindrical polar coordinates are defined by the mapping
20
Figure 2: A local basis in spherical polar coordinates. The vector r̂ points along a ray from the
center, φ̂ points along the meridians, and θ̂ along the parallels.
and these vary with position. Note that r̂ · θ̂ = r̂ · ẑ = θ̂ · ẑ = 0, and r̂ = θ̂ × ẑ, so cylindrical
coordinates are indeed orthogonal.
see Fig. 2. Now the derivatives with respect to the coordinates are
∂r
= (sin φ cos θ, sin φ sin θ, cos φ),
∂r
∂r
= (r cos φ cos θ, r cos φ sin θ, −r sin φ),
∂φ
∂r
= (−r sin φ sin θ, r sin φ cos θ, 0),
∂θ
and the scale factors become (check):
∂r ∂r ∂r
hr = = 1, hφ = = r, hθ = = r sin φ. (17)
∂r ∂φ ∂θ
21
2.8.2 Transformation of the gradient
Now if u = u1 q̂1 + u2q̂2 + u3 q̂3 then the orthonormal property of the local basis functions means
uj = u · q̂j . If we let u = ∇f and with (19) we find
3 3
X q̂α ∂f X q̂α ∂
∇f = , and so ∇= . (20)
h ∂qα
α=1 α
h ∂qα
α=1 α
E.g. 2.22: In cylindrical polar coordinates, according to (20) and (15), we have
∂ 1 ∂ ∂
∇ = r̂ + θ̂ + ẑ
∂r r ∂θ ∂z
E.g. 2.23: In spherical coordinates, according to (20) and (18),
∂ 1 ∂ 1 ∂
∇ = r̂ + φ̂ + θ̂
∂r r ∂φ r sin φ ∂θ
To find ∇ · u in curvilinear coordinates we first need to express the vector field u in the local
coordinate system. I.e.
u = u1 q̂1 + u2 q̂2 + u3 q̂3 .
The difficulty here is that both ui and q̂i depend on (q1 , q2 , q3 ). We come at the divergence in a
slightly roundabout way.
22
Then from §2.6.1 (Null identites: ∇ · (∇ × A) = 0, ∇ × (∇f ) = 0 for any A, f ,)
q̂α q̂1
∇× = 0, ∇· = 0.
hα h2 h3
Results true for the 2 cyclic permutations (1 → 2, 2 → 3, 3 → 1)
q̂2 q̂3
∇· =∇· = 0.
h1 h3 h1 h2
So now
q̂1
∇ · u = ∇. (u1 h2 h3 ) + 2 cyclic perms
h2 h3
q̂1 q̂1
= · ∇ (u1 h2 h3 ) + (u1 h2 h3 ) ∇ · + 2 cyclic perms
h2 h3 h2 h3
3
!
q̂1 X q̂α ∂(u1 h2 h3 )
= · + 2 cyclic perms
h2 h3 h α ∂qα
α=1
1 ∂(u1 h2 h3 ) ∂(u2 h1 h3 ) ∂(u3 h1 h2 )
= + +
h1 h2 h3 ∂q1 ∂q2 ∂q3
using the fact that q̂α · q̂β = δαβ .
23
Now we see
h1 q̂1 h2 q̂2 h3 q̂3
1
∇×u= ∂/∂q 1 ∂/∂q 2 ∂/∂q 3
.
h1 h2 h3
h1 u1 h2 u2 h3 u3
2.9 Examples
24
3 Integration theorems of vector calculus
Having done differential vector calculus, we turn to integral vector calculus. These are equally
important in applications as you will see in APDE2, Fluid Dynamics and beyond. We shall derive
three (quite stunning) main integral identities all of which may be considered as higher-dimensional
generalisations of the Fundamental Theorem of Calculus:
Z b
f ′ (x) dx = f (b) − f (a).
a
The LHS is a one-dimensional integral (i.e. an integral over a line) which is equated to zero-
dimensional (i.e. pointwise) evaluations on the boundary of the integral (here at x = a, b).
Remark: The formula for integration by parts is found by letting f (x) = u(x)v(x) in the above !
An ordinary 1D integral can be regarded as integration along a straight line. For example if F (x)
is the force on a particle alowed to move along the x-axis,
Z x2
F (x) dx
x1
is the “work done” moving it from x1 to x2 . We want to integrals along general paths in R2 or R3 .
Defn: A path is a bijective (i.e. one-to-one) map p : [t1 , t2 ] → R3 s.t. t 7→ p(t). It connects the
point p(t1 ) to p(t2 ) along a curve C, say. We say the curve C is parametrised by the path.
and ds = |dr| denotes the elemental arclength. Since r = p(t) on C, dr = p′ (t) dt and so
Z Z t2
f (r) ds = f (p(t))|p′ (t)| dt.
C t1
E.g. 3.1:√Let p(t) = (t, t, t) for t ∈ [0, 1] connects the points (0, 0, 0) to (1, 1, 1) by a straight line
of length 3. If f = xyz then
Z Z 1 √
3
√ 3
f ds = t 1 + 1 + 1 dt =
C 0 4
E.g. 3.2: Let p(t) = (t2 , t2 , t2 ) for t ∈ [0, 1] parametrises the same curve as in E.g. 3.1. With the
same f we have √
Z Z 1 p
3
f ds = t6 (2t)2 + (2t)2 + (2t)2 dt = .
C 0 4
25
Note: Parameterisation is not unique. Suggests value of line integral is independent of parametri-
sation.
Proof: Consider the bijective map t = g(u) for t1 < t < t2 such that t1 = g(u1), t2 = g(u2). Then
Z t2 Z u2 Z u2
′ ′ ′
f (p(t))|p (t)| dt = f (p(g(u)))|p (g(u))|g (u) du = f (q(u))|q′(u)| du.
t1 u1 u1
after letting q(u) = p(g(u)) and noting q (u) = g (u)p (g(u)) by the chain rule.
′ ′ ′
Defn: Let F(r) : R3 → R3 be a vector field, and let p(t) be a path on the interval [t1 , t2 ]. The
line integral of F along p is defined by
Z Z t2
F · dr = F(p(t)) · p′ (t) dt
C t1
as above.
Note: As above, the value of the line integral is not dependent on parametrisation of C but is
negated by a reversal of C.
E.g. 3.3: Integrate F = sin φẑ (φ is polar angle in spherical polars) along a meridian of a sphere
of radius R from the south to the north pole.
A: From the description of the path, C, convenient to use spherical coordinates (r, φ, θ). I.e.
p(φ) = Rr̂ = R(sin φ cos θ, sin φ sin θ, cos φ);
(see earlier defn of r̂, θ̂, φ̂ in spherical polars (18)) then
dp
= R(cos φ cos θ, cos φ sin θ, − sin φ) ≡ Rφ̂.
dφ
Thus we can see that ẑ · φ̂ = − sin φ and so
Z Z 0 Z 0 Z 0
Rπ
F · dr = ′
F · p (φ) dφ = R sin φ ẑ · φ̂ dφ = −R sin2 φdφ = .
C π π π 2
Proposition: Let f (r) be a scalar field and let C be a curve in R3 parametrised by the path p(t),
t1 ≤ t ≤ t2 . Then Z
∇f · dr = f (p(t2 )) − f (p(t1 )).
C
26
This is the fundamental theorem of Calculus for line integrals.
Proof: We have Z Z t2
∇f · dr = ∇f (p(t)) · p′ (t) dt.
C t1
Note: If C is closed, the line integral over a gradient field vanishes. As a result, line integrals of
gradient fields are independent of the path C.
Remark: The line integral of a vector field is often called the work integral, since if F represents
a force, the integral represents the work done moving a particle between two points. If F = ∇f
for some scalar field f (often called the potential) then the work done moving the particle is
independent of the path taken. Moreover the work done moving a particle which returns to the
same position is zero. Such a force is called conservative.
Defn: A path p(t), for t ∈ [t1 , t2 ] is closed if p(t1 ) = p(t2 ). A closed path is simple it it does
not intersect with itself apart from at the end points t1 , t2 .
Defn: Let D ⊂ R2 , let ∂D represent the boundary of D (it should be a simple closed path) and
let D̄ be D ∪ ∂D.
Now define a map s : D̄ → R3 s.t (u, v) 7→ s(u, v) and ∂s/∂u, ∂s/∂v are linearly independent on
D. A surface S ∈ R3 is given in parametrised form by S = {s(u, v) | (u, v) ∈ D}.
R −x −y .
2 2 2
27
Note: this is not the only way to parametrise a hemisphere; could (and will) use spherical polars.
where dS = n̂dS and n̂ is a unit vector pointing out from S (a surface element is defined by its
size dS and a direction, n̂, being the normal to the surface).
Z Z
Note: dS is the physical area of the surface S in the same way that ds is the physical length
S C
of the curve C.
Now the two vectors s(u + du, v) and s(u, v) both lie on S and, assuming du vanishingly small,
and Taylor expanding,
∂s
s(u + du, v) − s(u, v) = s(u, v) + du (u, v) − s(u, v) + h.o.t. order (du)2.
∂u
Do the same thing with v. It follows that the two vectors
∂s ∂s
du, dv
∂u ∂v
lie in the tangent plane to the surface S and the area dS of the elemental rhomboid formed by
these two vectors in the direction normal to dS is
∂s ∂s ∂s ∂s
dS = du × dv ≡ N(u, v)dudv, N(u, v) = ×
∂u ∂v ∂u ∂v
and so Z Z
f (r) dS = f (s(u, v))|N| du dv.
S D
Note: If S lies in the 2D plane, then s(u, v) = (x(u, v), y(u, v), 0) is a mapping from 2D to 2D
and so
∂(x, y)
|N| =
∂(u, v)
is the Jacobian of the map (easy to confirm). We know this from Calculus 1 (e.g. dxdy 7→ rdrdθ).
as above.
Important remark: By analogy with line integrals, can show that the surface integral of a vec-
tor field is independent of parametrisation up to a sign. The sign depends on the orientation of
28
the parametrisation, which is determined by the direction of the unit normal n̂ = N/|N|. Thus,
the direction of n̂, (or N) must be specified in order to fix the sign of the integral unambiguously.
Z
E.g. 3.5: Calculate B · dS where B(r) = rẑ + r̂ (expressed in cylindrical polars) in which
S
S = {(x, y, 0) | x2 + y 2 ≤ 1} is directed in positive z-direction.
∂s ∂s
N= × = rẑ,
∂r ∂θ
(factor r is the Jacobian determinant in this 2D mapping) and points in direction ẑ normal to the
2D plane as required by the question. Then
Z Z 2π Z 1 Z 2π Z 1
2π
B · dS = (rẑ + r̂) · ẑ rdrdθ = r 2 dr dθ =
S 0 0 0 0 3
Consider that the vector field v is expressed as the curl of another vector field, i.e. v = ∇ × A.
This frequently happens in applications.
Defn: The boundary of the surface S is denoted ∂S and, since it is mapped from the boundary
∂D, it inherits its properties, being a simple closed path. If c(t) ∈ R2 is the simple closed path
along ∂D then
p(t) = s(c(t))
is the simple closed path along ∂S.
Note: A closed path has no start and finish point and can be oriented in either the anti-clockwise
or clockwise directions.
Proposition: (Stokes’ theorem) Let S be a surface in R3 with boundary ∂S; let A be a vector
field. Then Z Z
∇ × A · dS = A · dr, (21)
S ∂S
where dS and ∂S must be consistently oriented according to the right-hand THUMB rule.
29
Defn: (Right-hand thumb rule) Point the thumb of your right hand along curve ∂S (either
clockwise or anti-clockwise). Wrap your fingers around the curve; your fingers will indicate the
direction of the normal N (or n̂) of the surface that must be chosen in accordance with your choice
of direction around the curve.
Z
E.g. 3.6: Compute ∇ × f · dS where f(r) = ω × r and ω = (ω1 , ω2 , ω3 ) is a constant vector.
S
S is the hemisphere in z > 0 of radius R with the normal to the surface defined to point inwards
towards the origin.
(i) Cartesian coordinates. Use the parametrisation defined in E.g. 3.4. I.e. the map
√
s(u, v) = (u, v, R2 − u2 − v 2 ).
maps the domain D = {(u, v) | u2 + v 2 < R2 } onto S.
Now
∂s ∂s u v u v
N(u, v) = × = 1, 0, − × 0, 1, − = , ,1 ,
∂u ∂v w w w w
√
abbreviating w = R2 − u2 − v 2 , which is the z-coordinate on the sphere. We can see N points
in same direction as s which is in the direction of the outward normal: r̂ = (u, v, w)/R. This is
not the direction we specified so we must adjust the sign of N by inserting a minus sign in
Z Z
∇ × f · dS = − ∇ × f · N(u, v) dudv
S D
Z Z
2ω1 u 2ω1 v
= − + + 2ω3 dudv = − 2ω3 dudv = −2πω3 R2
u2 +v2 <R2 w w u2 +v2 <R2
(using the oddness of the functions w.r.t. u and v in the first two terms).
30
but Z Z
2π 2π
ω · r̂ dθ = (ω1 sin φ cos θ + ω2 sin φ sin θ + ω3 cos φ) dθ = 2πω3 cos φ.
0 0
where ∂S is the edge of the hemisphere, radius R, where it meets z = 0. This is the circle of radius
R in the (x, y)-plane. By the RH thumb rule, the integral needs to be directed in the clockwise
direction (looking from above).
We define the circle by the path p(θ) = Rr̂ = (R cos θ, R sin θ, 0), 0 ≤ θ < 2π. Now
Z Z 0
f · dr = f(p(θ)) · p′ (θ) dθ,
∂S 2π
and we have
p′ (θ) = R(− sin θ, cos θ) = Rθ̂,
(in cylindrical polars) and on ∂S.
Z Z 0 Z 0
2
f · dr = (ω × (Rr̂)) · Rθ̂ dθ = R (r̂ × θ̂) · ω dθ
∂S 2π 2π
the same as before (and confirms Stokes’ theorem, for this example).
We first prove for a rectangle in (u, v)-space. The loose argument then proceeds that rectangles
can be assembled as a checkboard into larger domains, given that the limit of rectangle size can
be taken to go to zero and since contributions from adjacent sides cancel leaving just the circuit
around the total domain. See fig 3.
Let D = {(u, v) | 0 < u < a, 0 < v < b}. The surface S is defined by the map s(u, v) : D → R3
The boundary ∂D = C1 ∪ C2 ∪ C3 ∪ C4 is mapped by s onto ∂S = ∂S1 ∪ ∂S2 ∪ ∂S3 ∪ ∂S4 .
31
For e.g. C2 is the path p(t) = s(a, t), 0 < t < b and
Z Z b Z b
′ ∂s
A · dr = A(p(t)) · p (t) dt = A(s(a, v)) · (a, v) dv
∂S2 0 0 ∂v
Similarly, Z Z b
∂s
A · dr = − A(s(0, v)) · (0, v) dv
∂S4 0 ∂v
Z 0 Z b
(the minus sign accounts for reversing the orientation of the segment C4 , viz: =− )
b 0
32
Figure 3: A number of rectangles (left) can be put together to cover the domain (right).
Stokes’ theorem is applied in 2D. Let S be a surface on z = 0 and A = (A1 (x, y), A2 (x, y), 0) is a
vector field with no ẑ component and no dependence on z. Now dS = ẑdS = ẑdxdy and
∂A2 ∂A1
∇×A = − ẑ.
∂x ∂y
Thus Stokes’ theorem is reduced to
Z Z Z
∂A2 ∂A1
− dxdy = A · dr ≡ A1 dx + A2 dy
S ∂x ∂y ∂S ∂S
Defn: Let f : R3 → R s.t. r 7→ f (r) be a scalar field. The volume integral of f is given by
Z ZZZ
f (r) dV ≡ f (x, y, z) dx dy dz
V
in Cartesians.
Note: Unlike curves and surfaces, volumes in R3 do not have directions.
Z
Note: if f = 1, then 1 dV gives the physical volume of V .
V
33
Proof: The elemental volume dV = dxdydz is (ẑdz) · ((x̂dx) × (ŷdy)). Under the mapping, the
mapped elemental volume dVq is defined by a parallelepiped with sides given by
∂r ∂r ∂r
dq1 , dq2 , and dq3
∂q1 ∂q2 ∂q3
(just as we did for surfaces). The volume of dVq is therefore
|(q̂3 h3 dq3 ) · ((q̂1 h1 dq1 ) × (q̂2 h2 dq2 ))| = |Jr | dq1 dq2 dq3 .
1 ∂r
since q̂α = and using a result in §2.8.1. If q̂j are orthonormal, then |Jr | = h1 h2 h3 .
hα ∂qα
Defn: A simply connected domain, V , say, is one in which all paths within V can be continu-
ously transformed into all other paths within V without ever leaving V .
If V is finite and simply connected then ∂V forms a closed surface (a surface with no boundaries).
Proposition: For a vector field F : R3 → R3 , Let V ⊂ R3 be simply connected with boundary
∂V (a closed surface). Then Z Z
∇ · F dV = F · dS
V ∂V
where dS = n̂dS is a surface element and n̂ points outwards from the volume V .
The argument will be again that an arbitrary V can be divided into many small rectangular vol-
umes over each of which the divergence applies.
(there are 6 sides, and unit outward normal is one of ±x̂, ±ŷ, ±ẑ depending on the cuboid side).
Next we consider the volume integral,
Z Z aZ bZ c
∂F1 ∂F2 ∂F3
∇ · F dV = + + dz dy dx.
V 0 0 0 ∂x ∂y ∂z
34
The 3 terms are considered separately but in the same manner. For example, consider the contri-
bution from ∂F3 /∂z. From the Fundamental Theorem of Calculus,
Z a Z bZ c Z a Z b
∂F3
dz dy dx = (F3 (x, y, c) − F3 (x, y, 0)) dy dx.
0 0 0 ∂z 0 0
The result is
Z Z a Z b
∇ · F dV = (F3 (x, y, c) − F3 (x, y, 0)) dy dx +
V 0 0
Z a Z c Z bZ c
(F2 (x, b, z) − F2 (x, 0, z)) dz dx + (F1 (a, y, z) − F1 (0, y, z)) dz dy,
0 0 0 0
Figure 4: Gauss’ theorem for the cuboid V . The top and bottom faces of the boundary, S1 and
S2 , are indicated.
Z
E.g. 3.7: Compute ∇ · v dV where V is a sphere of radius a about the origin, and
V
v(r) = r + f (r)ẑ × r,
(and ẑ is the unit vector along the z-axis, and r = (x2 + y 2 + z 2 )1/2 ).
A (i): We have that
f ′ (r)
after using PS2, Q6(a). But ∇f = r and r·(ẑ×r) = 0. Also, can be shown that ∇·(ẑ×r) = 0,
r
so that
∇ · v = 3.
As the divergence of v is a constant, its integral over V is just its value times the volume of V ,
Z
4π
∇ · v dV = 3 a3 = 4πa3 .
V 3
35
Z
A (ii): Using the divergence theorem, answer is v(r) · dS where ∂V is the sphere of radius a.
∂V
If F = ∇f (i.e. the vector field can be described by a scalar potential) then the divergence theorem
reads Z Z
△f dV = n̂.∇f dS
V ∂V
36
Appendix: revision of the chain rule
Taylor series: Recall for a scalar function f (x),
h2 ′′
f (x0 + h) = f (x0 ) + hf ′ (x0 ) + f (x0 ) + . . .
2!
or, equivalently, (x = x0 + h)
(x − x0 )2 ′′
f (x) = f (x0 ) + (x − x0 )f ′ (x0 ) + f (x0 ) + . . .
2!
E.g. A.1: If f (x) = cos x, with x0 = 0 we get
h2
cos h = 1 − + ...
2!
Proposition: Start with chain rule for a scalar function of a single variable (sometimes
referred to a differentiation of a function of a function). Consider a function f (x) such that
x = x(t). Then F (t) = f (x(t)) and
dF
= x′ (t)f ′ (x(t))
dt
E.g. A.2: If f (x) = cos x and x(t) = t2 then x′ (t) = 2t f ′ (x) = sin x and so we get
d
cos(t2 ) = 2t sin(t2 )
dt
Proof: Standard limit used to define a derivative, along with Taylor series expansions (notation
O(h2 ) means collect terms as small as and smaller than h2 )
Now consider a scalar function of more than one variable, f (x, y), say, and let x = x(t) and
y = y(t).
That is, we can write F (t) = f (x(t), y(t)), say. From the chain rule, it follows that
dF dx ∂f dy ∂f
= (x(t), y(t)) + (x(t), y(t))
dt dt ∂x dt ∂y
37
Proof: Similar to before and requires the extension to multiple variables of Taylor’s expansion:
f (x0 + h, y0 + k) = f (x0 , y0 ) + hfx (x0 , y0) + kfy (x0 , y0 ) + higher order terms
E.g. A.3: If f (x, y) = xy and x(t) = t2 and y(t) = e−t then F (t) = t2 e−t . Then we can see that
by a direct calculation that
F ′ (t) = 2te−t − t2 e−t
Using the chain rule, we get x′ (t) = 2t, y ′(t) = −e−t , fx = y, fy = x and so
Put another way, F (t) = f (x) where x = x(t) and the the summation above can be either
interpreted as row times column vectors or as a dot product:
Note: The 2 × 2 matrix is the Jacobian of the map (u, v) 7→ (x(u, v), y(u, v)), the 1 × 2 matrix
(Fu , Fv ) is the Jacobian of of the map (u, v) 7→ F (u, v) and and the 1 × 2 matrix (fx , fy ) is the
Jacobian of of the map (x, y) 7→ f (x, y).
Note: Again, we can see through a way of extending this to a more general case in which
f = f (x1 , . . . , xm ) and xi = xi (u1, . . . , up ). Then we can write
38
or
F (u) = f (x(u))
and application of the chain rule gives
m
∂F X ∂xk ∂f
=
∂uj k=1
∂uj ∂xk
The final generalisation of this is to consider a vector function where f : Rm → Rn s.t. x 7→ f(x)
and another map x = x(u) where x : Rp → Rm . Then if we define F(u) = f(x(u)) and apply the
chain rule as above to each scalar component, Fi , i = 1, . . . , n of F we get
m
∂Fi X ∂xk ∂fi
=
∂uj k=1
∂uj ∂xk
or
∂F1 /∂u1 . . . ∂F1 /∂up ∂f1 /∂x1 . . . ∂f1 /∂xm ∂x1 /∂u1 . . . ∂x1 /∂up
.. .. .. .. .. .. .. .. ..
. . . = . . . . . . .
∂Fn /∂u1 . . . ∂Fn /∂up ∂fn /∂x1 . . . ∂fn /∂xm ∂xm /∂u1 . . . ∂xm /∂up
At this high level or generality we still have the same underlying structure from the first result.
Thus the equation above reads F′ (u) = f ′ (x(u))x′ (u), which is the result quoted in the notes for
the chain rule applies to the composition of maps, although the notation is shifted here from there.
39