Calc 3
Calc 3
1 Three-Dimensional Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1 Rectangular Coordinates in R3 7
1.2 Dot Product 9
1.3 Cross Product 11
1.4 Lines and Planes 13
1.5 Parametric Curves 16
2 Partial Differentiations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1 Functions of Several Variables 23
2.2 Partial Derivatives 27
2.3 Chain Rule 31
2.4 Directional Derivatives 36
2.5 Tangent Planes 40
2.6 Local Extrema 42
2.7 Lagrange’s Multiplier 47
2.8 Optimizations 52
3 Multiple Integrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.1 Double Integrals in Rectangular Coordinates 55
3.2 Fubini’s Theorem for General Regions 59
3.3 Double Integrals in Polar Coordinates 63
3.4 Triple Integrals in Rectangular Coordinates 68
3.5 Triple Integrals in Cylindrical Coordinates 72
3.6 Triple Integrals in Spherical Coordinates 75
4 Vector Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.1 Vector Fields on R2 and R3 79
4.2 Line Integrals of Vector Fields 81
4.3 Conservative Vector Fields 89
4.4 Green’s Theorem 97
4.5 Parametric Surfaces 104
4.6 Stokes’ Theorem 118
4.7 Divergence Theorem 125
4.8 Heat Diffusion (Optional) 131
5
Euclid
c
( a, b, c)
b
a
y
( a, b, 0)
x
Notation We will use the notation R3 to denote the entire three dimensional space.
Any point on the x-axis has the form ( x, 0, 0), i.e. y = 0 and z = 0. Similarly, points on
the y-axis are of the form (0, y, 0), and points on the z-axis are of the form (0, 0, z). The three
coordinate axes meet at a point with coordinates (0, 0, 0) which is called the origin.
A vector in R3 is an arrow which is based at one point and is pointing at another point. If a
vector v is based at ( x0 , y0 , z0 ) and points toward ( x1 , y1 , z1 ), then the vector is written as:
v = ( x1 − x0 )i + (y1 − y0 )j + (z1 − z0 )k.
For example, the vector based at (3, 2, −1) pointing at (5, 2, 0) is expressed as
(5 − 3)i + (2 − 2)j + (0 − (−1))k = 2i + k.
8 Three-Dimensional Space
i Consequently, any two vectors that are pointing at the same direction and have the same
length are considered to be equal, even though they may have different base points. For
instance, a vector v based at (1, 2, 3) pointing at (4, 3, 2), i.e.
is considered to be equal to the vector w based at (0, 0, 0) pointing at (3, 1, −1). In other
words, we can write w = v.
An alternative notation for a vector is the angular bracket ⟨ a, b, c⟩. We will sometimes write
a vector this way to save the hassle of writing down i, j and k:
Notation ⟨ a, b, c⟩ = ai + bj + ck.
In this course, we make very little conceptual distinction between a point ( x, y, z) and a
vector based at (0, 0, 0) pointing at the point ( x, y, z). However, speaking of notations, one should
use ⟨ x, y, z⟩ or xi + yj + zk to denote a vector and ( x, y, z) to denote a point so as to avoid
confusion.
Vector additions can scalar multiplications are defined as follows:
Definition 1.1 — Vector Additions and Scalar Multiplications. Let a = ⟨ a1 , a2 , a3 ⟩ and b =
⟨b1 , b2 , b3 ⟩ be two vectors in R3 , and c be a real scalar, then:
a + b = ⟨ a1 + b1 , a2 + b2 , a3 + b3 ⟩ (vector addition)
ca = ⟨ca1 , ca2 , ca3 ⟩ (scalar multiplication)
The negative of a vector is defined as: −a = (−1)a. The difference between vectors is
defined as a − b = a + (−b).
a a+b
b
b
a a−b
Property Vector additions and scalar multiplications have the following algebraic properties
1. commutative rule: a + b = b + a
2. associatative rule: (a + b) + c = a + (b + c)
3. distributive rules: (λ + µ)a = λa + µa and λ(a + b) = λa + λb
1.2 Dot Product 9
a · b = a1 b1 + a2 b2 + a3 b3 .
It is important to note that the dot product between a vector a = ⟨ a1 , a2 , a3 ⟩ and itself is given
by:
a · a = a21 + a22 + a23
which is incidently the square of the length of the vector a (by the Pythagoreas’ Theorem in
R3 ).
Notation Let a = ⟨ a1 , a2 , a3 ⟩. We denote the length of a vector a by |a|, which is given by:
q
|a| = a21 + a22 + a23 .
The length of a vector is sometimes called the norm, or the magnitude, of the vector.
Property It can easily be verified that the dot product satisfies the following algebraic
properties:
1. a · b = b · a.
2. (a + b) · c = a · c + b · c.
3. (λa) · b = λ(a · b)
4. 0 · a = a · 0 = 0.
The following theorem gives the geometric meaning of the dot product:
Theorem 1.1 Let a = ⟨ a1 , a2 , a3 ⟩ and b = ⟨b1 , b2 , b3 ⟩ be two vectors in R3 , and θ be the angle
between these two vectors. Then we have:
Proof. The proof uses the Law of Cosines. Consider the triangle in the diagram below:
b a−b
θ
a
The side opposite to the angle is represented by the vector a − b. Using the Law of Cosines:
One immediate consequence of Equation (1.1) is that it allows us to use the dot product to
find the angle between two vectors. Precisely, the angle θ between two vectors a and b is given
by:
a·b
θ = cos−1 .
|a| |b|
The most important case is that the two vectors a and b are perpendicular, also known
as orthogonal. The angle between the vectors is π2 and so we have the following important
fact:
Corollary 1.2 Two non-zero vectors a and b are orthogonal if and only if a · b = 0.
This corollary is particularly useful to determine whether two vectors are perpendicular.
■Example 1.1 Show that any triangle which is inscribed in a circle and has one of its side
coincides the diameter of the circle must be a right-angled triangle.
■ Solution Let O be the center of the circle. Define vectors a and b as in the diagram below.
−a O a
We would like to show the vectors in red and blue are orthogonal to each other. By basic
vector additions and subtractions:
(−a − b) · (a − b) = −a · a +
a ·
b −
b ·
a +b·b
= − | a |2 + | b |2 (recall v · v = |v|2 )
Since both a and b represent the radii of the circle, they have the same magnitude. Therefore
|a| = |b| and we have (−a − b) · (a − b) = 0. This shows the red and blue vectors are
orthogonal.
1.3 Cross Product 11
From the right-hand grab rule, we can clearly see that a × b and b × a are vectors with the
same length but in opposite direction, i.e. a × b = −b × a. The magnitude of |a × b|, which is
defined to be |a| |b| sin θ, is the area of the parallelogram formed by a and b:
b |b| sin θ
θ
a
The following are some useful algebraic properties of the cross products. Based on the
definition of cross products we presented above, the proofs are purely geometric and are
omitted here.
Property The cross product satisfies:
1. a × b = −b × a
2. (a + b) × c = a × c + b × c
3. a×0 = 0
4. a×a = 0
12 Three-Dimensional Space
For simple vectors such as i, j and k, their cross products can be easily found from the
definition:
i × j = k, j × k = i, k × i = j.
For more complicated vectors, the cross product can be computed using the following determi-
nant formula:
Theorem 1.3 — Determinant Formula of Cross Product. Given two vectors a = a1 i + a2 j + a3 k
and b = b1 i + b2 j + b3 k, their cross product is given by:
i j k
a × b = a1 a2 a3 (1.2)
b1 b2 b3
= ( a2 b3 − a3 b2 )i − ( a1 b3 − a3 b1 )j + ( a1 b2 − a2 b1 )k.
a × b = ( a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k)
using the algebraic properties of the cross product. It is left as an exercise for readers. ■
The cross product can be used to find a vector which is orthogonal to a plane.
Find a vector n which is orthogonal to the plane passing through A, B and C. Moreover, find
the area of the triangle △ ABC.
■ Solution A vector n is orthogonal to the plane if and only if it is orthogonal to any two
(non-parallel) vectors on the plane. We will first find two vectors on the plane and then take
the cross product. The outcome will give a vector orthogonal to these two vectors (hence
orthogonal to the plane as well).
The following two vectors lie on the plane:
−→
AB = ⟨4, 0, −1⟩ − ⟨0, 2, −1⟩ = ⟨4, −2, 0⟩
−→
AC = ⟨7, −3, 0⟩ − ⟨0, 2, −1⟩ = ⟨7, −5, 1⟩
−→ −→
Taking the cross product: AB × AC = ⟨−2, −4, −6⟩.
Therefore, the required vector n can be taken to be any scalar multiple of ⟨−2, −4, −6⟩,
such as ⟨1, 2, 3⟩ or ⟨2, 4, 6⟩.
−→ −→
The length of the cross product AB × AC is equal to the area of the parallelogram formed
−→ −→
by AB and AC. The area of the triangle △ ABC is 12 of the area of this parallelogram.
Therefore,
1 −→ − → 1
q √
Area of △ ABC = AB × BC = (−2)2 + (−4)2 + (−6)2 = 14.
2 2
1.4 Lines and Planes 13
⟨ x, y, z⟩ − ⟨ x0 , y0 , z0 ⟩ = tv
⟨ x, y, z⟩ = ⟨ x0 , y0 , z0 ⟩ + t ⟨v1 , v2 , v3 ⟩
Therefore, we have:
x = x0 + tv1
y = y0 + tv2
z = z0 + tv3
which is called the parametric equation of the line L. It is called this way because the variable
t is called the parameter of the line.
Notation In this course, the vector r is “reserved” to denote the position vector ⟨ x, y, z⟩.
Using this notation, we can also write the parametric equation of the line L in vector form:
−−→
r(t) = OP0 + tv.
The t-variable in r(t) emphasizes the fact that the position vector r depends on t. It can be
omitted if it is clear that t is the parameter letter.
■Example 1.3 Find parametric equation of the line L passing through both A(3, −2, 0) and
B(1, 0, 1). Express your answer in both equation form and vector form.
14 Three-Dimensional Space
■ Solution In order to write down the parametric equation of a straight-line, we need two
“ingredients":
1. a given point P0 on the line, and
2. the direction v of the line. −→
In this case, we can take P0 to be A(3, −2, 0). The direction of the line L is the vector AB,
which is given by:
−→ −→ −→
AB = OB − OA = ⟨1, 0, 1⟩ − ⟨3, −2, 0⟩ = ⟨−2, 2, 1⟩ .
With P0 (3, −2, 0) and v = ⟨−2, 2, 1⟩, the parametric equation of the line L is given by:
x = 3 − 2t
y = −2 + 2t
z = 0+t
or equivalently,
r(t) = ⟨3, −2, 0⟩ + t ⟨−2, 2, 1⟩ .
i In the above example, one may also take P0 to be B(1, 0, 1) and keeping v to be ⟨−2, 2, 1⟩,
then the parametric equation of L is given by:
Although it gives a different r(t), this parametric equation represents the same straight
line L. Every straight line can be represented by many different parametric equations!
P0 P
In order to find n, one may use the cross product. Here is an example:
■ Example 1.4 Find an equation of the plane in R3 passing through the following three
points.
A(0, 2, −1), B(4, 0, −1), C (7, −3, 0).
−→
AB = ⟨4, 0, −1⟩ − ⟨0, 2, −1⟩ = ⟨4, −2, 0⟩
−→
AC = ⟨7, −3, 0⟩ − ⟨0, 2, −1⟩ = ⟨7, −5, 1⟩
−→ −→
Taking the cross product: AB × AC = ⟨−2, −4, −6⟩. Any non-zero vector parallel to this
cross product is a normal vector to the plane. For simplicity, we can take:
n = ⟨1, 2, 3⟩ .
Take A(0, 2, −1) to be the given point P0 , then the equation of the plane through A, B
and C is given by:
1x + 2y + 3z = 1(0) + 2(2) + 3(−1)
| {z }
( x0 ,y0 ,z0 )=(0,2,−1) and n=⟨1,2,3⟩
After simplification: x + 2y + 3z = 1.
16 Three-Dimensional Space
x = cos t
y = sin t.
The former is called the Cartesian equation and the second one is called the parametric
equation.
However, in three dimensions, a single Cartesian equation such as x2 + y2 + z2 = 1
represents a surface instead. Therefore, we will only use the parametric equations to present
curves in three dimensions.
Definition 1.4 — Parametric Equation of a Curve. The parametric equation of a curve is of the
form:
x = f (t)
y = g(t)
z = h(t)
where f (t), g(t) and h(t) are differentiable functions of t. In vector notations, the parametric
equation of this curve is written as:
t
r1 (t) = (cos t)i + (sin t)j + k.
20
It is a curve that goes around the circle but the altitude is constantly increasing. See Figure
1.6a for the computer sketch. Here is another example of a parametric curve. See Figure 1.6b
for the sketch.
r2 (t) = (sin t)i + (cos t)j + (sin 2t)k.
1.0
0.5
0.0
-0.5
-1.0
1.0
1.5
0.5
1.0
0.0
0.5
1.0
-0.5
0.5
0.0
-1.0 -1.0
-1.0 0.0
-0.5
-0.5
0.0 0.0 -0.5
0.5 0.5
-1.0
1.0 1.0
■ Example 1.5 Find the velocity, speed and acceleration of the particle whose path is:
r(t) = (sin t) i + t2 − cos t j + et k.
■ Solution
′ ′
velocity = r′ (t) = (sin t)′ i + t2 − cos t j + et k
= (cos t) i + (2t + sin t) k + et k
q
speed = r (t) = (cos t)2 + (2t + sin t)2 + (et )2
′
p
= cos2 t + 4t2 + 4t sin t + sin2 t + e2t
p
= 1 + 4t2 + 4t sin t + e2t
d
acceleration = r′′ (t) = r′ (t)
dt
′
= (cos t)′ i + (2t + sin t)′ j + et k
= (− sin t)i + (2 + cos t) j + et k.
When L(t) is a non-zero constant vector (independent of t), we say that the angular momentum
is conserved. The conservation of angular momentum implies that the path of the particle is
contained in a plane. It can be explained as follows:
By the definition of cross product, the angular momentum L(t) is always orthogonal to r(t)
(and to r′ (t) too, but we do not need this). Therefore, at any time t, we have:
L(t) · r(t) = 0.
Let r(t) = x (t)i + y(t)j + z(t)k. If L(t) is a constant vector, it can be expressed as L =
Ai + Bj + Ck where A, B and C are fixed numbers. Then:
Therefore, the point ( x (t), y(t), z(t)) lies on the plane Ax + By + Cz = 0, which is a plane with
normal vector L passing through the origin. In other words, the path of the particle is confined
in this plane.
18 Three-Dimensional Space
Property Given two curves u(t) and v(t), and a scalar function f (t), we have:
d ′ ′
1. dt ( f ( t ) u ( t )) = f ( t ) u ( t ) + f ( t ) u ( t )
d ′ ′
2. dt ( u ( t ) · v ( t )) = u ( t ) · v ( t ) + u ( t ) · v ( t )
d ′ ′
3. dt ( u ( t ) × v ( t )) = u ( t ) × v ( t ) + u ( t ) × v ( t ).
Here is a good example on the use of one of the above product rules.
■ Example 1.6 Given r ( t ) represents a particle travelling at uniform speed C. Show that its
■ Solution The particle is travelling at uniform speed C. Therefore, |r′ (t)| ≡ C. We want to
show that r′ (t) · r′′ (t) ≡ 0, so it is natural to differentiate |r′ (t)| with respect to t so that the
RHS vanishes and the LHS perhaps may be related to r′′ (t).
However, since |r′ (t)| is the form of a square root so it is cumbersome to differentiate it.
2 2
Instead, we differentiate |r′ (t)| = C2 using the fact that |r′ (t)| = r′ (t) · r′ (t):
2
r′ (t) = C2
r′ (t) · r′ (t) = C2
d ′
r (t) · r′ (t) = 0
dt
r′′ (t) · r′ (t) + r′ (t) · r′′ (t) = 0
2r′ (t) · r′′ (t) = 0.
from (0, 0, 0) to 2, 38 , 2 .
■ Solution It is simple to verify that the initial point corresponds to t = 0 since r (0) = ⟨0, 0, 0⟩,
If you plot them using Mathematica, you should find out that these two curves are the same,
although their speeds are different. The curve r2 (t) is obtained by replacing every t in r1 (t) by
2t. The initial and final times are adjusted so that the end-points of both r1 and r2 are (0, 0, 0)
and (0, 0, 2π ). We say r2 is a reparametrization of r1 .
If r(s) is a parametric curve such that |r′ (s)| = 1 for any s, we say the curve is parametrized
by arc-length. For such a parametrization, it is conventional to use s to denote the parameter.
Given a parametric curve r(t), in theory one can reparametrize the curve by arc-length, such
that with the new parameter s, the curve r(s) travels at unit speed. To find the arc-length
parametrization, you may follow the procedure:
1. Given a curve r(t) : [ a, b] → R3 , compute the following integral:
ˆ t
s= r′ (τ ) dτ.
a
2. Since the upper limit of the above integral is t, the function s should be a function of t.
Express t in terms of s whenever it is possible, so that t is a function of s, i.e. t = t(s).
3. Finally, replace all t’s by this function of s in the curve r(t).
The new parametrization er(s) will be arc-length parametrized. Let’s see some examples before
we learn why it works:
20 Three-Dimensional Space
Therefore, ˆ ˆ t√
t √
s(t) = r′ (τ ) dτ = 2dτ = 2t.
0 0
Express t in terms of s, we get t = √s . Replace all t’s in r(t) by √s , we get an arc-length
2 2
parametrization:
s s s
er(s) = cos √ i + sin √ j + √ k.
2 2 2
r′ (t) = t + 1.
Consider: ˆ ˆ
t t
t2
s= r′ (τ ) dτ = (τ + 1)dτ = + t.
0 0 2
To solve t in terms of s, we use the quadratic equation. One should get:
√
−2 + 4 + 8s √
t= = −1 + 1 + 2s.
2
Finally, replace all t’s in r(t) by this function of s, we get an arc-length parametrization:
√
1 √ 2 2 2 √ 3/2 √
er(s) = −1 + 1 + 2s i + −1 + 1 + 2s j + −1 + 1 + 2s k.
2 3
To see why this procedure gives an arc-length parametrization, we need to show |er′ (s)| = 1.
Note that we relabel the parametrization r to er just to avoid confusion. Rigorously, they are
related by er(s) = r(t(s)) regarding t as a function of s.
We first use chain rule:
dr(t(s)) dr dt
er′ (s) = =
ds dt ds
dt
= r′ (t) .
ds
ˆ t
Recall that s is defined to be s = r′ (τ ) dτ. The Fundamental Theorem of Calculus tells us
a
ds
that = r′ (t) and so,
dt
dt 1 1
= ds = ′ .
ds
dt
|r (t)|
1.5 Parametric Curves 21
Therefore, we have:
1
er′ (s) = r′ (t) · = 1.
|r′ (t)|
The parametrization r(s) has unit speed, and hence is an arc-length parametrization.
i Although the above procedure of finding arc-length parametrization works in the two
examples we have seen, in general it may be hard to find an arc-length parametrization.
Since both steps – integration and solving t in terms of s – can be difficult if the given
curve r(t) is not nice.
1.5.5 Curvature
Curvature is quantity that measures the sharpness of a curve, and is closely related to the
acceleration. Imagine you are driving a car along a curved road. On a sharp turn, the force
exerted on your body is proportional to the acceleration according to the Newton’s Second Law.
Therefore, given a parametric curve r(t), the magnitude of the acceleration |r′′ (t)| somewhat
reflects the sharpness of the path – the sharper the turn, the larger the |r′′ (t)|.
However, the magnitude |r′′ (t)| is not only affected by the sharpness of the curve, but also
on how fast you drive. In order to give a fair and standardized measurement of sharpness, we
need to get an arc-length parametrization r(s) so that the “car” travels at unit speed.
Definition 1.5 — Curvature. Given a curve γ in R2 or R3 which can be arc-length parametrized
by r(s), then it’s curvature is a function of s defined as:
■ Example 1.10 Find the curvature of the circle of radius R centered at the origin (0, 0) in R2 .
■ Solution The circle of radius R centered at the origin (0, 0) on the xy-plane can be
parametrized by:
r(t) = ( R cos t, R sin t).
It can be easily verified that |r′ (t)| = R and so r(t) is not an arc-length parametrization.
To find an arc-length parametrization, we let:
ˆ t ˆ t
s(t) = r′ (τ ) dτ = R dτ = Rt.
0 0
s
Therefore, t(s) = R as a function of s and so an arc-length parametrization of the circle is:
s s
er(s) = R cos , R sin .
R R
To find its curvature, we compute:
d s s
er′ (s) = R cos , R sin
ds s
R
s
R
= − sin , cos
R R
′′ 1 s 1 s
er (s) = − cos , − sin
R R R R
1
κ (s) = er′′ (s) = .
R
1
Thus the curvature of the circle is given by R, i.e. the larger the circle, the smaller the
curvature.
2 — Partial Differentiations
John Nash
V (r, h) = πr2 h.
D = {( x, y) : y ≥ x2 }
1
The function h = is undefined when xy = 0, or equivalently when at least one of the x
xy
and y is zero. We can write its domain as {( x, y) : x ̸= 0 or y ̸= 0}. Geometrically, it is R2 with
both x- and y-axes removed.
Figure 2.1: value of f ( x0 , y0 ) is the height of the surface above the point ( x0 , y0 , 0)
f ( x1 , . . . , x n ) = c
where c is a constant.
Given f ( x, y) = x2 + y2 , which is a function from R2 to R. An example of a level set of f
is x2 + y2 = 1, which is a unit circle on R2 centered at the origin. By taking c to be different
| {z }
f ( x,y)
values, we get several level sets on the plane. They are circles centered at the origin with
varying radii depending on the value of c chosen:
The level set diagram of the two-variable function f ( x, y) consists of some representative
level sets of the function on R2 . See Figure 2.3.
3 16 12 10 14
16
14 6
2 8
12
10
2
1
-1
10
4
12
-2
14
16
-3 14 10 12 16
-3 -2 -1 0 1 2 3
For three-variable functions f ( x, y, z), we do not attempt to visualize its graph, but we can
visualize its level set diagram. The former requires the fourth dimension while the latter can
be visualized in R3 . A generic level set of a three-variable function f ( x, y, z) is a surface in R3
(see Figures 2.4ab).
To summarize, a the graph and level set of a function on several variables are:
Functions Graph Level Sets
f (x) y = f ( x ) is a curve in R 2 f ( x ) = c is generically a point on R
f ( x, y) z = f ( x, y) is a surface in R3 f ( x, y) = c is generically a curve on R2
f ( x, y, z) cannot visualize f ( x, y, z) = c is generically a surface in R3
f ( x, y, z, w) cannot visualize cannot visualize
26 Partial Differentiations
we have | f ( x, y) − f ( x0 , y0 )| < ε.
Don’t panic if you cannot understand this definition at this moment! We will not deal with this
definition in this course, but instead we learn continuity through some examples. The rigorous
approach of dealing with continuity will be covered systematically in MATH 2033 and MATH
3033. In this course, we will use the following facts about continuity without proof:
1. Any polynomial such as f ( x, y) = x2 + xy + y5 is continuous at every ( x0 , y0 ) in R2 (we
can also say it is continuous on R2 ).
2. sin x, cos x,√e x , | x | are all continuous everywhere.
3. ln x, tan x, x are continuous on their domains.
4. The sum, difference and product of two continuous functions are all continuous.
f ( x,y)
5. The quotient g( x,y) of two continuous functions f ( x, y) and g( x, y) is continuous at ( x0 , y0 )
whenever g( x0 , y0 ) ̸= 0. For instance, the function
x 2 − y2
x 2 + y2
∂f
( x, y) := the derivative of f ( x, y) with respect to x regarding y constant
∂x
f ( x + h, y) − f ( x, y)
= lim ;
h →0 h
∂f
( x, y) := the derivative of f ( x, y) with respect to y regarding x constant
∂y
f ( x, y + h) − f ( x, y)
= lim .
h →0 h
∂f ∂f
Notation Alternatively, we sometimes denote ∂x by f x , and ∂y by f y . Note also that we do
not use f ′ ( x, y) for multivariable functions, since it is ambigious to whether it means f x or
fy.
∂f ∂f
■ Example 2.1 Find ∂x and ∂y for the function:
f ( x, y) = x2 sin( xy).
∂f
■ Solution To calculate ∂x , we regard y as a constant:
∂f ∂ 2
= x sin( xy) (regarding y constant)
∂x ∂x
∂x2 ∂
= sin( xy) + x2 sin( xy) (product rule)
∂x ∂x
∂
= 2x sin( xy) + x2 · cos( xy) xy
∂x
= 2x sin( xy) + x2 y cos( xy).
∂f
Similarly, to calculate ∂y , regard x as a constant:
∂f ∂ 2
= x sin( xy)
∂y ∂y
∂
= x2 sin( xy) (here x2 is regarded as a constant)
∂y
∂
= x2 cos( xy) · xy
∂y
= x2 cos( xy) · x
= x3 cos( xy).
28 Partial Differentiations
∂f
Figure 2.5: geometric interpretation of ∂x
■ Solution
∂f ∂ 2 3
= e x +y + xyz
∂z ∂z
2 3 ∂( x2 + y3 + xyz)
= e x +y + xyz · (chain rule)
∂z
2 + y3 + xyz
= ex · (0 + 0 + xy)
x2 +y3 + xyz
= xye .
2.2 Partial Derivatives 29
■ Example 2.3 Let f ( x, y) = 3x4 y − 2xy + 5xy3 . Compute all first and second partial deriva-
tives.
∂f
= 12x3 y − 2y + 5y3
∂x
∂f
= 3x4 − 2x + 15xy2
∂y
∂ ∂f
Notation Since it is a bit clumsy to write ∂y ∂x every time, we can use the following
short-hand:
∂2 f ∂2 f
∂ ∂f ∂ ∂f
= =
∂x2 ∂x ∂x ∂x∂y ∂x ∂y
∂2 f ∂2 f
∂ ∂f ∂ ∂f
= 2
=
∂y∂x ∂y ∂x ∂y ∂y ∂y
Similarly for the subscript notations, we write f xx = ( f x ) x , and f xy = ( f x )y . The later means
to differentiate by x first and then by y. Therefore, it is related to the fraction notation by:
∂2 f
∂ ∂f
f xy = ( f x )y = = .
∂y ∂x ∂y∂x
∂2 f
The above remark seems to suggest that we should be very careful when converting ∂y∂x
into the subscript notation f xy . The order of x and y needs to be switched in the conversion.
However, thanks to the following important theorem, we don’t need to worry about this
too much, since in many cases, we have f xy = f yx .
Theorem 2.1 — Mixed Partials Theorem. Consider the function f ( x, y), if at least one of the
second partials f xy and f yx exists and is continuous, then we must have f xy = f yx .
∂f
■ Solution Needless to say, it is very tedious and time-consuming to compute ∂x . However,
∂ ∂f
by the Mixed Partials Theorem, we can try to find ∂x ∂y , and if it is continuous, then
∂ ∂ f ∂ ∂ f
∂y ∂x = ∂x ∂y .
∂ ∂f
It is much easier to compute ∂x ∂y since the “monster” term is gone after differentiating
the function by y:
∂f ∂
= 0 − sin( xy) · xy
∂y ∂y
= − x sin( xy)
∂ ∂f ∂
= − ( x sin( xy))
∂x ∂y ∂x
= − sin( xy) − xy cos( xy),
All examples of multivariable functions we have seen so far are “nice”, in a sense that partial
derivatives exist and are continuous at every point in their domains. In this course, we will use
the following terminology:
Definition 2.3 — C k functions. A multivariable function f ( x, y) is said to be C0 on its domain
D if it is continuous at every point ( x0 , y0 ) in D. Moreover, a function f ( x, y) is said to
be C k on its domain D if all partial derivatives up to and including order k exist and are
continuous at every point ( x0 , y0 ) in D.
i In this course, we will not discuss the difficult notion of differentiable functions, which will
be covered in MATH 3033. Meanwhile, please note that a function being C1 is not the
same as saying it is differentiable!
2.3 Chain Rule 31
df d f dx
= .
dt dx dt
One may represent this chain of relations by the schematic diagram:
f
df
dx
x
dx
dt
t
df d f dx
Figure 2.6: the schematic diagram for chain rule formula dt = dx dt .
For multivariable functions, the relation between variables can be more complicated. For
example, let the function u( x, y, z) be the temperature at the point ( x, y, z) in the space. Suppose
a particle moves along the path
Then, the coordinates x, y and z all depend on t, and so u is ultimately a function of t. See
Figure 2.7 for the tree diagram of the variables.
u
∂u ∂u
∂x ∂u ∂z
∂y
x y z
dx dy dz
dt dt dt
t t t
The derivative du
dt is the rate of change of the temperature that the particle “feels” as it
travels. The multivariable chain rule can be read off from the tree diagram 2.7:
du ∂u dx ∂u dy ∂u dz
= + + .
dt ∂x dt ∂y dt ∂z dt
du ∂u dx ∂u dy ∂u dz
= + +
dt ∂x dt ∂y dt ∂z dt
2x ·(− sin t) + 2y · cos
= |{z} t +(−4z ) · 1 (calculate each derivative)
| {z } |{z} |{z} |{z} |{z}
∂u dx ∂u dy ∂z dz
∂x dt ∂y dt ∂t dt
As illustrated in the above example, once the chain rule formula is correctly written
according to the tree diagram, the remaining computations are straight-forward. From now on,
we will investigate how to write down the chain rule under various configuration of variables.
The computations will usually be skipped and are left as exercises for readers.
∂u ∂u
∂x ∂u ∂t
∂y ∂u
∂z
x y z t
dx dy dz
dt dt dt
t t t
du
There are four paths from u to t, so we expect the chain rule formula for dt consists of four
terms:
du ∂u dx ∂u dy ∂u dz ∂u
= + + + .
dt ∂x dt ∂y dt ∂z dt ∂t
i Note that du ∂u
dt is different from ∂t . As the particle travels along its path, the temperature
the particle “feels” is a ultimately a function of t, and so the rate of change of temperature
is represented by du dt (using d instead of the partial ∂).
On the other hand, the partial ∂u∂t is the time derivative of u regarding x, y and z constant!
Therefore, it is the rate of change of temperature when the position is fixed! It is not the
rate of change of temperature for the moving particle!
∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + + .
∂s ∂x ∂s ∂y ∂s ∂z ∂s
w = x + 2y + z2
r
x= y = r2 + ln s z = 2r
s
■ Solution According to the tree diagram Figure 2.9, the chain rule for ∂w
∂s is given by:
∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + +
∂s ∂x ∂s ∂y ∂s ∂z ∂s
r 1
1 · − 2 + |{z}
= |{z} 2 · 2z · |{z}
+ |{z} 0
s s
wx | {z } wy |{z} wz zs
xs ys
r 2
=− 2+ .
s s
Now suppose w is a function of z only, z is a function of x and y, and both x and y are
functions of t, as illustrated in Figure 2.10. Ultimately, w is a function t. There are two paths
from w to t in the tree diagram. Each path consists of three segments. The chain rule for dw dt is
given by:
dw dw ∂z dx dw ∂z dy
= + .
dt dz ∂x dt dz ∂y dt
x2 + y3 + sin2 y = 1,
f
∂f ∂F f
∂x ∂y
x y
dy
dx
df ∂f ∂ f dy
= + .
dx ∂x ∂y dx
df
Recall that f ( x, y) = 1 is a constant, so dx = 0, which yields:
dy dy fx
0 = fx + fy , and so: =− .
dx dx fy
dy 2x
=− 2 .
dx 3y + 2 sin y cos y
x = r cos θ
y = r sin θ
2.3 Chain Rule 35
Therefore, both x and y can be regarded as functions of r and θ, and so u can be regarded as a
function of r and θ as well. By the chain rule, we know:
∂u ∂u ∂x ∂u ∂y
= +
∂r ∂x ∂r ∂y ∂r
= u x cos θ + uy sin θ.
∂2 u ∂
= u x cos θ + uy sin θ
∂θ∂r ∂θ
∂u x ∂uy
= cos θ − u x sin θ + sin θ + uy cos θ.
∂θ ∂θ
∂uy
Next we would like to express ∂u x
∂θ and ∂θ as partial derivatives with respect to x and y only.
The reason of doing so is because u xx is much easier to compute than u xθ . For instance, if
u( x, y) = x2 + y, then u x = 2x, and so u xx = 2 and u xy = 0. However, to find u xθ one needs to
first express u x as 2r cos θ.
Since u x and uy are both functions of x and y, and ( x, y) are functions of (r, θ ). Therefore,
u x and uy will have the same tree diagram as the function u. The chain rule for them is given
by:
∂u x ∂u x ∂x ∂u x ∂y
= +
∂θ ∂x ∂θ ∂y ∂θ
= u xx (−r sin θ ) + u xy (r cos θ )
∂uy ∂uy ∂x ∂uy ∂y
= +
∂θ ∂x ∂θ ∂y ∂θ
= uyx (−r sin θ ) + uyy (r cos θ )
∂2 u ∂u x ∂uy
= cos θ − u x sin θ + sin θ + uy cos θ
∂θ∂r ∂θ ∂θ
= −u xx r sin θ + u xy r cos θ cos θ − u x sin θ
+ −uyx r sin θ + uyy r cos θ sin θ + uy cos θ
= −u xx r sin θ cos θ + u xy r cos2 θ − sin2 θ + uyy r sin θ cos θ
− u x sin θ + uy cos θ
36 Partial Differentiations
d f ( x + t, y) − f ( x, y) ∂f
Di f ( x, y) = f ( x + t, y) = lim = ( x, y).
dt t =0 t → 0 t ∂x
In practice, we do not need to compute Du f from the definition, since Theorem 2.2 to
be introduced will come in handy help us. In order to introduce this theorem, we first
define:
Definition 2.5 — Gradient Vector. Given a two-variable function f ( x, y) which is C1 on its
domain, the gradient vector of f at ( x, y) is denoted and defined as:
∂f ∂f
∇ f ( x, y) = ( x, y)i + ( x, y)j.
∂x ∂y
∂f
( x, y) = 2xy + 3x2
∂x
∂f
( x, y) = x2
∂y
Therefore,
∇ f ( x, y) = (2xy + 3x2 )i + x2 j.
The vector ∇ f ( x, y) depends on ( x, y). By taking different values of ( x, y), a different gradient
2.4 Directional Derivatives 37
∇ f (1, 1) = 5i + j,
∇ f (1, 0) = 3i + j.
Theorem 2.2 Given a two-variable function f ( x, y) which is C1 on its domain, the directional
derivative of f at ( x, y) in the unit direction u = u1 i + u2 j is given by:
Du f ( x, y) = ∇ f ( x, y) · u.
θ
∇ f ( a, b)
Proof. Let r(t) = x (t)i + y(t)j be a parametrization of the level curve f ( x, y) = c. In other
words, we have f ( x (t), y(t)) = c for any t, and so
d
f ( x (t), y(t)) = 0.
dt
∂ f dx ∂ f dy
+ =0
∂x dt ∂y dt
∂f ∂f dx dy
i+ j · i+ j =0
∂x ∂y dt dt
∇ f · r′ (t) = 0.
Therefore, the gradient vector ∇ f is orthogonal to r′ (t) which is the tangent vector of the level
curve r(t). It completes the proof. ■
1.0 0.5
0.8
0.7
0.8
0.4 0.6
0.6
0.2
0.4
0.1 0.3
0.2
0.0
Figure 2.15: a plot of the field ∇ (sin xy) and the level curves of sin xy.
∂f ∂f ∂f
∇f = i+ j+ k
∂x ∂y ∂z
Given a unit vector u, the directional derivative of f ( x, y, z) in the direction of u is given by:
Du f = ∇ f · u.
Since the level set of a three-variable function is typically a surface, the gradient vector ∇ f
at any given point is orthogonal to the level surface f ( x, y, z) = c at that point. See Figure 2.16.
In the next section, we will use this fact to find the equation of the tangent plane to a
surface.
40 Partial Differentiations
x 2 + y2 = z2 + 3
■ Solution First we need to write the equation of the surface in a level set form, i.e.
x 2 + y2 − z2 = 3
Then, n := 4i + 2k is a normal vector of the surface at (2, 0, −1). The equation of the tangent
plane at (2, 0, −1) is given by:
z − f ( x, y) = 0.
Then, one can define g( x, y, z) = z − f ( x, y) so that the graph of the two-variable function
f ( x, y) becomes a level set of a three-variable function g( x, y, z). Let’s look at an example:
2.5 Tangent Planes 41
■ Example 2.8 Given the function f ( x, y) = x cos y − ye x , find the tangent plane at (0, 0, 0) to
the graph z = x cos y − ye x .
■ Solution First rearrange the terms so that the graph equation becomes a level set:
z − x cos y + ye x = 0.
Define g( x, y, z) = z − x cos y + ye x , then the surface under consideration is the level set
g( x, y, z) = 0.
∇ g( x, y, z) = (− cos y + ye x )i + ( x sin y + e x )j + k.
The normal vector at (0, 0, 0) is given by:
n = ∇ g(0, 0, 0) = −i + j + k.
− x + y + z = 0.
Generally, one can derive a formula for finding the tangent plane of any graph of a
two-variable function f ( x, y):
Theorem 2.4 Given a function f ( x, y) which is C1 on its domain. The equation of the tangent
plane for the graph z = f ( x, y) at the point ( x0 , y0 , f ( x0 , y0 )) is given by:
∂f ∂f
z = f ( x0 , y0 ) + ( x0 , y0 ) · ( x − x0 ) + ( x0 , y0 ) · ( y − y0 ).
∂x ∂y
z − f ( x, y) = 0
and define g( x, y, z) = z − f ( x, y), then the graph can be regarded as a level set g( x, y, z) = 0
of the three-variable function g.
∂f ∂f
∇ g( x, y, z) = − i− j + k.
∂x ∂y
At the point ( x0 , y0 , f ( x0 , y0 )), the normal vector to the surface is therefore given by:
∂f ∂f
n = ∇ g( x0 , y0 , f ( x0 , y0 )) = − ( x0 , y0 )i − ( x0 , y0 )j + k.
∂x ∂y
By rearrangement, we get:
∂f ∂f
z = f ( x0 , y0 ) + ( x0 , y0 ) · ( x − x0 ) + ( x0 , y0 ) · ( y − y0 ),
∂x ∂y
as desired. ■
42 Partial Differentiations
Figure 2.17: the tangent plane at a critical point of a C1 function is horizontal. However, a
critical point is not always a local maximum or minimum. It can be a saddle like the origin
of z = x2 − y2 , which is a local maximum in the y-direction but is a local minimum in the
x-direction.
∂f ∂f
■ Solution We compute ∂x and ∂y , and then set them to zero and solve for ( x, y):
∂f
0= = y − 2x − 2
∂x
∂f
0= = x − 2y − 2.
∂y
It is a system of equations with unknowns x and y. The first equation gives y = 2x + 2, and
substitute it into the second equation, we get:
x − 2(2x + 2) − 2 = 0 ⇒ x = −2.
When x = −2, we have y = −2, and so ( x, y) = (−2, −2) is a critical point of f ( x, y). See
Figure 2.18a for its graph.
■ Example 2.10 Find all critical point(s) of the function f ( x, y) = sin x sin y
■ Solution Consider:
∂f ∂f
=0 and =0
∂x ∂y
cos x sin y = 0 and sin x cos y = 0
(cos x = 0 or sin y = 0) and (sin x = 0 or cos y = 0)
π π
( x = + kπ or y = mπ ) and ( x = nπ or y = + pπ ).
2 2
Here m, n, k, p are any integers. Some logical deductions show that these imply the following:
π π
x= + kπ and y = + pπ
2 2
or: y = mπ and x = nπ.
f xx (0, 0) = 2
f yy (0, 0) = 2
for every ( x, y) on the R2 plane. Both are positive numbers. You may be tempted to conclude
that (0, 0) is a local maximum point. However, if one plots the graph of this function (see
Figure 2.19), one can see easily that (0, 0) is neither a local maximum or a local minimum.
Around (0, 0), the graph is a concave up in some directions but concave down in other
directions. We call this (0, 0) a saddle.
This example shows the signs of f xx and f yy alone could not conclude the nature of the
critical point. In fact, the second derivative test for two-variable functions is slightly more
complicated than that in single-variable calculus:
Theorem 2.5 — Second Derivative Test for Two-Variable Functions. Let f ( x, y) be a C2 function
and ( x0 , y0 ) is a critical point of f , i.e. ∇ f ( x0 , y0 ) = 0. Then the nature of this critical point
( x0 , y
0 ) is determined by the following table:
2
f xx f yy − f xy f xx ( x0 , y0 ) ( x0 , y0 ) is a:
( x0 ,y0 )
>0 >0 local minimum
>0 <0 local maximum
<0 anything saddle
Any other cases are inconclusive.
For the function f ( x, y) = x2 + 4xy + y2 in the above example, to determine the nature of (0, 0)
we also need f xy (0, 0), which can be found as equal to 4.
Therefore, we have:
2
f xx f yy − f xy = 2 × 2 − 42 < 0,
(0,0)
f xx (0, 0) = 2 > 0.
From the table in Theorem 2.5, we conclude (0, 0) is a saddle, as expected from the plot of the
its graph. Let’s look at one more example before we learn the proof of the Second Derivative
Test.
2.6 Local Extrema 45
■ Example 2.11 Let f ( x, y) = 3y2 − 2y3 − 3x2 + 6xy. Find all critical points and determine
the nature of each of them.
∂f
= −6x + 6y = 0,
∂x
∂f
= 6y − 6y2 + 6x = 0.
∂y
From the first equation, we get y = x. Substitute this into the second equation, we yield:
6x − 6x2 + 6x = 0, or equivalently 2x − x2 = 0.
x = 0 or x = 2.
By noting that y = x, we have two critical points: (0, 0) and (2, 2).
Next we compute the second derivatives of f :
f xx = −6 f xy = 6
f yx = 6 f yy = 6 − 12y
Critical point P f xx ( P) f yy ( P) f xy ( P) 2
f xx f yy − f xy ( P) Nature of P
(0, 0) -6 6 6 -72 saddle
(2, 2) -6 -18 6 72 local maximum
f ′′ ( a) f ′′′ ( a)
f ( x ) = f ( a) + f ′ ( a)( x − a) + ( x − a )2 + ( x − a )3 + . . .
2! 3!
If f ( x ) has a critical point at x = a, then f ′ ( a) = 0. Also, when x is very close to a, the
higher-order terms ( x − a)3 , ( x − a)4 , etc. are significantly smaller than the quadratic term
( x − a)2 . Therefore, the function f ( x ) is approximately given by:
f ′′ ( a)
f ( x ) ≃ f ( a) + ( x − a )2 when x is near a.
2!
f ′′ ( a)
The right-hand side f ( a) + 2! ( x − a)2 is a quadratic function. If f ′′ ( a) > 0, then the graph
f ′′ ( a) f ′′ ( a)
y = f ( a) + 2! ( x − a)2 is a concave up parabola and so f ( a) + 2! ( x − a )
2 ≥ f ( a). Therefore,
f ′′ ( a) 2
f ( x ), which is approximately f ( a) + 2! ( x − a ) , is also ≥ f ( a) when x is near a. This explains
f ( x ) has a local minimum at x = a.
f ′′ ( a)
On the other hand, if f ′′ ( a) < 0, then the graph y = f ( a) + 2! ( x − a)2 is a concave down
parabola. Similar argument as above shows f ( x ) has a local maximum at x = a.
46 Partial Differentiations
10
Figure 2.20: blue graph shows y = f ( x ) where f ′ (0) = 0; yellow graph shows y = f (0) +
f ′′ (0) 2
2! x where f ′′ (0) > 0
Back to multivariable calculus, we now explain the second derivative test using the Taylor’s
series approach. Given a function f ( x, y), the multivariable Taylor’s series about ( x, y) =
( x0 , y0 ) is given by:
f ( x, y) = f ( x0 , y0 ) + f x ( x0 , y0 )( x − x0 ) + f y ( x0 , y0 )(y − y0 )
f xx ( x0 , y0 ) 2 f xy ( x0 , y0 ) f yy ( x0 , y0 )
+ ( x − x0 )2 + ( x − x0 )(y − y0 ) + ( y − y0 )2
2! 2! 2!
+ higher-order terms
The proof is beyond the scope of the course. If ( x0 , y0 ) is a critical point of f ( x, y), then
f x ( x0 , y0 ) = f y ( x0 , y0 ) = 0. For simplicity, denote P = ( x0 , y0 ), then when ( x, y) is near P, we
have:
1
f ( x, y) ≃ f ( x0 , y0 ) + f xx ( P)( x − x0 )2 + 2 f xy ( P)( x − x0 )(y − y0 ) + f yy ( P)(y − y0 )2 .
2
Therefore, to determine whether ( x0 , y0 ) is a local maximum/minimum or a saddle of f ( x, y),
one should determine whether the quadratic function:
Translate
back to previous notations, we can conclude:
2
f xx f yy − f xy f xx ( x0 , y0 ) ( x0 , y0 ) is a:
( x0 ,y0 )
>0 >0 local minimum
>0 <0 local maximum
<0 anything saddle
This explains the second derivative test for two-variable functions!
2.7 Lagrange’s Multiplier 47
Figure 2.21: At the maximum and minimum points of the function x2 + 2y2 when ( x, y) is
restricted the unit circle x2 + y2 = 1, the tangent plane may not be horizontal. Therefore,
solving ∇ f = 0 does not give the maximum or minimum points of the function.
∇ f ( x, y) = λ∇ g( x, y)
g( x, y) = c
i We call this the Lagrange’s Multiplier method because the scalar λ is called the Lagrange’s
Multiplier.
48 Partial Differentiations
■ Example 2.12 Let f ( x, y) = x2 + y2 + 2x + 2y, find the maximum and minimum values of
f when ( x, y) is restricted on the constraint x2 + y2 = 1.
∇ f = ⟨2x + 2, 2y + 2⟩
∇ g = ⟨2x, 2y⟩.
2x + 2 = 2λx ⃝
1
2y + 2 = 2λy ⃝
2
2 2
x +y = 1 ⃝
3
that the λ can be canceled. However, we may worry that whether ⃝ 2 is zero! Therefore, we
2x + 2 2λx
= .
2y + 2 2λy
After cancellations, we get:
x+1 x
= .
y+1 y
By cross multiplication:
y( x + 1) = x (y + 1) ⇒ xy + y = xy + x ⇒ y = x.
Substitute y = x into ⃝,
3 we have 2x2 = 1, and so x = √1 or − √1 . Since y = x, the solutions
2 2
for ( x, y) in this case are:
1 1 1 1
( x, y) = √ ,√ , −√ ,−√ .
2 2 2 2
Case 2: 2y + 2 = 0
In this case, y = −1. Substitute this into ⃝,
3 we get x = 0. However, putting x = 0 into ⃝
1
0.5 2.41421
0.414214
3.12132
0.0
0.292893
-0.5
-1.82843
1.70711
-1.0 -1.12132
-1.0 -0.5 0.0 0.5 1.0
■ Example 2.13 Let f ( x, y) = x2 − 4x + y2 + 9. Find the maximum and minimum points and
values of f ( x, y) subject to the constraint 4x2 + 9y2 = 36.
■ Solution Define g( x, y) = 4x2 + 9y2 , then the constraint is the level set g( x, y) = 36. Set-up
the Lagrange’s Multiplier system:
∇ f ( x, y) = λ∇ g( x, y)
g( x, y) = 36
2x − 4 = 8λx ⃝
1
2y = 18λy ⃝
2
2 2
4x + 9y = 36 ⃝
3
Case 1: ⃝2 ̸= 0
By ⃝1 ÷ ⃝,2 we get:
2x − 4 8λx
=
2y 18λy
x−2 4x
= (cancel λ)
y 9y
9y( x − 2) = 4xy (cross multiplication)
9( x − 2) = 4x (cancel y ̸= 0)
18
x=
5
18
However, substitute x = 5 into ⃝,
3 we get:
2
18
9y2 = 36 − 4
5
which is a negative number, but 9y2 must be positive (or zero)! Therefore, there is no solution
in this case.
50 Partial Differentiations
Case 2: ⃝2 = 0
f (3, 0) = 6
f (−3, 0) = 30
Therefore, minimum point is (3, 0) with value 6; maximum point is (−3, 0) with value 30.
See Figure 2.23
2 30 26 18 10
0 6
-1
-2 30 22 14
-3 -2 -1 0 1 2 3
Next we explain how the Lagrange’s Multiplier method works. Given a function f ( x, y)
subject to the constraint g( x, y) = c. At the point ( a, b) on the constraint where the maximum
or minimum of f ( x, y) is achieved, the level set of f ( x, y) at ( a, b) is tangent to the constraint
g( x, y) = c. Consequently, the gradient vector ∇ f ( a, b), which is perpendicular to the level set
of f at ( a, b), must be parallel to the gradient vector ∇ g( a, b), which is perpendicular to the
constraint g = c. See Figure 2.24 for an illustration. Therefore, at such a point, we must have:
∇ f ( a, b) = λ∇ g( a, b)
where λ is a scalar.
The Lagrange’s Multiplier method also works for three-variable functions, yet the system
of equations may be more complicated. Let’s look at the example:
■Example 2.14 Find the distance from (0, 0, 0) to the plane 2x + 3y + 4z = 29 using La-
grange’s Multiplier.
■Solution The distance from a point P to a plane is defined to be the shortest possible
distance between the given point P and any point Q on the plane. Let’s first formulate this
problem in a mathematical way. We want to:
q
minimize x 2 + y2 + z2
subject to constraint 2x + 3y + 4z = 29
p p
However, to minimize x2 + y2 + z2 amounts to calculating ∇ x2 + y2 + z2 . As you
p
can imagine, it would be messy. It is useful to observe that minimizing x2 + y2 + z2 is
equivalent to minimizing x2 + y2 + z2 , i.e. the square of the distance from the origin. The
latter is much easier to handle. Let:
f ( x, y, z) = x2 + y2 + z2
g( x, y, z) = 2x + 3y + 4z,
then the constraint is the level set g = 29. Set-up the Lagrange’s Multiplier system ∇ f = λ∇ g
as in previous examples:
2x = 2λ
2y = 3λ
2z = 4λ
2x + 3y + 4z = 29
3λ
Then x = λ, y = 2 and z = 2λ. Substitute them into the constraint equation, we get:
9λ
2λ + + 8λ = 29.
2
It is easy to see that λ = 2, and therefore ( x, y, z) = (2, 3, 4). It gives the unique critical point.
It is intuitive that the minimum point must exist in this problem (and there is no maximum
√ give the minimum point. Since f (2, 3, 4) = 29, the
point), so this unique critical point must
distance from (0, 0, 0) to the plane is 29.
52 Partial Differentiations
2.8 Optimizations
In this section we will learn some examples of optimization using the gradient and/or La-
grange’s Multiplier methods.
■ Example 2.15 Many airlines require that the sum of length, width and height of a checked
baggage cannot exceed 62 inches. Find the dimensions of the rectangular baggage that has
the greatest possible volume under this regulation.
■Solution Denote l, w, h to be the length, width and height respectively. We need to maximize
the volume of the baggage, which is given by:
The constraint is l + w + h ≤ 62 (inches), but it is intuitively clear that in order to maximize the
volume, the sum l + w + h has better be at maximum possible. Define g(l, w, h) = l + w + h,
then the constraint can be regarded as the level set g = 62. Set up the Lagrange’s Multiplier
system:
∇V = λ ∇ g
g(l, w, h) = 62
which is equivalent to
wh = λ
lh = λ
lw = λ
l + w + h = 62
Although it is not too difficult to solve them by hand, let’s type the following command on
Mathematica to solve them:
Solve[{w h == L, l h == L, l w == L, l + w + h == 62}, {l, w, h, L}]
Only the first one is physically relevant. Therefore, the rectangular baggage with the largest
volume under this restriction is the square cube!
■ Example 2.16 Three cities A, B and C are located at (5, 2), (−4, 4) and (−1, −3) respectively
on the ( x, y)-plane. There is a railtrack whose equation is y = x3 + 1, and a station is going
to be built on the track so that the sum of squares of the distances from each city to the
station is minimized. Find the coordinates of the station.
The constraint is that the station has to be on the track, i.e. y = x3 + 1. Define g( x, y) = y − x3 ,
then the constraint can be written as g( x, y) = 1. Set up the Lagrange’s Multiplier system
2.8 Optimizations 53
∇ f = λ∇ g and g( x, y) = 1:
2( x − 5) + 2( x + 4) + 2( x + 1) = −3λx2
2( y − 2) + 2( y − 4) + 2( y + 3) = λ
y − x3 = 1
Solving the system, we get ( x, y) = (0, 1). Therefore, the station should be located at (0, 1)
in order to minimize the sum of squares of the distances.
( x1 , y1 ), . . . , ( x N , y N )
on the xy-plane. Find the straight-line y = mx + c such that the sum of squares of distances
between each ( xi , yi ) and ( xi , mxi + c) is minimized.
N
f (m, c) = ∑ (yi − mxi − c)2 .
i =1
Note that ( xi , yi )’s are given so they should be regarded as constants. The variables are m
and c. Note that there is no constraint for m and c, so we can simply solve ∇ f (m, c) = 0 for
critical points.
!
N N N N
∂f
= −2 ∑ (yi − mxi − c) xi = −2 ∑ xi yi − m ∑ xi2 − c ∑ xi
∂m i =1 i =1 i =1 i =1
!
N N N
∂f
= −2 ∑ (yi − mxi − c) = −2 ∑ yi − m ∑ xi − cN
∂c i =1 i =1 i =1
∂f ∂f
Set ∂m = ∂c = 0, regarding all xi ’s and yi ’s to be constants, then:
Am + Bc = E
Bm + Nc = F
where A = ∑iN=1 xi2 , B = ∑in=1 xi , E = ∑iN=1 xi yi and F = ∑iN=1 yi . By solving the system
carefully, one should get:
BF − EN ( ∑ xi ) ( ∑ yi ) − N ( ∑ xi yi )
m= 2
=
B − AN (∑ xi )2 − N ∑ xi2
(∑ xi ) (∑ xi yi ) − ∑ xi2 (∑ yi )
BE − AF
c= 2 =
B − AN ( ∑ x i )2 − N ∑ x 2
i
It is quite intuitive that this pair of m and c should minimize f since f ≥ 0 and so a minimum
must exist.
54 Partial Differentiations
■ Solution The Lagrange’s Multiplier method finds us the boundary critical points on
4x2 + 9y2 = 36. For the interior 4x2 + 9y2 < 36, the critical points are simply solutions to
∇ f = 0. The general procedure of an optimization problem with a solid domain is that:
1. Find all interior critical points by solving ∇ f = 0;
2. Find all boundary critical points using Lagrange’s Multiplier;
3. Evaluate f at each critical points found, and look for the point that gives greatest/lowest
value of f .
Interior: Set ∇ f = 0, we get:
2x − 4 = 0
2y = 0
Therefore, the only interior critical point is (2, 0), which can be checked easily that it is in the
given region.
Boundary: The boundary is the ellipse 4x2 + 9y2 = 36. We have already done in Example
2.13, using Lagrange’s Multiplier, that the boundary critical points are (3, 0) and (−3, 0).
Finally, evaluate f at each critical point found:
f (2, 0) = 5
f (3, 0) = 6
f (−3, 0) = 30
Therefore, the absolute minimum is 5 (attained at (2, 0)), and the absolute maximum is
30 (attained at (−3, 0)).
3 — Multiple Integrations
William Thurston
outer
ˆ y =2 ˆ
z }| {
x =1
(4 − x − y2 x )dx dy .
y =1 x =0
| {z }
inner
When computing the inner integral (which is respect to x in this example), we regard all
56 Multiple Integrations
It is worthwhile to note that if we switch the inner and outer integrals, the final answer is the
same!
ˆ x =1 ˆ y =2 ˆ y =2 y =2
y3 x
(4 − x − y2 x )dydx = 4y − xy − dx
x =0 y =1 y =1 3 y =1
ˆ x =1
8x x
= 8 − 2x − − 4−x− dx
x =0 3 3
ˆ x =1
10x 7
= 4− dx = .
x =0 3 3
It is not a coincident! Let’s explain why it is true by learning the geometric meaning of double
integrals. Consider the integral:
ˆ x =b ˆ y=d
f ( x, y) dydx.
x=a y=c
is an integral with respect to y keeping x fixed. This quantity represents the area under
the curve obtained by moving along the y-direction from y = c to y = d on the surface
z = f ( x, y), while keeping x unchanged. See Figure 3.1.
Since A( x ) dx can be thought as the volume of a solid slice with width dx and cross-section
area A( x ), by integrating A( x ) dx it means adding up the volume of these thin slices and so
the double integral
ˆ x =b
A( x ) dx
x=a
is the volume under the graph z = f ( x, y) over the base rectangle bounded by x = a, x = b,
y = c and y = d. It is important to understand the geometric meanings of the inner and outer
integrals in order to set-up a double integral correctly.
As a double integral represents the volume of a solid, one should not expect there is any
difference if we slice the solid in a different way. For instance, to find the volume under the
graph z = 6 − 2x − y over the rectangular region 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2, one can set up the
double integral in either way:
ˆ x =1 ˆ y =2
(6 − 2x − y) dy dx see Figure 3.2a
x =0 y =0
| {z }
A( x )
ˆ y =2 ˆ x =1
(6 − 2x − y) dx dy see Figure 3.2b
y =0 x =0
| {z }
A(y)
Readers should verify that the above integrals indeed give the same value (the answer
is 8). In general, the following Fubini’s Theorem asserts that switching dx and dy (and the
corresponding integral signs) give the same double integral. Although the statement of the
theorem is geometrically intuitive, the proof is not easy and is beyond the scope of this course.
58 Multiple Integrations
Theorem 3.1 — Fubini’s Theorem for Rectangular Regions. Let f ( x, y) be a continuous function
over a rectangular region a ≤ x ≤ b and c ≤ y ≤ d, then:
ˆ y=d ˆ x =b ˆ x =b ˆ y=d
f ( x, y) dxdy = f ( x, y) dydx.
y=c x=a x=a y=c
Since the order of integration (i.e. dxdy or dydx) determines the order of the integral signs,
the above two double integrals can simply be written as:
ˆ dˆ b ˆ bˆ d
f ( x, y) dxdy = f ( x, y) dydx.
c a a c
Even simpler, one may denote dA := dxdy or dydx and the rectangular region by R. Then, we
can write the integral as: ¨
f ( x, y) dA.
R
When setting up a double integral to find the volume of the solid under a graph z = f ( x, y),
it is worthwhile to observe that the lower and upper limits of the integral are not affected by
the function f ( x, y). Therefore, in order to interpret a double integral in a geometric way, one
may simply draw the base region (or in other words, the top-down view) instead of drawing
the solid in the three-dimensional space.
y x=1 y x=1
leaves at y = 2
y=2 y=2
enters at x = 0 leavs at x = 1
x x
enters at y = 0
(a) top-down view of Figure 3.2a (b) top-down view of Figure 3.2b
Figure 3.3: the red arrows represent the cross-section slices in Figures 3.2a and 3.2b.
3.2 Fubini’s Theorem for General Regions 59
■ Example 3.2 Find the volume of the solid under the plane z = 3 − x − y over the triangle
region R bounded by the x-axis, x = 1 and y = x.
■ Solution First we choose an order of integration, say dydx. The inner integral should
calculate the area of slices with y varies and x fixed. Since the height 3 − x − y of the solid
does not affect how we set-up the upper/lower limits, we consider the top-down view of the
solid (see Figure 3.4).
The red strip in Figure 3.4b represents a sample slice with fixed x. The strip enters at
y = 0 and leaves at y = x. Hence, the area of this slice is:
ˆ y= x
(3 − x − y) dy.
y =0 | {z }
height
“Summing up” the area of these slices, we integrate by dx over the range of x: 0 ≤ x ≤ 1 as
shown in Figure 3.4b, i.e.
ˆ x =1 ˆ y = x
(3 − x − y) dy dx.
x =0 y =0
| {z }
inner integral
It will give the volume of the solid as required in this problem. The rest of the task is to
compute the integral:
ˆ x =1 ˆ y = x ˆ x =1 y= x
y2
(3 − x − y) dydx = 3y − xy − dx
x =0 y =0 x =0 2 y =0
ˆ x =1
x2
= 3x − x2 − dx
x =0 2
ˆ 1
3x2
= 3x − dx
0 2
3x2 x 3 1
= −
2 2 0
= 1.
Alernatively, we can also integrate first by dx then by dy. Then, the inner integral is
represented by the red strip in Figure 3.4c. It enters at x = y and leaves at x = 1. The double
integral is therefore:
ˆ y =1 ˆ x =1
(3 − x − y) dxdy.
y =0 x =y
Figure 3.4: the graph and the top-down views of the function in Example 3.2
The fact that we have the freedom to choose our order of integration is guaranteed by the
Fubini’s Theorem, whose proof is again beyond the scope of this course.
Theorem 3.2 — Fubini’s Theorem for General Regions. Let R be a region on the xy-plane and
f ( x, y) is a continuous function on R, then
¨ ¨
f ( x, y)dxdy = f ( x, y)dydx
R R
where the lower/upper limits of each integral are set up according to the region R.
Notation As there is no difference between dxdy and dydx as far as the upper and lower
limits are set according to the same region, we may simply write:
dA = dxdy or dydx
Although choosing the dxdy-order will yield the same result as the dydx-order, it happens
often that one order is easier while the other one is harder. Let’s look at the following
example:
■ Example 3.3 Let R be the region in the first quadrant of the xy-plane bounded by the unit
circle x2 + y2 = 1 and the straight-line x + y = 1. Evaluate the integral
¨ p
1 − x2 dA.
R
√
■ Solution First choose the order of integration. However, it seems like integrating 1 − x2
3.2 Fubini’s Theorem for General Regions 61
by dx involves trig substitutions that we want to avoid if possible. Let’s try integrating first
by dy then by dx (to see if there is any luck).
Set-up the double integral according the top-down view of the solid (Figure 3.5):
ˆ √
x =1 ˆ y = 1− x 2 p
1 − x2 dydx.
x =0 y =1− x
As x is regarded as a constant when dealing with the inner integral, we can easily see that:
ˆ √
x =1 ˆ y = 1− x 2 p ˆ x =1 h p i y = √1− x 2
1− x2 dydx = y 1−x 2 dx
x =0 y =1− x x =0 y =1− x
ˆ x =1 p
= (1 − x2 ) − (1 − x ) 1 − x2 dx
x =0
ˆ 1 p p
= (1 − x 2 − 1 − x2 + x 1 − x2 ) dx
0
The former can be evaluated by substitution x = sin θ, while the latter can be done by
substituting u = 1 − x2 . Readers should complete the rest of computations as an exercise.
The final answer should be 1 − π4 .
We need a trig substitution anyway, but it is easier than doing a substitution for the inner
integral.
In the previous example, we see that although the Fubini’s Theorem tells that in theory we
can choose our favorite the order of integration, in practice we sometimes have to make a smart
choice. In the next example, let’s demonstrate an example that one order gives an integral
which is impossible to compute, while another is extremely easy.
62 Multiple Integrations
■ Solution The integrand sinx x does not have a simple antiderivative when integrating by dx!
Let’s switch the order of integration first.
The region corresponds to the double integral is formed by strips entering at x = y and
leaving at x = 1. The range of y is from 0 to 1. A sketch of the diagram can be found in
Figure 3.6.
Switching the order of integration, the strip for each x enters at y = 0 and leaves at y = x.
Therefore, Fubini’s Theorem says:
ˆ 1ˆ 1 ˆ 1ˆ x
sin x sin x
dxdy = dydx.
0 y x 0 0 x
If the region of integration is the shaded triangle below, which order of integration is better?
¨ ¨
f ( x, y) dxdy or f ( x, y) dydx,
R R
y y=x
y=1
x
y = 8−x
depending on the order of integration drdθ or dθdr. However, one should be very cautious that
while dA = dxdy in rectangular coordinates, it is instead:
dA = rdrdθ or rdθdr
However, things go much better if we switch to polar coordinates, since the region is in
the form a ≤ r ≤ b and α ≤ θ ≤ β. From Figure 3.8, the region is defined by:
0 ≤ r ≤ 1, 0 ≤ θ ≤ π.
Annular regions, to be discussed in the next example, are extremely clumsy using rectangular
coordinates but relatively easy using polar coordinates.
2 ≤ r ≤ 4, 0 ≤ θ ≤ 2π.
Therefore,
¨ ˆ 2π ˆ 4
x dA = r cos θ · rdrdθ
R 0 2
ˆ 2π ˆ 4
= r2 cos θ drdθ
0 2
ˆ 2π r =4
r3
= · cos θ dθ
0 3 r =2
ˆ 2π
56
= cos θ dθ
0 3
2π
56
= sin θ
3 0
=0
does not have an easy explicit expression. Nonetheless, this integral is extremely important
in probability, heat flow and many physics and engineering subjects. While this integral is
generally hard to compute, it can be magically done, with the help of a double integral, when
the upper/lower limits are: ˆ ∞ 2
e− x dx.
0
66 Multiple Integrations
2 − y2 2 2 2
Observing that e− x = e− x e−y , we can regard e−y as a constant in the inner dx-integral:
ˆ ∞ˆ ∞ ˆ ∞ˆ ∞ ˆ ∞ ˆ ∞
2 − y2 2 2 2 2
e− x dxdy = e− x e−y dxdy = e−y e− x dx dy.
0 0 0 0 0 0
ˆ ∞ 2
Now that e− x dx is a constant, and in particular, independent of y. Therefore, with respect
0
to the outer dy-integral, one can factor it out and yield:
ˆ ∞ ˆ ∞ ˆ ∞ ˆ ∞
− y2 − x2 − x2 − y2
e e dx dy = e dx e dy .
0 0 0 0
Now the double integral are split as a product of two single-variable integrals. Note that the
two single-variable integrals are the same as x and y are merely dummy variables! Therefore,
combining everything we have shown above, we have:
ˆ ∞ˆ ∞ ˆ ∞ ˆ ∞ ˆ ∞ 2
2 − y2 2 2 2
e− x dxdy = e− x dx e−y dy = e− x dx .
0 0 0 0 0
ˆ ∞ 2
Consequently, in order to evaluate the single-variable integral e− x dx, one can first evaluate
0
the double integral
ˆ ∞ˆ ∞ 2 − y2
e− x dxdy
0 0
ˆ ∞ 2
and then the square root of the double integral will give the value of e− x dx.
0
In contrast to the single-variable integral, the double integral is relatively easy to find
using polar coordinates. The region of integration is the entire first quadrant which, in polar
coordinates, can be described as:
π
0 ≤ r ≤ ∞, 0≤θ≤ .
2
3.3 Double Integrals in Polar Coordinates 67
Therefore,
ˆ ∞ˆ ∞ ˆ π/2 ˆ ∞
2 − y2 2
e− x dxdy = e−r r drdθ
0 0 0 0
ˆ
1 −r 2 r = ∞
π/2
d −r 2 2
= − e dθ note that e = e−r · (−2r )
0 2 r =0 dr
ˆ π/2
1
= −0 + dθ
0 2
π
= .
4
Therefore, ˆ √
∞
r
− x2 π π
e dx = = .
0 4 2
2
Furthermore, e− x is an even function, we also have:
ˆ ∞
2 √
e− x dx = π.
−∞
However, this trick does not work well for any similar definite integral such as:
ˆ 1
2
e− x dx.
0
the double integral is not “polar-friendly” since the region of integration is a square!
Some triple integrals have important physical meanings. For instance, if f ( x, y, z) is the
density function, then the triple integral
ˆ 1ˆ yˆ y
f ( x, y, z) dxdzdy
0 0 z
represents the mass of the solid (whose shape is determined by the upper/lower limits of the
integral). It is because dxdzdy represents (infinitesimal) volume, and
In physics, some calculations such as the center of mass of an object and the moment of inertia
about an axis also involve evaluations of triple integrals.
Pillar-Base Approach
Before we see some practical applications of triple integrals, let’s learn how a triple integral is
set-up. One helpful technique is so-called the pillar-base approach or pillar-shadow approach.
Let’s explain this through an example:
■ Example 3.7 Let D be the tetrahedral solid bounded by the plane z = y − x, the plane
y = 1, the xy-plane and the yz-plane (see Figure 3.11). Evaluate the triple integral:
˚
x2 dzdydx.
D
■Solution We demonstrate the pillar-base approach in the solution. The so-called pillar is the
orange ray labeled M in Figure 3.11, and it determines how the inner-most integral is set-up:
ˆ z=?
x2 dz.
z=?
Its lower and upper limits of z are determined by, respectively, where the pillar enters the
solid and where the pillar leaves the solid. According to the diagram, it enters at z = 0, i.e.
3.4 Triple Integrals in Rectangular Coordinates 69
the xy-plane; and it leaves at z = y − x. Therefore, the inner-most integral should be:
ˆ z=y− x
x2 dz.
z =0
Note that again the integrand x2 does not affect how we set-up this integral.
Next we call the other two variables x and y to be base variables or shadow variables.
Suppose light is coming in along the direction parallel to the pillar (i.e. z-direction), the
shadow of the solid appears on the base xy-plane as a triangle. To way to set-up the middle
and outer-most integrals are just the same as what we did for double integrals.
Since we picked the dydx-order, we draw a sample strip L on the base in the direction of
y. This strip enters at y = x and leaves at y = 1, and therefore the middle and outer-most
integrals should be set-up as:
ˆ x =1 ˆ y =1 ˆ z = y − x
x2 dzdydx.
x =0 y= x z =0
| {z } | {z }
base pillar
Finally, the easiest step is to evaluate the integral. This is as straight-forward as in double
integrals:
ˆ 1 ˆ 1 ˆ y− x ˆ 1ˆ 1
z=y− x
x2 dzdydx = [ x 2 z ] z =0 dydx
0 x 0 0 x
ˆ 1ˆ 1
= x2 (y − x )dydx
0 x
ˆ 1 y =1
x 2 y2
= − x3 y dx
0 2 y= x
ˆ 1 2
x4
x
= − x3 − + x4 dx
0 2 2
ˆ 1 2
x4
x 3
= −x + dx
0 2 2
3 1
x x4 x5
= − +
6 4 10 0
1 1 1 1
= − + = .
6 4 10 60
The Fubini’s Theorem also holds for triple integrals, i.e. we can evaluate the integral by
different order of integration, say dxdydz or dydxdz (there are 6 possible orders), we should get
the same value provided that the upper and lower limits are adjusted to present the same solid.
■ Example 3.8 Let D be the tetrahedral solid in Example 3.7. Evaluate the triple integral
˚
x2 dydzdx
D
■ Solution Now the inner-most integral is with respect to dy, meaning the pillar is pointing
along the positive y-axis. It is the ray labeled as M in Figure 3.12. It enters the solid through
the plane z = y − x, or equivalently y = x + z, and it leaves at y = 1. Therefore, the
70 Multiple Integrations
The middle and the outer-most variables are z and x, so the base is the shadow of the
solid on the xz-plane, which is the triangle labeled R in Figure 3.12.
Here the order of integration is dzdx, so we draw a sample strip (labeled L) along the
z-axis direction. It enters the region through z = 0 and leaves at the line x + z = 1, or
equivalently, z = 1 − x. Therefore, the whole triple integral should be set as:
ˆ x =1 ˆ z =1− x ˆ y =1
x2 dydzdx.
x =0 z =0 y= x +z
It can be verified by straight-forward computations that the answer should be the same as
we got in Example 3.7:
ˆ x =1 ˆ z =1− x ˆ y =1 ˆ 1 ˆ 1− x h i y =1
2
x dydzdx = x2 y dzdx
x =0 z =0 y= x +z 0 0 y= x +z
ˆ 1 ˆ 1− x
= x2 − x2 ( x + z) dzdx
0 0
ˆ 1 z =1− x
x 2 z2
= ( x2 − x3 )z − dx
0 2 z =0
ˆ 1
x )2 x 2 (1 −
= ( x2 − x3 )(1 − x ) − dx
0 2
ˆ 1 2
x4
x 3 1
= −x + dx = (same!)
0 2 2 60
The Fubini’s Theorem tells us that no matter what order of integration we choose, we
always get the same answer. Therefore, we can sometimes us the notation dV to denote the
volume element dxdydz (or any other order). A generic triple integral can be written as:
˚
f ( x, y, z) dV.
D
Although Fubini’s Theorem allows us to switch the order of integral, we sometimes need to
3.4 Triple Integrals in Rectangular Coordinates 71
Figure 3.12: the pillar-base diagram for the triple integral in Example 3.8
choose a smart choice of pillar and base variables to ease our computation. Let’s look at the
next example:
■ Example 3.9 Let D be the solid bounded by the paraboloid y = x2 + z2 and y = 16 − 3x2 − z2
(see Figure 3.13). Find the volume of the solid, i.e. evaluate the integral:
˚
1dV.
D
■ Solution After taking a careful look at the diagram, one should see that it would be a bad
choice if we chose either x or z to be the pillar variable. The solid can be decomposed into
two parts as shown in the diagram. If z were chosen to be the pillar direction, then the pillar
would enter the yellowp part in a way different from it does in the blue part. The yellow
part would
p have z = ± 16 − 3x2 − y as the lower/upper limits, while the blue part have
z = ± y − x2 . Even worse, the part near the intersection of the blue and the yellow surfaces
is geometrically complicated – it is not easy to set up the inner integral for that part.
However, life is much easier if we choose y as the pillar variable. Since then the y-pillar
will enter through the blue surface and leave through the yellow surface. The shadow is an
ellipse on the xz-plane.
To set-up the inner integral, we note that the y-pillar enters at y = x2 + z2 and leaves at
y = 16 − 3x2 − z2 :
ˆ y=16−3x2 −z2
dy.
y = x 2 + z2
The shadow (or base) is an ellipse, whose equation can be obtained by setting y = x2 + z2 =
16 − 3x2 − z2 , which gives
2x2 + z2 = 8.
Figure 3.13: the pillar-base diagram for the triple integral in Example 3.9
x = r cos θ
y = r sin θ
z=z
where r ≥ 0, 0 ≤ θ ≤ 2π and z can be any real number. Just like polar coordinates, it is good
to keep in mind that x2 + y2 = r2 , which will sometimes simplify your calculations. Figure
3.14 explains the geometry of these conversion rules.
If one sets r = constant, then it describes an infinite cylinder in R3 with z-axis as the
central axis. Therefore, if the solid is cylindrical in shape, it is usually easier to set-up a triple
3.5 Triple Integrals in Cylindrical Coordinates 73
integral using cylindrical coordinates. Analogous to polar coordinates, the volume element dV
is:
Theorem 3.3 Under cylindrical coordinates (r, θ, z), we have:
dV = rdzdrdθ.
■ Example 3.10 The volume of the solid bounded by two surfaces z = 4 − 4( x2 + y2 ) and
z = ( x2 + y2 )2 − 1, as shown in Figure 3.15.
■ Solution When using cylindrical coordinates, one often (though not always) set z as the
r4 − 1 = z = 4 − 4r2
which gives r = 1. Therefore, the outer and middle integrals have upper and lower limits
given by:
ˆ θ =2π ˆ r=1
.
θ =0 r =0
Combining all of the above, the volume of the solid is given by:
˚ ˆ θ =2π ˆ r =1 ˆ z=4−4r2
dV = 1 rdzdrdθ
D θ =0 r =0 z =r 4 −1
ˆ 2π ˆ 1
2
= [z]rz4=−41−4r rdrdθ
0 0
ˆ 2π ˆ 1
2 4
= (4 − 4r − r + 1)r drdθ
0 0
ˆ 2π ˆ 1
= (5r − 4r3 − r5 ) drdθ
0 0
ˆ 2π r =1
5r2 r6
4
= − −r − dθ
0 2 6 r =0
ˆ 2π
5 1 8π
= −1− dθ = .
0 2 6 3
■Example 3.11 Let D be a solid cylinder of radius a with z-axis as the central axis, is bounded
by planes z = z0 and z = z0 + h, where z0 and h are constants. Therefore, the cylinder has
height h. Suppose the solid has uniform density δ. Evaluate the following triple integral:
˚
Iz := δ( x2 + y2 )dV
D
■Solution We choose the order of integration dzdrdθ. The z-pillar enters the solid at z = z0
and leaves at z = z0 + h. The shadow is the circle with radius a centered at the origin.
74 Multiple Integrations
Therefore,
ˆ 2π ˆ a ˆ z0 + h
Iz = δ( x2 + y2 ) · rdzdrdθ
0 0 z0 | {z }
r2
ˆ 2π ˆ a ˆ z0 + h
= δr3 dzdrdθ
0 0 z0
ˆ 2π ˆ a
= δr3 hdrdθ
0 0
ˆ 2π r = a
δhr4
= dθ
0 4 r =0
ˆ 2π
δha4
= dθ
0 4
δhπa4
= .
2
Most physics/engineering textbook expresses the moment of inertia in terms of the total
mass m rather than the density δ. To rewrite the above answer in terms of m, we note that:
m
δ=
V
where V is the total volume of the solid, which is πa2 h. Therefore, combining with the above
calculation, we have:
m hπa4 ma2
Iz = · = ,
πa2 h 2 2
which is exactly what you can find in physics or engineering textbooks.
3.6 Triple Integrals in Spherical Coordinates 75
ρ ≥ 0, 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π.
From standard trigonometry, one can figure out the following conversion rules:
x = ρ sin ϕ cos θ
y = ρ sin ϕ sin θ
z = ρ cos ϕ
dV = ρ2 sin ϕ dρdϕdθ.
Infinitesimally, ρ2 sin ϕ dρdϕdθ is the volume of the little cube in red in Figure 3.18. The
dimensions of the little cube can be found using trigonometry.
There is a general way to derive a formula for dV for any kind of coordinate systems.
Suppose each coordinate of ( x, y, z) is a function of (u, v, w). The Jacobian matrix is defined to
be:
∂x ∂x ∂x
∂( x, y, z) ∂u
∂y
∂v
∂y
∂w
∂y
= ∂u .
∂(u, v, w) ∂z
∂v
∂z
∂w
∂z
∂u ∂v ∂w
There is a general conversion formula (whose proof is beyond the scope of the course):
∂( x, y, z)
dxdydz = det dudvdw. (3.1)
∂(u, v, w)
Let’s take cylindrical coordinates as an example. Since ( x, y, z) are related to (r, θ, z) by the
conversion rules:
x = r cos θ, y = r sin θ, z = z.
Therefore, the Jacobian matrix is given by:
∂ ∂ ∂
∂( x, y, z) ∂r r cos θ ∂θ r cos θ ∂z r cos θ cos θ −r sin θ 0
= ∂r r sin θ ∂θ
∂ ∂
r sin θ ∂
∂z r sin θ
= sin θ r cos θ 0
∂(r, θ, z) ∂
z ∂ ∂ 0 0 1
∂r ∂θ z ∂z z
∂( x, y, z)
It is straight-forward to compute that det = r cos2 θ − (−r sin2 θ ) = r. Therefore, (3.1)
∂(r, θ, z)
shows:
dxdydz = rdrdθdz.
3.6 Triple Integrals in Spherical Coordinates 77
■Example 3.12 Consider a solid sphere S of radius r centered at the origin. Suppose it has
uniform density δ. Derive the moment of inertia about z-axis, which is given by:
˚
Iz := δ( x2 + y2 )dV
S
0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π.
Therefore,
ˆ 2π ˆ π ˆ r
Iz = δ · ρ2 sin2 ϕ · ρ2 sin ϕ dρdϕdθ
0 0 0
ˆ 2π ˆ π ˆ r
= δρ4 sin3 ϕ dρdϕdθ
0 0 0
ˆ 2π
! ˆ
π ˆ r
= dθ sin3 ϕ dϕ δρ4 dρ
0 0 0
δr5
ˆ π
= 2π · (1 − cos2 ϕ) sin ϕ dϕ ·
0 5
π
cos3 ϕ δr5
= 2π · − cos ϕ + ·
3 0 5
δr5
1 1
= 2π · 1 − + 1 − · .
3 3 5
8πδr5
= .
15
Although it is a perfectly acceptable answer, the moment of inertia in many physics and
engineering books is expressed in terms of the total mass m rather than of the density. Since
the density in this problem is uniform, one can easily derive the moment of inertia formula
in terms of m:
8πr5 m 2mr2
Iz = · 4πr3 =
15 5
3
which is what one can find in physics or engineering books.
Conical objects are another type of solids which are compatible with spherical coordinates.
The inequality 0 ≤ ϕ ≤ π3 represents an infinite cone above the xy-plane with cone angle
3 counting from the positive z-axis. If one further impose the inequality 0 ≤ ρ ≤ 1 which
π
represents the sphere with radius 1 centered at the origin, then the combined inequalities:
π
0 ≤ ρ ≤ 1, 0 ≤ ϕ ≤
3
represents the common part of the sphere and the cone, which is in an ice-cream shape.
78 Multiple Integrations
■Example 3.13 Find the volume of the “ice-cream cone” D cut from the solid sphere ρ ≤ 1
by the cone ϕ = π3 , as shown in Figure 3.19.
˚
■ Solution The volume is given by ρ2 sin ϕ dρdϕdθ. The limits of the inner-most integral
D
are determine by where the ρ-ray, labeled M in the diagram, enters the solid and where it
leaves the solid. Evidently, it enters from the origin ρ = 0. When it leaves, it always leaves
from the sphere part and never the cone part, so the upper limit should be ρ = 1.
The ϕ-angle run from 0 to π3 , and the θ-ray, labeled L in the diagram, sweeps over the
shadow from 0 to 2π.
Combining all these limits, the volume is given by the integral:
ˆ ˆ
θ =2π ˆ
ϕ=π/3 ρ =1 ˆ ! ˆ
2π
! ˆ
π/3
!
1
ρ2 sin ϕ dρdϕdθ = dθ sin ϕ dϕ ρ2 dρ
θ =0 ϕ =0 ρ =0 0 0 0
1
= 2π · [− cos ϕ]0π/3 ·
3
1 1 π
= 2π · − + 1 · = .
2 3 3
Stephen Hawking
F(1, 0, 0) = − GMm i,
With the help of computer software (such as Mathematica), one can visualize a vector field
easily. A plot of the above gravitational field can be found in Figure 4.1.
The general form of a vector field in R3 is given by:
i Do NOT confuse Fx with the partial derivative ∂F ∂x ! Although we used the subscript
notation Fx to denote a partial derivative before, we should avoid using it in this chapter.
Here Fx means the x-component of the vector field F.
Many vector fields in physics are three-dimensional. To begin with, we will also study
two-dimensional vector fields. A two-dimensional vector field is a collection of vectors in R2
that are denoted by a vector-valued function F : R2 → R2 , which assigns to each point ( x, y) a
vector F( x, y). The general form of a two-dimensional vector field is given by:
F( x, y) = Fx ( x, y)i + Fy ( x, y)j.
Figure 4.2 shows the plot of two examples of two-dimensional vector fields:
1.0 2
0.5 1
0.0 0
- 0.5 -1
- 1.0 -2
(1 − y2 ) i = ⟨1 − y2 , 0⟩
−yi + xj = ⟨−y, x ⟩
xi + yj + zk ⟨ x, y, z⟩
− GMm 2 = − GMm 2
( x + y2 + z2 )3/2 ( x + y2 + z2 )3/2
4.2 Line Integrals of Vector Fields 81
Work = F · ∆r.
However, if the force is not a constant, meaning that either the direction or the magnitude
is not uniform, the work done by the force is not as simple as stated above. Likewise, if the
path is not a straight-path but a curved one so that ∆r is changing over time, the work done
by the force is again a bit more complicated. Line integrals of vector fields are introduced to
handle these more complicated scenarios.
Definition 4.1 — Line Integrals of Vector Fields. Given a continuous vector field F( x, y, z) and
a path C which is parametrized by r(t), a ≤ t ≤ b, the line integral of F over C is defined to
be: ˆ b
F · r′ (t) dt.
a
Let’s first look at some computational examples before we explain the physical and geomet-
ric meanings of line integrals.
■Example 4.1 Let F( x, y) = −yi + xj and C be´the counter-clockwise path along the circular
arc from (1, 0) to (0, 1). Find the line integral C F · dr.
π
r(t) = (cos t)i + (sin t)j, 0≤t≤ .
2
By the definition of line integrals,
ˆ ˆ t=π/2
F · dr = F · r′ (t) dt
C t =0
ˆ π/2
= (−yi + xj) · ((− sin t)i + (cos t)j) dt
0
ˆ π/2
= (y sin t + x cos t) dt.
0
Along the path C, we have x = cos t and y = sin t according to the parametrization, so we
have:
ˆ ˆ π/2
F · dr = sin2 t + cos2 t dt
C 0
ˆ π/2
π
= 1 dt = .
0 2
82 Vector Calculus
Line integrals in three dimension can be computed in an exactly the same way as in two
dimension. Let’s see one example:
■ Solution We are given the parametrization in the problem so we can proceed to the
computation of the line integral:
ˆ ˆ 1 D E
F · dr = y − x2 , z − y2 , x − z2 · r′ (t) dt
C 0
ˆ 1 D E D E
= y − x2 , z − y2 , x − z2 · 1, 2t, 3t2 dt
0
ˆ 1
= y − x2 + 2t(z − y2 ) + 3t2 ( x − z2 ) dt.
0
C = C1 + C2 + C3 + C4 + C5 .
4.2 Line Integrals of Vector Fields 83
3
Γ2
L1
1 Γ1
x
1 L2 3
■ Example 4.3 Consider the directed path C = L1 + Γ1 + L2 + Γ2 starting from (0, 3), first
along the y-axis to the point (0, 1), then along the circular arc Γ1 to (1, 0), then along the
ˆ to the point (0, 3) along the circular arc Γ2 . See Figure 4.4.
x-axis to (3, 0), and finally back
Compute the line integral F · dr where F is given by:
C
F( x, y) = −yi + xj.
■ Solution Since
ˆ ˆ ˆ ˆ ˆ
F · dr = F · dr + F · dr + F · dr + F · dr,
C L1 Γ1 L2 Γ2
r ( t ) = a + t ( b − a ), 0 ≤ t ≤ 1.
Hence,
ˆ ˆ 1
F · dr = F · r1′ (t) dt
L1 0
ˆ 1
= ⟨−y, x ⟩ · ⟨0, −2⟩ dt
0
ˆ 1
= ⟨−(3 − 2t), 0⟩ · ⟨0, −2⟩ dt
0
ˆ 1
= 0 dt = 0
0
84 Vector Calculus
Next we consider Γ1 , which is a clockwise circular arc of unit radius. The reverse path,
commonly denoted as −Γ1 , is a counter-clockwise circular arc of unit radius which can be
parametrized by:
π
− Γ1 : r2 (t) = ⟨cos t, sin t⟩, 0 ≤ t ≤ .
2
ˆ
We will compute F · dr, then the line integral over the original path Γ1 is given by:
− Γ1
ˆ ˆ
F · dr = − F · dr.
Γ1 − Γ1
ˆ ˆ π/2
F · dr = F · r2′ (t) dt
− Γ1 0
ˆ π/2
= ⟨−y, x ⟩ · ⟨− sin t, cos t⟩ dt
0
ˆ π/2
= ⟨− sin t, cos t⟩ · ⟨− sin t, cos t⟩ dt
0
ˆ π/2 ˆ π/2
2 2 π
= (sin t + cos t) dt = 1 dt = .
0 0 2
Therefore, ˆ ˆ
π
F · dr = − F · dr = − .
Γ1 − Γ1 2
Similarly, we parametrize L2 :
ˆ ˆ 1
F · dr = ⟨−y, x ⟩ · r3′ (t) dt
L2 0
ˆ 1
= ⟨0, 1 + 2t⟩ · ⟨2, 0⟩ dt
0
ˆ 1
= 0 dt = 0.
0
For the circular arc Γ2 with radius 3, the parametrization is given by:
π
Γ2 : r4 (t) = ⟨3 cos t, 3 sin t⟩, 0≤t≤ .
2
ˆ ˆ π/2
F · dr = ⟨−y, x ⟩ · ⟨−3 sin t, 3 cos t⟩ dt
Γ2 0
ˆ π/2
= ⟨−3 sin t, 3 cos t⟩ · ⟨−3 sin t, 3 cos t⟩ dt
0
ˆ π/2 ˆ π/2
2 2 9π
= (9 sin t + 9 cos t) dt = 9 dt = .
0 0 2
Finally, we sum up these results to find the value of the desired line integral:
ˆ ˆ ˆ ˆ ˆ
π 9π
F · dr = F · dr + F · dr + F · dr + F · dr = 0 − + 0 + = 4π.
C L1 Γ1 L2 Γ2 2 2
4.2 Line Integrals of Vector Fields 85
a
F · r′ (t) dt ≃ ∑ F · r′ (ti )∆ti
i
Assume each subdivision is very small that F and r′ are roughly constants in each subdivi-
sion. By definition of derivatives, we have:
∆r(ti )
r ′ ( ti ) ≃ .
∆ti
Therefore,
ˆ b
a
F · r′ (t) dt ≃ ∑ F · ∆r(ti )
i
Since F · ∆r(ti ) is the work done by F with displacement ∆r(ti ) (as they are roughly
constants), summing them up gives the approximated total work done by the force over the
whole path C. As n → ∞, the subdivisions become infinitesimal and ∑i F · ∆r(ti ) becomes
more accurate and approaches to the total work done by the force.
The value of the integral is determined by many factors, including the length of C, the
magnitude of F and the velocity of r(t). However, the sign of F · r′ (t) is solely determined by
the angle θ between F and the tangent vector r′ (t) of the curve C, since:
It is positive when F and r′ (t) make an acute angle, and is negative when they make an obtuse
angle. Therefore, the sign of the line integral can reveal whether the path C is overall along
86 Vector Calculus
-2
-4
-4 -2 2 4
or against the direction of the vector field F. The more positive is the value, the more often
the path is traveling along the vector field on average. Let’s illustrate this point through the
following example:
Consider the vector field as shown in Figure 4.5 and the two paths C1 and C2 with directions
indicated in the diagram. Along the path C1 , the velocity vector is pointing against the vector
field at the beginning of the path. Then, it turns slightly along the vector field near the very
end of the path. Therefore, F · r′ is negative for most of the time, and so we should expect that:
ˆ
F · dr < 0.
C1
On the other hand, the path C2 is along the vector field at all time. The integrand F · r′ is
positive throughout the path C2 . Therefore, it is certain that:
ˆ
F · dr > 0.
C2
F · dr = Fx dx + Fy dy + Fz dz.
While the ds-notation does not have much practical use, there are some practical advantages
for using the differential form notations if the path C is parallel to one of the coordinate axes.
Let’s illustrate this through an example:
where C is the path from (0, 1) down to (0, 0) along the y-axis, then to (1, 0) along the x-axis.
■ Solution Note that the path C has two segments. Break C into two segments C1 and C2
where C1 is the path from (0, 1) to (0, 0) along the y-axis, and C2 is the path from (0, 0) to
(1, 0) along the x-axis.
(0, 1)
C1
(1, 0)
(0, 0) C2
88 Vector Calculus
Along C1 , we have x = 0, and so dx = 0 too. From this we can immediately tell that
ˆ ˆ
−y dx + x dy = −y d(0) + 0 dy = 0.
C1 C1
∂f ∂f ∂f
∇f = i+ j+ k.
∂x ∂y ∂z
The most preliminary method to determine whether a given vector field is conservative is
to solve for the scalar potential f , as illustrated by the following example:
Determine whether or not F is a conservative vector field. If so, find its potential function f
such that F = ∇ f .
∂f
⃝
1 = 2x + y
∂x
∂f
⃝
2 = x + z3
∂y
∂f
⃝
3 = 3yz2 + 1
∂z
From ⃝,1 one can find f ( x, y, z) by integrating 2x + y by x, regarding y to be a constant. In
single variable calculus, an integration constant will be added after the integration. However,
we are now considering partial derivatives, and not only constants but also y or z will vanish
after differentiating by x! In other words, the “integration constant” is no longer a constant
but instead a function of y and z. Precisely, we have:
⃝
4 f ( x, y, z) = x2 + yx + g(y, z)
where g(y, z) is some function of y and z. We will figure out g(y, z) in the remaining steps.
By differentiating both sides of ⃝4 with respect to y, we get:
∂f ∂g
= x+ .
∂y ∂y
∂g
= z3 .
∂y
An integration by y yields:
g(y, z) = yz3 + h(z).
90 Vector Calculus
Note that from similar principle discussed above for f , the integration “constant” is no
longer just a constant but a function not depending on y and hence a function of z only.
By differentiating both sides with respect to z, we get:
∂g
= 3yz2 + h′ (z).
∂z
Finally, by comparing this result with ⃝,
3 one must have h′ (z) = 1, and so clearly h(z) = z + C,
where C is geniunely a constant this time!
Combining all results above, we have
f ( x, y, z) = x2 + yx + yz3 + z + C
It is important to keep in mind that not all vector fields are conservative! Here is one which
such an f does not exist:
■ Example 4.6 Let F( x, y) = −yi + xj. Determine whether or not F is a conservative vector
field. If so, find its potential function f such that F = ∇ f .
∂f
⃝
1 = −y
∂x
∂f
⃝
2 =x
∂y
∂f
= − x + g ′ ( y ).
∂y
One important feature of conservative vector fields is the path independence of line integral,
meaning that the line integral depends only on the end-points of the path. Precisely, we have:
Theorem 4.1 Given a conservative vector field F = ∇ f , where f is a potential function, then
along any path C connecting from point P0 ( x0 , y0 , z0 ) to point P1 ( x1 , y1 , z1 ), then the line
integral is given by: ˆ
F · dr = f ( x1 , y1 , z1 ) − f ( x0 , y0 , z0 ).
C
4.3 Conservative Vector Fields 91
Then,
ˆ ˆ
dr
F · dr = F· dt
C dt
ˆC
dr
= ∇ f · dt
C dt
ˆ b
∂f ∂f ∂f dx dy dz
= , , · , , dt
a ∂x ∂y ∂z dt dt dt
ˆ b
∂ f dx ∂ f dy ∂ f dz
= + + dt
a ∂x dt ∂y dt ∂z dt
ˆ b
d
= f (r(t))dt (chain rule)
a dt
= f (r(b)) − f (r( a)) (Fundamental Theorem of Calculus)
■
The significance of this theorem is that the RHS depends only on the initial and final points
of the curve C, but not the intermediate path. Inˆother words,
ˆ if C1 and C2 are two paths with
the same initial and final points, then we have F · dr = F · dr. Moreover, if C is closed
˛ C2 C1
path whose initial and final positions are the same, then F · dr = 0.
C
Notation If C is a closed path, meaning that the two endpoints are the same, it is a
convention to use denote the line integral as:
˛
F · dr.
C
To summarize, we have:
Corollary 4.2 For a conservative vector field F, if C1 and C2 are two paths with the same
initial and final positions, then
ˆ ˆ
F · dr = F · dr.
C1 C2
■ Example 4.7 Consider the vector field F( x, y, z) = (2x + y)i + ( x + z3 )j + (3yz2 + 1)k which
appeared in Example 4.5, and the path C given by the parametric equation
t
r(t) = (et sin2012 t)i + (cos2013 t)j + k, 0 ≤ t ≤ 2π.
π
ˆ
Find the line integral F · dr.
C
■ Solution It is clear that direct computation of this line integral is extremely laborious (and
may be impossible). Fortunately, it was shown in Example 4.5 that F is a conservative vector
field with potential function
f ( x, y, z) = x2 + yx + yz3 + z + C.
The initial and final positions of the given path are respectively:
r(0) = ⟨0, 1, 0⟩
r(2π ) = ⟨0, 1, 2⟩.
Alternatively, one can also find this line integral using the path independence property
of conservative vector fields. Let L be the straight path connecting from (0, 1, 0) to (0, 1, 2)
which are the initial and final positions respectively. Since F is conservative, we must have:
ˆ ˆ
F · dr = F · dr.
C L
Therefore,
ˆ ˆ 1 D E
F · dr = 2x + y, x + z3 , 3yz2 + 1 · r′L (t) dt
L 0
ˆ 1 D E
= 2(0) + 1, 0 + 8t3 , 3(4t2 ) + 1 · ⟨0, 0, 2⟩ dt
0
ˆ 1
= (24t2 + 2) dt
0
1
= 8t3 + 2t = 10.
0
ˆ ˆ
By path independence, F · dr = F · dr = 10.
C L
4.3 Conservative Vector Fields 93
F = Fx i + Fy j + Fz k,
the curl of the vector field F, denoted either by curl(F) or ∇ × F, is defined as:
∂ ∂ ∂
∇×F = i + j + k × Fx i + Fy j + Fz k
∂x ∂y ∂z
i j k
∂ ∂ ∂
= ∂x ∂y ∂z
Fx Fy Fz
∂Fz ∂Fy ∂Fx ∂Fz ∂Fy ∂Fx
= − i+ − j+ − k.
∂y ∂z ∂z ∂x ∂x ∂y
i The symbol ∇ := ∂ ∂ ∂
∂x i + ∂y j + ∂z k should be considered as an operator rather than a vector.
∂
It carries no physical or geometric meaning. “Multiplying” ∂x by a function P gives the
partial derivative ∂P
∂x as the product.
F = Fx i + Fy j + 0k.
i ∇ × F is called the curl of F because it measures how circular the vector field F is, as a
consequence of the Green’s and Stokes’ Theorems which we will learn very soon.
■ Solution
i j k
∂ ∂ ∂
∇×F = ∂x ∂y ∂z
2x + y x + z3 3yz2 + 1
∂ ∂ ∂ ∂
= (3yz2 + 1) − ( x + z3 ) i + (2x + y) − (3yz2 + 1) j
∂y ∂z ∂z ∂x
∂ ∂
+ ( x + z3 ) − (2x + y) k
∂x ∂y
= (3z2 − 3z2 )i + (0 − 0)j + (1 − 1)k = 0.
94 Vector Calculus
In order to introduce the curl test, we first need to introduce a topological concept about a
region, namely simply-connectedness. A connected region means the region is in one piece, and a
simply-connected region is defined as follows:
Definition 4.4 — Simply-Connected Regions. A region Ω is simply-connected if Ω is con-
nected and every closed loop in Ω can be contracted to a point continuously without leaving
the region Ω.
The set R2 with the origin removed, is not simply-connected, as the loops that go around
the origin cannot be contracted to a point without “touching” the origin. However, the set R3
with the origin removed is simply-connected – draw a picture to convince yourself on that!
The curl test to be introduced is a very straight-forward test to check whether a given vector
field F is conservative, without the need to solve for the potential function f . There is one
crucial condition on the domain of the vector field when applying the curl test.
Theorem 4.3 — Curl Test. Given a vector field F is defined and C1 on a region Ω, then:
1. If F = ∇ f for some scalar function f defined on Ω, then ∇ × F = 0 on Ω.
2. If ∇ × F = 0 and Ω is simply-connected, then F = ∇ f for some scalar function f
defined on Ω.
Proof. The theorem has two parts. Part (1) is easy, while the other part is very technical (and
hence the proof is omitted).
Part (1) is a consequence of the Mixed Partial Theorem. Suppose F = ∇ f , then F =
∂f ∂f ∂f
∂x + ∂y j + ∂z k, and so
i
i j k
∂ ∂ ∂
∇×F = ∂x ∂y ∂z
∂f ∂f ∂f
∂x ∂y ∂z
∂2 f ∂2 f ∂2 f ∂2 f ∂2 f ∂2 f
= − i+ − j+ − k
∂y∂z ∂z∂y ∂z∂x ∂x∂z ∂x∂y ∂y∂x
= 0i + 0j + 0k = 0.
■
Therefore, to check whether or not F is conservative assuming it is defined and smooth on a
simply-connected region, it is not necessary to solve for the potential function f . All is needed
is to find ∇ × F which only involves differentiation but not integration. However, the curl test
only tells you whether or not the vector field is conservative. It fails to tell you what the
potential function is. However, knowing that F is conservative without knowing the potential
f can still be helpful since one can then pick an easier path when computing a line integral.
Let’s consider the following example:
4.3 Conservative Vector Fields 95
ˆ
■ Solution Recall that the required line integral can be equivalently written as F · dr where
C
the vector field F is given by:
F( x, y) = 2xe xy + x2 ye xy i + x3 e xy + 2y j.
If we compute the line integral directly from the definition, then one would have to first
compute F · r′ (t), which is given by:
24601 cos24601 t· 2t 2 × 24601 2t cos24601 t· 2t
2 cos t·e π + cos t· e π · 24601 cos24600 t · (− sin t) + . . .
π | {z }
24601 t ) ′
| {z }
2xe xy + x2 ye xy
( cos
i j k
∂ ∂ ∂
∇×F = ∂x ∂y ∂z
2xe xy + x2 ye xy x3 e xy + 2y
0
∂( x e + 2y) ∂(2xe xy + x2 ye xy )
3 xy
= 0i − 0j + − k
∂x ∂y
= 3x2 e xy + x3 ye xy − 2x2 e xy − x2 e xy − x3 ye xy k
= 0k = 0.
Note that F is defined and C1 everywhere in R2 . The curl test applies and therefore it shows
F is conservative. Even
ˆ though the curl test does not tell us what the potential function
is, the line integral F · dr is now shown to be path-independent. One may simply join
C ˆ
the end-points of C by a straight-line path L, and we can compute the line integral F · dr
L
instead.
The end-points of C are given by:
2 π
r(−π/2) = (cos(−π/2))24601 i + − j = 0i − j
π 2
2 π
r(π/2) = (cos(π/2))24601 i + j = 0i + j
π 2
The straight-line path L from (0, −1) to (0, 1) is parametrized by:
C is a closed path, so H cannot be conservative, even thought the curl of H is zero! The curl
test fails here because the domain of H is not simply-connected.
The gravitational vector field
xi + yj + zk
F( x, y, z) = − GMm
( x2+ y2 + z2 )3/2
is not defined on (0, 0, 0) but is defined and C1 on everywhere else in R3 . With some straight-
forward (although lengthy) computations, one can verify that ∇ × F = 0 for any ( x, y, z) ̸=
(0, 0, 0). Since the domain R3 with origin removed is simply-connected, the curl test applies to
this vector field! It concludes that the gravitational vector field F is conservative.
4.4 Green’s Theorem 97
Theorem 4.4 — Green’s Theorem. Let C be a simple closed curve in R2 which is counter-
clockwise oriented. Suppose the curve C encloses region R. Let F( x, y) be a vector field
which is defined and C1 at every point in R, then:
˛ ¨
F · dr = (∇ × F) · k dA
C R
| {z } | {z }
line integral double integral
The proof of the Green’s Theorem is quite technical if the curve C is complicated. The proof
of the theorem in some special cases can be found in some reference textbooks listed in the
syllabus. Let’s first look at some examples.
■ Example 4.10 Use the Green’s Theorem to evaluate the line integral:
˛
−ydx + xdy
C
98 Vector Calculus
■Solution The vector field corresponding to this line integral is F = −yi + xj, which is
defined and is smooth everywhere on R2 . The given path C encloses the rectangle R with
vertices (0, 0), (1, 0), (1, 1) and (0, 1).
(0, 1) (1, 1)
(0, 0) (1, 0)
∇ × F = 2k.
3
Γ2
L1
1 Γ1
x
1 L2 3
■ Example 4.11 Let F = −y2 i + xyj and C is the multi-segment path appeared in Example
4.3. For easy reference, see Figure 4.6. Find the line integral using Green’s Theorem:
˛
F · dr.
C
■ Solution The closed path C consists of four segments. If we attempt to find the line integral
directly, we would have to break it down into four integrals, one for each segment. However,
this closed path C encloses a fan-shape region (denoted by R). The double integral over R
is easy to set up if one converts it into polar coordinates. This suggests that the Green’s
Theorem may come in handy here.
Since F is defined and is C1 everywhere, we can apply the Green’s Theorem without any
issues. First we compute its curl:
∇ × F = 3yk.
By the Green’s Theorem
˛ ¨
F · dr = (∇ × F) · k dA
C
¨R
= 3yk · k dA
R
¨
= 3y dA.
R
= 26 [− cos θ ]0π/2
= 26
Therefore, ˛
F · dr = 26.
C
100 Vector Calculus
In other words, ˛
1
(∇ × F) · k ≃ F · dr.
area of R C
˛
The line integral F · dr is larger if F · r′ is large for most of the time. It will happen when
C ˛
F is circular around the curve C. In other words, the closed-loop line integral F · dr indicates
C
how circular the vector field F is around the curve C. The above result tells us that the quantity
(∇ × F) · k at a point is roughly proportional to the circulation of F around that point. That’s
why we call ∇ × F to be curl of F because it is an indicator of curliness of a vector field!
As an example, you can verify that the curls of the vector fields F = −yi + xj and G =
xi + yj are respectively given by:
(∇ × F) · k = 2
(∇ × G) · k = 0
It suggests that F is more circular than G. By plotting them in Mathematica, we can see that
vectors in F are circling around the origin, while vectors in G are diverging from the origin.
However, one can check from direct computations that ∇ × F = 0 at every point except the
origin, and is undefined at the origin. Therefore, the double integral
¨
(∇ × F) · k dA = 0.
R| {z }
=0
Therefore, in this case the Green’s Theorem does not hold as we have:
˛ ¨
F · dr ̸= (∇ × F) · k dA .
| C {z } | R {z }
= 2π =0
4.4 Green’s Theorem 101
To summarize, one cannot directly apply the Green’s Theorem if the given curve encloses a
point at which the vector field is not defined. In such cases, we should either compute the line
integral directly by parametrization, or use some other tools to compute it. We will compute
this integral by so-called the hole-drilling technique to be presented in the next subsection.
If the curve does not enclose the origin (at which F is undefined), then Green’s Theorem
applies to F and such a curve without any issues. For instance, if Γ is a circle centered at
(3, 0) with radius 2, then it does not enclose the origin (which is the only point at which
y x
F = − x 2 + y2 i + x 2 + y2
j is undefined). The Green’s Theorem does show that
˛ ¨ ¨
F · dr = (∇ × F) · k dA = 0 dA = 0.
Γ R | {z } R
=0 at every point in R
To handle this issue, we drill a hole near the origin, i.e. we remove a tiny ball with radius ε
centered at the origin from the region (see Figure 4.7b), and further split the punctured region
into two parts R1 and R2 by cutting it through line segments L1 and L2 (see Figure 4.7cd). We
label each segment of the boundaries by C1 , C2 , L1 , L2 , Γ1 and Γ2 with directions indicated in
the diagram. Then, according to the directions of C1 , C2 and C, the line integral in question
can be expressed as:
˛ ˛ ˆ ˆ
F · dr = F · dr = F · dr + F · dr.
C C1 +C2 C1 C2
Similarly, R2 does not enclose the origin so the Green’s Theorem can be applied to R2 :
ˆ ˆ ˆ ˆ ¨
F · dr − F · dr − F · dr − F · dr = (∇ × F) ·k dA = 0. (4.2)
C2 L1 Γ2 L2 R2 | {z }
| ¸
{z } =0
boundary of R2 F·dr
4.4 Green’s Theorem 103
By summing up (4.1) and (4.2) and canceling out the line integrals over L1 and L2 , we get:
ˆ ˆ ˆ ˆ
F · dr + F · dr − F · dr − F · dr = 0.
C1 C2 Γ1 Γ2
and hence: ˛ ˛
F · dr = F · dr = 2π.
C Γε
Self-Intersecting Curves
Next, let’s use the above result to deal with some intersecting curves. Consider the curve C in
Figure 4.8.
Since we already know how to deal with any simple closed curve enclosing the origin, we
are going to split the curve C into simple closed curves C1 , C2 and C3 according to Figure 4.8.
As both C1 and C2 are simple closed curves enclosing the origin, by our previous discussion
we know: ˛ ˛
F · dr = F · dr = 2π.
C1 C2
For C3 , it is a simple closed curve not enclosing the origin. Therefore, the Green’s Theorem
can be applied on C3 without any issues. Hence, we have
˛ ¨
F · dr = (∇ × F) · k dA = 0
−C3 R
where R is the region enclosed by C3 . Here we have again used the fact that ∇ × F = 0 inside
the region R.
According the directions indicated on the diagram, we have:
˛ ˛ ˛ ˛
F · dr = F · dr + F · dr + F · dr = 2π + 2π + 0 = 4π.
C C1 C2 C3
Therefore, the winding number of this curve C is equal to 2. Similar technique can be
applied to more complicated curves which go around the origin a lot of times.
104 Vector Calculus
Instead of regarding u and v as time variables, we regard them as the coordinates on a uv-plane,
and the vector r(u, v) is a function that associates each point (u, v) on the uv-plane to a point
( x (u, v), y(u, v), z(u, v)) in the xyz-space. Since there are two parameters, the image of the
function is a surface in the xyz-space. In other words, the function r(u, v) can be thought as a
transformation that “wraps” the uv-paper onto the surface.
The function r(u, v) is called a parametrization of the surface, and by what we mean
“parametrizing a surface” is to give a parametrization of the surface. As we will see later,
parametrizing a surface is often the first step of computing a surface integral.
i Although we use (u, v) to denote the parameters, you can use whatever pair of variables
for the parameters, provided that there is no confusion.
■ Example 4.12 Find a parametrization for the cylinder with radius r0 and with z-axis as the
central axis.
■Solution If, under a certain coordinate system, the surface has one of the coordinates being
constant, then giving a parametrization to that surface is fairly easy: simply take the other
two coordinates as parameters, and define r according to the conversion rules between this
coordinate system and the rectangular coordinate system.
The cylinder described in the problem can be presented by equation r = r0 under
cylindrical coordinates (r, θ, z). The conversion rule between cylindrical and rectangular
4.5 Parametric Surfaces 105
x = r cos θ
y = r sin θ
z=z
One can also specify the range of z so that the parametrization gives a finite cylinder. For
instance,
r(θ, z) = (r0 cos θ )i + (r0 sin θ )j + zk, 0 ≤ θ ≤ 2π, 0 ≤ z ≤ 1
gives the finite cylinder with unit height (from z = 0 to z = 1).
Similarly, a cone making π4 angle with the z-axis can be represented by z = r in cylindrical
coordinates, or ϕ = π4 in spherical coordinates. Therefore, it is can be parametrized by two
different ways:
r1 (r, θ ) = (r cos θ )i + (r sin θ )j + rk, 0 ≤ r < ∞, 0 ≤ θ ≤ 2π
π π π
r2 (ρ, θ ) = ρ sin cos θ i + ρ cos sin θ j + ρ cos k
4 4 4
ρ cos θ ρ sin θ ρ
= √ i+ √ j + √ k, 0 ≤ ρ < ∞, 0 ≤ θ ≤ 2π
2 2 2
A sphere with radius 3 can be presented by ρ = 3 in spherical coordinates, and so it can be
parametrized by:
r3 (θ, ϕ) = (3 sin ϕ cos θ )i + (3 sin ϕ sin θ )j + (3 sin ϕ)k, 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π.
Parametrization of Graphs
If a surface can be represented by a Cartesian equation (i.e. level-set form) such as x2 + y − z3 =
1 and that you can write one of the variables as a function of the other two variables (such
as y = z3 − x2 + 1 in this example), then we can use the other two variables as parameters. For
instance, the surface y = z3 − x2 + 1 can be parametrized as:
r( x, z) = xi + (z3 − x2 + 1) j + zk.
| {z }
y
106 Vector Calculus
However,
p x cannot be written as a function of y and z for the surface y = z3 − x2 + 1, since
x = ± z3 − y + 1 and the ± makes x not be a function of y and z. On the other hand, z can
be written as a function of x and y, since z3 = x2 + y − 1 and there is exactly one cubic root for
x2 + y − 1. However, the resulting parametrization
1/3
r( x, y) = xi + yj + x2 + y − 1 k
The Möbius strip, a famous object in topology, has the following parametrization
v u v u v u
r(u, v) = 1 + cos cos u i + 1 + cos sin u j + sin k
2 2 2 2 2 2
where 0 ≤ u ≤ 2π and −1 ≤ v ≤ 1.
Since r denotes the position vector xi + yj + zk, we can also write down a parametrization
in an equation form (especially when the expression of the parametrization is too long to fit in
one line). For instance, the Möbius strip parametrization can be equivalently written as:
v u
x = 1 + cos cos u
2 2
v u
y = 1 + cos sin u
2 2
v u
z = sin
2 2
4.5 Parametric Surfaces 107
where 0 ≤ u ≤ 2π and −1 ≤ v ≤ 1.
The following parametrization, which looks intimidating, describes a very beautiful surface
called the Klein bottle (see Figure 4.11):
2
x = − cos u 3 cos v − 30 sin u + 90 cos4 u sin u − 60 cos6 u sin u + 5 cos u sin u cos v
15
1
y = − sin u(3 cos v − 3 cos2 u cos v − 48 cos4 u cos v + 48 cos6 u cos v
15
− 60 sin u + 5 cos u sin u cos v − 5 cos3 u cos v sin u
− 80 cos5 u sin u cos v + 80 cos7 u sin u cos v)
2
z= (3 + 5 cos u sin u) sin v
15
for 0 ≤ u ≤ π and 0 ≤ v ≤ 2π.
is a special type of surface integral where the region of integration R is on the flat xy-plane. A
surface integral is one that the region of integration can be a curved surface. Many geometric
and physical quantities, such as surface area, surface flux, and moment of inertia for some
shell objects, can be computed using surface integrals. The Stokes’ Theorem to be introduced
in the next section also involves surface integrals.
We first state the definition of surface integrals, compute some examples and then explain
its geometric and physical meaning.
Definition 4.6 — Surface Integrals. Given a surface S parametrized by r(u, v) with a ≤ u ≤ b
and c ≤ v ≤ d, and a continuous, scaled-valued function f ( x, y, z), the surface integral of f
over the surface S is denoted and defined to be:
¨ ˆ v=d ˆ u=b
∂r ∂r
f dS = f (r(u, v)) × dudv.
S v=c u= a ∂u ∂v
i The line integral of a vector field over a curve C is independent of how we parametrize C.
It is also true that the surface integral over a surface S is also independent of the surface
parametrization r(u, v). The proof involves change of variables technique in multivariable
calculus and is omitted here.
S
Examples of closed surfaces include spheres and torus, while a hemisphere (only the
spherical part, the flat part is not included) is not closed since it has a circle as its boundary.
4.5 Parametric Surfaces 109
■ Example 4.13 Let S be the sphere of radius a centered at the origin. Evaluate the surface
integral ‹
x 2 + y2 dS
S
■ Solution We first parametrize the surface. Using spherical coordinates, the sphere is
presented by ρ = a. Take (θ, ϕ) as parameters, then a parametrization of the sphere is given
by:
r(θ, ϕ) = ( a sin ϕ cos θ )i + ( a sin ϕ sin θ )j + ( a cos ϕ)k
where 0 ≤ θ ≤ 2π and 0 ≤ ϕ ≤ π.
According to the definition of surface integrals, we need to compute ∂r
∂θ × ∂r
∂ϕ . It is
straight-forward, although quite tedious:
∂r
= (− a sin ϕ sin θ )i + a sin ϕ cos θ )j + 0k
∂θ
∂r
= ( a cos ϕ cos θ )i + ( a cos ϕ sin θ )j + (− a sin ϕ)k
∂ϕ
i j k
∂r ∂r
× = − a sin ϕ sin θ a sin ϕ cos θ 0
∂θ ∂ϕ
a cos ϕ cos θ a cos ϕ sin θ − a sin ϕ
= (− a2 sin2 ϕ cos θ )i + (− a2 sin2 ϕ sin θ )j + (− a2 sin ϕ cos ϕ)k
q
∂r ∂r
× = a4 sin4 ϕ(cos2 θ + sin2 θ ) + a4 sin2 ϕ cos2 ϕ
∂θ ∂ϕ
q
= a4 sin2 ϕ(sin2 ϕ + cos2 ϕ)
= a2 sin ϕ.
ˆ ϕ=π ˆ θ =2π
= a4 sin3 ϕ dθdϕ
ϕ =0 θ =0
ˆ ϕ=π
= 2πa4 sin3 ϕ dϕ
ϕ =0
4 8πa4
= 2πa4 · = .
3 3
110 Vector Calculus
■ Example 4.14 Let S be the plane 3x + 2y + z = 1 defined over the region 0 ≤ x ≤ 1 and
0 ≤ y ≤ 1. Compute the surface integral:
¨
( x + y + z) dS.
S
■ Solution The equation of the given plane can be written as a graph z = 1 − 3x − 2y over
the given region 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1. We can take x and y as parameters and so a
parametrization of the plane is given by:
r( x, y) = xi + yj + (1 − 3x − 2y)k
where 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1.
Next we compute all the ingredients of the surface integral:
∂r
= i − 3k
∂x
∂r
= j − 2k
∂y
∂r ∂r
× = 3i + 2j + k
∂x ∂y
∂r ∂r p √
× = 32 + 22 + 12 = 14
∂x ∂y
x + y + z = x + y + (1 − 3x − 2y) = 1 − 2x − y
Surface Element
Now we explain the geometric and physical meaning of surface integrals. Given a para-
metric surface S with parametrization r(u, v), a ≤ u ≤ b and c ≤ v ≤ d, from now on we
denote:
∂r
Notation dS = ∂u × ∂v
∂r
dudv
It is a called the surface element of the integral. If we subdivide the domain in the uv-plane
into small rectangular pieces with area ∆u ∆v, the parametrization r(u, v) transform them into
small pieces ∆S on the surface (see Figure 4.12).
If the number of subdivisions is very large so that each ∆S is very small, then ∆S is
approximately a parallelogram. The red side of the parallelogram has length approximately
∂r
equal to ∂u ∆u (to see this, regard u is the time then the distance traveled after ∆u unit time
is approximately the speed × time. Similarly, the blue side of the parallelogram has length
approximately equal to ∂r
∂v ∆v. Suppose the angle of the parallelogram is θ, then the area of
∆S is approximately:
∂r ∂r ∂r ∂r ∂r ∂r
∆u · ∆v · sin θ = sin θ · ∆u ∆v = × ∆u ∆v.
∂u ∂v ∂u ∂v ∂u ∂v
| {z }
| ∂u
∂r
|| ∂v |
∂r sin θ
Infinitesimally, the ∆S becomes the surface element dS, and ∆u ∆v becomes dudv. In other
words, the surface element dS presents the area of a very tiny piece of subdivision on the
surface. Summing up the area of all tiny subdivisions, we get the surface area of S, i.e.
¨
dS = surface area of S.
S
Different geometric or physical meanings of the integrand f give different meanings to the
surface integral of f over S. For instance, ¨
If f is each element f dS means f dS means
S
1 area of dS total surface area
surface density mass of dS total mass of the surface
density × ( x2 + y2 ) moment of inertia of dS moment of inertia of S
112 Vector Calculus
Let S be the sphere of radius a centered at the origin. We computed in Example 4.13 that
‹ 8πa4
x2 + y2 dS = .
S 3
‹
Suppose this spherical shell has a uniform surface density δ, then the integral δ x2 + y2 dS
S
is the moment of inertia Iz about z-axis. Since its formula in many physics textbooks is written
in terms of the total mass m of the sphere, let’s rewrite it in terms of m.
‹
Iz = δ x2 + y2 dS
S
8πa4
=δ
3
m 8πa4
= ·
4πa2 3
2 2
= ma .
3
Figure 4.13: surface flux for uniform vector field through a plane
The force F can be decomposed into two components, one perpendicular to the plane,
another parallel to the plane. As the flux counts only the amount of vectors through the plane,
only those perpendicular to the plane should be counted. Suppose the force F makes an angle
θ with the unit normal vector n̂, then the perpendicular component of the force has length
F · n̂ = |F| cos θ.
Furthermore, there are vectors at every point on the plane. The larger the plane, the more
vectors passing through it. Therefore, the flux of F through the plane should be defined as:
Definition 4.7 — Surface Flux. Given a vector field F and a surface S, the surface flux of F
through S is defined to be the surface integral:
¨
F · n̂ dS
S
i Note that there are often two choice of n̂, so the sign of the surface flux depends on which
direction of n̂ is chosen. For a closed surface such as a sphere, it is a convention to choose
the outward unit normal.
i There are some surfaces which you cannot “choose” a unit normal convention. For
instance, if you pick a normal vector on the Möbius strip and let it vary continuously over
the strip, the normal vector may end up pointing at opposite direction when it returns to
the original position! We call these non-orientable surfaces. We do not define surface flux
on non-orientable surfaces.
Generally speaking, the surface flux can be computed by parametrizing the surface. Given
a parametrization r(u, v), with a ≤ u ≤ b and c ≤ v ≤ d, of a surface S, the unit normal vector
n is given by:
∂r
× ∂r
n̂ = ± ∂u ∂v
∂r
∂u × ∂r
∂v
∂r ∂r
dS = × dudv.
∂u ∂v
xi + yj + zk
F = − GMm .
( x2 + y2 + z2 )3/2
Let S be part of the horizontal plane z = 1 over the region x2 + y2 ≤ 1. Compute the surface
flux of F through S, i.e. ¨
F · n̂ dS.
S
Here n̂ is chosen to be the upward normal.
■ Solution First we parametrize the surface S. Since the plane z = 1 is a graph over the
xy-plane, it seems the easiest way is to use x and y as parameters. However, the base region
x2 + y2 ≤ 1 is not a rectangle so it may be difficult to set up the ranges of x and y for the
parametrization. Since the base region is a solid circle, we can use cylindrical coordinates
too. Let
r(r, θ ) = (r cos θ )i + (r sin θ )j + k (since z = 1),
where 0 ≤ r ≤ 1 and 0 ≤ θ ≤ 2π. Next we compute all the ingredients:
∂r
= (cos θ )i + (sin θ )j
∂r
∂r
= (−r sin θ )i + (r cos θ )j
∂θ
i j k
∂r ∂r
× = cos θ sin θ 0
∂r ∂θ
−r sin θ r cos θ 0
= (r cos2 θ + r sin2 θ )k
= rk (which is upward)
(r cos θ )i + (r sin θ )j + zk
F = − GMm
(r2 cos2 θ + r2 sin2 θ + z2 )3/2
(r cos θ )i + (r sin θ )j + k
= − GMm (since z = 1)
(r2 + 1)3/2
∂r ∂r GMmr
F· × =− 2 .
∂r ∂θ (r + 1)3/2
4.5 Parametric Surfaces 115
■ Example 4.16 Let F = xi − yj. Find the upward flux of F over S which is the upper part of
√
the sphere with radius 2 centered at the origin cut out by the plane z = 1.
■ Solution Since the surface is spherical, it is usually the best to use spherical coordinates to
√
parametrize it. Under spherical coordinates, the sphere is represented by ρ = 2, so we use
θ and ϕ for the parameters. Let
D√ √ √ E
r(θ, ϕ) = 2 sin ϕ cos θ, 2 sin ϕ sin θ, 2 cos ϕ .
∂r D √ √ E
= − 2 sin ϕ sin θ, 2 sin ϕ cos θ, 0
∂θ
∂r D√ √ √ E
= 2 cos ϕ cos θ, 2 cos ϕ sin θ, − 2 sin ϕ
∂ϕ
∂r ∂r D E
× = −2 cos ϕ sin2 θ, −2 sin ϕ sin2 θ, −2 cos ϕ sin ϕ
∂θ ∂ϕ
D√ √ E
F = xi − yj = 2 sin ϕ cos θ, − 2 sin ϕ sin θ, 0
∂r ∂r
√
F· × = −2 2 cos 2θ sin3 ϕ.
∂θ ∂ϕ
∂r
Note that ∂θ × ∂ϕ
∂r
obtained above is a downward normal since the k-component is negative.
The upward flux is given by:
¨ ˆ 2π ˆ π/4
∂r ∂r
F · n̂ dS = F· − × dϕdθ
S 0 0 ∂θ ∂ϕ
ˆ 2π ˆ π/4 √
= 2 2 cos 2θ sin3 ϕ dϕdθ
0 0
√ ˆ 2π
! ˆ !
π/4
3
=2 2 cos 2θ dθ sin ϕ dϕ = 0.
0 0
| {z }
=0
over this closed surface measures the net volume of fluid flowing in the direction of n̂ through
4.5 Parametric Surfaces 117
the surface, or in other words, the net volume of fluid flowing out from region D enclosed by
S.
If one denotes ϱ as the uniform density of the fluid, then
‹
ϱu · n̂ dS
S
measures the net mass of the fluid flowing out from the enclosed region D through its boundary
S per unit time. Assuming there is no sink or source inside D, by the conservation of mass, the
rate of change of the total mass of fluid enclosed by S is related to the surface flux by:
˚ ‹
∂
ϱ dV = − ϱu · n̂ dS.
∂t D S
| {z }
total mass in D
Later we can apply the Divergence Theorem on the above relation to derive an important
equation to the fluid flow.
Now suppose the vector field J represents the transfer of heat energy in unit Joule per
second. Note that while energy is a scalar, the transfer of heat at different point may have a
different direction and so it is a vector quantity. Again, take S to be a closed surface enclosing
a solid region D, then the surface flux of j through S:
‹
J · n̂ dS
S
measures the amount of heat energy flowing out from the region D through S. This flux
integral is commonly called the heat flux through S by physicists.
If E is a electric field and, again, S is a closed surface, then the flux integral
‹
E · n̂ dS
S
is commonly called the electric flux through S. A result by Gauss claims that this flux integral is
proportional to the total amount of charges enclosed by the surface. If B is a magnetic field
and, again, S is a closed surface, then the flux integral
‹
B · n̂ dS
S
is commonly called the magnetic flux through S. Gauss’s Law for Magnetism asserts that it
must be zero.
118 Vector Calculus
where n̂ is the unit normal vector to S, with direction determined by the right-hand rule
(see Figure 4.17).
i By comparing the statements of the Green’s and Stokes’ Theorems, one can easily see that
the Green’s Theorem is a special case of the Stokes’ Theorem, in a sense that the former
applies to plane curve and the flat region enclosed by the curve on the xy-plane. The unit
normal vector n̂ for the plane region is obviously given by k if the plane curve is traveling
in the counter-clockwise orientation.
i For closed curves in the three-dimensional space, one cannot say whether they are counter-
clockwise or clockwise as it depends on the direction of observations. Therefore, the
counter-clockwise convention of the Green’s Theorem is generalized to the right-hand rule
condition in the statement of the Stokes’ Theorem.
i The Stokes’ Theorem applies only on orientable surfaces. That says, it may not hold for
surfaces such as the Möbius strip. Also, the condition where the vector field F needs to be
defined and is C1 on the surface S is crucial. However, we will mostly deal with vector
fields that satisfy this condition.
i The condition that S has to be simply-connected is also crucial, but we will later learn how
to modify the Stokes’ Theorem so as to allow non-simply-connected surface S.
The proof of the Stokes’ Theorem is omitted here. Interested readers may consult a reference
textbook for a proof of one special case. Let’s look at some examples.
Since the RHS is a surface integral, we need to parametrize it first in order to compute it.
Using spherical coordinates, the parametrization of S is given by:
where 0 ≤ θ ≤ 2π and 0 ≤ ϕ ≤ π
2. Then,
∂r
= (−2 sin ϕ cos θ )i + (2 sin ϕ cos θ )j + 0k
∂θ
∂r
= (2 cos ϕ cos θ )i + 2 cos ϕ sin θ )j + (−2 sin ϕ)k
∂ϕ
∂r ∂r
× = (−4 sin2 ϕ cos θ )i + (−4 sin2 ϕ sin θ )j + (−4 sin ϕ cos ϕ)k
∂θ ∂ϕ
∂r ∂r
× = (4 sin2 ϕ cos θ )i + (4 sin2 ϕ sin θ )j + (4 sin ϕ cos ϕ)k
∂ϕ ∂θ
∂r
Note that ∂θ × ∂ϕ
∂r
is pointing downward, so we use ∂r
∂ϕ × ∂r
∂θ instead.
Next we need to compute ∇ × F:
i j k
∂ ∂ ∂
∇×F = ∂x ∂y ∂z
z−y x −x
= 2j + 2k.
i Although the Stokes’ Theorem was used in the previous example (as required by the
problem), it is actually easier to compute the line integral directly by parametrizing the
curve:
120 Vector Calculus
The purpose of the previous example is simply to illustrate how to use the Stokes’ Theorem,
although it is not necessary to use it. The line integral in the next example, however, would be
extremely difficult to compute without the Stokes’ Theorem.
■Example 4.18 Let C be the curve of intersection of the plane Ax + By + Cz = 0 and the
sphere x2 + y2 + z2 = a2 . See Figure 4.18. Show that
˛
πa2 ( A + B + C )
ydx + zdy + xdz = ± √
C A2 + B2 + C 2
where ± is determined by the orientation of C.
■ Solution The plane Ax + By + Cz = 0 passes through the origin. Therefore, the curve C is
a great circle on the sphere. However, this great circle is neither horizontal or vertical, so it is
difficult to parametrize C to compute the line integral.
Let’s use the Stokes’ Theorem to see if there is any luck! When using the Stokes’ Theorem,
one needs to pick a surface S whose boundary curve is C. There are three choices:
1. the disk region enclosed by C on the plane
2. the hemisphere above the plane
3. the hemisphere below the plane
All of the above choice should give the same answer. However, let’s pick the region of the
plane enclosed by C to be the surface S and we will explain why it is the smartest choice
among all three.
4.6 Stokes’ Theorem 121
Now the given line integral is associated to the vector field F = yi + zj + xk, i.e.
˛ ˛
ydx + zdy + xdz = F · dr.
C C
i j k
∂ ∂ ∂
∇×F = ∂x ∂y ∂z
y z x
= −i − j − k.
We also need the unit normal vector n̂, but since the region S is a plane whose equation is
Ax + By + Cz = 0. The unit normal is a constant vector given by:
Ai + Bj + Ck
n̂ = ± √
A2 + B2 + C 2
where ± is determined by the orientation of C.
Therefore,
Ai + Bj + Ck A+B+C
(∇ × F) · n̂ = ± (−i − j − k) · √ = ±√ .
2 2
A +B +C 2 A2 + B2 + C 2
Next, we apply the Stokes’ Theorem on these C and S:
˛ ¨
F · dr = (∇ × F) · n̂ dS
C S
¨
A+B+C
=± √ dS
S A2 + B2 + C 2
¨
A+B+C
= ±√ dS
A2 + B2 + C2 | S{z }
surface area
πa2 ( A + B + C)
=±√ .
A2 + B2 + C 2
Note that the surface area of the region S on the plane is πa2 , since its boundary is a circle
with radius a.
There are two major reasons why the part of the plane enclosed by C is a smarter choice
for S than the hemispheres. For one thing, both ∇ × F and n̂ are constant vector field if
S is chosen to be a planar region, so that computing its surface integral is very easy – no
parametrization! For another, if any of the hemispheres were chosen to be S, then the surface
integral needs to be computed by parametrization – which can be tedious. It is also very
difficult to determine the range of values of ϕ and θ since the plane cutting the sphere is not
a horizontal one.
Occasionally, the Stokes’ Theorem can be applied to evaluate a surface integral over an
arbitrary or complicated surface, which is not easy to be parametrized. If the given vector field
G can be expressed in the form of F = ∇ × G for another vector field G, by the (backward)
Stokes’ Theorem asserts that:
¨ ¨ ˛
F · n̂ dS = (∇ × G) · n̂ dS = G · dr
S S C
where C is the boundary curve of the surface S. Very often, the line integral is easier to compute
than the surface integral.
i Note that the above discussion holds only when the given vector field F is of the form
122 Vector Calculus
F = ∇ × G. If such an F is not in this form, there is no easy way to apply the Stokes’
Theorem backward.
■Example 4.19 Let C be an arbitrary simple closed curve on the xy-plane in the xyz-space,
and S be an arbitrary surface above the xy-plane with boundary curve C. See Figure 4.19.
y
1. Verify that i = ∇ × − 2z j + 2 k .
2. Show that: ¨
i · n̂ dS = 0.
S
z i j k
y ∂ ∂ ∂
∇× − j+ k = ∂x ∂y ∂z
2 2 y
0 − 2z z
∂ y ∂ z
= − − i − 0j + 0k
∂y 2 ∂z 2
=i
y
For part (2), we denote G = − 2z j + 2 k for simplicity. Then i = ∇ × G. Applying the
Stokes’ Theorem backward, we get:
¨ ¨
i · n̂ dS = (∇ × G) · n̂ dS
S
˛S
= G · dr.
C
determined by the right-hand rule. Since the surface S is very small, one can regard the
quantity (∇ × F) · n̂ is nearly a constant over the surface S. By the Stokes’ Theorem, we have:
˛ ¨ ¨
F · dr = (∇ × F) · n̂ dS ≃ (∇ × F) · n̂ dS .
C S S
Therefore,
¸
F · dr.
C
(∇ × F) · n̂ ≃ .
Surface area of S
This quantity is large when the vector field F is circular about the normal vector n̂. In other
words, the quantity (∇ × F) · n̂ measures the circulation density around any given point. That’s
why ∇ × F is often called the curl of F.
which recovers the result we proved before in Theorem 4.1. Of course, the Stokes’ Theorem as-
serts more than Theorem 4.1 does because it applies to any vector field, not just the conservative
ones.
Summing up all three equations and cancelling out the Li ’s terms, we get:
ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ !
+ + + − − − − F · dr
C1 C2 C3 C4 Γ1− Γ1+ Γ2− Γ2+
¨ ¨ ¨
= + + (∇ × F) · n̂ dS
S1 S2 S3
While the surface integral (RHS) is in the same form as in the usual Stokes’ Theorem, the LHS
is not summing up the boundary line integrals, but rather the outer boundary has a plus sign
in front and the inner boundaries each has a minus sign in front.
For more complicated surfaces (with many, but finitely many, holes), one can apply the
technique illustrated above to establish:
Theorem 4.7 — Stokes’ Theorem for Higher Genusa Surfaces. Let S be an orientable surface
in R3 with n holes. Denote C to be its outer boundary, and Γ1 , Γ2 , . . . , Γn to be its inner
boundaries. Suppose F is a vector field is defined and C1 on the surface S, then:
˛ n ˛ ¨
F · dr − ∑ F · dr = (∇ × F) · n̂ dS.
C i =1 Γ i S
Here n̂ is the unit normal vector to S with orientation determined by the right-hand rule
applied to the outer boundary C.
a The word genus is the mathematical term for number of “holes” inside the surface.
4.7 Divergence Theorem 125
i We will see the geometric interpretation of ∇ · F after we state the Divergence Theorem.
Essentially, it measures how diverging the vector field is.
Theorem 4.8 — Divergence Theorem. Let S be a closed orientable surface enclosing a simply-
connected solid region D. Suppose F is a vector field defined and being C1 in and near the
region D, then we have: ‹ ˚
F · n̂ dS = ∇ · F dV .
S D
| {z } | {z }
surface integral triple integral
The Divergence Theorem is particularly useful for computing the flux over a closed surface,
as the theorem says we do not need to parametrize the surface and compute the normal vector.
Let’s look at some examples.
126 Vector Calculus
■
‹ − 5zk. Let S be the sphere with radius
Example 4.20 Consider the vector field F = 3xi + 4yj
a centered at the origin. Evaluate the flux integral F · n̂ dS with outward unit normal n̂.
S
■ Solution You can imagine the computation would be quite tedious if we computed this
flux integral directly by parametrizing the sphere. However, since S is a closed surface, one
can try to use the Divergence Theorem:
∂ ∂ ∂
∇·F = (3x ) + (4y) + (−5z) = 3 + 4 − 5 = 2.
∂x ∂y ∂z
Denote D to be the solid sphere with radius a centered at the origin, i.e. the solid region
enclosed by S. By the Divergence Theorem, we have:
‹ ˚ ˚
4 8
F · n̂ dS = ∇ · F dV = 2 dV = 2 · πa3 = πa3 .
S D D 3 3
■ Solution The closed surface S has six faces! If one attempts to compute the flux integral
directly, one needs to split it into six integrals, corresponding to each of its six faces.
However, if one applies the Divergence Theorem, the difficult surface integral becomes a
triple integral over a rectangular region which is very easy to set-up.
We first compute:
∂ 2 ∂ ∂
∇·F = ( x ) + (4xyz) + (ze x ) = 2x + 4xz + e x .
∂x ∂y ∂z
We omit the computational detail of the triple integral above, which is a very straight-forward.
One should note that the surface integral stated in the Divergence Theorem is
‹
F · n̂ dS
S
but not of (∇ × F) · n̂. In fact, using the Divergence Theorem, one can show that
‹
(∇ × F) · n̂ dS = 0
S
It is because:
‹ ˚
(∇ × F) ·n̂ dS = ∇ · G dV
S | {z } D
=:G
˚
= ∇ · (∇ × F) dV.
D
Using the Mixed Partials Theorem, all of the above second derivatives are canceled out, and so
∇ · (∇ × F) = 0.
Therefore, we get: ‹
(∇ × F) · n̂ dS = 0.
S
When the region D is very small, one can regard ∇ × F is nearly a constant, and so we have:
‹
F · n̂ ≃ (∇ · F) × volume of D.
S
In other words, ∇ × F measures the flux density near a point. The more diverging F is
around a point, the higher the flux over a tiny closed surface around that point, resulting in
greater value of ∇ · F. This justifies the use of name divergence for ∇ · F.
Under the spherical coordinates (ρ, θ, ϕ), we have ρ2 = x2 + y2 + z2 . The vector field
xi + yj + zk xi + yj + zk
p is the unit radial vector field. For simplicity, we denote eρ = p ,
2
x +y +z2 2 x 2 + y2 + z2
then the gravitational vector field can be expressed as:
GMm
F=− eρ .
ρ2
128 Vector Calculus
If S is the sphere with radius a centered at the origin, then its outward unit normal is also
radial and so we have n̂ = eρ , and ρ = a on S. Therefore, we have:
GMm
F · n̂ = − eρ · eρ
ρ2
GMm 2
= − 2 eρ
ρ
GMm
=− 2 .
ρ
The Divergence Theorem does not hold in this case. The reason is that the vector field F is not
defined at the origin and the surface S encloses the origin!
To conclude, one needs to be very careful when applying the Divergence Theorem if the
region D contains some points at which the vector field is not defined. In this next subsection,
we will learn how to apply the Divergence Theorem (in a modified way) when the surface
encloses some points at which the vector field in not defined.
We have shown that it is so when S is a sphere centered at the origin, and we are going to
use the Divergence Theorem to show that it is always true for any closed surface S enclosing
the origin. However, we need to be very careful when applying the Divergence Theorem since
the gravitational field is undefined at the origin.
We will adopt the “hole-drilling” technique which was previously used in computing the
winding number integral. Given a solid D containing the origin, we first construct a small
sphere B with radius a centered at the origin. Then, the solid D \ B (i.e. the solid D with B
removed) is a solid not enclosing the “bad” point origin.
Next, we cut this solid into two parts by the horizontal plane z = 0. Label each side of the
resulting solids by Si , Π and Σi as shown in the Figure 4.21. Note that Π is the common side.
Gluing S1 , Π and Σ1 together gives a closed surface not enclosing the origin. Denote D1 to
be the solid enclosed by this closed surface. Hence, one can apply the Divergence Theorem
without any issue:
¨ ¨ ¨ ˚
GMm GMm
+ + eρ · n̂ dS = ∇· eρ dV = 0
S1 Π Σ1 ρ2 D1 ρ2
| {z }
=0
where n̂ is the outward unit normal of the boundary surface of D1 . Denote n̂up and n̂down to
be the upward and downward normal vector respectively. The above integrals can be expressed
as:
¨ ¨ ¨
GMm GMm GMm
2
e ρ · n̂ up dS + 2
e ρ · n̂ down dS + eρ · n̂down dS = 0. (4.3)
S1 ρ Π ρ Σ1 ρ2
Similarly, gluing S2 , Π and Σ2 together gives a closed surface not enclosing the origin. By
the Divergence Theorem applied to this surface, we get:
¨ ¨ ¨
GMm GMm GMm
eρ · n̂down dS + eρ · n̂up dS + eρ · n̂up dS = 0. (4.4)
S2 ρ2 Π ρ2 Σ2 ρ2
We then add up the above two equations. First note that S1 and S2 can glue together to
form the closed surface S. Both n̂up of S1 , and n̂down of S2 become the outward normal of S.
130 Vector Calculus
Therefore,
¨ ¨ ¨
GMm GMm GMm
eρ · n̂down dS + eρ · n̂up dS = eρ · n̂outward dS.
S1 ρ2 S2 ρ2 S ρ2
For the planar surface Π, the downward normal n̂down is in the opposite direction of the
upward normal n̂up , i.e. n̂down = −n̂up . Therefore,
¨ ¨
GMm GMm
2
eρ · n̂down dS + eρ · n̂up dS = 0.
Π ρ Π ρ2
Finally, the surfaces Σ1 and Σ2 glue together to form the closed sphere Σ. The normal
vectors n̂down of Σ1 , and n̂up of Σ2 , are the inward unit normal of Σ. Therefore,
¨ ¨ ¨
GMm GMm GMm
2
eρ · n̂down dS + 2
eρ · n̂up dS = eρ · n̂inward dS.
Σ1 ρ Σ2 ρ Σ ρ2
Therefore,
‹ ¨
GMm GMm
e ρ · n̂ outward dS = − eρ · n̂inward dS
S ρ2 Σ ρ2
‹
GMm
= eρ · n̂outward dS
Σ ρ2
= 4πGMm (computed before)
This holds true for any closed surface S enclosing the origin. This proves the Gauss’s Law for
Gravity (assuming the inverse-square law).
However, if S does not enclose the origin, then one can apply the Divergence Theorem on
the gravitational vector field without any issue.
To conclude, for any closed surface S not passing through the origin, we have:
‹ (
1 4πGMm if S encloses the origin;
GMm 2 eρ · n̂ dS =
S ρ 0 otherwise.
4.8 Heat Diffusion (Optional) 131
J = − a∇u
where J is a vector field (in energy per second) representing the flow of heat energy, and a is a
positive constant depending on the medium. In other words, heat energy diffuses from higher
temperature regions to lower ones, and the rate of diffusion is proportional to the magnitude
of ∇u.
Let D be an arbitrary solid region with boundary surface S. Denote ϱ to be the energy
density function (in energy per volume), which equals to bu for some positive constant b whose
value depends on the medium. Then the triple integral:
˚
ϱ dV
D
measures the amount of heat loss through the closed surface S. By the conservation of heat
energy, heat energy must escape through the surface S. In mathematical terms, it is stated as:
˚ ‹
∂
ϱ dV = − J · n̂ dS.
∂t D S
The negative sign appears on the RHS because of the outward convention of n̂.
Applying the above physical laws, we get:
˚ ‹
∂u
b dV = − (− a∇u) · n̂ dS
∂t
˚D ‹ S
∂u
b dV = a∇u · n̂ dS
∂t
˚D ˚S
∂u
b dV = ∇ · ( a∇u) dV (Divergence Theorem)
D ∂t D
∂u
b = ∇ · ( a∇u) = a∇ · ∇u.
∂t
We leave it as an exercise for readers to verify that:
∂2 u ∂2 u ∂2 u
∇ · ∇u = + 2 + 2.
∂x2 ∂y ∂z
132 Vector Calculus
x 2 + y2 + z2
1
Φ( x, y, z, t) = exp − .
(4πkt)3/2 4kt
1
lim Φ(0, 0, 0, t) = lim = ∞.
t →0 t→0 (4πkt )3/2
2 2 2
x +y +z
In contrast, if ( x, y, z) ̸= (0, 0, 0), both exp − 4kt and (4πkt)3/2 go to 0 as t → 0.
However, the exponential term goes to 0 faster than the t3/2 term, so
x 2 + y2 + z2
1
lim Φ( x, y, z, t) = lim exp − = 0 when ( x, y, z) ̸= (0, 0, 0).
t →0 t→0 (4πkt )3/2 4kt
Therefore, the function Φ( x, y, z, t) represents the heat diffusion starting from a highly concen-
trated heat source at t = 0. As time goes, the temperature distribution becomes more and more
uniform.
In general, if the initial temperature distribution is given by the function g( x, y, z), it can be
shown (proof beyond the scope of the course) that the following function
ˆ ∞ ˆ ∞ ˆ ∞
u( x, y, z, t) = Φ( x − u, y − v, z − w, t) g(u, v, w) dudvdw
−∞ −∞ −∞
∂u
satisfies the heat equation ∂t = k∆u with initial condition g( x, y, z), meaning that
lim u( x, y, z, t) = g( x, y, z).
t →0
In other words, the function u( x, y, z, t) predicts how heat diffuses when given an initial
temperature profile g( x, y, z). However, the triple integral involved is generally difficult to be
found explicitly.
∆u = 0.
u( x, y, z) = C for any ( x, y, z) on S.
4.8 Heat Diffusion (Optional) 133
Next we consider the vector field (u − C )∇u. We leave it as an exercise for readers to verify
from the definition that:
At steady state, we have ∆u = 0 and so ∇ · ((u − C )∇u) = |∇u|2 . Next we integrate this result
over D: ˚ ˚
∇ · ((u − C )∇u) dV = |∇u|2 dV.
D D
Applying the Divergence Theorem on the LHS, we get:
‹ ˚
(u − C )∇u · n̂ dS = |∇u|2 dV.
S D
From our assumption, we have u = C for any point on S. Therefore, the integrand (u − C )∇u · n̂
of the flux integral on LHS is zero, and so:
˚
0= |∇u|2 dV.
D
Since the integrand |∇u|2 is non-negative, the only scenario for the above to happen is that
Therefore, u must be a constant in the region D, and by continuity, this constant must be C.