0% found this document useful (0 votes)
5 views

Lecture Notes

The document outlines a syllabus for a course on Multiple Integrals and Vector Calculus, covering topics such as vectors, multiple integrals, statistical distributions, curves, surfaces, and vector fields. It includes definitions, theorems, and applications relevant to the course material, as well as recommended books for further reading. The course emphasizes the manipulation of vector quantities and the evaluation of integrals in various coordinate systems.

Uploaded by

Sheen Bendon
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Lecture Notes

The document outlines a syllabus for a course on Multiple Integrals and Vector Calculus, covering topics such as vectors, multiple integrals, statistical distributions, curves, surfaces, and vector fields. It includes definitions, theorems, and applications relevant to the course material, as well as recommended books for further reading. The course emphasizes the manipulation of vector quantities and the evaluation of integrals in various coordinate systems.

Uploaded by

Sheen Bendon
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 84

CP4

Multiple Integrals and Vector Calculus

Sam Vinko
Hilary 2024
Contents
1 A refresher: vectors, planes, fields 4
1.1 Some basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Functions of multiple variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Scalar and vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Lines and planes in space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Multiple integrals 11
2.1 Non-rectangular domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Triple integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 The Jacobian matrix and determinant . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Curvilinear coordinate systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.1 2D: Plane polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.2 3D: Cylindrical coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.3 3D: Spherical polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . 23

3 Statistical Distributions 26
3.1 Continuous random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.1 Expectation value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Distribution functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Join probability distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Marginal and conditional distributions . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 The Dirac Delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4 Curves and vector functions 35


4.1 Derivative of a vector function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 Integral of a vector function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3 Arc length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.4 Frenet-Serret coordinate system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.5 Line integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.5.1 Line integrals of scalar and vector fields . . . . . . . . . . . . . . . . . . . 43
4.6 Conservative fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.7 Green’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5 Surfaces 49
5.1 The Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.1.1 Directional derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.1.2 The gradient in curvilinear coordinate systems . . . . . . . . . . . . . . . 53
5.1.3 Gradient expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Surface integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2.1 Vector area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2.2 Normal vector to a surface . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.3 Surface integrals of scalar fields . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.4 Surface integrals of vector fields . . . . . . . . . . . . . . . . . . . . . . . . 61

6 Vector fields and operators 64


6.1 The Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.2 The Divergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.2.1 Divergence theorem in 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.3 The Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.4 Stokes’ theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

2
6.5 Some differential identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

7 Revision 78
7.1 Summary of some key points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.2 Worked examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

1
CP4 Multiple Integrals and Vector Calculus syllabus
• Double integrals and their evaluation by repeated integration in Cartesian, plane polar
and other specified coordinate systems.

• Jacobians.

• Probability theory and general probability distributions.

• Line, surface and volume integrals, evaluation by change of variables (Cartesian, plane po-
lar, spherical polar coordinates and cylindrical coordinates only unless the transformation
to be used is specified).

• Integrals around closed curves and exact differentials.

• Scalar and vector fields.

• The operations of grad, div and curl and understanding and use of identities involving
these.

• The statements of the theorems of Gauss and Stokes with simple applications.

• Conservative fields.

Recommended books

1
What this looks like according to generative AI, graphically, for those interested, is depicted on the title page.

3
1 A refresher: vectors, planes, fields
1.1 Some basic definitions
In this course we will be interested in manipulating vector quantities of various kinds, so we
should start by recalling the definition of fields, vectors and vector spaces, and setting out the
preferred notation used in this course. First, a few definitions that will help clarify the language
we will be using.
• A set is a collection of mathematical objects.
• A set G with a binary operation that combines two elements in the set to yield another
element in the set is called a group. Mathematically this can be written as G × G → G.
This operation has to have certain properties (also called the axioms), and together with
the set these constitute the algebraic structure of the group. Specifically, for a group
these are closure, associativity, ∃ of an identity element, and ∃ of inverse elements. If the
operation also commutes (i.e., it does not depend on the order in which the elements are
written in the operation), then the group is said to be a commutative or an Abelian group.
• A set with two binary operations of addition and multiplication is called a field F . These
must satisfy six field axioms, namely associativity, commutativity, ∃ of additive and multi-
plicative identity, ∃ of additive and multiplicative inverses, and distributivity. We see the
complexity of the structure is increasing; in particular, we note that a field is an Abelian
group under both addition and multiplication, the two being connected via the requirement
of distributivity. Note that the set of rationals Q, real numbers R, and complex numbers
C are all fields, but the set of integers Z is not a field as the multiplicative inverse of
every integer is not necessarily another integer (e.g., the multiplicative inverse of 2 is 1/2,
not an integer). The integers are, instead, a ring. Rings generalize fields: they too have
two operations, but multiplication does not have to be commutative, and multiplicative
inverses need not exist.
• A vector space over a field F is a set V together with two binary operations: vector
addition (V × V → V ) and scalar multiplication (V × F → V ). Note the difference with
the definition of a field: the notion of multiplication between vectors is not part of the
fundamental algebraic structure of a vector space. In fact, as we know there are multiple
ways to multiply vectors, and the multiplicative inverse (i.e., vector division) is not defined.
The elements of the field are called scalars to distinguish them from the elements of the
vector space (vectors), and because the definition of multiplication acts to scale the vectors.
We will typically be interested in vector spaces defined over R: this is called a real vector
space. Eight axioms must be satisfied for the operations permitted on a vector space:
associativity and commutativity of addition, ∃ of identity elements for addition and scalar
multiplication, ∃ of additive inverse, distributivity for scalar multiplication with respect
to vector and field addition, and compatibility of scalar and field multiplication.

1.2 Vectors
Throughout this course we will be interested in n-dimensional real vector spaces and associated
real vectors, defined over Rn . We will write these vectors generally as
r = (x1 , x2 , · · · , xn ) r ∈ Rn ,
where xi are called the components of r. Often it will be convenient to use a slightly different
notation in the special cases of 2 or 3 dimensions, where instead we write
r = (x, y) r ∈ R2 .
r = (x, y, z) r ∈ R3 .

4
The modulus or norm of a vector is given by
v
u n
(1)
uX
|r| ≡ r = t x2i .
i=1

This allows us to define a unit vector (vector with norm of 1) as


r
r̂ = . (2)
|r|

The standard or canonical basis of Rn is given by the unit vectors

ê1 = (1, 0, 0, · · · , 0)
ê2 = (0, 1, 0, · · · , 0)
..
.
ên = (0, 0, 0, · · · , 1).

In this basis we can write the a generic vector r as


n
(3)
X
r = x1 ê1 + x2 ê2 + · · · + xn ên = xi êi
i=1

Note that the subscripts mean two different things: in one case they denote the component of
the vector (a scalar), and in the other a specific element of the basis set (a vector). That this
decomposition of a generic vector r can be done follows directly from the axioms of the vector
space: we explicitly use both the concepts of vector sums and scalar multiplications to identify
all possible elements of the vector space.
Once again, it will be convenient to use special notation for the 2 and 3 dimensional cases
when discussing the sets of canonical unit vectors:

(î, ĵ) on a surface in R2 ,


(î, ĵ, k̂) in a volume in R3 .

With this notion a generic vector in 3D can be written as

r = x î + y ĵ + z k̂. (4)

In addition to the basic operations provided for by the axioms of a vector space, there are
a few others that will prove useful. The first is the scalar or dot product. This is an operation
that takes in two vectors elements of the vector space V and outputs a scalar, an element of
the field F over which the vector space is defined: V × V → F . For two vectors u, v ∈ V , with
components ui and vi , we write this operation as

u · v ≡ hu, vi = u1 v1 + u2 v2 + · · · + un vn . (5)

In 2 dimensions it is easy to see that this definition can also be expressed geometrically as

u · v = |u||v| cos ϑ, (6)

which allows us to define the concept to angle between two vectors in n dimensions as
u·v
cos ϑ = . (7)
|u||v|

5
We say that two vectors are orthogonal iff u · v = 0. This is easy to see geometrically in the
case of 2 or 3 dimensions from Eq. (6), but it holds more generally for vectors in any number of
dimensions.
In 3 dimensions we will see that it will be convenient to define another product operation
between vectors, but which, unlike the scalar product, outputs a vector: V × V → V . We call
this the vector or cross product. For two vectors u, v ∈ R3 , with components ui and vi , it is
defined as  
î ĵ k̂
u × v = det u1 u2 u3  . (8)
v1 v2 v3
We note that if u, v are linearly dependent, i.e. u = λv, λ ∈ R, then because of the permutation
rules of the determinant the above quantity is 0. If u, v are linearly independent, then they
define a plane, and the vector product will create a new vector normal to this plane (i.e., to the
entire subspace spanned by u, v). This provides insight into how the operation can be applied
to vectors defined in R2 rather than R3 : this is a special case where the basis has been aligned
with the plane spanned by u, v, and its normal. So for u, v ∈ R2 we can pad the missing third
dimension of the vectors with zeros and write
 
î ĵ k̂
u × v = det u1 u2 0  = (u1 v2 − v1 u2 )k̂. (9)
v1 v2 0

The vector product can be generalized to more than 3 dimensions but this is seldom useful.
With these two operations we can write the triple product between three vectors u, v, w ∈ R3
as  
u1 u2 u3
u · (v × w) = det  v1 v2 v3  . (10)
w1 w2 w3
This operation necessarily produces a scalar, and can be denoted as a map V ×V ×V → F . Note
that the parenthesis is not strictly needed, as there is only one way to get a meaningful result
from the operation that is consistent with the definitions of the scalar and vector products. The
triple product possesses a useful geometric property: it measures the volume of the parallelepiped
formed by the three vectors u, v, w.

1.3 Functions of multiple variables


With the armamentarium of vectors, fields and operations permitted on vector spaces, we are
ready to consider a wide range of mappings (or functions) connecting vectors and scalars. In
particular, we will be interested in the integro-differential calculus of such functions across
various multidimensional domains. Because the main application of these tools lies in physics,
we will focus our discussions on 2 and 3 dimensional spaces, but much of underlying theory still
holds in the same form for any number of dimensions.
In general, we think of a function as a map between two sets A and B f : A → B that
assigns to each element in A exactly one element in B. The set A is called the domain of f , and
B is called the codomain. For a one-dimensional function defined over the reals on some domain
A this would be written as f : A ⊆ R → R. There are three main ways in which we can extend
these ideas to vector calculus in multiple dimensions:

1. Functions that map multi-dimensional spaces to scalars f : A ⊆ Rn → R. We will call


this a multivariable function. An example of this would be the equation of a 2D surface
defined over the (x, y) plane as z = f (x, y). For each point in the (x, y) plane the function
returns the height of the surface above that point.

6
Temperatures in ºC Wind gusts in mph

Figure 1: Temperature and wind gust maps of the UK. Temperatures represent a scalar field,
while the winds must be represented by a vector quantity representing the wind speed and
direction. (Source: Met Office 2022)

2. Functions that map a scalar to a vector f : A ⊆ R → Rn . We will call this a vector function
in one variable. An example would the equation describing the position of a particle in
3D space as a function of time. For each value of time (a scalar), the function outputs the
position vector of the particle r = (x, y, z).

3. Functions that map multi-dimensional spaces to vectors f : A ⊆ Rn → Rm . This is a


multivariable vector function. A simple example would be an equation that relates the
position of a particle in 3D space and time to the velocity of the particle, i.e. (x, y, z, t) →
(vx , vy , vz ). Note that this is the type of function that can associate a vector (here the
velocity) to every point in space-time, which leads us to an important concept in physics
– that of a vector field. We will expand on this further below.

1.4 Scalar and vector fields


We will often refer to the concepts of a scalar or a vector field, both of which are of considerable
importance to various physical applications. Unlike the fields mentioned above, these are not
sets but rather maps. Consider, for example, the temperatures across the UK. There will be a
measurable temperature in every part of the country, and so we can think of this temperature
distribution as a map where we assign a value of the temperature (a scalar) to each point
on the map of the UK identified via a latitude and longitude (a 2D surface). This mapping
can be written as (x, y) → T . We could extend this description so as to also consider how
the temperature varies with altitude, in which case the mapping would associate the scalar
temperature to a point in 3D space: (x, y, z) → T . We could now note that the temperature
changes throughout the day, and so an additional dimension is required to account for the actual

7
Figure 2: An object with mass travelling in the solar system will be subject to the gravitational
field: a vector field where each point in space has a gravitational force associated with it. This
will cause the mass to travel in some direction. There are 5 stationary points in the Earth-Sun
system, called Lagrange points, where small objects can steadily orbit the Sun along with the
Earth in a fixed position. One of these points, L2 (note, it’s a saddle point), is the home of the
recently launched James Webb telescope. (Source: NASA 2022.)

temperature distribution. A proper map would therefore look more like (x, y, z, t) → T . We can
generalize these ideas and define a scalar field to be a function S that assigns to each point in
some n-dimensional vector space a scalar value:

S(r) : Rn → R. (11)

If instead of the temperature we were interested in accounting for the wind speed across
the country, we could do something very similar, but the scalar temperature would need to
be replaced with a vector quantity accounting for the wind direction and velocity. Again, in
the simplest case we may want to associate the wind direction with the latitude and longitude,
and so the required mapping can be written as (x, y) → (vx , vy ). As before, we can add the
altitude or the time to our coordinate system, but now we may also include further complexity
to the description of the wind, for example by adding its vertical velocity component. Then the
mapping would be from a 4D to a 3D space: (x, y, z, t) → (vx , vy , vz ) We can generalize this by
defining a vector field as a function F that assigns to each point in some n-dimensional vector
space an m-dimensional vector:
F(r) : Rn → Rm . (12)

8
Figure 3: A line in 3D space can be fully specified by two vectors. This could either be two
position vectors describing two points on the line, or by a vector along the line (the tangent
vector) and a vector to any point on the line.

This is an obvious generalization of Eq. 11, since a 1D vector field reduces to a scalar field.

1.5 Lines and planes in space


Throughout this course we will be interested in shapes, primarily surfaces, in 3D space. It will
therefore be convenient to briefly summarize how we can denote lines and planes using vector
notation. In general, objects in some vector space (lines, surfaces, planes, etc.) are sets that will
be described by equations that specify the conditions necessary for a vector to belong to the set.
A line in 3D space can be fully determined by specifying two vectors, since two (non-identical)
points define a line. A set of vectors r denoting a line can then be specified as all vectors satisfying
the following equation
r = a + αt̂, α ∈ R, (13)
where a is any point on the line and t̂ is any tangent vector to the line. Unsurprisingly, the
dimensionality of this curve is 1 as the equation has one degree of freedom, α. This holds
regardless of the dimensionality of the space in which we operate, and the above equation
extends our concept of a line to any dimension.
The set of points r comprising a plane can be defined in a similar manner as

r = a + αt̂1 + β t̂2 , α, β ∈ R, (14)

where, again, a is any point in the plane, and t̂1 and t̂2 are any two non-colinear vectors lying
in the plane. The dimensionality of this set is now 2, as we have two degrees of freedom in the
equation. While not part of this course, it should be clear from these two examples how sets of
higher dimensionality can be generated for vector spaces of arbitrary dimensionality.
In addition to Eq. (14), a plane is also often given in terms of three points in the plane given
by the vectors a,b,c, noting that the difference of any two of these vectors yields a vector in the
plane:
r = a + α(b − a) + β(b − c), α, β ∈ R. (15)
Alternatively, a plane can also be specified using the unit vector normal to the plane n̂, and a
point in the plane a
r · n̂ = a · n̂ = d, (16)

9
Figure 4: Three ways to define a plane in space: via two vectors in the plane and a point; via
three points; and via a point and the unit normal vector to the plane.

where d is the length of the projection of any vector describing a point on the plane onto the
unit normal to the plane, i.e., the distance between the plane and the origin. A summary of
these ways to describe a plane is given in Fig. 4.

10
2 Multiple integrals

Figure 5: Riemann integration of a function in 1 dimension.

We start by recalling the well-known definition of (Riemann) integration in one dimension.


Definition 1. Let f : [a, b] → R be a continuous function. Then, if we consider splitting the
domain into n equal intervals of width b−an , and denote by tk any point within the k-th interval,
the integral of f over this domain is given by the limit of Cauchy-Riemann sums
ˆ b n
b−a
(17)
X
f (x) dx = lim sn ≡ lim f (tk ).
a n→∞ n→∞ n
k=1

If this limit exists and is finite, the function is said to be (Riemann) integrable. In practice,
of course, we tend to solve integrals by guessing them; we can compute and memorize the
derivatives of a wide range of commonly-encountered functions, and then use the fundamental
theorem of calculus (integration is the inverse operation of differentiation) to figure out the
integrals.
Conceptually, this approach can easily be extended to integrals in more dimensions, simply
by considering domains that are no longer 1D intervals but higher dimensional areas, volumes, or
hyper-volumes. However, on the practical side our strategy will be to reduce a multidimensional
integral into a series of 1D integrals, and then solve those using standard approaches. Let us
illustrate this using a simple case in two dimensions.
Let the integration domain be a rectangle D = [a, b] × [c, d] ⊂ R2 in the (x, y) plane, and let
f (x, y) be a continuous function in two variables. Intuitively, the integral of f over this domain
will be some volume V contained by the rectangle between the surfaces z = 0 and z = f (x, y),
so the volume of an object in 3D space. Now let us cut this shape with a vertical plane parallel
to the x-axis, as illustrated in Fig. 6. The intersection between the volume and the plane will
be a flat region in the (x, z) plane, identifiable by a single value of y given by where the vertical
plane intersects the y-axis. We know how to calculate the area A = A(y) of this shape:
ˆ b
A(y) = f (x, y) dx. (18)
a

Within the integral the only variable is x since y is fixed. This is therefore a 1D integral which
we can attempt to solve using known techniques. The total volume V is clearly made by varying
y between the given bounds of [c, d], and summing over their contributions:
ˆ d
V = A(y) dy. (19)
c

11
Figure 6: Integration in two dimensions.

Substituting Eq. (18) into Eq. (19), and noting that performing the same operations, but choos-
ing instead a vertical plane parallel to y, would have to yield the same result, we can write
ˆ d ˆ b  ˆ b ˆ d 
V = f (x, y) dx dy = f (x, y) dy dx. (20)
c a a c

Because the order in which we chose to integrate does not matter2 , we adopt the following
notation to indicate a general (definite) integral of f over some domain D:
¨
V = f (x, y) dx dy. (21)
D

Note that if a function happens to be separable f (x, y) = h(x) · g(y), then the double integral
above, on a rectangular domain, simplifies to the product of two 1D integrals. The following
integrability theorem holds:
Theorem 1. Let D ⊂ R2 be a regular domain and f : D → R a continuous function. Then f
is integrable in D.
We used the concept of a regular domain here, which is worth dwelling on. A domain D is
said to be regular if it is open (i.e., every point in D is an interior point) and if its boundary
∂D is piecewise smooth (i.e., represented by a finite number of functions that admit continuous
derivatives). We will elaborate more on this in what follows.
We have seen an example of how a 2D integral can be used to calculate a volume in 3D.
However, a double integral can also define the measure (area) of a domain. A limited domain
D ⊂ R2 is said to be measurable if the function 1 is integrable in D. Then, the measure (or
area) of D is given by ¨
|D| = 1 dx dy. (22)
D
2
Although this holds for the integrals treated in this course, this statement is not strictly true. If you are
interested to learn more, look up Fubini’s theorem!

12
2.1 Non-rectangular domains

Figure 7: X-simple and y-simple domains for double integration.

The extension of double integrals with rectangular domains, as discussed above, to more
complex domains, is relatively straightforward. It will be convenient to consider simple domains
first. We will call a domain x-simple if horizontal lines always intersect the domain into a single
line segment (in the domain), and y-simple if vertical lines always intersect the domain into a
single line segment. The domains are illustrated in Fig. 7. The choice of horizontal and vertical
here is simply a convention based on how we normally draw a 2D coordinate system. In slightly
more mathematical terms,
Definition 2. A set D ⊂ R2 is y-simple if

D = {(x, y) ∈ R2 : x ∈ [a, b], g1 (x) ≤ y ≤ g2 (x)}, (23)

where g1 and g2 are continuous functions.


Slicing D with a line parallel to the y axis therefore always yields a line segment, and this
line segment varies continuously with a change in the position of the line. Similarly, we can also
write the definition for x-simple domains as
Definition 3. A set D ⊂ R2 is x-simple if

D = {(x, y) ∈ R2 : y ∈ [c, d], h1 (y) ≤ x ≤ h2 (y)}, (24)

where h1 and h2 are continuous functions.


We can now see how double integrals should be performed on such domains. Specifically,
if D is y-simple, then the double integral of some function f (x, y) over D can be computed by
integrating over the line segments parallel to the y axis first, and is therefore given by:
¨ ˆ b ˆ g2 (x) !
f (x, y) dx dy = f (x, y) dy dx. (25)
a g1 (x)
D

Similarly, if D is x-simple, then we should integrate over the line segments along the x direction
first: ¨ ˆ d ˆ h2 (y) !
f (x, y) dx dy = f (x, y) dx dy. (26)
c h1 (y)
D

13
Figure 8: Domain of integral in Example 1. We see the domain is both x-simple and y-simple.
The order in which we can chose to integrate is denoted by the red and blue lines.

Both approaches allow us to reduce the double integral to two single integrals, which we know
how to tackle. A rectangular domain is trivially both x-simple and y-simple, but there is a
further family of non-rectangular domains which are both x-simple and y-simple, such as the
disc. Such domains are called simple. Note that if the domain is simple, then we can choose
whether to partition it as x-simple or y-simple. A judicious choice here can dramatically change
the difficulty of the integral.
Most domains you will encounter may not be simple, but as long as they can be partitioned
into a finite number of simple domains we can still integrate across them, and the methods
discussed above apply. This provides another, perhaps more intuitive, way of defining a regular
domain: it is one that can be written as a union of a finite number of simple domains. So we see
that an annulus, for example, would not be a simple domain, but would be regular (and thus
allowable). We will not deal with domains that are not regular in this course.

Example 1. Evaluate the integral


¨
sin y 3 dx dy, (27)

I=
D

where D is the area between the curves y = x and y = 1, for x ∈ [0, 1].

Solution: The domain is both x-simple and y-simple, as shown in Fig. 8, so we can choose the
order of integration based on preference. Considering D to be y-simple, the integral becomes
ˆ 1 ˆ y=1 
3
(28)

I= √
sin y dy dx.
0 y= x

Now unfortunately the integral in the parenthesis has no elementary primitive (one can write
the solution using complex analysis in terms of gamma functions, a task not for the faint of
heart – but do have a go if you’re feeling intrepid!). A far simpler solution is to consider the

14
Figure 9: Separation of the domains of triple integrals based on domain type: the first domain
can be split into layers of area Ω(z) for h1 ≥ z ≥ h2 . The second integrates along the z-axis
lines connecting two functions z = g1 (x, y) and z = g2 (x, y), for all (x, y) points in the domain
D.

integral on an x-simple domain instead. Then, we can write


ˆ 1 ˆ x=y2 !
3
(29)

I= sin y dx dy
0 x=0
ˆ 1
sin y 3 y 2 dy (30)

=
0
 1 1
 
1
= − cos y 3
= (1 − cos(1)). (31)
3 0 3

The order of integration does not impact the integrability of the function, but it does have a
material impact on the ease with which one can find the solution in simple form.

2.2 Triple integrals


We can extend the approach described above to three dimensions, allowing us to tackle triple
integration. To do this, our previous definition of rectangular domains in the plane [a, b] × [c, d]
must be replaced by a cuboid [a, b]×[c, d]×[e, f ] in 3D space, but otherwise the approach remains
the same. We can immediately see that the main challenge with triple integrals, analogously to
double integration, will be the treatment of more general, non-cuboid domains. Provided the
domain of integration can be given a relatively simple analytical representation, a continuous
function defined over this domain will still be integrable, and we can proceed to find the integral
via appropriate iterative 1D integration. Two typical kinds of triple integration that are most
commonly encountered are illustrated in Fig. 9: triple integration by layers and triple integration
by lines. You may recognize the first as appropriate for the integration of domains that are x-
and y-simple, while the second is used for z-simple domains. We thus can reduce the triple
integral into a single and a double integral in two different ways, as given by the following two
definitions:

Definition 4. Integration by layers: Let Ω be a domain in R3 given by

Ω = {(x, y, z) : h1 ≤ z ≤ h2 ∈ Ω(z)} (32)

15
Figure 10: Tetrahedron given in Example 2.

where for every z ∈ [h1 , h2 ], Ω(z) is a plane regular domain. Then, if f : Ω → R is a continuous
function, f is integrable in Ω and the integral can be calculated as
 
˚ ˆ h2 ¨
f (x, y, z) dx dy dz = f (x, y, z) dx dy  dz. (33)
 

h1
Ω Ω(z)

Definition 5. Integration by lines: Let Ω be a domain in R3 given by

Ω = {(x, y, z) : g1 (x, y) ≤ z ≤ g2 (x, y), (x, y) ∈ D} (34)

where D is a regular domain in the (x, y) plane and g1 , g2 : D → R are continuous functions.
Then, if f : Ω → R is a continuous function, f is integrable in Ω and the integral can be
calculated as ˚ ¨ ˆ !
g2 (x,y)
f (x, y, z) dx dy dz = f (x, y, z) dz dx dy. (35)
g1 (x,y)
Ω D

These expressions contain at most a double integral, and we can deploy the methods discussed
earlier to reduce them further into single integrals that we can solve explicitly. For the purposes
of this course we will not be interested in integrating in spaces having domains of dimensionality
higher than three. Nevertheless, it should be fairly obvious by now how this methodology can
be extended to tackle integrals in higher dimensions.

Example 2. Find the mass of the tetrahedron bounded by the three coordinate planes
and the plane x + 2y + 3z = 4, if its density is given by ρ(x, y, z) = ρ0 x.

Solution: The volume of interest is a tetrahedron that intersections the coordinate axes in
points (4, 0, 0), (0, 2, 0), and (0, 0, 4/3), as shown in Fig. 10. In the (x, y) plane the domain is
the triangle bounded by the lines x = 0, y = 0, and y = 2 − x/2, found by setting z = 0 into
the equation of the plane. For each point (x, y) in this domain, the volume can be found by

16
integrating along the z-axis, from z = 0 to z = 13 (4 − x − 2y). We therefore have
˚
m= ρ(x, y, z) dV =
V
ˆ x=4 ˆ y=2−x/2 ˆ z=(4−x−2y)/3
= ρ0 x dx dy dz =
x=0 y=0 z=0
ˆ x=4 ˆ y=2−x/2
ρ0
= (4 − x − 2y)x dx dy =
3 x=0 y=0
ˆ x=4 
x3

ρ0
= 4x − 2x2 + dx =
3 x=0 4
 
ρ0 2 16
= 32 − 42 + 16 = ρ0 .
3 3 9

2.3 The Jacobian matrix and determinant


It is often very useful to be able to change the integration variables to simplify the form of
an integral. In fact, changing variables is one of the most common strategies
´ used to solve 1D
integrals. For example, say we are tasked with solving the integral I = f (x) dx, which can be
simplified if we change the variable x → t. If x and t are related through some function g, then
we have that
dg(t)
x = g(t), dx = dg(t) = dt, f (x) = f (g(t)),
dt
which allows us to rewrite the integral as
ˆ ˆ
dg(t)
I = f (x) dx = f (g(t)) dt.
dt
In this section we shall concern ourselves with how this approach can be extended to multiple
dimensions. Importantly, changing variables will also change the parametrization of the domain
of integration, so this can be a useful strategy to move between complex domains (which may
not be simple, in the sense above) to simpler, and even rectangular, domains. We start with the
simplest new case: a double integral over a 2D domain. Say we want to calculate the value of
the following integral ¨
f (x, y) dx dy. (36)
D

while changing the original coordinates from (x, y) to some new set of coordinates (u, v) described
by the transformation T
(x, y) = T (u, v). (37)
The domain D of the integral will, necessarily, also change T : D0 → D (note that given the
definition of Eq. (37), T acts on (u, v)). We can can also write explicitly
(
x = g(u, v)
y = h(u, v),

and the integrand becomes f (g(u, v), h(u, v)) := f˜(u, v).
In the 1D case, when changing variables, we were interested in finding an expression for
how the infinitesimal length element changes under the transformation. In 2D, we ask how the
infinitesimal area element dA changes with the change of variables. Consider the transformation
of Eq. (37) illustrated in Fig. 11: we are interested in calculating what the infinitesimal area

17
Figure 11: The infinitesimal area of a domain changes in a coordinate transformation, and must
be accounted for in integration using the Jacobian. Here, (x, y) represent the standard Cartesian
coordinate system, and the lines of constant u and v (the new coordinates) are drawn in red.

dx dy is when written in the new coordinates dv du, and vice versa. To find this, consider the area
of the shaded region in Fig. 11 between the vertices A, B, C and D. For a generic transformation
the sides of this shape will be curved, but in the limit of infinitesimal displacements, to the first
order linear approximation, the sides can be considered straight lines and the shape tends to a
parallelogram. The vertices of the parallelogram have the following (x, y) components:

A: g(u, v) h(u, v)
B: g(u + du, v) h(u + du, v)
C: g(u + du, v + dv) h(u + du, v + dv)
D: g(u, v + dv) h(u, v + dv)

From the vertices we can find the vector sides of the parallelogram
 
∂g ∂h
a = (g(u + du, v), h(u + du, v)) − (g(u, v), h(u, v)) = du, du
∂u ∂u
 
∂g ∂h
b = (g(u, v + dv), h(u, v + dv)) − (g(u, v), h(u, v)) = dv, dv .
∂v ∂v

Simplifying the notation by writing = fα , the area of the parallelogram A is


∂f
∂α
 
gu du hu du
dA ≡ dx dy = |a × b| = det(a, b) = det = (gu hv − gv hu ) du dv. (38)
gv dv hv dv
The relationship between the area written in the two different coordinate systems is therefore

dx dy = (gu hv − gv hu ) du dv, (39)

18
or in matrix notation, which lends itself more naturally to generalization in more dimensions,
 
gu hu
dx dy = det du dv. (40)
g v hv

This matrix containing all first-order partial derivatives is commonly encountered in vector
calculus, and merits a name: the Jacobian matrix. Both the matrix and its determinant are
often simply referred to as the ’Jacobian’, but this is (perhaps surprisingly) rarely confusing.
The Jacobian matrix is sometimes defined as the transpose of the expression above, a form
originating from the requirement to define continuity for multivariable vector functions. For the
use of the matrix this can be somewhat confusing at times, but as regards the determinant there
is no difference since the determinant is invariant per transposition.

Definition 6. Let D ⊂ R2 be a regular domain, let f : D → R be a continuous function, and let


T : D0 → D, (x, y) = T (u, v) be a coordinate transformation with x = g(u, v) and y = h(u, v).
Then ¨ ¨
f (x, y) dx dy = f (g(u, v), h(u, v))J(u, v) du dv, (41)
D D0

where  
g hu
J(u, v) = det u (42)
g v hv
is the determinant of the Jacobian matrix.

Example 3. Evaluate ¨
4
I= dx dy, (43)
(x − y)2
D

where D is the area enclosed by the lines y = x − 2, y = x − 4, x = 0, and y = 0.

Solution: We start by defining two new variables u = x + y and v = x − y, or equivalently,


x = 12 (u + v) and y = 12 (u − v). The Jacobian of this coordinate transformation is given by
!
∂(u+v) ∂(u−v)  
1 1 1 1 1
∂u ∂u
J(u, v) = det ∂(u+v) ∂(u−v) = det =− . (44)
4 4 1 −1 2
∂v ∂v

We are only interested in the size of the Jacobian, not the sign (the sign will depend on the
orientation choice of the vectors taken for the cross product). Taking |J| we can then rewrite
the integral as ¨
4 1
I= · du dv. (45)
v2 2
D0

Now we must analyse the domain, drawn in Fig. 12. The lines y = x−2 and y = x−4 correspond
to v = 2 and v = 4, respectively. These will be the limits of the domain along v. To find the
limits on u we use the remaining two lines and parametrize them in terms of (u, v). Firstly,
when x = 0 we have that u = y and v = −y, so u = −v. When y = 0 we find instead that
u = x = v. Thus u will vary between −v and +v. Therefore
ˆ v=4 ˆ u=v ˆ v=4
2 2
I= 2
du dv = 2
· 2v dv = ln(16). (46)
v=2 u=−v v v=2 v

19
Figure 12: Integration domains for Example 3 in the (x, y) and (u, v) coordinate systems. Note
the dotted triangles in both, each with an area of 2. There are 3 such triangles in the (x, y)
domain but 6 in the (u, v) domain. The transformation has stretched the domain, and this is
accounted for by the Jacobian of 1/2.

2.4 Curvilinear coordinate systems


Having defined the Jacobian, we are now able to move between any two coordinate systems
when dealing with multiple integrals. In addition to the standard Cartesian coordinate system
there are three other systems that are widely used in physics because they represent common
symmetries. These are the plane polar coordinate system in 2D, and the spherical polar and
cylindrical systems in 3D. We will discuss them in some detail here.

2.4.1 2D: Plane polar coordinates


Plane polar coordinates are suitable to describe systems with axial symmetry in a plane. The
transformation (x, y) → (r, θ) is given by the following relations:

x = r cos θ
y = r sin θ

The basis vectors of the Cartesian system are unit vectors along the x- and y-axes, î and ĵ. The
basis vectors in the new system will be different: r̂ is the unit vector along the line through the
origin at some fixed angle θ to the x-axis, while θ̂ is the unit vector tangent to the circumference
at some fixed distance from the origin (i.e, fixed r):

r̂ = î cos θ + ĵ sin θ
θ̂ = −î sin θ + ĵ cos θ. (47)

We can calculate the Jacobian of the transformation, which will allow us to use it to simplify
the calculation of some double integrals:
∂ ∂
  
r cos θ ∂r r sin θ cos θ sin θ
J(r, θ) = det ∂ ∂r
∂ = det = r(sin2 θ + cos2 θ) = r. (48)
∂θ r cos θ ∂θ r sin θ −r sin θ r cos θ

The infinitesimal area in this coordinate system is then given by

dA = dx dy = J(r, θ) dr dθ = r dr dθ. (49)

20
Figure 13: Plane polar coordinate system allows us to identify any point in the plane (except
the origin) via two new variables: r, the Euclidean distance of the point to the origin, and the
angle θ. In these system the infinitesimal surface element in has an area of r dr dθ: it depends
explicitly on the distance of the element to the origin.

The sketch in Fig. 13 shows this result more intuitively: the blue-shaded infinitesimal area, in
the limit of dr, dθ → 0, is simply a rectangle with sides r dθ and dr, and therefore has an area of
r dr dθ. Interestingly, in plane polar coordinates dA this is not constant in all space like dx dy,
but increases linearly with distance from the origin.

Example 4. Evaluate ¨
xy 2
I= dx dy, (50)
x2 + y 2
D

where D is the area enclosed by the circle of radius 2 for x ≥ 0 and y ≥ 0.

Solution: We note that the domain is particularly simple to express in plane polar coordinates:
on the (x, y) plane it represents a quarter of a disc, so we must have that r ∈ [0, 2] and θ ∈
[0, π/2]. We proceed to rewrite the integral in this coordinate system by making the substitutions
x = r cos θ, y = r sin θ, and making sure we remember to include the Jacobian:
¨ 3
r cos θ sin2 θ
I= r dr dθ =
r2
D
ˆ 2 ˆ π/2
r3 cos θ sin2 θ
= r dr dθ =
0 0 r2
ˆ 2  ˆ π/2 !
2 2
= r dr sin θ cos θ dθ =
0 0
2  3 π/2
r3

sin θ 8
= · = .
3 0 3 0 9

21
Example 5. Evaluate the Gaussian integral
ˆ +∞
(51)
2
G= e−x dx.
−∞

Solution: The first thing to note is that the indefinite integral of e−x does not have a simple
2

primitive, that is to say, there is no elementary function the derivative of which is the squared
exponential. However, the 2D version of the same integral is more tractable, so, for a change,
we shall use double integration to help us solve an integration problem in one dimension. We
start by defining the following integral
¨
(52)
2 2
I(R) = e−(x +y ) dx dy,
D

where D is the area enclosed by the circle of radius R: x2 + y 2 < R. This integral can be
simplified by moving to plane polar coordinates:
ˆ 2π ˆ R 
−r2
I(R) = e r dr dθ =
0 0
1 −r2 R
   
2
= 2π − e = π 1 − e−R .
2 0

Taking the limit of this expression for R → ∞ gives us the value of the integral over the entire
(x, y) plane
I = lim I(R) = π. (53)
R→∞
´  ´ 
We now note that I = Gx × Gy = −∞ e−x dx × −∞ e−y dy . The two terms in x and y
+∞ 2 +∞ 2

are identical and separable, so they must be equal. We thus conclude that
ˆ +∞ √ √
(54)
2
e−x dx = I = π.
−∞

2.4.2 3D: Cylindrical coordinates


Cylindrical coordinates are perhaps the simplest extension of plane polar coordinates to 3 dimen-
sions. As the name suggests, they are ideally suitable to represent physical and mathematical
objects that possess some form of axial symmetry, or are at least easiest to describe within
a coordinate system possessing such symmetry. The transformations from the 3D Cartesian
coordinate system (x, y, z) are given by

x = r cos φ
y = r sin φ
z = z.

As shown in Fig. 14, the unit vectors in the Cartesian basis î, ĵ are replaced by the plane polar
basis vectors r̂, φ̂ in the (x, y) plane, while the basis vector along the z direction, k̂, remains
unchanged. The relationship between basis vectors is therefore:

r̂ = î cos φ + ĵ sin φ
φ̂ = −î sin φ + ĵ cos φ
ẑ = k̂. (55)

22
rd
<latexit sha1_base64="JqIoHk53vbnijbTvIc6DYFFhruY=">AAAB8XicbVDLSsNAFL3xWeur6tLNYBFcSEnE17KgC5cV7AObUCaTSTt0ZhJmJkIJ/Qs3LhRx69+482+ctllo64ELh3Pu5d57wpQzbVz321laXlldWy9tlDe3tnd2K3v7LZ1kitAmSXiiOiHWlDNJm4YZTjupoliEnLbD4c3Ebz9RpVkiH8wopYHAfcliRrCx0qNC/imK/HTAepWqW3OnQIvEK0gVCjR6lS8/SkgmqDSEY627npuaIMfKMMLpuOxnmqaYDHGfdi2VWFAd5NOLx+jYKhGKE2VLGjRVf0/kWGg9EqHtFNgM9Lw3Ef/zupmJr4OcyTQzVJLZojjjyCRo8j6KmKLE8JElmChmb0VkgBUmxoZUtiF48y8vktZZzbusXdyfV+u3RRwlOIQjOAEPrqAOd9CAJhCQ8Ayv8OZo58V5dz5mrUtOMXMAf+B8/gBzX5Al</latexit>

dr
<latexit sha1_base64="aKZOmmC+lP9vZpNAO01pEIK25yQ=">AAAB7HicbVBNS8NAEJ2tX7V+VT16WSyCBymJ+HUs6MFjBdMW2lA2m027dLMJuxuhhP4GLx4U8eoP8ua/cdvmoK0PBh7vzTAzL0gF18ZxvlFpZXVtfaO8Wdna3tndq+4ftHSSKco8mohEdQKimeCSeYYbwTqpYiQOBGsHo9up335iSvNEPppxyvyYDCSPOCXGSl7vDIeqX605dWcGvEzcgtSgQLNf/eqFCc1iJg0VROuu66TGz4kynAo2qfQyzVJCR2TAupZKEjPt57NjJ/jEKiGOEmVLGjxTf0/kJNZ6HAe2MyZmqBe9qfif181MdOPnXKaZYZLOF0WZwCbB089xyBWjRowtIVRxeyumQ6IINTafig3BXXx5mbTO6+5V/fLhota4K+IowxEcwym4cA0NuIcmeECBwzO8whuS6AW9o495awkVM4fwB+jzBw2MjjY=</latexit>

r rd
<latexit sha1_base64="JqIoHk53vbnijbTvIc6DYFFhruY=">AAAB8XicbVDLSsNAFL3xWeur6tLNYBFcSEnE17KgC5cV7AObUCaTSTt0ZhJmJkIJ/Qs3LhRx69+482+ctllo64ELh3Pu5d57wpQzbVz321laXlldWy9tlDe3tnd2K3v7LZ1kitAmSXiiOiHWlDNJm4YZTjupoliEnLbD4c3Ebz9RpVkiH8wopYHAfcliRrCx0qNC/imK/HTAepWqW3OnQIvEK0gVCjR6lS8/SkgmqDSEY627npuaIMfKMMLpuOxnmqaYDHGfdi2VWFAd5NOLx+jYKhGKE2VLGjRVf0/kWGg9EqHtFNgM9Lw3Ef/zupmJr4OcyTQzVJLZojjjyCRo8j6KmKLE8JElmChmb0VkgBUmxoZUtiF48y8vktZZzbusXdyfV+u3RRwlOIQjOAEPrqAOd9CAJhCQ8Ayv8OZo58V5dz5mrUtOMXMAf+B8/gBzX5Al</latexit>

<latexit sha1_base64="OICVT4dK95dXJAFrn35WIRFo0uQ=">AAAB6HicbVDLSgNBEOz1GeMr6tHLYBA8hV3xdQzowWMC5gHJEmYnvcmY2dllZlYIS77AiwdFvPpJ3vwbJ8keNLGgoajqprsrSATXxnW/nZXVtfWNzcJWcXtnd2+/dHDY1HGqGDZYLGLVDqhGwSU2DDcC24lCGgUCW8Hoduq3nlBpHssHM07Qj+hA8pAzaqxUV71S2a24M5Bl4uWkDDlqvdJXtx+zNEJpmKBadzw3MX5GleFM4KTYTTUmlI3oADuWShqh9rPZoRNyapU+CWNlSxoyU39PZDTSehwFtjOiZqgXvan4n9dJTXjjZ1wmqUHJ5ovCVBATk+nXpM8VMiPGllCmuL2VsCFVlBmbTdGG4C2+vEya5xXvqnJZvyhX7/I4CnAMJ3AGHlxDFe6hBg1ggPAMr/DmPDovzrvzMW9dcfKZI/gD5/MH4PuNAg==</latexit>

Figure 14: The cylindrical coordinate system, basis vectors, and volume element.

As illustrated in Fig. 14, the infinitesimal volume in this coordinate system tends to a cuboid
with sides dr, dz, and r dφ. In this coordinate system we can therefore write the (square)
infinitesimal line element as

(ds)2 = (dr)2 + (r dφ)2 + (dz)2 . (56)

Similarly, the surfaces of the cuboid have sides with areas given by

dr dz; r dr dφ; r dφ dz; (57)

and the volume element is


dV = r dr dφ dz. (58)
The Jacobian of this transformation is therefore J(r, φ, z) = r, as can be verified easily by an
explicit calculation of the determinant of the Jacobian matrix.

2.4.3 3D: Spherical polar coordinates


For systems with spherical symmetry we will find the use of the spherical polar coordinate
system highly advantageous. The transformations here are given by the following relations:

x = r sin θ cos φ
y = r sin θ sin φ
z = r cos θ,

where θ ∈ [0, π] is the polar angle, measured between the z-axis and the radial vector connecting
the origin to some point, and φ ∈ [0, 2π] is the azimuthal angle, measured between the x-axis
and the projection of the radial vector onto the xy-plane. The new basis vectors in this system
are shown in Fig. 15, and are given by the relations

r̂ = sin θ(î cos φ + ĵ sin φ) + k̂ cos θ


θ̂ = cos θ(î cos φ + ĵ sin φ) − k̂ sin θ
φ̂ = −î sin φ + ĵ cos φ. (59)

23
Figure 15: The spherical polar coordinate system, basis vectors, and volume element.

As illustrated in Fig. 15, the infinitesimal volume in this coordinate system tends to a cuboid
with sides dr, r dθ, and r sin θ dφ. The line element is therefore given by

(ds)2 = (dr)2 + (r dθ)2 + (r sin θ dφ)2 . (60)

The surfaces of the cuboid have sides with areas

r dr dθ; r sin θ dφ dr; r2 sin θ dθ dφ; (61)

and the volume element is


dV = r2 sin θ dr dθ dφ. (62)
The Jacobian of this transformation is therefore J(r, θ, φ) = r2 sin θ, as can be verified easily by
an explicit calculation of the determinant of the Jacobian matrix.

24
Example 6. Integrate the function f = x2 z in the positive half of the sphere of radius R
centred in the origin.

Solution: Our domain can be written as


p
Ω = {(x, y, z) : 0 ≤ z ≤ R2 − x2 − y 2 , x2 + y 2 ≤ R2 },

which allows us to compute the integral by following the approach of Def. 5:


˚ ¨ ˆ p 2 2 2 !
R −x −y
f (x, y, z) dx dy dz = x2 z dz dx dy =
0
Ω x2 +y 2 ≤R2
¨  pR2 −x2 −y2
1 2
= x2 z dx dy =
2 0
x2 +y 2 ≤R2
¨
1 2 2
= x (R − x2 − y 2 ) dx dy =
2
x2 +y 2 ≤R2
ˆ 2π ˆ R 
1 2 2 2 2
= r cos θ(R − r )r dr dθ =
0 0 2
ˆ 2π  ˆ R 
1 2 3 2 5
= cos θ dθ (r R − r ) dr =
2 0 0
R
π r4 2 r6

π
= R − = R6 .
2 4 6 0 24

Note how we have moved to plane polar coordinates to simply the double integral in (x, y) into
a product of two single integrals in r and θ given the radial symmetry of the problem.
Alternatively, we could also consider the domain as a series of discs slicing the sphere:

Ω = {(x, y, z) : x2 + y 2 ≤ R2 − z 2 , 0 ≤ z ≤ R},

which lends itself to splitting the triple integral according to Def. 4:


 
˚ ˆ R ¨
f (x, y, z) dx dy dz = x2 z dx dy  dz =
 

0
Ω x2 +y 2 ≤R2 −z 2
ˆ R ˆ 2π ˆ √R2 −z 2 ! !
2 2
= r cos θ z r dr dθ dz =
0 0 0
ˆ R ˆ √
R2 −z 2
! ˆ
2π 
3 2
= r dr cos θ dθ z dz =
0 0 0

ˆ R  4  R2 −z 2
r
= πz dz =
0 4 0
ˆ R ˆ
π π R 2 2 2
= z(R − z ) dz = (zR4 + z 5 + 2R2 z 3 ) dz =
4 0 4 0
π R4 z 2 z 6 4 R
 
z π
= + − 2R2 = R6 .
4 2 6 4 0 24

25
3 Statistical Distributions
Consider we wish to perform an experiment, the outcome of which is well-defined (and ideally
measurable), but a priori unknown. There is value in being able to make quantitative predictions
on what the outcome of the experiment may be, without having to perform it. The mathematical
tools that can assist us in making such predictions in real-world experiments fall under the
general headings of probability and statistics. Of course this is a huge topic; in this course we
will explore a very small part of it, that of continuous random distributions.
First, we should define some basic terminology. The experiment mentioned earlier will be
called a trial, and the result of the experiment the outcome. The set of all possible outcomes of
the trial, i.e. our space of possibilities, will be called the sample space. The sample space can be
discrete, with a finite or infinite number of possible outcomes, or continuous, with a necessarily
infinite number of possible outcomes. The trial of flipping a coin has a discrete number of
outcomes if we are interested in which side ends facing upwards (heads, tails, or, possibly, it
landing on its side), but a continuous set of outcomes if we were interested in predicting, for
example, where on the floor it will land. If instead of considering an individual outcome we are
interested in set of outcomes, itself a subset of the sample space, we will call such a subset an
event. These too can be continuous or discrete. In this course we will be interested in continuous
sample spaces, which will allow us to apply our vector calculus methods to the study of statistics.
How likely is a certain event, compared to all others? One way to think about this is to
imagine repeating the trial many times over, and to count the frequency of measuring a certain
event compared to all other outcomes. It turns out that this relative frequency tends to be
approximately the same every time a large set of trials is performed, which spurs us to call this
ratio the probability P of the event taking place. If we call the event A, then we can write
nA
P (A) = , (63)
ntot
where nA is the number of outcomes of event A and ntot is the number of total outcomes.

3.1 Continuous random variables

Figure 16: Probability density function (PDF) for a variable X that can take values between l1
and l2 ]. The probability of finding X between a and b is given by the integral of the PDF over
that range, as given in Eq. (65).

A random variable in one dimension X is said to have a continuous distribution if X can


assume any real value over some domain of the reals: X ∈ [a, b] ⊂ R. The number of possible

26
outcomes is therefore equal to the number of values present on some finite interval of the reals.
As this is infinite, we cannot assign a finite probability to individual outcomes: Eq. (63) shows
that P → 0 if ntot → ∞ for any finite nA . Instead of considering individual outcomes, when
dealing with continuous variables we must therefore consider outcomes over a finite continuous
subset of the sample space.
We can use this insight to define a probability density function (PDF) f of a continuous
random variable X as
P (x < X ≤ x + dx) = f (x) dx. (64)
Here f (x) dx is the probability of finding X in the interval between x and dx. More generally:
ˆ b
P (a < X ≤ b) = f (x) dx. (65)
a

We know that some outcome of a trial will take place with certainty, and so if we take the
range to be the entire sample space S, the probability of finding X there must be 1 (i.e., 100%),
by definition. This leads to the requirement for f to be integrable over S, and normalized to 1:
ˆ
f (x) dx = 1.
S

Furthermore, a probability must be a real number between 0 and 1, and can never be negative.
For f (x) to be a probability density we therefore further require f to be a non-negative function
of x, ∀x ∈ S.
For practical reasons will be useful to define the cumulative probability distribution function
(CDF), denoted by F (x), which is given by the following integral:
ˆ x
F (x) = f (u) du. (66)
l1

Clearly, this is the probability that X is less than, or equal to, x, i.e., F (x) = P (X ≤ x). We
can turn back to Eq. (65), and rewrite it in terms of the CDF as
ˆ b
P (a < X ≤ b) = f (x) dx = F (b) − F (a). (67)
a

We can see why the CDF is useful – if known, it provides us with a way to quickly determine the
probability of the random variable lying in some range, without needing to evaluate an integral.
From the fundamental theorem of calculus f (x) = dFdx(x) .

Example 7. Consider the PDF provided in Fig. 17, describing the seasonal maximum
air temperature in SE England according to two climate modelling scenarios. Estimate
the likelihood that the maximum temperatures exceed 2◦ C, and 6◦ C, respectively, in the
two scenarios. What is the probability the maximum temperature increase will be in the
range 2◦ C-6◦ C?

Solution: The likelihood of the temperature exceeding 6◦ C is easy to glean directly from the
PDF. It approaches 0 under the RCP2.6 scenario (the entire distribution lies below the 6◦ mark),
but is around 50% in the scenario RCP8.5. The probability that the temperature will exceed
2◦ C is a little harder to extract from the PDF figure without having the numerical values, but
can instead be found from the CDF:

P (X > 2)RCP2.6 ≈ 1 − 35% = 65%,


P (X > 2)RCP8.5 ≈ 1 − 5% = 95%.

27
Probability Density Function Cumulative Distribution Function

Figure 17: Seasonal average Maximum air temperature anomaly at 1.5m (in ◦ C) for June-July-
August in years 2070 up to and including 2098, in SE England using baseline 1961-1990. RCP
stands for Representative Concentration Pathway. RCP8.5 is a scenario where we make little
effort to curb emissions, while under RCP2.6 a concerted effort is made. Source: Met Office.

To find the probability of the temperatures ending up in the range 2◦ C-6◦ C we can simply
subtract the CDFs at the two values as shown in Eq. (67): we find:

P (2 < X ≤ 6)RCP2.6 ≈ 100% − 35% = 65%,


P (2 < X ≤ 6)RCP8.5 ≈ 50% − 5% = 45%.

3.1.1 Expectation value


Definition 7. The expectation value of any function g(X) of the random variable X for a
continuous distribution f is defined as
ˆ
E[g(X)] = g(x)f (x) dx, (68)
S

where S is the sample space, and assuming the integral exists. We note that it is a linear
functional that depends on the functions f and g, rather than being a function of the random
variable X.
The most common types of functions for which we will want to calculate the expectation value
will be powers of the variable itself. Because of this, these expectation values are given a special
name to distinguish them – they are called moments of the distribution, with
ˆ
k
E[X ] = xk f (x) dx (69)
S

denoting the k-th moment of the distribution. Of these, the first moment – the expectation of
the variable itself – is most widely used, and is more commonly called the mean:
ˆ
E[X] = xf (x) dx. (70)
S

This quantity is often denoted as hxi, x̄, or by the greek letter µ. In addition to this, distributions
are also commonly characterized by the mode and the median. The mode of a distribution is the
value of the random variable X at which the probability density function f (x) has its greatest
value. Note that it is possible to have multiple modes in a distribution. The median of a
distribution is the value of the random variable X at which the cumulative probability function
F(x) takes the value 1/2: it divides the PDF into two, equally probably halves.
Another important expectation value is the variance:

28
Definition 8. The variance of a distribution, σ 2 , is defined as
ˆ
E[(X − hxi) ] = σ = (x − hxi)2 f (x) dx,
2 2
(71)
S

where S is the sample space, and assuming the integral exists.

The variance of a distribution is always positive. Its positive square root is known as the
standard deviation of the distribution and is often denoted by σ. Roughly speaking, σ measures
the spread of the values that X can assume around the mean. Often the variance is given in a
slightly different form:

σ 2 = E[(X − hxi)2 ] =
= E[X 2 ] − 2hxiE[X] + hxi2 =
= E[X 2 ] − hxi2 ≡ E[X 2 ] − E[X]2 .

It follows that the variance is the difference between the second moment and the square of the
first moment of a distribution.

3.2 Distribution functions


There are some commonly encountered continuous distribution functions you should be familiar
with. We list some of their PDFs here, and show them in Fig. 18, alongside their CDFs.

• Uniform distribution
f (x) = constant
This is the simplest PDF, and we note that the domain must be limited to some finite
subset of R or the integral will not converge. Despite its simplicity, it is still widely used,
for example:

– For random number generation.


– In Monte Carlo simulations, as a model for inputs where random samples from a
uniform distribution are used to estimate results.
– In sampling, as a prior distribution in Bayesian analysis when little prior knowledge
about a parameter is available.
– For hypothesis testing to compare the uniformity of a sample against a known uniform
distribution.
– In computer graphics, to distribute random points on a surface, or to determine the
brightness of pixels in an image.

• Exponential distribution
f (x) = λe−λx

The exponential distribution models the time between events in a Poisson process. Some
common uses include:

– In reliability engineering, to model the lifetime of a device or system, and to calculate


its reliability and failure rate.
– In survival analysis, to model the time to an event, such as the time to death, failure
or diagnosis of a disease.
– In finance, to model the time between financial events, such as the time between stock
trades or the time between defaults in a portfolio.
– In statistical inference, as a conjugate prior distribution in Bayesian analysis.

29
4 PDF 1.0 CDF Uniform

0.8  0 | max  1
min min  0 | max  1
3
0.6
2 min  -1 | max  1 min  -1 | max  1
0.4
1
0.2  0.5 | max  0.75
min min  0.5 | max  0.75

0 0.0
-1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0

PDF CDF Exponential


1.5 1.0

0.8
λ  0.5 λ  0.5
1.0
0.6
λ1 λ1
0.4
0.5
λ  1.5
0.2 λ  1.5

0.0 0.0
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

PDF 1.0 CDF Gaussian


0.8
0.8 μ  0 | σ  1 μ 0 | σ1
0.6
0.6
μ  0 | σ  0.447214 μ  0 | σ  0.447214
0.4
0.4

0.2 0.2 μ  1 | σ  2.23607 μ  1 | σ  2.23607

0.0 0.0
-4 -2 0 2 4 6 -4 -2 0 2 4 6
0.7 1.0
PDF CDF Cauchy (Lorentz)
0.6
0.8
0.5 a  0 | b  0.5 a  0 | b  0.5
0.4 0.6
0.3 a0 | b1 a0 | b1
0.4
0.2
a  -2 | b  1 a  -2 | b  1
0.1 0.2

-5 -4 -3 -2 -1 0 1 2 3 -5 -4 -3 -2 -1 0 1 2 3

3.0 PDF 1.0 CDF Pareto

2.5 0.8k  1 | α  1 k 1 | α  1
2.0
0.6
1.5 k 1 | α  2 k 1 | α  2
0.4
1.0
0.5 0.2k  1 | α  3 k 1 | α  3

0.0
1 2 3 4 5 6 1 2 3 4 5 6

Figure 18: Commonly encountered probability density functions (PDFs) and cumulative distri-
bution functions (CDFs). The equations are given in the main text.

30
• Gausssian distribution
−(x − µ)2
 
1
f (x) = √ exp
2πσ 2σ 2

The Gaussian (or normal) distribution is the most widely used probability distribution,
and has many applications across various fields, including:

– In physics, to quantify uncertainties and to describe random processes.


– In statistics, to model the distribution of random variables, and as a basis for hy-
pothesis testing.
– In machine learning, in algorithms such as Gaussian Naive Bayes and Gaussian Mix-
ture Models.
– In signal processing, to model random noise and to perform signal smoothing.
– In image processing, in edge detection, image smoothing, and deblurring.
– In economics, to model stock prices and other financial data.
– In biology, to model the distribution of various physical and biological phenomena,
such as the sizes of organisms in a population.

• Cauchy or Lorentzian distribution


−1
(x − a)2

1
f (x) = +1
πb b2

The Lorentzian distribution is similar in shape to a Gaussian distribution, but it has


a “heavier tail”, which means it decays slower to zero (as 1/x2 , rather than square-
exponentially) and thus gives more weight to outliers than the Gaussian. Its uses include:

– In spectral analysis, to model the line shapes of spectral peaks, such as those ob-
served in emission spectroscopy, nuclear magnetic resonance (NMR), and Raman
spectroscopy.
– In robust statistics, as a model for outliers, as it is less sensitive to outliers than the
normal distribution.
– In signal processing, to model resonances in signals, such as those observed in vibra-
tion analysis.
– In fluid mechanics, to model the turbulence in fluid flow.

• Pareto distribution
αk α
f (x) =
xα+1
The Pareto distribution is another heavy-tailed distribution, popular in the social sciences.
Its uses include:

– In economics, to model the distribution of wealth.


– In urban economics, to model the distribution of city sizes.
– In computer networking, to model the distribution of internet traffic.
– In software engineering, to model the distribution of software faults.
– In the modelling of certain types of stochastic processes, such as Lévy flights (a type
of random walk).

31
3.3 Join probability distributions
The discussions above can be extended to multiple random variables, which may or may not
be independent. Consider the case of two random variables X and Y , described by the join
probability density function fXY (x, y). The two variables are said to be independent if we can
factorize their joint PDF: fXY (x, y) = gX (x)hY (y). Otherwise, the variables are interdependent,
or coupled. Similarly to Eq. (64), we can write the probability for the variables to fall within
some infinitesimal range as
dP (X, Y ) = fXY (x, y) dx dy, (72)
which, again, must be normalized (or at least normalizable) over the space of all possible out-
comes S: ¨
fXY (x, y) dx dy = 1. (73)
S
The probability of an event is then simply a double integral over the 2D domain D of desired
outcomes of the random variables X and Y :
¨
P ((X, Y ) ∈ D) = fXY (x, y) dx dy. (74)
D

Example 8. Let fXY (x, y) = x+cy 2 be a joint probability density function of two random
variables X and Y , defined over the support x ∈ [0, 1] and y ∈ [0, 1]. Find the appropriate
normalization constant c, and determine the probability that both X and Y are less than,
or equal to, 1/2.

Solution: For the normalization we write


ˆ 1ˆ 1 ˆ 1 1
2 cy 3
1= (x + cy ) dx dy = xy + dx =
0 0 0 3 0
ˆ 1
c
= x+ dx =
0 3
 2 1
x cx 1 c
= + = + .
2 3 0 2 3
Therefore, c = 3/2. To find the required probability we then need to integrate the normalized
PDF over the domain of required outcomes:
ˆ 1/2 ˆ 1/2  
3 2 3
P (0 < X ≤ 1/2, 0 < Y ≤ 1/2) = x + y dx dy = .
0 0 2 32

3.4 Marginal and conditional distributions


Given a joint PDF f (x, y), we may be interested only in the probability function for X irrespec-
tive of the value of Y (or vice versa). This is called the marginal distribution of X (or Y ), and
is obtained by integrating the joint PDF over all allowed values of Y (or X):
ˆ ˆ
fX (x) = f (x, y) dy; fY (y) = f (x, y) dx. (75)

Alternatively, we might be interested in the probability function of X given that Y takes


some specific value Y = y0 , P (X|Y = y0 ). Such a probability is given by the conditional
distribution C(x):
f (x, y0 ) f (x, y0 )
C(x) = ´ = .
f (x, y0 ) dx fY (y0 )

32
An important point to highlight here is the need to renormalize this distribution. This is because
the overall joint PDF is normalized assuming all values of Y are accessible. As we can see, the
normalization needed is provided by the marginal distribution for Y = y0 .

Example 9. Find the marginal PDFs fX (x) and fY (y) for the joint PDF given in Exam-
ple 8 fXY (x, y) = x + 32 y 2 . Find the probability that Y > 1/2 if X = 1/3.

Solution: We use Eq. (75) to find:


ˆ 1 
3 2 1
fX (x) = x + y dy = x + ;
0 2 2
ˆ 1 
3 1 3
fY (y) = x + y 2 dx = + y 2 .
0 2 2 2
For the second question we first need to find the conditional distribution
f ( 13 , y) 1
3 + 23 y 2 2 9 2
C(y) = = = + y .
fX ( 13 ) 5
6
5 5
We note that the nominator is a single variable PDF, but one that is no longer normalized,
hence the need for the denominator. The resulting conditional distribution is a proper PDF. To
find the required probability we now simply need to integrate this conditional distribution over
the required range for Y :
ˆ 1 ˆ 1  
2 9 2 29
P (1/2 < Y ≤ 1) = C(y) dy = + y dy = .
1/2 1/2 5 5 40

3.5 The Dirac Delta


So far we have considered real 1D functions f : R → R, which are essentially objects that assign
the numbers f (x) to some subset of real numbers x. It turns out this is too restrictive for
certain aspects of differential calculus, and a generalization in terms of a new ´class of object is
useful. The idea is to reinterpret f as some object that assigns the number f (x)φ(x) dx to
every suitably chosen φ belonging to some set of continuous functions. An f defined as such a
map is called a distribution or a generalized function.
We can intuitively see why this definition of f would lend itself to be called a distribution: in
order to get a number from f one must compute the integral over the entire function, not just use
a single point on the real axis. The distribution f can thus be thought of as a linear functional
on a space of functions φ. Clearly, all continuous functions must also be distributions, but a
distribution need not be a function since a point-wise definition of f is not required. Distributions
thus extend the space of objects we can play with within the realm of differential calculus. The
formal maths surrounding distributions is not trivial, nor a part of the syllabus for this course,
so we will not discuss it in any detail. However, one distribution that is not also a function plays
a vital role in physics (and in the differential calculus used by physicists): the Dirac delta δ.
With some care one can often manipulate it as if it were a function, but it is important to keep
in mind the differences to avoid confusion. We list some of the useful characteristic of the Dirac
delta function below. For a proper discussion of the more general analysis of distributions the
interested student may wish to consult chapter 6 of Rudin’s Functional Analysis for an excellent,
if quite advanced, exposition of the topic.
The Dirac delta function has several definitions, but the most useful one for our purposes is,
in one dimension: ˆ +∞
f (x)δ(x) dx = f (0), (76)
−∞

33
for
´ all continuous functions f on R. We will not address the details of the meaning of the symbol
in Eq. (76), but note that it cannot be the Riemann integral given by Eq. (17) (and sketched
in Fig. 5)).
While it is tempting to consider δ(x) as a function, i.e., a map between reals, this would be
incorrect and can lead to contradictions. From the definition we see that for the particular case
of f (x) = 1 we must have ˆ +∞
δ(x) dx = 1. (77)
−∞

This fits with our concept of continuous probability distributions: the integrand must be eval-
uated over some finite range rather than in a point. In fact, by definition this integral of the
delta is always 1 for any interval containing the point 0, and 0 otherwise.
The δ can be shifted around the real axis generalizing Eq. (76) to a generic point x0 , where
for convenience we use the standard notation for functions:
ˆ +∞
f (x)δ(x − x0 ) dx = f (x0 ). (78)
−∞

We can also define the derivative of a distribution, here of the Dirac delta, as
ˆ +∞
f (x)δ 0 (x) dx = −f 0 (0), (79)
−∞

which can be shown to have all the required properties. The expression above can be understood
in terms of integration by parts, i.e.
ˆ +∞ ∞ ˆ +∞
0
f (x)δ (x) dx = δ(x)f (x) − f 0 (x)δ(x) dx = −f 0 (0). (80)
−∞ −∞ −∞

This is, of course, a terrible abuse of notation, and the term with the delta outside of the
integral should give you indigestion. Nevertheless, it is a useful aide-mémoire, even if not a
rigorous proof. It is important to note that these integrals make sense whether δ is differentiable
or not (it is not, in the normal sense that we have used so far). All that is needed is that the
function f is differentiable. By setting f (x) = 1 we then find that
ˆ +∞
δ 0 (x) dx = 0. (81)
−∞

One of the many uses in physics of the Dirac delta is that it allows us to move between
discrete and continuous variables and distribution functions with ease. Let X be a discrete
random variable which can take values x1 , x2 , ..., xn with related probabilities p1 , p2 , ..., pn . Then
X can be described as a continuous variable having a PDF given by:
n
(82)
X
f (x) = pi δ(x − xi ).
i=1

Using the PDF to evaluate the probability of X lying in some interval [a, b] yields
ˆ b n ˆ b
(83)
X X
P (a < X ≤ b) = f (x) dx = pi δ(x − xi ) dx = pi ,
a i=1 a i|xi ∈[a,b]

as expected. Note that this can considerably simplify dealing with discrete quantities, which are
almost always harder to treat than continuous functions.

34
4 Curves and vector functions

Figure 19: Parametrization of a curve in 3D space by a vector r that is a function of a continuous


variable t ∈ [t1 , t2 ].

Vector functions of one variable a useful to represent curves in multidimensional space. To be


specific, we will focus here on the case of 2 or 3 dimensions, but the generalization to higher
dimensions is straightforward. Consider a point moving in 3D space along a certain trajectory.
Its positional coordinates (x, y, z) will be a function of time t, and thus specified by a function
of the form r : [t1 , t2 ] → R3 , with r = (x, y, z) and

x = x(t)

y = y(t) t ∈ [t1 , t2 ]

z = z(t).

We can also write the function explicitly in vector form as r(t) = x(t)î + y(t)ĵ + z(t)k̂.

Definition 9. Let I be an interval [a, b] in R, and let r : I → R3 be a continuous map. Then r


is called a curve in R3 . The image, r(I), is called the trace of r.
If r(a) = r(b), the curve is said to be closed. If, with the exception of its end points, the curve
never passes the same point twice (i.e., never crosses itself), the curve is said to be simple.

The curve r written explicitly as a function of some parameter (here t) is called the parametrized
form of the curve. The same curve can also be given in non-parametrized version, purely as a
function of the coordinate variables (x, y, z).

Example 10. Let the curve c be a circumference in 2D, centered in the origin, of radius
r. Write the curve in parametric and non-parametric form.

Solution: In parametric form the curve is given by c = (x, y) with


(
x = r cos(t)
y = r sin(t) t ∈ [0, 2π].

35
Figure 20: Examples of curves that are closed, open, simple, and not simple.

To find the non-parametric form, we must eliminate t from the equations above. The simplest
way to do this is to square x and y, and add them together, to find explicitly the relationship
between x and y: x2 + y 2 = r2 .

4.1 Derivative of a vector function


We now wish to generalize the concept of derivative for the vector function r defined above. We
can immediately see that the standard definition of differentiation via the limit of increments
remains valid:

Definition 10. Let r : I → Rn , and t0 ∈ I. Then the derivative of r in t0 is

d r(t0 + h) − r(t0 )
r(t) ≡ r0 (t0 ) = lim , (84)
dt t=t0 h→0 h

provided the limit exists and is finite.

The limit of a vector quantity is straightforward to compute: it is simply the limit of each
component of the vector. This follows directly from the definition of a vector space. Hence
we see that the derivative of a vector function is the vector containing the derivatives of its
components:
r0 (t) = (r10 (t), r20 (t), · · · , rn0 (t)). (85)
A curve r is said to be regular if r0 (t) 6= 0 for all t ∈ I, and singular otherwise. From Eq. (84)
we also see how we can define the differential of a vector:
dr
dr = dt. (86)
dt
So far we have only considered vectors written in the Cartesian basis, where the basis vectors
do not change from one point to another. This is not the case for polar coordinates, and in this

36
Figure 21: Graphical representation of the derivative of a vector r(t), taken in the limit of
infinitesimal increment h of the parameter t.

case we cannot assume that the basis vectors remain constant. To see this, recall the definition
of the basis vectors in plane polar coordinates from Eq. (47):
r̂ = cos θ î + sin θ ĵ
θ̂ = − sin θ î + cos θ ĵ.
By writing them in the Cartesian basis we can find their derivatives explicitly:
dr̂ dθ dθ dθ
= − sin θ î + cos θ ĵ = θ̂
dt dt dt dt
dθ̂ dθ dθ dθ
= − cos θ î − sin θ ĵ = − r̂.
dt dt dt dt
So some care is needed in general curvilinear coordinate systems, but the standard results and
the chain rule all continue to work as expected.

Example 11. Let (ds)2 = dr · dr = (dx)2 + (dy)2 + (dy)2 indicate an infinitesimal


displacement written in Cartesian coordinates. Find the expression for (ds)2 in spherical
polar coordinates.

Solution: We make use of the definition of a vector differential from Eq. (86), for the transfor-
mation (x, y, z) → (r, θ, φ):
x = r sin θ cos φ
y = r sin θ sin φ
z = r cos θ.
Because r = r(r, θ, φ) depends on three variables, we have:
2 2 2
∂r ∂r ∂r
(ds)2 = | dr|2 = (dr)2 + (dθ)2 + (dφ)2 =
∂r ∂θ ∂φ
2 2 2
sin θ cos φ r cos θ cos φ −r sin θ sin φ
= sin θ sin φ (dr)2 + r cos θ sin φ (dθ)2 + r sin θ cos φ (dφ)2 =
cos θ −r sin θ 0
=1(dr)2 + r2 (dθ)2 + r2 sin2 θ(dφ)2 .

37
We recover the result for spherical polar coordinates of Eq. (60), found using a geometric ap-
proach.

4.2 Integral of a vector function


Like the derivative, the definition of the integral of a vector function poses no particular diffi-
culties, and is a simple vector extension of known results in 1D.
Definition 11. Let r : [a, b] → Rn . The curve r is integrable on [a, b] if every component of ri
is integrable on [a, b]. Then we can write
ˆ b ˆ b ˆ b ˆ b 
r(t) dt = r1 (t) dt, r2 (t) dt, · · · , rn (t) dt . (87)
a a a a

With this definition we can recover the fundamental theorem of calculus for vector functions:
Theorem 2. Let r : [a, b] → Rn be a differentiable map with continuous first derivatives r0 .
Then,
ˆ b
r0 (t) dt = r(b) − r(a). (88)
a
Clearly, for regular closed curves this integral is always zero.

4.3 Arc length


Let r : [a, b] → Rm be a regular curve. How can we find the length of an arc between two
points a and b on the curve (a quantity sometimes also referred to as the metric distance)? To
fix ideas let’s start with the simplest non-trivial case of a curve in 2D (m = 2), as illustrated
in Fig. 22. Imagine we split the curve into a finite number of segments, and then connect the
points delimitating the segments with straight lines. The length of each such segment, ∆si , can
be found easily via Pythagoras’ theorem as ∆s2i = ∆x2i + ∆yi2 (i.e., making use of the Euclidean
metric for the distance between points). Summing all the lengths across all the segments
P provides
an approximation to the length of the curve between the two limiting points s ≈ i ∆si . If we
then take the limit for ∆x → dx, the arc length s can be written as an integral in terms of the
infinitesimal line element ds ˆ b
s= ds. (89)
a
This equation holds for curves embedded in any number of dimensions, the only thing that
changes is the number of terms contained in ds. The integral seems straightforward to evaluate,
and it is, provided the curve is parametrized in terms of the arc length ds. Unfortunately, this is
almost never the case. Fortunately, it is not difficult to see how the above integral can be written
in a more general form, which better lends itself to be applied to curves of any parametrization.
To see this, let r(t) be some curve in 3D space, given in parametric form as r(t) = x(t)î +
y(t)ĵ + z(t)k̂, and consider an infinitesimal displacement dr = dx î + dy ĵ + dz k̂ along the curve.
The square of the infinitesimal distance moved is given by

(ds)2 = dr · dr = (dx)2 + (dy)2 + (dz)2 . (90)

If instead of taking the differential dr we consider the derivative dr/ dt, this relationship can we
rewritten as  2  2  2  2
ds dr dr dx dy dz
= · = + + . (91)
dt dt dt dt dt dt
In particular, we can write
s 2  2  2
ds dr(t) dx dy dz
= = |r0 (t)| = + + . (92)
dt dt dt dt dt

38
Figure 22: Calculation of the length of a curve via segments.

The arc length given by the integral in Eq. (89) can then be rewritten as
ˆ b ˆ b  2  2  2
s
dx dy dz
s= r0 (t) dt = + + dt. (93)
a a dt dt dt

If the curve is provided in non-parametric form, then we need to find how ds varies with dx,
dy or dz:
p
ds = (dx)2 + (dy)2 + (dz)2 =
s  2  2
dy dz
= 1+ + dx
dx dx
s  2  2
dx dz
= 1+ + dy
dy dy
s  2  2
dx dy
= 1+ + dz.
dz dz

Which one of the last three expressions we choose to substitute into Eq. (89) will largely depend
on the functional expressions of the curve, and on the interdependences between them, as some
derivatives may be easier to integrate than others.
These expressions yield a particularly simple, yet useful, case for curves r that can be written
as a 1D function in the (x, y) plane by an equation of the form y = f (x). The arc length can
then be written as directly as a function of f , or of its inverse f −1 , if it exists, as:
ˆ bx ˆ by
(94)
p p
s= 0 2
1 + f (x) dx = 1 + f −1 (y)2 dy.
ax ay

39
Figure 23: Unit vectors defining a orthonormal coordinate system on a curve in space. The
derivatives of these three vectors with respect to the arc length along the curve s are given by
the Frenet-Serret formulae.

Example 12. Find the arc length of a cylindrical helix given by the parametric equations

x = R cos t,
y = R sin t,
z = αt,

where R and α are real constants, and t ∈ [0, 2π].

Solution: We start by calculating |r0 (t)|:

r0 (t) = (−R sin t, R cos t, α),


p p
|r0 (t)| = R2 sin2 t + R2 cos2 t + α2 = R2 + α2 .

The length of arc is then given by the integral


ˆ 2π ˆ 2π p p
0
l= |r (t)| dt = R2 + α2 dt = 2π R2 + α2 .
0 0

4.4 Frenet-Serret coordinate system


We have introduced the concept of a curve in space in vector form with some parametrization
as r(t), and of the tangent vector to the curve r0 (t). We want to extend this to describe the way
the curve changes direction more generally, and define a new orthogonal coordinate system that
is intrinsic to the curve and not fixed by an external reference frame.
First of all, it is easy to show that if |r(t)| is constant, that is to say the length of the vector
does not change along the curve, then r(t) ⊥ r0 (t) ∀t:

|r(t)|2 = const.
d
|r(t)|2 = 0 = 2 r0 (t) · r(t) ⇒ r0 ⊥ r.
dt
From Eq. (92) we have that ds = |r0 (t)| dt. The size of the tangent |r0 (t)| assumes the role
of a speed: it tells us how quickly distance mapped out by s changes with t. Clearly, if we

40
chose our parametrization t to be the arc length s, then ds = |r0 (s)| ds, and the tangent in this
parametrization must have a modulus of 1 for all s: r0 (s) is a unit tangent vector to the curve,
which we shall call t̂ .
The rate at which the unit tangent t̂ changes with respect to s is given by dt̂/ ds, and its
magnitude is defined as the curvature κ of the curve at a given point

dt̂ d2 r
κ= = . (95)
ds ds2

Note that for a straight line, κ = 0. The inverse of the curvature, the radius of curvature
ρ = 1/κ, is often the more useful quantity to quantify the “curvedness” of a curve, as shown in
Fig. 23. Because t̂ is a unit vector, its derivative (of size κ) is orthogonal to it. The unit vector
in this direction is called the principal normal n̂:

dt̂
= κn̂. (96)
ds

Finally, we can define a third unit vector, the binormal to the curve, as b̂ = t̂ × n̂. Since b̂ · t̂ = 0
we can write

d h i db̂ dt̂
b̂ · t̂ = · t̂ + b̂ ·
ds ds ds
db̂
= · t̂ + b̂ · κn̂ = 0.
ds

Since b̂ ⊥ n̂ the last term is zero, the first term must be zero as well. The vector db̂/ ds is thus
⊥ to both t̂ and to b̂ (because |b̂| = 1), and must therefore point in the direction of n̂. To state
this explicitly, we write
db̂
= −τ n̂, (97)
ds
where τ is called the torsion of the curve, and the negative sign is a matter of convention.
To find dn̂
ds we note that this vector must be ⊥ to n̂. It can thus be written as some linear
combination of the vectors t̂ and b̂:
dn̂
= αt̂ + β b̂. (98)
ds
Taking the scalar product of this expression with t̂ we can see that α = dn̂
ds · t̂. We then note
that since n̂ · t̂ = 0,
dn̂ dt̂
· t̂ = −n̂ · .
ds ds
Using Eq. (96) we can conclude that α = −κ. Similarly, taking the scalar product of Eq. (98)
with b̂ we find β = dn̂ ds · b̂, and since n̂ · b̂ = 0, we again have that

dn̂ db̂
· b̂ = −n̂ · ,
ds ds
from which, using Eq. (97), we conclude that β = τ . Therefore, Eq. (98) can be rewritten as

dn̂
= −κt̂ + τ b̂. (99)
ds

The trio (t̂, n̂, b̂) form a right-handed orthonormal coordinate system at any given point on
the curve, but note that this system changes as one moves along the curve. They are sketched

41
together in Fig. 23. The equations relating the vectors of the Frenet frame and their derivatives
with respect to the arc length along the curve s are given by the famous Frenet-Serret formulæ:

dt̂ dn̂ db̂


= κn̂, = τ b̂ − κt̂, = −τ n̂. (100)
ds ds ds
They describe by how much the intrinsic coordinate system changes orientation as one moves
along the space curve parametrized by s.
We can see that the shape of a curve is completely determined by κ and τ . In particular, the
necessary and sufficient condition for a curve to be planar is for it to have no torsion, i.e. τ = 0,
at arbitrary s. In contrast, a general helix is defined as a curve in which t̂ makes a constant
angle with a fixed direction, that is, when either κ or τ are independent of s. If both κ and τ
are constant, the helix is called a circular helix.
We can use these results to distinguish three different kinds of coordinate systems:

• Fixed coordinate system, e.g. Cartesian (î, ĵ, k̂).

• Varying coordinate system, e.g. spherical polar (r̂, θ̂, φ̂).

• Intrinsic coordinate system, e.g. Frenet-Serret (t̂, n̂, b̂).

4.5 Line integrals


The results shown above allow us to consider a new type of integral: a line or path integral.
The idea is that we want to find a general way to integrate some quantity (typically a vector
or a scalar function) along some specific pathway in space, and which does not necessarily align
with any of the coordinate axes. Such integrals are common in physics. Suppose we wish to
calculate the total mass of a non-homogeneous string of length L, for which we know the linear
density. If we can parametrize the string using the arc length r = r(s), s ∈ [0, L], and if the
linear density is given as ρ = ρ(s), then the total mass m will be given by the integral
ˆ L
m= ρ(s) ds. (101)
0

In practice, parametrizing the curve as a function of the arc length can be quite challenging,
but fortunately we can write the same integral for a generic parametrization r = r(t), t ∈ [a, b]
using a change of variables transformation

s = s(t)
ds = |r0 (t)| dt
ˆ b ˆ b
m= 0
ρ(s(t))|r (t)| dt = ρ̃(t)|r0 (t)| dt. (102)
a a

Definition 12. Let r : [a, b] → Rm be a regular curve with trace γ and let f be a real-valued
function defined on a subset of Rm which contains γ (f : A ⊂ Rm → R, γ ⊂ A). The line (or
curvilinear) integral of f along γ is the integral
ˆ ˆ b
f ds ≡ f [r(t)] |r0 (t)| dt. (103)
γ a

This is the natural generalization of the standard 1-dimensional integral to a generic curve in
space. The challenge is typically not so much in computing the integral, but rather in finding a
suitable parametrization for the curve γ.

42
Example 13. Evaluate the line integral
ˆ
xy 4 ds,
γ

where γ is the right-half of the circumference given by the equation x2 + y 2 = 16.

Solution: We start by parametrizing the variables in the most convenient way:

x = 4 cos θ,
h π πi
y = 4 sin θ, θ∈ − , ,
2 2
and therefore dx = −4 sin θ dθ and dy = 4 cos θ dθ. We can use Eq. (92) to find ds:
s 
dx 2
 2
dy
ds = + dθ
dθ dθ
p
= 16 sin2 θ + 16 cos2 θ dθ = 4 dθ.

The integral becomes


ˆ ˆ +π/2 +π/2
sin5 θ 2
xy 4 ds = (4 cos θ)(4 sin θ)4 4 dθ = 46 = 46 .
γ −π/2 5 −π/2 5

4.5.1 Line integrals of scalar and vector fields


It is often convenient to consider line integrals where we integrate with respect to the infinitesimal
vector displacement dr, rather than with respect to the arc length parameter ds = | dr|. This
can be done using Eq. (86), since the two quantities are related: dr = dr ds ds. In this case we will
be presented with integrals of the form:
ˆ ˆ
f (r) dr = f (x, y, z)(î dx + ĵ dy + k̂ dz), (104)
γ γ

where f (x, y, z) is a scalar function or field. For basis vectors that do not change with the
coordinates
´ we can take them out of the integral and write the result as a vector with integrals
γ f (x, y, z) dxi as its components.
If instead of integrating a scalar function along some curve γ we integrate a vector function,
then we will find integrals of the following form:
ˆ ˆ ˆ ˆ
A(r) · dr = Ax (x, y, z) dx + Ay (x, y, z) dy + Az (x, y, z) dz. (105)
γ γ γ γ

where A(r) = (Ax , Ay , Az ) is the vector field, and the dot indicates the scalar product, so the
outcome of the integral must now be a scalar. Integrals of this form are common in physics, and
are used, for example, to find the total work done, W , by a force F moving along some path γ:
ˆ
W = F · dr. (106)
γ

43
Example 14. Evaluate the line integral
ˆ
I = A(r) · dr,
γ

where γ is the line segment connecting points (0, 2) and (1, 4), and A = (yx2 , sin πy).

Solution: We start by parametrizing the curve γ. Taking (0, 2) as the point on the line, and
noting it has a slope of 2, we can write the line equation as
   
0 1
r(t) = +t , t ∈ [0, 1].
2 2

From this we find the most convenient parametrization setting x = t, and y = 2t + 2, which
implies dx = dt, and dy = 2 dt. We can now rewrite the integral in a more tractable form:
ˆ
I = (yx2 dx + sin πy dy) =
γ
ˆ 1 ˆ 1
2
= (2t + 2)t dt + sin[π(2t + 2)] 2 dt =
0 0
1 1
t4 2 3
 
2 cos[π(2t + 2)] 7
= + t − = .
2 3 0 2π 0 6

4.6 Conservative fields


The line integral of a vector field (or vector function) A(r) given by
ˆ
A(r) · dr
γ

in general depends on A in all points r traversed by the path γ: it is a path-dependent quantity.


However, if A can be expressed as a gradient of some scalar function A(r) = ∇φ(r), the integral
becomes dependent only on the end points of the curve, and independent on the specific path
taken between these two end points. A vector field A for which this holds is called a conservative
field, and is sometimes referred to as a vector potential.
To see this explicitly, we can substitute the expression for the gradient of a scalar field into
the expression for the line integral:
ˆ ˆ
A(r) · dr = ∇φ(r) · dr = (107)
γ γ
ˆ  
∂φ ∂φ ∂φ
= dx + dy + dz = (108)
γ ∂x ∂y ∂z
ˆ
(109)
b
= dφ(r) = φ(r) a = φ(b) − φ(a),
γ

where a and b indicate the start and end points of the curve γ, as illustrated in Fig. 24. The
integral is path-independent, because we are integrating an exact (also called perfect) differential.
A consequence of this is that every closed line integral of such a vector field must necessarily be
zero, since closed integrals have the same starting and ending point:
˛
A(r) · dr = 0, if A(r) = ∇φ(r). (110)
γ

44
Figure 24: Three different paths taken between points a and b. If the vector field is conservative,
then the line integral of the vector field will not depend on the path taken, but only on the points
a and b.

Importantly, the converse statement also holds. If A is such that its integral vanishes around
every closed curve in some region, then there exists a scalar function φ such that A = ∇φ. In
physics, the line integral of a vector field around a closed curve is often also called the circulation
of the field.
There is a small but important caveat to the statement above: it holds provided the region
R in which we select our curves γ is simply connected: we need to be able to continuously
transform every path between the two points into any other such path in R without leaving
R. An alternative way to say this is to require that every simple closed curve in R can be
continuously shrunk to a point. In practice this means the domain must not have holes: i.e.,
regions for which A(r) and its derivatives are ill-defined. If the region R contains a hole then
there exist simple curves that cannot be shrunk to a point without leaving R. Such a domain
has two boundaries (the outer boundary, and the boundary around the hole), and is thus called
doubly-connected.

Definition 13. A plane region R is said to be simply connected if every simple closed curve
within R can be continuously shrunk to a point without leaving the region. A region with n holes
is said to be (n + 1)-fold connected, or multiply connected.

In practice it is much simpler to identify exact differential forms than it is to prove general
path independence of some vector field. To see this, consider the integrand in Eq. (108), which
must be an exact differential, in terms of the components of A:
∂φ ∂φ ∂φ
dx + dy + dz = Ax dx + Ay dy + Az dz. (111)
∂x ∂y ∂z
Now note that the second mixed partial derivatives of φ must be independent of the order in
which the derivatives are taken
∂2φ ∂2φ ∂2φ ∂2φ ∂2φ ∂2φ
= , = , = . (112)
∂x∂y ∂y∂x ∂x∂z ∂z∂x ∂y∂z ∂z∂y

Written in terms of A = (Ax , Ay , Az ), these expressions provide a useful test for whether a
generic vector field is conservative – the following relations must hold for its components:
∂Ax ∂Ay ∂Ax ∂Az ∂Ay ∂Az
= , = , = . (113)
∂y ∂x ∂z ∂x ∂z ∂y

45
Example 15. Determine if A represents a conservative vector field for the following cases:

(a) A = r f (r) (b) A = c f (r),

where r is the position vector, r its modulus, and c a constant vector.

Solution: We need to check that the relations in Eq. (113) hold for the vector fields provided.
Starting with case (a), we have
∂Ax ∂(xf (r)) ∂r xy
= = xf 0 (r) = f 0 (r)
∂y ∂y ∂y r
∂Ay ∂(yf (r)) ∂r yx
= = yf 0 (r) = f 0 (r) .
∂x ∂x ∂x r
These are the same; proceeding similarly we can find the other two equalities hold, hence dφ =
f (r)(x dx + y dy + z dz) is an exact differential. The associated scalar potential φ can only be
given´ in integral form, since we have no information on the functional form of f . It is simply
φ = rf (r)dr.
For case (b) the calculation is very similar. Writing c = (cx , cy , cz ) we have
∂Ax ∂(cx f (r)) ∂r y
= = cx f 0 (r) = cx f 0 (r)
∂y ∂y ∂y r
∂Ay ∂(cy f (r)) ∂r x
= = cy f 0 (r) = cy f 0 (r) .
∂x ∂x ∂x r
These clearly differ, and so we conclude that f (r)(cx dx + cy dy + cz dz) cannot be an exact
differential.

Example 16. The gravitational force acting on a particle of mass m near the Earth’s
surface can be represented by the vector F = (0, 0, −mg), where the z-axis points vertically
upwards, and g is some constant. Find a general expression for the work done by the force
as the particle is displaced from height h1 to h2 , and show that the work done is path
independent.

Solution: We start by checking if the force is conservative, i.e., if the vector field can be
represented as the gradient of some scalar field. The conditions of Eq. (113) are trivially verified,
since F is a constant vector and all the partial derivatives of its components are 0. The differential
is thus exact, and given by
dφ(x, y, z) = F · dr = −mg dz.
We recognize the scalar field φ as the potential energy. Integrating both sides between the initial
and final points, with corresponding heights h1 to h2 , we find the work done
ˆ r2 ˆ r2 ˆ h2
W = F · dr = dφ(r) = −mg dz =
r1 r1 h1
= φ(h2 ) − φ(h1 ) = −mg(h2 − h1 ).
This is obviously path-independent, since it is given by the difference in values of φ(z) calculated
in two different points.

4.7 Green’s theorem


Green’s theorem allows us to make a connection between two seemingly different mathematical
objects: a line integral around some closed curve in a plane, and the double integral over the
region bounded by that closed curve, as shown in Fig. 25. This is the first of several such
relationships that we will study in this course.

46
Figure 25: A domain R of the xy-plane bounded by a curve γ, used in Green’s theorem. The
domain is simple, and bounded by the box x ∈ [a, b] and y ∈ [c, d].

Theorem 3. Let P (x, y) and Q(x, y) be two continuous functions with continuous partial deriva-
tives, well-defined inside and on the boundary γ of a simply connected region R in the xy-plane.
Then the following relationship holds
˛ ¨  
∂Q ∂P
(P dx + Q dy) = − dx dy. (114)
γ R ∂x ∂y

The proof of Green’s theorem is relatively straightforward if we restrict ourselves to simple


domains. Note that this is a convenience, not a requirement. A non-simple domain can be
sliced into multiple smaller domains which can be made to be simple. The process is only more
laborious. Consider the expression on the right hand side of Eq. (114), and the domain R and
bounding curve γ shown in Fig. 25. For convenience we will write the curve connecting the
points ABC as y1 (x), and that connecting ADC y2 (x). The two allow us to consider the double
integral over R as a y-simple domain, and together parametrize the entire boundary curve γ.
Similarly, taking the domain as x-simple we can parametrize the curve γ as two functions, now of
y, and call the curve connecting BAD x1 (y), and that connecting BCD x2 (y). Our strategy will
be to break down the double integral using known methods. Starting with the term containing
P we have:
¨ ˆ b ˆ y2 (x) ˆ b h
∂P ∂P iy=y2 (x)
dx dy = dx dy = dx P (x, y)
R ∂y a y1 (x) ∂y a y=y1 (x)
ˆ b ˛
= [P (x, y2 (x)) − P (x, y1 (x))] dx = − P dx.
a γ

Similarly, for the Q term:


¨ ˆ d ˆ x2 (y) ˆ d
∂Q ∂Q h ix=x2 (y)
dx dy = dy dx = dy Q(x, y)
R ∂x c x1 (y) ∂x c x=x1 (y)
ˆ d ˛
= [Q(x2 (y), y) − Q(x1 (y), y)] dy = Q dy.
c γ

Adding the two integrals yields Green’s theorem.

47
Example 17. Use Green’s theorem to calculate the area of an ellipse centred at the origin
with semi-axes a and b.

Solution: Let R be the region in the xy-plane bounded by the closed curve describing ˜ the
ellipse in question. The area A of the ellipse will be given by the double integral A = R dx dy.
We must therefore choose the integrand of the right hand side of Eq. 114 such that ∂Q ∂P
∂x − ∂y = 1,
which restricts our choice of P and Q. There are many ways to satisfy this requirement; for
example, if we set P = 0, then ∂Q
∂x = 1, and Q = x. Inserting these into Green’s theorem:
˛ ˛
A = (P dx + Q dy) = x dy.
γ γ

The ellipse in question can be written in parametric form as


x = a cos θ
y = b sin θ, θ ∈ [0, 2π].
Substituting this parametrization into the expression for the area gives
ˆ 2π
A= a cos θ b cos θ dθ = πab.
0

¸
Example 18. Evaluate the integral γ A · dr, where A = (2xy, x2 ), and γ is the triangle
connecting points (0,0), (1,0) and (0,2).

Solution: We start by solving this problem by performing the integral along the line. We have
˛ ˛
A · dr = (2xy dx + x2 dy)
γ γ
ˆ (1,0) ˆ (0,2) ˆ (0,0)
2 2
= (2xy dx + x dy) + (2xy dx + x dy) + (2xy dx + x2 dy)
(0,0) (1,0) (0,2)
ˆ 0 h i0
=0+ 2x(−2x + 2) dx + x2 (−2) dx + 0 = − 2x3 + 2x2 = 0,
1 1

where for the middle segment connecting points (1,0) and (0,2) we used the parametrization
y = −2x + 2, dy = −2 dx.
Another way to proceed is to use Green’s theorem. We identify P = 2xy, and Q = x2 , both
well-behaved differentiable functions on the xy-plane. Their derivatives are
∂P ∂Q
= 2x, = 2x.
∂y ∂x
As these are the same, from Green’s theorem we can immediately see that the integral must
be 0, irrespective of the path taken. So not only is the integral along the triangle 0, all closed
integrals will integrate to 0. We learn more by doing less.
The observed equality of mixed derivatives above is of course nothing but the known result
for an exact differential, and so it should be possible to write the vector field provided as the
gradient of some scalar function φ, such that A = ∇φ. To find φ we note that
ˆ ˆ
∂φ
P = ⇒ dφ = 2xy dx ⇒ φ = x2 y + C(y),
∂x
ˆ ˆ
∂φ
Q= ⇒ dφ = x2 dy ⇒ φ = x2 y + C(x).
∂y
Setting C(x) = C(y) = 0 provides the scalar function sought. Its existence means that the
closed line integral must be zero, as found previously.

48
5 Surfaces
2

Surface Colour map Contour plot

z x x 0

-1

x -2
y -2 -1 0 1 2
y y

Figure 26: Three common ways to present and plot a 2D surface.

In this course we are interested in studying functions of multiple variables, alongside their
derivates and integrals, and surfaces are among the most common and useful examples of such
functions. Recall that a general function of multiple variables is given by the map
f : D ⊂ Rn → R.
In 2D such a map, if continuous, represents a surface and is commonly written as f (x, y) = z.
The surface itself can be thought of as an object with two degrees of freedom embedded in 3D
space, and can then be described by a 3D vector with components
r = (x, y, z = f (x, y)).
This expression suggests a convenient general parametric way of describing surfaces along the
same lines used to define curves:
r(u, v) = (x(u, v), y(u, v), z(u, v)). (115)
Here r is a vector function of the parameters (u, v) which vary within a certain domain D in the
parametric uv-plane.

Example 19. Write the surface of the sphere x2 + y 2 + z 2 = R2 in parametric form.

Solution: The convenient parameter choice here is to use spherical polar coordinates, where
(u, v) = (θ, φ) are the polar and azimuthal angles. We know that
x = R sin θ cos φ
y = R sin θ sin φ
z = R cos θ,
and so the parametric form of the surface of a sphere is simply
r(θ, φ) = R(sin θ cos φ, sin θ sin φ, cos θ). (116)

5.1 The Gradient


Let us now extend the definition of a derivative to functions in more than one dimension. This
will allow us to apply it more broadly, and in particular to study surfaces and volumes in 3D
space. Recall the definition for the derivative of a function
df f (x + h) − f (x)
≡ f 0 (x) = lim . (117)
dx h→0 h

49
How should we extend this to multiple dimensions? The first difficulty is in the definition of
the increment h: in multiple dimensions the increment will necessarily become a vector, as
it will have an orientation depending on what directions we want to move in to evaluate the
change in the function. However, simply replacing h by a vector h will not work as we cannot
divide by a vector in the definition of a limit in Eq. (117). The solution will be to consider,
instead, directional derivatives. In particular, for a function in n dimensions f (x), where x =
(x1 , x2 , ..., xi , ..., xn ), we fix n − 1 dimensions and vary only one, xi , at a time. Along this single
dimension the function is one-dimensional, and the known results for taking derivatives apply.
However, to keep in mind that the function is free to vary in multiple dimensions, we will call
the derivative of the function f along the single dimension xi the partial derivative of f with
respect to xi , denoted by
∂f (x)
≡ ∂i f (x) ≡ fxi .
∂xi
Clearly, for a function in n dimensions we will have n partial derivatives. These form a vector
space, and it will be convenient to write them explicitly in vector form. We will call this the
gradient of the function f , calculated in some point x:
 
∂f ∂f ∂f
gradf (x) := , ,..., = ∇f (x). (118)
∂x1 ∂x2 ∂xn

The last equality introduces a new vector operator sufficiently important to merit its own new
symbol, ∇, called nabla:  
∂ ∂ ∂
∇ := , ,..., . (119)
∂x1 ∂x2 ∂xn
This is an n-dimensional vector operator that takes a (multidimensional) scalar function as an
input, and returns a vector containing all the partial derivatives of the scalar function. Of
particular importance to this course is the gradient in two and three dimensions:
   
∂ ∂ ∂ ∂ ∂
∇= , ∇= , , . (120)
∂x ∂y ∂x ∂y ∂z

5.1.1 Directional derivatives


For a continuous and differentiable function f : Rn → R the partial derivative ∂f /∂xi provides
a measure of how quickly the function varies in along the direction of xi . We can use this to find
the rate of change of the function along any direction, making use of the gradient. Suppose we
have some surface as shown in Fig. 27, and that we are interested in evaluating how the function
changes along some direction parametrized by the line

x(t) = x0 + tv̂.

The change in the value of the function will be

∆f = f (x0 + tv̂) − f (x0 ),

which along the line is now a 1D problem. In the limit of t → 0 this is simply the directional
derivative of f along v̂:
∂f f (x0 + tv̂) − f (x0 )
≡ lim . (121)
∂xv t→0 t
Clearly, the vector v̂ can be written as a linear combination of the basis vectors in the domain.
In the example shown in Fig. 27, this would correspond to some linear combination of the unit
vectors in the î and ĵ directions. It follows that the directional derivatives are linear combinations
of the partial derivatives in the standard basis, and we can thus relate the directional derivatives
to our gradient operator via the following theorem:

50
Figure 27: Variation of a function along some specific direction v̂.

Theorem 4. Gradient Theorem.


Let f : D ⊂ Rn → R, with f differentiable in x0 ∈ D. For every unit vector v̂ there exists a
directional derivative ∂v f (x0 ), and the following identity holds:
n
∂f
(122)
X
∂v f (x0 ) = ∇f (x0 ) · v̂ = vi .
∂xi x=x0
i=1

The directional derivative is the scalar product of the gradient of the function and the
direction vector. This relatively simple result leads to two importance consequences. Firstly,
the expression ∇f · v̂ is a scalar product of two vectors that will be maximized when ∇f k v̂.
The gradient of a function thus always points in the direction where the change in the function
is maximal. In contrast, contour lines are defined as lines along which there is no change in the
value of the function, i.e, where ∂v f = 0. This occurs when ∇f ⊥ v̂, and the gradient ∇f is
thus always perpendicular to the contour lines. The gradient of the function shown in Fig. 26
is given in Fig. 28 (red arrows), alongside the contour lines of the function to aid the eye. We
see that the gradient is indeed perpendicular to the contour lines, and that the gradient always
points to the direction of maximum ascent. As intuitively expected, the gradient goes to zero
in the extremum points.

Example 20. Find the unit normal vector to the surface z = x2 + y 2 in the point (1, 1, 2).

Solution: Recall that the gradient of a scalar field is always perpendicular to the isosurfaces of
that field. As we have seen, for a 2D field these will be (contour) lines, but in higher dimensions
they will be (hyper-)surfaces representing points where the field has some constant value. We
therefore rewrite the surface above as an isosurface of a 3D scalar field

U (x, y, z) = x2 + y 2 − z = 0,

51
2

-1

-2
-2 -1 0 1 2
Figure 28: Gradient of the function shown in Fig. 26. The gradient is perpendicular to the
contour lines, and always points in the direction of maximum ascent.

52
and compute its 3D gradient:
 
∂U ∂U ∂U
∇U (x, y, z) = , ,
∂x ∂y ∂z
= (2x, 2y, −1).

This is a vector in 3D space normal to the surface in question. In the point (1, 1, 2) on the
surface we have n = (2, 2, −1), and n̂ = 13 (2, 2, −1). The gradient thus allows us to quickly and
conveniently compute the normal vectors to surfaces and hypersurfaces by considering them as
isosurfaces of objects in higher dimensional space.

Example 21. The 2D gravitational potential of a point mass M placed in the origin of
the coordinate system is given by
GM
V (r) = − ,
r

where r = (x, y) is the position vector, r ≡ |r| = x2 + y 2 its modulus, and G and M are
p

constants. This represents a 2D scalar field. Find the gradient of V .

Solution:
 
GM
∇V (r) = −∇ =
r
 
∂ 2 2 −1/2 ∂ 2 2 −1/2
= −GM (x + y ) , (x + y ) =
∂x ∂y
 
−x −y
= −GM , =
(x2 + y 2 )3/2 (x2 + y 2 )3/2
r r̂
= GM 3 = GM 2 .
r r
Both the scalar potential V and its gradient (a radial vector field) are shown in Fig. 29. As
expected, the gradient is just the negative of the gravitational force.

5.1.2 The gradient in curvilinear coordinate systems


Consider a multi-dimensional scalar function f (x) (in physics we often refer to this as a scalar
field), and assume f is differentiable so that its gradient, the vector ∇f , exists. We are interested
in finding how to compute ∇f , or more precisely, its components, in a generic, non-Cartesian
coordinate system.
In Cartesian coordinates the components of ∇f are given by Eq. (118), and can be repre-
sented concisely in index notation as
∂f
(∇f )i = ∂i f = . (123)
∂xi
The three forms are, of course, equivalent. In order to move to a new coordinate system x → u,
we need to find how to express the derivates of f with respect to the new variables ui . Given
that the new coordinates are a function of the old, u = u(x), we can use the chain rule to
compute this:
∂f ∂f ∂x1 ∂f ∂x2 ∂f ∂xn X ∂xj ∂f
= + + ... + = . (124)
∂ui ∂x1 ∂ui ∂x2 ∂ui ∂xn ∂ui ∂ui ∂xj
j

We recognize this expression as a matrix equation of the form ∇u f = J ∇x f , where J is the


matrix containing the first partial derivatives of the two coordinate systems. For a 3D system,

53
1

-1
-1 0 1
GM r
<latexit sha1_base64="mBIdJ904hR6hEgoe4lxrA6QhIBI=">AAACBXicbVC7SgNBFL0bXzG+Vi21GAxCLAy74qsRghbaCBHMA7IhzE5mkyGzD2ZmhbBsY+Ov2FgoYus/2Pk3TpItNPHAhcM593LvPW7EmVSW9W3k5uYXFpfyy4WV1bX1DXNzqy7DWBBaIyEPRdPFknIW0JpiitNmJCj2XU4b7uBq5DceqJAsDO7VMKJtH/cC5jGClZY65m69lDiuh0R6gC7QIXI8gUlyjW7TRKQds2iVrTHQLLEzUoQM1Y755XRDEvs0UIRjKVu2Fal2goVihNO04MSSRpgMcI+2NA2wT2U7GX+Ron2tdJEXCl2BQmP190SCfSmHvqs7faz6ctobif95rVh55+2EBVGsaEAmi7yYIxWiUSSoywQlig81wUQwfSsifaxzUDq4gg7Bnn55ltSPyvZp+eTuuFi5zOLIww7sQQlsOIMK3EAVakDgEZ7hFd6MJ+PFeDc+Jq05I5vZhj8wPn8A8TyW+g==</latexit>

<latexit sha1_base64="tboSpZvHZ1dUuJlKDnUNH0YmMXo=">AAACFnicbVDLSsNAFJ34rPUVdelmsAh1YUl8b4SiC90IFewDmlgm00k7dDIJMxOhhHyFG3/FjQtF3Io7/8ZJm4W2HrhwOOde7r3HixiVyrK+jZnZufmFxcJScXlldW3d3NhsyDAWmNRxyELR8pAkjHJSV1Qx0ooEQYHHSNMbXGZ+84EISUN+p4YRcQPU49SnGCktdcx9hyOPIdgoJ47nQ5HuwXN4BW+g4wuEk1xME3F/mFaKHbNkVawR4DSxc1ICOWod88vphjgOCFeYISnbthUpN0FCUcxIWnRiSSKEB6hH2ppyFBDpJqO3UrirlS70Q6GLKzhSf08kKJByGHi6M0CqLye9TPzPa8fKP3MTyqNYEY7Hi/yYQRXCLCPYpYJgxYaaICyovhXiPtJ5KJ1kFoI9+fI0aRxU7JPK8e1RqXqRx1EA22AHlIENTkEVXIMaqAMMHsEzeAVvxpPxYrwbH+PWGSOf2QJ/YHz+APqPnWg=</latexit>

V (r) = rV (r) = GM 3 .
r r

Figure 29: Relationship between the scalar gravitation potential field, and the vector field given
by its gradient.

where x = (x, y, z) and u = (u, v, w), we can write the 3 × 3 matrix explicitly
 
∂x ∂y ∂z
 ∂u ∂u ∂u 
 
(125)
 
J = ∂y ∂z  ,
 ∂x
 ∂v ∂v ∂v 

 
∂x ∂y ∂z
∂w ∂w ∂w

which we recognize as the transpose of the Jacobian matrix. This should come as no surprise,
given that we are essentially performing basis transformations on a differential operator.

Example 22. Calculate the gradient of the function f (x, y) = 3(x2 + y 2 ) in both the
Cartesian and the plane polar coordinate basis.

Solution: In Cartesian coordinates the gradient is simply given by


 
∂f (x, y) ∂f (x, y)
∇f = , = (6x, 6y). (126)
∂x ∂y
For plane polar coordinates we have x = (x, y) and u = (r, θ), which are related by the expres-
sions
x = r cos θ
y = r sin θ
To find the gradient in the new coordinates we make use of Eq. (124), which here becomes:
       
∂ ∂x ∂y ∂ ∂
cos θ sin θ
 ∂r  =  ∂r ∂r   ∂x  =    ∂x  .
∂ ∂x ∂y ∂ ∂
∂θ ∂θ ∂θ ∂y −r sin θ r cos θ ∂y

We can invert this matrix to find an explicit expression for the gradient in Cartesian coordinates
    
∂ −1 ∂
cos θ r sin θ
 ∂x  =    ∂r  ,

∂y sin θ 1r cos θ ∂
∂θ

54
and can write    
∂ ∂
cos θ ∂r − 1r sin θ ∂θ

∇=  ∂x  = . (127)
∂ ∂
∂y sin θ ∂r + 1r cos θ ∂θ

In polar coordinates the function can be written as f = 3r2 , so that its partial derivatives with
respect to (r, θ) are ∂f /∂r = 6r and ∂f /∂θ = 0. Substituting these in the expression above
yields:
∇f = (6r cos θ, 6r sin θ),
which is equal to our first result, as given in Eq. (126). These expressions are all written in the
standard Cartesian basis, even if we have changed the variables. That is to say, calculating ∇f
using Eq. (127) is, explicitly in terms of the basis vectors:
   
∂f 1 ∂f ∂f 1 ∂f
∇f = cos θ − sin θ î + sin θ + cos θ ĵ
∂r r ∂θ ∂r r ∂θ
∂f ∂f 1
= [î cos θ + ĵ sin θ] + [−î sin θ + ĵ cos θ].
∂r ∂θ r
Recalling Eq. (47) for the basis vectors r̂ and θ̂, we can rewrite this expression in the basis (r, θ):
∂f 1 ∂f
∇f = r̂ + θ̂ = 6r r = (6r, 0), (128)
∂r r ∂θ
which is the gradient written in the basis of the plane polar coordinate system.

5.1.3 Gradient expressions


We summarize here the results for the gradient written in curvilinear systems. For plane polar
coordinates, as we have seen in the example above, we have:
 
∂f 1 ∂f 1
∇f = r+ θ̂ ≡ ∂r , ∂θ f. (129)
∂r r ∂θ r
Cylindrical coordinates in 3D are a simple extension of the plane polar system, and the form of
the gradient can be guessed – we just need to add the z component, but this axis is the same
as in the now familiar Cartesian system. Therefore:
 
∂f 1 ∂f ∂f 1
∇f = r+ θ̂ + ẑ ≡ ∂r , ∂θ , ∂z f. (130)
∂r r ∂θ ∂z r
Spherical coordinates require a little more work. Following the same approach as above, and
using the relationships introduced in Eq. (59), we find:
 
∂f 1 ∂f 1 ∂f 1 1
∇f = r+ θ̂ + φ̂ ≡ ∂r , ∂θ , ∂φ f. (131)
∂r r ∂θ r sin θ ∂φ r r sin θ
where θ is the polar angle and φ is the azimuthal angle.

5.2 Surface integrals


We now turn to the development of techniques necessary to perform surface integrals. These are
similar to the double integrals you have already encountered, but have the added complication
that the surface – which is the domain of integration – is not a flat xy-plane, but some generic
curved surface. Such integrals are common in physics, and we have two main ways to tackle
them. For surfaces which can be parametrized conveniently in some new set of coordinates (e.g.,
spherical polar or cylindrical systems) one can often perform the integral directly following a
direct coordinate transformation. However, finding a simple and suitable transformation can

55
Figure 30: Bounded surface S and its projection A on the xy-plane.

prove challenging for more complex surfaces, and we often resort to the alternative: we project
the surface onto some convenient Cartesian coordinate plane and perform the integration there.
We proceed to show how these two approaches can be deployed in practice. Surface integrals
will typically come in four general forms. The two that produce scalar outcomes are
¨ ¨
φ(r) dS, A(r) · n̂ dS, (132)
S S

while the integrals producing a vector output can be written as:


¨ ¨
φ(r)n̂ dS, A(r) × n̂ dS. (133)
S S

Here n̂ denotes the normal vector to the surface, φ denotes a scalar function, and A a vector
function (or field).
It will be convenient to consider surfaces in terms of their vector areas, defined as dS = n̂ dS.
This is particularly convenient in the context of surface integrals of scalar and vector fields,
which are commonly used to describe physical quantities and processes. We proceed by finding
convenient expressions for the terms n̂, dS, and dS.

5.2.1 Vector area


The vector area S is easiest to understand for a finite planar surface of scalar area S (i.e., the
normal area) and unit normal n̂:
S = Sn̂. (134)
It is simply a vector parallel to the plane normal, with modulus corresponding to the (scalar)
surface area. We can extend this notation to a more general surface composed of a set {Si } of
flat facet areas with corresponding normal vectors n̂i as:

(135)
X
S= Si n̂i .
i

56
Taking the limit of this expression for infinitely small facets allows us to define the concept of a
vector area of a bounded differentiable surface
ˆ ¨
dS = n̂ dS ⇒ S = dS ≡ n̂ dS. (136)
S S

It is worth noting from this expression that the vector and scalar areas are of the same size only
for a plane surface. For a curved surface, the modulus of the vector area will always be less than
the scalar area.
Consider a general surface S as shown in Fig. 30 that has some projection A onto the xy-
plane. An infinitesimal vector area in some point dS has a projection dA, which can be found
by taking the scalar product of the vector area and the unit normal to the xy-plane, k̂:
dA = k̂ · dS = k̂ · n̂ dS = | cos α| dS, (137)
where α is the angle between k̂ and n̂. Clearly the entire surface S can be tiled
´ in this way and
the total vector area is then given by integrating over the entire surface S = S dS.

Example 23. Find the vector areas of a hemisphere and of a sphere of radius R.

Solution: Given the symmetry of the problem, we place the sphere in the centre of the coor-
dinate system and choose spherical polar coordinates to represent the surface of the sphere as
(see Example 19):
r(θ, φ) = R(sin θ cos φ, sin θ sin φ, cos θ).
We know this vector is normal to the surface of the sphere, so the unit normal is given simply
by the same vector, but normalized:
n̂(θ, φ) = (sin θ cos φ, sin θ sin φ, cos θ).
Taking the hemisphere the surface for positive z, we can calculate the total vector surface as
¨ ˆ π/2 ˆ 2π sin θ cos φ
   
0
S= n̂ dS =  sin θ sin φ  R2 sin θ dθ dφ =  0  .
0 0 cos θ πR2
As expected, the vector only has a component along the z-axis, and the size of the vector area
is the projected area – the circle – in the xy-plane. To find the vector area of the entire sphere
´ π/2 ´ +π/2
we simply need to change the limits of the polar integral from 0 → −π/2 , which leads to
S = 0. This is a general result: the vector area of any closed surface is the 0 vector.
This result leads to an interesting (and useful) consequence, and that is that the vector area
of an open surface must depend only on its boundary curve. To see this, consider two surfaces,
S1 and S2 , which share a boundary curve. Taking the difference of the surfaces S1 − S2 must
yield a closed surface, for which we know the vector area is 0. The two surfaces must therefore
have the same vector area.
We can use this observation to obtain an expression of the vector area for an open surface
directly from the line integral around its boundary curve γ. Consider the open surface illustrated
in Fig. 31, created by the cone connecting the origin to some closed curve γ. The triangular
vector area element dS is given by
1
dS = r × dr,
2
and adding up all of these triangles along the curve γ gives us the total vector area of the surface:
˛
1
S= r × dr. (138)
2 γ
The vector area of any surface bound by a closed curve γ can therefore be found by performing
a line integral of this kind.

57
Figure 31: Vector area defined as a cone connecting the origin to some closed curve γ. The
vector area element dS can be found by taking the cross product of the vector r describing the
curve, and its differential dr.

5.2.2 Normal vector to a surface


A convenient way to describe the normal to a general surface is in terms of its gradient. Consider
the surface shown in Fig. 30, with some projection on the xy-plane A. For a 2D surface f (x, y)
we have already seen how this can be done by considering an isosurface in a higher dimension
in Example 20, i.e. g(x, y, z) = f (x, y) − z = 0. The unit normal at any point on the surface is
then given by the normalized gradient
∇g
n̂ = ,
|∇g|
evaluated in the point of interest. Explicitly,
 
∂f ∂f
∇g = ∇f (x, y) − ∇z = , , −1 , (139)
∂x ∂y
s  2  2
∂f ∂f
|∇g| = 1 + + . (140)
∂x ∂y

We see from Eq. (136) that in order to evaluate a surface integral we need to find expressions
for both n̂ and dS. To find a more convenient expression for the surface element we can use
Eq. (137) to write
dA |∇g| dA
dS = = = |∇g| dA. (141)
|k̂ · n̂| |∇g · k̂|
We can use these expressions to relate any surface integral over S to a double integral over the
projected region in the xy-plane A:
ˆ ¨ ¨ ¨
∇g
dS = n̂ dS = |∇g| dA = ∇g dA. (142)
S S A |∇g| A

The convenient cancelation of the normalization of the gradient of g make this expression rel-
atively simple to manipulate. Note the loose similarities to the chain rule, considering that
dA = dx dy.
An alternative approach is to calculate the vector normal to the surface directly, taking
the cross product of two linearly independent vectors lying in the tangent plane to some point
on the surface. This approach is convenient for surfaces given in parametric form r(u, v) =
(x(u, v), y(u, v), z(u, v)). As we have seen when discussing vector derivatives and the gradient,
the vectors ∂r(u,v)
∂u and ∂r(u,v)
∂v , evaluated in some point, are guaranteed to lie in the tangent

58
plane to the surface in that point. The unit normal vector to the surface (and to the tangent
plane) can then be obtained from the cross product:
∂r ∂r
×
n̂ = ∂u
∂r
∂v
∂r
. (143)
∂u × ∂v

Following the same steps taken to find the Jacobian for a coordinate transformation we can also
find the size of the area element for a parametric surface as

∂r ∂r
dS = × du dv. (144)
∂u ∂v

Again, the normalization term will cancel when inserting these into a surface integral
ˆ ¨ ¨  
∂r ∂r
dS = n̂ dS = × du dv. (145)
S S A ∂u ∂v

5.2.3 Surface integrals of scalar fields


We can use the results from the preceding section to evaluate integrals of some scalar field over
a surface. There are two main approaches to follow here, depending on whether the surface is
given in parametric form, or if it is provided non-parametrically. The easiest way to get to grips
with this using a couple of examples.

Example 24. Evaluate ¨


6xy dS
S
over the plane x + y + z = 1 in the first octant (i.e., for positive (x, y, z)).

Solution: We need to find a convenient way to parametrize the surface element dS. Let’s use
the gradient, via Eq. (141). We write:

g = x + y + z − 1 = 0 ⇒ ∇g = (1, 1, 1) ⇒ |∇g| = 3.

Therefore, √
dS = 3 dx dy,
where we have projected the integral over the surface S onto the xy-plane. The domain of
integration is now the triangle bounded by x = 0, y = 0, and y = 1 − x (all in the xy-plane),

59
and we can write the integral as
¨ ¨
6xy dS = 6xy|∇g| dA
S A
√ ˆ 1 ˆ 1−x
= 3 dx dy 6xy
0 0
√ ˆ 1 (1 − x)2
=6 3 dx x
0 2

3
= .
4

Example 25. Evaluate ¨


z dS
S

over the surface x2 + y 2 + z 2 = 4, for z ≥ 0.

Solution: The surface is the positive hemisphere, which can be conveniently parametrized using
spherical polar coordinates. Let’s take this approach in evaluating the integral. The vector to
the surface is given by (see Eq. (116))
r(θ, φ) = 2(sin θ cos φ, sin θ sin φ, cos θ), θ ∈ [0, π/2], φ ∈ [0, 2π]. (146)
To find dS, we use Eq. (144), with u = θ and v = φ. Let’s first find the two vectors in the
tangent plane to the surface:
   
2 cos θ cos φ −2 sin θ sin φ
∂r  ∂r
= 2 cos θ sin φ  , =  2 sin θ cos φ  .
∂θ ∂φ
−2 sin θ 0
The modulus of the cross product is
∂r ∂r
× = 4 sin θ, (147)
∂u ∂v
which should come as little surprise: we did not need to do this calculation again, as we know
the Jacobian for spherical polar coordinates is r2 sin θ, i.e., the result found here. However, for a
more general parametrization for which the Jacobian is not known, the cross product will need
to be evaluated directly. We now have all the ingredients needed to evaluate the integral:
¨ ¨
∂r ∂r
z dS = z(θ) × dθ dφ
S S ∂u ∂v
ˆ 2π ˆ π/2
= dφ dθ 2 cos θ 4 sin θ = 8π.
0 0

60
5.2.4 Surface integrals of vector fields
Surface integrals of vector fields are widely in physics as we often want to evaluate vector
quantities over some surface in three-dimensional space. They provide a way to calculate the
total flow of a vector field across a surface; if the surface is closed, then we will talk about the
flow into, or out of, some region of space. We have a choice in the direction of the normal vector
to a surface, which is important as it determines the direction of the vector flow. The normal
vector is conventionally chosen to point outwards of a closed surface. For an open surface, the
convention is to follow the right-hand rule tracing the curve that forms the surface boundary.

Example 26. Evaluate ¨


F(r) · dS
S

over the surface bounded by the paraboloid y = x2 + z 2 for y ∈ [0, 1], and the disc
x2 + z 2 ≤ 1 at y = 1, for F = (0, y, −z).

Solution: We will use the gradient approach to solve this problem, and will want to parametrize
the surface to write it in the form given in Eq. (142), so that
¨ ¨
F · dS = F · ∇g dA.
S A

We will consider the paraboloid and disc surfaces separately. Because of the orientation of
the paraboloid it will be most convenient to choose A in the xz-plane. We can then write
y = f (x, z) = x2 + z 2 , so that g(x, y, z) = y − f (x, z) = y − x2 − z 2 . The gradient of g is given
by
∇g = (−2x, 1, −2z),
which represents a vector normal to the surface of the paraboloid. This vector points inward to
the surface, and so we should change its sign to adhere to the normal convention of outward-
pointing normals. The integral over the surface of the paraboloid then becomes

¨ ¨
   
0 2x
F · dS =  y  · −1 dx dz
P A −z 2z
¨
= (−y − 2z 2 ) dx dz
¨A
= (−x2 − 3z 2 ) dx dz.
A

The domain A is the projection of the paraboloid on the xz-plane, and is simply the disc of
radius 1 in that plane. Plane polar coordinates are therefore ideally suited to compute this

61
double integral. Setting x = r cos θ and z = r sin θ for θ ∈ [0, 2π] and r ∈ [0, 1], we have:
¨ ˆ 2π ˆ 1
(−x2 − 3z 2 ) dx dz = dθ dr r(−r2 cos2 θ − 3r2 sin2 θ)
A 0 0
ˆ 2π ˆ 1
2 2
=− dθ(cos θ + 3 sin θ) r3 dr
0 0
ˆ
1 2π
 
1 3
=− dθ (cos 2θ − 1) + (1 − cos 2θ) = −π.
4 0 2 2
We now need to integrate the vector field over the disc. This part is much simpler. The normal
is obviously along the y-axis, and so the integral becomes:
¨ ¨
   
0 0
F · dS =  y  · 1 dx dz
D D −z 0
¨
= y dx dz = π,
D
where the results follows directly since y = 1 on the disc, and so the integral is simply the area
of a disc of radius 1. Adding the two contributions we find that the total integral over the closed
surface is 0.

Example 27. Evaluate ¨


F(r) · dS
S

over the upper half of the sphere x2 + y 2 + z 2 = 9 and for F = (x, y, z 4 ).

Solution: We will solve this using the parametric approach, given that the surface is a sphere.
We follow the same initial approach as in Example 25. The surface is given parametrically by
the vector equation
r(θ, φ) = 3(sin θ cos φ, sin θ sin φ, cos θ), θ ∈ [0, π/2], φ ∈ [0, 2π]. (148)
We can then find the normal vector from the cross product of two vectors in the tangent plane,
found by taking the derivatives of r with respect to the two parameters (θ, φ):
   
3 cos θ cos φ −3 sin θ sin φ
∂r  ∂r
= 3 cos θ sin φ  , =  3 sin θ cos φ  ,
∂θ ∂φ
−3 sin θ 0
and the cross product is  2 
sin θ cos φ
∂r ∂r
n= × = 9  sin2 θ sin φ  .
∂θ ∂φ
sin θ cos θ

62
The vector function F written in spherical polar coordinates is:
 
3 sin θ cos φ
F =  3 sin θ sin φ  .
34 cos4 θ

We can now evaluate the integral using the expression in Eq. (145):
¨ ¨  
∂r ∂r
F · dS = F· × dθ dφ
S A ∂u ∂v
ˆ 2π ˆ π/2
   2 
3 sin θ cos φ sin θ cos φ
= dφ dθ  3 sin θ sin φ  · 9  sin2 θ sin φ 
0 0 34 cos4 θ sin θ cos θ
ˆ 2π ˆ π/2
= dφ dθ (27 sin3 θ + 729 sin θ cos5 θ) = 279π.
0 0

63
6 Vector fields and operators
In addition to the gradient that was introduced in the previous chapter, which we have seen is
In[34]:= needs = "VectorFieldPlots`";
a differential operator on scalar fields, there are two more differential operators important for
VectorPlot {x * y ^ 2, - y * x ^ 2}, x, - 1, 1 ,
the study of vector fields: the divergence and the curl. As we shall see,
Out[35]=
both can be expressed
in terms of the differential operator ∇. 1.0

6.1 The Divergence


The divergence of a vector field F (r), with r = (x, y, z), is defined as 0.5

∂Fx ∂Fy ∂Fz


div F(r) := + + = ∇ · F(r), (149)
∂x ∂y ∂z
0.0

where Fx , Fy , Fz are the components of F. A vector field F for which ∇ · F = 0 is called


solenoidal or incompressible.
The divergence produces a scalar output (which is a function, or a -0.5 field) that measures
the rate of change of a vector field at a given point, or, perhaps easier to imagine, in some
infinitesimal volume around a point. As such, the divergence of a vector field can be thought of
as a measure of how much the vectors in the field are spreading out or converging. Because of
this, it is often used to describe the flow of a fluid, heat or electricity, or other
needs = "VectorFieldPlots`";
In[34]:=
-1.0
-1.0
similar -0.5
physical 0.0 0.5
quantities. VectorPlot {x * y ^ 2, - y * x ^22}, x, - 2
<latexit sha1_base64="H3pOWMEU6qrH5b09tfUt+giAluA=">AAACAnicbVDLSsNAFJ34rPUVdSVuBotQQUtSfG2EoiAuK9gHtGmZTCft0MkkzEykIRQ3/oobF4q49Svc+TdO2yy09cCFwzn3cu89bsioVJb1bczNLywuLWdWsqtr6xub5tZ2VQaRwKSCAxaIuoskYZSTiqKKkXooCPJdRmpu/3rk1x6IkDTg9yoOieOjLqcexUhpqW3uJk3XgzdDeAnzAxi3ikfwOIaDVvGwbeasgjUGnCV2SnIgRbltfjU7AY58whVmSMqGbYXKSZBQFDMyzDYjSUKE+6hLGppy5BPpJOMXhvBAKx3oBUIXV3Cs/p5IkC9l7Lu600eqJ6e9kfif14iUd+EklIeRIhxPFnkRgyqAozxghwqCFYs1QVhQfSvEPSQQVjq1rA7Bnn55llSLBfuscHp3kitdpXFkwB7YB3lgg3NQAregDCoAg0fwDF7Bm/FkvBjvxsekdc5IZ3bAHxifPytllLo=</latexit>

F = (xy , yx1,) 1 , y, - 1, 1 , PlotLegends Automatic 2 2


<latexit sha1_base64="nV29hlvnhwWLCQ2syWlFybnBDqM=">AAACCXicbVDLSsNAFJ3UV62vqEs3g0VwY0mKr41QFMRlBfuAJi2TyaQdOpmEmYkYQrdu/BU3LhRx6x+482+ctllo64ELh3Pu5d57vJhRqSzr2ygsLC4trxRXS2vrG5tb5vZOU0aJwKSBIxaJtockYZSThqKKkXYsCAo9Rlre8Grst+6JkDTidyqNiRuiPqcBxUhpqWdChyOPIehgP1Iwc7wAXo/gBUy7VXgEH7rVnlm2KtYEcJ7YOSmDHPWe+eX4EU5CwhVmSMqObcXKzZBQFDMyKjmJJDHCQ9QnHU05Col0s8knI3igFR8GkdDFFZyovycyFEqZhp7uDJEayFlvLP7ndRIVnLsZ5XGiCMfTRUHCoIrgOBboU0GwYqkmCAuqb4V4gATCSodX0iHYsy/Pk2a1Yp9WTm6Py7XLPI4i2AP74BDY4AzUwA2ogwbA4BE8g1fwZjwZL8a78TFtLRj5zC74A+PzByf9mBI=</latexit>

Out[35]= In[36]:= r·F= y1, 1 , xy,


Plot3D {y ^ 2 - x ^ 2},
x, - - 1, 1
1.0
Out[36]=

0.5
1.00

0.75

y
<latexit sha1_base64="tmEvmksfb9cWDAa5xB1ckWUhUbQ=">AAAB6HicbVDLSgNBEOz1GeMr6tHLYBA8hV3xdQx68ZiAeUCyhNlJbzJmdnaZmRXCki/w4kERr36SN//GSbIHTSxoKKq66e4KEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivVx71S2a24M5Bl4uWkDDlqvdJXtx+zNEJpmKBadzw3MX5GleFM4KTYTTUmlI3oADuWShqh9rPZoRNyapU+CWNlSxoyU39PZDTSehwFtjOiZqgXvan4n9dJTXjjZ1wmqUHJ5ovCVBATk+nXpM8VMiPGllCmuL2VsCFVlBmbTdGG4C2+vEya5xXvqnJZvyhXb/M4CnAMJ3AGHlxDFe6hBg1ggPAMr/DmPDovzrvzMW9dcfKZI/gD5/MH6v2NBw==</latexit>

0.0

0.50

0.25

-0.5

0.

x
<latexit sha1_base64="ZwKkaJcdU3hHlnhMyVD4tdz55QM=">AAAB6HicbVDLTgJBEOzFF+IL9ehlIjHxRHaNryPRi0dI5JHAhswODYzMzm5mZo1kwxd48aAxXv0kb/6NA+xBwUo6qVR1p7sriAXXxnW/ndzK6tr6Rn6zsLW9s7tX3D9o6ChRDOssEpFqBVSj4BLrhhuBrVghDQOBzWB0O/Wbj6g0j+S9Gcfoh3QgeZ8zaqxUe+oWS27ZnYEsEy8jJchQ7Ra/Or2IJSFKwwTVuu25sfFTqgxnAieFTqIxpmxEB9i2VNIQtZ/ODp2QE6v0SD9StqQhM/X3REpDrcdhYDtDaoZ60ZuK/3ntxPSv/ZTLODEo2XxRPxHERGT6NelxhcyIsSWUKW5vJWxIFWXGZlOwIXiLLy+TxlnZuyxf1M5LlZssjjwcwTGcggdXUIE7qEIdGCA8wyu8OQ/Oi/PufMxbc042cwh/4Hz+AOl5jQY=</latexit>

-1.0
-1.0 -0.5 0.0 0.5 1.0
x
<latexit sha1_base64="ZwKkaJcdU3hHlnhMyVD4tdz55QM=">AAAB6HicbVDLTgJBEOzFF+IL9ehlIjHxRHaNryPRi0dI5JHAhswODYzMzm5mZo1kwxd48aAxXv0kb/6NA+xBwUo6qVR1p7sriAXXxnW/ndzK6tr6Rn6zsLW9s7tX3D9o6ChRDOssEpFqBVSj4BLrhhuBrVghDQOBzWB0O/Wbj6g0j+S9Gcfoh3QgeZ8zaqxUe+oWS27ZnYEsEy8jJchQ7Ra/Or2IJSFKwwTVuu25sfFTqgxnAieFTqIxpmxEB9i2VNIQtZ/ODp2QE6v0SD9StqQhM/X3REpDrcdhYDtDaoZ60ZuK/3ntxPSv/ZTLODEo2XxRPxHERGT6NelxhcyIsSWUKW5vJWxIFWXGZlOwIXiLLy+TxlnZuyxf1M5LlZssjjwcwTGcggdXUIE7qEIdGCA8wyu8OQ/Oi/PufMxbc042cwh/4Hz+AOl5jQY=</latexit>

In[31]:= Div[{x * y ^ 2, - y * x ^ 2}, {x, y}]


In[36]:= Plot3D {y ^ 2 - x ^ 2}, x, - 1, 1 , y, - 1, 1 Out[31]=

Figure 32: Plot of the 2D vector field F = (xy 2 , −yx2 ).


- x2 + y2
Out[36]=

To see this in a simple 2D example, consider the vector field F = (xy 2 , −yx2 ), shown in
Fig. 32. The divergence of this field is simple to calculate: ∇ · F = y 2 − x2 , which is the
hyperboloid surface shown in Fig. 33. We can see that the divergence is increasingly large and
positive along the y-axis around x = 0, indicating a flow of the vector field away from this
region, and increasingly negative along the x-axis, indicating a flow toward this region. This is,
of course, fully consistent with the direction of the arrows in Fig. 32.

Example 28. Evaluate the divergence of the vector field


 
2 2x xy
F(x, y, z) = 3xy , ,− .
Div[{x * y ^ 2, - y * x ^ 2}, {x, y}]
In[31]:= (y − 1) z
Out[31]=

- x2 + y2

64
1.0

0.5

y 0.0

-0.5

-1.0
-1.0 -0.5 0.0 0.5 1.0
x

Figure 33: Divergence of the vector field F = (xy 2 , −yx2 ) shown in Fig. 32.

Solution: We note that the field is well-behaved except for y 6= 1 and z 6= 0. Everywhere else,
 
∂ ∂ 2x ∂  xy 
∇·F= (3xy) + −
∂x ∂y (y − 1) ∂z z
2x xy
= 3y − 2
+ 2.
(y − 1) z

If a vector field F can be written as the gradient of some scalar potential φ, i.e., it is
conservative and F = ∇φ, then we can write the divergence directly in terms of the scalar
potential as
∇ · F = ∇ · ∇φ = ∇2 φ. (150)
This differential operator is again sufficiently common to merit its own symbol and name – it is
called the Laplacian:
∂2 ∂2 ∂2
∆ := ∇2 = + + . (151)
∂x2 ∂y 2 ∂z 2
The usage of the delta to denote the Laplacian, while common in maths, is not common in
physics.

Example 29. Evaluate the Laplacian of the scalar field

φ = xy 2 z 3 .

Solution:
∂2φ ∂2φ ∂2φ
∇2 φ = + 2 + 2 = 2xz 3 + 6xy 2 z.
∂x2 ∂y ∂z

6.2 The Divergence Theorem


The divergence theorem, also known as Gauss’s theorem or the Gauss-Ostrogradsky theorem, is
an important mathematical statement of the physical fact that, in the absence of the creation
or destruction of matter, the density within a region of space can change only by having it flow
into or away from the region through its boundary.

65
Figure 34: A volume V enclosed by a surface S used in the divergence theorem. The surface
can be split into two surfaces which can be parametrized as z = ϕ(x, y).

Theorem 5. Let V ⊂ R3 be a simple region in space bounded by a regular closed surface S,


let n̂ be a vector normal to S pointing outwards, and let F be a differentiable vector field with
continuous first derivatives. Then,
˚ ‹
∇ · F dV = F · n̂ dS. (152)
V S

That is to say, the flux of a vector field leaving a closed surface is equal to the divergence of the
field integrated over the volume bounded by that surface.
To prove the divergence theorem, we proceed as follows. Let us first assume the volume V
is z-simple, which is to say it can be written as

V = {(x, y, z) : ϕ1 (x, y) < z < ϕ2 (x, y), (x, y) ∈ D ⊂ R2 },

with ϕ1 and ϕ2 differentiable functions with continuous first derivatives (also called C 1 functions
on D). This volume is illustrated in Fig. 34. Using this, and taking the vector field components
explicitly F = (F1 , F2 , F3 ), we can rewrite the last term of the triple integral on the right hand
side of Eq. (1) as
˚ ¨ ˆ ϕ2 (x,y)
∂F3 ∂F3
dx dy dz = dx dy dz
V ∂z D ϕ1 (x,y) ∂z
¨
= [F3 (x, y, ϕ2 (x, y)) − F3 (x, y, ϕ1 (x, y))] dx dy
¨D ¨ ‹
= F3 k̂ · n̂ dS + F3 k̂ · n̂ dS = F3 k̂ · n̂ dS.
ϕ2 ϕ1 S

The last line follows from Eq. (137) as we move from an integral over the xy-domain back to an
integral over the open surfaces ϕ1 and ϕ1 . The closed surface S is then reconstructed from the
two open surfaces ϕ1 and ϕ1 .
Following the same reasoning, we see that if, instead, V is y-simple, then
˚ ‹
∂F2
dx dy dz = F2 ĵ · n̂ dS,
V ∂y S

or if V is x-simple, then
˚ ‹
∂F1
dx dy dz = F1 î · n̂ dS.
V ∂x S

66
A bounded volume can always be broken down into sections that are simple along some direction,
and since integration is linear, summing the three components gives the general result, which is
the divergence theorem.
A valuable way to see the physical meaning of the divergence operator is to take the diver-
gence theorem in the limit of infinitesimal small volumes. For this purpose, let us consider the
volume V to be that of a sphere of radius r centred in some point r0 , and let us divide the
divergence theorem expressions by the volume |V (r)| of this sphere:
˚ ‹
1 1
∇ · F dV = F · n̂ dS.
|V (r)| V |V (r)| S
We now take the limit of these expressions for r → 0. The left-hand side of the equation simply
tends to the divergence of F, evaluated in the point r0 :

1
∇ · F(r0 ) = lim F · n̂ dS. (153)
r→0 |V (r)| S
The right-hand side of the equation shows that the divergence of a vector field in some point
gives the flux density of the vector field leaving that point, per unit volume. It is therefore
intimately related to the concept of sources and sinks of a vector field. The expression given in
Eq.(153) is also known as the integral form of the divergence operator. It has the advantage of
defining the divergence operator in terms of the geometry of the system, and is not dependent
on the coordinate system used (unlike Eq. (149)). It is, however, typically more challenging to
use for solving problems in practice.

Example 30. Evaluate the divergence of the gravitational field (also called the gravita-
tional acceleration) generated by a point mass placed in the origin
 
x
GM GM  
F(r) = − 2 r̂ = − 3 y ,
r r
z

where r = (x, y, z) and r = x2 + y 2 + z 2 is the standard Euclidean distance.


p

Solution: We note that the field is not defined in the origin, but is well behaved elsewhere. For
r 6= 0 we have
 
∂ x ∂ y ∂ z
∇ · F = −GM + +
∂x r3 ∂y r3 ∂z r3
 3
r − 3x r r − 3y r r3 − 3z 2 r
2 3 2

= −GM + +
r6 r6 r6
GM 
= − 6 3r3 − 3(x2 + y 2 + z 2 )r = 0.
r
We conclude that the gravitational field is solenoidal everywhere except in the origin.
This result may surprise you, as you may have expected the field to be a diverging one,
given that it has a source in the origin. To dig into this result a little deeper, consider the same
problem, but let us now use the divergence theorem given in Eq. (1), where we evaluate the
surface integral considering a sphere of radius r centred in the origin:
‹ ‹

F · n̂ dS = −GM 2
r̂ dS
S S r
ˆ 2π ˆ π 2
|r̂| 2
= −GM dφ r sin θ dθ
0 0 r2
ˆ 2π ˆ π
= −GM dφ sin θ dθ = −4πGM.
0 0

67
This expression does not depend on r, which means that if we take a volume between two
spherical shells some ∆r apart, then the amount of vector flux traversing each surface must be
the same. We can see why from the second line of the equation above: the scaling of the field with
distance as 1/r2 exactly cancels out the r2 increase of the surface with distance (which comes
from the Jacobian). We thus understand intuitively why the field must have zero divergence
outside of the origin, and can extend this result to any vector field that scales as 1/r2 , e.g. the
electric field.
Interestingly, the result above also implies that
˚
∇ · F dV = −4πGM, (154)
V

which cannot hold if ∇ · F = 0 everywhere. The problem is of course in the origin, where the
field is not defined. Equation (153) provides further insight on this conundrum; in the limit of
r → 0 the divergence can be written as
−4πGM
∇ · F(r → 0) = lim . (155)
r→0 |V (r)|
This limit diverges for any M 6= 0. It would thus seem that the divergence of our inverse-square-
law field should be 0 everywhere except in the origin, where it diverges. We have already seen
an object that can describe such physics: the Dirac delta. We can then rewrite the divergence
of the gravitational field as
∇ · F(r) = −4πGM δ(r), (156)
where δ(r) := δ(x)δ(y)δ(z) is the 3D generalization of the 1D Dirac delta:
˚ ˆ +∞ ˆ +∞ ˆ +∞
δ(r) = dx dy dz δ(x)δ(y)δ(z) = 1. (157)
V −∞ −∞ −∞

With this expression the divergence in any region of space excluding the origin r = 0 is indeed
0, but the divergence returns the result of Eq. (154) whenever the origin is included in the
integral. By introducing the Dirac delta the divergence theorem can be made valid even for
non-differentiable point sources.
We note that the gravitational field can be expressed as the gradient of a scalar potential,
F = −∇φ. Finding φ is straightforward. We write
∂   
∂x φ x
 ∂ φ = GM y  ,
∂x r3
∂ z
∂x φ

which we can integrate directly to find


GM
φ(r) = + const.
r
By imposing that the scalar potential should tend to 0 for r → ∞ we can set the constant to 0.
It is often convenient to deal with mass densities ρ = ρ(x, y, z) rather than with point masses,
where the total mass M contained in some volume V of space is given by the integral
˚
M= ρ dV.
V

In terms of the mass density, the divergence of the gravitational field can be written as
∇ · F = −4πGρ. (158)
Writing this expression in terms of the scalar potential leads to the well-known Poisson equation,
which relates a distribution of matter of density ρ with the gravitational potential φ:
∇2 φ = 4πGρ. (159)

68
Figure 35: A region of the xy-plane bounded by the closed curve γ. For a point on the curve
described by the vector r we can identify a tangent vector dr and a normal vector n̂.

6.2.1 Divergence theorem in 2D


The ideas described above were developed in 3D space, but the divergence theorem can also be
useful in two dimensions. Consider a two-dimensional planar region R in the xy-plane bounded
by some closed curve γ, as shown in Fig. 35. As we have seen when discussing space curves (in
particular, recall Fig. 23 and related discussion), the tangent and normal vectors to any point
on the bounding curve γ are given by dr = (dx, dy) and n̂ = (dy, − dx), where we chose n̂ to
point outwards of the region R (note: to find n̂ simply look for a vector in the plane for which
dr · n̂ = 0; clearly the above satisfies this requirement). If F = (Fx , Fy ) is a continuous and
differentiable vector field on R, then we can rewrite Eq. (1) as:
¨ ˛
∇ · F dx dy = F · n̂ ds, (160)
R γ

which, written explicitly in terms of the components of F, is


¨   ˛
∂Fx ∂Fy
+ dx dy = (Fx dy − Fy dx). (161)
R ∂x ∂y γ

By setting P = −Fy and Q = Fx we rediscover Green’s theorem in the plane, as defined in


Eq. (114), which we can now consider simply as a special case of the more general divergence
theorem.

6.3 The Curl


We now want to find an expression to help us evaluate the circulation of a vector field. We already
encountered the concept of circulation – the line integral of a vector field around a closed curve –
in the context of conservative fields, and will come back to this application shortly. Let’s denote
the circulation of some continuously differentiable vector field F as the curl of F, or curl F. This
operation will necessarily produce another vector field, since the curl will have both a size (i.e.,
the amount of circulation) and a direction (i.e., axis around which the circulation occurs defines
a direction). Furthermore, we are interested in the microscopic circulation, that is to say the
value of the curl in some infinitesimal neighbourhood of some point in space. To define this, we
start by picking some point P , and some direction given by a vector n̂ in the point P . Then the
projection of the curl of F near P along n̂ will be given by
˛
1
(curl F) · n̂ = lim F · dr, (162)
A→0 A γ

69
Figure 36: The curl of a vector field F in some point P along the direction n̂ is given by the
limit of a closed line integral representing the circulation of the field.

where the closed line integral is the circulation of F along the curve γ that lies in the plane with
normal vector n̂ circling point P , and A is the area of the plane bounded by γ, as shown in
Fig. 36. In the limit of A → 0 we obtain the component of the curl along n̂ in the point P .
To find a more convenient expression for the curl, let’s pick a particular case where n̂ lies
along the z axis, i.e., n̂ = k̂. The curve γ will then lie in the xy-plane, and let us consider γ
to be a rectangle with one edge in point P (a, b) and with sides ∆x and ∆y, as illustrated in
Fig. 37. Since F = (Fx , Fy , Fz ), we can rewrite F · dr = Fx dx + Fy dy + Fz dz.

Figure 37: Rectangular curve with area ∆x∆y, split into straight segments C1 , C2 , C3 and C4 .
Given the orientation, note that the closed curve γ = C1 + C2 − C3 − C4 . We omit writing the
z component of F explicitly as it is constant in the plane.

Given that dz = 0 along any path in the xy-plane, Eq. (162) becomes:
˛
1
(curl F) · k̂ = lim (Fx dx + Fy dy)
A→0 A γ
ˆ ˆ ˆ ˆ 
1
= lim Fx dx + Fy dy − Fx dx − Fy dy
A→0 A C1 C2 C3 C4
1
= lim [Fx (a, b)∆x + Fy (a + ∆x, b)∆y − Fx (a, b + ∆y)∆x − Fy (a, b)∆y].
A→0 A

Note that in the last two lines we changed the direction of two of the the integrals to use the
three vertices shown in the Fig. 37, and have also omitted writing the z component of F explicitly

70
In[1]:= needs = "VectorFieldPlots`";
VectorPlot3D {0, 0, 2}, x, - 2, 2 , y, - 2, 2 , z, - 0.1, 0.1
r ⇥ v = (0, 0, 2!)
<latexit sha1_base64="qQ24MBZOu1sbpk83EnH08f7oQ8U=">AAACC3icbZDLSsNAFIYn9VbrLerSzdAiVNCSiLeNUHTjsoK9QBPKZDpph04mYWZSDKF7N76KGxeKuPUF3Pk2Ttsg2vrDwMd/zuHM+b2IUaks68vILSwuLa/kVwtr6xubW+b2TkOGscCkjkMWipaHJGGUk7qiipFWJAgKPEaa3uB6XG8OiZA05HcqiYgboB6nPsVIaatjFlPH8+FwBC9h+SiBThiQHjqE9z9kHXTMklWxJoLzYGdQAplqHfPT6YY4DghXmCEp27YVKTdFQlHMyKjgxJJECA9Qj7Q1chQQ6aaTW0ZwXztd6IdCP67gxP09kaJAyiTwdGeAVF/O1sbmf7V2rPwLN6U8ihXheLrIjxlUIRwHA7tUEKxYogFhQfVfIe4jgbDS8RV0CPbsyfPQOK7YZ5XT25NS9SqLIw/2QBGUgQ3OQRXcgBqoAwwewBN4Aa/Go/FsvBnv09ackc3sgj8yPr4B+UiYeQ==</latexit> <latexit sha1_base64="YLrWJ6OcJ7wxLCm7oKNeZVyy5GI=">AAACEHicbVDLSgMxFM3UV62vUZdugkWsIGWm+NoIRTcuK9gHdEq5k2ba0ExmSDKFMvQT3Pgrblwo4talO//G9LHQ1kMCh3Pu5d57/JgzpR3n28osLa+srmXXcxubW9s79u5eTUWJJLRKIh7Jhg+KciZoVTPNaSOWFEKf07rfvx379QGVikXiQQ9j2gqhK1jACGgjte1jT4DPAXuahVTh1PMDPBjha1xwTrF5JS8KaRdO2nbeKToT4EXizkgezVBp219eJyJJSIUmHJRquk6sWylIzQino5yXKBoD6UOXNg0VYMa30slBI3xklA4OImm+0Hii/u5IIVRqGPqmMgTdU/PeWPzPayY6uGqlTMSJpoJMBwUJxzrC43Rwh0lKNB8aAkQysysmPZBAtMkwZ0Jw509eJLVS0b0ont+f5cs3sziy6AAdogJy0SUqoztUQVVE0CN6Rq/ozXqyXqx362NamrFmPfvoD6zPH5JDmmI=</latexit>

v = ( y!, x!, 0)

0
y

Out[2]=

-1

-2

-2 -1 0 1 2
x

Figure 38: Velocity field of a solid body in rotation, and its curl.

to simplify notation. Noting that A = ∆x∆y, and rearranging, we have


 
[Fy (a + ∆x, b) − Fy (a, b)]∆y [Fx (a, b + ∆y) − Fx (a, b)]∆x
(curl F) · k̂ = lim −
A→0 ∆x∆y ∆x∆y
[Fy (a + ∆x, b) − Fy (a, b)] [Fx (a, b + ∆y) − Fx (a, b)]
= lim − lim
∆x→0 ∆x ∆y→0 ∆y
∂Fy ∂Fx
= − .
∂x ∂y
The choice of axis is, of course, arbitrary, so repeating the same exercise with n̂ along the x and
y axes gives

∂Fz ∂Fy
(curl F) · î = − ,
∂y ∂z
∂Fx ∂Fz
(curl F) · ĵ = − .
∂z ∂x
These projections along the coordinate axes provide the curl vector in the Cartesian basis.
We recognize them as the components of a vector product, as defined in Eq. (8). Using the
differential operator ∇ we can thus write the curl in a more convenient form for calculations
     
∂Fz ∂Fy ∂Fx ∂Fz ∂Fy ∂Fx
curl F = − î + − ĵ + − k̂ ≡ ∇ × F. (163)
∂y ∂z ∂z ∂x ∂x ∂y

A vector field F for which ∇ × F = 0 is called irrotational.

Example 31. Find the curl of the vector field

F = (xy, x2 z, xyz).

Solution:
xz − x2
   
x̂ ŷ ẑ
∂ ∂ ∂ 
∇ × F = det  ∂x ∂y ∂z =  −yz  .
2
xy x z xyz 2xz − x

71
From the definition of the curl we can see that it measures the microscopic rather than some
macroscopic circulation of a vector field. It represents the tendency of a fictional microscopic
paddle wheel, placed in some point in the field, to rotate around the axis that points in the
direction of the curl. Note that the micro and macroscopic circulations of a field are typically
different. To see this, consider a solid body rotating around the z axis with angular velocity
~ = ω k̂. In the xy-plane, some point at r = (x, y) will then have a velocity
ω
   
î ĵ k̂ −yω
v=ω ~ × r = det  0 0 ω  = +xω  .
x y 0 0

This velocity is a vector field that represents the macroscopic rotation of the solid body. Let’s
calculate the curl of this field to find the microscopic circulation, for comparison:
 
î ĵ k̂
∂ ∂ ∂ 
curl v = ∇ × v = det  ∂x ∂y ∂z = 2ω k̂.
−yω xω 0

We see that the curl is twice the angular velocity vector of the solid body around its axis of
rotation. The velocity field and its curl are shown in Fig. 38.
Recall that we have already encountered the terms forming the components of ∇ × F when
discussing exact differentials in Section 4.6, in Eq. (113). There we saw that given some vector
field F, if all the components of ∇×F = 0, then the differential form F·dr = Fx dx+Fy dy+Fz dz
is an exact differential. We showed how line integrals of such a differential must be path-
independent, and that closed line integrals are always zero. A vector field for which this holds
is called a conservative field. Using the curl formalism defined here we now have a convenient
way to check whether a field is conservative or not. This should come as little surprise given the
definition of the curl in Eq. (162): if F is conservative then the closed line integral will be zero
for any choice of n̂.
It is important to note that these observations require the region bounded by the curve along
which we perform the line integral to be simply connected. From our derivation of the curl it
should be obvious why: in taking the limit for the area bounded by the curve, we explicitly
require that the vector field be well-defined in the point (and its neighbourhood) in which the
curl is being evaluated. We discussed this already in Section 4.6, and reiterate it here. If all
closed line integrals on some domain are always zero then the field is certainly irrotational, but
if we find an irrotational field on a multiply-connected domain we are not guaranteed that all
closed line integrals will be zero. Here is an example of such a case.

Example 32. Find the curl of the vector field


 
−y x
F= , .
x2 + y 2 x2 + y 2

Is F conservative?

Solution: We proceed in the normal way by writing


 
î ĵ k̂  2
y − x2 x2 − y 2

 ∂ ∂ ∂ 
∇ × F = det  ∂x ∂y ∂z  = + k̂ = 0.
−y x (x2 + y 2 )2 (x2 + y 2 )2
x2 +y 2 x2 +y 2
0

The field F is shown in Fig. 39. There is a clearly visible macroscopic circulation, but the curl
shows us that the microscopic circulation is zero. So if we imagine a paddle wheel moving around
in this field it would follow the field lines, but would not itself rotate. If we compare this with

72
1.0

0.5

60 000

40 000
0.0

20 000

-0.5

-1.0
-1.0 -0.5 0.0 0.5 1.0

Figure 39: Vector field given in Example 32. While there is clear macroscopic circulation, there
is no microscopic circulation: the curl of this field is 0.

the solid body case, we can see that the only difference between the two is how quickly the field
drops off as we move away from the origin: the inverse square rate here is such that the fields on
the inner and outer side of our imaginary paddle wheel cancel out exactly. It is clearly possible
to have very different behaviour in micro and macroscopic circulation of a vector field.
Importantly, this field is defined everywhere except for the origin (where it diverges), so the
xy-plane is not simply connected. Let’s calculate the circulation of F along a circumference
centred in the origin of radius 1. Moving to plane polar coordinates, we have x2 + y 2 = r2 = 1
so F = (− sin θ, cos θ), and
˛ ˆ 2π
F · dr = (sin2 θ + cos2 θ) dθ = 2π 6= 0.
γ 0

We see that while ∇ × F = 0, not all closed line integrals are zero. They are zero only if the
path doesn’t enclose the origin. Otherwise, the integral will be 2π. So F is (locally) conservative
over any domain in the xy-plane that excludes the origin, but is globally non-conservative.

6.4 Stokes’ theorem


Stokes’ theorem is a mathematical statement that allows us to relate the circulation of a vector
field along some curve γ, to its vorticity (a measure of the amount of rotation of the field in a
given region, given by the curl operator) within the surface bounded by γ.
Theorem 6. Let S be a regular and oriented surface with unit normal vector n̂, bounded by
a closed regular curve γ, and let F = (Fx , Fy , Fz ) be a continuously differentiable vector field
defined over S and γ. Then ¨ ˛
(∇ × F) · n̂ dS = F · dr. (164)
S γ

That is to say, the flux of the curl of a vector field through some open surface S equals the
circulation of the field along the surface’s boundary. We note that the surface may be closed; if
so, the integrals must equal zero.
To see why Stokes’ theorem holds, consider the surface S bounded by a closed curve γ as
illustrated in Fig. 40. The circulation of some well-behaved vector field F along γ is given by
˛
F · dr.
γ

73
a b

c d

Figure 40: Slicing of some surface parametrized by the variables (x, y) for Stokes’ theorem. The
line integrals on all internal grid lines cancel because they are traversed in opposite directions,
leaving only the line integral along the boundary curve γ. Each small circulation element in the
final panel (d) resembles the setup shown in Fig. 36: the circulation corresponds to the scalar
product of the curl of the field and the vector area.

We now divide the area enclosed by γ in two parts, bounded by two new curves γ1 and γ2 .
Clearly, ˛ ˛ ˛
F · dr = F · dr + F · dr,
γ γ1 γ2
since the internal grid line is traversed twice, in opposite directions, and thus cancels exactly –
see Fig. 40(b). We can continue this process to build an increasingly dense tiling, so that
˛ X n ˛
F · dr = F · dr
γ i=1 γi
Xn
≈ (∇ × F) · Si ,
i=1

where in the last line we have used the definition of the curl from Eq. (162), and where Si
denotes the vector area of the region bounded by γi . Since the vector area has not been taken in
the infinitesimal limit, the curl expression is only approximate. The relationship becomes exact
in the limit of Si → 0, i.e., infinite regions of infinitesimal size, and the sum over the regions is
replaced by a double integral over the surface S
˛ ¨
F · dr = (∇ × F) · dS, (165)
γ S

which we recognize as Stokes’ theorem. Note that in two dimensions with F = (Fx , Fy ) we once
again recover Green’s theorem:
˛ ¨  
∂Fy ∂Fx
F · dr = − dS. (166)
γ S ∂x ∂y

74
Just as the divergence theorem can be used to relate volume and surface integrals for certain
types of integrands, Stokes’ theorem connects surface integrals with line integrals.

Example 33. Evaluate the line integral


˛
F · dr,
γ

where F = (y + z, z + x, x + y), and the curve γ is given by the intersection of the sphere
x2 + y 2 + z 2 = R2 with the plane x + y + z = 0.

Solution: The vector field is well-defined in all R3 and γ is a closed regular curve so we can use
Stokes’ theorem to transform the line integral into a surface integral:
˛ ¨
F · dr = (∇ × F) · n̂ dS,
γ S

where S is the portion of the plane bounded by the intersection with the sphere. The normal
vector is thus simply that of the plane: n̂ = √13 (1, 1, 1). Calculating the curl we have
   
x̂ 0 ŷ ẑ
∂ ∂ ∂ 
∇ × F = det  ∂x ∂y ∂z = 0 ,
y+z z+x x+y 0

and
¸ so we conclude that F is conservative in R , F · dr is an exact differential, and that therefore
3

γ F · dr = 0.

Example 34. Show that Stokes’ theorem holds for the vector field F = (x, 0, y) over the
surface S given parametrically as

x =R sin θ cos φ
y =R sin θ sin φ
z =R cos θ

for θ ∈ [0, π/2] and φ ∈ [−π/2, +π/2] (a quarter sphere).

Solution: We proceed to verify


˛ ¨
F · dr = (∇ × F) · n̂ dS,
γ S

by calculating both integrals separately, and showing the results agree.


For the surface integral we have
∇ × F = (1, 0, 0).
The normal vector to the sphere is (see Eq. (116)) n̂ = (sin θ cos φ, sin θ sin φ, cos θ), and we know
from the Jacobian that dS = R2 sin θ dθ dφ. Therefore
¨ ˆ +π/2 ˆ π/2
(∇ × F) · n̂ dS = dφ sin θ cos φ R2 sin θ dθ
S −π/2 0
ˆ +π/2 ˆ π/2
2
=R cos φ dφ sin2 θ dθ
−π/2 0
π π
= R · 2 · = R2 .
2
4 2

75
Figure 41: The quarter sphere surface used in Example 34, bounded by the two semi-
circumferences C1 and C2 .

To find the circulation, we must first parametrize the bounding curve γ. As can be seen from
Fig. 41, γ will be given by the sum of two half-circumferences on the sphere. The first can be
parametrized as:

x =R cos t
y =R sin t
z =0

for t ∈ [−π/2, +π/2]. The contribution to the total circulation from this segment is
ˆ +π/2
C1 = (Fx dx + Fy dy + Fz dz)
−π/2
ˆ +π/2
= Fx dx
−π/2
ˆ +π/2
= R cos t(−R sin t) dt = 0.
−π/2

The second half-circumference can be parametrized as

x =0
y =R sin t
z =R cos t

for t ∈ [−π/2, +π/2]. The contribution to the total circulation from this segment is (note the
change in direction to make this a closed curve)
ˆ −π/2
C2 = (Fx dx + Fy dy + Fz dz)
+π/2
ˆ −π/2
= Fz dz
+π/2
ˆ −π/2
π 2
= R sin t(−R sin t) dt = R .
+π/2 2

So the total circulation is ˛


π 2
F · dr = C1 + C2 = R .
γ 2

76
6.5 Some differential identities
Differential operators are often most conveniently handled in index notation. Given a differen-
tiable scalar field φ and a differentiable vector field F, we can write:

∇φ = φ = ∂i φ
∂xi

∇·F= Fi = ∂i Fi
∂xi

∇ × F = εijk Fk = εijk ∂j Fk
∂xj
where we remember the convention that repeated indices are summed over. The number of free
indices (not repeated) indicate the rank of the remaining tensor: in the first case it is 1, so we
have a vector, in the second 0 (i is repeated) so a scalar, and in the third again 1, so a vector (j
and k are repeated). Here εijk is the Levi-Civita symbol, a rank 3 tensor.
Given the definitions of the gradient, divergence and curl, there are certain relationships
between them that will prove useful. Note that only some compositions of these operators are
meaningful:

1. div grad φ ≡ ∇ · (∇φ) = ∇2 φ. This is the Laplacian.

2. curl grad φ ≡ ∇ × (∇φ) = 0. Equivalent to computing the curl of a conservative field.

3. div curl F ≡ ∇ · (∇ × F) = 0. The curl is solenoidal by construction.

4. grad div F ≡ ∇(∇ · F)

5. curl curl F ≡ ∇ × (∇ × F)

Example 35. Prove the identity

∇ × (∇ × F) = ∇(∇ · F) − ∇2 F.

Solution: We proceed using index notation:

∇ × (∇ × F) = εijk ∂j (∇ × F)k
= εijk ∂j εklm ∂l Fm
= εkij εklm ∂j ∂l Fm
= (δil δjm − δim δjl ) ∂j ∂l Fm
= δil δjm ∂j ∂l Fm − δim δjl ∂j ∂l Fm
= ∂i ∂m Fm − ∂j ∂j Fi
= ∇(∇ · F) − ∇2 F.

77
7 Revision
7.1 Summary of some key points
The arc length s of a space curve parametrized by a vector r(x(t), y(t), z(t)) is given by
ˆ b ˆ b  2  2  2
s
dx dy dz
s= r0 (t) dt = + + dt.
a a dt dt dt

We have three main kinds of line integrals:

1. Line integral of a scalar function f along curve γ, produces a scalar:


ˆ ˆ b
f (r) ds = f [r(t)] |r0 (t)| dt.
γ a

2. Line integral of a scalar function f along curve γ with respect to an infinitesimal vector
displacement, produces a vector:
ˆ ˆ
f (r) dr = f (x, y, z)(î dx + ĵ dy + k̂ dz).
γ γ

3. Line integral of a vector function F along curve γ with respect to an infinitesimal vector
displacement, produces a scalar:
ˆ ˆ ˆ ˆ
F(r) · dr = Fx (x, y, z) dx + Fy (x, y, z) dy + Fz (x, y, z) dz.
γ γ γ γ

A common example is the work W done by a force F moving along a curvilinear path γ.

The three main differential operators are:


 
∂f ∂f ∂f
gradf (r) = ∇f = , ,..., .
∂x1 ∂x2 ∂xn
∂Fx ∂Fy ∂Fz
div F(r) = ∇ · F = + + .
∂x ∂y ∂z
     
∂Fz ∂Fy ∂Fx ∂Fz ∂Fy ∂Fx
curl F(r) = ∇ × F = − î + − ĵ + − k̂.
∂y ∂z ∂z ∂x ∂x ∂y

We can use these to relate integrals along lines, surfaces and volumes of scalar and vector
functions.

1. Divergence Theorem:
˚ ‹ ‹
∇ · F dV = F · dS = F · n̂ dS.
V S S

2. Stokes’ Theorem:
¨ ¨ ˛
(∇ × F) · dS = (∇ × F) · n̂ dS = F · dr.
S S γ

3. In 2D, the divergence and Stokes’ theorems reduce to Green’s Theorem:


˛ ¨  
∂Q ∂P
(P dx + Q dy) = − dx dy.
γ R ∂x ∂y

78
For surface integrals, where the surface is parametrized by the vector r(u, v), we have
 
∂r ∂r ∂r ∂r
dS = × du dv; dS = × du dv. (167)
∂u ∂v ∂u ∂v

For surface integrals, where the surface is given by the equation z = f (x, y), we can write
g(x, y, z) = f (x, y) − z = 0, and use the gradient of g to find the normal vector to the surface:
q
dS = |∇g| dA = fx2 + fy2 + 1 dA; dS = ∇g dA = (fx , fy , −1) dA, (168)

where the subscripts indicate first partial derivatives.

7.2 Worked examples

Example 36. Viviani’s window is the part of the surface of the sphere of radius R centred
in the origin, contained within the right cylinder with base given by a circumference of
radius R/2, centred in (x, y) = (R/2, 0), for z ≥ 0 (see Fig. 42). Find the area S of
Viviani’s window.

Figure 42: Viviani’s window. The surface S is bounded by the space curve γ, also known as
Viviani’s curve.

Solution: We are tasked with solving the following integral for the area S:
¨
S= dS.
S

Finding the normal vector to the surface of the sphere, and thus dS according to the standard
expressions, is simple enough – we have already solved this problem in Example 25. However,
the challenge is then to find the limits of the two variables (θ, φ) parametrizing the surface of the
sphere over which we should integrate. This is not trivial in this case, so we shall proceed with
a different parametrization of the surface that is more convenient, using cylindrical coordinates
for a cylinder not centred in the origin. Specifically, we write

x = r cos θ
y = r sin θ
z = z(r, θ).

79
Figure 43: Parametrization of an offset circle in plane-polar coordinates. A point P on the circle
is given by the parameters (r, θ), with r = R cos θ.

To find the range of (r, θ), consider the base of the cylinder in the xy-plane, as shown in Fig. 43.
Clearly, in order to describe the circle we see that −π/2 < θ < π/2, while the length of the
vector tracing the circle for some given value of θ must be r = R cos θ. The entire base of the
cylinder can then be described by setting 0 ≤ r ≤ R cos θ.
All points on the surface S must obey the equation of the sphere, and thus

R2 = x2 + y 2 + z 2 = (r cos θ)2 + (r sin θ)2 + z 2 = r2 + z 2 .



It follows that z = R2 − r2 . So the parametrization we are looking for is:

x = r cos θ
y = r sin θ
p π π
z = R2 − r 2 , for r ≤ R cos θ, − ≤θ≤ .
2 2
To find a convenient form for dS we recall Eq. (144), where our surface is traced out by the
vector x = (x, y, z), parametrized in terms of the two variables (r, θ). We therefore have
¨ ¨
∂x ∂x
dS = × dr dθ.
S S ∂r ∂θ

 
i j k
∂x ∂x 
 ∂x ∂y ∂z 

× = det  ∂r ∂r ∂r 
∂r ∂θ


∂x ∂y ∂z
∂θ ∂θ ∂θ
   
r2 cos θ
i j k √
   R2 −r2 
 −r2 sin θ 
= det  cos θ √ −r = √ 2 2  .
 
sin θ 
R2 −r2 
  R −r 
−r sin θ r cos θ 0 r

The modulus of this vector is


r
∂x ∂x (r2 cos θ)2 (r2 sin θ)2 rR
× = + + r2 = √ .
∂r ∂θ R2 − r 2 R2 − r 2 R2 − r 2

80
We now have all the ingredients needed to perform the integral:
¨ ˆ +π/2 ˆ R cos θ
rR
S= dS = dθ dr √
S −π/2 0 R2 − r 2
ˆ +π/2  p R cos θ
= R − R2 − r 2 dθ
−π/2 0
ˆ+π/2
= R2 (1 − | sin θ|) dθ
−π/2
ˆ +π/2
!
= R2 π−2 sin θ dθ
0

= R2 (π − 2).

Example 37. Alice and Bob agree to meet for lunch at some time between noon and
1 pm. They are both willing to wait up to 10 min for the other to arrive, but will leave
otherwise. Find the probability that they meet for lunch, if

a) their arrival times are independent and uniformly distributed;

b) their arrival times are independent, but while Alice’s probability of arrival is uniform,
Bob is quadratically more likely to arrive later in the hour.

Does Bob’s tardiness help or hinder the likelihood of the two enjoying a meal together?

Solution: Let X indicate the arrival time for Alice, and Y the arrival time for Bob. We’re told
the two events are independent, so the joint PDF for the arrival of Alice and Bob is obtained
by multiplying their respective single-variable PDFs.
Part a) We’re told Alice’s and Bob’s arrival time is uniformly distributed, so the PDF is a
constant function, the integral of which, over the domain of interest (a 1 hour window), should
be 1: ˆ ˆ ˆ
fX (x) dx = fY (y) dy = C dx = 1.
1h 1h 1h
The constant is therefore simply the inverse of the time window. For convenience we can choose
to work in the units of minutes, in which case C = 1/60. The joint PDF is then
 2
1
fXY (x, y) = .
60

The domain over which the joint PDF is defined is shown in Fig. 44: a square of total area 3600
square minutes. We’re interested in finding the probably that they meet. Since this requires
|Y − X| ≤ 10, this probability is given by
¨
P (|Y − X| ≤ 10) = fXY (x, y) dx dy.
|Y −X|≤10

In this case the integral simply needs to evaluate the red area shown in Fig. 44. We do not even
need to do the integral to find the answer: the total area of the box is 602 = 3600, while the
area of the two blue triangles is 502 = 2500, and so
3600 − 2500 11
P (|Y − X| ≤ 10) = = ≈ 31%.
3600 36

81
Figure 44: Probability domain for Alice and Bob meeting.

Part b) Here the PDFs for Alice and Bob are not the same. We still have
1
fX (x) =
60
for Alice, but for Bob we’re told that fY (y) ∝ y 2 . To find the normalization constant we must
integrate this function over the domain of interest:
ˆ 60 60
Cy 3 603 3
Cy 2 dy = = C := 1; ⇒C= .
0 3 0 3 603

The joint PDF is therefore:


3 2
fXY (x, y) =
y ,
604
and the probability for the two to meet in a 10 minute window is
¨ ¨
3 2
P (|Y − X| ≤ 10) = fXY (x, y) dx dy = y dx dy,
|Y −X|≤10 |Y −X|≤10 604

where the domain is the same as before, i.e., the red region in Fig. 44. Due to the shape of the
domain, note that it is easier to integrate the blue regions instead (that is, the cases where the
two do not meet), and so we write

P (|Y − X| ≤ 10) = 1 − P ((Y − X) > 10) − P ((Y − X) < −10),

where the last two terms correspond to the upper and lower triangles, respectively. Note that

82
these two are no longer the same. We have
ˆ 50 ˆ 60
3 2
P ((Y − X) > 10) = dx y dy
0 x+10 604
ˆ 50
603 (x + 10)3
 
3
= dx 4 −
0 60 3 3
50
x 1 (x + 10)4
50
= −
0 604
60 4 0
 4
7 1 1
= + .
12 4 6
ˆ 60 ˆ x−10
3 2
P ((Y − X) < −10) = dx y dy
10 0 604
ˆ 60
3 (x − 10)3
= dx 4
10 60 3
60
1 (x − 10)4
=
60 4 10
 4
1 5
= .
4 6
Therefore, we find
 4
1 5 4
 
7 1 1 767
P (|Y − X| ≤ 10) = 1 − − − = ≈ 30%.
12 4 6 4 6 2592
This is slightly lower than the result from part a), but only by around 1%. Bob’s preference for
showing up late doesn’t seem to make much of a difference.

Example 38. Evaluate the integral


˛
dx + xy dz,
γ

where γ is the closed curve forming the boundary of the Pringles surface given by the
equation z = x2 − y 2 , for x2 + y 2 ≤ 1, by

a) calculating the line integral directly;

b) using Stokes’ theorem.

Solution: The surface in question is shown in Fig. 45.


Part a) We can parametrize the curve in terms of the plane-polar angle θ as
x = cos θ
y = sin θ
z = cos2 θ − sin2 θ; θ ∈ [0, 2π].
The integral then becomes
ˆ 2π ˆ 2π
[− sin θ + sin θ cos θ(−2 cos θ sin θ − 2 sin θ cos θ)] dθ = −4 sin2 θ cos2 θ dθ = −π.
0 0
Part b) We note that
¸ the surface is well-defined everywhere, and that the given integral can be
written in the form γ F · dr, with F = (1, 0, xy). To use Stokes’ theorem we then write
˛ ˛ ¨
dx + xy dz = F · dr = (∇ × F) · dS,
γ γ S

83
Figure 45: The Pringles crisp, a hyperbolic paraboloid.

and so we need to find the curl of F and an expression for the vector area dS. The curl is

∇ × F = (x, −y, 0).

For the area we can write g = z − f (x, y) = z − x2 + y 2 , the gradient of which will provide the
normal vector to the surface
∇g(x, y, z) = (−2x, 2y, 1),
so that
dS = ∇g dx dy = (−2x, 2y, 1) dx dy.
This is oriented in the right direction. Inserting these expressions into the integral we have
¨ ¨
(∇ × F) · dS = (x, −y, 0) · (−2x, 2y, 1) dx dy
S A¨

= −2 (x2 + y 2 ) dx dy
A
ˆ 2π ˆ 1
= −2 dθ r(r2 cos θ + r2 sin θ) dr
0 0
1
r4

= −4π = −π.
4 0

84

You might also like