Lecture Notes
Lecture Notes
Sam Vinko
Hilary 2024
Contents
1 A refresher: vectors, planes, fields 4
1.1 Some basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Functions of multiple variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Scalar and vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Lines and planes in space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Multiple integrals 11
2.1 Non-rectangular domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Triple integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 The Jacobian matrix and determinant . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Curvilinear coordinate systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.1 2D: Plane polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.2 3D: Cylindrical coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.3 3D: Spherical polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . 23
3 Statistical Distributions 26
3.1 Continuous random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.1 Expectation value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Distribution functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Join probability distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Marginal and conditional distributions . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 The Dirac Delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Surfaces 49
5.1 The Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.1.1 Directional derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.1.2 The gradient in curvilinear coordinate systems . . . . . . . . . . . . . . . 53
5.1.3 Gradient expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Surface integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2.1 Vector area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2.2 Normal vector to a surface . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.3 Surface integrals of scalar fields . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.4 Surface integrals of vector fields . . . . . . . . . . . . . . . . . . . . . . . . 61
2
6.5 Some differential identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7 Revision 78
7.1 Summary of some key points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.2 Worked examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
1
CP4 Multiple Integrals and Vector Calculus syllabus
• Double integrals and their evaluation by repeated integration in Cartesian, plane polar
and other specified coordinate systems.
• Jacobians.
• Line, surface and volume integrals, evaluation by change of variables (Cartesian, plane po-
lar, spherical polar coordinates and cylindrical coordinates only unless the transformation
to be used is specified).
• The operations of grad, div and curl and understanding and use of identities involving
these.
• The statements of the theorems of Gauss and Stokes with simple applications.
• Conservative fields.
Recommended books
1
What this looks like according to generative AI, graphically, for those interested, is depicted on the title page.
3
1 A refresher: vectors, planes, fields
1.1 Some basic definitions
In this course we will be interested in manipulating vector quantities of various kinds, so we
should start by recalling the definition of fields, vectors and vector spaces, and setting out the
preferred notation used in this course. First, a few definitions that will help clarify the language
we will be using.
• A set is a collection of mathematical objects.
• A set G with a binary operation that combines two elements in the set to yield another
element in the set is called a group. Mathematically this can be written as G × G → G.
This operation has to have certain properties (also called the axioms), and together with
the set these constitute the algebraic structure of the group. Specifically, for a group
these are closure, associativity, ∃ of an identity element, and ∃ of inverse elements. If the
operation also commutes (i.e., it does not depend on the order in which the elements are
written in the operation), then the group is said to be a commutative or an Abelian group.
• A set with two binary operations of addition and multiplication is called a field F . These
must satisfy six field axioms, namely associativity, commutativity, ∃ of additive and multi-
plicative identity, ∃ of additive and multiplicative inverses, and distributivity. We see the
complexity of the structure is increasing; in particular, we note that a field is an Abelian
group under both addition and multiplication, the two being connected via the requirement
of distributivity. Note that the set of rationals Q, real numbers R, and complex numbers
C are all fields, but the set of integers Z is not a field as the multiplicative inverse of
every integer is not necessarily another integer (e.g., the multiplicative inverse of 2 is 1/2,
not an integer). The integers are, instead, a ring. Rings generalize fields: they too have
two operations, but multiplication does not have to be commutative, and multiplicative
inverses need not exist.
• A vector space over a field F is a set V together with two binary operations: vector
addition (V × V → V ) and scalar multiplication (V × F → V ). Note the difference with
the definition of a field: the notion of multiplication between vectors is not part of the
fundamental algebraic structure of a vector space. In fact, as we know there are multiple
ways to multiply vectors, and the multiplicative inverse (i.e., vector division) is not defined.
The elements of the field are called scalars to distinguish them from the elements of the
vector space (vectors), and because the definition of multiplication acts to scale the vectors.
We will typically be interested in vector spaces defined over R: this is called a real vector
space. Eight axioms must be satisfied for the operations permitted on a vector space:
associativity and commutativity of addition, ∃ of identity elements for addition and scalar
multiplication, ∃ of additive inverse, distributivity for scalar multiplication with respect
to vector and field addition, and compatibility of scalar and field multiplication.
1.2 Vectors
Throughout this course we will be interested in n-dimensional real vector spaces and associated
real vectors, defined over Rn . We will write these vectors generally as
r = (x1 , x2 , · · · , xn ) r ∈ Rn ,
where xi are called the components of r. Often it will be convenient to use a slightly different
notation in the special cases of 2 or 3 dimensions, where instead we write
r = (x, y) r ∈ R2 .
r = (x, y, z) r ∈ R3 .
4
The modulus or norm of a vector is given by
v
u n
(1)
uX
|r| ≡ r = t x2i .
i=1
ê1 = (1, 0, 0, · · · , 0)
ê2 = (0, 1, 0, · · · , 0)
..
.
ên = (0, 0, 0, · · · , 1).
Note that the subscripts mean two different things: in one case they denote the component of
the vector (a scalar), and in the other a specific element of the basis set (a vector). That this
decomposition of a generic vector r can be done follows directly from the axioms of the vector
space: we explicitly use both the concepts of vector sums and scalar multiplications to identify
all possible elements of the vector space.
Once again, it will be convenient to use special notation for the 2 and 3 dimensional cases
when discussing the sets of canonical unit vectors:
r = x î + y ĵ + z k̂. (4)
In addition to the basic operations provided for by the axioms of a vector space, there are
a few others that will prove useful. The first is the scalar or dot product. This is an operation
that takes in two vectors elements of the vector space V and outputs a scalar, an element of
the field F over which the vector space is defined: V × V → F . For two vectors u, v ∈ V , with
components ui and vi , we write this operation as
u · v ≡ hu, vi = u1 v1 + u2 v2 + · · · + un vn . (5)
In 2 dimensions it is easy to see that this definition can also be expressed geometrically as
which allows us to define the concept to angle between two vectors in n dimensions as
u·v
cos ϑ = . (7)
|u||v|
5
We say that two vectors are orthogonal iff u · v = 0. This is easy to see geometrically in the
case of 2 or 3 dimensions from Eq. (6), but it holds more generally for vectors in any number of
dimensions.
In 3 dimensions we will see that it will be convenient to define another product operation
between vectors, but which, unlike the scalar product, outputs a vector: V × V → V . We call
this the vector or cross product. For two vectors u, v ∈ R3 , with components ui and vi , it is
defined as
î ĵ k̂
u × v = det u1 u2 u3 . (8)
v1 v2 v3
We note that if u, v are linearly dependent, i.e. u = λv, λ ∈ R, then because of the permutation
rules of the determinant the above quantity is 0. If u, v are linearly independent, then they
define a plane, and the vector product will create a new vector normal to this plane (i.e., to the
entire subspace spanned by u, v). This provides insight into how the operation can be applied
to vectors defined in R2 rather than R3 : this is a special case where the basis has been aligned
with the plane spanned by u, v, and its normal. So for u, v ∈ R2 we can pad the missing third
dimension of the vectors with zeros and write
î ĵ k̂
u × v = det u1 u2 0 = (u1 v2 − v1 u2 )k̂. (9)
v1 v2 0
The vector product can be generalized to more than 3 dimensions but this is seldom useful.
With these two operations we can write the triple product between three vectors u, v, w ∈ R3
as
u1 u2 u3
u · (v × w) = det v1 v2 v3 . (10)
w1 w2 w3
This operation necessarily produces a scalar, and can be denoted as a map V ×V ×V → F . Note
that the parenthesis is not strictly needed, as there is only one way to get a meaningful result
from the operation that is consistent with the definitions of the scalar and vector products. The
triple product possesses a useful geometric property: it measures the volume of the parallelepiped
formed by the three vectors u, v, w.
6
Temperatures in ºC Wind gusts in mph
Figure 1: Temperature and wind gust maps of the UK. Temperatures represent a scalar field,
while the winds must be represented by a vector quantity representing the wind speed and
direction. (Source: Met Office 2022)
2. Functions that map a scalar to a vector f : A ⊆ R → Rn . We will call this a vector function
in one variable. An example would the equation describing the position of a particle in
3D space as a function of time. For each value of time (a scalar), the function outputs the
position vector of the particle r = (x, y, z).
7
Figure 2: An object with mass travelling in the solar system will be subject to the gravitational
field: a vector field where each point in space has a gravitational force associated with it. This
will cause the mass to travel in some direction. There are 5 stationary points in the Earth-Sun
system, called Lagrange points, where small objects can steadily orbit the Sun along with the
Earth in a fixed position. One of these points, L2 (note, it’s a saddle point), is the home of the
recently launched James Webb telescope. (Source: NASA 2022.)
temperature distribution. A proper map would therefore look more like (x, y, z, t) → T . We can
generalize these ideas and define a scalar field to be a function S that assigns to each point in
some n-dimensional vector space a scalar value:
S(r) : Rn → R. (11)
If instead of the temperature we were interested in accounting for the wind speed across
the country, we could do something very similar, but the scalar temperature would need to
be replaced with a vector quantity accounting for the wind direction and velocity. Again, in
the simplest case we may want to associate the wind direction with the latitude and longitude,
and so the required mapping can be written as (x, y) → (vx , vy ). As before, we can add the
altitude or the time to our coordinate system, but now we may also include further complexity
to the description of the wind, for example by adding its vertical velocity component. Then the
mapping would be from a 4D to a 3D space: (x, y, z, t) → (vx , vy , vz ) We can generalize this by
defining a vector field as a function F that assigns to each point in some n-dimensional vector
space an m-dimensional vector:
F(r) : Rn → Rm . (12)
8
Figure 3: A line in 3D space can be fully specified by two vectors. This could either be two
position vectors describing two points on the line, or by a vector along the line (the tangent
vector) and a vector to any point on the line.
This is an obvious generalization of Eq. 11, since a 1D vector field reduces to a scalar field.
where, again, a is any point in the plane, and t̂1 and t̂2 are any two non-colinear vectors lying
in the plane. The dimensionality of this set is now 2, as we have two degrees of freedom in the
equation. While not part of this course, it should be clear from these two examples how sets of
higher dimensionality can be generated for vector spaces of arbitrary dimensionality.
In addition to Eq. (14), a plane is also often given in terms of three points in the plane given
by the vectors a,b,c, noting that the difference of any two of these vectors yields a vector in the
plane:
r = a + α(b − a) + β(b − c), α, β ∈ R. (15)
Alternatively, a plane can also be specified using the unit vector normal to the plane n̂, and a
point in the plane a
r · n̂ = a · n̂ = d, (16)
9
Figure 4: Three ways to define a plane in space: via two vectors in the plane and a point; via
three points; and via a point and the unit normal vector to the plane.
where d is the length of the projection of any vector describing a point on the plane onto the
unit normal to the plane, i.e., the distance between the plane and the origin. A summary of
these ways to describe a plane is given in Fig. 4.
10
2 Multiple integrals
If this limit exists and is finite, the function is said to be (Riemann) integrable. In practice,
of course, we tend to solve integrals by guessing them; we can compute and memorize the
derivatives of a wide range of commonly-encountered functions, and then use the fundamental
theorem of calculus (integration is the inverse operation of differentiation) to figure out the
integrals.
Conceptually, this approach can easily be extended to integrals in more dimensions, simply
by considering domains that are no longer 1D intervals but higher dimensional areas, volumes, or
hyper-volumes. However, on the practical side our strategy will be to reduce a multidimensional
integral into a series of 1D integrals, and then solve those using standard approaches. Let us
illustrate this using a simple case in two dimensions.
Let the integration domain be a rectangle D = [a, b] × [c, d] ⊂ R2 in the (x, y) plane, and let
f (x, y) be a continuous function in two variables. Intuitively, the integral of f over this domain
will be some volume V contained by the rectangle between the surfaces z = 0 and z = f (x, y),
so the volume of an object in 3D space. Now let us cut this shape with a vertical plane parallel
to the x-axis, as illustrated in Fig. 6. The intersection between the volume and the plane will
be a flat region in the (x, z) plane, identifiable by a single value of y given by where the vertical
plane intersects the y-axis. We know how to calculate the area A = A(y) of this shape:
ˆ b
A(y) = f (x, y) dx. (18)
a
Within the integral the only variable is x since y is fixed. This is therefore a 1D integral which
we can attempt to solve using known techniques. The total volume V is clearly made by varying
y between the given bounds of [c, d], and summing over their contributions:
ˆ d
V = A(y) dy. (19)
c
11
Figure 6: Integration in two dimensions.
Substituting Eq. (18) into Eq. (19), and noting that performing the same operations, but choos-
ing instead a vertical plane parallel to y, would have to yield the same result, we can write
ˆ d ˆ b ˆ b ˆ d
V = f (x, y) dx dy = f (x, y) dy dx. (20)
c a a c
Because the order in which we chose to integrate does not matter2 , we adopt the following
notation to indicate a general (definite) integral of f over some domain D:
¨
V = f (x, y) dx dy. (21)
D
Note that if a function happens to be separable f (x, y) = h(x) · g(y), then the double integral
above, on a rectangular domain, simplifies to the product of two 1D integrals. The following
integrability theorem holds:
Theorem 1. Let D ⊂ R2 be a regular domain and f : D → R a continuous function. Then f
is integrable in D.
We used the concept of a regular domain here, which is worth dwelling on. A domain D is
said to be regular if it is open (i.e., every point in D is an interior point) and if its boundary
∂D is piecewise smooth (i.e., represented by a finite number of functions that admit continuous
derivatives). We will elaborate more on this in what follows.
We have seen an example of how a 2D integral can be used to calculate a volume in 3D.
However, a double integral can also define the measure (area) of a domain. A limited domain
D ⊂ R2 is said to be measurable if the function 1 is integrable in D. Then, the measure (or
area) of D is given by ¨
|D| = 1 dx dy. (22)
D
2
Although this holds for the integrals treated in this course, this statement is not strictly true. If you are
interested to learn more, look up Fubini’s theorem!
12
2.1 Non-rectangular domains
The extension of double integrals with rectangular domains, as discussed above, to more
complex domains, is relatively straightforward. It will be convenient to consider simple domains
first. We will call a domain x-simple if horizontal lines always intersect the domain into a single
line segment (in the domain), and y-simple if vertical lines always intersect the domain into a
single line segment. The domains are illustrated in Fig. 7. The choice of horizontal and vertical
here is simply a convention based on how we normally draw a 2D coordinate system. In slightly
more mathematical terms,
Definition 2. A set D ⊂ R2 is y-simple if
Similarly, if D is x-simple, then we should integrate over the line segments along the x direction
first: ¨ ˆ d ˆ h2 (y) !
f (x, y) dx dy = f (x, y) dx dy. (26)
c h1 (y)
D
13
Figure 8: Domain of integral in Example 1. We see the domain is both x-simple and y-simple.
The order in which we can chose to integrate is denoted by the red and blue lines.
Both approaches allow us to reduce the double integral to two single integrals, which we know
how to tackle. A rectangular domain is trivially both x-simple and y-simple, but there is a
further family of non-rectangular domains which are both x-simple and y-simple, such as the
disc. Such domains are called simple. Note that if the domain is simple, then we can choose
whether to partition it as x-simple or y-simple. A judicious choice here can dramatically change
the difficulty of the integral.
Most domains you will encounter may not be simple, but as long as they can be partitioned
into a finite number of simple domains we can still integrate across them, and the methods
discussed above apply. This provides another, perhaps more intuitive, way of defining a regular
domain: it is one that can be written as a union of a finite number of simple domains. So we see
that an annulus, for example, would not be a simple domain, but would be regular (and thus
allowable). We will not deal with domains that are not regular in this course.
Solution: The domain is both x-simple and y-simple, as shown in Fig. 8, so we can choose the
order of integration based on preference. Considering D to be y-simple, the integral becomes
ˆ 1 ˆ y=1
3
(28)
I= √
sin y dy dx.
0 y= x
Now unfortunately the integral in the parenthesis has no elementary primitive (one can write
the solution using complex analysis in terms of gamma functions, a task not for the faint of
heart – but do have a go if you’re feeling intrepid!). A far simpler solution is to consider the
14
Figure 9: Separation of the domains of triple integrals based on domain type: the first domain
can be split into layers of area Ω(z) for h1 ≥ z ≥ h2 . The second integrates along the z-axis
lines connecting two functions z = g1 (x, y) and z = g2 (x, y), for all (x, y) points in the domain
D.
The order of integration does not impact the integrability of the function, but it does have a
material impact on the ease with which one can find the solution in simple form.
15
Figure 10: Tetrahedron given in Example 2.
where for every z ∈ [h1 , h2 ], Ω(z) is a plane regular domain. Then, if f : Ω → R is a continuous
function, f is integrable in Ω and the integral can be calculated as
˚ ˆ h2 ¨
f (x, y, z) dx dy dz = f (x, y, z) dx dy dz. (33)
h1
Ω Ω(z)
where D is a regular domain in the (x, y) plane and g1 , g2 : D → R are continuous functions.
Then, if f : Ω → R is a continuous function, f is integrable in Ω and the integral can be
calculated as ˚ ¨ ˆ !
g2 (x,y)
f (x, y, z) dx dy dz = f (x, y, z) dz dx dy. (35)
g1 (x,y)
Ω D
These expressions contain at most a double integral, and we can deploy the methods discussed
earlier to reduce them further into single integrals that we can solve explicitly. For the purposes
of this course we will not be interested in integrating in spaces having domains of dimensionality
higher than three. Nevertheless, it should be fairly obvious by now how this methodology can
be extended to tackle integrals in higher dimensions.
Example 2. Find the mass of the tetrahedron bounded by the three coordinate planes
and the plane x + 2y + 3z = 4, if its density is given by ρ(x, y, z) = ρ0 x.
Solution: The volume of interest is a tetrahedron that intersections the coordinate axes in
points (4, 0, 0), (0, 2, 0), and (0, 0, 4/3), as shown in Fig. 10. In the (x, y) plane the domain is
the triangle bounded by the lines x = 0, y = 0, and y = 2 − x/2, found by setting z = 0 into
the equation of the plane. For each point (x, y) in this domain, the volume can be found by
16
integrating along the z-axis, from z = 0 to z = 13 (4 − x − 2y). We therefore have
˚
m= ρ(x, y, z) dV =
V
ˆ x=4 ˆ y=2−x/2 ˆ z=(4−x−2y)/3
= ρ0 x dx dy dz =
x=0 y=0 z=0
ˆ x=4 ˆ y=2−x/2
ρ0
= (4 − x − 2y)x dx dy =
3 x=0 y=0
ˆ x=4
x3
ρ0
= 4x − 2x2 + dx =
3 x=0 4
ρ0 2 16
= 32 − 42 + 16 = ρ0 .
3 3 9
while changing the original coordinates from (x, y) to some new set of coordinates (u, v) described
by the transformation T
(x, y) = T (u, v). (37)
The domain D of the integral will, necessarily, also change T : D0 → D (note that given the
definition of Eq. (37), T acts on (u, v)). We can can also write explicitly
(
x = g(u, v)
y = h(u, v),
and the integrand becomes f (g(u, v), h(u, v)) := f˜(u, v).
In the 1D case, when changing variables, we were interested in finding an expression for
how the infinitesimal length element changes under the transformation. In 2D, we ask how the
infinitesimal area element dA changes with the change of variables. Consider the transformation
of Eq. (37) illustrated in Fig. 11: we are interested in calculating what the infinitesimal area
17
Figure 11: The infinitesimal area of a domain changes in a coordinate transformation, and must
be accounted for in integration using the Jacobian. Here, (x, y) represent the standard Cartesian
coordinate system, and the lines of constant u and v (the new coordinates) are drawn in red.
dx dy is when written in the new coordinates dv du, and vice versa. To find this, consider the area
of the shaded region in Fig. 11 between the vertices A, B, C and D. For a generic transformation
the sides of this shape will be curved, but in the limit of infinitesimal displacements, to the first
order linear approximation, the sides can be considered straight lines and the shape tends to a
parallelogram. The vertices of the parallelogram have the following (x, y) components:
A: g(u, v) h(u, v)
B: g(u + du, v) h(u + du, v)
C: g(u + du, v + dv) h(u + du, v + dv)
D: g(u, v + dv) h(u, v + dv)
From the vertices we can find the vector sides of the parallelogram
∂g ∂h
a = (g(u + du, v), h(u + du, v)) − (g(u, v), h(u, v)) = du, du
∂u ∂u
∂g ∂h
b = (g(u, v + dv), h(u, v + dv)) − (g(u, v), h(u, v)) = dv, dv .
∂v ∂v
18
or in matrix notation, which lends itself more naturally to generalization in more dimensions,
gu hu
dx dy = det du dv. (40)
g v hv
This matrix containing all first-order partial derivatives is commonly encountered in vector
calculus, and merits a name: the Jacobian matrix. Both the matrix and its determinant are
often simply referred to as the ’Jacobian’, but this is (perhaps surprisingly) rarely confusing.
The Jacobian matrix is sometimes defined as the transpose of the expression above, a form
originating from the requirement to define continuity for multivariable vector functions. For the
use of the matrix this can be somewhat confusing at times, but as regards the determinant there
is no difference since the determinant is invariant per transposition.
where
g hu
J(u, v) = det u (42)
g v hv
is the determinant of the Jacobian matrix.
Example 3. Evaluate ¨
4
I= dx dy, (43)
(x − y)2
D
We are only interested in the size of the Jacobian, not the sign (the sign will depend on the
orientation choice of the vectors taken for the cross product). Taking |J| we can then rewrite
the integral as ¨
4 1
I= · du dv. (45)
v2 2
D0
Now we must analyse the domain, drawn in Fig. 12. The lines y = x−2 and y = x−4 correspond
to v = 2 and v = 4, respectively. These will be the limits of the domain along v. To find the
limits on u we use the remaining two lines and parametrize them in terms of (u, v). Firstly,
when x = 0 we have that u = y and v = −y, so u = −v. When y = 0 we find instead that
u = x = v. Thus u will vary between −v and +v. Therefore
ˆ v=4 ˆ u=v ˆ v=4
2 2
I= 2
du dv = 2
· 2v dv = ln(16). (46)
v=2 u=−v v v=2 v
19
Figure 12: Integration domains for Example 3 in the (x, y) and (u, v) coordinate systems. Note
the dotted triangles in both, each with an area of 2. There are 3 such triangles in the (x, y)
domain but 6 in the (u, v) domain. The transformation has stretched the domain, and this is
accounted for by the Jacobian of 1/2.
x = r cos θ
y = r sin θ
The basis vectors of the Cartesian system are unit vectors along the x- and y-axes, î and ĵ. The
basis vectors in the new system will be different: r̂ is the unit vector along the line through the
origin at some fixed angle θ to the x-axis, while θ̂ is the unit vector tangent to the circumference
at some fixed distance from the origin (i.e, fixed r):
r̂ = î cos θ + ĵ sin θ
θ̂ = −î sin θ + ĵ cos θ. (47)
We can calculate the Jacobian of the transformation, which will allow us to use it to simplify
the calculation of some double integrals:
∂ ∂
r cos θ ∂r r sin θ cos θ sin θ
J(r, θ) = det ∂ ∂r
∂ = det = r(sin2 θ + cos2 θ) = r. (48)
∂θ r cos θ ∂θ r sin θ −r sin θ r cos θ
20
Figure 13: Plane polar coordinate system allows us to identify any point in the plane (except
the origin) via two new variables: r, the Euclidean distance of the point to the origin, and the
angle θ. In these system the infinitesimal surface element in has an area of r dr dθ: it depends
explicitly on the distance of the element to the origin.
The sketch in Fig. 13 shows this result more intuitively: the blue-shaded infinitesimal area, in
the limit of dr, dθ → 0, is simply a rectangle with sides r dθ and dr, and therefore has an area of
r dr dθ. Interestingly, in plane polar coordinates dA this is not constant in all space like dx dy,
but increases linearly with distance from the origin.
Example 4. Evaluate ¨
xy 2
I= dx dy, (50)
x2 + y 2
D
Solution: We note that the domain is particularly simple to express in plane polar coordinates:
on the (x, y) plane it represents a quarter of a disc, so we must have that r ∈ [0, 2] and θ ∈
[0, π/2]. We proceed to rewrite the integral in this coordinate system by making the substitutions
x = r cos θ, y = r sin θ, and making sure we remember to include the Jacobian:
¨ 3
r cos θ sin2 θ
I= r dr dθ =
r2
D
ˆ 2 ˆ π/2
r3 cos θ sin2 θ
= r dr dθ =
0 0 r2
ˆ 2 ˆ π/2 !
2 2
= r dr sin θ cos θ dθ =
0 0
2 3 π/2
r3
sin θ 8
= · = .
3 0 3 0 9
21
Example 5. Evaluate the Gaussian integral
ˆ +∞
(51)
2
G= e−x dx.
−∞
Solution: The first thing to note is that the indefinite integral of e−x does not have a simple
2
primitive, that is to say, there is no elementary function the derivative of which is the squared
exponential. However, the 2D version of the same integral is more tractable, so, for a change,
we shall use double integration to help us solve an integration problem in one dimension. We
start by defining the following integral
¨
(52)
2 2
I(R) = e−(x +y ) dx dy,
D
where D is the area enclosed by the circle of radius R: x2 + y 2 < R. This integral can be
simplified by moving to plane polar coordinates:
ˆ 2π ˆ R
−r2
I(R) = e r dr dθ =
0 0
1 −r2 R
2
= 2π − e = π 1 − e−R .
2 0
Taking the limit of this expression for R → ∞ gives us the value of the integral over the entire
(x, y) plane
I = lim I(R) = π. (53)
R→∞
´ ´
We now note that I = Gx × Gy = −∞ e−x dx × −∞ e−y dy . The two terms in x and y
+∞ 2 +∞ 2
are identical and separable, so they must be equal. We thus conclude that
ˆ +∞ √ √
(54)
2
e−x dx = I = π.
−∞
x = r cos φ
y = r sin φ
z = z.
As shown in Fig. 14, the unit vectors in the Cartesian basis î, ĵ are replaced by the plane polar
basis vectors r̂, φ̂ in the (x, y) plane, while the basis vector along the z direction, k̂, remains
unchanged. The relationship between basis vectors is therefore:
r̂ = î cos φ + ĵ sin φ
φ̂ = −î sin φ + ĵ cos φ
ẑ = k̂. (55)
22
rd
<latexit sha1_base64="JqIoHk53vbnijbTvIc6DYFFhruY=">AAAB8XicbVDLSsNAFL3xWeur6tLNYBFcSEnE17KgC5cV7AObUCaTSTt0ZhJmJkIJ/Qs3LhRx69+482+ctllo64ELh3Pu5d57wpQzbVz321laXlldWy9tlDe3tnd2K3v7LZ1kitAmSXiiOiHWlDNJm4YZTjupoliEnLbD4c3Ebz9RpVkiH8wopYHAfcliRrCx0qNC/imK/HTAepWqW3OnQIvEK0gVCjR6lS8/SkgmqDSEY627npuaIMfKMMLpuOxnmqaYDHGfdi2VWFAd5NOLx+jYKhGKE2VLGjRVf0/kWGg9EqHtFNgM9Lw3Ef/zupmJr4OcyTQzVJLZojjjyCRo8j6KmKLE8JElmChmb0VkgBUmxoZUtiF48y8vktZZzbusXdyfV+u3RRwlOIQjOAEPrqAOd9CAJhCQ8Ayv8OZo58V5dz5mrUtOMXMAf+B8/gBzX5Al</latexit>
dr
<latexit sha1_base64="aKZOmmC+lP9vZpNAO01pEIK25yQ=">AAAB7HicbVBNS8NAEJ2tX7V+VT16WSyCBymJ+HUs6MFjBdMW2lA2m027dLMJuxuhhP4GLx4U8eoP8ua/cdvmoK0PBh7vzTAzL0gF18ZxvlFpZXVtfaO8Wdna3tndq+4ftHSSKco8mohEdQKimeCSeYYbwTqpYiQOBGsHo9up335iSvNEPppxyvyYDCSPOCXGSl7vDIeqX605dWcGvEzcgtSgQLNf/eqFCc1iJg0VROuu66TGz4kynAo2qfQyzVJCR2TAupZKEjPt57NjJ/jEKiGOEmVLGjxTf0/kJNZ6HAe2MyZmqBe9qfif181MdOPnXKaZYZLOF0WZwCbB089xyBWjRowtIVRxeyumQ6IINTafig3BXXx5mbTO6+5V/fLhota4K+IowxEcwym4cA0NuIcmeECBwzO8whuS6AW9o495awkVM4fwB+jzBw2MjjY=</latexit>
r rd
<latexit sha1_base64="JqIoHk53vbnijbTvIc6DYFFhruY=">AAAB8XicbVDLSsNAFL3xWeur6tLNYBFcSEnE17KgC5cV7AObUCaTSTt0ZhJmJkIJ/Qs3LhRx69+482+ctllo64ELh3Pu5d57wpQzbVz321laXlldWy9tlDe3tnd2K3v7LZ1kitAmSXiiOiHWlDNJm4YZTjupoliEnLbD4c3Ebz9RpVkiH8wopYHAfcliRrCx0qNC/imK/HTAepWqW3OnQIvEK0gVCjR6lS8/SkgmqDSEY627npuaIMfKMMLpuOxnmqaYDHGfdi2VWFAd5NOLx+jYKhGKE2VLGjRVf0/kWGg9EqHtFNgM9Lw3Ef/zupmJr4OcyTQzVJLZojjjyCRo8j6KmKLE8JElmChmb0VkgBUmxoZUtiF48y8vktZZzbusXdyfV+u3RRwlOIQjOAEPrqAOd9CAJhCQ8Ayv8OZo58V5dz5mrUtOMXMAf+B8/gBzX5Al</latexit>
<latexit sha1_base64="OICVT4dK95dXJAFrn35WIRFo0uQ=">AAAB6HicbVDLSgNBEOz1GeMr6tHLYBA8hV3xdQzowWMC5gHJEmYnvcmY2dllZlYIS77AiwdFvPpJ3vwbJ8keNLGgoajqprsrSATXxnW/nZXVtfWNzcJWcXtnd2+/dHDY1HGqGDZYLGLVDqhGwSU2DDcC24lCGgUCW8Hoduq3nlBpHssHM07Qj+hA8pAzaqxUV71S2a24M5Bl4uWkDDlqvdJXtx+zNEJpmKBadzw3MX5GleFM4KTYTTUmlI3oADuWShqh9rPZoRNyapU+CWNlSxoyU39PZDTSehwFtjOiZqgXvan4n9dJTXjjZ1wmqUHJ5ovCVBATk+nXpM8VMiPGllCmuL2VsCFVlBmbTdGG4C2+vEya5xXvqnJZvyhX7/I4CnAMJ3AGHlxDFe6hBg1ggPAMr/DmPDovzrvzMW9dcfKZI/gD5/MH4PuNAg==</latexit>
Figure 14: The cylindrical coordinate system, basis vectors, and volume element.
As illustrated in Fig. 14, the infinitesimal volume in this coordinate system tends to a cuboid
with sides dr, dz, and r dφ. In this coordinate system we can therefore write the (square)
infinitesimal line element as
Similarly, the surfaces of the cuboid have sides with areas given by
x = r sin θ cos φ
y = r sin θ sin φ
z = r cos θ,
where θ ∈ [0, π] is the polar angle, measured between the z-axis and the radial vector connecting
the origin to some point, and φ ∈ [0, 2π] is the azimuthal angle, measured between the x-axis
and the projection of the radial vector onto the xy-plane. The new basis vectors in this system
are shown in Fig. 15, and are given by the relations
23
Figure 15: The spherical polar coordinate system, basis vectors, and volume element.
As illustrated in Fig. 15, the infinitesimal volume in this coordinate system tends to a cuboid
with sides dr, r dθ, and r sin θ dφ. The line element is therefore given by
24
Example 6. Integrate the function f = x2 z in the positive half of the sphere of radius R
centred in the origin.
Note how we have moved to plane polar coordinates to simply the double integral in (x, y) into
a product of two single integrals in r and θ given the radial symmetry of the problem.
Alternatively, we could also consider the domain as a series of discs slicing the sphere:
Ω = {(x, y, z) : x2 + y 2 ≤ R2 − z 2 , 0 ≤ z ≤ R},
25
3 Statistical Distributions
Consider we wish to perform an experiment, the outcome of which is well-defined (and ideally
measurable), but a priori unknown. There is value in being able to make quantitative predictions
on what the outcome of the experiment may be, without having to perform it. The mathematical
tools that can assist us in making such predictions in real-world experiments fall under the
general headings of probability and statistics. Of course this is a huge topic; in this course we
will explore a very small part of it, that of continuous random distributions.
First, we should define some basic terminology. The experiment mentioned earlier will be
called a trial, and the result of the experiment the outcome. The set of all possible outcomes of
the trial, i.e. our space of possibilities, will be called the sample space. The sample space can be
discrete, with a finite or infinite number of possible outcomes, or continuous, with a necessarily
infinite number of possible outcomes. The trial of flipping a coin has a discrete number of
outcomes if we are interested in which side ends facing upwards (heads, tails, or, possibly, it
landing on its side), but a continuous set of outcomes if we were interested in predicting, for
example, where on the floor it will land. If instead of considering an individual outcome we are
interested in set of outcomes, itself a subset of the sample space, we will call such a subset an
event. These too can be continuous or discrete. In this course we will be interested in continuous
sample spaces, which will allow us to apply our vector calculus methods to the study of statistics.
How likely is a certain event, compared to all others? One way to think about this is to
imagine repeating the trial many times over, and to count the frequency of measuring a certain
event compared to all other outcomes. It turns out that this relative frequency tends to be
approximately the same every time a large set of trials is performed, which spurs us to call this
ratio the probability P of the event taking place. If we call the event A, then we can write
nA
P (A) = , (63)
ntot
where nA is the number of outcomes of event A and ntot is the number of total outcomes.
Figure 16: Probability density function (PDF) for a variable X that can take values between l1
and l2 ]. The probability of finding X between a and b is given by the integral of the PDF over
that range, as given in Eq. (65).
26
outcomes is therefore equal to the number of values present on some finite interval of the reals.
As this is infinite, we cannot assign a finite probability to individual outcomes: Eq. (63) shows
that P → 0 if ntot → ∞ for any finite nA . Instead of considering individual outcomes, when
dealing with continuous variables we must therefore consider outcomes over a finite continuous
subset of the sample space.
We can use this insight to define a probability density function (PDF) f of a continuous
random variable X as
P (x < X ≤ x + dx) = f (x) dx. (64)
Here f (x) dx is the probability of finding X in the interval between x and dx. More generally:
ˆ b
P (a < X ≤ b) = f (x) dx. (65)
a
We know that some outcome of a trial will take place with certainty, and so if we take the
range to be the entire sample space S, the probability of finding X there must be 1 (i.e., 100%),
by definition. This leads to the requirement for f to be integrable over S, and normalized to 1:
ˆ
f (x) dx = 1.
S
Furthermore, a probability must be a real number between 0 and 1, and can never be negative.
For f (x) to be a probability density we therefore further require f to be a non-negative function
of x, ∀x ∈ S.
For practical reasons will be useful to define the cumulative probability distribution function
(CDF), denoted by F (x), which is given by the following integral:
ˆ x
F (x) = f (u) du. (66)
l1
Clearly, this is the probability that X is less than, or equal to, x, i.e., F (x) = P (X ≤ x). We
can turn back to Eq. (65), and rewrite it in terms of the CDF as
ˆ b
P (a < X ≤ b) = f (x) dx = F (b) − F (a). (67)
a
We can see why the CDF is useful – if known, it provides us with a way to quickly determine the
probability of the random variable lying in some range, without needing to evaluate an integral.
From the fundamental theorem of calculus f (x) = dFdx(x) .
Example 7. Consider the PDF provided in Fig. 17, describing the seasonal maximum
air temperature in SE England according to two climate modelling scenarios. Estimate
the likelihood that the maximum temperatures exceed 2◦ C, and 6◦ C, respectively, in the
two scenarios. What is the probability the maximum temperature increase will be in the
range 2◦ C-6◦ C?
Solution: The likelihood of the temperature exceeding 6◦ C is easy to glean directly from the
PDF. It approaches 0 under the RCP2.6 scenario (the entire distribution lies below the 6◦ mark),
but is around 50% in the scenario RCP8.5. The probability that the temperature will exceed
2◦ C is a little harder to extract from the PDF figure without having the numerical values, but
can instead be found from the CDF:
27
Probability Density Function Cumulative Distribution Function
Figure 17: Seasonal average Maximum air temperature anomaly at 1.5m (in ◦ C) for June-July-
August in years 2070 up to and including 2098, in SE England using baseline 1961-1990. RCP
stands for Representative Concentration Pathway. RCP8.5 is a scenario where we make little
effort to curb emissions, while under RCP2.6 a concerted effort is made. Source: Met Office.
To find the probability of the temperatures ending up in the range 2◦ C-6◦ C we can simply
subtract the CDFs at the two values as shown in Eq. (67): we find:
where S is the sample space, and assuming the integral exists. We note that it is a linear
functional that depends on the functions f and g, rather than being a function of the random
variable X.
The most common types of functions for which we will want to calculate the expectation value
will be powers of the variable itself. Because of this, these expectation values are given a special
name to distinguish them – they are called moments of the distribution, with
ˆ
k
E[X ] = xk f (x) dx (69)
S
denoting the k-th moment of the distribution. Of these, the first moment – the expectation of
the variable itself – is most widely used, and is more commonly called the mean:
ˆ
E[X] = xf (x) dx. (70)
S
This quantity is often denoted as hxi, x̄, or by the greek letter µ. In addition to this, distributions
are also commonly characterized by the mode and the median. The mode of a distribution is the
value of the random variable X at which the probability density function f (x) has its greatest
value. Note that it is possible to have multiple modes in a distribution. The median of a
distribution is the value of the random variable X at which the cumulative probability function
F(x) takes the value 1/2: it divides the PDF into two, equally probably halves.
Another important expectation value is the variance:
28
Definition 8. The variance of a distribution, σ 2 , is defined as
ˆ
E[(X − hxi) ] = σ = (x − hxi)2 f (x) dx,
2 2
(71)
S
The variance of a distribution is always positive. Its positive square root is known as the
standard deviation of the distribution and is often denoted by σ. Roughly speaking, σ measures
the spread of the values that X can assume around the mean. Often the variance is given in a
slightly different form:
σ 2 = E[(X − hxi)2 ] =
= E[X 2 ] − 2hxiE[X] + hxi2 =
= E[X 2 ] − hxi2 ≡ E[X 2 ] − E[X]2 .
It follows that the variance is the difference between the second moment and the square of the
first moment of a distribution.
• Uniform distribution
f (x) = constant
This is the simplest PDF, and we note that the domain must be limited to some finite
subset of R or the integral will not converge. Despite its simplicity, it is still widely used,
for example:
• Exponential distribution
f (x) = λe−λx
The exponential distribution models the time between events in a Poisson process. Some
common uses include:
29
4 PDF 1.0 CDF Uniform
0.8 0 | max 1
min min 0 | max 1
3
0.6
2 min -1 | max 1 min -1 | max 1
0.4
1
0.2 0.5 | max 0.75
min min 0.5 | max 0.75
0 0.0
-1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0
0.8
λ 0.5 λ 0.5
1.0
0.6
λ1 λ1
0.4
0.5
λ 1.5
0.2 λ 1.5
0.0 0.0
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
0.0 0.0
-4 -2 0 2 4 6 -4 -2 0 2 4 6
0.7 1.0
PDF CDF Cauchy (Lorentz)
0.6
0.8
0.5 a 0 | b 0.5 a 0 | b 0.5
0.4 0.6
0.3 a0 | b1 a0 | b1
0.4
0.2
a -2 | b 1 a -2 | b 1
0.1 0.2
-5 -4 -3 -2 -1 0 1 2 3 -5 -4 -3 -2 -1 0 1 2 3
2.5 0.8k 1 | α 1 k 1 | α 1
2.0
0.6
1.5 k 1 | α 2 k 1 | α 2
0.4
1.0
0.5 0.2k 1 | α 3 k 1 | α 3
0.0
1 2 3 4 5 6 1 2 3 4 5 6
Figure 18: Commonly encountered probability density functions (PDFs) and cumulative distri-
bution functions (CDFs). The equations are given in the main text.
30
• Gausssian distribution
−(x − µ)2
1
f (x) = √ exp
2πσ 2σ 2
The Gaussian (or normal) distribution is the most widely used probability distribution,
and has many applications across various fields, including:
– In spectral analysis, to model the line shapes of spectral peaks, such as those ob-
served in emission spectroscopy, nuclear magnetic resonance (NMR), and Raman
spectroscopy.
– In robust statistics, as a model for outliers, as it is less sensitive to outliers than the
normal distribution.
– In signal processing, to model resonances in signals, such as those observed in vibra-
tion analysis.
– In fluid mechanics, to model the turbulence in fluid flow.
• Pareto distribution
αk α
f (x) =
xα+1
The Pareto distribution is another heavy-tailed distribution, popular in the social sciences.
Its uses include:
31
3.3 Join probability distributions
The discussions above can be extended to multiple random variables, which may or may not
be independent. Consider the case of two random variables X and Y , described by the join
probability density function fXY (x, y). The two variables are said to be independent if we can
factorize their joint PDF: fXY (x, y) = gX (x)hY (y). Otherwise, the variables are interdependent,
or coupled. Similarly to Eq. (64), we can write the probability for the variables to fall within
some infinitesimal range as
dP (X, Y ) = fXY (x, y) dx dy, (72)
which, again, must be normalized (or at least normalizable) over the space of all possible out-
comes S: ¨
fXY (x, y) dx dy = 1. (73)
S
The probability of an event is then simply a double integral over the 2D domain D of desired
outcomes of the random variables X and Y :
¨
P ((X, Y ) ∈ D) = fXY (x, y) dx dy. (74)
D
Example 8. Let fXY (x, y) = x+cy 2 be a joint probability density function of two random
variables X and Y , defined over the support x ∈ [0, 1] and y ∈ [0, 1]. Find the appropriate
normalization constant c, and determine the probability that both X and Y are less than,
or equal to, 1/2.
32
An important point to highlight here is the need to renormalize this distribution. This is because
the overall joint PDF is normalized assuming all values of Y are accessible. As we can see, the
normalization needed is provided by the marginal distribution for Y = y0 .
Example 9. Find the marginal PDFs fX (x) and fY (y) for the joint PDF given in Exam-
ple 8 fXY (x, y) = x + 32 y 2 . Find the probability that Y > 1/2 if X = 1/3.
33
for
´ all continuous functions f on R. We will not address the details of the meaning of the symbol
in Eq. (76), but note that it cannot be the Riemann integral given by Eq. (17) (and sketched
in Fig. 5)).
While it is tempting to consider δ(x) as a function, i.e., a map between reals, this would be
incorrect and can lead to contradictions. From the definition we see that for the particular case
of f (x) = 1 we must have ˆ +∞
δ(x) dx = 1. (77)
−∞
This fits with our concept of continuous probability distributions: the integrand must be eval-
uated over some finite range rather than in a point. In fact, by definition this integral of the
delta is always 1 for any interval containing the point 0, and 0 otherwise.
The δ can be shifted around the real axis generalizing Eq. (76) to a generic point x0 , where
for convenience we use the standard notation for functions:
ˆ +∞
f (x)δ(x − x0 ) dx = f (x0 ). (78)
−∞
We can also define the derivative of a distribution, here of the Dirac delta, as
ˆ +∞
f (x)δ 0 (x) dx = −f 0 (0), (79)
−∞
which can be shown to have all the required properties. The expression above can be understood
in terms of integration by parts, i.e.
ˆ +∞ ∞ ˆ +∞
0
f (x)δ (x) dx = δ(x)f (x) − f 0 (x)δ(x) dx = −f 0 (0). (80)
−∞ −∞ −∞
This is, of course, a terrible abuse of notation, and the term with the delta outside of the
integral should give you indigestion. Nevertheless, it is a useful aide-mémoire, even if not a
rigorous proof. It is important to note that these integrals make sense whether δ is differentiable
or not (it is not, in the normal sense that we have used so far). All that is needed is that the
function f is differentiable. By setting f (x) = 1 we then find that
ˆ +∞
δ 0 (x) dx = 0. (81)
−∞
One of the many uses in physics of the Dirac delta is that it allows us to move between
discrete and continuous variables and distribution functions with ease. Let X be a discrete
random variable which can take values x1 , x2 , ..., xn with related probabilities p1 , p2 , ..., pn . Then
X can be described as a continuous variable having a PDF given by:
n
(82)
X
f (x) = pi δ(x − xi ).
i=1
Using the PDF to evaluate the probability of X lying in some interval [a, b] yields
ˆ b n ˆ b
(83)
X X
P (a < X ≤ b) = f (x) dx = pi δ(x − xi ) dx = pi ,
a i=1 a i|xi ∈[a,b]
as expected. Note that this can considerably simplify dealing with discrete quantities, which are
almost always harder to treat than continuous functions.
34
4 Curves and vector functions
We can also write the function explicitly in vector form as r(t) = x(t)î + y(t)ĵ + z(t)k̂.
The curve r written explicitly as a function of some parameter (here t) is called the parametrized
form of the curve. The same curve can also be given in non-parametrized version, purely as a
function of the coordinate variables (x, y, z).
Example 10. Let the curve c be a circumference in 2D, centered in the origin, of radius
r. Write the curve in parametric and non-parametric form.
35
Figure 20: Examples of curves that are closed, open, simple, and not simple.
To find the non-parametric form, we must eliminate t from the equations above. The simplest
way to do this is to square x and y, and add them together, to find explicitly the relationship
between x and y: x2 + y 2 = r2 .
d r(t0 + h) − r(t0 )
r(t) ≡ r0 (t0 ) = lim , (84)
dt t=t0 h→0 h
The limit of a vector quantity is straightforward to compute: it is simply the limit of each
component of the vector. This follows directly from the definition of a vector space. Hence
we see that the derivative of a vector function is the vector containing the derivatives of its
components:
r0 (t) = (r10 (t), r20 (t), · · · , rn0 (t)). (85)
A curve r is said to be regular if r0 (t) 6= 0 for all t ∈ I, and singular otherwise. From Eq. (84)
we also see how we can define the differential of a vector:
dr
dr = dt. (86)
dt
So far we have only considered vectors written in the Cartesian basis, where the basis vectors
do not change from one point to another. This is not the case for polar coordinates, and in this
36
Figure 21: Graphical representation of the derivative of a vector r(t), taken in the limit of
infinitesimal increment h of the parameter t.
case we cannot assume that the basis vectors remain constant. To see this, recall the definition
of the basis vectors in plane polar coordinates from Eq. (47):
r̂ = cos θ î + sin θ ĵ
θ̂ = − sin θ î + cos θ ĵ.
By writing them in the Cartesian basis we can find their derivatives explicitly:
dr̂ dθ dθ dθ
= − sin θ î + cos θ ĵ = θ̂
dt dt dt dt
dθ̂ dθ dθ dθ
= − cos θ î − sin θ ĵ = − r̂.
dt dt dt dt
So some care is needed in general curvilinear coordinate systems, but the standard results and
the chain rule all continue to work as expected.
Solution: We make use of the definition of a vector differential from Eq. (86), for the transfor-
mation (x, y, z) → (r, θ, φ):
x = r sin θ cos φ
y = r sin θ sin φ
z = r cos θ.
Because r = r(r, θ, φ) depends on three variables, we have:
2 2 2
∂r ∂r ∂r
(ds)2 = | dr|2 = (dr)2 + (dθ)2 + (dφ)2 =
∂r ∂θ ∂φ
2 2 2
sin θ cos φ r cos θ cos φ −r sin θ sin φ
= sin θ sin φ (dr)2 + r cos θ sin φ (dθ)2 + r sin θ cos φ (dφ)2 =
cos θ −r sin θ 0
=1(dr)2 + r2 (dθ)2 + r2 sin2 θ(dφ)2 .
37
We recover the result for spherical polar coordinates of Eq. (60), found using a geometric ap-
proach.
With this definition we can recover the fundamental theorem of calculus for vector functions:
Theorem 2. Let r : [a, b] → Rn be a differentiable map with continuous first derivatives r0 .
Then,
ˆ b
r0 (t) dt = r(b) − r(a). (88)
a
Clearly, for regular closed curves this integral is always zero.
If instead of taking the differential dr we consider the derivative dr/ dt, this relationship can we
rewritten as 2 2 2 2
ds dr dr dx dy dz
= · = + + . (91)
dt dt dt dt dt dt
In particular, we can write
s 2 2 2
ds dr(t) dx dy dz
= = |r0 (t)| = + + . (92)
dt dt dt dt dt
38
Figure 22: Calculation of the length of a curve via segments.
The arc length given by the integral in Eq. (89) can then be rewritten as
ˆ b ˆ b 2 2 2
s
dx dy dz
s= r0 (t) dt = + + dt. (93)
a a dt dt dt
If the curve is provided in non-parametric form, then we need to find how ds varies with dx,
dy or dz:
p
ds = (dx)2 + (dy)2 + (dz)2 =
s 2 2
dy dz
= 1+ + dx
dx dx
s 2 2
dx dz
= 1+ + dy
dy dy
s 2 2
dx dy
= 1+ + dz.
dz dz
Which one of the last three expressions we choose to substitute into Eq. (89) will largely depend
on the functional expressions of the curve, and on the interdependences between them, as some
derivatives may be easier to integrate than others.
These expressions yield a particularly simple, yet useful, case for curves r that can be written
as a 1D function in the (x, y) plane by an equation of the form y = f (x). The arc length can
then be written as directly as a function of f , or of its inverse f −1 , if it exists, as:
ˆ bx ˆ by
(94)
p p
s= 0 2
1 + f (x) dx = 1 + f −1 (y)2 dy.
ax ay
39
Figure 23: Unit vectors defining a orthonormal coordinate system on a curve in space. The
derivatives of these three vectors with respect to the arc length along the curve s are given by
the Frenet-Serret formulae.
Example 12. Find the arc length of a cylindrical helix given by the parametric equations
x = R cos t,
y = R sin t,
z = αt,
|r(t)|2 = const.
d
|r(t)|2 = 0 = 2 r0 (t) · r(t) ⇒ r0 ⊥ r.
dt
From Eq. (92) we have that ds = |r0 (t)| dt. The size of the tangent |r0 (t)| assumes the role
of a speed: it tells us how quickly distance mapped out by s changes with t. Clearly, if we
40
chose our parametrization t to be the arc length s, then ds = |r0 (s)| ds, and the tangent in this
parametrization must have a modulus of 1 for all s: r0 (s) is a unit tangent vector to the curve,
which we shall call t̂ .
The rate at which the unit tangent t̂ changes with respect to s is given by dt̂/ ds, and its
magnitude is defined as the curvature κ of the curve at a given point
dt̂ d2 r
κ= = . (95)
ds ds2
Note that for a straight line, κ = 0. The inverse of the curvature, the radius of curvature
ρ = 1/κ, is often the more useful quantity to quantify the “curvedness” of a curve, as shown in
Fig. 23. Because t̂ is a unit vector, its derivative (of size κ) is orthogonal to it. The unit vector
in this direction is called the principal normal n̂:
dt̂
= κn̂. (96)
ds
Finally, we can define a third unit vector, the binormal to the curve, as b̂ = t̂ × n̂. Since b̂ · t̂ = 0
we can write
d h i db̂ dt̂
b̂ · t̂ = · t̂ + b̂ ·
ds ds ds
db̂
= · t̂ + b̂ · κn̂ = 0.
ds
Since b̂ ⊥ n̂ the last term is zero, the first term must be zero as well. The vector db̂/ ds is thus
⊥ to both t̂ and to b̂ (because |b̂| = 1), and must therefore point in the direction of n̂. To state
this explicitly, we write
db̂
= −τ n̂, (97)
ds
where τ is called the torsion of the curve, and the negative sign is a matter of convention.
To find dn̂
ds we note that this vector must be ⊥ to n̂. It can thus be written as some linear
combination of the vectors t̂ and b̂:
dn̂
= αt̂ + β b̂. (98)
ds
Taking the scalar product of this expression with t̂ we can see that α = dn̂
ds · t̂. We then note
that since n̂ · t̂ = 0,
dn̂ dt̂
· t̂ = −n̂ · .
ds ds
Using Eq. (96) we can conclude that α = −κ. Similarly, taking the scalar product of Eq. (98)
with b̂ we find β = dn̂ ds · b̂, and since n̂ · b̂ = 0, we again have that
dn̂ db̂
· b̂ = −n̂ · ,
ds ds
from which, using Eq. (97), we conclude that β = τ . Therefore, Eq. (98) can be rewritten as
dn̂
= −κt̂ + τ b̂. (99)
ds
The trio (t̂, n̂, b̂) form a right-handed orthonormal coordinate system at any given point on
the curve, but note that this system changes as one moves along the curve. They are sketched
41
together in Fig. 23. The equations relating the vectors of the Frenet frame and their derivatives
with respect to the arc length along the curve s are given by the famous Frenet-Serret formulæ:
In practice, parametrizing the curve as a function of the arc length can be quite challenging,
but fortunately we can write the same integral for a generic parametrization r = r(t), t ∈ [a, b]
using a change of variables transformation
s = s(t)
ds = |r0 (t)| dt
ˆ b ˆ b
m= 0
ρ(s(t))|r (t)| dt = ρ̃(t)|r0 (t)| dt. (102)
a a
Definition 12. Let r : [a, b] → Rm be a regular curve with trace γ and let f be a real-valued
function defined on a subset of Rm which contains γ (f : A ⊂ Rm → R, γ ⊂ A). The line (or
curvilinear) integral of f along γ is the integral
ˆ ˆ b
f ds ≡ f [r(t)] |r0 (t)| dt. (103)
γ a
This is the natural generalization of the standard 1-dimensional integral to a generic curve in
space. The challenge is typically not so much in computing the integral, but rather in finding a
suitable parametrization for the curve γ.
42
Example 13. Evaluate the line integral
ˆ
xy 4 ds,
γ
x = 4 cos θ,
h π πi
y = 4 sin θ, θ∈ − , ,
2 2
and therefore dx = −4 sin θ dθ and dy = 4 cos θ dθ. We can use Eq. (92) to find ds:
s
dx 2
2
dy
ds = + dθ
dθ dθ
p
= 16 sin2 θ + 16 cos2 θ dθ = 4 dθ.
where f (x, y, z) is a scalar function or field. For basis vectors that do not change with the
coordinates
´ we can take them out of the integral and write the result as a vector with integrals
γ f (x, y, z) dxi as its components.
If instead of integrating a scalar function along some curve γ we integrate a vector function,
then we will find integrals of the following form:
ˆ ˆ ˆ ˆ
A(r) · dr = Ax (x, y, z) dx + Ay (x, y, z) dy + Az (x, y, z) dz. (105)
γ γ γ γ
where A(r) = (Ax , Ay , Az ) is the vector field, and the dot indicates the scalar product, so the
outcome of the integral must now be a scalar. Integrals of this form are common in physics, and
are used, for example, to find the total work done, W , by a force F moving along some path γ:
ˆ
W = F · dr. (106)
γ
43
Example 14. Evaluate the line integral
ˆ
I = A(r) · dr,
γ
where γ is the line segment connecting points (0, 2) and (1, 4), and A = (yx2 , sin πy).
Solution: We start by parametrizing the curve γ. Taking (0, 2) as the point on the line, and
noting it has a slope of 2, we can write the line equation as
0 1
r(t) = +t , t ∈ [0, 1].
2 2
From this we find the most convenient parametrization setting x = t, and y = 2t + 2, which
implies dx = dt, and dy = 2 dt. We can now rewrite the integral in a more tractable form:
ˆ
I = (yx2 dx + sin πy dy) =
γ
ˆ 1 ˆ 1
2
= (2t + 2)t dt + sin[π(2t + 2)] 2 dt =
0 0
1 1
t4 2 3
2 cos[π(2t + 2)] 7
= + t − = .
2 3 0 2π 0 6
where a and b indicate the start and end points of the curve γ, as illustrated in Fig. 24. The
integral is path-independent, because we are integrating an exact (also called perfect) differential.
A consequence of this is that every closed line integral of such a vector field must necessarily be
zero, since closed integrals have the same starting and ending point:
˛
A(r) · dr = 0, if A(r) = ∇φ(r). (110)
γ
44
Figure 24: Three different paths taken between points a and b. If the vector field is conservative,
then the line integral of the vector field will not depend on the path taken, but only on the points
a and b.
Importantly, the converse statement also holds. If A is such that its integral vanishes around
every closed curve in some region, then there exists a scalar function φ such that A = ∇φ. In
physics, the line integral of a vector field around a closed curve is often also called the circulation
of the field.
There is a small but important caveat to the statement above: it holds provided the region
R in which we select our curves γ is simply connected: we need to be able to continuously
transform every path between the two points into any other such path in R without leaving
R. An alternative way to say this is to require that every simple closed curve in R can be
continuously shrunk to a point. In practice this means the domain must not have holes: i.e.,
regions for which A(r) and its derivatives are ill-defined. If the region R contains a hole then
there exist simple curves that cannot be shrunk to a point without leaving R. Such a domain
has two boundaries (the outer boundary, and the boundary around the hole), and is thus called
doubly-connected.
Definition 13. A plane region R is said to be simply connected if every simple closed curve
within R can be continuously shrunk to a point without leaving the region. A region with n holes
is said to be (n + 1)-fold connected, or multiply connected.
In practice it is much simpler to identify exact differential forms than it is to prove general
path independence of some vector field. To see this, consider the integrand in Eq. (108), which
must be an exact differential, in terms of the components of A:
∂φ ∂φ ∂φ
dx + dy + dz = Ax dx + Ay dy + Az dz. (111)
∂x ∂y ∂z
Now note that the second mixed partial derivatives of φ must be independent of the order in
which the derivatives are taken
∂2φ ∂2φ ∂2φ ∂2φ ∂2φ ∂2φ
= , = , = . (112)
∂x∂y ∂y∂x ∂x∂z ∂z∂x ∂y∂z ∂z∂y
Written in terms of A = (Ax , Ay , Az ), these expressions provide a useful test for whether a
generic vector field is conservative – the following relations must hold for its components:
∂Ax ∂Ay ∂Ax ∂Az ∂Ay ∂Az
= , = , = . (113)
∂y ∂x ∂z ∂x ∂z ∂y
45
Example 15. Determine if A represents a conservative vector field for the following cases:
Solution: We need to check that the relations in Eq. (113) hold for the vector fields provided.
Starting with case (a), we have
∂Ax ∂(xf (r)) ∂r xy
= = xf 0 (r) = f 0 (r)
∂y ∂y ∂y r
∂Ay ∂(yf (r)) ∂r yx
= = yf 0 (r) = f 0 (r) .
∂x ∂x ∂x r
These are the same; proceeding similarly we can find the other two equalities hold, hence dφ =
f (r)(x dx + y dy + z dz) is an exact differential. The associated scalar potential φ can only be
given´ in integral form, since we have no information on the functional form of f . It is simply
φ = rf (r)dr.
For case (b) the calculation is very similar. Writing c = (cx , cy , cz ) we have
∂Ax ∂(cx f (r)) ∂r y
= = cx f 0 (r) = cx f 0 (r)
∂y ∂y ∂y r
∂Ay ∂(cy f (r)) ∂r x
= = cy f 0 (r) = cy f 0 (r) .
∂x ∂x ∂x r
These clearly differ, and so we conclude that f (r)(cx dx + cy dy + cz dz) cannot be an exact
differential.
Example 16. The gravitational force acting on a particle of mass m near the Earth’s
surface can be represented by the vector F = (0, 0, −mg), where the z-axis points vertically
upwards, and g is some constant. Find a general expression for the work done by the force
as the particle is displaced from height h1 to h2 , and show that the work done is path
independent.
Solution: We start by checking if the force is conservative, i.e., if the vector field can be
represented as the gradient of some scalar field. The conditions of Eq. (113) are trivially verified,
since F is a constant vector and all the partial derivatives of its components are 0. The differential
is thus exact, and given by
dφ(x, y, z) = F · dr = −mg dz.
We recognize the scalar field φ as the potential energy. Integrating both sides between the initial
and final points, with corresponding heights h1 to h2 , we find the work done
ˆ r2 ˆ r2 ˆ h2
W = F · dr = dφ(r) = −mg dz =
r1 r1 h1
= φ(h2 ) − φ(h1 ) = −mg(h2 − h1 ).
This is obviously path-independent, since it is given by the difference in values of φ(z) calculated
in two different points.
46
Figure 25: A domain R of the xy-plane bounded by a curve γ, used in Green’s theorem. The
domain is simple, and bounded by the box x ∈ [a, b] and y ∈ [c, d].
Theorem 3. Let P (x, y) and Q(x, y) be two continuous functions with continuous partial deriva-
tives, well-defined inside and on the boundary γ of a simply connected region R in the xy-plane.
Then the following relationship holds
˛ ¨
∂Q ∂P
(P dx + Q dy) = − dx dy. (114)
γ R ∂x ∂y
47
Example 17. Use Green’s theorem to calculate the area of an ellipse centred at the origin
with semi-axes a and b.
Solution: Let R be the region in the xy-plane bounded by the closed curve describing ˜ the
ellipse in question. The area A of the ellipse will be given by the double integral A = R dx dy.
We must therefore choose the integrand of the right hand side of Eq. 114 such that ∂Q ∂P
∂x − ∂y = 1,
which restricts our choice of P and Q. There are many ways to satisfy this requirement; for
example, if we set P = 0, then ∂Q
∂x = 1, and Q = x. Inserting these into Green’s theorem:
˛ ˛
A = (P dx + Q dy) = x dy.
γ γ
¸
Example 18. Evaluate the integral γ A · dr, where A = (2xy, x2 ), and γ is the triangle
connecting points (0,0), (1,0) and (0,2).
Solution: We start by solving this problem by performing the integral along the line. We have
˛ ˛
A · dr = (2xy dx + x2 dy)
γ γ
ˆ (1,0) ˆ (0,2) ˆ (0,0)
2 2
= (2xy dx + x dy) + (2xy dx + x dy) + (2xy dx + x2 dy)
(0,0) (1,0) (0,2)
ˆ 0 h i0
=0+ 2x(−2x + 2) dx + x2 (−2) dx + 0 = − 2x3 + 2x2 = 0,
1 1
where for the middle segment connecting points (1,0) and (0,2) we used the parametrization
y = −2x + 2, dy = −2 dx.
Another way to proceed is to use Green’s theorem. We identify P = 2xy, and Q = x2 , both
well-behaved differentiable functions on the xy-plane. Their derivatives are
∂P ∂Q
= 2x, = 2x.
∂y ∂x
As these are the same, from Green’s theorem we can immediately see that the integral must
be 0, irrespective of the path taken. So not only is the integral along the triangle 0, all closed
integrals will integrate to 0. We learn more by doing less.
The observed equality of mixed derivatives above is of course nothing but the known result
for an exact differential, and so it should be possible to write the vector field provided as the
gradient of some scalar function φ, such that A = ∇φ. To find φ we note that
ˆ ˆ
∂φ
P = ⇒ dφ = 2xy dx ⇒ φ = x2 y + C(y),
∂x
ˆ ˆ
∂φ
Q= ⇒ dφ = x2 dy ⇒ φ = x2 y + C(x).
∂y
Setting C(x) = C(y) = 0 provides the scalar function sought. Its existence means that the
closed line integral must be zero, as found previously.
48
5 Surfaces
2
z x x 0
-1
x -2
y -2 -1 0 1 2
y y
In this course we are interested in studying functions of multiple variables, alongside their
derivates and integrals, and surfaces are among the most common and useful examples of such
functions. Recall that a general function of multiple variables is given by the map
f : D ⊂ Rn → R.
In 2D such a map, if continuous, represents a surface and is commonly written as f (x, y) = z.
The surface itself can be thought of as an object with two degrees of freedom embedded in 3D
space, and can then be described by a 3D vector with components
r = (x, y, z = f (x, y)).
This expression suggests a convenient general parametric way of describing surfaces along the
same lines used to define curves:
r(u, v) = (x(u, v), y(u, v), z(u, v)). (115)
Here r is a vector function of the parameters (u, v) which vary within a certain domain D in the
parametric uv-plane.
Solution: The convenient parameter choice here is to use spherical polar coordinates, where
(u, v) = (θ, φ) are the polar and azimuthal angles. We know that
x = R sin θ cos φ
y = R sin θ sin φ
z = R cos θ,
and so the parametric form of the surface of a sphere is simply
r(θ, φ) = R(sin θ cos φ, sin θ sin φ, cos θ). (116)
49
How should we extend this to multiple dimensions? The first difficulty is in the definition of
the increment h: in multiple dimensions the increment will necessarily become a vector, as
it will have an orientation depending on what directions we want to move in to evaluate the
change in the function. However, simply replacing h by a vector h will not work as we cannot
divide by a vector in the definition of a limit in Eq. (117). The solution will be to consider,
instead, directional derivatives. In particular, for a function in n dimensions f (x), where x =
(x1 , x2 , ..., xi , ..., xn ), we fix n − 1 dimensions and vary only one, xi , at a time. Along this single
dimension the function is one-dimensional, and the known results for taking derivatives apply.
However, to keep in mind that the function is free to vary in multiple dimensions, we will call
the derivative of the function f along the single dimension xi the partial derivative of f with
respect to xi , denoted by
∂f (x)
≡ ∂i f (x) ≡ fxi .
∂xi
Clearly, for a function in n dimensions we will have n partial derivatives. These form a vector
space, and it will be convenient to write them explicitly in vector form. We will call this the
gradient of the function f , calculated in some point x:
∂f ∂f ∂f
gradf (x) := , ,..., = ∇f (x). (118)
∂x1 ∂x2 ∂xn
The last equality introduces a new vector operator sufficiently important to merit its own new
symbol, ∇, called nabla:
∂ ∂ ∂
∇ := , ,..., . (119)
∂x1 ∂x2 ∂xn
This is an n-dimensional vector operator that takes a (multidimensional) scalar function as an
input, and returns a vector containing all the partial derivatives of the scalar function. Of
particular importance to this course is the gradient in two and three dimensions:
∂ ∂ ∂ ∂ ∂
∇= , ∇= , , . (120)
∂x ∂y ∂x ∂y ∂z
x(t) = x0 + tv̂.
which along the line is now a 1D problem. In the limit of t → 0 this is simply the directional
derivative of f along v̂:
∂f f (x0 + tv̂) − f (x0 )
≡ lim . (121)
∂xv t→0 t
Clearly, the vector v̂ can be written as a linear combination of the basis vectors in the domain.
In the example shown in Fig. 27, this would correspond to some linear combination of the unit
vectors in the î and ĵ directions. It follows that the directional derivatives are linear combinations
of the partial derivatives in the standard basis, and we can thus relate the directional derivatives
to our gradient operator via the following theorem:
50
Figure 27: Variation of a function along some specific direction v̂.
The directional derivative is the scalar product of the gradient of the function and the
direction vector. This relatively simple result leads to two importance consequences. Firstly,
the expression ∇f · v̂ is a scalar product of two vectors that will be maximized when ∇f k v̂.
The gradient of a function thus always points in the direction where the change in the function
is maximal. In contrast, contour lines are defined as lines along which there is no change in the
value of the function, i.e, where ∂v f = 0. This occurs when ∇f ⊥ v̂, and the gradient ∇f is
thus always perpendicular to the contour lines. The gradient of the function shown in Fig. 26
is given in Fig. 28 (red arrows), alongside the contour lines of the function to aid the eye. We
see that the gradient is indeed perpendicular to the contour lines, and that the gradient always
points to the direction of maximum ascent. As intuitively expected, the gradient goes to zero
in the extremum points.
Example 20. Find the unit normal vector to the surface z = x2 + y 2 in the point (1, 1, 2).
Solution: Recall that the gradient of a scalar field is always perpendicular to the isosurfaces of
that field. As we have seen, for a 2D field these will be (contour) lines, but in higher dimensions
they will be (hyper-)surfaces representing points where the field has some constant value. We
therefore rewrite the surface above as an isosurface of a 3D scalar field
U (x, y, z) = x2 + y 2 − z = 0,
51
2
-1
-2
-2 -1 0 1 2
Figure 28: Gradient of the function shown in Fig. 26. The gradient is perpendicular to the
contour lines, and always points in the direction of maximum ascent.
52
and compute its 3D gradient:
∂U ∂U ∂U
∇U (x, y, z) = , ,
∂x ∂y ∂z
= (2x, 2y, −1).
This is a vector in 3D space normal to the surface in question. In the point (1, 1, 2) on the
surface we have n = (2, 2, −1), and n̂ = 13 (2, 2, −1). The gradient thus allows us to quickly and
conveniently compute the normal vectors to surfaces and hypersurfaces by considering them as
isosurfaces of objects in higher dimensional space.
Example 21. The 2D gravitational potential of a point mass M placed in the origin of
the coordinate system is given by
GM
V (r) = − ,
r
where r = (x, y) is the position vector, r ≡ |r| = x2 + y 2 its modulus, and G and M are
p
Solution:
GM
∇V (r) = −∇ =
r
∂ 2 2 −1/2 ∂ 2 2 −1/2
= −GM (x + y ) , (x + y ) =
∂x ∂y
−x −y
= −GM , =
(x2 + y 2 )3/2 (x2 + y 2 )3/2
r r̂
= GM 3 = GM 2 .
r r
Both the scalar potential V and its gradient (a radial vector field) are shown in Fig. 29. As
expected, the gradient is just the negative of the gravitational force.
53
1
-1
-1 0 1
GM r
<latexit sha1_base64="mBIdJ904hR6hEgoe4lxrA6QhIBI=">AAACBXicbVC7SgNBFL0bXzG+Vi21GAxCLAy74qsRghbaCBHMA7IhzE5mkyGzD2ZmhbBsY+Ov2FgoYus/2Pk3TpItNPHAhcM593LvPW7EmVSW9W3k5uYXFpfyy4WV1bX1DXNzqy7DWBBaIyEPRdPFknIW0JpiitNmJCj2XU4b7uBq5DceqJAsDO7VMKJtH/cC5jGClZY65m69lDiuh0R6gC7QIXI8gUlyjW7TRKQds2iVrTHQLLEzUoQM1Y755XRDEvs0UIRjKVu2Fal2goVihNO04MSSRpgMcI+2NA2wT2U7GX+Ron2tdJEXCl2BQmP190SCfSmHvqs7faz6ctobif95rVh55+2EBVGsaEAmi7yYIxWiUSSoywQlig81wUQwfSsifaxzUDq4gg7Bnn55ltSPyvZp+eTuuFi5zOLIww7sQQlsOIMK3EAVakDgEZ7hFd6MJ+PFeDc+Jq05I5vZhj8wPn8A8TyW+g==</latexit>
<latexit sha1_base64="tboSpZvHZ1dUuJlKDnUNH0YmMXo=">AAACFnicbVDLSsNAFJ34rPUVdelmsAh1YUl8b4SiC90IFewDmlgm00k7dDIJMxOhhHyFG3/FjQtF3Io7/8ZJm4W2HrhwOOde7r3HixiVyrK+jZnZufmFxcJScXlldW3d3NhsyDAWmNRxyELR8pAkjHJSV1Qx0ooEQYHHSNMbXGZ+84EISUN+p4YRcQPU49SnGCktdcx9hyOPIdgoJ47nQ5HuwXN4BW+g4wuEk1xME3F/mFaKHbNkVawR4DSxc1ICOWod88vphjgOCFeYISnbthUpN0FCUcxIWnRiSSKEB6hH2ppyFBDpJqO3UrirlS70Q6GLKzhSf08kKJByGHi6M0CqLye9TPzPa8fKP3MTyqNYEY7Hi/yYQRXCLCPYpYJgxYaaICyovhXiPtJ5KJ1kFoI9+fI0aRxU7JPK8e1RqXqRx1EA22AHlIENTkEVXIMaqAMMHsEzeAVvxpPxYrwbH+PWGSOf2QJ/YHz+APqPnWg=</latexit>
V (r) = rV (r) = GM 3 .
r r
Figure 29: Relationship between the scalar gravitation potential field, and the vector field given
by its gradient.
where x = (x, y, z) and u = (u, v, w), we can write the 3 × 3 matrix explicitly
∂x ∂y ∂z
∂u ∂u ∂u
(125)
J = ∂y ∂z ,
∂x
∂v ∂v ∂v
∂x ∂y ∂z
∂w ∂w ∂w
which we recognize as the transpose of the Jacobian matrix. This should come as no surprise,
given that we are essentially performing basis transformations on a differential operator.
Example 22. Calculate the gradient of the function f (x, y) = 3(x2 + y 2 ) in both the
Cartesian and the plane polar coordinate basis.
We can invert this matrix to find an explicit expression for the gradient in Cartesian coordinates
∂ −1 ∂
cos θ r sin θ
∂x = ∂r ,
∂
∂y sin θ 1r cos θ ∂
∂θ
54
and can write
∂ ∂
cos θ ∂r − 1r sin θ ∂θ
∂
∇= ∂x = . (127)
∂ ∂
∂y sin θ ∂r + 1r cos θ ∂θ
∂
In polar coordinates the function can be written as f = 3r2 , so that its partial derivatives with
respect to (r, θ) are ∂f /∂r = 6r and ∂f /∂θ = 0. Substituting these in the expression above
yields:
∇f = (6r cos θ, 6r sin θ),
which is equal to our first result, as given in Eq. (126). These expressions are all written in the
standard Cartesian basis, even if we have changed the variables. That is to say, calculating ∇f
using Eq. (127) is, explicitly in terms of the basis vectors:
∂f 1 ∂f ∂f 1 ∂f
∇f = cos θ − sin θ î + sin θ + cos θ ĵ
∂r r ∂θ ∂r r ∂θ
∂f ∂f 1
= [î cos θ + ĵ sin θ] + [−î sin θ + ĵ cos θ].
∂r ∂θ r
Recalling Eq. (47) for the basis vectors r̂ and θ̂, we can rewrite this expression in the basis (r, θ):
∂f 1 ∂f
∇f = r̂ + θ̂ = 6r r = (6r, 0), (128)
∂r r ∂θ
which is the gradient written in the basis of the plane polar coordinate system.
55
Figure 30: Bounded surface S and its projection A on the xy-plane.
prove challenging for more complex surfaces, and we often resort to the alternative: we project
the surface onto some convenient Cartesian coordinate plane and perform the integration there.
We proceed to show how these two approaches can be deployed in practice. Surface integrals
will typically come in four general forms. The two that produce scalar outcomes are
¨ ¨
φ(r) dS, A(r) · n̂ dS, (132)
S S
Here n̂ denotes the normal vector to the surface, φ denotes a scalar function, and A a vector
function (or field).
It will be convenient to consider surfaces in terms of their vector areas, defined as dS = n̂ dS.
This is particularly convenient in the context of surface integrals of scalar and vector fields,
which are commonly used to describe physical quantities and processes. We proceed by finding
convenient expressions for the terms n̂, dS, and dS.
(135)
X
S= Si n̂i .
i
56
Taking the limit of this expression for infinitely small facets allows us to define the concept of a
vector area of a bounded differentiable surface
ˆ ¨
dS = n̂ dS ⇒ S = dS ≡ n̂ dS. (136)
S S
It is worth noting from this expression that the vector and scalar areas are of the same size only
for a plane surface. For a curved surface, the modulus of the vector area will always be less than
the scalar area.
Consider a general surface S as shown in Fig. 30 that has some projection A onto the xy-
plane. An infinitesimal vector area in some point dS has a projection dA, which can be found
by taking the scalar product of the vector area and the unit normal to the xy-plane, k̂:
dA = k̂ · dS = k̂ · n̂ dS = | cos α| dS, (137)
where α is the angle between k̂ and n̂. Clearly the entire surface S can be tiled
´ in this way and
the total vector area is then given by integrating over the entire surface S = S dS.
Example 23. Find the vector areas of a hemisphere and of a sphere of radius R.
Solution: Given the symmetry of the problem, we place the sphere in the centre of the coor-
dinate system and choose spherical polar coordinates to represent the surface of the sphere as
(see Example 19):
r(θ, φ) = R(sin θ cos φ, sin θ sin φ, cos θ).
We know this vector is normal to the surface of the sphere, so the unit normal is given simply
by the same vector, but normalized:
n̂(θ, φ) = (sin θ cos φ, sin θ sin φ, cos θ).
Taking the hemisphere the surface for positive z, we can calculate the total vector surface as
¨ ˆ π/2 ˆ 2π sin θ cos φ
0
S= n̂ dS = sin θ sin φ R2 sin θ dθ dφ = 0 .
0 0 cos θ πR2
As expected, the vector only has a component along the z-axis, and the size of the vector area
is the projected area – the circle – in the xy-plane. To find the vector area of the entire sphere
´ π/2 ´ +π/2
we simply need to change the limits of the polar integral from 0 → −π/2 , which leads to
S = 0. This is a general result: the vector area of any closed surface is the 0 vector.
This result leads to an interesting (and useful) consequence, and that is that the vector area
of an open surface must depend only on its boundary curve. To see this, consider two surfaces,
S1 and S2 , which share a boundary curve. Taking the difference of the surfaces S1 − S2 must
yield a closed surface, for which we know the vector area is 0. The two surfaces must therefore
have the same vector area.
We can use this observation to obtain an expression of the vector area for an open surface
directly from the line integral around its boundary curve γ. Consider the open surface illustrated
in Fig. 31, created by the cone connecting the origin to some closed curve γ. The triangular
vector area element dS is given by
1
dS = r × dr,
2
and adding up all of these triangles along the curve γ gives us the total vector area of the surface:
˛
1
S= r × dr. (138)
2 γ
The vector area of any surface bound by a closed curve γ can therefore be found by performing
a line integral of this kind.
57
Figure 31: Vector area defined as a cone connecting the origin to some closed curve γ. The
vector area element dS can be found by taking the cross product of the vector r describing the
curve, and its differential dr.
We see from Eq. (136) that in order to evaluate a surface integral we need to find expressions
for both n̂ and dS. To find a more convenient expression for the surface element we can use
Eq. (137) to write
dA |∇g| dA
dS = = = |∇g| dA. (141)
|k̂ · n̂| |∇g · k̂|
We can use these expressions to relate any surface integral over S to a double integral over the
projected region in the xy-plane A:
ˆ ¨ ¨ ¨
∇g
dS = n̂ dS = |∇g| dA = ∇g dA. (142)
S S A |∇g| A
The convenient cancelation of the normalization of the gradient of g make this expression rel-
atively simple to manipulate. Note the loose similarities to the chain rule, considering that
dA = dx dy.
An alternative approach is to calculate the vector normal to the surface directly, taking
the cross product of two linearly independent vectors lying in the tangent plane to some point
on the surface. This approach is convenient for surfaces given in parametric form r(u, v) =
(x(u, v), y(u, v), z(u, v)). As we have seen when discussing vector derivatives and the gradient,
the vectors ∂r(u,v)
∂u and ∂r(u,v)
∂v , evaluated in some point, are guaranteed to lie in the tangent
58
plane to the surface in that point. The unit normal vector to the surface (and to the tangent
plane) can then be obtained from the cross product:
∂r ∂r
×
n̂ = ∂u
∂r
∂v
∂r
. (143)
∂u × ∂v
Following the same steps taken to find the Jacobian for a coordinate transformation we can also
find the size of the area element for a parametric surface as
∂r ∂r
dS = × du dv. (144)
∂u ∂v
Again, the normalization term will cancel when inserting these into a surface integral
ˆ ¨ ¨
∂r ∂r
dS = n̂ dS = × du dv. (145)
S S A ∂u ∂v
Solution: We need to find a convenient way to parametrize the surface element dS. Let’s use
the gradient, via Eq. (141). We write:
√
g = x + y + z − 1 = 0 ⇒ ∇g = (1, 1, 1) ⇒ |∇g| = 3.
Therefore, √
dS = 3 dx dy,
where we have projected the integral over the surface S onto the xy-plane. The domain of
integration is now the triangle bounded by x = 0, y = 0, and y = 1 − x (all in the xy-plane),
59
and we can write the integral as
¨ ¨
6xy dS = 6xy|∇g| dA
S A
√ ˆ 1 ˆ 1−x
= 3 dx dy 6xy
0 0
√ ˆ 1 (1 − x)2
=6 3 dx x
0 2
√
3
= .
4
Solution: The surface is the positive hemisphere, which can be conveniently parametrized using
spherical polar coordinates. Let’s take this approach in evaluating the integral. The vector to
the surface is given by (see Eq. (116))
r(θ, φ) = 2(sin θ cos φ, sin θ sin φ, cos θ), θ ∈ [0, π/2], φ ∈ [0, 2π]. (146)
To find dS, we use Eq. (144), with u = θ and v = φ. Let’s first find the two vectors in the
tangent plane to the surface:
2 cos θ cos φ −2 sin θ sin φ
∂r ∂r
= 2 cos θ sin φ , = 2 sin θ cos φ .
∂θ ∂φ
−2 sin θ 0
The modulus of the cross product is
∂r ∂r
× = 4 sin θ, (147)
∂u ∂v
which should come as little surprise: we did not need to do this calculation again, as we know
the Jacobian for spherical polar coordinates is r2 sin θ, i.e., the result found here. However, for a
more general parametrization for which the Jacobian is not known, the cross product will need
to be evaluated directly. We now have all the ingredients needed to evaluate the integral:
¨ ¨
∂r ∂r
z dS = z(θ) × dθ dφ
S S ∂u ∂v
ˆ 2π ˆ π/2
= dφ dθ 2 cos θ 4 sin θ = 8π.
0 0
60
5.2.4 Surface integrals of vector fields
Surface integrals of vector fields are widely in physics as we often want to evaluate vector
quantities over some surface in three-dimensional space. They provide a way to calculate the
total flow of a vector field across a surface; if the surface is closed, then we will talk about the
flow into, or out of, some region of space. We have a choice in the direction of the normal vector
to a surface, which is important as it determines the direction of the vector flow. The normal
vector is conventionally chosen to point outwards of a closed surface. For an open surface, the
convention is to follow the right-hand rule tracing the curve that forms the surface boundary.
over the surface bounded by the paraboloid y = x2 + z 2 for y ∈ [0, 1], and the disc
x2 + z 2 ≤ 1 at y = 1, for F = (0, y, −z).
Solution: We will use the gradient approach to solve this problem, and will want to parametrize
the surface to write it in the form given in Eq. (142), so that
¨ ¨
F · dS = F · ∇g dA.
S A
We will consider the paraboloid and disc surfaces separately. Because of the orientation of
the paraboloid it will be most convenient to choose A in the xz-plane. We can then write
y = f (x, z) = x2 + z 2 , so that g(x, y, z) = y − f (x, z) = y − x2 − z 2 . The gradient of g is given
by
∇g = (−2x, 1, −2z),
which represents a vector normal to the surface of the paraboloid. This vector points inward to
the surface, and so we should change its sign to adhere to the normal convention of outward-
pointing normals. The integral over the surface of the paraboloid then becomes
¨ ¨
0 2x
F · dS = y · −1 dx dz
P A −z 2z
¨
= (−y − 2z 2 ) dx dz
¨A
= (−x2 − 3z 2 ) dx dz.
A
The domain A is the projection of the paraboloid on the xz-plane, and is simply the disc of
radius 1 in that plane. Plane polar coordinates are therefore ideally suited to compute this
61
double integral. Setting x = r cos θ and z = r sin θ for θ ∈ [0, 2π] and r ∈ [0, 1], we have:
¨ ˆ 2π ˆ 1
(−x2 − 3z 2 ) dx dz = dθ dr r(−r2 cos2 θ − 3r2 sin2 θ)
A 0 0
ˆ 2π ˆ 1
2 2
=− dθ(cos θ + 3 sin θ) r3 dr
0 0
ˆ
1 2π
1 3
=− dθ (cos 2θ − 1) + (1 − cos 2θ) = −π.
4 0 2 2
We now need to integrate the vector field over the disc. This part is much simpler. The normal
is obviously along the y-axis, and so the integral becomes:
¨ ¨
0 0
F · dS = y · 1 dx dz
D D −z 0
¨
= y dx dz = π,
D
where the results follows directly since y = 1 on the disc, and so the integral is simply the area
of a disc of radius 1. Adding the two contributions we find that the total integral over the closed
surface is 0.
Solution: We will solve this using the parametric approach, given that the surface is a sphere.
We follow the same initial approach as in Example 25. The surface is given parametrically by
the vector equation
r(θ, φ) = 3(sin θ cos φ, sin θ sin φ, cos θ), θ ∈ [0, π/2], φ ∈ [0, 2π]. (148)
We can then find the normal vector from the cross product of two vectors in the tangent plane,
found by taking the derivatives of r with respect to the two parameters (θ, φ):
3 cos θ cos φ −3 sin θ sin φ
∂r ∂r
= 3 cos θ sin φ , = 3 sin θ cos φ ,
∂θ ∂φ
−3 sin θ 0
and the cross product is 2
sin θ cos φ
∂r ∂r
n= × = 9 sin2 θ sin φ .
∂θ ∂φ
sin θ cos θ
62
The vector function F written in spherical polar coordinates is:
3 sin θ cos φ
F = 3 sin θ sin φ .
34 cos4 θ
We can now evaluate the integral using the expression in Eq. (145):
¨ ¨
∂r ∂r
F · dS = F· × dθ dφ
S A ∂u ∂v
ˆ 2π ˆ π/2
2
3 sin θ cos φ sin θ cos φ
= dφ dθ 3 sin θ sin φ · 9 sin2 θ sin φ
0 0 34 cos4 θ sin θ cos θ
ˆ 2π ˆ π/2
= dφ dθ (27 sin3 θ + 729 sin θ cos5 θ) = 279π.
0 0
63
6 Vector fields and operators
In addition to the gradient that was introduced in the previous chapter, which we have seen is
In[34]:= needs = "VectorFieldPlots`";
a differential operator on scalar fields, there are two more differential operators important for
VectorPlot {x * y ^ 2, - y * x ^ 2}, x, - 1, 1 ,
the study of vector fields: the divergence and the curl. As we shall see,
Out[35]=
both can be expressed
in terms of the differential operator ∇. 1.0
0.5
1.00
0.75
y
<latexit sha1_base64="tmEvmksfb9cWDAa5xB1ckWUhUbQ=">AAAB6HicbVDLSgNBEOz1GeMr6tHLYBA8hV3xdQx68ZiAeUCyhNlJbzJmdnaZmRXCki/w4kERr36SN//GSbIHTSxoKKq66e4KEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivVx71S2a24M5Bl4uWkDDlqvdJXtx+zNEJpmKBadzw3MX5GleFM4KTYTTUmlI3oADuWShqh9rPZoRNyapU+CWNlSxoyU39PZDTSehwFtjOiZqgXvan4n9dJTXjjZ1wmqUHJ5ovCVBATk+nXpM8VMiPGllCmuL2VsCFVlBmbTdGG4C2+vEya5xXvqnJZvyhXb/M4CnAMJ3AGHlxDFe6hBg1ggPAMr/DmPDovzrvzMW9dcfKZI/gD5/MH6v2NBw==</latexit>
0.0
0.50
0.25
-0.5
0.
x
<latexit sha1_base64="ZwKkaJcdU3hHlnhMyVD4tdz55QM=">AAAB6HicbVDLTgJBEOzFF+IL9ehlIjHxRHaNryPRi0dI5JHAhswODYzMzm5mZo1kwxd48aAxXv0kb/6NA+xBwUo6qVR1p7sriAXXxnW/ndzK6tr6Rn6zsLW9s7tX3D9o6ChRDOssEpFqBVSj4BLrhhuBrVghDQOBzWB0O/Wbj6g0j+S9Gcfoh3QgeZ8zaqxUe+oWS27ZnYEsEy8jJchQ7Ra/Or2IJSFKwwTVuu25sfFTqgxnAieFTqIxpmxEB9i2VNIQtZ/ODp2QE6v0SD9StqQhM/X3REpDrcdhYDtDaoZ60ZuK/3ntxPSv/ZTLODEo2XxRPxHERGT6NelxhcyIsSWUKW5vJWxIFWXGZlOwIXiLLy+TxlnZuyxf1M5LlZssjjwcwTGcggdXUIE7qEIdGCA8wyu8OQ/Oi/PufMxbc042cwh/4Hz+AOl5jQY=</latexit>
-1.0
-1.0 -0.5 0.0 0.5 1.0
x
<latexit sha1_base64="ZwKkaJcdU3hHlnhMyVD4tdz55QM=">AAAB6HicbVDLTgJBEOzFF+IL9ehlIjHxRHaNryPRi0dI5JHAhswODYzMzm5mZo1kwxd48aAxXv0kb/6NA+xBwUo6qVR1p7sriAXXxnW/ndzK6tr6Rn6zsLW9s7tX3D9o6ChRDOssEpFqBVSj4BLrhhuBrVghDQOBzWB0O/Wbj6g0j+S9Gcfoh3QgeZ8zaqxUe+oWS27ZnYEsEy8jJchQ7Ra/Or2IJSFKwwTVuu25sfFTqgxnAieFTqIxpmxEB9i2VNIQtZ/ODp2QE6v0SD9StqQhM/X3REpDrcdhYDtDaoZ60ZuK/3ntxPSv/ZTLODEo2XxRPxHERGT6NelxhcyIsSWUKW5vJWxIFWXGZlOwIXiLLy+TxlnZuyxf1M5LlZssjjwcwTGcggdXUIE7qEIdGCA8wyu8OQ/Oi/PufMxbc042cwh/4Hz+AOl5jQY=</latexit>
To see this in a simple 2D example, consider the vector field F = (xy 2 , −yx2 ), shown in
Fig. 32. The divergence of this field is simple to calculate: ∇ · F = y 2 − x2 , which is the
hyperboloid surface shown in Fig. 33. We can see that the divergence is increasingly large and
positive along the y-axis around x = 0, indicating a flow of the vector field away from this
region, and increasingly negative along the x-axis, indicating a flow toward this region. This is,
of course, fully consistent with the direction of the arrows in Fig. 32.
- x2 + y2
64
1.0
0.5
y 0.0
-0.5
-1.0
-1.0 -0.5 0.0 0.5 1.0
x
Figure 33: Divergence of the vector field F = (xy 2 , −yx2 ) shown in Fig. 32.
Solution: We note that the field is well-behaved except for y 6= 1 and z 6= 0. Everywhere else,
∂ ∂ 2x ∂ xy
∇·F= (3xy) + −
∂x ∂y (y − 1) ∂z z
2x xy
= 3y − 2
+ 2.
(y − 1) z
If a vector field F can be written as the gradient of some scalar potential φ, i.e., it is
conservative and F = ∇φ, then we can write the divergence directly in terms of the scalar
potential as
∇ · F = ∇ · ∇φ = ∇2 φ. (150)
This differential operator is again sufficiently common to merit its own symbol and name – it is
called the Laplacian:
∂2 ∂2 ∂2
∆ := ∇2 = + + . (151)
∂x2 ∂y 2 ∂z 2
The usage of the delta to denote the Laplacian, while common in maths, is not common in
physics.
φ = xy 2 z 3 .
Solution:
∂2φ ∂2φ ∂2φ
∇2 φ = + 2 + 2 = 2xz 3 + 6xy 2 z.
∂x2 ∂y ∂z
65
Figure 34: A volume V enclosed by a surface S used in the divergence theorem. The surface
can be split into two surfaces which can be parametrized as z = ϕ(x, y).
That is to say, the flux of a vector field leaving a closed surface is equal to the divergence of the
field integrated over the volume bounded by that surface.
To prove the divergence theorem, we proceed as follows. Let us first assume the volume V
is z-simple, which is to say it can be written as
with ϕ1 and ϕ2 differentiable functions with continuous first derivatives (also called C 1 functions
on D). This volume is illustrated in Fig. 34. Using this, and taking the vector field components
explicitly F = (F1 , F2 , F3 ), we can rewrite the last term of the triple integral on the right hand
side of Eq. (1) as
˚ ¨ ˆ ϕ2 (x,y)
∂F3 ∂F3
dx dy dz = dx dy dz
V ∂z D ϕ1 (x,y) ∂z
¨
= [F3 (x, y, ϕ2 (x, y)) − F3 (x, y, ϕ1 (x, y))] dx dy
¨D ¨ ‹
= F3 k̂ · n̂ dS + F3 k̂ · n̂ dS = F3 k̂ · n̂ dS.
ϕ2 ϕ1 S
The last line follows from Eq. (137) as we move from an integral over the xy-domain back to an
integral over the open surfaces ϕ1 and ϕ1 . The closed surface S is then reconstructed from the
two open surfaces ϕ1 and ϕ1 .
Following the same reasoning, we see that if, instead, V is y-simple, then
˚ ‹
∂F2
dx dy dz = F2 ĵ · n̂ dS,
V ∂y S
or if V is x-simple, then
˚ ‹
∂F1
dx dy dz = F1 î · n̂ dS.
V ∂x S
66
A bounded volume can always be broken down into sections that are simple along some direction,
and since integration is linear, summing the three components gives the general result, which is
the divergence theorem.
A valuable way to see the physical meaning of the divergence operator is to take the diver-
gence theorem in the limit of infinitesimal small volumes. For this purpose, let us consider the
volume V to be that of a sphere of radius r centred in some point r0 , and let us divide the
divergence theorem expressions by the volume |V (r)| of this sphere:
˚ ‹
1 1
∇ · F dV = F · n̂ dS.
|V (r)| V |V (r)| S
We now take the limit of these expressions for r → 0. The left-hand side of the equation simply
tends to the divergence of F, evaluated in the point r0 :
‹
1
∇ · F(r0 ) = lim F · n̂ dS. (153)
r→0 |V (r)| S
The right-hand side of the equation shows that the divergence of a vector field in some point
gives the flux density of the vector field leaving that point, per unit volume. It is therefore
intimately related to the concept of sources and sinks of a vector field. The expression given in
Eq.(153) is also known as the integral form of the divergence operator. It has the advantage of
defining the divergence operator in terms of the geometry of the system, and is not dependent
on the coordinate system used (unlike Eq. (149)). It is, however, typically more challenging to
use for solving problems in practice.
Example 30. Evaluate the divergence of the gravitational field (also called the gravita-
tional acceleration) generated by a point mass placed in the origin
x
GM GM
F(r) = − 2 r̂ = − 3 y ,
r r
z
Solution: We note that the field is not defined in the origin, but is well behaved elsewhere. For
r 6= 0 we have
∂ x ∂ y ∂ z
∇ · F = −GM + +
∂x r3 ∂y r3 ∂z r3
3
r − 3x r r − 3y r r3 − 3z 2 r
2 3 2
= −GM + +
r6 r6 r6
GM
= − 6 3r3 − 3(x2 + y 2 + z 2 )r = 0.
r
We conclude that the gravitational field is solenoidal everywhere except in the origin.
This result may surprise you, as you may have expected the field to be a diverging one,
given that it has a source in the origin. To dig into this result a little deeper, consider the same
problem, but let us now use the divergence theorem given in Eq. (1), where we evaluate the
surface integral considering a sphere of radius r centred in the origin:
‹ ‹
r̂
F · n̂ dS = −GM 2
r̂ dS
S S r
ˆ 2π ˆ π 2
|r̂| 2
= −GM dφ r sin θ dθ
0 0 r2
ˆ 2π ˆ π
= −GM dφ sin θ dθ = −4πGM.
0 0
67
This expression does not depend on r, which means that if we take a volume between two
spherical shells some ∆r apart, then the amount of vector flux traversing each surface must be
the same. We can see why from the second line of the equation above: the scaling of the field with
distance as 1/r2 exactly cancels out the r2 increase of the surface with distance (which comes
from the Jacobian). We thus understand intuitively why the field must have zero divergence
outside of the origin, and can extend this result to any vector field that scales as 1/r2 , e.g. the
electric field.
Interestingly, the result above also implies that
˚
∇ · F dV = −4πGM, (154)
V
which cannot hold if ∇ · F = 0 everywhere. The problem is of course in the origin, where the
field is not defined. Equation (153) provides further insight on this conundrum; in the limit of
r → 0 the divergence can be written as
−4πGM
∇ · F(r → 0) = lim . (155)
r→0 |V (r)|
This limit diverges for any M 6= 0. It would thus seem that the divergence of our inverse-square-
law field should be 0 everywhere except in the origin, where it diverges. We have already seen
an object that can describe such physics: the Dirac delta. We can then rewrite the divergence
of the gravitational field as
∇ · F(r) = −4πGM δ(r), (156)
where δ(r) := δ(x)δ(y)δ(z) is the 3D generalization of the 1D Dirac delta:
˚ ˆ +∞ ˆ +∞ ˆ +∞
δ(r) = dx dy dz δ(x)δ(y)δ(z) = 1. (157)
V −∞ −∞ −∞
With this expression the divergence in any region of space excluding the origin r = 0 is indeed
0, but the divergence returns the result of Eq. (154) whenever the origin is included in the
integral. By introducing the Dirac delta the divergence theorem can be made valid even for
non-differentiable point sources.
We note that the gravitational field can be expressed as the gradient of a scalar potential,
F = −∇φ. Finding φ is straightforward. We write
∂
∂x φ x
∂ φ = GM y ,
∂x r3
∂ z
∂x φ
In terms of the mass density, the divergence of the gravitational field can be written as
∇ · F = −4πGρ. (158)
Writing this expression in terms of the scalar potential leads to the well-known Poisson equation,
which relates a distribution of matter of density ρ with the gravitational potential φ:
∇2 φ = 4πGρ. (159)
68
Figure 35: A region of the xy-plane bounded by the closed curve γ. For a point on the curve
described by the vector r we can identify a tangent vector dr and a normal vector n̂.
69
Figure 36: The curl of a vector field F in some point P along the direction n̂ is given by the
limit of a closed line integral representing the circulation of the field.
where the closed line integral is the circulation of F along the curve γ that lies in the plane with
normal vector n̂ circling point P , and A is the area of the plane bounded by γ, as shown in
Fig. 36. In the limit of A → 0 we obtain the component of the curl along n̂ in the point P .
To find a more convenient expression for the curl, let’s pick a particular case where n̂ lies
along the z axis, i.e., n̂ = k̂. The curve γ will then lie in the xy-plane, and let us consider γ
to be a rectangle with one edge in point P (a, b) and with sides ∆x and ∆y, as illustrated in
Fig. 37. Since F = (Fx , Fy , Fz ), we can rewrite F · dr = Fx dx + Fy dy + Fz dz.
Figure 37: Rectangular curve with area ∆x∆y, split into straight segments C1 , C2 , C3 and C4 .
Given the orientation, note that the closed curve γ = C1 + C2 − C3 − C4 . We omit writing the
z component of F explicitly as it is constant in the plane.
Given that dz = 0 along any path in the xy-plane, Eq. (162) becomes:
˛
1
(curl F) · k̂ = lim (Fx dx + Fy dy)
A→0 A γ
ˆ ˆ ˆ ˆ
1
= lim Fx dx + Fy dy − Fx dx − Fy dy
A→0 A C1 C2 C3 C4
1
= lim [Fx (a, b)∆x + Fy (a + ∆x, b)∆y − Fx (a, b + ∆y)∆x − Fy (a, b)∆y].
A→0 A
Note that in the last two lines we changed the direction of two of the the integrals to use the
three vertices shown in the Fig. 37, and have also omitted writing the z component of F explicitly
70
In[1]:= needs = "VectorFieldPlots`";
VectorPlot3D {0, 0, 2}, x, - 2, 2 , y, - 2, 2 , z, - 0.1, 0.1
r ⇥ v = (0, 0, 2!)
<latexit sha1_base64="qQ24MBZOu1sbpk83EnH08f7oQ8U=">AAACC3icbZDLSsNAFIYn9VbrLerSzdAiVNCSiLeNUHTjsoK9QBPKZDpph04mYWZSDKF7N76KGxeKuPUF3Pk2Ttsg2vrDwMd/zuHM+b2IUaks68vILSwuLa/kVwtr6xubW+b2TkOGscCkjkMWipaHJGGUk7qiipFWJAgKPEaa3uB6XG8OiZA05HcqiYgboB6nPsVIaatjFlPH8+FwBC9h+SiBThiQHjqE9z9kHXTMklWxJoLzYGdQAplqHfPT6YY4DghXmCEp27YVKTdFQlHMyKjgxJJECA9Qj7Q1chQQ6aaTW0ZwXztd6IdCP67gxP09kaJAyiTwdGeAVF/O1sbmf7V2rPwLN6U8ihXheLrIjxlUIRwHA7tUEKxYogFhQfVfIe4jgbDS8RV0CPbsyfPQOK7YZ5XT25NS9SqLIw/2QBGUgQ3OQRXcgBqoAwwewBN4Aa/Go/FsvBnv09ackc3sgj8yPr4B+UiYeQ==</latexit> <latexit sha1_base64="YLrWJ6OcJ7wxLCm7oKNeZVyy5GI=">AAACEHicbVDLSgMxFM3UV62vUZdugkWsIGWm+NoIRTcuK9gHdEq5k2ba0ExmSDKFMvQT3Pgrblwo4talO//G9LHQ1kMCh3Pu5d57/JgzpR3n28osLa+srmXXcxubW9s79u5eTUWJJLRKIh7Jhg+KciZoVTPNaSOWFEKf07rfvx379QGVikXiQQ9j2gqhK1jACGgjte1jT4DPAXuahVTh1PMDPBjha1xwTrF5JS8KaRdO2nbeKToT4EXizkgezVBp219eJyJJSIUmHJRquk6sWylIzQino5yXKBoD6UOXNg0VYMa30slBI3xklA4OImm+0Hii/u5IIVRqGPqmMgTdU/PeWPzPayY6uGqlTMSJpoJMBwUJxzrC43Rwh0lKNB8aAkQysysmPZBAtMkwZ0Jw509eJLVS0b0ont+f5cs3sziy6AAdogJy0SUqoztUQVVE0CN6Rq/ozXqyXqx362NamrFmPfvoD6zPH5JDmmI=</latexit>
v = ( y!, x!, 0)
0
y
Out[2]=
-1
-2
-2 -1 0 1 2
x
Figure 38: Velocity field of a solid body in rotation, and its curl.
∂Fz ∂Fy
(curl F) · î = − ,
∂y ∂z
∂Fx ∂Fz
(curl F) · ĵ = − .
∂z ∂x
These projections along the coordinate axes provide the curl vector in the Cartesian basis.
We recognize them as the components of a vector product, as defined in Eq. (8). Using the
differential operator ∇ we can thus write the curl in a more convenient form for calculations
∂Fz ∂Fy ∂Fx ∂Fz ∂Fy ∂Fx
curl F = − î + − ĵ + − k̂ ≡ ∇ × F. (163)
∂y ∂z ∂z ∂x ∂x ∂y
F = (xy, x2 z, xyz).
Solution:
xz − x2
x̂ ŷ ẑ
∂ ∂ ∂
∇ × F = det ∂x ∂y ∂z = −yz .
2
xy x z xyz 2xz − x
71
From the definition of the curl we can see that it measures the microscopic rather than some
macroscopic circulation of a vector field. It represents the tendency of a fictional microscopic
paddle wheel, placed in some point in the field, to rotate around the axis that points in the
direction of the curl. Note that the micro and macroscopic circulations of a field are typically
different. To see this, consider a solid body rotating around the z axis with angular velocity
~ = ω k̂. In the xy-plane, some point at r = (x, y) will then have a velocity
ω
î ĵ k̂ −yω
v=ω ~ × r = det 0 0 ω = +xω .
x y 0 0
This velocity is a vector field that represents the macroscopic rotation of the solid body. Let’s
calculate the curl of this field to find the microscopic circulation, for comparison:
î ĵ k̂
∂ ∂ ∂
curl v = ∇ × v = det ∂x ∂y ∂z = 2ω k̂.
−yω xω 0
We see that the curl is twice the angular velocity vector of the solid body around its axis of
rotation. The velocity field and its curl are shown in Fig. 38.
Recall that we have already encountered the terms forming the components of ∇ × F when
discussing exact differentials in Section 4.6, in Eq. (113). There we saw that given some vector
field F, if all the components of ∇×F = 0, then the differential form F·dr = Fx dx+Fy dy+Fz dz
is an exact differential. We showed how line integrals of such a differential must be path-
independent, and that closed line integrals are always zero. A vector field for which this holds
is called a conservative field. Using the curl formalism defined here we now have a convenient
way to check whether a field is conservative or not. This should come as little surprise given the
definition of the curl in Eq. (162): if F is conservative then the closed line integral will be zero
for any choice of n̂.
It is important to note that these observations require the region bounded by the curve along
which we perform the line integral to be simply connected. From our derivation of the curl it
should be obvious why: in taking the limit for the area bounded by the curve, we explicitly
require that the vector field be well-defined in the point (and its neighbourhood) in which the
curl is being evaluated. We discussed this already in Section 4.6, and reiterate it here. If all
closed line integrals on some domain are always zero then the field is certainly irrotational, but
if we find an irrotational field on a multiply-connected domain we are not guaranteed that all
closed line integrals will be zero. Here is an example of such a case.
Is F conservative?
The field F is shown in Fig. 39. There is a clearly visible macroscopic circulation, but the curl
shows us that the microscopic circulation is zero. So if we imagine a paddle wheel moving around
in this field it would follow the field lines, but would not itself rotate. If we compare this with
72
1.0
0.5
60 000
40 000
0.0
20 000
-0.5
-1.0
-1.0 -0.5 0.0 0.5 1.0
Figure 39: Vector field given in Example 32. While there is clear macroscopic circulation, there
is no microscopic circulation: the curl of this field is 0.
the solid body case, we can see that the only difference between the two is how quickly the field
drops off as we move away from the origin: the inverse square rate here is such that the fields on
the inner and outer side of our imaginary paddle wheel cancel out exactly. It is clearly possible
to have very different behaviour in micro and macroscopic circulation of a vector field.
Importantly, this field is defined everywhere except for the origin (where it diverges), so the
xy-plane is not simply connected. Let’s calculate the circulation of F along a circumference
centred in the origin of radius 1. Moving to plane polar coordinates, we have x2 + y 2 = r2 = 1
so F = (− sin θ, cos θ), and
˛ ˆ 2π
F · dr = (sin2 θ + cos2 θ) dθ = 2π 6= 0.
γ 0
We see that while ∇ × F = 0, not all closed line integrals are zero. They are zero only if the
path doesn’t enclose the origin. Otherwise, the integral will be 2π. So F is (locally) conservative
over any domain in the xy-plane that excludes the origin, but is globally non-conservative.
That is to say, the flux of the curl of a vector field through some open surface S equals the
circulation of the field along the surface’s boundary. We note that the surface may be closed; if
so, the integrals must equal zero.
To see why Stokes’ theorem holds, consider the surface S bounded by a closed curve γ as
illustrated in Fig. 40. The circulation of some well-behaved vector field F along γ is given by
˛
F · dr.
γ
73
a b
c d
Figure 40: Slicing of some surface parametrized by the variables (x, y) for Stokes’ theorem. The
line integrals on all internal grid lines cancel because they are traversed in opposite directions,
leaving only the line integral along the boundary curve γ. Each small circulation element in the
final panel (d) resembles the setup shown in Fig. 36: the circulation corresponds to the scalar
product of the curl of the field and the vector area.
We now divide the area enclosed by γ in two parts, bounded by two new curves γ1 and γ2 .
Clearly, ˛ ˛ ˛
F · dr = F · dr + F · dr,
γ γ1 γ2
since the internal grid line is traversed twice, in opposite directions, and thus cancels exactly –
see Fig. 40(b). We can continue this process to build an increasingly dense tiling, so that
˛ X n ˛
F · dr = F · dr
γ i=1 γi
Xn
≈ (∇ × F) · Si ,
i=1
where in the last line we have used the definition of the curl from Eq. (162), and where Si
denotes the vector area of the region bounded by γi . Since the vector area has not been taken in
the infinitesimal limit, the curl expression is only approximate. The relationship becomes exact
in the limit of Si → 0, i.e., infinite regions of infinitesimal size, and the sum over the regions is
replaced by a double integral over the surface S
˛ ¨
F · dr = (∇ × F) · dS, (165)
γ S
which we recognize as Stokes’ theorem. Note that in two dimensions with F = (Fx , Fy ) we once
again recover Green’s theorem:
˛ ¨
∂Fy ∂Fx
F · dr = − dS. (166)
γ S ∂x ∂y
74
Just as the divergence theorem can be used to relate volume and surface integrals for certain
types of integrands, Stokes’ theorem connects surface integrals with line integrals.
where F = (y + z, z + x, x + y), and the curve γ is given by the intersection of the sphere
x2 + y 2 + z 2 = R2 with the plane x + y + z = 0.
Solution: The vector field is well-defined in all R3 and γ is a closed regular curve so we can use
Stokes’ theorem to transform the line integral into a surface integral:
˛ ¨
F · dr = (∇ × F) · n̂ dS,
γ S
where S is the portion of the plane bounded by the intersection with the sphere. The normal
vector is thus simply that of the plane: n̂ = √13 (1, 1, 1). Calculating the curl we have
x̂ 0 ŷ ẑ
∂ ∂ ∂
∇ × F = det ∂x ∂y ∂z = 0 ,
y+z z+x x+y 0
and
¸ so we conclude that F is conservative in R , F · dr is an exact differential, and that therefore
3
γ F · dr = 0.
Example 34. Show that Stokes’ theorem holds for the vector field F = (x, 0, y) over the
surface S given parametrically as
x =R sin θ cos φ
y =R sin θ sin φ
z =R cos θ
75
Figure 41: The quarter sphere surface used in Example 34, bounded by the two semi-
circumferences C1 and C2 .
To find the circulation, we must first parametrize the bounding curve γ. As can be seen from
Fig. 41, γ will be given by the sum of two half-circumferences on the sphere. The first can be
parametrized as:
x =R cos t
y =R sin t
z =0
for t ∈ [−π/2, +π/2]. The contribution to the total circulation from this segment is
ˆ +π/2
C1 = (Fx dx + Fy dy + Fz dz)
−π/2
ˆ +π/2
= Fx dx
−π/2
ˆ +π/2
= R cos t(−R sin t) dt = 0.
−π/2
x =0
y =R sin t
z =R cos t
for t ∈ [−π/2, +π/2]. The contribution to the total circulation from this segment is (note the
change in direction to make this a closed curve)
ˆ −π/2
C2 = (Fx dx + Fy dy + Fz dz)
+π/2
ˆ −π/2
= Fz dz
+π/2
ˆ −π/2
π 2
= R sin t(−R sin t) dt = R .
+π/2 2
76
6.5 Some differential identities
Differential operators are often most conveniently handled in index notation. Given a differen-
tiable scalar field φ and a differentiable vector field F, we can write:
∂
∇φ = φ = ∂i φ
∂xi
∂
∇·F= Fi = ∂i Fi
∂xi
∂
∇ × F = εijk Fk = εijk ∂j Fk
∂xj
where we remember the convention that repeated indices are summed over. The number of free
indices (not repeated) indicate the rank of the remaining tensor: in the first case it is 1, so we
have a vector, in the second 0 (i is repeated) so a scalar, and in the third again 1, so a vector (j
and k are repeated). Here εijk is the Levi-Civita symbol, a rank 3 tensor.
Given the definitions of the gradient, divergence and curl, there are certain relationships
between them that will prove useful. Note that only some compositions of these operators are
meaningful:
5. curl curl F ≡ ∇ × (∇ × F)
∇ × (∇ × F) = ∇(∇ · F) − ∇2 F.
∇ × (∇ × F) = εijk ∂j (∇ × F)k
= εijk ∂j εklm ∂l Fm
= εkij εklm ∂j ∂l Fm
= (δil δjm − δim δjl ) ∂j ∂l Fm
= δil δjm ∂j ∂l Fm − δim δjl ∂j ∂l Fm
= ∂i ∂m Fm − ∂j ∂j Fi
= ∇(∇ · F) − ∇2 F.
77
7 Revision
7.1 Summary of some key points
The arc length s of a space curve parametrized by a vector r(x(t), y(t), z(t)) is given by
ˆ b ˆ b 2 2 2
s
dx dy dz
s= r0 (t) dt = + + dt.
a a dt dt dt
2. Line integral of a scalar function f along curve γ with respect to an infinitesimal vector
displacement, produces a vector:
ˆ ˆ
f (r) dr = f (x, y, z)(î dx + ĵ dy + k̂ dz).
γ γ
3. Line integral of a vector function F along curve γ with respect to an infinitesimal vector
displacement, produces a scalar:
ˆ ˆ ˆ ˆ
F(r) · dr = Fx (x, y, z) dx + Fy (x, y, z) dy + Fz (x, y, z) dz.
γ γ γ γ
A common example is the work W done by a force F moving along a curvilinear path γ.
We can use these to relate integrals along lines, surfaces and volumes of scalar and vector
functions.
1. Divergence Theorem:
˚ ‹ ‹
∇ · F dV = F · dS = F · n̂ dS.
V S S
2. Stokes’ Theorem:
¨ ¨ ˛
(∇ × F) · dS = (∇ × F) · n̂ dS = F · dr.
S S γ
78
For surface integrals, where the surface is parametrized by the vector r(u, v), we have
∂r ∂r ∂r ∂r
dS = × du dv; dS = × du dv. (167)
∂u ∂v ∂u ∂v
For surface integrals, where the surface is given by the equation z = f (x, y), we can write
g(x, y, z) = f (x, y) − z = 0, and use the gradient of g to find the normal vector to the surface:
q
dS = |∇g| dA = fx2 + fy2 + 1 dA; dS = ∇g dA = (fx , fy , −1) dA, (168)
Example 36. Viviani’s window is the part of the surface of the sphere of radius R centred
in the origin, contained within the right cylinder with base given by a circumference of
radius R/2, centred in (x, y) = (R/2, 0), for z ≥ 0 (see Fig. 42). Find the area S of
Viviani’s window.
Figure 42: Viviani’s window. The surface S is bounded by the space curve γ, also known as
Viviani’s curve.
Solution: We are tasked with solving the following integral for the area S:
¨
S= dS.
S
Finding the normal vector to the surface of the sphere, and thus dS according to the standard
expressions, is simple enough – we have already solved this problem in Example 25. However,
the challenge is then to find the limits of the two variables (θ, φ) parametrizing the surface of the
sphere over which we should integrate. This is not trivial in this case, so we shall proceed with
a different parametrization of the surface that is more convenient, using cylindrical coordinates
for a cylinder not centred in the origin. Specifically, we write
x = r cos θ
y = r sin θ
z = z(r, θ).
79
Figure 43: Parametrization of an offset circle in plane-polar coordinates. A point P on the circle
is given by the parameters (r, θ), with r = R cos θ.
To find the range of (r, θ), consider the base of the cylinder in the xy-plane, as shown in Fig. 43.
Clearly, in order to describe the circle we see that −π/2 < θ < π/2, while the length of the
vector tracing the circle for some given value of θ must be r = R cos θ. The entire base of the
cylinder can then be described by setting 0 ≤ r ≤ R cos θ.
All points on the surface S must obey the equation of the sphere, and thus
x = r cos θ
y = r sin θ
p π π
z = R2 − r 2 , for r ≤ R cos θ, − ≤θ≤ .
2 2
To find a convenient form for dS we recall Eq. (144), where our surface is traced out by the
vector x = (x, y, z), parametrized in terms of the two variables (r, θ). We therefore have
¨ ¨
∂x ∂x
dS = × dr dθ.
S S ∂r ∂θ
i j k
∂x ∂x
∂x ∂y ∂z
× = det ∂r ∂r ∂r
∂r ∂θ
∂x ∂y ∂z
∂θ ∂θ ∂θ
r2 cos θ
i j k √
R2 −r2
−r2 sin θ
= det cos θ √ −r = √ 2 2 .
sin θ
R2 −r2
R −r
−r sin θ r cos θ 0 r
80
We now have all the ingredients needed to perform the integral:
¨ ˆ +π/2 ˆ R cos θ
rR
S= dS = dθ dr √
S −π/2 0 R2 − r 2
ˆ +π/2 p R cos θ
= R − R2 − r 2 dθ
−π/2 0
ˆ+π/2
= R2 (1 − | sin θ|) dθ
−π/2
ˆ +π/2
!
= R2 π−2 sin θ dθ
0
= R2 (π − 2).
Example 37. Alice and Bob agree to meet for lunch at some time between noon and
1 pm. They are both willing to wait up to 10 min for the other to arrive, but will leave
otherwise. Find the probability that they meet for lunch, if
b) their arrival times are independent, but while Alice’s probability of arrival is uniform,
Bob is quadratically more likely to arrive later in the hour.
Does Bob’s tardiness help or hinder the likelihood of the two enjoying a meal together?
Solution: Let X indicate the arrival time for Alice, and Y the arrival time for Bob. We’re told
the two events are independent, so the joint PDF for the arrival of Alice and Bob is obtained
by multiplying their respective single-variable PDFs.
Part a) We’re told Alice’s and Bob’s arrival time is uniformly distributed, so the PDF is a
constant function, the integral of which, over the domain of interest (a 1 hour window), should
be 1: ˆ ˆ ˆ
fX (x) dx = fY (y) dy = C dx = 1.
1h 1h 1h
The constant is therefore simply the inverse of the time window. For convenience we can choose
to work in the units of minutes, in which case C = 1/60. The joint PDF is then
2
1
fXY (x, y) = .
60
The domain over which the joint PDF is defined is shown in Fig. 44: a square of total area 3600
square minutes. We’re interested in finding the probably that they meet. Since this requires
|Y − X| ≤ 10, this probability is given by
¨
P (|Y − X| ≤ 10) = fXY (x, y) dx dy.
|Y −X|≤10
In this case the integral simply needs to evaluate the red area shown in Fig. 44. We do not even
need to do the integral to find the answer: the total area of the box is 602 = 3600, while the
area of the two blue triangles is 502 = 2500, and so
3600 − 2500 11
P (|Y − X| ≤ 10) = = ≈ 31%.
3600 36
81
Figure 44: Probability domain for Alice and Bob meeting.
Part b) Here the PDFs for Alice and Bob are not the same. We still have
1
fX (x) =
60
for Alice, but for Bob we’re told that fY (y) ∝ y 2 . To find the normalization constant we must
integrate this function over the domain of interest:
ˆ 60 60
Cy 3 603 3
Cy 2 dy = = C := 1; ⇒C= .
0 3 0 3 603
where the domain is the same as before, i.e., the red region in Fig. 44. Due to the shape of the
domain, note that it is easier to integrate the blue regions instead (that is, the cases where the
two do not meet), and so we write
where the last two terms correspond to the upper and lower triangles, respectively. Note that
82
these two are no longer the same. We have
ˆ 50 ˆ 60
3 2
P ((Y − X) > 10) = dx y dy
0 x+10 604
ˆ 50
603 (x + 10)3
3
= dx 4 −
0 60 3 3
50
x 1 (x + 10)4
50
= −
0 604
60 4 0
4
7 1 1
= + .
12 4 6
ˆ 60 ˆ x−10
3 2
P ((Y − X) < −10) = dx y dy
10 0 604
ˆ 60
3 (x − 10)3
= dx 4
10 60 3
60
1 (x − 10)4
=
60 4 10
4
1 5
= .
4 6
Therefore, we find
4
1 5 4
7 1 1 767
P (|Y − X| ≤ 10) = 1 − − − = ≈ 30%.
12 4 6 4 6 2592
This is slightly lower than the result from part a), but only by around 1%. Bob’s preference for
showing up late doesn’t seem to make much of a difference.
where γ is the closed curve forming the boundary of the Pringles surface given by the
equation z = x2 − y 2 , for x2 + y 2 ≤ 1, by
83
Figure 45: The Pringles crisp, a hyperbolic paraboloid.
and so we need to find the curl of F and an expression for the vector area dS. The curl is
For the area we can write g = z − f (x, y) = z − x2 + y 2 , the gradient of which will provide the
normal vector to the surface
∇g(x, y, z) = (−2x, 2y, 1),
so that
dS = ∇g dx dy = (−2x, 2y, 1) dx dy.
This is oriented in the right direction. Inserting these expressions into the integral we have
¨ ¨
(∇ × F) · dS = (x, −y, 0) · (−2x, 2y, 1) dx dy
S A¨
= −2 (x2 + y 2 ) dx dy
A
ˆ 2π ˆ 1
= −2 dθ r(r2 cos θ + r2 sin θ) dr
0 0
1
r4
= −4π = −π.
4 0
84