0% found this document useful (0 votes)

22 views111 pages

Multivariable Knill 2022

The document is a comprehensive overview of Multivariable Calculus, covering topics such as geometry, vectors, curves, surfaces, and optimization techniques. It includes definitions, theorems, and applications related to distance, dot and cross products, parametric equations, and partial derivatives. Key concepts like linearization, gradients, and Lagrange multipliers are also discussed to aid in finding maxima and minima under constraints.

Uploaded by

p16roalejandro2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views111 pages

Multivariable Knill 2022

Uploaded by

p16roalejandro2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 111

Multivariable Calculus Math S21A, 2022

Oliver Knill Harvard University

Chapter 1. Geometry and Space

Section 1.1: Space and Distance

Points P in space are described by coordinates like P = (3, 4, 5)

As promoted by René Descartes in the 16’th century, geometry
can be treated algebraically using coordinate systems. The dis-
tance
p between P (x, y, z) and Q = (a, b, c) is defined as d(P, Q) =
(x − a)2 + (y − b)2 + (z − c)2 . This is motivated by Pythagoras
theorem which we will prove. We explore geometric objects in the
plane and in space. We focus cylinders, planes or spheres and
learn how to find the center and radius of a sphere. This is the
completion of the square.
René Descartes

Section 1.2: Vectors and Dot product

Two points P, Q define a vector P⃗Q = −QP ⃗ . Vectors describe ve-
locities, forces, color or data. The components of P⃗Q connect-
ing P = (a, b, c) with Q = (x, y, z) are the entries of the vector
[x − a, y − b, z − c]. The zero vector is ⃗0 = [0, 0, 0]. The stan-
dard basis vectors are ⃗i = [1, 0, 0], ⃗j = [0, 1, 0], ⃗k = [0, 0, 1]. Ad-
dition, subtraction and scalar multiplication work geometrically
and algebraically.
√ The dot product ⃗v · w ⃗ is a scalar giving length
|⃗v | = ⃗v · ⃗v and direction ⃗v /|⃗v | for |⃗v | =
̸ 0. The angle is defined by
⃗v · w ⃗ = |⃗v ||w|
⃗ cos α as justified by Cauchy-Schwarz |⃗u · ⃗v | ≤ |⃗u||⃗v |.
The cos-formula follows. If ⃗v · w ⃗ = 0, we say ⃗v , w ⃗ are perpendicular,
Pythagoras giving Pythagoras |⃗v + w| 2 2
⃗ = |⃗v | + |w| ⃗ .2

Section 1.3: Cross and Triple Product

The cross product ⃗v × w ⃗ of ⃗v = [a, b, c] and w
⃗ = [p, q, r] is defined
as [br − cq, cp − ar, aq − bp]. It is perpendicular to ⃗v and w. ⃗ In two
dimensions, the cross product is a scalar [a, b] × [p, q] = aq − bp. The
product is useful to compute areas of parallelograms, the distance
between a point and a line, or to construct a plane through three
points or to intersect two planes. We prove a formula |⃗v × w| ⃗ =
|⃗v ||w|
⃗ sin(α) which allows us to define the area of the parallelepiped
spanned by ⃗v and w. ⃗ The triple scalar product (⃗u × ⃗v ) · w ⃗ is a
scalar and defines the signed volume of the parallelepiped spanned
by ⃗u, ⃗v and w.
⃗ Its sign gives the orientation of the coordinate system
defined by the three vectors. The triple scalar product is 0 if and only
Rowan Hamilton if the three vectors are in a common plane.
1
2

Section 1.4: Lines and Planes

Because [a, b, c] = ⃗n = ⃗u × ⃗v is perpendicular to ⃗x − w,⃗ if ⃗x, w
⃗ are in
the plane spanned by ⃗u and ⃗v , points on a plane satisfy the equation
ax + by + cz = d. We often know the normal vector ⃗n = [a, b, c] to a
plane and can determine the constant d by plugging in a known point
(x, y, z) on equation ax + by + cz = d. The parametrization ⃗x(t, s) =
w+t⃗
⃗ u +s⃗v is an other way to represent surfaces. We introduce lines by
the parameterization ⃗r(t) = OP ⃗ +t⃗v , where P is a point on the line and
⃗v = [a, b, c] is a vector telling the direction of the line. If P = (o, p, q),
and a, b, c are all non-zero then (x − o)/a = (y − p)/b = (z − q)/c is
called the symmetric equation of a line. It can be interpreted as
the intersection of two planes. As an application of the dot and cross
Arthur Cayley
products, we look at various distance formulas.

Chapter 2. Curves and Surfaces

Section 2.1: Level Curves and Surfaces

The graph of a function f (x, y) of two variables is defined as the set
of points (x, y, z) for which g(x, y, z) = z − f (x, y) = 0. We look at
examples and match some graphs with functions f (x, y). General-
ized traces like f (x, y) = c are called level curves of f and help to
visualize surfaces. The set of all level curves forms a contour map.
After a short review of conic sections like ellipses, parabola and
hyperbola in two dimensions, we look at more general surfaces of
the form g(x, y, z) = 0. We start with the sphere and the plane. If
g(x, y, z) is a function which only involves linear and quadratic terms,
the level surface is called a quadric. Important quadrics are spheres,
Claudius Ptolemy ellipsoids, cones, paraboloids, cylinders as well as hyperboloids.

Section 2.2: Parametric Surfaces

Surfaces are described implicitly or parametrically. Examples of im-
plicit descriptions g(x, y, z) = 0 are x2 + y 2 + z 2 − 1 = 0. Exam-
ples of parametrizations ⃗r(u, v) = [x(u, v), y(u, v), z(u, v)] are the
sphere ⃗r(θ, ϕ) = [ρ cos(θ) sin(ϕ), ρ sin(θ) sin(ϕ), ρ cos(ϕ)], where ρ is
fixed and ϕ, θ are the Euler angles. Using computers, one can vi-
sualize also complicated surfaces. Parametrization of surfaces is im-
portant in geodesy, where they appear as maps or in computer
generated imaging, where the parameterization ⃗r(u, v) is called the
”uv-map”. Parametrizations of surfaces make use of cylindrical
coordinates (r, θ, z), where r ≥ 0 is the distance to the z-axes and
0 ≤ θ < 2π is an angle. spherical coordinates (ρ, θ, ϕ) use ρ, the
Leonhard Euler distance to (0, 0, 0) and θ, ϕ, the Euler angles.
3

Section 2.3: Parametric Curves

The parametrization ⃗r(t) = [x(t), y(t), z(t)], of a curve using param-
eter t given in the interval I = [a, b] contains more information than
the curve itself. It tells also, how the curve is traced if t is interpreted
as time. Differentiation of a parametrization ⃗r(t) leads to the veloc-
ity ⃗r ′ (t), a vector which is tangent to the curve at ⃗r(t). A second
differentiation with respect to t gives the acceleration vector ⃗r ′′ (t).
The speed |r′ (t)| is a scalar. We also learn how to get from ⃗r′′ (t) and
⃗r ′ (0) and ⃗r(0) the position ⃗r(t) by integration. A special case is the
free fall, where the acceleration vector is constant.
Johannes Kepler

Section 2.4: Arc length and Curvature

The arc length of a curve is definedR as a limiting length of polygons
b
and leads to the arc length integral a |r′ (t)| dt. A re-parametrization
of a curve does not change the arc length. The curvature κ(t) of a
curve measures how much a curve is bent. Acceleration and curva-
ture involve second derivatives. Curvature is a quantity which does
not depend on parameterizations. One ”feels” acceleration and ”sees”
curvature κ(t) = |T ′′ (t)|/|T ′ (t)| = |⃗r ′ (t) × ⃗r ′′ (t)|/|r ′ (t)| , where
T⃗ (t) = ⃗r′ (t) is the unit normal vector T⃗ . Together with normal
vector N ⃗ and bi-normal vector B ⃗ the 3 vectors form an orthonor-
mal frame.
Isaac Newton

Chapter 3. Linearization and Gradient

Section 3.1: Partial Derivatives

Continuity questions in multi variables can be more interesting than in
one dimension. It can happen for example that t → f (t⃗v ) is continuous
for every ⃗v but that f is not still continuous. Discontinuities natu-
rally appear with catastrophes, changes of the minimum of a critical
point. Partial derivatives fx = ∂x f = ∂f ∂x
satisfy Clairot’s theo-
rem fxy = fyx for smooth functions (functions one can differentiate
arbitrarily). We look then at some partial differential equations
(PDE). Examples are the transport fx (t, x) = ft (t, x), the wave
ftt (t, x) = fxx (t, x) and the heat equation ft (t, x) = fxx (t, x).
Alexis Clairot
4

Section 3.2 Linear Approximation

Linearization is an important concept in science because many phys-
ical laws are linearization of more complicated laws. Linearization is
also useful to estimate quantities. After a review of linearization of
functions of one variables, we introduce the linearization of a func-
tion f (x, y) of two variables at a point (p, q). It is defined as the
function L(x, y) = f (p, q) + fx (p, q)(x − p) + fy (p, q)(y − q). The
tangent line ax + by = d at a point (p, q) is a level curve of L and
a = fx , b = fy . Linearization works similarly in three dimensions,
where it allows to compute the tangent plane ax + by + cz = d. The
key is the gradient f ′ = ∇f = [fx , fy , fz ]. We don’t cover higher
order approximations but they could be done. For nice functions of
several variables there is Taylor theorem f (⃗x) = f (⃗p) + f ′ (⃗p) · (⃗x −
Brook Taylor p⃗) + f ′′ (0)(⃗x − p⃗) · (⃗x − p⃗)/2 + · · · as in one dimensions. The second
term f ′′ (0) is a 3 × 3 matrix which contains in its entries all the mixed
derivatives of f at p⃗.

Section 3.3: Implicit Differentiation

The chain rule d/dtf (g(t)) = f ′ (g(t))g ′ (t) in one dimension can
be generalized to higher dimensions. It becomes d/dtf (⃗r(t)) =
∇f (⃗r(t)) · ⃗r ′ (t), where ∇f = [fx , fy , fz ] is the gradient. Written
out, this formula is d/dtf (x(t), y(t), z(t)) = fx (x(t), y(t), z(t))x′ (t) +
fy (x(t), y(t), z(t))y ′ (t) + fz (x(t), y(t), z(t))z ′ (t). All other chain rule
versions can be derived from this like if you have a function of several
variables or vector-valued functions. A nice application of the chain
rule is implicit differentiation: if f (x, y, z) = 0 defines a surface
which looks locally like z = g(x, y) and because fx + fz z ′ = 0 we can
compute the partial derivatives gx = −fx /fz and gy = −fy /fz of g
Gottfried Wilhelm Leibniz without knowing g.

Section 3.4: Steepest Ascent

The gradient helps to understand the geometry of surfaces g(x, y, z)−
0 because it is perpendicular to the level surface f (x, y, z) = c.
One can see this by linearization or by using the chain rule for a
curve ⃗r(t) on the surface f (⃗r(t)) = 0. A special case is the plane
g(x, y, z) = ax + by + cz = d, where ∇g = [a, b, c]. The gradient
helps to find tangent planes and tangent lines. We introduce the
directional derivative D⃗v (f ) as D⃗v f = ∇f · ⃗v for unit vectors ⃗v .
Partial derivatives are special directional derivatives. The direction of
the normal vector gives a non-negative partial derivative. Moving into
the direction of the normal vector, increases f because D∇f /|∇f | f =
|∇f |. In other words, the gradient vector points in the direction of
Pierre-Simon Laplace steepest ascent.
5

Chapter 4. Extrema and Double integrals

Section 4.1: Maxima and Minima

To maximize f (x, y), first identify critical points, points where the
gradient vanishes: ∇f (x, y) = [0, 0]. The nature of critical points can
be established using the second derivative test. Let (p, q) be a
2
critical point and let D = fxx fyy − fxy denote the discriminant of f
at this critical point. There are three fundamentally different cases:
local maxima, local minima as well as saddle points. If D < 0,
then (p, q) is a saddle point, if D > 0 and fxx < 0 then we have a
local maximum, if D > 0 and fxx > 0 then we have a local minimum.
If D = 0, we can not determine the nature of the critical point from
the second derivatives alone. Global maxima are places where the
Pierre de Fermat f (x0 , y0 ) ≥ f (x, y) for all (x, y).

Section 4.2: Lagrange Multipliers

We can maximize f (x, y) in the presence of a constraint g(x, y) = 0.
A necessary condition for a maximum is ∇f and ∇g are parallel. The
corresponding system of equations are called the Lagrange equa-
tions. They are a system of nonlinear equations ∇f = λ∇g, g = 0.
Extrema solve this equation of ∇g = 0. When we maximize or mini-
mize functions on a domain bounded by a curve g(x, y) = 0, we have to
solve two problems: find the extrema in the interior and the extrema
on the boundary. The second problem is a Lagrange problem. With
the same method we can also maximize or minimize functions f (x, y, z)
of three variables, under the constraint g(x, y) = 0, h(x, y) = 0. In two
or three dimensions, extrema could also be obtained without Lagrange
Joseph Louis Lagrange by looking at the equation ∇f × ∇g = ⃗0. Still, the Lagrange frame-
work is very general and works in any dimension.

Section 4.3: Double integrals

Integration in two dimensions is first done on rectangles, then on re-
gions G bounded by graphs of functions. Depending on whether curves
y = c(x), y = d(x) or curves x = a(y), y = b(y) are the boundaries, we
call the region left-to-right region or bottum-to-top region. As
in one dimension, there is a Archimedian sum or Riemann sum
approximation of the integral. This allows us to derive results like
Fubini’s theorem on a rectangular region or the change of the order
of integration
RR which often enables the integration. The double inte-
gral G
f (x, y) dxdy is the signed volume under the graph of the
function of two variables. Double integrals define area if f (x, y) = 1.
By changing of order of integration in regions which are of both
Guido Fubini times, we sometimes can integrate integrals which are impossible.
6

Section 4.4: Polar Integration

Some regions can be described better in polar coordinates (r, θ),

where r ≥ 0 is the distance to the origin and θ is the polar angle with
the positive x-axes. Examples of regions which can be treated like that
are polar region is 0 ≤ r ≤ g(θ) which trace flower-like shapes in the
plane. An other application
RR of double integrals is surface area. We
derive the formula |r × rv | dudv and give examples like graphs,
R u
surfaces of revolution and especially the sphere. Similar as for arc
length, it is easy to give examples, where the surface area can be
computed in closed form, like triangles, parts of the sphere or cylinder
or paraboloid. Polar integration also helps to find one-dimensional
Bonaventura Cavalieri integrals which otherwise would be difficult to obtain.

Chapter 5. Line integrals

Section 5.1: Triple Integrals

Triple integrals can measure volume, moment of inertia or the

center of mass of a solid. First introduced for cuboids, then to
more general regions like solids, sandwiched between the graphs of
two functionsR R R g(x, y) and h(x, y). Applications are computations
RRR
of mass E
δ(x, y, z) dxdydz, moment
R R R of inertia E
(x2 +
y 2 + z 2 )Rdxdydz,
RR center of mass,
RRR [x, y, z] dV the expectation
E[X] = X(x, y, z) dV / dV of a random variable X(x, y, z)
on a region Ω.
Archimedes of Syracuse

Section 5.2: Spherical Integration

Some objects can be described better in cylindrical coordinates

(r, θ, z), which are just polar coordinates for the x, y variables in space,
with an additional z coordinate. Examples of such regions are parts
of cylinders or solids of revolution. The important factor to include
when changing to cylindrical coordinates is r. Other regions are inte-
grated over better in spherical coordinates (ρ, ϕ, θ) with ρ ≥ 0, the
distance to the origin, the angles ϕ ∈ [0, π] and θ ∈ [0, 2π). Example
of such regions are parts of cones or spheres. The important factor
to include when changing to spherical coordinates is ρ2 sin(ϕ). As an
application, we can compute moments of inertia of some bodies.
Bernhard Riemann
7

Section 5.3: Vector Fields

Vector fields occur as force fields or velocity fields or in phase por-
traits of mechanics or in population dynamics. An important class
are gradient fields. We look at examples in two or three dimensions.
We learn how to match vector fields with formulas and introduce
flow lines, parametrized curves ⃗r(t) for which the vector F⃗ (⃗r(t)) is
parallel to ⃗r ′ (t) at all times. Given a parametrized
R curve ⃗r(t) and
⃗ ′
a vector field F , we can define the line integral C F (⃗r(t)) ⃗r (t) dt
along a curve in the presence of a vector field. An important example
is the case if F⃗ is a force field. The line integral is then work.
James Maxwell

Section 5.4: Line Integrals

For conservative vector fields one can evaluate a line integral using
the fundamental theorem of line integrals. The property conser-
vative is also called path independence or conservative or being
a gradient field F⃗ = ∇f . It is equivalent to being irrotational
curl(F ) = Qx − Py = 0 if the topological condition of simply con-
nected is satisfied: any closed curve can be contracted continuously
to a point within the region. The region {(x, y) | x2 + y 2 > 1} for
example is not simply connected because the path [2 cos(t), 2 sin(t)]
can not be pulled together to a point. In two dimensions, the curl of a
field curl([P, Q]) = Qx − Py measures the vorticity of the field and if
André-Marie Ampère this is zero, the line integral along a simply connected region is zero.

Chapter 6. Integral Theorems

Section 6.1: Green’s Theorem

Greens theorem equates the line integral along a boundary
Rcurve C with a double integral of the curl inside the region G:
⃗ )(x, y) dxdy = ⃗ (⃗r(t)) ⃗r ′ (t) dt. The theorem is use-
R R
G
curl(F C
F
ful to compute areas: take a field F⃗ = [0, x] which has constant curl
1. It also allows to compute complicated line integrals. Greens theo-
rem implies that if curl(F ) = 0 everywhere in the plane, then the field
has the closed loop property and is therefore conservative. The
curl of a vector field F⃗ = [P, Q, R] in three dimensions is a new vec-
tor field which can be computed as ∇ × F⃗ . The three components of
Mikhail Ostrogradsky curl(F ) are the vorticity of the vector field in the x,y and z direction.
8

Section 6.2: Flux Integrals

Given a surface S and a fluid moving with velocity field F⃗ (x, y, z) at
(x, y, z). the amount of fluid which passesR Rthrough the membrane S
in unit time is the flux. This integral is S
F⃗ · (⃗ru × ⃗rv ) dudv. The
angle between F⃗ and the normal vector ⃗n = ⃗ru × ⃗rv determines the
sign of dS⃗ = F⃗ · ⃗n dudv. Many concepts are used in this definition:
the parametrization of surfaces, the dot and cross product, as well
as double integrals. We discuss how the derivatives div, grad and
curl fit together. In one dimensions, there is only one derivative, in
two dimensions, there are two derivatives grad and curl and in three
Siméon Denis Poisson dimensions, there are three derivatives grad, , curl and div.

Section 6.3: Stokes Theorem

Stokes theorem equates the line integral along the boundary C of
the
R surface RwithR the flux of the curl of the field through the surface:
⃗
F dr = ˙ The correct orientations of the surface is
curl(F )dS.
C S
important. The theorem allows to illustrate the Maxwell equations
in electromagnetism and explains why the line integral of an irrota-
tional field along a closed curve in space is zero if the region, where
F⃗ is defined in a simply connected region. It is the flux of the curl of
F⃗ through the surface S bound by the curve C. At this moment, the
Mathematica project is due. The project is creative, and illustrates
the strong connections of mathematics with art.
George Gabriel Stokes

Section 6.4 Divergence Theorem

The total divergence of a vector field F⃗ = [P, Q, R] inside solid
E is the flux of F⃗ through the boundary S. This divergence the-
orem
R R R equates the ”local expansion rate” integrated R R over the solid
⃗ ⃗
div(F ) dV of a vector field F with the flux ⃗ through
F⃗ · dS
E S
the boundary surface S of E. Overview: In one dimension, there is
one integral theorem, the fundamental theorem of calculus. In
two dimensions, we have the the fundamental theorem of line
integrals in the plane as well as Greens theorem. In three di-
mensions we have the fundamental theorem of line integrals in
space, Stokes theorem and the divergence
R R theorem. These in-
tegral theorems are all of the form δG F = G dF , where δG is the
Carl Friedrich Gauss
boundary of G and dF is the derivative of F .
MULTIVARIABLE CALCULUS

MATH S-21A

Unit 1: Geometry and Distance

Lecture
1.1. A point on the real line R is determined by a single real coordinate x. Zero 0
divides the positive axis from the negative axis. A point P = (x, y) in the plane
R2 is fixed by two coordinates x and y. In space R3 , locating a point needs three
coordinates P = (x, y, z). The third coordinate z is usually interpreted as height, the
distance from the xy-plane. The signs define four quadrants in R2 or eight octants in
R3 . These regions all intersect at the origin O = (0, 0) or O = (0, 0, 0) and are bound
by coordinate axes {y = 0} and {x = 0} or coordinate planes {x = 0}, {y =
0}, {z = 0}.
1.2. In R2 , we usually orient the x-axis to point “east” and the y-axis point “north”.
In R3 , a common view is to see the xy-plane as the “ground” and to imagine the
z-coordinate axis pointing “up”. A photographic coordinate system appears in
computer graphics or photography, where the xy-plane is the retina or film plate and
the z-coordinate measures the distance towards the viewer.
1.3. The Euclidean distance between two points P = (x, y, z) and Q = (a, b, c) in
space is defined as
p
Definition: d(P, Q) = (x − a)2 + (y − b)2 + (z − c)2 .

1.4. The points A = (1, 1, 0), B = (1, 0, 1), C = (0, 1, 1) define an equilateral trian-
gle in space because all distances d(A, B), d(B, C) and d(A, C) are equal. The points
A = (275, 0, 0), B = (0, 252, 0), C = (0, 0, 240) define a triangle, in which all sides have
integer length. You will encounter this triangle in a homework.
1.5. The distance formula is a definition, not a result. It is motivated by the theo-
rem of Pythagoras. We will prove the Pythagorean theorem later. This distance is
defined pin any dimension. In the plane for example, the distance of the point (x, y) to
(a, b) is (x − a)2 + (y − b)2 . When working in R2 , we do not think of it as part of R3 .
Coordinates work in arbitrary dimensions. A collection of n data points defines a vec-
tor in Rn . The Euclidean space Rn appears in data science as n real data points can
be seen as a vector. The Euclidean distance Pn between2 two data points x = (x1 , . . . , xn )
2
and a = (a1 , . . . , an ) is then d(x, a) = k=1 (xk −ak ) . The sum of the squares appears
in statistics, like in least square problems.
Multivariable Calculus

1.6. Points, curves, surfaces and solids are geometric objects which can be de-
scribed with functions of several variables. An example of a curve is a circle, an
example of a surface is a sphere, an example of a solid is the ball, the region enclosed
by a sphere.

Definition: A circle of radius r ≥ 0 centered at P = (a, b) is the collection

of points in R2 which have distance r from P . A sphere of radius ρ centered
at P = (a, b, c) is the collection of points in R3 which have distance ρ ≥ 0 from
P . The equation of a sphere is (x − a)2 + (y − b)2 + (z − c)2 = ρ2 .

1.7. Completing the square for an equation x2 + bx + c = 0 means adding 2

p (b/2) − c
2 2
on both sides to get (x+b/2) = (b/2) −c. Solving for x gives x = −b/2± (b/2)2 − c.
This is the quadratic equation. It is good to know this equation because you do not
want to waste your creative power having to re-derive this. If you still should forget,
the completion idea allows you to re-derive it.

Examples
1.8. P = (−1, −3) is in the third quadrant of the plane and P = (2, 4, 3) is in the
positive octant of space. The point (0, 0, −8) is located on the negative z axis. The
point P = (4, 5, −3) is below the xy-plane. Can you spot the point Q on the xy-plane
which is closest to P ?

1.9. Problem: Find the midpoint M on the line segment connecting P = (1, 2, 5)
and Q = (−3, 4, 7) and verify that d(P, M ) + d(Q, M ) = d(P, Q). Answer: The
point
√ we are looking
√ for is the average
√ M = (P +√Q)/2. The distances
√ are d(P, Q) =
2 2 2 2 2 2 2 2 2
√4 + 2 + 2 = 24, d(P, M ) is 2 + 1 + 1 = 6. and d(Q, M ) is 2 + 1 + 1 =
6. Indeed d(P, M ) + d(M, Q) = d(P, Q).

1.10. The equation x2 + 5x + y 2 − 2y + z 2 = −1 is after a completion of the square

(x + 5/2)2 − 25/4 + (y − 1)2 − 1 + z 2 = −1 or (x + 5/2)2 + (y − 1)2 + z 2 = (5/2)2 . We
see a sphere center (−5/2, 1, 0) and radius 5/2.

1.11. An other distance d(P, Q) = |x − a| + |y − b| in the plane R2 is also called the

taxi metric or Manhattan distance. Problem: draw a circle of radius 2 in this
metric. More challenging: draw an ellipse in this plane: the set of points whose sum
of the distances from (−2, 0) and (2, 0) is equal to 6.

1.12. Draw the unit circle of the quartic distance d(x, y) = (x − a)4 + (y − b)4 . More
generally, for any p > 1, we get a distance d(x, y) = (x−a)p +(y−b)p . For p = 1, it is the
taxi metric, for t = 2 it is the Euclidean metric, for t → ∞ it goesp to the p distance
max(|x−a|, |y−b|) which is the l∞ metric. Problem: is d(P, Q) = |x − a|+ |y − b|
a distance? Answer: no, while it satisfies d(P, Q) = d(Q, P ) and is zero if and only
if P = Q, it does not satisfy the triangle inequality d(A, B) + d(B, C) ≥ d(A, C).
We call a space (X, d) for which d is a distance formula satisfying d(P, Q) = d(Q, P ),
d(P, Q) = 0 ↔ P = Q and d(A, B) + d(B, C) ≥ d(A, C) a metric space.
1.13. Problem: Find an algebraic expression for the set of all points for which the
sum of thepdistances to A =p (1, 0) and B = (−1, 0) is equal to 3. Answer: Square the
equation (x − 1) + y + (x + 1)2 + y 2 = 3, separate the remaining single square
2 2

root on one side and square again. Simplification gives 20x2 + 36y 2 = 45 which is
2 2
equivalent to xa2 + yb2 = 1, where a, b can be computed as follows: because P = (a, 0)
satisfies this equation, d(P, A) + d(P, B) = (a − 1) + (a + 1) = 3 so √ that a = 3/2.
Similarly,
√ the point Q = (0, b) satisfying it gives d(Q, A) + d(P, B) = 2 b2 + 1 = 3 or
b = 5/2.

1.14. In an appendix to “La Geometry” of his “Discours de la méthode” which ap-

peared in 1637, René Descartes promoted the idea to use algebra to solve geometric
problems. Even so Descartes mostly dealt with ruler-and compass constructions, the
rectangular coordinate system is now called the Cartesian coordinate system. His
ideas profoundly changed mathematics. But ideas do not grow in a vacuum; Davis and
Hersh write that in its current form, Cartesian geometry is due as much to Descartes
own contemporaries and successors as to himself. One of the first to explore higher
dimensional Euclidean space was Ludwig Schläfli. 1

1.15. The method of completion of squares is due to Al-Khwarizmi who lived from
780-850 and used it as a method to solve quadratic equations. Even so Al-Khwarizmi
worked with numerical examples, it is one of the first important steps of algebra. His
work ”Compendium on Calculation by Completion and Reduction” was dedicated to
the Caliph al Ma’mun, who had established research center called ”House of Wisdom”
in Baghdad. 2

1.16. The Euclidean geometry described is only one of many geometries. One can
work with more general metric spaces. An important class of metric spaces are
studied in Riemannian geometry, where the distance between two points can become
dependent on where we are. Space becomes curved. This is the frame work of general
relativity. Formally, this can happen by changing the coefficients E, G of the metric
d(P, Q)2 = E(x − a)2 + G(y − b)2 . One a sphere, where x = θ ∈ [0, 2π] is longitude and
y = ϕ ∈ [0, π] is latitude, one would take E = sin2 (y), G = 1. Two points on the arctic
circle with fixed longitude have shorter distance than two points on the equator with
the same fixed longitudes. It is important to think now of the surface of the sphere as
a space itself, without its embedding in the ambient space. This space is curved. Our
four dimensional space-time universe is curved depending on the matter distribution.

1An entertaining read is “Descartes secret notebook” by Amir Aczel which deals with an other
discovery of Descartes.
2The book ”The mathematics of Egypt, Mesopotamia, China, India and Islam, by Ed Victor Katz,
page 542 contains translations of some of this work.
Multivariable Calculus

Homework
This homework is due on Tuesday, 6/28/2022.

Problem 1.1: a) Verify that the three points P = (2, 4, 0), Q =

(4, 0, 2), R = (6, 2, −2) define the corners of an equilateral triangle.
b) Find the smallest sphere which passes through all three points.

Problem 1.2: For each of the following objects in R3 draw it and

describe it with a word: a) A = |x| + |y| + |z| ≤ 1. b) B = x2 + y 2 < 1,
c) C = A ∩ B A intersected with B, d) D = A \ B A without B.
e) With d((x, y), (a, b)) = |x − a| + |y − b|, A = (−1, 0), B = (0, 1), draw
E = {X = (x, y), d(X, A) + d(X, B) ≤ 8}.

Problem 1.3: Verify that the radius of the inscribed circle in a 3 : 4 : 5

triangle is 1.

Problem 1.4: The figure shows two rectangles of area 64 and 65 made
up of matching pieces. What is going on?

(0,8) (3,8) (8,8)

2 (0,3) (5,3) (8,3)

2 Y
5 (0,0) (8,0)
3
(0,5) (5,5) (13,5)
Z
3 (8,3)
1 (5,2)
1 3
(0,0) (8,0) (13,0)
A X 4 B

Figure 1. The 3-4-5 triangle and the Curry missing square paradox.

Problem 1.5: You play billiard in the table {(x, y) | 0 ≤ x ≤ 4, 0 ≤

y ≤ 8}. a) Hit the ball at (3, 2) to reach the hole (4, 8) bouncing 3 times
at the left wall and three times at the right wall and no other walls. Find
the length of the shot.
b) Hit from (3, 2) to reach the hole (4, 0) after hitting twice the left and
twice the right wall as well as the top wall y = 8 once. What is the length
of the trajectory?

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 2: Vectors and dot product

Lecture
2.1.
 Two points P = (a, b, c) and Q = (x, y, z) in space R3 define a vector ⃗v =
x−a
 y − b . We write this column vector also as a row vector [x − a, y − b, z − c] in
z−c
order to save space. As the vector starts at P to Q we write ⃗v = P⃗Q. The real numbers
numbers p, q, r in ⃗v = [p, q, r] are called the components of ⃗v .

2.2. Vectors can be attached to any point in space. Two vectors with the same compo-
nents are considered equal as they can be translated into each other. If a vector ⃗v starts
at the origin O = (0, 0, 0), then ⃗v = [p, q, r] heads from O to the point P = (p, q, r).
One can therefore identify points P = (a, b, c) with vectors ⃗v = [a, b, c] attached to
the origin. For clarity reasons, we often draw an arrow⃗ on top of a vector variable
and if ⃗v = P⃗Q then P is the “tail” and Q is the “head” of the vector. To distinguish
vectors from points, it is custom to write [2, 3, 4] for vectors and (2, 3, 4) for points. 1

2.3.
Definition: The sum of two vectors is ⃗u +⃗v = [u1 , u2 ]+[v1 , v2 ] = [u1 +v1 , u2 +
v2 ]. The scalar multiple is λ⃗u = λ[u1 , u2 ] = [λu1 , λu2 ]. The difference ⃗u − ⃗v
can best be seen as the addition of ⃗u and (−1) · ⃗v .

Commutativity, associativity, or distributivity rules for vectors are inherited from

the corresponding rules for numbers. Please review these rules yourself if necessary.

2.4. The vectors ⃗i = [1, 0, 0], ⃗j = [0, 1, 0], ⃗k = [0, 0, 1] are called standard basis
vectors. This has historically grown because the dot and cross product have grown
from quaternions which are points (t, x, y, z) in R4 , usually written as q = t + ix +
jy + kz.

1Please avoid the Stewart notation ⟨p, q, r⟩. It is not used in advanced mathematics. Programming
languages like Python use [p, q, r]. Mathematica, Perl or C use the notation {p, q, r}.
Multivariable Calculus

2.5.
Definition: The length |⃗v | of a vector ⃗v = P⃗Q is defined as the distance
d(P, Q) from P to Q. A vector of length 1 is called a unit vector. If ⃗v ̸= ⃗0,
then ⃗v /|⃗v | is called a direction of ⃗v . The only vector of length 0 is the zero
vector ⃗0 = [0, 0, 0].

2.6.
Definition: The dot product of two vectors ⃗v = [a, b, c] and w
⃗ = [p, q, r] is
defined as ⃗v · w
⃗ = ap + bq + cr.

2.7. Different notations for the dot product are used in different mathematical fields.
While mathematicians write ⃗v · w ⃗ or ⟨⃗v , w⟩,
⃗ or (⃗v , w) ⃗ the Dirac notation ⟨⃗v |w⟩
⃗ is used
in quantum mechanics. The Einstein notation vi wi or more generally gij v i wj is used
in general relativity. In statistics, it is called the covariance Cov[v, w] of centered
data. The dot product is also called scalar product or inner product. It could
be generalized. Any product g(v, w) which is bi-linear (linear both in v and w) and
satisfies the symmetry g(v, w) = g(w, v) and g(v, v) ≥ 0 and g(v, v) = 0 if and only if
v = 0, can be used as a dot product. An example is g(v, w) = 2v1 w1 + 3v2 w2 + 5v3 w3 .

2.8. The dot product determines distances and distances determine the dot product.
Proof:√ Write v = ⃗v . Using the dot product one can express the length of v as
|v| = v · v. On the other hand, from (v + w) · (v + w) = v · v + w · w + 2(v · w) can
be solved for v · w:
v · w = (|v + w|2 − |v|2 − |w|2 )/2 .

2.9. The Cauchy-Schwarz inequality is

Theorem: |⃗v · w|
⃗ ≤ |⃗v ||w|.
⃗

Proof. If |w| = 0, the statement holds because both sides are zero. Otherwise, assume
|w| = 1 by dividing the equation by |w|. Now plug in a = v · w into the equation
0 ≤ (v −aw)·(v −aw) to get 0 ≤ (v −(v ·w)w)·(v −(v ·w)w) = |v|2 +(v ·w)2 −2(v ·w)2 =
|v|2 − (v · w)2 which means (v · w)2 ≤ |v|2 . □

2.10. Having established this, it is possible to give a definition of what an angle is,
without the need to refer to any geometric pictures:

Definition: The angle between two nonzero vectors ⃗v , w ⃗ is defined as the

unique α ∈ [0, π] which satisfies ⃗v · w
⃗ = |⃗v | · |w|
⃗ cos(α). Since cos maps [0, π] in
a 1:1 manner to [−1, 1], this is well defined.

2.11. The Al Kashi theorem gives the third side length c of a triangle ABC in terms
of the sides a = d(A, C), b = d(B, C) and α, the angle located at the vertex A.

Theorem: a2 + b2 = c2 − 2ab cos(α).

Proof. Define ⃗v = AB,⃗ w
⃗ = AC. ⃗ Because c2 = |⃗v − w|
⃗ 2 = (⃗v − w)
⃗ · (⃗v − w)⃗ =
2 2 2 2 2
|⃗v | +|w|
⃗ −2⃗v ·w,
⃗ We know ⃗v ·w
⃗ = |⃗v |·|w|
⃗ cos(α) so that c = |⃗v | +|w|
⃗ −2|⃗v |·|w|
⃗ cos(α) =
a2 + b2 − 2ab cos(α). □

2.12. The triangle inequality tells

Theorem: |⃗u + ⃗v | ≤ |⃗u| + |⃗v |

Proof. |⃗u +⃗v |2 = (⃗u +⃗v )·(⃗u +⃗v ) = ⃗u2 +⃗v 2 +2⃗u ·⃗v ≤ ⃗u2 +⃗v 2 +2|⃗u ·⃗v | ≤ ⃗u2 +⃗v 2 +2|⃗u|·|⃗v | =
(|⃗u| + |⃗v |)2 . □

Definition: Two vectors are called orthogonal or perpendicular if ⃗v · w ⃗=

0. The zero vector ⃗0 is orthogonal to any vector. For example, ⃗v = [2, 3] is
orthogonal to w
⃗ = [−3, 2].

2.13. We can now prove the Pythagorean theorem:

⃗ are orthogonal, then |v − w|2 = |v|2 + |w|2 .

Theorem: If ⃗v and w

Proof. (⃗v − w)
⃗ · (⃗v − w)
⃗ = ⃗v · ⃗v + w
⃗ ·w
⃗ + 2⃗v · w
⃗ = ⃗v · ⃗v + w
⃗ · w.
⃗ We usually write this
as a2 + b2 = c2 . □

2.14.
·w
Definition: The vector P(⃗v ) = |⃗vw| ⃗
⃗ 2
w
⃗ is called the projection of ⃗v onto w.
⃗ The
⃗v ·w
⃗
scalar projection |w| ⃗
is a signed length of the vector projection. Its absolute
value is the length of the projection of ⃗v onto w. ⃗ The vector ⃗b = ⃗v − P (⃗v ) is a
vector orthogonal to the w-direction.
⃗

2.15. The projection allows to visualize the dot product. The absolute value of the
dot product is the length of the projection. The dot product is positive if ⃗v points
more towards to w,⃗ it is negative if ⃗v points away from it. In the next class, we use
the projection to compute distances between various objects.

Examples
2.16.√For example, with ⃗v = [0, −1, 1], w
⃗ = [1, −1, 0], P(⃗v ) = [1/2, −1/2, 0]. Its length
is 1/ 2.

2.17. The RGB color space consists of triples ⃗v = [r, g, b] describing the amount of
red, green and blue of a color. An other coordinate system is the CMY color space
consisting of triples ⃗v = [c, m, y] = [1 − r, 1 − g, 1 − b], where c is cyan, m is magenta
and y is yellow.

2.18. In physics, forces and fields F⃗ are described by vectors. The velocity of a curve
r(t) = [x(t), y(t), z(t)] is a vector attached to the point r(t). We will come to this.
Multivariable Calculus

2.19. In probability theory, data are described by vectors. One calls them also ran-
dom variables. It is in statistics, where higher dimensional spaces appear. In statis-
tics, ⃗v · w
⃗ is the covariance and cos(α) = (⃗v · w)/(|⃗
⃗ v ||w|)
⃗ is the correlation.
Homework
This homework is due on Tuesday, 6/28/2022.

Problem 2.1: a) Find a unit vector parallel to ⃗x = ⃗u + ⃗v + w ⃗ if

⃗u = [1, 2, 3] and ⃗v = [2, 4, 1] and w
⃗ = [0, 0, 2].
b) Now find a two non-parallel unit vectors perpendicular to ⃗x.

Problem 2.2: An Euler brick is a cuboid with side lengths a, b, c such

that all face diagonals are integers.
a) Verify that ⃗v = [a, b, c] = [44, 117, 240] is a vector which leads to an
Euler brick. Halcke found this in 1719.
b) (*) Verify that [a, b, c] = [u(4v 2 − w2 ), v(4u2 − w2 ), 4uvw] leads to an
Euler brick if u2 + v 2 = w2 √(Sounderson 1740).
If also the space diagonal a2 + b2 + c2 is an integer, the Euler brick is
called perfect. Nobody has found one, nor proven that it can not exist.

Problem 2.3: Colors are encoded by vectors ⃗v = [ red , green , blue ].

The red, green and blue components of ⃗v are all real numbers in the
interval [0, 1].
a) Determine the angle between the colors yellow and magenta.
b) What is the vector projection of the magenta-orange mixture ⃗x =
(⃗v + w)/2
⃗ onto green ⃗y ?

Problem 2.4: A rope is wound exactly 8 times around a stick of cir-

cumference 1 and length 15. How long is the rope?

Problem 2.5: a) Find the angle between the main diagonal of the unit
cube and one of the face diagonals. Assume that both diagonals pass
through a common vertex.
b) Find the vector projection of the main diagonal ⃗v = [1, 1, 1] onto the
side diagonal w
⃗ = [1, 1, 0].
c) Find the maximal distance between the 16 points (±1, ±1, ±1, ±1) of
a tesseract.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 3: Cross product

Lecture
⃗ = [w1 , w2 ] in the plane R2
3.1. The cross product of two vectors ⃗v = [v1 , v2 ] and w
is the scalar ⃗v × w
⃗ = v1 w2 −
v2 w1 . One can remember this as the determinant of a
v1 v2
2 × 2 matrix A = , the product of the diagonal entries minus the product
w1 w2
of the side diagonal entries.

3.2.
Definition: The cross product of two vectors ⃗v = [v1 , v2 , v3 ] and w
⃗ =
[w1 , w2 , w3 ] in space is defined as the vector
⃗v × w
⃗ = [v2 w3 − v3 w2 , v3 w1 − v1 w3 , v1 w2 − v2 w1 ] .

We can write the product also as a ”determinant”:

       
i j k i j k
 v1 v2 v3  =  v2 v3  −  v1 v3  +  v1 v2 
w1 w2 w3 w2 w3 w1 w3 w1 w2
which is ⃗i(v2 w3 −v3 w2 )−⃗j(v1 w3 −v3 w1 )+⃗k(v1 w2 −v2 w1 ) using the notation ⃗i = [1, 0, 0],
⃗j = [0, 1, 0] and ⃗k = [0, 0, 1].
3.3. Examples the cross product of [1, 2] and [4, 5] is [5−8] = [−3]. The cross product
of [1, 2, 3] and [4, 5, 1] is [−13, 11, −3].
3.4. Unlike the dot product which is commutative, the cross product is anti-commutative:
⃗v × w
⃗ = −w⃗ × ⃗v .
Theorem: In R3 , the vector ⃗v × w ⃗ is orthogonal to both ⃗v and w
⃗ and
has length |⃗v × w|
⃗ = |⃗v ||w|
⃗ sin(α).

Proof. Check orthogonality using the dot product ⃗v · (⃗v × w)

⃗ = 0. The length formula
2 2 2
follows from the Lagrange identity |⃗v × w|
⃗ = |⃗v | |w| ⃗ 2 which is also called
⃗ − (⃗v · w)
Cauchy-Binet formula. This is done by direct computation. To finish up, use |⃗v · w|
⃗ =
|⃗v ||w||
⃗ cos(α)| and simplify. □
3.5. For example, when choosing a coordinate system where ⃗v = [a, 0, 0] and w ⃗ =
[b cos(α), b sin(α), 0], we have ⃗v × w
⃗ = [0, 0, ab sin(α)] which has length |ab sin(α)|.
Multivariable Calculus

Figure 1. The cross product produces vectors perpendicular to both

arguments. The length of the vector is the area of the parallelogram.

3.6. The absolute value respectively length |⃗v × w|

⃗ defines the area of the parallel-
ogram spanned by ⃗v and w.⃗ We can state this as a definition of area. The definition
fits with our common intuition we have about area because |w| ⃗ sin(α) is the height of
the parallelogram with base length |⃗v |.

3.7. The trigonometric sin-formula relates the side lengths a, b, c and angles α, β, γ
of a general triangle:
a b c
Theorem: sin(α)
= sin(β)
= sin(γ)
.

Proof. We can express the doubled area of the triangle in three different ways:
ab sin(γ) = bc sin(α) = ac sin(β) .
Divide the first equation by sin(γ) sin(α) to get one identity. Divide the second equation
by sin(α) sin(β) to get the second identity. □

3.8. It follows from the sin-formula and the fact that sin(α) = 0 if α = 0 or α = π
that ⃗v × w
⃗ is zero if and only if ⃗v and w
⃗ are parallel, that is if either ⃗v = 0, w
⃗ =0
or ⃗v = λw
⃗ for some real λ. The cross product is a quick check to see that two vectors
are parallel or not. Note that v and −v are considered parallel even so sometimes the
notion anti-parallel is used.
3.9.
Definition: The scalar [⃗u, ⃗v , w] ⃗ = ⃗u · (⃗v × w)⃗ is called the triple scalar
product of ⃗u, ⃗v , w.
⃗ The absolute value of [⃗u, ⃗v , w]
⃗ defines the volume of the
parallelepiped spanned by ⃗u, ⃗v , w. ⃗ The orientation of three vectors is defined
as the sign of [⃗u, ⃗v , w].
⃗ It is positive if the three vectors define a right-handed
coordinate system. It is zero if the vectors are in one plane.

3.10. We have defined volume and orientation using dot and cross product. Why does
this fit with our intuition? The value h = |⃗u · ⃗n|/|⃗n| is the height of the parallelepiped
if ⃗n = (⃗v × w)
⃗ is a normal vector to the ground parallelogram of area A = |⃗n| = |⃗v × w|. ⃗
The volume of the parallelepiped is hA = (⃗u · ⃗n/|⃗n|)|⃗v × w| ⃗ which simplifies to ⃗u · ⃗n =
|(⃗u · (⃗v × w)|
⃗ which is the absolute value of the triple scalar product. The vectors ⃗v , w ⃗
and ⃗v × w ⃗ form a right handed coordinate system. If the first vector ⃗v is your
thumb, the second vector w ⃗ is the pointing finger then ⃗v × w
⃗ is the third middle finger of
⃗ ⃗ ⃗ ⃗ ⃗
the right hand. For example, the vectors i, j, i × j = k form a right handed coordinate
system. Since the triple scalar product is linear with respect to each vector, we also see
that volume is additive. Adding two equal parallelepipeds together for example gives
a parallelepiped with twice the volume.

Examples
3.11. Problem: Find the volume of the parallelepiped which has the vertices O =
(1, 1, 0), P = (2, 3, 1), Q = (4, 3, 1), R = (1, 4, 1). Answer: the solid is spanned by the
⃗ = [0, 3, 1]. We get ⃗v × w
vectors ⃗u = [1, 2, 1], ⃗v = [3, 2, 1], and w ⃗ = [−1, −3, 9] and
⃗u · (⃗v × w)
⃗ = 2. The volume is 2.

3.12. Problem: Two apples have the same shape and mass density, but one has a
3 times larger diameter. What is their weight ratio? Answer. For a cuboid spanned
by [a, 0, 0] [0, b, 0] and [0, 0, c], the volume is the triple scalar product abc. If a, b, c
are all tripled, the volume gets multiplied by a factor 27. Now cut each apple into
parallelepipeds, the larger one with slices 3 times as large too. Since each of the larger
pieces has 27 times the volume of the smaller, also the apple is 27 times heavier!

3.13. Problem. A 3D scanner is used to build a 3D model of a face. It detects a

triangle which has its vertices at P = (0, 1, 1), Q = (1, 1, 0) and R = (1, 2, 3). Find the
area of the triangle. Solution. We have to ⃗
√find the length of the cross product√of P Q
and P⃗R which is [1, −3, 1]. The length is 11. The triangle has half the area 11/2.

3.14. Problem. The scanner now detects an other point A = (1, 1, 1). On which
side of the triangle is it located if the cross product of P⃗Q and P⃗R is considered the
up-direction. Solution. The cross product is ⃗n = [1, −3, 1]. We have to see whether
the vector P⃗A = [1, 0, 0] points into the direction of ⃗n or not. To see that, we have to
form the dot product. It is 1 so that indeed, A is ”above” the triangle. Note that a
triangle in space a priori does not have an orientation. We have to tell, what direction
is ”up”. That is the reason that file formats for 3D printing like contain the data for
three points in space as well as a vector, telling the direction.
Multivariable Calculus

Homework
This homework is due on Tuesday, 6/28/2022.
Problem 3.1: A three dimensional analogue of a right angle triangle is
a tetrahedral shape P, Q, R, O, where all angles at O are right angles. Let
us assume that O = (0, 0, 0) and P = (3, 0, 0), Q = (0, 5, 0), R = (0, 0, 7).
The 3D Pythagoras theorem stats that the sum of three triangle area
triangle squared is the area of the square of the triangle P QR. That is
|OP Q|2 + |OQR|2 + |ORP |2 = |P QR|2 . Verify it in the example, then
verify it in general for P = (a, 0, 0), Q = (0, b, 0), R = (0, 0, c).

Problem 3.2: a) Find a unit vector perpendicular to the space diagonal

[1, 1, 1] and the face diagonal [1, 0, 1] of the cube.
b) Find the area of the triangle containing the points A = (4, 0, 1), B =
(1, 1, 0), C = (1, 1, 1).
c) Find the volume of the tetrahedral shape with corners
O = (0, 0, 0), P = (4, 0, 1), Q = (1, 1, 0), R = (1, 2, 1).

Problem 3.3: Find the equation ax + by + cz = d for the plane which

contains the point P = (1, 2, 3) as well as the line which passes through
Q = (3, 4, 4) and R = (1, 1, 2).

Problem 3.4: Verify the ”BAC minus CAB” formula (due to Lagrange)
⃗a × (⃗b × ⃗c) = ⃗b(⃗a · ⃗c) − ⃗c(⃗a · ⃗b) for general vectors ⃗a, ⃗b, ⃗c in space.

Problem 3.5: Investigate which of the following formulas are always

true for all vectors u,v,w,x,y. If it is true, either explain, cite a source
(i.e. on the web), or a by hand or computer algebra verification. If it is
not true, find a counter example.
a) u · (v × w) = v · (w × u)
b) u × (v × w) = (u × v) × w
c) u × (v + w) = u × v + u × w
d) u × (v × w) = (u · w)v − (u · v)w.
e) (u × v) · (x × y) = (u · x)(v · y) − (u · y)(v · x).

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 4: Lines and Planes

Lecture
3.1. A point P = (p, q, r) and a vector ⃗v = [a, b, c] define the line
   
p a
L = { q + t b , t ∈ R } .
  
r c
The line consists of all points obtained by adding a multiple of the vector ⃗v = [a, b, c] to
⃗ = [p, q, r]. It contains the point P as well as a copy of ⃗v = P⃗Q attached
the vector OP
to P . Every vector contained in the line is necessarily parallel to ⃗v . We think about
the parameter t as “time”. At t = 0, we are at the end point P of OP ⃗ and at t = 1,
we are at the end point Q of OQ ⃗ = OP⃗ + ⃗v .

3.2. If t is restricted to values in a parameter interval [t1 , t2 ], then L = {[p, q, r] +

t[a, b, c], t1 ≤ t ≤ t2 } is a line segment which connects ⃗r(t1 ) with ⃗r(t2 ). For
example, to get the line through P = (1, 1, 2) and Q = (2, 4, 6), form the vector
⃗v = P⃗Q = [1, 3, 4] and get L = {[x, y, z] = [1, 1, 2] + t[1, 3, 4]; }. This can be written
also as ⃗r(t) = [1+t, 1+3t, 2+4t]. If we write [x, y, z] = [1, 1, 2]+t[1, 3, 4] as a collection
of equations x = 1 + 2t, y = 1 + 3t, z = 2 + 4t and solve the first equation for t:
L = {(x, y, z) | (x − 1)/2 = (y − 1)/3 = (z − 2)/4 } .

3.3. The line ⃗r = OP ⃗ + t⃗v defined by P = (p, q, r) and vector ⃗v = [a, b, c] with nonzero
a, b, c satisfies the symmetric equations
x−p y−q z−r
= = .
a b c
The reason is that each of these expressions is equal to t. These symmetric equations
have to be modified a bit one or two of the numbers a, b, c are zero. If a = 0, replace
the first equation with x = p, if b = 0 replace the second equation with y = q and if
c = 0 replace third equation with z = r. The interpretation is that the line is written
as an intersection of two planes.

⃗ define a plane Σ = {OP

3.4. A point P and two vectors ⃗v , w ⃗ + t⃗v + sw,
⃗ where t, s are
real numbers }.
An example is Σ = {[x, y, z] = [1, 1, 2] + t[2, 4, 6] + s[1, 0, −1] }. This is called the
parametric description of a plane.
Multivariable Calculus

3.5. If a plane contains the two vectors ⃗v and w, ⃗ then the vector ⃗n = ⃗v × w
⃗ is orthogonal
to both ⃗v and w. ⃗ ⃗ ⃗
⃗ Because also the vector P Q = OQ − OP is perpendicular to ⃗n, we
have (Q − P ) · ⃗n = 0. With Q = (x0 , y0 , z0 ), P = (x, y, z), and ⃗n = [a, b, c], this means
ax+by +cz = ax0 +by0 +cz0 = d. The plane is therefore described by a single equation
ax + by + cz = d. We have shown:

Theorem: The equation for a plane containing ⃗v and w ⃗ and a point P

is ax + by + cz = d, where [a, b, c] = ⃗v × w
⃗ and where d is obtained by
plugging in P .

3.6. Problem: Find the equation of a plane which contains the three points P =
(−1, −1, 1), Q = (0, 1, 1), R = (1, 1, 3).
Answer: The plane contains the two vectors ⃗v = P⃗Q = [1, 2, 0] and w⃗ = P⃗R = [2, 2, 2].
The normal vector ⃗n = ⃗v × w ⃗ = [4, −2, −2] leads to the equation 4x − 2y − 2z = d.
The constant d is obtained by plugging in the coordinates of one of the points. In our
case, it is 4x − 2y − 2z = −4.

3.7. Problem: Find the angle between the planes x + y = −1 and x + y + z = 2, The
angle between the two planes ax + by + cz = d and ex + f y + gz = h is defined as
the angle between the two normal vectors ⃗n = [a, b, c] and m
⃗ = [e, f, g]. √
Answer: find the angle between ⃗n = [1, 1, 0] and m
⃗ = [1, 1, 1]. It is arccos(2/ 6).

Examples
3.8. To practice the concepts, we look at distance formulas.

1) If P is a point and Σ : ⃗n · ⃗x = d is a plane containing a point Q, then

|P⃗Q · ⃗n|
d(P, Σ) =
|⃗n|

is the distance between P and the plane. Proof: use the angle formula in the denomi-
nator. For example, to find the distance from P = (7, 1, 4) to Σ : 2x + 4y + 5z = 9, we
find first a a point Q = (0, 1, 1) on the plane. Then compute
|[−7, 0, −3] · [2, 4, 5]| 29
d(P, Σ) = =√ .
|[2, 4, 5]| 45
2) If P is a point in space and L is the line ⃗r(t) = Q + t⃗u, then

|(P⃗Q) × ⃗u|
d(P, L) =
|⃗u|

is the distance between P and the line L. Proof: the area divided by base length is
height of parallelogram. For example, to compute the distance from P = (2, 3, 1) to
the line ⃗r(t) = (1, 1, 2) + t(5, 0, 1), compute
√
|[−1, −2, 1] × [5, 0, 1]| |[−2, 6, 10]| 140
d(P, L) = = √ = √ .
[5, 0, 1] 26 26
3) If L is the line ⃗r(t) = Q + t⃗u and M is the line ⃗s(t) = P + t⃗v , then

|(P⃗Q) · (⃗u × ⃗v )|
d(L, M ) =
|⃗u × ⃗v |

is the distance between the two lines L and M . Proof: the distance is the length of
the vector projection of P⃗Q onto ⃗u × ⃗v which is normal to both lines. For example,
to compute the distance between ⃗r(t) = (2, 1, 4) + t(−1, 1, 0) and M is the line ⃗s(t) =
(−1, 0, 2) + t(5, 1, 2) form the cross product of [−1, 1, 0] and [5, 1, 2] is [2, 2, −6]. The
distance between these two lines is
|(3, 1, 2) · (2, 2, −6)| 4
d(L, M ) = =√ .
|[2, 2, −6]| 44
4) To get the distance between two planes ⃗n · ⃗x = d and ⃗n · ⃗x = e, then their distance
is

|e − d|
d(Σ, Π) =
|⃗n|

Non-parallel planes have distance 0. Proof: use the distance formula between point
and plane. For example, 5x + 4y + 3z = 8 and 10x + 8y + 6z = 2 have the distance
|8 − 1| 7
=√ .
|[5, 4, 3]| 50

Figure 1. The global positioning system GPS uses the fact that a
receiver can get the difference of distances to two satellites.
Multivariable Calculus

Homework
This homework is due on Tuesday, 6/28/2022.

Problem 4.1: Given the three points P = (10, 4, 5) and Q = (1, 3, 9)

and R = (4, 2, 10), find the parametric equation for the line perpendicular
to the triangle P QR passing through its center of mass (P + Q + R)/3 =
(5, 3, 8).

Problem
√ 4.2: A regular
√ tetrahedron
√ has corners
√ at P√
1 = (0, 0, 6),P2 =
(0, 32, −2), P3 = (− 24, − 8, −2) and P4 = ( 24, − 8, −2). What is
the distance between two non-intersecting edges.

Problem 4.3: Find a parametric equation for the line through the point
P = (3, 1, 2) that is perpendicular to the line L : x = 1+4t, y = 1−4t, z =
8t and intersects this line in a point Q.

Problem 4.4: Given three spheres of radius 9 centered at A =

(1, 2, 0), B = (4, 5, 0), C = (1, 3, 2). Find a plane ax + by + cz = d which
touches all of three spheres from the same side.

Problem 4.5: a) Find the distance between the point P = (3, 3, 4) and
the line 2x = 2y = 2z.
b) Parametrize the line ⃗r(t) = [x(t), y(t), z(t)] in a) and find the minimum
of the function f (t) = d(P, ⃗r(t))2 . Verify this agrees with a).

Figure 2. The sphere problem and the tetrahedron.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 5: Functions

Lecture
5.1. Functions are a way to define geometric objects or to assign properties to a point
of a geometric object. The shape of a mountain for example can be described by a
function which assigns to every point (x, y) the height.

Definition: A function of two variables f (x, y) is a rule which assigns to

two numbers x, y a third number f (x, y).

The function f (x, y) = x3 y − 2y 2 for example assigns to (2, 3) the number 24 − 18 = 6.

5.2. In general, a function is assumed to be defined for all points (x, y) in R2 . An

example is f (x, y) = |x|7 + esin(xy) . Sometimes however it is required to restrict
√ the
function to a domain D in the plane. For example, if f (x, y) = log |y| + x, then
(x, y) is only defined for x > 0 and for y ̸= 0. The range of a function f is the set of
values which the function f reaches. The function f (x, y) = 3+x2 /(1+x2 ) for example
takes all values 3 ≤ z < 4. While z = 3 is reached for x = y = 0, and all values z < 4
can be reached, the value z = 4 is not attained.

5.3.
Definition: The set {(x, y, f (x, y)) | (x, y) ∈ D } ⊂ R3 is the graph of f .

Graphs are surfaces which allow to visualize the function. We should not mix up the
graph of a function with the function itself. The function is a rule which assigns to
(x, y) a third number, the graph is a geometric object in three dimensional space.
Multivariable Calculus

The modern notion of function only started to appear at the beginning of the 19th cen-
tury. That the function f and the graph of f are different things matters in computer
science for example. They are implemented as different data structures. In some cases,
we know f but we do not understand the graph. An example is the zeta function
f (x, y) = |ζ(x + iy)| for which we know the function very well, can evaluate it at every
point but where we do not know the graph, especially, where its zeros are.

5.4. Here are some examples:

Example
p f (x, y) domain D of f range = f (D) of f
9 − x − y2
2 closed disc x2 + y 2 ≤ 9 [0, 3]
− log(1 − x2 − y 2 ) open unit disc x2 + y 2 < 1 (0, ∞)
fp(x, y) = x2 + y 3 − xy + cos(xy) plane R2 the real line
4 − x2 − 2y 2 x2 + 2y 2 ≤ 4 [0, 2]
1/(x2 + y 2 − 1) all except unit circle R \ (−1, 0]
1/(x2 + y 2 )2 all except origin positive real axis

5.5.
Definition: The set {(x, y) | f (x, y) = c = const } is called a contour curve
or level curve of f . A collection of contour curves is a contour map.

For example, for f (x, y) = 4x2 + 3y 2 , the level curves f = c are ellipses if c > 0.
Drawing several contour curves {f (x, y) = c } simultaneously produces a contour
map of the function f .

5.6. Level curves allow to visualize and analyze functions f (x, y) without leaving the
two dimensional space. The picture below for example shows the level curves of the
function sin(xy) − sin(x2 + y). Contour curves are everywhere: they appear as iso-
bars=curves of constant pressure, or isoclines= curves of constant (wind) field direc-
tion, isothermes= curves of constant temperature or isoheights =curves of constant
height.

Examples
5.7. For f (x, y) = x2 − y 2 , the set x2 − y 2 = 0 is the union of the lines x = y and
x = −y. The set x2 − y 2 = 1 consists of two hyperbola with with their ”noses” at the
point (−1, 0) and (1, 0). The set x2 − y 2 = −1 consists of two hyperbola with their
noses at (0, 1) and (0, −1).

5.8. The function f (x, y) = 1 − 2x2 − y 2 has contour curves f (x, y) = 1 − 2x2 − y 2 = c
which are ellipses 2x2 + y 2 = 1 − c for c < 1.

2 2
5.9. For the function f (x, y) = (x2 − y 2 )e−x −y , we can not find explicit expressions
2 2
for the contour curves (x2 −y 2 )e−x −y = c. We can draw the traces curve (x, 0, f (x, 0))
or (0, y, f (0, y) or then use a computer:
p
5.10. The surface z = f (x, y) = sin( x2 + y 2 ) has concentric circles as contour curves.

5.11. In applications, discontinuous functions can occur. The temperature of water in

relation to pressure and volume is an example. One experiences then phase transi-
tions, places where the function value can jumps. Mathematicians study singularities
in a mathematical field called ”catastrophe theory”.

5.12.
Definition: A function f (x, y) is called continuous at (a, b) if there is a
finite value f (a, b) with lim(x,y)→(a,b) f (x, y) = f (a, b). This means that for any
sequence (xn , yn ) converging to (a, b), also f (xn , yn ) → f (a, b). A function is
continuous in G ⊂ R2 if it is continuous at every point (a, b) of G.

5.13. Continuity means that if (x, y) is close to (a, b), then f (x, y) must be close to
f (a, b). Continuity for functions of more than two variables is defined in the same way.
The bad news is that continuity is not always easy to check. The good news is that in
general we do not have to worry about continuity. Lets look at some examples:
5.14. Example: For f (x, y) = (xy)/(x2 + y 2 ), we have
x2 1
lim f (x, x) = lim 2 =
x→0 x→0 2x 2
and limx→0 f (0, x) = lim(x,0)→(0,0) 0 = 0. The function is not continuous at (0, 0).
5.15. Example: The function f (x, y) = (x2 y)/(x2 +y 2 ) is better described using polar
coordinates: f (r, θ) = r3 cos2 (θ) sin(θ)/r2 = r cos2 (θ) sin(θ). We see that f (r, θ) → 0
uniformly in θ if r → 0. The function is continuous as we can extend it and extend
the value to f (0, 0) = 0. It is custom in mathematics to consider the above function
to be continuous. The reason is that there is a unique way to give a function value
Multivariable Calculus

at the undefined point.

5.16. A simpler example: the function f (x, y) = (x2 − y 2 )/(x + y) is continuous

everywhere. Yes, the function is not defined a priori at x + y = 0 but as it is outside
this line equal to f (x, y) = x − y, there is a unique continuation to the entire plane
and this continuation is x − y.

5.17. A function of three variables g(x, y, z) assigns to three variables x, y, z a real

number g(x, y, z). The function f (x, y, z) = x2 + y − z for example satisfies f (3, 2, 1) =
10. We can visualize a function by contour surfaces g(x, y, z) = c, where c is constant.
It is an implicit description of the surface. The contour surface of g(x, y, z) =
x2 + y 2 + z 2 = c is a sphere if c > 0. To understand a contour surface, it is helpful
to look at the traces, the intersections of the surfaces with the coordinate planes
x = 0, y = 0 or z = 0.

5.18. The function g(x, y, z) = 2 + sin(xyz) could define a temperature distribution

in space. We can no more draw the graph of g because that would be an object in 4
dimensions. We can however draw level surfaces like g(x, y, z) = 0 or g(x, y, z) = 1.

5.19. The level surfaces of g(x, y, z) = x2 + y 2 + z 2 are spheres. The level surfaces of
g(x, y, z) = 2x2 + y 2 + 3z 2 are ellipsoids. The equation ax + by + cz = d is a plane.
With ⃗n = [a, b, c] and ⃗x = [x, y, z], we can rewrite the equation ⃗n · ⃗x = d. If a point ⃗x0
is on the plane, then ⃗n · ⃗x0 = d. so that ⃗n · (⃗x − ⃗x0 ) = 0. This means that every vector
⃗x − ⃗x0 in the plane is orthogonal to ⃗n. For f (x, y, z) = ax2 + by 2 + cz 2 + dxy + exz +
f yz + gx + hy + kz + m the surface f (x, y, z) = 0 is called a quadric.
Sphere Paraboloid Plane

(x−a)2 +(y−b)2 +(z −c)2 = r2 (x − a)2 + (y − b)2 − c = z ax + by + cz = d

One sheeted hyperboloid Cylinder Two sheeted hyperboloid

(x−a)2 +(y−b)2 −(z −c)2 = r2 (x − a)2 + (y − b)2 = r2 (x − a)2 + (y − b)2 − (z − c)2 =

−r2
Ellipsoid Hyperbolic paraboloid Elliptic hyperboloid

x2 /a2 + y 2 /b2 + z 2 /c2 = 1 x2 − y 2 + z = 1 x2 /a2 + y 2 /b2 − z 2 /c2 = 1

Higher order polynomial surfaces can be intriguingly beautiful and are

sometimes difficult to describe. If f is a polynomial in several variables
and f (x, x, x) is a polynomial of degree d, then f is called a degree d
polynomial surface. Degree 2 surfaces are quadrics, degree 3 surfaces
cubics, degree 4 surfaces quartics, degree 5 surfaces quintics, degree 10
surfaces decics and so on.
Multivariable Calculus

Homework
This homework is due on Tuesday, 7/5/2022.

Problem 5.1: a) Plot the contour plot and graph of the function
f (x, y) = sin(x + y 2 )/(x2 + y 2 ).
2

b) Plot both the graph and the contour plot of the function f (x, y) =
sin(x2 ) + sin(y 2 ) on the region −π ≤ x ≤ π, −π ≤ y ≤ π. Try a) without a
computer, for b), you might want to use Wolfram alpha or Mathematica or a graphing calculator to
get your picture.

Problem 5.2: a) Determine the domain and range of the logarithmic

mean
(y − x)
f (x, y) =
log(y) − log(x)
where log the natural logarithm.
b) The function is not defined at x = y but one can define f (x, y) on the
diagonal x = y. Use Hôpital, show that the limit limx→2 f (x, 2) exists.
c) The function is also not defined at first if x = 0 or y = 0. Show that
the limit limx→0 f (x, 2) exists.

Problem 5.3: a) Use the computer to draw the level surface x2 − y 2 +

z 2 − x4 y 4 z 4 − x2 y 2 z 2 = 0 with x, y, z all in [−2, 2]
b) Do the same for the contour ((x2 + y 2 )2 − x2 − y 2 )2 + z 2 = 0.02 with
with x, y, z all in [−1.1, 1.1]

Problem 5.4: a) Draw the Taxi-metric hyperboloid |x| + |y| − |z| = 1.

b) Draw the Taxi-metric hyperboloid |x| + |y| − |z| = −1.
c) Draw the Taxi-metric ellipsoid |x| + |y| + 2|z| = 5.
d) Draw the Taxi-metric elliptic parabolid z = |x| + |y|.
e) Draw the Taxi-metric hyperboloic paraboloid z = |x| − |y|.

Problem 5.5: a) Verify that the line ⃗r(t) = [1 + t, 1 − t, t] is part of the

one sheeted hyperboloid x2 + y 2 − 2z 2 = 2.
b) Verify that the line ⃗r(t) = [1, 3, 2] + t[1, 2, 1] is part of the hyperbolic
paraboloid z 2 − x2 − y = 0.
c) As also the line ⃗r(s) = [1 − s, 1 + s, s] is part of the same hyperboloid,
what is the intersection of the hyperboloid with the plane ⃗r(t, s) = [1 +
t − s, 1 − t + s, t + s]?

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 6: Parametrized Surfaces

Lecture
6.1. There are two fundamentally different ways to describe surfaces: there are level
surfaces g(x, y, z) = c and parametrizations. What we have seen already for planes
can be done more generally for other surfaces. Let’s first look at the general setup:

Definition: A parametrization of a surface is a vector-valued function

⃗r(u, v) = [x(u, v), y(u, v), z(u, v)] ,
where x(u, v), y(u, v), z(u, v) are three functions of two variables. The param-
eters u, v serve as coordinates on the surface. If we plug in concrete values
like u = 3, v = 2 for example in a function ⃗r(u, v) = [u − 2, v 2 , u3 − v], we get a
concrete point ⃗r(3, 2) = [1, 4, 25] in R3 .

6.2. Because two parameters u and v are involved, the map ⃗r is also called uv-map.
And like uv-light, it looks cool. If you like a fancy description, a parametrization is
a map from R2 to R3 . A parametrized surface is the image of the uv-map. The
domain R of the uv-map is called the parameter domain. The parametrization is
what you are doing, the surface itself is something you see. There are many different
parametrizations of the same surface.
Multivariable Calculus

Definition: If the first parameter u is kept constant, then v 7→ ⃗r(u, v) is a

curve on the surface. Similarly, if v is constant, then u 7→ ⃗r(u, v) traces a curve
the surface. These curves are called grid curves.

Parametric surfaces can become complex. In that case, it is better to explored them
with the help of a computer. The following four examples are important building blocks
for more general surfaces.

Definition:
p A point (x, y) ̸= (0, 0) in the plane has the polar coordinates
2 2
r = x + y , θ, where θ is the angle from the positive x-axes to the point
in counter clockwise direction. For x > 0, y > 0 it is arctan(y/x). In general
(x, y) = (r cos(θ), r sin(θ)). A common choice is to take θ ∈ [0, 2π). The point
((0, −1) has then the polar coordinates (r, θ) = (1, 3π/2).

6.3. Note that the formula θ = arctan(y/x) defines the angle θ only up to an addition
of an integer multiple of π. The points (1, 2) and (−1, −2) for example have the same θ
value. In order to get the correct θ value one can could take arctan(y/x) in (−π/2, π/2],
where π/2 is the limit when x → 0+ , then add π if x < 0 or if x = 0 and y < 0.

Definition: The coordinate system obtained by representing points in space

as
(x, y, z) = (r cos(θ), r sin(θ), z)
is called the cylindrical coordinate system.

Definition: Spherical coordinates use the distance ρ to the origin as well

as two angles θ and ϕ called Euler angles. The first angle θ is the angle we
have used in polar coordinates. The second angle, ϕ, is the angle between the
⃗ and the z-axis. A point has the spherical coordinate
vector OP
(x, y, z) = (ρ cos(θ) sin(ϕ), ρ sin(θ) sin(ϕ), ρ cos(ϕ)) .
We always use 0 ≤ θ < 2π, 0 ≤ ϕ ≤ π, ρ ≥ 0.

6.4. The following figures allow you to derive the formulas. The distance to the z axes
is r = ρ sin(ϕ) and the height z = ρ cos(ϕ) can be read off by the left picture, the
coordinates x = r cos(θ), y = r sin(θ) can be seen in the right picture.
z y

ϕ ρ r
x = ρ cos(θ) sin(ϕ), θ
r
xy
y = ρ sin(θ) sin(ϕ), x
z = ρ cos(ϕ)

Examples
⃗ + s⃗v + tw
6.5. A plane has the parametrization ⃗r(s, t) = OP ⃗ and the implicit equation
ax + by + cz = d. To get from parametric to implicit, find the normal vector ⃗n = ⃗v × w.
⃗
To get from implicit to parametric, find two vectors ⃗v , w
⃗ normal to the vector ⃗n. For
example, find three points P, Q, R on the surface and form ⃗u = P⃗Q, ⃗v = P⃗R.

6.6. The sphere ⃗r(u, v) = [a, b, c] + [ρ cos(u) sin(v), ρ sin(u) sin(v), ρ cos(v)] can be
brought into the implicit form by finding the center and radius (x − a)2 + (y − b)2 +
(z − c)2 = ρ2 .

6.7. The parametrization of a graph is ⃗r(u, v) = [u, v, f (u, v)]. It can be written in
implicit form as z − f (x, y) = 0.

6.8. The surface of revolution is inpparametric form given as ⃗r(u, v) = [g(v) cos(u), g(v) sin(u), v].
It has the implicit description x2 + y 2 = r = g(z) which can be rewritten as
2 2 2
x + y = g(z) .

6.9. Here are some level surfaces in cylindrical coordinates:

r = 1 is a cylinder, r = |z| is a double cone, r2 = z elliptic parabo-

loid, θ = 0 is a half plane, r = θ is a rolled sheet of paper.

r = 2 + sin(z) is an example of a surface of revolution.

6.10. Here are some level surfaces described in spherical coordinates:

ρ = 1 is a sphere, the surface ϕ = π/4 is a single cone, ρ = ϕ is an

apple shaped surface and ρ = 2 + cos(3θ) sin(ϕ) is an example of a
bumpy sphere.
Multivariable Calculus

Homework
This homework is due on Tuesday, 7/5/2022.

Problem 6.1: Find a parametrization ⃗r(t, s) for the plane which contains
the point P = (−9, 7, 1) and the line through Q = (2, 2, 1) and R =
(1, 3, 5).

Problem 6.2: a) Plot the surface with the parametrization

⃗r(u, v) = [(3 + v cos(u/2)) cos(u), (3 + v cos(u/2)) sin(u), −v sin(u/2)]
with −2 ≤ v ≤ 2 and 0 ≤ u ≤ 2π. You can use technology if you like. Do
you recognize the shape of the surface?
b) Now make a picture of the same surface where the domain R is −5 ≤
v ≤ 5 and 0 ≤ u ≤ 2π. Something dramatic happens at −6 ≤ v ≤ 6.
Describe it.

Problem 6.3: a) Find a parametrisations of the lower half of the ellipsoid

x2 /36+y 2 /16+(z −4)2 = 1 by using that the surface is a graph z = f (x, y)
on a suitable domain.
b) Find a second parametrization but use angles ϕ, θ similarly as for the
sphere.

Problem 6.4: Find a parametrisation of the torus candy (www.math-

candy.com) given as the set of points which have distance 6 + 2 cos(7θ)
from the circle
[10 cos(θ), 10 sin(θ), 0] ,
where θ is the angle occurring in cylindrical and spherical coordinates. We
can assure you that the candy melts wonderfully on your tongue.

Hint: Use r, the distance of a point (x, y, ) to the z-axis. This

distance is r = (10 + (6 + 2 cos(7θ)) cos(ψ)) if ψ is the angle for the
circle winding around the candy. You can also use that z = (6 +
2 cos(7θ)) sin(ψ). To finish the parametrization problem, translate
back to Cartesian coordinates.

Problem 6.5: a) What is the equation for the surface x2 + y 2 = z 2 + y/x

in cylindrical coordinates?
b) Describe in words or draw a sketch of the surface whose equation is
ρ = | sin(4ϕ)| in spherical coordinates (ρ, θ, ϕ).

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 7: Parametrized curves

Lecture

Definition: A parametrization of a planar curve is a map ⃗r(t) =

[x(t), y(t)] from a parameter interval R = [a, b] to the plane R2 . The
functions x(t) and y(t) are called coordinate functions. The image of the
parametrization is called a parametrized curve in the plane. Similarly, the
parametrization of a space curve is ⃗r(t) = [x(t), y(t), z(t)]. The image of ⃗r is
a parametrized curve in space.

r(t)

7.1. We think of the parameter t as time and the parametrization as a drawing

process. The curve is the result what you see. For a fixed time t, we have a vector
[x(t), y(t), z(t)] in space. As t varies, the end point of this vector moves along the
curve. The parametrization contains more information about the curve then the
curve itself. It tells for example how fast the curve was traced.

7.2. Curves can describe the paths of particles, celestial bodies, or other quantities
which change in time. Examples are the motion of a star moving in a galaxy, or eco-
nomical data changing in time. Here are some more places, where curves appear:
Multivariable Calculus

Knots are closed curves in space.

Molecules DNA, RNA or proteins.
Graphics: grid curves produce a mesh of curves.
Typography: fonts represented by Bézier curves.
Relativity: curve in space-time describes the motion of an object
Topology: space filling curves, boundaries of surfaces or knots.

Definition: Any vector parallel to the velocity ⃗r ′ (t) is called tangent to the
curve at ⃗r(t).

7.3. You know from single variable, the addition rule (f + g)′ = f ′ + g ′ , the scalar
multiplication rule (cf )′ = cf ′ and the Leibniz rule (f g)′ = f ′ g + f g ′ as well as
the chain rule (f (g))′ = f ′ (g)g ′ . They generalize to vector-valued functions.

(⃗v + w)⃗ ′ = ⃗v ′ + w
⃗ ′ , (c⃗v )′ = c⃗v ′ , (⃗v · w)
⃗ ′ = ⃗v ′ · w ⃗ ′ , (⃗v × w)
⃗ + ⃗v · w ⃗ ′ = ⃗v ′ × w ⃗ ′,
⃗ + ⃗v × w
′ ′ ′
(⃗v (f (t))) = ⃗v (f (t))f (t).

7.4. The Differentiation of curves can be reversed using the fundamental theorem
′
of calculus.
R t ′ If ⃗r (t) and ⃗r(0) is known, we can figure out ⃗r(t) by integration ⃗r(t) =
⃗r(0) + 0 ⃗r (s) ds.

Assume we know the acceleration ⃗a(t) = ⃗r′′ (t) at all times as well as initial
R tvelocity
′ ′ ⃗ ⃗
and position ⃗r (0) and ⃗r(0). Then ⃗r(t) = ⃗r(0)+t⃗r (0)+ R(t), where R(t) = 0 ⃗v (s) ds
Rt
and ⃗v (t) = 0 ⃗a(s) ds.

The free fall is the case when acceleration is a con-

stant vector. The direction of the constant force de- 6

fines what is “down”. If ⃗r′′ (t) = [0, 0, −10], ⃗r′ (0) = 4

[0, 1000, 2], ⃗r(0) = [0, 0, h], then ⃗r(t) = [0, 1000t, h + 2t −
10t2 /2]. 2

50 100 150 200 250

-2

If r′′ (t) = F⃗ is constant, then ⃗r(t) = ⃗r(0) + t⃗r′ (0) − F⃗ t2 /2.

Examples
7.5. Examples:
1) The parametrization ⃗r(t) = [1 + 2 cos(t), 3 + 5 sin(t)] is the ellipse (x − 1)2 /4 + (y −
3)2 /25 = 1. The parametrization ⃗r(t) = [cos(3t), sin(5t)] is an example of a Lissajous
curve.
2) If x(t) = t, y(t) = f (t), the curve ⃗r(t) = [t, f (t)] traces the graph of the func-
tion f (x). For example, for f (x) = x2 + 1, the graph is a parabola. 3) With
x(t) = t cos(t), y(t) = t sin(t), z(t) = t we get the parametrization of a space curve
⃗r(t) = [t cos(t), t sin(t), t] which traces a spiral on a cone x2 + y 2 = z 2 . 4) For
x(t) = 2t cos(2t), y(t) = 2t sin(2t), z(t) = 2t traces the same curve but twice as fast.
5) If P = (a, b, c) and Q = (u, v, w) are points in space, then ⃗r(t) = [a + t(u − a), b +
t(v − b), c + t(w − c)] with t ∈ [0, 1] is a line segment from P to Q. Example:
⃗r(t) = [1 + t, 1 − t, 2 + 3t] connects P = (1, 1, 2) with Q = (2, 0, 5).
6) For ⃗r(t) = [cos(t), sin(2t), 0] we get a figure 8 curve.

The computation is done coordinate wise:

Position ⃗r (t) = [cos(3t), sin(2t), 2 sin(t)]

Velocity ⃗r ′ (t) = [−3 sin(3t), 2 cos(2t), 2 cos(t)]
Acceleration ⃗r ′′ (t) = [−9 cos(3t), −4 sin(2t), −2 sin(t)]
Jerk ⃗r ′′′ (t) = [27 sin(3t), 8 cos(2t), −2 cos(t)]

7.6. Lets look at some examples of velocities and accelerations:

Example Velocity Example Acceleration

Hair growth: 0.000000005 m/s Train: 0.1-0.3 m/s2
Garden Snail 0.013 m/s Sprinter (100 m Dash): 3 m/s2
Signals in nerves: 50 m/s Car: 3-8 m/s2
Sound in air: 340 m/s Free fall: 1G = 9.81 m/s2
Speed of bullet: 1’200 m/s Space X Starship: 4G m/s2
Earth in solar system 30’000 m/s Combat plane F35A: 9G m/s2
Sun in galaxy: 200’000 m/s Ejection from F35A: 14G m/s2 .
Light in vacuum: 299’792’458 m/s Electron in vacuum: 1015 m/s2

Homework
This homework is due on Tuesday, 7/5/2022.
Multivariable Calculus

Problem 7.1: Sketch the plane curve

⃗r(t) = [x(t), y(t)] = [cos(t) + sin(5t), sin(t) + cos(5t)] ,
for t ∈ [0, 2π] by plotting the points for different values of t. Calculate its
velocity ⃗r ′ (t) as well as the acceleration ⃗r ′′ (t) at t = 0. When plotting
the curve you see a flower shaped curve. How many petals are there?

Problem 7.2: Oliver’s cellphone app measures the acceleration

⃗r ′′ (t) = [− cos(t), −81 sin(9t), 81 cos(9t) − sin(t)]
on a Phoenix coaster in a amusement park. Assume r(0) = [1, 0, −1] and
r′ (0) = [0, 9, 1]. What is its position ⃗r(π/2)?

Problem 7.3: a) Two particles travel along space curves. The first is
⃗r1 (t) = [t, t2 , t3 ] .
The second is
⃗r2 (t) = [1 + 2t, 1 + 6t, 1 + 14t] .
Do the particles collide? Do the particle paths intersect?
b) If ⃗r(t) = [cos(t), 2 sin(t), 4t], find ⃗r ′ (0) and ⃗r ′′ (0). Then compute
|⃗r ′ (0) × ⃗r ′′ (0)|/|⃗r ′ (0)|3 . We will later call this the curvature.

Problem 7.4: Find the parametrization ⃗r(t) = [x(t), y(t), z(t)] of the
curve obtained by intersecting the elliptical cylinder x2 /16 + y 2 /25 = 1
with the surface z = x2 y. Find the velocity vector ⃗r ′ (t) at the time
t = π/2.

Problem 7.5: Consider the curve

⃗r(t) = [x(t), y(t), z(t)] = [t2 , 1 + t, 1 + t3 ] .
Check that it passes through the point (1, 0, 0) and find the velocity vector
⃗r ′ (t), the acceleration vector ⃗r ′′ (t) as well as the jerk vector ⃗r ′′′ (t) at this
point.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

1Oliver got really sick on that ride: https://www.youtube.com/watch?v=Hr8h-WmgX0Q

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 8: Arc length and Curvature

Lecture

Definition: If t ∈ [a, b] 7→ ⃗r(t) parametrizes a curve with velocity ⃗r ′ (t) and

Rb
speed |⃗r ′ (t)|, then L = a |⃗r ′ (t)| dt is called the arc length of the curve.

8.1. If ⃗r is differentiable,
Rbp then a polygon approximation justifies it. Written out, the
formula is L = a x (t) + y ′ (t)2 + z ′ (t)2 dt.
′ 2

8.2. Because a parameter change t = t(s) corresponds to a substitution in the inte-

gration which does not change the integral, we immediately see “path independence of
arc length”:

The arc length is independent of the parameterization of the curve.

Definition: Define the unit tangent vector T⃗ (t) = ⃗r ′ (t)/|⃗r ′ (t)|.

Definition: The curvature of a curve at the point ⃗r(t) is defined as κ(t) =

|T⃗ ′ (t)|
r ′ (t)|
|⃗
.

8.3. The curvature is the length of the acceleration vector if ⃗r(t) parametrizes the
curve with constant speed 1. A large curvature at a point means that the curve is
strongly bent. Unlike the acceleration or the velocity, the curvature does not depend
on the parameterization of the curve. You “see” the curvature, while you “feel” the
acceleration. We can measure curvature at a point only if ⃗r ′ (t) is not zero.

The curvature does not depend on the parametrization.

Proof. Let s(t) be an other parametrization, then by the chain rule d/dtT (s(t)) =
T ′ (s(t))s′ (t) and d/dtr(s(t)) = r′ (s(t))s′ (t). We see that the s′ cancels in T ′ /r′ .
Multivariable Calculus

Especially, if the curve is parametrized by arc length, meaning that the velocity vector
r′ (t) has length 1, then κ(t) = |T ′ (t)|. It measures the rate of change of the unit
tangent vector.
Definition: If ⃗r(t) is a curve which has nonzero speed at t, then we can define
′
T⃗ (t) = |⃗⃗rr ′ (t) ⃗ (t) = T⃗ ′ (t) , the normal vector and
, the unit tangent vector, N
(t)| |T⃗ ′ (t)|
⃗ ⃗ ⃗
B(t) = T (t) × N (t) the bi-normal vector. The plane spanned by N ⃗ and B⃗ is
called the normal plane. It is perpendicular to the curve. The plane spanned
by T and N is called the osculating plane.

8.4. If we differentiate T⃗ (t) · T⃗ (t) = 1, we get T⃗ ′ (t) · T⃗ (t) = 0 and see that N⃗ (t) is
perpendicular to T⃗ (t). Because B ⃗ is automatically normal to T⃗ and N ⃗ , we have shown:

The three vectors (T⃗ (t), N

⃗ (t), B(t))
⃗ are unit vectors orthogonal to each other.

8.5. Here is an application of curvature: if a curve ⃗r(t) represents a wave front and
⃗n(t) is a unit vector normal to the curve at ⃗r(t), then ⃗s(t) = ⃗r(t) + ⃗n(t)/κ(t) defines
a new curve called the caustic of the curve. Geometers call it the evolute of the
original curve. To the left a caustic. The picture of John Harvard was obtained by
following level curves.

A useful formula for curvature is

|⃗r ′ (t) × ⃗r ′′ (t)|
κ(t) =
|⃗r ′ (t)|3

8.6. We prove this in class. Finally, lets mention that curvature is important also in
computer vision. If the gray level value of a picture is modeled as a function f (x, y)
of two variables, places where the level curves of f have maximal curvature corresponds
to corners in the picture. This is useful when tracking or identifying objects.

Examples

The arc length of the circle ⃗r(t) = [R cos(t), R sin(t)] is 2πR. The speed
|⃗r′ (t)| is constant and equal to R.
The helix ⃗r(t) = [cos(t), sin(t), t] has velocity ⃗r ′ (t)√= [− sin(t), cos(t), 1]
and constant speed |⃗r ′ (t)| = |[− sin(t), cos(t), 1]| = 2.

What is the arc length of the curve

√
⃗r(t) = [ 2t, log(t), t2 /2]
′
√
for
q 1 ≤ t ≤ 2? Answer: Because ⃗
r (t) = [ 2, 1/t, t], we have ⃗r′ (t) =
2 2
2 + t12 + t2 = | 1t +t| and L = 1 1t +t dt = log(t)+ t2 |21 = log(2)+2−1/2.
R

This curve does not have a name. But because it is constructed in such a
way that the arc length can be computed, we an call it ”opportunity”.

Find the arc length of the curve ⃗r(t) = [3t2 , 6t, t3 ] from t = 1 to t = 3.

What is the arc p length of the curve ⃗r(t) = [cos3 (t), sin3 (t)]? Answer: We
have |⃗r ′ (t)| = 3 sin2 (t) cos4 (t) + cos2 (t) sin4 (t) = (3/2)| sin(2t)|. There-
R 2π
fore, 0 (3/2) sin(2t) dt = 6.

Find the arc length of ⃗r(t) = [t2 /2, t3 /3] for −1 ≤ t ≤ 1. This cubic
2 3
curve satisfies
R √ y = x 8/9 and is an example of an elliptic curve. Be-
2 2 3/2
cause x 1 + x dx = (1 + x ) /3, the integral can be evaluated as
R1 √ R1 √ √
−1
|x| 1 + x2 dx = 2 0 x 1 + x2 dx = 2(1 + x2 )3/2 /3|10 = 2(2 2 − 1)/3.

The arc length of an epicycle ⃗r(t) = [t + sin(t), p cos(t)] parame-

terized by 0 ≤ t ≤ 2π. We have |r⃗′ (t)| = 2 + 2 cos(t). so
R 2π p
that L = 0 2 + 2 cos(t) dt. A substitution t = 2u gives
Rπp Rπp
L = 0 2 + 2 cos(2u) 2du = 0 2 + 2 cos2 (u) − 2 sin2 (u) 2du =
Rπp
2 (u) 2du = 4 π | cos(u)| du = 8.
R
0
4 cos 0

Find the arc length of the catenary ⃗r(t) = [t, cosh(t)], where cosh(t) =
(et + e−t )/2 is the hyperbolic cosine and t ∈ [−1, 1]. We have
cosh2 (t)2 − sinh2 (t) = 1 ,
t −t
q = (e − e )/2 is the hyperbolic
where sinh(t) sine. Solution: We have
1
|⃗r ′ (t)| = 1 + sinh2 (t) = cosh(t) and −1 cosh(t) dt = 2 sinh(1).
R

Often, there is no closed formula for the arc length of a curve. For exam-
ple, the Lissajous figure ⃗r(t) = [cos(3t), sin(5t)] leads to the arc length
R 2π p 2
integral 0 9 sin (3t) + 25 cos2 (5t) dt which can only be evaluated nu-
merically.
Multivariable Calculus

The curve ⃗r(t) = [t, f (t)], which is the graph of a function f has
the velocity p ⃗r ′ (t) = (1, f ′ (t)) and the unit tangent vector T⃗ (t) =
(1, f ′ (t))/ 1 + f ′ (t)2 . After some simplification we get
3
κ(t) = |T⃗ ′ (t)|/|⃗r ′ (t)| = |f ′′ (t)|/ 1 + f ′ (t)2
p
p 3
For example, for f (t) = sin(t), then κ(t) = | sin(t)|/| 1 + cos2 (t) .

Homework
This homework is due on Tuesday, 7/5/2022.

Problem 8.1: a) Find the arc length of the curve ⃗r(t) = [t2 , 2t3 /3, 2]
from t = −2 to t = 2.
b) Find the arc length of ⃗r(t) = [4t, 4 sin(3t), 4 cos(3t), 2] with 0 ≤ t ≤ π.

Problem 8.2: Find the curvature of ⃗r(t) = [et cos(t), et sin(t), t] at the
point (1, 0, 0).

Problem 8.3: Find the vectors T⃗ (t), N ⃗ (t) and B(t))

⃗ for the curve ⃗r(t) =
2 3 ⃗ ⃗ ⃗
[t , t , 0] for t = 2. Do the vectors T (t), N (t), B(t) depend continuously t
for all t?

Problem 8.4: Let ⃗r(t) = [t, t2 ]. Find the equation for the caustic
⃗ (t)
⃗s(t) = ⃗r(t) + Nκ(t) . It is known also as the evolute of the curve.

Problem 8.5: If ⃗r(t) = [− sin(t), cos(t)] is the boundary of a coffee cup

and light enters in the direction [−1, 0], then light focuses inside the cup
on a curve which is called the coffee cup caustic. The light ray travels
after the reflection for length sin(θ)/(2κ) until it reaches the caustic. Find
a parameterization of the caustic.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 9: Partial derivatives

Lecture
9.1. Functions of several variables one can differentiated with respect to any of the
variables:

Definition: If f (x, y) is a function of the two variables x and y, then the

∂
partial derivative ∂x f (x, y) is defined as the derivative of the function g(x) =
f (x, y) with respect to x, where y is kept a constant. The partial derivative
with respect to y is the derivative with respect to y, where x is fixed.

∂
9.2. The short hand notation fx (x, y) = ∂x f (x, y) is convenient. When iterating
∂ ∂
derivatives, the notation is similar: we write for example fxy = ∂x ∂y
f . The num-
ber fx (x0 , y0 ) gives the slope of the graph sliced at (x0 , y0 ) in the x-direction. The
second derivative fxx is a measure of concavity in the x-direction. The meaning of fxy
is the rate of change of the x-slope if horizontal cut is moved along the y-axis.

9.3. The notation ∂x f, ∂y f was introduced by Carl Gustav Jacobi. Before that, Josef
Lagrange used the term “partial differences”. For functions of three or more variables,
the partial derivatives are defined in the same way. We write for example fx (x, y, z) or
fxxz (x, y, z).

Theorem: Clairaut’s theorem: If fxy and fyx are both continuous,

then fxy = fyx .

9.4. Proof. Following Euler, we first look at the difference quotients and say that if
the “Planck constant” h is positive, then define fx (x, y) = [f (x + h, y) − f (x, y)]/h.
The limit h = 0, is then the usual partial derivative fx . Comparing the two sides of
the equation for fixed h > 0 shows:

hfx (x, y) = f (x + h, y) − f (x, y) hfy (x, y) = f (x, y + h) − f (x, y).

h2 f xy (x, y) = f (x+h, y +h)−f (x, y +h)−(f (x+h, y)−f (x, y)) h2 fyx (x, y) = f (x+h, y +h)−f (x+h, y)−(f (x, y +h)−f (x, y))
Multivariable Calculus

9.5. Without having taken any limits we established an identity which holds for all
h > 0: the discrete derivatives fx , fy satisfy the relation fxy = fyx for any h > 0. We
could fancy it as ”quantum Clairaut” formula. If the classical derivatives fxy , fyx
are both continuous, it is possible to take the limit h → 0. The classical Clairaut’s
theorem can be seen as a “classical limit”. The quantum Clairaut holds however for
all functions f (x, y) of two variables. Not even continuity is needed. 1

9.6. An equation for an unknown function f (x, y) which involves partial derivatives
with respect to at least two different variables is called a partial differential equa-
tion. We abbreviate PDE. If only the derivative with respect to one variable appears,
it is an ordinary differential equation, abbreviated ODE.

Examples
9.7. For f (x, y) = x4 − 6x2 y 2 + y 4 , we have fx (x, y) = 4x3 − 12xy 2 , fxx = 12x2 −
12y 2 , fy (x, y) = −12x2 y + 4y 3 , fyy = −12x2 + 12y 2 and see that ∆f = fxx + fyy = 0. A
function which satisfies ∆f = 0 is also called harmonic. The equation fxx + fyy = 0
is a PDE:
Definition: A partial differential equation (PDE) is an equation for an
unknown function f (x, y) which involves partial derivatives with respect to more
than one variables.

9.8.
The wave equation ftt (t, x) = fxx (t, x) governs the motion of light or
sound. The function f (t, x) = sin(x − t) + sin(x + t) satisfies the wave
equation.

The heat equation ft (t, x) = fxx (t, x) describes diffusion of heat or

1 −x2 /(4t)
spread of an epidemic. The function f (t, x) = √
t
e satisfies the
heat equation.

The Laplace equation fxx + fyy = 0 determines the shape of a mem-

brane. The function f (x, y) = x3 − 3xy 2 is an example satisfying the
Laplace equation.

The advection equation ft = fx is used to model transport in a wire.

2
The function f (t, x) = e−(x+t) satisfies the advection equation.

1For
a full proof of Clairaut’s theorem, see
www.math.harvard.edu/˜knill/teaching/math22a2018/handouts/lecture14.pdf .
The Burgers equation ft + f fx = fxx describes waves at the beach
√ 1 −x2 /(4t)
e
which break. The function f (t, x) = t √t 1 −x2 /(4t) satisfies the Burgers
x
1+ t
e
equation.

The eiconal equation fx2 + fy2 = 1 is used to see the evolution of

wave fronts in optics. The function f (x, y) = cos(x) + sin(y) satisfies
the eiconal equation.

The KdV equation ft + 6f fx + fxxx = 0 models water waves in a

a2
narrow channel. The function f (t, x) = 2
cosh−2 ( a2 (x − a2 t)) satisfies
the KdV equation.

iℏ
The Schrödinger equation ft = f
2m xx
is used to describe a quan-
ℏ 2
tum particle of mass m. The function f (t, x) = ei(kx− 2m k t) solves the
Schrödinger equation. [Here i2 = −1 is the imaginary i and ℏ is the
Planck constant ℏ ∼ 10−34 Js.]

Can you match the graphs f (t, x) with the equations which satisfy this equation?

9.9. In all examples, we just see one possible solution to the partial differential equa-
tion. There are in general many solutions and additional initial or boundary conditions
then determine the solution uniquely. If we know f (0, x) for the Burgers equation, then
the solution f (t, x) is determined.
Multivariable Calculus

Homework
This homework is due on Tuesday, 7/12/2022.

Problem 9.1: Verify that f (t, x) = exp((t + x)2 ) + sin(sin(t + x)) is a

solution of the transport equation ft (t, x) = fx (t, x).

Problem 9.2: a) Verify that f (x, y) = sin(2x)(cos(21y) + sin(21y))

satisfies the Klein Gordon equation uxx − uyy = 347u. This PDE is
useful in quantum mechanics.
(x− t ) √2
b) Verify that 4 arctan(e 2 3 ) satisfies the Sin-Gordon equation utt −
uxx = − sin(u). Use might want to use technology.

Problem 9.3: Verify that for any real constant b, the function
−bt
f (x, t) = e sin(x + t) satisfies the driven transport equation
ft (x, t) = fx (x, t) − bf (x, t) This PDE is sometimes called the advec-
tion equation with damping b.

Problem 9.4: The differential equation

ft = f − xfx − x2 fxx
is a version of the infamous Black-Scholes equation. Here f (x, t) is
the prize of a call option and x the stock prize and t is time. Find a
function f (x, t) solving it which depends both on x and t. Do not just
produce an example like f (x, t) = x or f (x, t) = et which only depends on
one variable. But you certainly can get inspired by them.

Problem 9.5: The partial differential equation ft + f fx = fxx is called

Burgers equation and describes waves at the beach. In higher dimen-
sions, it leads to the Navier-Stokes equation which are used to describe
the weather. Verify that
1 3/2 x2
xe− 4t

t
f (t, x) = q 2
1 − x4t
t
e +1
solves the Burgers equation. You also here might want to get help with
technology.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 10: Linearization

Lecture
10.1. In single variable calculus we have seen how to approximate functions by linear
functions:
Definition: The linear approximation of f (x) at a is the affine function
L(x) = f (a) + f ′ (a)(x − a) .

10.2. If you remember Taylor series, this is the part of the series f (x) = ∞ (k)
P
k=0 f (a)(x−
k
a) /k!, where only the k = 0 and k = 1 term are considered. We think about the linear
approximation L as a function and not as a graph because we also will look at linear
approximations for functions of three variables, where we can not draw graphs.

y=L(x)

y=f(x)

10.3. The graph of the function L is close to the graph of f at a. What about higher
dimensions?
Definition: The linear approximation of f (x, y) at (a, b) is the affine
function
L(x, y) = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b) .
The linear approximation of f (x, y, z) at (a, b, c) is
L(x, y, z) = f (a, b, c) + fx (a, b, c)(x − a) + fy (a, b, c)(y − b) + fz (a, b, c)(z − c) .
Multivariable Calculus

10.4. Using the gradient

∇f (x, y) = [fx , fy ], ∇f (x, y, z) = [fx , fy , fz ] ,
the linearization can be written more compactly as
L(⃗x) = f (⃗x0 ) + ∇f (⃗a) · (⃗x − ⃗a) .

10.5. How do we justify the linearization? If the second variable y = b is fixed,

we have a one-dimensional situation, where the only variable is x. Now f (x, b) =
f (a, b) + fx (a, b)(x − a) is the linear approximation. Similarly, if x = x0 is fixed y
is the single variable, then f (x0 , y) = f (x0 , y0 ) + fy (x0 , y0 )(y − y0 ). Knowing the
linear approximations in both the x and y variables, we can get the general linear
approximation by f (x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ).

Examples
10.6. What is the linear approximation of the function f (x, y) = sin(πxy 2 ) at the point
(1, 1)? Answer: We have [fx (x, y), fy (x, y)] = [πy 2 cos(πxy 2 ), 2xyπ cos(πxy 2 )] which is
at the point (1, 1) equal to ∇f (1, 1) = [π cos(π), 2π cos(π)] = [−π, −2π]. The function
is L(x, y) = 0 + (−π)(x − 1) − 2π(y − 1).

10.7. Linearization can be used to estimate functions near a point. In the previous
example,
f (1 + 0.01, 1 + 0.01) = −0.095

L(1 + 0.01, 1 + 0.01) = −π0.01 − 2π0.01 = −3π/100 = −0.0942 .

10.8. Here is an example in three dimensions: find the linear approximation to f (x, y, z) =
xy+yz+zx at the point (1, 1, 1). Since f (1, 1, 1) = 3, and ∇f (x, y, z) = (y+z, x+z, y+
x), ∇f (1, 1, 1) = [2, 2, 2]. we have L(x, y, z) = f (1, 1, 1) + [2, 2, 2] · [x − 1, y − 1, z − 1] =
3 + 2(x − 1) + 2(y − 1) + 2(z − 1) = 2x + 2y + 2z − 3.
√
10.9. Estimate f (0.01, 24.8, 1.02) for f (x, y, z) = ex yz.
Solution: take (x0 , y0 , z0 ) = (0, 25, 1), where f (x0 , y0 , z0 ) = 5. The gradient is
√ √ √
∇f (x, y, z) = (ex yz, ex z/(2 y), ex y). At the point (x0 , y0 , z0 ) = (0, 25, 1) the gra-
dient is the vector (5, 1/10, 5). The linear approximation is L(x, y, z) = f (x0 , y0 , z0 ) +
∇f (x0 , y0 , z0 )(x − x0 , y − y0 , z − z0 ) = 5 + (5, 1/10, 5)(x − 0, y − 25, z − 1) = 5x + y/10 +
5z − 2.5. We can approximate f (0.01, 24.8, 1.02) by 5 + (5, 1/10, 5) · (0.01, −0.2, 0.02) =
5 + 0.05 − 0.02 + 0.10 = 5.13. The actual value is f (0.01, 24.8, 1.02) = 5.1306, very
close to the estimate.

10.10. Find the tangent line to the graph of the function g(x) = x2 at the point (2, 4).
Solution: the level curve f (x, y) = y − x2 = 0 is the graph of a function g(x) = x2 and
the tangent at a point (2, g(2)) = (2, 4) is obtained by computing the gradient [a, b] =
∇f (2, 4) = [−g ′ (2), 1] = [−4, 1] and forming −4x + y = d, where d = −4 · 2 + 1 · 4 = −4.
The answer is −4x + y = −4 which is the line y = 4x − 4 of slope 4.
10.11. The Barth surface is defined as the level surface f = 0 of
f (x, y, z) = (3 + 5t)(−1 + x2 + y 2 + z 2 )2 (−2 + t + x2 + y 2 + z 2 )2
+ 8(x2 − t4 y 2 )(−(t4 x2 ) + z 2 )(y 2 − t4 z 2 )(x4 − 2x2 y 2 + y 4 − 2x2 z 2 − 2y 2 z 2 + z 4 ) ,
√
where t√= ( 5 + 1)/2 is a constant called the golden ratio. If we replace t with
1/t = ( 5 − 1)/2 we see the surface to the middle. For t = 1, we see to the right the
surface f (x, y, z) = 8. Find the tangent plane of the later surface at the point (1, 1, 0).
Answer: We have ∇f (1, 1, 0) = [64, 64, 0]. The surface is x + y = d for some constant
d. By plugging in (1, 1, 0) we see that x + y = 2.

The quartic surface

f (x, y, z) = x4 − x3 + y 2 + z 2 = 0
is called the piriform. What is the equation for the tangent plane at the
point P = (2, 2, 2) of this pair shaped surface? We get [a, b, c] = [20, 4, 4]
and so the equation of the plane 20x+4y+4z = 56, where we have obtained
the constant to the right by plugging in the point (x, y, z) = (2, 2, 2).
Multivariable Calculus

10.12. Linearization is just the first step for more accurate approximations. One
could do quadratic approximations for example. In one dimension, one has Q(x) =
2
f (a) + f ′ (a)(x − a) + f ′′ (a) (x−a)
2!
. In two dimensions, this becomes Q(x, y) = L(x, y) +
H(a,
b)[x − a, y − b] ·
[x − a, y − b]/2, where H is the Hessian matrix H(a, b) =
fxx (a, b) fxy (a, b)
. We will see this matrix next week, when we maximize or
fyx (a, b) fyy (a, b)
minimize functions.

Homework
This homework is due on Tuesday, 7/12/2022.

Problem 10.1: Estimate 400′ 000′ 0001/9 using linear approximation of

f (x) = x1/9 near x0 = 99 .

6yx
Problem 10.2: Given f (x, y) = π
− cos(x). Estimate f (π + 0.01, π −
0.03) using linearization

Problem 10.3: Estimate f (0.003, 0.9999) for f (x, y) = cos(πy)+sin(x+

πy) using linearization.

Problem 10.4: Find the linear approximation L(x, y) of the function

p
f (x, y) = 10 − x2 − 5y 2
at (2, 1) and use it to estimate f (1.95, 1.04).

Problem 10.5: Estimate (993 ∗ 1012 ) by linearising the function

f (x, y) = x3 y 2 at (100, 100). What is the difference between L(100, 100)
and f (100, 100)?

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 11: Chain rule

Lecture
11.1. If f and g are functions of t, the single variable chain rule tells us that
d
dt
f (g(t)) = f ′ (g(t))g ′ (t). For example, dtd sin(log(t)) = cos(log(t))/t. The rule can be
proven by linearization of the functions f and g and verifying the chain rule in the
linear case. The chain rule is also useful:

11.2. To find arccos′ (x) for example, we differentiatepx = cos(arccos(x)) to get 1 =

d/dx
√ cos(arccos(x)) = − sin(arccos(x)) arccos′ (x)
√ = − 1 − cos2 (arccos(x)) arccos′ (x) =
− 1 − x2 arccos′ (x) so that arccos′ (x) = −1/ 1 − x2 .

Definition: Define the gradient ∇f (x, y) = [fx (x, y), fy (x, y)] or
∇f (x, y, z) = [fx (x, y, z), fy (x, y, z), fz (x, y, z)]. It is the analog of the deriv-
ative in higher dimensions.

11.3. If ⃗r(t) is curve and f is a function of several variables we get a function t 7→

f (⃗r(t)) of one variable. Similarly, if ⃗r(t) is a parametrization of a planar curve f is a
function of two variables, then t 7→ f (⃗r(t)) is a function of one variable.

Theorem: d
dt
f (⃗r(t)) = ∇f (⃗r(t)) · ⃗r′ (t).

Proof. When written out in two dimensions, it is

d
f (x(t), y(t)) = fx (x(t), y(t))x′ (t) + fy (x(t), y(t))y ′ (t) .
dt
The identity
f (x(t+h),y(t+h))−f (x(t),y(t))
h
= f (x(t+h),y(t+h))−f
h
(x(t),y(t+h))
+ f (x(t),y(t+h))−f
h
(x(t),y(t))

holds for every h > 0. The left hand side converges to dtd f (x(t), y(t)) in the limit
h → 0 and the right hand side to fx (x(t), y(t))x′ (t) + fy (x(t), y(t))y ′ (t) using the single
variable chain rule twice. Here is the proof of the later, when we differentiate f with
respect to t and y is treated as a constant:

f ( x(t+h) ) − f (x(t)) [f ( x(t) + (x(t+h)-x(t)) ) − f (x(t))] [x(t+h)-x(t)]

= · .
h [x(t+h)-x(t)] h
Multivariable Calculus

Write H(t) = x(t+h)-x(t) in the first part on the right hand side.
f (x(t + h)) − f (x(t)) [f (x(t) + H) − f (x(t))] x(t + h) − x(t)
= · .
h H h
As h → 0, we also have H → 0 and the first part goes to f ′ (x(t)) and the second factor
to x′ (t).
11.4. The chain rule is powerful because it implies all other differentiation rules:
the addition, product and quotient rule: f (x, y) = x + y, x = u(t), y = v(t), d/dt(x +
y) = fx u′ + fy v ′ = u′ + v ′ .
f (x, y) = xy, x = u(t), y = v(t), d/dt(xy) = fx u′ + fy v ′ = vu′ + uv ′ .
f (x, y) = x/y, x = u(t), y = v(t), d/dt(x/y) = fx u′ + fy v ′ = u′ /y − v ′ u/v 2 .
11.5. We mentioned that the chain rule can be derived from linearization. Let us look
at this case: if f is a linear function f (x, y) = ax + by − c and if ⃗r(t) = [x0 + tu, y0 + tv]
is a line, then dtd f (⃗r(t)) = dtd (a(x0 + tu) + b(y0 + tv)) = au + bv and this is the dot
product of ∇f = (a, b) with ⃗r ′ (t) = (u, v). Because the chain rule only refers to the
derivatives of the functions and the linearlization too, the chain rule is also true for
general functions.

Examples
11.6. A ladybug moves on a circle ⃗r(t) = [cos(t), sin(t)] on a table with tempera-
ture distribution f (x, y) = x2 − y 3 . Find the rate of change of the temperature
∇f (x, y) = (2x, −3y 2 ), ⃗r′ (t) = (− sin(t), cos(t)) d/dtf (⃗r(t)) = ∇T (⃗r(t)) · ⃗r′ (t) =
(2 cos(t), −3 sin(t)2 ) · (− sin(t), cos(t)) = −2 cos(t) sin(t) − 3 sin2 (t) cos(t).

11.7. From f (x, y) = 0, one can express y as a function of x, at least near a point
where fy is not zero. From dxd
f (x, y(x)) = ∇f · (1, y ′ (x)) = fx + fy y ′ = 0, we obtain
′
y = −fx /fy . Even so, we do not know y(x), we can compute its derivative! Implicit
differentiation works also in three variables. The equation f (x, y, z) = c defines a
surface. Near a point where fz is not zero, the surface can be described as a graph
z = z(x, y). We can compute the derivative zx without actually knowing the function
z(x, y). To do so, we consider y a fixed parameter and compute, using the chain rule
fx (x, y, z(x, y)) · 1 + fy (x, y, z(x, y)) · 0 + fz (x, y) · zx (x, y) = 0
so that zx (x, y) = −fx (x, y, z)/fz (x, y, z). This works at points where fz is not zero.
11.8. The surface f (x, y, z) = x2 + y 2 /4 + z 2 /9 = 6 is an ellipsoid. Compute zx (x, y)
at the point (x, y, z) = (2, 1, 1).
Solution: zx (x, y) = −fx (2, 1, 1)/fz (2, 1, 1) = −4/(2/9) = −18.
Homework
This homework is due on Tuesday, 7/12/2022.

Problem 11.1: You know that d/dtf (⃗r(t)) = 69 at t = 7 if ⃗r(t) = [t, t]

and d/dtf (⃗r(t)) = 21 at t = 7. ⃗r(t) = [t, 14 − t]. Find the gradient of f at
(7, 7).

Problem 11.2: The pressure in the space at the position (x, y, z) is

p(x, y, z) = x2 + y 2 − z 3 and the trajectory of an observer is the curve
⃗r(t) = [t, t, 1/t]. Using the chain rule, compute the rate of change of the
pressure the observer measures at time t = 2.

Problem 11.3: The chain rule is closely related to linearization. Lets

get back to linearization a bit: A farm costs f (x, y), where x is the number
of cows and y is the number of ducks. There are 10 cows and 20 ducks
and f (10, 20) = 1000000. We know that fx (x, y) = 2x and fy (x, y) = y 2
for all x, y. Estimate f (12, 19).
Here is a song out of this:
”Old MacDonald had a million dollar farm, E-I-E-I-O,
and on that farm he had x = 10 cows, E-I-E-I-O,
and on that farm he had y = 20 ducks, E-I-E-I-O,
with fx = 2x here and fy = y 2 there,
and here two cows more, and there a duck less,
how much does the farm cost now, E-I-E-I-O?”
Problem 11.4: Find, using implicit differentiation the derivative
d/dx arctanh(x), where
tanh(x) = sinh(x)/ cosh(x) .
The hyperbolic sine and hyperbolic cosine are defined as are
sinh(x) = (ex − e−x )/2 and cosh(x) = (ex + e−x )/2. We have sinh′ = cosh
and cosh′ = sinh and cosh2 (x) − sinh2 (x) = 1.

Problem 11.5: The equation f (x, y, z) = exyz + z = 1 + e implicitly

defines z as a function z = g(x, y) of x and y. Find formulas (in terms of
x,y and z) for gx (x, y) and gy (x, y). Estimate g(1.01, 0.999) using linear
approximation.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 12: Tangent spaces

Lecture
12.1. The gradient ∇f (x, y, z) = [a, b, c] is the derivative of a scalar function f of
many variables. It produces a vector at every point (x, y, z). This vector [a, b, c] is
useful for example to compute tangent lines or tangent planes.

Definition: The gradient of a function f (x, y) is defined as

∇f (x, y) = [fx (x, y), fy (x, y)] .
For functions of three variables, define
∇f (x, y, z) = [fx (x, y, z), fy (x, y, z), fz (x, y, z)] .

12.2. The symbol ∇ is spelled “Nabla” and named after an Egyptian or Assyrian harp.
Early on, the name “Atled” was suggested. But the textbook of 1901 of Gibbs used
Nabla was too persuasive. The following important fact holds in any dimension.

Theorem: Gradient Theorem: ∇f (x0 , y0 ) is perpendicular to the

level curve {(x, y) | f (x, y) = c} containing (x0 , y0 ). ∇f (x0 , y0 , z0 ) is
perpendicular to the level surface {(x, y, z) | f (x, y, z) = c} containing
(x0 , y0 , z0 ).

Proof: Every curve ⃗r(t) on the level curve or level surface satisfies dtd f (⃗r(t)) = 0. By
the chain rule, ∇f (⃗r(t)) is perpendicular to the tangent vector ⃗r′ (t). QED.

12.3. Because ⃗n = ∇f (p, q) = [a, b] is perpendicular to the level curve f (x, y) = c

through (p, q), the equation for the tangent line is ax+by = d, a = fx (p, q), b = fy (p, q),
d = ap + bq. Compactly written, this is
∇f (⃗x0 ) · (⃗x − ⃗x0 ) = 0
and means that the gradient of f is perpendicular to any vector (⃗x − ⃗x0 ) in the plane.
It is one of the most important statements in multivariable calculus as it gives a crucial
link between calculus and geometry. The just mentioned gradient theorem is also useful.
We can immediately compute tangent planes and tangent lines, without linearization!
Multivariable Calculus

Definition: If f is a function of several variables and ⃗v is a unit vector then

D⃗v f = ∇f · ⃗v is called the directional derivative of f in the direction ⃗v .
The name “directional derivative” is related to the fact that every unit vector gives a
direction. If ⃗v is a unit vector, then the chain rule tells us D⃗v f = dtd f (x + t⃗v ).

The directional derivative tells us how the function changes when we move in a given
direction. Assume for example that T (x, y, z) is the temperature at position (x, y, z). If
we move with velocity ⃗v through space, then D⃗v T tells us at which rate the temperature
changes for us. If we move with velocity ⃗v on a hilly surface of height h(x, y), then
D⃗v h(x, y) gives us the slope we drive on.
12.4. If ⃗r(t) is a curve with velocity ⃗r ′ (t) and the speed is 1, then D⃗r′ (t) f = ∇f (⃗r(t)) ·
⃗r ′ (t) is the temperature change, one measures at ⃗r(t). The chain rule told us that this
is d/dtf (⃗r(t)).
12.5. For ⃗v = [1, 0, 0], then D⃗v f = ∇f · v = fx , the directional derivative is a general-
ization of the partial derivatives. It measures the rate of change of f , if we walk with
unit speed into that direction. But as with partial derivatives, it is a scalar.
12.6. The directional derivative satisfies |D⃗v f | ≤ |∇f ||⃗v | because
∇f · ⃗v ≤ |∇f ||⃗v || cos(ϕ)| ≤ |∇f ||⃗v | .

Definition: The direction ⃗v = ∇f /|∇f | is the direction, where f increases

most. It is the direction of steepest ascent.

12.7. If ⃗v = ∇f /|∇f |, then the directional derivative is ∇f · ∇f /|∇f | = |∇f |. This

means f increases, if we move into the direction of the gradient. The slope in that
direction is |∇f |.
12.8. If ⃗r(t) is a curve with velocity ⃗r ′ (t) and the speed is 1, then D⃗r′ (t) f = ∇f (⃗r(t)) ·
⃗r ′ (t) is the temperature change, one measures at ⃗r(t). The chain rule told us that this
is d/dtf (⃗r(t)).
12.9. If ⃗v = ∇f /|∇f |, then the directional derivative is ∇f · ∇f /|∇f | = |∇f |. This
means f increases, if we move into the direction of the gradient. The slope in that
direction is |∇f |.
12.10. The directional derivative has the same properties than any derivative: Dv (λf ) =
λDv (f ), Dv (f + g) = Dv (f ) + Dv (g) and Dv (f g) = Dv (f )g + f Dv (g).
We will see later that points with ∇f = ⃗0 are candidates for local maxima or min-
ima of f . Points (x, y), where ∇f (x, y) = (0, 0) are called critical points and help
to understand the function f .

Examples
12.11. Compute the tangent plane to the surface 3x2 y + z 2 − 4 = 0 at the point
(1, 1, 1). Solution: ∇f (x, y, z) = [6xy, 3x2 , 2z]. And ∇f (1, 1, 1) = [6, 3, 2]. The plane
is 6x + 3y + 2z = d where d is a constant. We can find the constant d by plugging in
a point and get 6x + 3y + 2z = 11.
12.12. Problem: reflect the ray ⃗r(t) = [1 − t, −t, 1] at the surface
x4 + y 2 + z 6 = 6 .
Solution: ⃗r(t) hits the surface at the time t = 2 in the point (−1, −2, 1). The velocity
vector in that ray is ⃗v = [−1, −1, 0]. The normal vector at this point is ∇f (−1, −2, 1) =
[−4, −4, 6] = ⃗n. The reflected vector is R(⃗v = 2Proj⃗n (⃗v ) − ⃗v . We have Proj⃗n (⃗v ) =
8/68[−4, −4, 6]. Therefore, the reflected ray is w⃗ = (4/17)[−4, −4, 6] − [−1, −1, 0].

12.13. You are on a trip in a air-ship over Cambridge at (1, 2) and you want to
avoid a thunderstorm, a region of low pressure. The pressure is given by a function
p(x, y) = x2 + 2y 2 . In which direction do you have to fly so that the pressure change
is largest? Solution: The gradient√∇p(x, y) = [2x, 4y] at the point (1, 2) is [2, 8].
Normalize to get the direction [1, 4]/ 17.
12.14. The “Dom” is a mountain in Switzerland with an altitude of 4’545 meters. In
suitable units on the ground, the height f (x, y) is approximated by the quadratic func-
tion f (x, y) = 4000−x2 −y 2 . At height f (−10, 10) = 3800, at the point (−10, 10, 3800),
√
you rest. The climbing route continues into the south-east direction ⃗v = [1, −1]/ 2.
Calculate the rate of change in that direction. We have ∇f (x, y) = [−2x, −2y], so
Multivariable Calculus
√ √
that√[20, −20] · [1, −1]/ 2 = 40/ 2. This is a place, with a ladder, where you climb
40/ 2 meters up when advancing 1m forward. The rate of change in all directions is
zero if and only if ∇f (x, y) = [0, 0]: if ∇f ̸= ⃗0, we can choose ⃗v = ∇f /|∇f | and get
D∇f f = |∇f |.

Dom as seen from the Alp Salmenfee in Switzerland. Oliver was finally back there
this summer 2022.

√ √
12.15.√ Assume we know√ Dv f (1, 1) = 3/ 5 and Dw f (1, 1) = 5/ 5, where v =
[1, 2]/ 5 and w = [2, 1]/ 5. Find the gradient of f . Note that we do not know any-
thing else about the function f . Solution: Let ∇f (1, 1) = [a, b]. We know a + 2b = 3
and 2a + b = 5. This allows us to get a = 7/3, b = 1/3.
Homework
This homework is due on Tuesday, 7/12/2022.

Problem 12.1: Find the directional derivative D⃗v f (3, 1) = ∇f (3, 1) · ⃗v

into the direction ⃗v = [3, −4]/5 for the function f (x, y) = 2 + x4 y + y 2 + y.

Problem 12.2: A surface x2 + y 2 − z = 1 radiates light away. It can

be parametrized as ⃗r(x, y) = [x, y, x2 + y 2 − 1]. Find the parametrization
of the wave front ⃗r(x, y) + ⃗n(x, y), which is distance 1 from the surface.
Here ⃗n is a unit vector normal to the surface.

Problem 12.3: Assume f (x, y) = 1 − x2 + y 2 . Compute the directional

derivative D⃗v (x, y) at (0, 0), where ⃗v = [cos(t), sin(t)] is a unit vector.
Now compute
Dv Dv f (x, y)
at (0, 0), for any unit vector. For which values t is this second directional
derivative positive?

Problem 12.4: The Kitchen-Rosenberg formula gives the curvature

of a level curve f (x, y) = c as
fxx fy2 − 2fxy fx fy + fyy fx2
κ=
(fx2 + fy2 )3/2
Use this formula to find the curvature of the ellipse f (x, y) = x2 + 2y 2 = 1
at the point (1, 0).
This formula is useful in computer vision. If you want to derive the formula, you can check that the
angle
g(x, y) = arctan(fy /fx )
q
of the gradient vector has κ as the directional derivative in the direction ⃗v = [−fy , fx ]/ fx2 + fy2
tangent to the curve.

Problem 12.5: Using gradient methods is is one of the important

paradigms in machine learning. One can find the maximum of a func-
tion numerically by moving in the direction of the gradient. This is
called the steepest ascent method. You start at a point (x0 , y0 )
then move in the direction of the gradient for some time c to be at
(x1 , y1 ) = (x0 , y0 )+c∇f (x0 , y0 ). Repeat to (x2 , y2 ) = (x1 , y1 )+c∇f (x1 , y1 )
etc. It can be a bit difficult if the function has a flat ridge like in the
Rosenbrock function
f (x, y) = 1 − (1 − x)2 − 100(y − x2 )2 .
Plot the contour map of this function on −0.6 ≤ x ≤ 1, −0.1 ≤ y ≤
1.1, then
√ and find the directional derivative at (1/5, 0) in the direction
(1, 1)/ 2.
Multivariable Calculus

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 13: Extrema

Lecture
13.1. In applications we often are led to the task to maximize or minimize a function
f . As in single variable calculus, the strategy is to look for points where the derivative
is zero. In the interior of an Euclidean domain this is needed for a maximum by the
Fermat principle. In one dimensions, like for f (x) = 3x5 − 5x3 we can then use the
second derivative test to classify the extrema, like local max at −1 and the local
min at 1. 1

Definition: A point (a, b) in the plane is called a critical point of a function

f (x, y) if ∇f (a, b) = [0, 0].

13.2. The Fermat principle in two dimensions tells:

If ∇f (x, y) is not zero, then (x, y) is not a maximum or minimum.

13.3. Proof. Take the directional derivative in the direction ⃗v = ∇f /|∇f | at ⃗x =

(x, y). Then D⃗v f = ∇f · ⃗v = |∇f | > 0. This means that f (⃗x + ϵ⃗v ) > f (⃗x) and
f (⃗x − ϵ⃗v ) < f (⃗x) for small ϵ and ⃗x is neither a maximum nor a minimum. QED

13.4. Note that in the definition, we do not include points, where f or its derivative
is not defined. For f (x, y) = |x| + |y| we would have to exclude the points on the x
and the y axis and study the function there separately. Without stating otherwise, we
always assume that a function f can be differentiated arbitrarily often. Points, where
the function has no derivatives are just not considered to be part of the domain and
need to be studied separately. For the continuous function f (x, y) = 1/ log(|xy|) for
example, we would have to look at the points on the coordinate axes as well as the
points on the hyperbola xy = 1 separately.

1It is custom to abbreviate max for maximum and min for minimum.
Multivariable Calculus

13.5. In one dimension, we used the condition f ′ (x) = 0, f ′′ (x) > 0 to get a local
minimum and f ′ (x) = 0, f ′′ (x) < 0 to assure a local max. If f ′ (x) = 0, f ′′ (x) = 0, the
nature of the critical point is undetermined and could be a max like for f (x) = −x4 ,
or a minimum like for f (x) = x4 or a flat inflection point like for f (x) = x3 , where
we have neither a max nor a min.
Definition: If f (x, y) is a function of two variables with a critical point (a, b),
2
the number D = fxx fyy − fxy is called the discriminant of the critical point.

13.6. The discriminant can be remembered

better if seen as the determinant of the
fxx fxy
Hessian matrix H = combining all the second partial derivatives in
fyx fyy
one entity, a matrix. 2 As of default, we always assume that functions are twice
continuously differentiable. Here is the second derivative test:
Theorem: Assume (a, b) is a critical point for f (x, y).
If D > 0 and fxx (a, b) > 0 then (a, b) is a local min.
If D > 0 and fxx (a, b) < 0 then (a, b) is a local max.
If D < 0 then (a, b) is a saddle point.

13.7. If D ̸= 0 at all critical points, the function f is called Morse. The Morse
condition is nice as for D = 0, we need higher derivatives or ad-hoc methods to
determine the nature of the critical point.
13.8. To find the max or min of f (x, y) on a domain, determine all critical points in
the interior the domain, and compare their values with maxima or minima at the
boundary. We will see in the next unit how to get extrema on the boundary.
13.9. Sometimes, we want to find the overall maximum and not only the local ones.
Definition: A point (a, b) in the plane is called a global maximum of
f (x, y) if f (x, y) ≤ f (a, b) for all (x, y). For example, the point (0, 0) is a global
maximum of the function f (x, y) = 1 − x2 − y 2 . Similarly, we call (a, b) a global
minimum, if f (x, y) ≥ f (a, b) for all (x, y).
3

Examples
13.10. Find the critical points of f (x, y) = x4 +y 4 −4xy+2. The gradient is ∇f (x, y) =
[4(x3 − y), 4(y 3 − x)]. This is [0, 0] at the points (0, 0), (1, 1), (−1, −1). These are the
critical points.
13.11. f (x, y) = sin(x2 + y) + y. The gradient is ∇f (x, y) = [2x cos(x2 + y), cos(x2 +
y) + 1]. For a critical points, we must have x = 0 and cos(y) + 1 = 0 which means
π + k2π. The critical points are at . . . (0, −π), (0, π), (0, 3π), . . . . There are infinitely
many critical points here.
2Matrixcomes from Mater = womb of a mother as matrices are the mother of determinants!
3Weavoid “absolute maximum” as this would suggest to look for the maximum of |f |. Compare
for example when looking at absolute convergence of series.
2 2
13.12. The graph of f (x, y) = (x2 + y 2 )e−x −y looks like a volcano. The gradient
2 2
∇f = [2x − 2x(x2 + y 2 ), 2y − 2y(x2 + y 2 )]e−x −y vanishes at (0, 0) and on the circle
x2 + y 2 = 1. This function has a continuum of critical points.

13.13. The function f (x, y) = y 2 /2 − g cos(x) is the energy of the pendulum. The
variable g is a constant and related to the gravitational strength. We have ∇f =
(y, −g sin(x)) = [(0, 0] for
(x, y) = . . . , (−π, 0), (0, 0), (π, 0), (2π, 0), . . . .
These points are equilibrium points, the angles for which the pendulum is at rest.

13.14. The function f (x, y) = a log(y)−by+c log(x)−dx is a function which is invariant

by the flow of the Volterra-Lodka differential equation ẋ = ax − bxy, ẏ = −cy + dxy.
The point (c/d, a/b) is a critical point of f . In the context of differential equations, we
say that this is an equilibrium point of the system.

13.15. The function f (x, y) = |x| + |y| is smooth on the first quadrant {x > 0, y > 0}.
It does not have critical points there. The function has a minimum at (0, 0) but it is
not in the domain, where f and ∇f are defined. We have to look at the points on the
coordinate axis separately. For y = 0, we see that x = 0 is a minimum of |x| For x = 0
we see that y = 0 is a minimum of |y|. Now (0, 0) is a minimum of f . This minimum
was not detected using derivatives.

13.16. The function f (x, y) = x3 /3 − x − (y 3 /3 − y) has a graph which looks like

a “napkin”. It has the gradient ∇f (x, y) = [x2 − 1, −y 2 + 1]. There are 4 critical
points (1, 1),(−1, 1),(1, −1) and (−1,
−1). The Hessian matrix which includes all par-
2x 0
tial derivatives is H = .
0 −2y
For (1, 1) we have D = −4 and so a saddle point,
For (−1, 1) we have D = 4, fxx = −2 and so a local maximum,
For (1, −1) we have D = 4, fxx = 2 and so a local minimum.
For (−1, −1) we have D = −4 and so a saddle point. The function has a local maxi-
mum, a local minimum as well as 2 saddle points.

13.17. Find the maximum of f (x, y) = 2x2 − x3 − y 2 on y ≥ −1. With ∇f (x, y) =

2
− 3x , −2y), the critical points are (4/3, 0) and (0, 0). The Hessian is H(x, y) =
4x
4 − 6x 0
. At (0, 0), the discriminant is −8 so that this is a saddle point. At
0 −2
(4/3, 0), the discriminant is 8 and H11 = 4/3, so that (4/3, 0) is a local maximum. We
have now also to look at the boundary y = −1 where the function is g(x) = f (x, −1) =
2x2 − x3 − 1. Since g ′ (x) = 0 at x = 0, 4/3, where 0 is a local minimum, and 4/3
Multivariable Calculus

is a local maximum on the line y = −1. Comparing f (4/3, 0), f (4/3, −1) shows that
(4/3, 0) is the global maximum.
13.18. Find the global maxima and minima of f (x, y) = x4 + y 4 − 2x2 − 2y 2 Solution:
the function has no global maximum. This can be seen by restricting the function to
the x-axis, where f (x, 0) = x4 − 2x2 is a function without maximum. The function has
four global minima however. They are located on the 4 points (±1, ±1). The best way
to see this is to note that f (x, y) = (x2 − 1)2 + (y − 1)2 − 2 which is minimal when
x2 = 1, y 2 = 1.

Homework
This homework is due on Tuesday, 7/19/2022.
Problem 13.1: Find all the extrema of the function
f (x, y) = xy + x2 y − xy 2
they are maxima, minima or saddle points.

Problem 13.2: Where on the parametrized surface ⃗r(u, v) = [1 +

3 2
u , v , uv] is the temperature T (x, y, z) = 7 + x + 12y − 24z minimal?
To find the minimum, minimize the function f (u, v) = T (⃗r(u, v)). Find
all local maxima, local minima or saddle points of f .

Problem 13.3: Do this problem as it is an old classic and because it will

appear again in HW 14. Find and classify all the extrema of the function
2 2
f (x, y) = e−x −y (x2 + 2y 2 ).

Problem 13.4: Find all extrema of the function f (x, y) = x3 + y 3 −

3x − 12y + 13e and characterize them. Do you find a global maximum or
global minimum among them?

Problem 13.5: Graph theorists are fond of at the Tutte polynomial

f (x, y) of a network. We work with the Tutte polynomial
f (x, y) = x + 2x2 + x3 + y + 2xy + y 2
of the Kite network. Classify using the second derivative test.

Remark. The polynomial is useful: xf (1 − x, 0) tells in how

many ways one can color the nodes of the network with x
colors and f (1, 1) tells how many spanning trees there are.
This picture illustrates that the number of spanning trees of
the kite graph is f (1, 1) = 8 as you see the 8 possible trees.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 14: Lagrange

Lecture
14.1. When looking for maxima and minima of a function f (x, y) in the presence of
a constraint g(x, y) = 0, a necessary condition is that the gradients of f and g are
parallel. The reason is that otherwise, we can move along the curve g = c and increase
the value of f . Indeed, the directional derivative of f in the direction tangent to the
level curve is zero if and only if the tangent vector to g is perpendicular to the gradient
of f . This can also include the case ∇g = [0, 0].

Definition: The system of equations ∇f (x, y) = λ∇g(x, y), g(x, y) = 0 for

the three unknowns x, y, λ are called the Lagrange equations. The variable
λ is a Lagrange multiplier.

Theorem: A maximum or minimum of f (x, y) on the curve g(x, y) = c

is either a solution of the Lagrange equations or then is a critical point of
g.

Proof. The condition that ∇f is parallel to ∇g either means ∇f = λ∇g or ∇f = 0

or ∇g = 0. The case ∇f = 0 can be included in the Lagrange equation case with
λ = 0. The case ∇g = 0 however needs to be added as a special case because then
only ∇g = λ∇f works with λ = 0. QED.
Multivariable Calculus

14.2. In higher dimensions, the statement is exactly the same: extrema of f (⃗x) under
the constraint g(⃗x) = c are either solutions of the Lagrange equations ∇f = λ∇g, g = c
or points, where ∇g = ⃗0. But we also can have more than one constraint:

Theorem: Extrema of f (x, y, z) under the constraint g(x, y, z) =

c, h(x, y, z) = d are either solutions of the Lagrange equations ∇f =
λ∇g + µ∇h, g = c, h = d or solutions to ∇g = 0, ∇f (x, y, z) = µ∇h, h = d
or solutions to ∇h = 0, ∇f = λ∇g, g = c or solutions to ∇g = ∇h = 0.

14.3. Remarks.
1) The conditions in the Lagrange theorem are equivalent to ∇f ×∇g = ⃗0 in dimensions
2 or 3.
2) With g(x, y) = 0, the Lagrange equations can also be written as ∇F (x, y, λ) = ⃗0,
where F (x, y, λ) = f (x, y) − λg(x, y).
3) The two conditions in the theorem are equivalent to ”∇g = λ∇f or f has a critical
point”.
4) Constrained optimization problems work also in higher dimensions.
5) Can we avoid Lagrange? Sometimes. It is often done in single variable calculus. In
order to maximize xy under the constraint 2x + 2y = 4 for example, we solve for y in
the second equation and extremize the single variable problem f (x, y(x)). This needs
to be done carefully and the boundaries must be considered.
√ To extremize f (x, y) = y
2 2 2
on x + y = 1 for example we need to maximize 1 − x . We can differentiate to get
the critical points but also have to look at the cases x = 1 and x = −1, where the
actual minima and maxima occur. In general also, we can not do the substitution. To
extremize f (x, y) = x2 + y 2 with constraint g(x, y) = x4 + 3y 2 − 1 = 0 for example, we
solve y 2 = (1 − x4 )/3 and minimize h(x) = f (x, y(x)) = x2 + (1 − x4 )/3. h′ (x) = 0
gives x = 0. The find the maximum (±1, 0), we had to maximize h(x) on [−1, 1], which
occurs at ±1.
To extremize f (x, y) = x2 +y 2 under the constraint g(x, y) = p(x)+p(y) = 1, where p is
a complicated function in x which satisfies p(0) = 0, p′ (1) = 2,the Lagrange equations
2x = λp′ (x), 2y = λp′ (y), p(x) + p(y) = 1 can be solved with x = 0, y = 1, λ = 1. We
can not solve g(x, y) = 1 however for y in an explicit way.
6) How do we determine whether a solution of the Lagrange equations is a maximum
or minimum? Instead of introducing a second derivative test, we just make a list of
critical points and pick the maximum and minimum. A second derivative test can be
designed using second directional derivative in the direction of the tangent.
7) The Lagrange method also works with more constraints. The constraints g = c, h =
d define a curve in space. The gradient of f must now be in the plane spanned by the
gradients of g and h because otherwise, we could move along the curve and increase f :

Examples
14.4. Minimize f (x, y) = x2 +2y 2 under the constraint g(x, y) = x+y 2 = 1. Solution:
The Lagrange equations are 2x = λ, 4y = λ2y. If y = 0 then x = 1. If y ̸= 0 we can
divide the second equation by y and get 2x = λ, 4 = λ2 again showing x = 1. The
point x = 1, y = 0 is the only solution.
14.5. Find the shortest distance from the origin to the curve x6 + 3y 2 = 1. Solution:
Minimize the function f (x, y) = x2 + y 2 under the constraint g(x, y) = x6 + 3y 2 = 1.
The gradients are ∇f = [2x, 2y], ∇g = [6x5 , 6y]. The Lagrange equations ∇f = λ∇g
lead to the system 2x = λ6x5 , 2y = λ6y, x6 + 3y 2 − 1 = 0. We get λ = 1/3, x = x5 ,
p either x = 0 or 1 or −1. From the constraint
so that p equation g = 1, we obtain
6
y = (1 − x )/3. So, we have the solutions (0, ± 1/3) and (1, 0), (−1, p 0). To see
which is the minimum, just evaluate f on each of the points. (0, ± 1/3) are the
minima.

14.6. Which cylindrical soda cans of height h and radius r has minimal surface for fixed
volume? Solution: The volume is V (r, h) = hπr2 = 1. The surface area is A(r, h) =
2πrh + 2πr2 . With x = hπ, y = r, you need to optimize f (x, y) = 2xy + 2πy 2 under
the constrained g(x, y) = xy 2 = 1. Calculate ∇f (x, y) = (2y, 2x + 4πy), ∇g(x, y) =
(y 2 , 2xy). The task is to solve 2y = λy 2 , 2x + 4πy = λ2xy, xy 2 = 1. The first equation
gives yλ = 2. Putting that in the second one gives 2x + 4πy = 4x or 2πy = x.
The third equation finally reveals 2πy 3 = 1 or y = 1/(2π)1/3 , x = 2π(2π)1/3 . This
means h = 0.54.., r = 2h = 1.08. Remark: Other factors can influence the shape. For
example, the can has to withstand a pressure up to 100 psi. A typical can of ”Coca-
Cola classic” with 3.7 volumes of CO2 dissolve has at 75F an internal pressure of 55
psi, where PSI stands for pounds per square inch.

14.7. On the curve g(x, y) = x3 −y 2 the function f (x, y) = x obviously has a minimum
(0, 0). The Lagrange equations ∇f = λ∇g have no solutions. This is a case where the
minimum is a solution to ∇g(x, y) = 0.

14.8. Find the extrema of f (x, y, z) = z on the sphere g(x, y, z) = x2 + y 2 + z 2 = 1.

Solution: compute the gradients ∇f (x, y, z) = (0, 0, 1), ∇g(x, y, z) = (2x, 2y, 2z) and
solve (0, 0, 1) = ∇f = λ∇g = (2λx, 2λy, 2λz), x2 + y 2 + z 2 = 1. The case λ = 0 is
excluded by the third equation 1 = 2λz so that the first two equations 2λx = 0, 2λy = 0
give x = 0, y = 0. The 4’th equation gives z = 1 or z = −1. The minimum is the south
pole (0, 0, −1) the maximum the north pole (0, 0, 1).

14.9. A dice shows k eyes with probability pk with k in Ω = {1, 2, 3, 4, 5, 6 }. A

probability distribution is a non-negative function p on Ω which sums up to 1. It
can be written as a vector (p1 , p2 , p3 , p4 , p5 , p6 ) with p1 + p2 + p3 + pP
4 + p5 + p6 = 1.
The entropy of the probability vector p⃗ is defined as f (⃗p) = − 6i=1 pi log(pi ) =
−p1 log(p1 ) − p2 log(p2 ) − ... − p6 log(p6 ). Find the distribution p which maximizes
entropy under the constrained g(⃗p) = p1 + p2 + p3 + p4 + p5 + p6 = 1. Solution:
∇f = (−1 − log(p1 ), . . . , −1 − log(pn )), ∇g = (1, . . . , 1). The Lagrange equations are
−1 −Plog(pi ) = λ, p1 + · · · + p6 = 1, from which we get pi = e−(λ+1) . The last equation
1 = i exp(−(λ + 1)) = 6 exp(−(λ + 1)) fixes λ = − log(1/6) − 1 so that pi = 1/6. The
distribution, where each event has the same probability is the distribution of maximal
entropy. Maximal entropy means least information content. An unfair dice allows
a cheating gambler or casino to gain profit. Cheating through asymmetric weight
distributions can be avoided by making the dices transparent.
Multivariable Calculus

14.10. The probability that a chemical compound is in the state k is pk . The en-
ergy
P of the state k is Ek . Nature tries to minimize the free energy f (p1 , . . . , pn ) =
− i pi log(pi ) + Ei pi if the energies Ei are fixed. The probability distribution P pi min-
imizing this is called the Gibbs distribution. The constraint of course is k pk = 1.
We have ∇f = (−1 − log(p1 ) − E1 , . . . , −1 − log(pn ) − En ), ∇g = (1, . . . , 1). The La-
grange equation are log(pi ) = −1−λ−Ei , or pi = exp(−E P i )C, where C = exp(−1−λ).
ThePconstraint g(p) = p1 + · · · + pn = 1 gives C(P i exp(−Ei )) = 1 so that C =
1/( i e−Ei ). The Gibbs solution is pk = exp(−Ek )/ i exp(−Ei ).
15. Homework
This homework is due on Tuesday, 7/19/2022.
Problem 14.1: A solid bullet made of a half sphere and a cylinder has
the volume V = 2πr3 /3 + πr2 h and surface area A = 2πr2 + 2πrh + πr2 .
Doctor Manhattan designs a bullet with fixed volume and minimal
area. With g = 3V /π = 1 and f = A/π he therefore minimizes
f (h, r) = 3r2 + 2rh under the constraint g(h, r) = 2r3 + 3r2 h = 1. Use the
Lagrange method to find a local minimum of f under the constraint g = 1.

Problem 14.2: Find the cylindrical basket which is open on the top
has has the largest volume for fixed area 3π. If x is the radius and y
is the height, we have to extremize f (x, y) = πx2 y under the constraint
g(x, y) = 2πxy + πx2 = 3π. Use the method of Lagrange multipliers.

Problem 14.3: Find the extrema of the same function

2 −y 2
f (x, y) = e−x (x2 + 2y 2 )
you have seen in HW 13, but now on the entire disc {x2 + y 2 ≤ 4 } of
radius 2. Besides the already found extrema inside the disk, now find
also the extrema on the boundary.

Problem 14.4: Motivated by the Disney movie “Tangled”, we want

to build a hot air balloon with a cuboid mesh of dimension x, y, z which
together with the top and bottom fortifications uses wires of total length
g(x, y, z) = 6x + 6y + 4z = 32. Find the balloon with maximal volume
f (x, y, z) = xyz.

Problem 14.5: Which √ pyramid of height h over a square [−a, a]×[−a, a]

with surface area is 4a h2 + a2 + 4a2 = 4 has maximal volume V (h, a) =
4ha2 /3? By using new variables (x, y) and multiplying V with a constant,
2
we get to the equivalent
p problem to maximize f (x, y) = yx over the
constraint g(x, y) = x y 2 + x2 + x2 = 1. Use the later variables.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 15: Double Integrals

Lecture
15.1. If f (x) is a continuous function of one variable, then the Riemann integral
Rb
f (x) dx is defined as the limit of the Riemann sums Sn f = n1 k/n∈[a,b] f (k/n)
P
a
for n → ∞. The derivative of f is the limit of difference quotients Dn f (x) =
Rb
n[f (x + 1/n) − f (x)] as n → ∞. The integral a f (x) dx is the signed area under
the graph of f and above the x-axes, where “signed”Rx indicates that area below the
x-axes has negative sign. The function F (x) = 0 f (y) dy is called an anti-derivative
of f . It is determined up a constant. The fundamental theorem of calculus states
Z x
′
F (x) = f (x), f (x) dx = F (x) − F (0) .
0
It allows to compute integrals by inverting differentiation. Differentiation rules so
become integration rules: the product rule leads to integration by parts; the chain
rule becomes substitution.

RR
Definition: If f (x, y) is continuous on a region R, the integral R
f (x, y) dxdy
is defined as the limit of Riemann sums
1 X i j
2
f( , )
n i j n n
( n , n )∈R
RR
when n → ∞. We write also R f (x, y) dA, where dA = dxdy is a formal
notation meaning for “an area element”.

15.2. The Fubini’s theorem allows to switch the order of integration over a rectangle
if the function f is continuous:
Multivariable Calculus

RbRd RdRb
Theorem: a c
f (x, y) dxdy = c a
f (x, y) dydx.

Proof. For every n, there is the “quantum Fubini identity”

X X i j X X i j
f( , ) = f( , )
i j
n n j i
n n
n
∈[a,b] n ∈[c,d] n
∈[c,d] n
∈[a,b]

which holds for all functions. Now divide both sides by n2 and take the limit n → ∞.
This is possible for continuous functions. Fubini’s theorem only holds for rectangles.
We extend now the collection of possible regions to integrate over:

Definition: A bottom to top region is of the form

R = {(x, y) | a ≤ x ≤ b, c(x) ≤ y ≤ d(x) } .
An integral over a bottom to top region is called a bottom to top integral
ZZ Z b Z d(x)
f dA = f (x, y) dydx .
R a c(x)

A left to right region is of the form

R = {(x, y) | c ≤ y ≤ d, a(y) ≤ x ≤ b(y) } .
An integral over such a region is called a left to right integral
ZZ Z d Z b(y)
f dA = f (x, y) dxdy .
R c a(y)

d(x)

a(y)

b(y)

c(x)
c
a b

15.3. Similarly
R Ras we could see in one dimensions, an integral as a signed area, one
can interpret R
f (x, y) dydx as the signed volume of the solid below the graph of
f and above R in the xy plane. As in 1D integration, the volume of the solid below
the xy-plane is counted negatively.

Examples
15.4. If we integrate f (x, y) = xy over the unit square we can sum up the Riemann
sum for fixed y = j/n and get y/2. Now perform the integral over y to get 1/4. This
example shows how to reduce double integrals to single variable integrals.

15.5. If f (x, y) = 1, then the integral is the area of the region R. The integral is the
limit L(n)/n2 , where L(n) is the number of lattice points (i/n, j/n) contained in R.
RR RR
15.6. The value R
f (x, y) dA/ R
1 dA is the average value of f .

15.7. Integrate f (x, y) = x2 over the region bounded above by sin(x3 ) and bounded
below by the graph of − sin(x3 ) for 0 ≤ x ≤ π 1/3 . The value of this integral has a
physical meaning. It is called moment of inertia.
Z π1/3 Z sin(x3 ) Z π1/3
2
x dydx = 2 sin(x3 )x2 dx
0 − sin(x3 ) 0
1/3
We have now an integral, which we can solve by substitution − 32 cos(x3 )|π0 = 34 .

15.8. Integrate f (x, y) = y 2 over the region bound by the x-axes, the lines y = x + 1
and y = 1 − x. The problem is best solved with a “left to right” integral. As you can
see from the picture, we would have to compute two different integrals as a “bottom
to top” integral. To do so, we have to write the bounds as a function of x: they are
x = y − 1 and x = 1 − y
Z 1 Z 1−y
y 2 dx dy = 1/6 .
0 y−1

15.9. Let R be the triangle 1 ≥ x ≥ 0, 0 ≤ y ≤ x. What is

Z Z
2
e−x dxdy ?
R
R1 R1 2 2
The left to right integral 0 [ y e−x dx]dy can not be solved because e−x has no anti-
R1 Rx 2
derivative in terms of elementary functions. The bottom to top integral 0 [ 0 e−x dy] dx
however can be solved:
Z 1 2
2 e−x 1 (1 − e−1 )
= xe−x dx = − | = = 0.316... .
0 2 0 2

R R R √R2 −x2 RR √
15.10. The area of a disc of radius R is −R −√R2 −x2 1 dydx = −R 2 R2 − x2 dx.
R π/2 p
Substitute x = R sin(u), dx = R cos(u)du, to get −π/2 2 R2 − R2 sin2 (u)R cos(u) du =
R π/2
−π/2
2R2 cos2 (u) du = R2 π. In the last identity, we have used the double angle iden-
tity 2 cos2 (x) = 1 + cos(2x).
Multivariable Calculus

Homework
This homework is due on Tuesday, 7/19/2022.
Problem 15.1: Find the area of the region
R = {(x, y) | 0 ≤ x ≤ 4π, sin(x) − 1 ≤ y ≤ cos(x) + 2}
RR
and use it to compute the average value R
f (x, y) dxdy/area(R) of
f (x, y) = y over that region.

Problem 15.2: a) (4 points) Find the iterated integral

Z 1Z 2 p
6xy/ x2 + (y 2 /2) dy dx .
0 0
b) (4 points) Now compute
Z 1 Z 2 p
6xy/ x2 + y 2 /2 dx dy .
0 0
c) (2 points) Wouldn’t Fubini assure that a) and b) are the same? What
change would be needed in b) to make the results agree?

Problem 15.3: Find the volume of the solid lying under the paraboloid
z = 3x2 + 3y 2 and above the rectangle R = [−2, 2] × [−2, 4] = {(x, y) | −
3 ≤ x ≤ 3, −2 ≤ y ≤ 4 }.

R 1 R 2−x
Problem 15.4: a) Fist evaluate the iterated integral 0 x 6(x2 −
y) dydx. Make sure to sketch the corresponding bottom to top region.
b) Rewrite the integral as a left to right region and compute the integral
again.

Problem 15.5: There is a great way to identify zombies: throw two

difficult integrals at them and see whether they can solve them. Prove
that you are not a zombie!
a) (5 points) Integrate
Z 1 Z √1−y2
44(x2 + y 2 )10 dxdy .
0 0
You might want to “time travel” one lecture forward, where polar coordi-
nates are known to solve this problem. b) (5 points) Find the integral
Z 1 Z y2
3x7
√ dx dy .
0
√
y x − x2

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 16: Surface Integration

Lecture
16.1. For certain regions, it is better to use a different coordinate system. A re-
parametrization (x, y) = ⃗r(u, v) often helps. This works then also in higher dimensions,
where surfaces are parametrized as [x, y, z] = ⃗r(u, v). We first remain in R2 , where polar
coordinates (x, y) = (r cos(θ), r sin(θ) are an important example.

Definition: A polar region is a planar region bound by a simple closed

curve. It is defined in polar coordinates by a curve (t, r(t)) where t = θ is
the angle. In Cartesian coordinates, the parametrization of the boundary of a
polar region is ⃗r(t) = [r(t) cos(t), r(t) sin(t)], a polar graph like the spiral with
r(t) = t.

Theorem: To integrate in polar coordinates, we evaluate the integral

ZZ ZZ
f (x, y) dxdy = f (r cos(θ), r sin(θ))r drdθ .
R R

16.2. Why do we have to include the factor r, when we transition to polar coordinates?
The reason is that a small rectangle R with area dA = dθdr in the (r, θ) plane is mapped
by T : (r, θ) 7→ (r cos(θ), r sin(θ)) to a sector segment S in the (x, y) plane. It has the
area r dθdr. We will also see that the parametrization ⃗r(θ, r) = [r cos(θ), r sin(θ), 0]
gives |⃗rθ × ⃗rr | = r.
Multivariable Calculus

16.3. We can now integrate over basic regions in the (θ, r)-plane. Examples are flow-
ers: {(θ, r) |0 ≤ r ≤ f (θ)}, where f (θ) is a non-negative periodic function of θ.

A polar region shown in polar coordi- The same region in the xy coordinate
nates. system.

Theorem: A surface ⃗r(u, v) parametrized on a parameter domain R has

the surface area Z Z
|⃗ru (u, v) × ⃗rv (u, v)| dudv .
R

16.4. Proof. The vector ⃗ru is tangent to the grid curve u 7→ ⃗r(u, v) and ⃗rv is tangent
to v 7→ ⃗r(u, v). The two vectors span a parallelogram with area |⃗ru × ⃗rv |. A small
rectangle [u, u + du] × [v, v + dv] is mapped by ⃗r to a parallelogram spanned by ⃗ru du
and ⃗rv dv which has the area |⃗ru (u, v) × ⃗rv (u, v)| dudv.

Examples
16.5. The polar graph defined by r(θ) = | cos(3θ)| belongs to the class of roses r(t) =
| cos(nt)|. Regions enclosed by this graph are also called rhododenea. Note that in
the literature you often see also situations where r(θ) can become negative. We will
never allow that as r ≥ 0 is a radius and a radius is non-negative in order not to get
confused.

16.6. The polar curve r(θ) = 1 + sin(θ) is called a cardioid. It looks like a heart and
belongs to the class of limacon curves r(θ) = 1 + b sin(θ).

p
16.7. The polar curve r(θ) = | cos(2t)| is called a lemniscate.
2.0 1.0

1.5 0.5
0.5

1.0

-1.0 -0.5 0.5 1.0 -1.0 -0.5 0.5 1.0

0.5

-0.5
-0.5

-1.0 -0.5 0.5 1.0

-1.0

16.8. Integrate
f (x, y) = x2 + y 2 + xy
over
RR the unit disc. We
R 1 Rhave f (x, y) = f (r cos(θ), r sin(θ)) = r2 + r2 cos(θ) sin(θ) so that
2π
R
f (x, y) dxdy = 0 0 (r2 + r2 cos(θ) sin(θ))r dθdr = 2π/4.

16.9. We have earlier computed area of the disc {x2 + y 2 ≤ 1 } using substitution. It
is more elegant to do this in polar coordinates:
Z 2π Z 1
r drdθ = 2πr2 /2|10 = π .
0 0

16.10. Integrate the function f (x, y) = 1 {(θ, r(θ)) | r(θ) ≤ | cos(3θ)| }.

Z 2π Z cos(3θ) Z 2π
cos(3θ)2
Z Z
1 dxdy = r dr dθ = dθ = π/2 .
R 0 0 0 2
p
16.11. Integrate f (x, y) = y x2 + y 2 over the region R = {(x, y) | 1 < x2 + y 2 <
4, y > 0 }.
2 π 2 π
(24 − 14 ) π
Z Z Z Z
Z
3
r sin(θ)r r dθdr = r sin(θ) dθdr = sin(θ) dθ = 15/2
1 0 1 0 4 0
For integration problems, where the region is part of an annular region, or if you see
function with terms x2 + y 2 , always first try to use polar coordinates x = r cos(θ), y =
r sin(θ).

16.12. The Belgian Biologist Johan Gielis defined in 1997 the family of curves given
in polar coordinates as
| cos( mϕ
4
)|n1 | sin( mϕ
4
)|n2 −1/n3
+ r(ϕ) = ( )
a b
This super-curve can produce a variety of shapes like circles, squares, triangles or
stars. It can also be used to produce “super-shapes”. The super-curve generalizes
the super-ellipse which had been discussed in 1818 by Lamé and helps to describe
forms in biology. 1

1”Gielis,J. A ’generic geometric transformation that unifies a wide range of natural and abstract
shapes’. American Journal of Botany, 90, 333 - 338, (2003).
Multivariable Calculus

16.13. The parametrized surface ⃗r(u, v) = [2u, 3v, 0] is part of the xy-plane. The
parameter region R just gets stretched by a factor 2 in the x coordinate and by a
factor 3 in the y coordinate. ⃗ru × ⃗rv = [0, 0, 6] and we see that the area of S = ⃗r(R) is
6 times the area of R.
16.14. The map ⃗r(u, v) = [L cos(u) sin(v), L sin(u) sin(v), L cos(v)] maps the rectangle
G = {0 ≤ x ≤ 2π, 0 ≤ y ≤ π onto a sphere of R Rradius L. RWe compute ⃗ru × ⃗rv =
2 2 2π R π 2
L sin(v)⃗r(u, v). So, |⃗ru × ⃗rv | = L | sin(v)| and R
1 dS = 0 0 L sin(v) dvdu =
2
4πL . This is a formula which Archimedes already has derived by seeing it as the
surface area of an open cylinder of height 2L and radius L.
16.15. For graphs (u, v) 7→ [u, v, f (u, v)], we have ⃗ru = [1, 0, fu (u,p v)] and ⃗rv =
−fv , 1] has the length 1 + fu2 + fv2 .
[0, 1, fv (u, v)]. The cross product ⃗ru × ⃗rv = [−fu ,p
RR
The area of the surface above a region R is R
1 + fu2 + fv2 dudv.
16.16. Lets take a surface of revolution ⃗r(u, v) = [v, f (v) cos(u), f (v) sin(u)] on R =
[0, 2π]×[a, b]. We have ⃗ru = (0, −f (v) sin(u), f (v) cos(u)), ⃗rv = (1, f ′ (v) cos(u), f ′ (v) sin(u))
and ⃗ru ×⃗rv = (−f (v)f ′ (v), f (v) cos(u), f (v) sin(u)) ′
p = f (v)(−f (v), cos(u), sin(u)). The
RR Rb
surface area is |⃗ru × ⃗rv | dudv = 2π a |f (v)| 1 + f ′ (v)2 dv.
Homework
This homework is due on Tuesday, 7/19/2022.
RR
Problem 16.1: Find R
(x2 + y 2 )150 dA, where R is the part of the
2 2
unit disc {x + y ≤ 1 } for which y > x.

Problem 16.2: The cardioid has first been described in 1741. Its
ultimate fame came because the main body of the Mandelbrot set is
a Cardioid. Find its area. Its boundary is
cos(t) cos(2t) sin(t) sin(2t)
⃗r(t) = [ − , − ], 0 ≤ t ≤ 2π .
2 4 2 4

Problem 16.3: Find the area of the region bounded by three curves:
first by the polar curve r(θ) = 2θ with θ ∈ [0, 2π], second by the polar
curve r(θ) = 3θ with θ ∈ [0, 2π] and third by the positive x-axis?

RR
f dxdy
Problem 16.4: Find the average RRR 1 dxdy for f (x, y) = 34(x2 + y 2 ) on
R
the annular region R : 1 ≤ |(x, y)| ≤ 2.

Problem 16.5: Find the surface area of the part of the paraboloid
x = y 2 + z 2 which is inside the cylinder y 2 + z 2 ≤ 16.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 17: Triple integrals

Lecture
17.1. Integrating over 3-dimensional solids is done in the same way than in two di-
mensions. Three dimensional regions are referred to as solids.

Definition:
RRR If f (x, y, z) is continuous and E is a bounded solid in R3 , then
E
f (x, y, z) dxdydz is defined as the n → ∞ limit of the Riemann sum
X i j k 1
f( , , ) 3 .
i j k
n n n n
( n , n , n )∈E

Triple integrals can be evaluated by iterated single integrals. Here is an example:

17.2. If E is the box {x ∈ [0, 1], y ∈ [0, 1], z ∈ [0, 1]} and f (x, y, z) = 24x2 y 3 z.
R1 R1 R1
0 0 0
24x2 y 3 z dz dy dx .
R1
To evaluate the integral, start from the inside 0 24x2 y 3 z dz = 12x3 y 3 , then then
R1
integrate the middle layer, 0 12x3 y 3 dy = 3x2 and finally and finally handle the most
R1
outer layer: 0 3x2 dx =1.
For the inner integral, x = x0 and y = y0 are fixed. The middle integral now computes
the contribution over a slice z = z0 intersected with R. The outer integral sums up all
these slice contributions.

17.3. There are two reductions possible to compute triple integrals:

The burger method R b RR slices the solid a line
and computes a R(z) f (x, y, z) dA dz,
where g(z) is a double integral giv-
ing the values when integrating over
cheese, meet or tomato. The fries
method eats up fries going from g(x, y)
to h(x, y) over a region R. We have
RR R h(x,y)
[
R g(x,y)
f (x, y, z) dz] dA.
Multivariable Calculus

17.4. A special case is the signed volume

Z Z Z f (x,y)
1 dzdxdy .
R 0

below the graph of a function R R f (x, y) and above a region R, considered part of the
xy-plane. It is the integral R
f (x, y) dA. The triple integral above also has more
flexibility: we can replace 1 with a function f (x, y, z). If interpreted as a mass density,
then the integral is the mass of the solid.

17.5. The problem of computing volumes has been been worked on by Archimedes
(287-212 BC) already. His method of exhaustion was a precursor of Riemann sums
allowed him to find areas, volumes and surface areas in many cases without calculus.
One idea is comparison. Already the Archimedes principle relating volume to
the amount of displayed water is such an idea. The displacement method is a
comparison technique: the area of a sphere is the area of the cylinder enclosing it.
The volume of a sphere is the volume of the complement of a cone in that cylinder.
Cavalieri (1598-1647) would build on Archimedes ideas and determine area and
volume using tricks now called the Cavalieri principle. An example already due to
Archimedes is the computation of the volume the half sphere of radius R, cut away a
cone of height and radius R from a cylinder of height R and radius R. At height z,
this body has a cross section with area R2 π − r2 π. If we cut the half sphere at height
z, we obtain a disc of area (R2 − r2 )π. Because these areas are the same, the volume
of the half-sphere is the same as the cylinder minus the cone: πR3 − πR3 /3 = 2πR3 /3
and the volume of the sphere is 4πR3 /3. Newton (1643-1727) and Leibniz( 1646-
1716) developed calculus independently. It provided a new tool which made it possible
to compute integrals through ”anti-derivation”. Suddenly, it became possible to find
integrals using analytic tools. We can do this also in higher dimensions.

Examples
17.6. Find the volume of the unit sphere. Solution: The sphere is sandwiched be-
tween the graphs of two functions obtained by solving for z. Let R be the unit disc in
the xy plane. If we use the sandwich method, we get
Z Z Z √ 1−x2 −y 2
V = [ √ 1dz]dA .
R − 1−x2 −y 2
RR p
which gives a double integral 2 1 − x2 − y 2 dA which is of course best solved in
R 2π RR1 √
polar coordinates. We have 0 0 1 − r2 r drdθ = 4π/3.
With the washer method which is in √this case also called disc method, we slice
2
along the z axes and get a disc of radius 1 − z 2 with area π(1
R 1 − z ). This is a method
2
suitable for single variable calculus because we get directly −1 π(1 − z ) dz = 4π/3.
RRR
17.7. The mass of a body with mass density ρ(x, y, z) is defined as R
ρ(x, y, z) dV .
For bodies with constant density ρ, the mass is ρV , where V is the volume. Compute
the mass of a body which is bounded by the parabolic cylinder z = 4 − x2 , and the
planes x = 0, y = 0, y = 6, z = 0 if the density of the body is z. Solution:
Z 2 Z 6 Z 4−x2 Z 2Z 6
z dz dy dx = (4 − x2 )2 /2 dydx
0 0 0 0 0
Z 2 5 3
x 8x
= 6 (4 − x2 )2 /2 dx = 6( − + 16x)|20 = 2 · 512/5
0 5 3

17.8. The solid region bound by x2 + y 2 = 1, x = z and z = 0 is called the hoof

of Archimedes. It is historically significant because it is one of the first exam-
ples, on which Archimedes probed a Riemann sum integration technique. It appears
in every calculus text book. Find the volume of the hoof. Solution. Look from
the situation from above and picture it in the xy-plane. You see a half disc R. It
Ris Rthe floor of the solid. The roof is the function z = x. We have to integrate
R
x dxdy. We got a double integral problems which is best done in polar coor-
R π/2 R 1
dinates; −π/2 0 r2 cos(θ) drdθ = 2/3.

17.9. Finding the volume of the solid region bound by the three cylinders x2 + y 2 = 1,
x2 + z 2 = 1 and y 2 + z 2 = 1 is one of the most famous volume integration problems
going back to Archimedes.
Solution: look at 1/16’th of√the body given in cylindrical coordinates 0 ≤ θ ≤ π/4, r ≤
1, z > 0. The roof is z = 1 − x2 because above the ”one eighth disc” R only the
cylinder x2 + z 2 = 1 matters. The polar integration problem
Z π/4 Z 1 p
16 1 − r2 cos2 (θ)r drdθ
0 0

has an inner r-integral of (16/3)(1 − sin(θ)3 )/ cos2 (θ). Integrating this over θ can be
done by integrating (1 + sin(x)3 ) sec2 (x) by parts using tan′ (x) √= sec2 (x) leading to
the anti derivative cos(x) + sec(x) + tan(x). The result is 16 − 8 2.

Homework
This homework is due on Tuesday, 7/26/2022.
Multivariable Calculus

Problem 17.1: Evaluate the triple integral

Z 4 Z z Z 4y
2
2ze−2y dxdydz .
0 0 0

R1R1R1 2
Problem 17.2: What is 0 0 y
xe−z dzdydx?

RRR
Problem 17.3: Find the moment of inertia E
(x2 + y 2 ) dV of a
cone
E = {x2 + y 2 ≤ z 2 0 ≤ z ≤ 15 } ,
which has the z-axis as its center of symmetry.

Problem 17.4: Integrate f (x, y, z) = x2 + y 2 − z over the tetrahedron

with vertices
(0, 0, 0), (4, 4, 0), (0, 4, 0), (0, 0, 12).

Problem 17.5: This is a classic problem of Archimedes: what is the

volume of the body obtained by intersecting the solid cylinders x2 +z 2 ≤ 9
and y 2 + z 2 ≤ 9?

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 18: Spherical integrals

Lecture
18.1. Cylindrical and spherical coordinate systems help to integrate in many situa-
tions.

Definition: Cylindrical coordinates are coordinates in R3 , where polar

coordinates are used in the xy-plane while the z-coordinate is not changed.
The coordinate transformation T (r, θ, z) = (r cos(θ), r sin(θ), z), produces the
integration factor r . It is the same factor than in polar coordinates.
ZZZ ZZZ
f (x, y, z) dxdydz = g(r, θ, z) r drdθdz
T (R) R
Multivariable Calculus

Definition: Spherical coordinates use ρ ≥ 0, the distance to the origin as

well as two Euler angles: 0 ≤ θ < 2π the polar angle and 0 ≤ ϕ ≤ π, the
angle between the vector and the positive z axis. The coordinate change is
T : (x, y, z) = (ρ cos(θ) sin(ϕ), ρ sin(θ) sin(ϕ), ρ cos(ϕ)) .
The integration factor measures the volume of a spherical wedge which is
dρ · ρ sin(ϕ) · dθ · ρdϕ = ρ2 sin(ϕ)dθdϕdρ.

ZZZ ZZZ
f (x, y, z) dxdydz = g(ρ, θ, z) ρ2 sin(ϕ) dρdθdϕ
T (R) R

A ball of radius R has the volume

Z R Z 2π Z π
ρ2 sin(ϕ) dϕdθdρ .
0 0 0
Rπ
The most inner integral 0 ρ sin(ϕ)dϕ = −ρ2 cos(ϕ)|π0 = 2ρ2 . The next
2
R 2π
layer is, because ϕ does not appear: 0 2ρ2 dϕ = 4πρ2 . The final integral
RR
is 0 4πρ2 dρ = 4πR3 /3.

Definition: The moment of inertia R R R of a body G with respect to an axis

2
L is defined as the triple integral G
r(x, y, z) dzdydx, where r(x, y, z) =
ρ sin(ϕ) is the distance from the axis L.

Examples
18.2. For a ball of radius R we obtain with respect to the z-axis:
Z R Z 2π Z π
I= ρ2 sin2 (ϕ)ρ2 sin(ϕ) dϕdθdρ
0 0 0

Z π Z R Z 2π
3 4
=( sin (ϕ) dϕ)( ρ dr)( dθ)
0 0 0

Z π Z R Z 2pi
2 4
=( sin(ϕ)(1 − cos (ϕ)) dϕ)( ρ dr)( dθ)
0 0 0

3 4 R5 8πR5
= (− cos(ϕ) + cos(ϕ) /3)|π0 (L5 /5)(2π) = · · 2π = .
3 5 15

18.3. If the sphere rotates with angular velocity ω, then Iω 2 /2 is the kinetic en-
ergy of that sphere. The moment of inertia of the earth for example is 8 · 1037 kgm2 .
The angular velocity is ω = 2π/day = 2π/(86400s). The rotational energy is 8 ·
1037 kgm2 /(7464960000s2 ) ∼ 1029 J ∼ 2.51024 kcal.
18.4. Find the volume and the center of mass of a diamond, the √ intersection of the
unit sphere with the cone given in cylindrical coordinates as z = 3r.
Solution: we use spherical coordinates to find the center of mass
Z 1 Z 2π Z π/6
1
x = ρ3 sin2 (ϕ) cos(θ) dϕdθdρ =0
0 0 0 V
Z 1 Z 2π Z π/6
1
y = ρ3 sin2 (ϕ) sin(θ) dϕdθdρ =0
0 0 0 V
Z 1 Z 2π Z π/6
1 2π
z = ρ3 cos(ϕ) sin(ϕ) dϕdθdρ =
0 0 0 V 32V
RRR
18.5. Find R
z 2 dV for the solid obtained by intersecting {1 ≤ x2 + y 2 + z 2 ≤ 4 }
with the double cone {z 2 ≥ x2 + y 2 }.
Solution: since the result for the double cone is twice the result for the single cone, we
work with the diamond shaped region R in {z > 0} and multiply the result at the end
with 2. In spherical coordinates, the solid R is given by 1 ≤ ρ ≤ 2 and 0 ≤ ϕ ≤ π/4.
With z = ρ cos(ϕ), we have
Z 2 Z 2π Z π/4
ρ4 cos2 (ϕ) sin(ϕ) dϕdθdρ
1 0 0

25 15 − cos3 (ϕ)) π/4 31

=( − )2π( |0 = 2π (1 − 2−3/2 ) .
5 5 3 5
√ 3
The result for the double cone is 4π(31/5)(1 − 1/ 2 ) .
Multivariable Calculus

Homework
This homework is due on Tuesday, 7/26/2022.

Problem
√ 18.1: A hot air balloon E has the shape x2 + y 2 + z 2 ≤ 1, z ≥
−1/ 2. The density of the gas is f (x, y, z) = 1+z. Olli Rocky Docky com-
RRR
putes the amount√ of f (x, h, z) dV using cylindrical coordinates and
R 2π R 1 R 1−z2 E
gets 0 −1/√2 0 (1+z)rdrdzdθ. Then he computes the same volume
R 2π R 3π/4 R 1 2
using spherical coordinates 0 0 0
ρ sin(ϕ)(1 + ρ cos(ϕ)) dρdϕdθ.
Compute both integrals. You will see that they do not agree. Which
of the two integrals correctly computes the volume? What went wrong
with the other?

Problem 18.2: Assume the mass density of a solid E = x2 + y 2 − z 2 <

1, −1 < z < 1 is given by the 8’s power of the distance to the z-axes:
σ(x, y, z) = r8 = (x2 + y 2 )4 . Find its mass
Z Z Z
M= (x2 + y 2 )4 dxdydz .
E

Problem 18.3: A solid is described in spherical coordinates by the

inequality ρ ≤ 2 sin(ϕ). Find its volume.

Problem 18.4: Integrate the function

2 +y 2 +z 2 )3/2
f (x, y, z) = e(x
over the solid which lies between the spheres x2 + y 2 + z 2 = 1 and x2 +
y 2 + z 2 = 4, which is in the first octant and which is above the cone
x2 + y 2 = z 2 .

Problem 18.5: Find the volume of the solid x2 + y 2 ≤ z 4 , z 2 ≤ 4.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 19: Vector fields

Lecture
19.1. A vector-valued function F is called a vector field. A real valued function f is
called a scalar field.

Definition: A planar vector field is a vector-valued map F⃗ which assigns

to a point (x, y) ∈ R2 a vector F⃗ (x, y) = [P (x, y), Q(x, y)]. A vector field in
space is a map, which assigns to each point (x, y, z) ∈ R3 a vector F⃗ (x, y, z) =
[P (x, y, z), Q(x, y, z), R(x, y, z)].

19.2. Here are examples of vector fields in two and three dimensions
 
−y
y − sin(x)
F⃗ (x, y) = , F⃗ (x, y, z) =  x  .
x3 + cos(2y)
sin(z)

Definition: If f (x, y) is a function of two variables, then F⃗ (x, y) = ∇f (x, y)

is called a gradient field. Gradient fields in space are of the form F⃗ (x, y, z) =
∇f (x, y, z). They are important!
Multivariable Calculus

19.3. When is a vector field a gradient field? F⃗ (x, y) = [P (x, y), Q(x, y)] = ∇f (x, y)
implies Qx (x, y) = Py (x, y). If this does not hold at some point, F⃗ is no gradient field.

Clairaut test: If Qx (x, y) − Py (x, y) is not zero at some point, then F⃗ (x, y) =
[P (x, y), Q(x, y)] is not a gradient field.

19.4. We will see next week that curl(F⃗ ) = Qx − Py = 0 is also sufficient for F⃗ to be a
gradient field if F⃗ is defined everywhere. How do we get f the function with F⃗ = ∇f ?
We will look at examples in class.

Examples
19.5. Is the vector field F⃗ (x, y) = [P, Q] = [3x2 y + y + 2, x3 + x − 1] a gradient
field? Solution: the Clairaut test shows Qx − Py = 0. We integrate the equation
fx = P = 3x2 y + y + 2 and get f (x, y) = 2x + xy + x3 y + c(y). Now take the derivative
of this with respect to y to get x + x3 + c′ (y) and compare with x3 + x − 1. We see
c′ (y) = −1 and so c(y) = −y + c. We see the solution x3 y + xy − y + 2x .

19.6. Is the vector field F⃗ (x, y) = [xy, 2xy 2 ] a gradient field? Solution: No: Qx −Py =
2y 2 − x is not zero.
Vector fields appear naturally when studying differential equations. Here is an example
in population dynamics:

19.7. If x(t) is the population of a “prey species” like shrimp and y(t) is the population
size of a “predator” like sharks. We have x′ (t) = ax(t) − bx(t)y(t) with positive a, b
because both more predators and more prey species will lead to prey consumption.
The rate of change of y(t) is y ′ (t) = −cy(t) + dxy, where c, d are positive. This can
be written using a vector field ⃗r′ = F⃗ (⃗r(t)). We have a negative sign in the first part
because predators would die out without food. The second term is explained because
both more predators as well as more prey leads to a growth of predators through
reproduction. A concrete example is the Volterra-Lodka system

ẋ = 0.4x − 0.4xy
ẏ = −0.1y + 0.2xy ,

where F⃗ (x, y) = [0.4x − 0.4xy, −0.1y + 0.2xy]. Volterra explained with such systems
the oscillation of fish populations in the Mediterranean sea. At any specific point
⃗r(x, y) = [x(t), y(t)], there is a curve = ⃗r(t) = [x(t), y(t)] through that point for which
the tangent ⃗r ′ (t) = (x′ (t), y ′ (t) is the vector field.
19.8. In mechanics, Hamiltonian fields plays an important role: if H(x, y) is a func-
tion of two variables called energy, then [Hy (x, y), −Hx (x, y)] is called a Hamiltonian
vector field. An example is the harmonic oscillator H(x, y) = (x2 + y 2 )/2. Its vec-
tor field is F⃗ (x, y) = [Hy (x, y), −Hx (x, y)] = [y, −x]. The flow lines of a Hamiltonian
vector fields are located on the level curves of H.
19.9. Here is a famous example. It is the Lorenz vector field
 
10y − 10x
F⃗ (x, y, z) =  −xz + 28x − y  .
xy − 38 z
It features what one calls a strange attractor, an icon in chaos theory.
Multivariable Calculus

Homework
This homework is due on Tuesday, 7/26/2022.
Problem 19.1: p
a) Draw the gradient vector field of f (x, y) = (x + 1)2 + (y − 2)2 .
b) Draw the gradient vector field of f (x, y) = sin(x2 − y 2 ).
In both cases, draw a contour map of f and use gradients to draw the
vector field⃗F (x, y) = ∇f .

Problem 19.2: The vector field

" #
x
(x2 +y 2 )(3/2)
F⃗ (x, y) = y
(x2 +y 2 )(3/2)

appears in electrostatics. Find a function f (x, y) such that F⃗ = ∇f .

Problem 19.3:
xy
a) Is the vector field F⃗ (x, y) = a gradient field?
x2

sin(x) + y
b) Is the vector field F⃗ (x, y) = a gradient field?
cos(y) + x
In both cases, find f (x, y) satisfying ∇f (x, y) = F⃗ (x, y) or give a reason,
why it does not exist.

Problem 19.4: Find conditions such that a vector field in three

dimensions F⃗ (x, y, z) is a gradient field. Then check it in the following
cases. If the field is a gradient field, find a potential f such that F⃗ = ∇f .
a) F⃗ (x, y, z) = [x11 , y 9 , z].
b) F⃗ (x, y, z) = [y, x, z 3 ].
c) F⃗ (x, y, z) = [10y + 10x, 10x + 10y, x].
d) F⃗ (x, y) = [y, z, x].

Problem 19.5: Find the potential function f (x, y, z) to

F⃗ (x, y, z) = [5e5x + 5x4 y + z 4 + y cos(xy), x5 + x cos(xy), 4xz 3 + 7e7z ] .

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 20: Line integral theorem

Lecture
20.1. When a vector field is integrated along a curve, we get a line integral. In the
special case, where F⃗ is a gradient field, we can compute the integral using the funda-
mental theorem of calculus. The corresponding formula will be the first generalization
of the fundamental theorem to higher dimensions.

Definition: If F⃗ is a vector field in R2 or R3 and C : t 7→ ⃗r(t) is a curve, then

Z b
F⃗ (⃗r(t)) · ⃗r ′ (t) dt
a

is called the line integral of F⃗ along the curve C.

20.2. We use also the short-hand notation C F⃗ · dr. ⃗ In physics, if F⃗ (x, y, z) is a force
R
Rb
field, then F⃗ (⃗r(t)) · ⃗r ′ (t) is called power and the line integral a F⃗ (⃗r(t)) · ⃗r ′ (t) dt
is work. In electrodynamics, if F⃗ (x, y, z) is an electric field, then the line integral
Rb
a
F⃗ (⃗r(t)) · ⃗r ′ (t) dt is the electrostatic potential.

20.3. Let C : t 7→ ⃗r(t) = [cos(t), sin(t)] be a circle parameterized by t ∈ [0, 2π] and let
⃗
F⃗ (x, y) = [−y, x]. Calculate the line integral I = C F⃗ (⃗r) · dr.
R
R 2π 2π
Solution: We have I = 0 F⃗ (⃗r(t))·⃗r ′ (t) dt = 0 (− sin(t), cos(t))·(− sin(t), cos(t)) dt =
R
R 2π 2
0
sin (t) + cos2 (t) dt = 2π
Multivariable Calculus

20.4. Let ⃗r(t) be a curve given in polar coordinates as ⃗r(t) = [r(t), ϕ(t)] = [cos(t), t]
defined on the interval 0 ≤ t ≤ π. Let F⃗ be the vector field F⃗ (x, y) = [−xy, 0]. Calcu-
late the line integral C F⃗ · dr. ⃗ Solution: In Cartesian coordinates, the curve is ⃗r(t) =
R

[cos (t), cos(t) sin(t)]. The velocity vector is then ⃗r ′ (t) = [−2 sin(t) cos(t), − sin2 (t) +
2

cos2 (t)) = (x(t), y(t)]. The line integral is

Z π Z π
⃗ ′
F (⃗r(t)) · ⃗r (t) dt = [cos3 (t) sin(t), 0] · [−2 sin(t) cos(t), − sin2 (t) + cos2 (t)] dt
0 0
Z π
= −2 sin2 (t) cos4 (t) dt = −2(t/16 + sin(2t)/64 − sin(4t)/64 − sin(6t)/192)|π0 = −π/8 .
0

20.5. The first generalization of the fundamental theorem of calculus to higher dimen-
sions is the fundamental theorem of line integrals.

Theorem: Fundamental theorem of line integrals: If F⃗ = ∇f ,

then Z b
F⃗ (⃗r(t)) · ⃗r ′ (t) dt = f (⃗r(b)) − f (⃗r(a)) .
a

20.6. In other words, the line integral is the potential difference between the end points
⃗r(b) and ⃗r(a), if F⃗ is a gradient field.

Examples
20.7. Let f (x, y, z) be the temperature distribution in a room and let ⃗r(t) the path
of a fly in the room, then f (⃗r(t)) is the temperature, the fly experiences at the point
⃗r(t) at time t. The change of temperature for the fly is dtd f (⃗r(t)). The line-integral of
the temperature gradient ∇f along the path of the fly coincides with the temperature
difference between the end point and initial point.
20.8. Here are some special cases: If ⃗r(t) is parallel to the level curve of f , then
d/dtf (⃗r(t)) = 0 because ⃗r ′ (t) is orthogonal to ∇f (⃗r(t)). If ⃗r(t) is orthogonal to the
level curve, then |d/dtf (⃗r(t))| = |∇f ||⃗r ′ (t)| because ⃗r ′ (t) is parallel to ∇f (⃗r(t)).
20.9. The proof of the fundamental theorem uses the chain rule in the second equality
and the fundamental theorem of calculus in the third equality of the following identities:
Z b Z b Z b
⃗ ′ ′ d
F (⃗r(t)) · ⃗r (t) dt = ∇f (⃗r(t)) · ⃗r (t) dt = f (⃗r(t)) dt = f (⃗r(b)) − f (⃗r(a)) .
a a a dt

Theorem: For a gradient field, the line-integral along any closed curve
is zero.

20.10. When is a vector field a gradient field? F⃗ (x, y) = ∇f (x, y) implies Py (x, y) =
Qx (x, y). If this does not hold at some point, F⃗ = [P, Q] is no gradient field. This is
called the Clairaut test. We will see later that the condition curl(F⃗ ) = Qx − Py = 0
and F⃗ being defined everywhere implies that the field is a gradient field
20.11. Let F⃗ (x, y) = [2xy 2 + 3x2 , 2yx2 ]. Find a potential f of F⃗ = [P, Q].
Solution: The potential function f (x, y) satisfies fx (x, y) = 2xy 2 + 3x2 and fy (x, y) =
2yx2 . Integrating the second equation gives f (x, y) = x2 y 2 + h(x). Partial differentia-
tion with respect to x gives fx (x, y) = 2xy 2 + h′ (x) which should be 2xy 2 + 3x2 so that
3 2 2 3
we can takeR x h(x) = x . The potential function is f (x, y) = x y + x . Find g, h from
f (x, y) = 0 P (x, y) dx + h(y) and fy (x, y) = g(x, y).

20.12. Let F⃗ (x, y) = [P, Q] = [ x2−y , x ]. It appears to be a gradient field because

+y 2 x2 +y 2
f (x, y) = arctan(y/x) has the property that fx = (−y/x2 )/(1 + y 2 /x2 ) = P, fy =
(1/x)/(1 + y 2 /x2 ) = Q. However, the line integral γ F⃗ dr, ⃗ where γ is the unit circle is
R
Z 2π
− sin(t) cos(t)
[ 2 , ] · [− sin(t), cos(t)] dt
0 cos (t) + sin (t) cos (t) + sin2
2 2
R 2π
which is 0 1 dt = 2π. What is wrong?
Solution: note that the potential f as well as the vector-field F⃗ are not differentiable
everywhere. The curl of F⃗ is zero except at (0, 0), where it is not defined.

20.13. A device which implements a non gradient force field is called a perpetual
motion machine. It realizes a force field for which the energy gain is positive along
some closed loop. The first law of thermodynamics forbids the existence of such a
machine. It is informative to contemplate some of the ideas people have come up and
to analyze why they don’t work. Here is an example: consider a O-shaped pipe which
is filled only on the right side with water. A wooden ball falls on the right hand side in
the air and moves up in the water. You find plenty of other futile attempts on youtube.
Multivariable Calculus

Homework
This homework is due on Tuesday, 7/26/2022.
Problem 20.1: What is the work done by moving in the force field
F⃗ (x, y) = [6x2 + 2, 16y 7 ] along the parabola y = x2 from (−1, 1) to (1, 1)?
In part a) compute it directly. Then, in part b), use the theorem.

Problem 20.2: Let C be the space curve ⃗r(t) = [cos(t), sin(sin(t)), 5t] for
t ∈ [0, π] and let F⃗ (x, y, z) = [y, x, 15 + cos(21z)]. Find the value of the
⃗
line integral C F⃗ · dr.
R

Problem 20.3: Let F⃗ be the vector field F⃗ (x, y) = [−y, x]/2. Compute
the line integral of F⃗ along an ellipse ⃗r(t) = [a cos(t), b sin(t)] with width
2a and height 2b. The result should depend on a and b.

Problem 20.4: It is hot and you refresh yourself in a little pool in your
garden. Its rim has the shape x40 + y 40 = 1 oriented counter clockwise.
There is a hose filling in fresh water to the tub so that there is a velocity
field F⃗ (x, y) = [2x + 5y, 10y 4 + 5x] inside. Calculate the line integral
⃗ the energy you gain from the fluid force when dislocating from
F⃗ · dr,
R
C
(1, 0) to (0, 1) along the rim. Remember you are in a pool and do not
want to work hard. There is an easy way to get the answer.

Problem 20.5: Find a closed curve C : ⃗r(t) for which the vector field
F⃗ (x, y) = [P (x, y), Q(x, y)] = [xy, x2 ]
satisfies C F⃗ (⃗r(t)) · ⃗r ′ (t) dt ̸= 0.
R

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 21: Green’s theorem

Lecture
21.1. Multi-variable calculus in two dimensions has two derivatives ∇, curl and two
integral theorems: the fundamental theorem of line integrals as well as Green’s
theorem. You might be used to think about two-dimensions as the xy-plane in three
space, but we insist on remaining two dimensional. 1

21.2. The curl of a vector field F⃗ (x, y) = [P (x, y), Q(x, y)] is the scalar field curl(F⃗ )(x, y) =
∇ × F⃗ = Qx (x, y) − Py (x, y). It measures the vorticity of the vector field at (x, y).
For example, for F⃗ (x, y) = [x3 + y 2 , y 3 + x2 y], we have curl(F )(x, y) = 2xy − 2y.

Theorem: Green’s theorem tells that if F⃗ (x, y) = [P (x, y), Q(x, y)]
is a vector field and G is a region for which the boundary C is a curve,
parametrized so that G is “to the left”, then
Z ZZ
⃗ ⃗
F · dr = curl(F⃗ ) dxdy .
C G

21.3. Take a square G = [x, x + h] × [y, y + h] with small h > 0. The line integral of
Rh Rh Rh
F⃗ = [P, Q] along the boundary is 0 P (x + t, y)dt + 0 Q(x + h, y + t) dt − 0 P (x +
Rh
t, y + h) dt − 0 Q(x, y + t) dt. It measures the “circulation” at the position (x, y).
Because Q(x + h, y) − Q(x, y) ∼ Qx (x, y)h and P (x, y + h) − P (x, y) ∼ Py (x, y)h, the
RhRh
line integral is (Qx − Py )h2 is 0 0 curl(F⃗ ) dxdy with an error of the order h3 . Now
take a region G with area |G| and chop it into small squares of size h. We need about
|G|/h2 such squares. Summing up all the line integrals around the boundaries is the
sum of the line integral along the boundary of G because of the cancellations in the
interior. On the boundary, it is a Riemann sum of the line integral along the boundary.
The sumRRof the curls of the squares is a Riemann sum approximation of the double
integral G curl(F⃗ ) dxdy. Taking the limit h → 0 gives Greens theorem.

1Think about two dimensions as if you were a flat-lander unaware about the third dimension. If
we speak about “the plane”, it is our universe and we are ignorant about 3 space. Edwin Abbot’s
Flatland is a 1884 romance plays in two dimensions.
Multivariable Calculus

21.4. George Green lived from 1793 to 1841. Unfortunately, we don’t have a single
picture of him. He was a physicist, a self-taught mathematician as well as a miller. His
work greatly contributed to modern physics.

⃗ ⃗
21.5. A special case R is if F is a gradient field F = ∇f . Then, both sides of Green’s
theorem are zero: C F⃗ · dr ⃗ is zero by the fundamental theorem for line integrals. And
curl(F⃗ ) · dA is zero because curl(F⃗ ) = curl(grad(f )) = 0.
RR
G

21.6. If F⃗ (x, y) = ∇f is a gradient field then the curl is zero because if P (x, y) =
fx (x, y), Q(x, y) = fy (x, y) and curl(F⃗ ) = Qx − Py = fyx − fxy = 0 by Clairaut. The
field F⃗ (x, y) = [x + y, yx] for example is not a gradient field because curl(F⃗ ) = y − 1.

21.7. The already established Clairaut identity

curl(grad(f )) = 0

21.8. This can also be remembered by writing curl(F⃗ ) = ∇ × F⃗ and curl(∇f ) =

∇ × ∇f . Use now that cross product of two identical vectors is 0. Working with ∇ as
a vector is called nabla calculus which can serve as a mnemonic.

21.9. It had been a consequence of the fundamental theorem of line integrals that:

If F⃗ is a gradient field then curl(F⃗ ) = 0 everywhere.

21.10. Is the converse true? Here is the answer:

Definition: A region R is called simply connected if every closed loop in
R can be pulled together continuously within R to a point inside R.

21.11. R = {x2 + y 2 ≤ 1} is simply connected, O = {3 ≤ x2 + y 2 ≤ 4} is not.

If curl(F⃗ ) = 0 in a simply connected region G, then F⃗ is a gradient field.

Proof. Given a closed curve C in G Renclosing a region R. Green’s theorem assures
⃗ = 0. So F⃗ has the closed loop property
that for any gradient field F⃗ we have C F⃗ dr
in G. This is equivalent to the fact that line integrals are path independent. In that
case F⃗ is therefore a gradient field: one can get f (x, y) by taking the line integral
from an arbitrary point O to (x, y). In the homework, you look at an example of a not
simply connected region where the curl(F⃗ ) = 0 does not imply that F⃗ is a gradient field.

Examples
21.12. Problem: Find the line integral of F⃗ (x, y) = [x2 − y 2 , 2xy] = [P, Q] along the
boundary of the rectangle [0, 2] × [0, 1]. Solution: curl(F⃗ ) = Qx − Py = 2y + 2y = 4y
⃗ = 2 1 4y dydx = 2y 2 |1 x|2 = 4.
so that C F⃗ dr
R R R
0 0 0 0

21.13. Problem: Find the area of the region enclosed by

sin2 (πt) 2
⃗r(t) = [ , t − 1]
t
for −1 ≤ t ≤ 1. To do so, use Greens theorem with the vector field F⃗ = [0, x].
21.14. Green’s theorem allows to express the coordinates of the centroid = center of
mass Z Z Z Z
( x dA/A, y dA/A)
G G
using line integrals. With F⃗ = [0, x2 /2] we have ⃗
F⃗ dr.
RR R
G
x dA = C

21.15. An important application of Green is area computation: Take a vector field

like F⃗ (x, y) = [P, Q] = [0, x] which has constant vorticity curl(F⃗ )(x, y) = 1. For
⃗
RF (x, y) = [0, x] , the right hand side in Green’s theorem is the area Area(G) =
⃗ (⃗r(t)) ⃗r′ (t) dt.
F
C

21.16. Let G be the region below the graph of a function f (x) on [a, b]. The line
integral around the boundary of G is 0 from (a, 0) to (b, 0) because F⃗ (x, y) = [0, 0] there.
The line integral is also zero from (b, 0) to (b, f (b)) and (a, f (a)) to (a, 0) because P = 0.
Rb Rb
The line integral along the curve (t, f (t)) is − a [−y(t), 0] · [1, f ′ (t)] dt = a f (t) dt.
Green’s theorem confirms that this is the area of the region below the graph.
21.17. An engineering application is the planimeter, a mechanical device for mea-
suring areas. We demonstrate it in class. Historically it had been used in medicine to
measure the size of the cross-sections of tumors, in biology to measure the area of leaves
or wing sizes of insects, in agriculture to measure the area of forests, in engineering to
measure the size of profiles. There is a vector field F⃗ associated to the device which
is obtained by placing a unit vector perpendicular to the arm). One can prove that F⃗
has vorticity 1. The planimeter calculates the line integral of F⃗ along a given curve.
Green’s theorem assures this is the area.

Homework
Multivariable Calculus

This homework is due on Tuesday, 8/2/2022.

Problem 21.1: Given f (x, y) = 99999 ∗ x5 + 77777 ∗ xy 4 , compute the

line integral of F⃗ (x, y) = [25y + 6y 2 , 12xy + 10y 445555 ] + ∇f along the
boundary of the Monster region given in the picture. There are four
boundary curves, oriented as shown in the picture: a large ellipse of area
16, two circles of area 1 and 2 as well as a small ellipse (the mouth) of
area 3.

Problem 21.2: Find the area of the region bounded by the hypocycloid
⃗r(t) = [cos3 (t), sin3 (t)], 0 ≤ t ≤ 2π.

Problem 21.3: Let G be the region x10 +y 10 ≤ 1. Mathematica allows us

to get the area as Area[ImplicitRegion[x10 + y10 <= 1, {x, y}]] and tells, it
is A = 3.94293 (which is using the Gamma function 4Γ(11/10)2 /Γ(6/5)).
What is the line integral of F⃗ (x, y) = [x800 +sin(x)−55y, y 12 +cos(y)+4x]
counter clockwise along the boundary of G in terms of A.

Problem 21.4: Let C be the boundary curve of the white Yang part of
the Ying-Yang symbol in the disc of radius 6. You can see in the image
that the curve C has three parts, and that the orientation of each part is
given. Find the line integral of the vector field F⃗ (x, y) = [−y + sin(ex ), 5x]
around C.

R √
Problem 21.5: Use Green’s Theorem to evaluate C [sin( 1 + x7 ) +
21y, 121x] · d⃗r, where C is the boundary of the region K(4). You see in
the picture K(0), K(1), K(2), K(3), K(4). The first K(0) is an equilateral
triangle of length 1. The second K(1) is K(0) with 3 equilateral triangles
of length 1/3 added. K(2) is K(1) with 3 ∗ 41 equilateral triangles of
length 1/9 added. Remark. We could now find the line integral in the
limit K = K(∞), a fractal called the Koch snowflake It has dimension
log(4)/ log(3) = 1.26 . . . which is between 1 and 2.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 22: Curl and Flux

Lecture
22.1. In two dimensions, the curl of F⃗ was the scalar field curl(F⃗ ) = Qx − Py . By
Green’s theorem, the curl evaluated at (x, y) is limr→0 Cr F⃗ dr/(πr2 ), where Cr is
R

a small circle of radius r oriented counter clockwise an centered at (x, y). Green’s
theorem explains that curl measures how the field “curls” or rotates. As rotations in
two dimensions are determined by a single angle, in three dimensions, three parameters
in the form of a vector are needed A direction of this vector tells the axes of rotation
and the magnitude tells the amount of rotation.

Definition: The curl of F⃗ = [P, Q, R] is the vector field

curl([P, Q, R]) = [Ry − Qz , Pz − Rx , Qx − Py ] .

22.2. In Nabla calculus, this is written as curl(F⃗ ) = ∇ × F⃗ . Note that the third
component Qx − Py of the curl is for fixed z just the curl of the two-dimensional vector
field F⃗ = [P, Q]. While the curl in two dimensions is a scalar field, it is a vector field in
3 dimensions. In n dimensions, it would have n(n − 1)/2 components, the number of
2-dimensional coordinate planes. The curl measures the “vorticity” of the field. Each
of the components is the rotation projected onto one the three coordinate planes.

Definition: If F⃗ has zero curl everywhere, the field is called irrotational.

22.3. The curl is frequently visualized using a “paddle wheel”. If the rotation axes
points into direction ⃗v , the signed rotation speed is F⃗ · ⃗v . If the vector ⃗v is chosen
into the direction so that the wheel turns fastest, this is the direction of curl(F⃗ ). The
angular velocity of the wheel is the magnitude of the curl.
Multivariable Calculus

22.4. In two dimensions, we had two derivatives, the gradient and curl. In three
dimensions, there are now three fundamental derivatives: the gradient, the curl and
the divergence.

Definition: The divergence of F⃗ = [P, Q, R] is the scalar field

div([P, Q, R]) = ∇ · F⃗ = Px + Qy + Rz .

22.5. The divergence can also be defined in two dimensions, but it is not as funda-
mental because it is not an “exterior derivatives”. We want in d dimensions to have
d fundamental derivatives and d fundamental integrals and d fundamental theorems.
Distinguishing dimensions helps to organize the integral theorems. While Green looks
like Stokes, we urge you to look at it as a different theorem taking place in “flatland”.
It is a small matter but it is much clearer to have in every dimension d a separate
calculus. This prevents mixing up the theorems and makes things easier.

Definition: In two dimensions, the divergence of F⃗ = [P, Q] is defined as

div([P, Q]) = ∇ · F⃗ = Px + Qy .

22.6. In two dimensions, the divergence can be written as the curl of a −90 degrees
⃗ = [Q, −P ] because div(G)
rotated field G ⃗ = Qx − Py = curl(F⃗ ). It measures the
“expansion” of the field F⃗ = [P, Q] If a field has zero divergence everywhere, the field
is called incompressible.
22.7. With the ”vector” ∇ = [∂x , ∂y , ∂z ], we can write curl(F⃗ ) = ∇ × F⃗ and div(F⃗ ) =
∇ · F⃗ . Rewriting formulas using the “Nabla vector” and using rules from geometry is
called Nabla calculus. This works both in 2 and 3 dimensions even so the ∇ vector
is not an actual vector but an operator. The following combination of divergence and
gradient often appears in physics:
Definition:
∆f = div(grad(f )) = fxx + fyy + fzz
is called the Laplacian of f . One can write ∆f = ∇2 f .

22.8. We can extend the Laplacian also to vector fields by defining:

Definition: ∆F⃗ = [∆P, ∆Q, ∆R] and write ∇2 F⃗ .

Mathematicians know ∆ it as a “form Laplacian”. Here are some identities:
div(curl(F⃗ )) = 0.
curlgrad(F⃗ ) = ⃗0
curl(curl(F⃗ )) = grad(div(F⃗ )) − ∆(F⃗ ).

Examples
22.9. Question: Is there a vector field G⃗ such that F⃗ = [x + y, z, y 2 ] = curl(G)?
⃗
Answer: No, because div(F⃗ ) = 1 is incompatible with div(curl(G))
⃗ = 0.

22.10. Show that in simply connected region, every irrotational and incompressible
field can be written as a vector field F⃗ = grad(f ) with ∆f = 0. Proof. Since F⃗ is
irrotational, there exists a function f satisfying F = grad(f ). Fromdiv(F ) = 0 one has
divgrad(f ) = ∆f = 0.

22.11. Here is a remark: If we rotate the vector field F⃗ = R [P, Q] by 90 degrees =

⃗
π/2, we get a new vector field G = [−Q, P ]. The integral C F · ds becomes a flux
⃗ of G through the boundary of R, where dn ⃗ is a normal vector to dr
⃗ with
R
γ
G · dn
length dn = dr = |r′ |dt. With div(F⃗ ) = (Px + Qy ), we see that
curl(F⃗ ) = div(G)
⃗ .
Green’s theorem now becomes
Z Z Z
⃗ dxdy =
div(G) ⃗ ,
⃗ · dn
G
R C
⃗
where dn(x, y) is a normal vector at (x, y) orthogonal to the velocity vector ⃗r ′ (x, y) at
(x, y). In three dimensions this theorem will become the Gauss theorem or divergence
theorem. Don’t treat this however as a different theorem in two dimensions. It is just
Green’s theorem in disguise.

In two dimensions, the divergence at a point (x, y) is the average flux of the field
through a small circle of radius r around the point in the limit when the radius of
the circle goes to zero.

We have now all the derivatives we need. In dimension d, there are d fundamental
derivatives.

1
grad
1 −→ 1
grad curl
1 −→ 2 −→ 1
grad curl div
1 −→ 3 −→ 3 −→ 1
Multivariable Calculus

Homework
This homework is due on Tuesday, 8/2/2022.

Problem 22.1: Construct your own nonzero vector field F⃗ (x, y) =

[P (x, y), Q(x, y)] in each of the following cases:
a) F⃗ is irrotational but not incompressible.
b) F⃗ is incompressible but not irrotational.
c) F⃗ is irrotational and incompressible.
d) F⃗ is not irrotational and not incompressible.

Problem 22.2: The vector field F⃗ (x, y, z) = [x, y, −2z] satisfies

div(F⃗ ) = 0. Can you find a vector field G(x,
⃗ ⃗ = F⃗ ?
y, z) such that curl(G)
Such a field G⃗ is called a vector potential.
Hint. Write F⃗ as a sum [x, 0, −z] + [0, y, −z] and find vector potentials
for each of the parts using a vector field you have seen on the blackboard
in class.

⃗ where S
RR
Problem 22.3: Evaluate the flux integral S
[0, 0, yz] · dS,
is the surface with parametric equation x = uv, y = u + v, z = u − v on
R : u2 + v 2 ≤ 4 and u > 0.

⃗ for
RR
Problem 22.4: Evaluate the flux integral S
curl(F ) · dS
F⃗ (x, y, z) = [3xy, 3yz, 3zx] .
where S is the part of the paraboloid z = 4 − x2 − y 2 that lies above the
square [0, 2] × [0, 2] and has an upward orientation.

Problem 22.5: a) What is the relation between the flux of the vector
field F⃗ = ∇g/|∇g| through the surface S : {g = 1} with g(x, y, z) =
x6 + y 4 + 2z 8 and the surface area of S?
⃗ = ∇g × [0, 0, 2] through the surface
b) Find the flux of the vector field G
S.
1

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

1Botha and b) do not need any computation. You can answer each question with one sentence. In
⃗ with dS in that case.
part a) compare F⃗ · dS
MULTIVARIABLE CALCULUS

MATH S-21A

Unit 23: Stokes Theorem

Lecture
23.1. If a surface S is parametrized as ⃗r(u, v) = [x(u, v), y(u, v), z(u, v)] over a domain
R in the uv-plane, the flux integral of F⃗ through S is defined as the double integral
ZZ
F⃗ (⃗r(u, v)) · (⃗ru × ⃗rv ) dudv .
R

Definition: The boundary of a surface S consists of all points P for which

we do not have an entire disc around P which is contained in S. For the upper
hemisphere for example, the boundary is the equator, a circle. The boundary
of a sphere is empty.

23.2. The boundary of S is a collection of closed curves. Each is oriented so that the
surface is to the “left” if the normal vector to the surface is pointing “up”. In other
words, the velocity vector ⃗v , a vector w
⃗ pointing towards the surface and the normal
vector ⃗n to the surface are right-handed.

Theorem: Stokes theorem: if S is a surface bounded by a curve C

and F⃗ be a vector field, then
ZZ Z
⃗ =
curl(F⃗ ) · dS ⃗ .
F⃗ · dr
S C
Multivariable Calculus

23.3. Stokes theorem can be verified in the same way than Green’s theorem. Chop up
S into a collection of small triangles. As before, the sum of the fluxes through all these
triangles adds up to the flux through the surface and the sum of the line integrals along
the boundaries adds up to the line integral of the boundary of S. Stokes theorem for
a small triangle can be reduced to Green’s theorem because with a coordinate system
such that the triangle.

Examples
23.4. Let F⃗ (x, y, z) = [−y, x, 0] and let S be the upper hemisphere oriented upwards.
We have curl(F⃗ )(x, y, z) = [0, 0, 2]. The surface is parameterized by
⃗r(ϕ, θ) = [cos(θ) sin(ϕ), sin(θ) sin(ϕ), cos(ϕ)]

on R = [0, π/2] × [0, 2π] and ⃗rϕ × ⃗rθ = sin(ϕ)⃗r(ϕ, θ) so that curl(F⃗ )(x, y, z) · ⃗rϕ × ⃗rθ =
R π/2 R 2π
cos(ϕ) sin(ϕ)2. The integral 0 0
sin(2ϕ) dθdϕ = 2π.
The boundary C of S is parameterized by ⃗r(t) = ⃗ = ⃗r ′ (t)dt =
[cos(t), sin(t), 0] so that dr
2π
[− sin(t), cos(t), 0] dt and F⃗ (⃗r(t)) ⃗r ′ (t)dt = 0 sin(t)2 + cos2 (t) dt = 2π.
R R

23.5. If S is a surface in the xy-plane and F⃗ = [P, Q, 0] has a zero z-component,

then curl(F⃗ ) = [0, 0, Qx − Py ] and curl(F⃗ ) · dS ⃗ = Qx − Py dxdy. We see that for
a surface which is flat, Stokes theorem is very close to Green’s theorem. If we put
the coordinate axis so that the surface is in the xy-plane, then the vector field F⃗
induces a vector field on the surface such that its 2D-curl is the normal component
of curl(F ). The third component Qx − Py of curl(F⃗ )[Ry − Qz , Pz − Rx , Qx − Py ] is
the two-dimensional curl(F⃗ )(⃗r(u, v)) · [0, 0, 1] = Qx − Py . If C is the boundary of the
F⃗ (⃗r(u, v)) · [0, 0, 1] dudv = C F⃗ (⃗r(t)) ⃗r ′ (t)dt.
RR R
surface, then S

23.6. For every surface bounded by a curve C, the flux of curl(F⃗ ) through the surface
is the same. Proof: the flux of the curl of a vector field through a surface S depends
only on the boundary of S. Compare this with the earlier statement that for every
curve between two points A, B the line integral of grad(f ) along C is the same. The
line integral of the gradient of a function of a curve C depends only on the end points
of C.

23.7. Electric and magnetic fields are linked by the Maxwell equation like curl(E) ⃗ =
− 1c Ḃ, anRRexample of a partial differential equation. If a closed wire C bounds a surface
S, then S B · dS is the flux of the magnetic field through S. Its change can be related
⃗ ·
RR RR RR
with a voltage using Stokes theorem: d/dt S B · dS = S Ḃ · dS = S −c curl(E)
⃗ = −c E ⃗ = U , where U is the voltage. If we change the flux of the magnetic
⃗ dr
R
dS C
field through the wire, then this induces a voltage. The flux can be changed by changing
the amount of the magnetic field but also by changing the direction. If we turn around
a magnet around the wire or the wire inside the magnet, we get an electric voltage.
This happens in a power-generator, like the alternator in a car. Stokes theorem explains
why we can generate electricity from motion.
23.8. The history of Stokes theorem is a bit hazy. 1. A version of Stokes theorem was
known by André Ampère in 1825. William Thomson (Lord Kelvin) mentioned the
theorem to Stokes in 1850). George Gabriel Stokes (1819-1903), who found parts
of the identity earlier 1840, formulated it in a prize exam from 1854 in which giving a
proof was one of the exam problems. The first published proof of Stokes theorem was
provided by Hermann Hankel in 1861.

George Gabriel Stokes William Thomson (Kelvin) André Marie Ampere

1See V. Katz, the History of Stokes theorem, Mathematics Magazine 52, 1979, p 146-156
Multivariable Calculus

Homework
This homework is due on Tuesday, 8/2/2022.

Problem 23.1: Find C F⃗ · dr, ⃗ where F⃗ (x, y, z) = [3x2 y, x3 , 3xy] and C

is the curve obtained by intersecting the hyperbolic paraboloid z = y 2 −x2

with the cylinder x2 + y 2 = 1, oriented counterclockwise as viewed from
above.

Problem 23.2: Assume S is the surface x1000 + y 8000 +RRz 1000 = 1000000
and F⃗ = [9+ee , x8 yz, 3x−5y−6 cos(zx)]. Explain why S curl(F⃗ )·dS
xyz
⃗ =
0.

⃗ where
curl(F⃗ ) · dS,
RR
Problem 23.3: Evaluate the flux integral S
2 2 2 2
F⃗ (x, y, z) = [xey z 3 + 2xyzex +z , x + z 2 ex +z , yex +z + zex ]
and where S is the part of the ellipsoid x2 + y 2 /4 + (z + 1)2 = 2, z > 0
oriented so that the normal vector points upwards.

Problem 23.4: Find the line integral C F⃗ dr, ⃗ where C is the circle of
R
radius 5 in the xz-plane oriented counter clockwise when looking from the
point (0, 1, 0) onto the plane and where F⃗ is the vector field
F⃗ (x, y, z) = [x2 z + x5 , cos(ey ), −xz 2 + sin(sin(z)] .
Use a convenient surface S which has C as a boundary.

curl(F⃗ )·dS,
⃗ where F⃗ (x, y, z) =
RR
Problem 23.5: Find the flux integral S

[−y + 2 cos(πy)e2x + z 2 , x2 cos(zπ/2) − π sin(πy)e2x , 2xz]

and S is the surface parametrized by
⃗r(s, t) = [(1 − s1/3 ) cos(t) − 4s2 , (1 − s1/3 ) sin(t), 5s]
with 0 ≤ t ≤ 2π, 0 ≤ s ≤ 1 and oriented so that the normal vectors point
to the outside of the thorn.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

MULTIVARIABLE CALCULUS

MATH S-21A

Unit 24: Divergence Theorem

Lecture
24.1. We have already seen the fundamental theorem of line integrals and Stokes
theorem. The divergence theorem completes the list of integral theorems in three
dimensions:

Theorem: Divergence Theorem. If E be a solid bounded by a surface

S. The surface S is oriented so that the normal vector points outside. If
F⃗ be a vector field, then
ZZZ ZZ
div(F⃗ ) dV = ⃗ .
F⃗ · dS
E S

24.2. To see why this is true, take a small box [x, x + dx] × [y, y + dy] × [z, z + dz]. The
flux of F⃗ = [P, Q, R] through the faces perpendicular to the x-axes is [F⃗ (x + dx, y, z) ·
[1, 0, 0] + F⃗ (x, y, z) · [−1, 0, 0]]dydz = (P (x + dx, y, z) − P (x, y, z))dydz ∼ Px dxdydz.
Similarly, the flux through the y-boundaries is Py dydxdz and the flux through the
two z-boundaries is Pz dzdxdy. The total flux through the faces of the cube is
(Px + Qy + Rz ) dxdydz = div(F⃗ ) dxdydz. A general solid can be approximated as
a union of small cubes. The sum of the fluxes through all the cubes consists now
of the flux through all faces without neighboring faces. The fluxes through adjacent
sides cancel and only the flux through the boundary survives. RRR The sum of all the
div(F⃗ ) dxdydz is a Riemann sum approximation for the integral G
div(F⃗ ) dxdydz.
In the limit, when dx, dy, dz all go to zero, we obtain the divergence theorem.
Multivariable Calculus

24.3. The theorem explains what divergence means. If we integrate the divergence
over a small cube, it is equal the flux of the field through the boundary of the cube.
If this is positive, then more field exits the cube than entering the cube. There is field
“generated” inside. The divergence measures the “expansion” of the field.

Examples
24.4. Let F⃗ (x, y, z) = [x, y, z] and RRR
let S be the unit sphere. The divergence of F⃗ is the
⃗
constant function div(F ) = 3 and div(F⃗ ) dV = 3 · 4π/3 = 4π. The flux through
RR G
RR R π R 2π
the boundary is S ⃗r · ⃗ru × ⃗rv dudv = S |⃗r(u, v)|2 sin(v) dudv = 0 0 sin(v) dudv =
4π also. We see that the divergence theorem allows us to compute the area of the
sphere from the volume of the enclosed ball or compute the volume from the surface
area.
24.5. What is the flux of the vector field F⃗ (x, y, z) = [2x, 3z 2 + y, sin(x)] through the
solid G = [0, 3]×[0, 3]×[0, 3]\([0, 3]×[1, 2]×[1, 2]∪[1, 2]×[0, 3]×[1, 2]∪[0, 3]×[0, 3]×[1, 2])
which is a cube where three perpendicular cubic holes RRRhave been removed?RRR Solution:
Use the divergence theorem: div(F⃗ ) = 2 and so G
div( ⃗ ) dV = 2
F G
dV =
2Vol(G) = 2(27−7) = 40. Note that the flux integral here would be over a complicated
surface over dozens of rectangular planar regions.
24.6. Find the flux of curl(F⃗ ) through a torus if F⃗ = [yz 2 , z + sin(x) + y, cos(x)] and
the torus has the parametrization
⃗r(θ, ϕ) = [(2 + cos(ϕ)) cos(θ), (2 + cos(ϕ)) sin(θ), sin(ϕ)] .
Solution: The answer is 0 because the divergence of curl(F⃗ ) is zero. By the divergence
theorem, the flux is zero.
24.7. Similarly as Green’s theorem allowed us to calculate the area of a region by
passing along the boundary, the volume of a region can be computed as a flux integral:
Take for example the vector field F⃗ (x, y, z) = [x, 0, 0] which has divergence 1. The flux
of thisRR
vector field through the boundary of a solid region is equal to the volume of the
⃗ = Vol(G).
solid: δG [x, 0, 0] · dS

24.8. How heavy are we at distance r from the center of the earth?
Solution: The law of gravity can be formulated as div(F⃗ ) = 4πρ, where ρ is the mass
density. We assume that the earth is a ball of radius R. By rotational symmetry,
the gravitational force is normal to the surface: F⃗ (⃗x) = F⃗ (r)⃗x/||⃗x||. The flux of F⃗
⃗ = 4πr2 F⃗ (r). By the divergence theorem,
through a ball of radius r is Sr F⃗ (x) · dS
RR
RRR
this is 4πMr = 4π Br
ρ(x) dV , where Mr is the mass of the material inside Sr .
We have (4π) ρr /3 = 4πr2 F⃗ (r) for r < R and (4π)2 ρR3 /3 = 4πr2 F⃗ (r) for r ≥ R.
2 3

Inside the earth, the gravitational force F⃗ (r) = 4πρr/3. Outside the earth, it satisfies
F⃗ (r) = M/r2 with M = 4πR3 ρ/3.
1.0

0.8

0.6

0.4

0.2

0.5 1.0 1.5 2.0

24.9. We conclude with an overview over the integral theorems and give an other typ-
ical example in each case.
Multivariable Calculus

The fundamental theorem for line integrals, Green’s theorem,

R Stokes
R theorem and di-
vergence theorem are all part of one single theorem A dF = δA F , where dF is
a exterior derivative of F and where δA is the boundary of A. It generalizes the
fundamental theorem of calculus.

Fundamental theorem of line integrals: If C is a curve with boundary {A, B}

and f is a function, then Z
⃗ = f (B) − f (A)
∇f · dr
C

24.10. Remarks.

⃗ is zero.
R
1) For closed curves, the line integral C ∇f · dr
2) Gradient fields are path independent: if F⃗ = ∇f , then the line integral between
two points P and Q does not depend on the path connecting the two points.
3) The line integral theorem holds in any dimension. In one dimension, it reduces to
Rb
the fundamental theorem of calculus a f ′ (x) dx = f (b) − f (a)
4) The theorem justifies the name conservative for gradient vector fields.
5) The term “potential” was coined by George Green who lived from 1783-1841.
24.11. Example. Let f (x, y, z) = x2 + y 4 + z. Find the line integral of the vector
field F⃗ (x, y, z) = ∇f (x, y, z) along the path ⃗r(t) = [cos(5t), sin(2t), t2 ] from t = 0 to
t = 2π.

Solution. ⃗r(0) = [1, 0, 0] and ⃗r(2π) = [1, 0, 4π 2 ] and f (⃗r(0)) = 1 and f (⃗r(2π)) = 1 +
⃗ = f (r(2π)) − f (r(0)) =
R
4π 2 . The fundamental theorem of line integral gives C ∇f dr
2
4π .

Green’s theorem. If R is a region with boundary C and F⃗ is a vector field, then

ZZ Z
⃗
curl(F ) dxdy = ⃗ .
F⃗ · dr
R C

24.12. Remarks.
1) Greens theorem allows to switch from double integrals to one-dimensional integrals.
2) The curve is oriented in such a way that the region is to the left.
3) The boundary of the curve can consist of piecewise smooth pieces.
Rb
4) If C : t 7→ ⃗r(t) = [x(t), y(t)], the line integral is a [P (x(t), y(t)), Q(x(t), y(t))] ·
[x′ (t), y ′ (t)] dt.
5) Green’s theorem was found by George Green (1793-1841) in 1827 and by Mikhail
Ostrogradski (1801-1862).
6) If curl(F⃗ ) = 0 in a simply connected region, then the line integral along a closed
curve is zero. If two curves connect two points then the line integral along those curves
agrees.
7) Taking F⃗ (x, y) = [−y, 0] or F⃗ (x, y) = [0, x] gives area formulas.

24.13. Example. Find the line integral of the vector field F⃗ (x, y) = [x4 + sin(x) +
y, x + y 3 ] along the path ⃗r(t) = [cos(t), 5 sin(t) + log(1 + sin(t))], where t runs from
t = 0 to t = π/2.

Solution. curl(F⃗ ) = 0 implies that the line integral depends only on the end points

′
R 1 4⃗r(t) = [−t, 0], −1 ≤ t ≤ 1, which
(1, 0), (0, 5) of the path. Take the simpler path
5 1
has
velocity ⃗r (t) = [−1, 0]. The line integral is −1 [t − sin(t), −t] · [−1, 0] dt = −t /5|−1 =
−2/5.

Remark We could also find a potential f (x, y) = x5 /5 − cos(x) + xy + y 5 /4. It has

the property that grad(f ) = F . Again, we get f (0, −1)−f (0, 1) = −1/5−1/5 = −2/5.

Stokes theorem. If S is a surface with boundary C and F⃗ is a vector field, then

ZZ Z
curl(F⃗ ) · dS = ⃗ .
F⃗ · dr
S C

24.14. Remarks.
1) Stokes theorem allows to derive Greens theorem: if F⃗ is z-independent and the
surface S is contained in the xy-plane, one obtains the result of Green.
2) The orientation of C is such that if you walk along C and have your head in the
direction of the normal vector ⃗ru × ⃗rv , then the surface to your left.
3) Stokes theorem was found by André Ampère (1775-1836) in 1825 and rediscovered
by George Stokes (1819-1903).
4) The flux of the curl of a vector field does not depend on the surface S, only on the
boundary of S.
5) The flux of the curl through a closed surface like the sphere is zero: the boundary
of such a surface is empty.

24.15. Example. Compute the line integral of F⃗ (x, y, z) = [x3 + xy, y, z] along the
polygonal path C connecting the points (0, 0, 0), (2, 0, 0), (2, 1, 0), (0, 1, 0) in that order.
Solution. The path C bounds a surface S : ⃗r(u, v) = [u, v, 0] parameterized by
R = [0, 2] × [0, 1]. By Stokes theorem, the line integral is equal to the flux of
curl(F⃗ )(x, y, z) = [0, 0, −x] through S. The normal vector of S is ⃗ru × ⃗rv = [1, 0, 0] ×
⃗ = 2 1 [0, 0, −u]·[0, 0, 1] dudv = 2 1 −u dudv =
[0, 1, 0] = [0, 0, 1] so that S curl(F⃗ ) dS
RR R R R R
0 0 0 0
−2.
Multivariable Calculus

Divergence theorem: If S is the boundary of a region E in space and F⃗ is a

vector field, then ZZZ ZZ
div(F⃗ ) dV = ⃗ .
F⃗ · dS
E S

24.16. Remarks.
1) The divergence theorem is also called Gauss theorem.
2) It is useful to determine the flux of vector fields through surfaces.
3) It can be used to compute volume.
4) It was discovered in 1764 by Joseph Louis Lagrange (1736-1813), later it was redis-
covered by Carl Friedrich Gauss (1777-1855) and by George Green.
5) For divergence free vector fields F⃗ , the flux through a closed surface is zero. Such
fields F⃗ are also called incompressible or source free.

24.17. Example. Compute the flux of the vector field F⃗ (x, y, z) = [−x, y, z 2 ] through
the boundary S of the rectangular box [0, 3] × [−1, 2] × [1, 2].
Solution. ByRGauss theorem, the flux is equal to the triple integral of div(F ) = 2z
3R2 R2
over the box: 0 −1 1 2z dxdydz = (3 − 0)(2 − (−1))(4 − 1) = 27.

24.18. How do these theorems fit together? In n-dimensions, there are n theorems.
We have here seen the situation in dimension n = 2 and n = 3, but one could continue.
The fundamental theorem of line integrals generalizes directly to higher dimensions.
Also the divergence theorem generalizes directly to n dimensions. The generalization
of curl and flux would need more explanation as for n = 4 already, the curl of a vector
field is a 6-dimensional object. It is a n(n − 1)/2 dimensional object in general.

1
FTC
1 −→ 1
FTL Green
1 −→ 2 −→ 1
FTL Stokes Gauss
1 −→ 3 −→ 3 −→ 1
24.19. In one dimension, there is one derivative: f (x) → f ′ (x). It maps scalar func-
tions to scalar functions. It corresponds to the entry 1 − 1 in the Pascal triangle. The
next entry 1 − 2 − 1 corresponds to differentiation in two dimensions, where we have
the gradient f → ∇f mapping a scalar function to a vector field with 2 components as
well as the curl, F → curl(F ) which corresponds to the transition 2 − 1 going back to a
scalar function. The situation in three dimensions is captured by the entry 1 − 3 − 3 − 1
in the Pascal triangle. The first derivative 1 − 3 is the gradient. The second derivative
3 − 3 is the curl and the third derivative 3 − 1 is the divergence which gives again a
scalar. In n = 4 dimensions, we would have to look at 1 − 4 − 6 − 4 − 1. The first
derivative 1 − 4 is still the gradient. Then we have a first curl, which maps a vector
field with 4 components into an object with 6 components. Then there is a second
curl, which maps an object with 6 components back to a vector field, we would have
to look at 1 − 4 − 6 − 4 − 1.
24.20. When setting up calculus in dimension n, one talks about differential forms
instead of scalar fields or vector fields. Functions are 0 forms or n-forms. Vector fields
can be described by 1 or n − 1 forms. The general formalism defines a derivative d
called exterior derivative on differential forms. It maps k forms to k+1 forms. There
is also an integration of k-forms on k-dimensional objects. The boundary operation
δ which maps a k-dimensional object into a k − 1 dimensional object. This boundary
operation is dual to differentiation. They both satisfy the same relation dd(F ) = 0 and
δδG = 0. Differentiation and integration are linked by the general Stokes theorem:

Z Z
F = dF
δG G

24.21. One can see this as a single theorem, the fundamental theorem of multi-
variable calculus. The theorem is simpler in quantum calculus, where geometric
objects and fields are on the same footing. There are various ways how one can gen-
eralize this. One way is to write it as < δG, F >=< G, dF > which in linear algebra
would be written as [AT v, w] = [v, Aw], where AT is the transpose of a matrix A [v, w]
is the dot product. Since traditional calculus we deal with ”smooth” functions and
fields, we have to pay a price and consider in turn ”singular” objects like points or
curves and surfaces. These are idealized objects which have zero diameter, radius or
thickness.
24.22. So, it is all about geometries and fields. Geometries are curves, or surfaces
or solids. Fields are scalar functions or vector fields. Geometries G can be “dif-
ferentiated” by taking the boundary δG. Fields F can be differentiated by applying
differential operators dF like grad, curl or div. And then there Ris integration
R which
pairs up geometries G and fields F . The fundamental theorem δG F = G dF tells
that taking the boundary on the object is dual to taking the derivative of the field.
24.23. Nature likes simplicity and elegance 1 and therefore found a quantum mathe-
matics to be more fundamental. But a geometry in which geometries and fields are
indistinguishable manifests only in the very small.

1Leibniz: 1646-1716
Multivariable Calculus

Homework
This homework is due on Tuesday, 8/2/2022.

Problem 24.1: What is the flux of the vector field F⃗ (x, y, z) =

[9y, 2xy, 4yz + 234xy] through the unit cube [0, 1] × [0, 1] × [0, 1].

Problem 24.2: Find the flux of the vector field F⃗ (x, y, z) = [xy, yz, zx]
through the cylinder x2 + y 2 ≤ 1, 0 ≤ z ≤ 2 without the bottom disk
z = 0. Hint: close the surface first and then also find the flux through the
bottom.

Problem 24.3: Use the divergence theorem to calculate the flux of

F⃗ (x, y, z) = [x3 , y 3 , z 3 ] through the sphere S : x2 + y 2 + z 2 = 1, where the
sphere is oriented so that the normal vector points outwards.

Problem 24.4: Assume the vector field

F⃗ (x, y, z) = [15x3 + 42xy 2 , y 3 + ey sin(z), ey cos(z) + 15z 3 ]
is the magnetic field of the sun whose surface is a sphere of radius
RR 3
oriented with the outward orientation. Compute the magnetic flux S F⃗ ·
⃗
dS.

⃗ where F⃗ (x, y, z) = [−150x, 25y, 25z]

F⃗ · dS,
RR
Problem 24.5: Find S
and S is the boundary of the solid built with the 18 cubes shown in the
picture. Each cube has length 1.

Oliver Knill, [email protected], Math S-21a, Harvard Summer School, 2022

Calculus II Cheat Sheet
50% (2)
Calculus II Cheat Sheet
4 pages
Multivariable Calculus Review Sheet
100% (1)
Multivariable Calculus Review Sheet
2 pages
Manual EDX 700 Shimadzu
100% (1)
Manual EDX 700 Shimadzu
39 pages
Second Text
100% (1)
Second Text
181 pages
Multivariable Knill 2019 PDF
No ratings yet
Multivariable Knill 2019 PDF
117 pages
Multivariable Calculus: Chapter 1. Geometry and Space
No ratings yet
Multivariable Calculus: Chapter 1. Geometry and Space
4 pages
Functions of Several Variables2
No ratings yet
Functions of Several Variables2
5 pages
5÷‹◊‹Ω·
No ratings yet
5÷‹◊‹Ω·
4 pages
MAT1841 Revision 35
No ratings yet
MAT1841 Revision 35
21 pages
A I J K, B I J K,: If and Is The Angle Between Them, (For Two Dimensions)
No ratings yet
A I J K, B I J K,: If and Is The Angle Between Them, (For Two Dimensions)
9 pages
Vector Calculus Textbook CLP
No ratings yet
Vector Calculus Textbook CLP
290 pages
Vectcal
No ratings yet
Vectcal
41 pages
Ox - Lec 03 - Differentiating Vector Functions
No ratings yet
Ox - Lec 03 - Differentiating Vector Functions
15 pages
Multivariable Study Guide 2017
No ratings yet
Multivariable Study Guide 2017
21 pages
Study Guide # 1: MA 26100 - FALL 2019
No ratings yet
Study Guide # 1: MA 26100 - FALL 2019
5 pages
Module 4
No ratings yet
Module 4
28 pages
MA2104 Notes PDF
No ratings yet
MA2104 Notes PDF
35 pages
Math 20C Notes
No ratings yet
Math 20C Notes
2 pages
Advanced Calculus Formulas
100% (1)
Advanced Calculus Formulas
13 pages
Visual Interactive Differential Geometry PDF
100% (2)
Visual Interactive Differential Geometry PDF
99 pages
1402 Notes 2
No ratings yet
1402 Notes 2
42 pages
Analytic Geometry With Calculus PDF
100% (2)
Analytic Geometry With Calculus PDF
258 pages
Calc2 6a Vectors and 3d Geometry PDF
No ratings yet
Calc2 6a Vectors and 3d Geometry PDF
7 pages
Vector Algebra and Calculus: Differentiation of Vector Functions, Applications To Mechanics
No ratings yet
Vector Algebra and Calculus: Differentiation of Vector Functions, Applications To Mechanics
43 pages
Student Made MVC REVIEW 2024
No ratings yet
Student Made MVC REVIEW 2024
14 pages
Advanced CAD/CAM Differential Geometry: Curves and Surfaces: Kwanghee Ko
No ratings yet
Advanced CAD/CAM Differential Geometry: Curves and Surfaces: Kwanghee Ko
33 pages
Multivariable Calculus Study Guide: AL TEX Version: 1 Disclaimer
No ratings yet
Multivariable Calculus Study Guide: AL TEX Version: 1 Disclaimer
18 pages
Math 263 Lecture Notes
No ratings yet
Math 263 Lecture Notes
126 pages
CLP 3
No ratings yet
CLP 3
374 pages
Vector Calculus Review
No ratings yet
Vector Calculus Review
45 pages
MA4006 Notes
No ratings yet
MA4006 Notes
112 pages
CLP 3 MC Text
No ratings yet
CLP 3 MC Text
394 pages
Curves and Surfaces: Lecture Notes For Geometry 1
0% (1)
Curves and Surfaces: Lecture Notes For Geometry 1
140 pages
Curves and Surfaces: Lecture Notes For Geometry 1
No ratings yet
Curves and Surfaces: Lecture Notes For Geometry 1
140 pages
Ezil
No ratings yet
Ezil
8 pages
Math 2004 Notes
No ratings yet
Math 2004 Notes
79 pages
MIT18 02SC Notes 20
No ratings yet
MIT18 02SC Notes 20
4 pages
1chapter09 Vec Calculus
No ratings yet
1chapter09 Vec Calculus
49 pages
Analysis in Many Variables II
No ratings yet
Analysis in Many Variables II
17 pages
Maths Calculus Formula Sheet
No ratings yet
Maths Calculus Formula Sheet
9 pages
Differential Geometry of Curves and Surfaces 1. Curves in The Plane
No ratings yet
Differential Geometry of Curves and Surfaces 1. Curves in The Plane
20 pages
Cs 1
No ratings yet
Cs 1
20 pages
CLP Calculus 04 Joel Feldman, Andrew Rechnitzer and Elyse Yeager
No ratings yet
CLP Calculus 04 Joel Feldman, Andrew Rechnitzer and Elyse Yeager
304 pages
AFD Note01 PDF
No ratings yet
AFD Note01 PDF
5 pages
Chapter 3
No ratings yet
Chapter 3
8 pages
Curves
No ratings yet
Curves
13 pages
Differential Geometry
No ratings yet
Differential Geometry
195 pages
282 V Calc
No ratings yet
282 V Calc
8 pages
Diff Geo Book
No ratings yet
Diff Geo Book
112 pages
المحاضرة السابعة Maths
No ratings yet
المحاضرة السابعة Maths
7 pages
Curves
No ratings yet
Curves
15 pages
MultiVariable Differential Calculus
No ratings yet
MultiVariable Differential Calculus
26 pages
2AC Notes Summary 2023
No ratings yet
2AC Notes Summary 2023
7 pages
Calc 3 Book
No ratings yet
Calc 3 Book
238 pages
Pde Slides1
No ratings yet
Pde Slides1
9 pages
DONALDSON - Introduction To Differential Geometry
No ratings yet
DONALDSON - Introduction To Differential Geometry
84 pages
ENGR 2422 Engineering Mathematics 2 Brief Notes On Chapter 1 1.1 Lines and Planes
No ratings yet
ENGR 2422 Engineering Mathematics 2 Brief Notes On Chapter 1 1.1 Lines and Planes
10 pages
Notes (Calculo Vectorial
No ratings yet
Notes (Calculo Vectorial
63 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Exercises of Basic Analytical Geometry
From Everand
Exercises of Basic Analytical Geometry
Simone Malacrida
No ratings yet
Offshore Structure
100% (4)
Offshore Structure
124 pages
Photosynthesis Quiz
No ratings yet
Photosynthesis Quiz
3 pages
2005 - CG 03 Boxguide e r6
No ratings yet
2005 - CG 03 Boxguide e r6
1 page
SPC
No ratings yet
SPC
49 pages
Velocity Structure and A Seismic Model For Nevado Del Ruiz Volcano (Colombia)
No ratings yet
Velocity Structure and A Seismic Model For Nevado Del Ruiz Volcano (Colombia)
27 pages
Enzymes
No ratings yet
Enzymes
10 pages
TM 347 Lesson 5 Basic Clamping Rules
No ratings yet
TM 347 Lesson 5 Basic Clamping Rules
41 pages
Chemical Reaction Engineering (CRE) Is The
No ratings yet
Chemical Reaction Engineering (CRE) Is The
61 pages
Level I Final Branded Exam Information Document 2019
No ratings yet
Level I Final Branded Exam Information Document 2019
13 pages
1Z0 1087 24 Demo
No ratings yet
1Z0 1087 24 Demo
4 pages
Summary of Intensive Quenching Processes Theory and Applications 17p
No ratings yet
Summary of Intensive Quenching Processes Theory and Applications 17p
17 pages
DNA To Proteins Practice
No ratings yet
DNA To Proteins Practice
2 pages
Control Strategy and Application of Power Converter
No ratings yet
Control Strategy and Application of Power Converter
6 pages
Q. 1-Q.30 Carry One Mark Each: India's No.1 Institute For GATE Chemical Engineering CH-1
No ratings yet
Q. 1-Q.30 Carry One Mark Each: India's No.1 Institute For GATE Chemical Engineering CH-1
29 pages
APDU Basic Commands
100% (1)
APDU Basic Commands
3 pages
Digital Marketing Adoption and Succes For Small Business
No ratings yet
Digital Marketing Adoption and Succes For Small Business
25 pages
Exercise 2.1 Page 143 PDF
No ratings yet
Exercise 2.1 Page 143 PDF
3 pages
Mahendra SFDC8
No ratings yet
Mahendra SFDC8
5 pages
How To Make A Leather Bushcraft Hat
100% (1)
How To Make A Leather Bushcraft Hat
10 pages
2024 Note 2
No ratings yet
2024 Note 2
5 pages
Lab.3 Industrial Pharmacy
No ratings yet
Lab.3 Industrial Pharmacy
4 pages
How The Internal Anatomy of The Leaf Facilitates Photosynthesis
No ratings yet
How The Internal Anatomy of The Leaf Facilitates Photosynthesis
4 pages
Ece 1101-5
No ratings yet
Ece 1101-5
12 pages
Inclined Bedding - Fold (Lab 2A)
No ratings yet
Inclined Bedding - Fold (Lab 2A)
14 pages
Kertas 1
No ratings yet
Kertas 1
41 pages
SYS600 Operation Manual
100% (1)
SYS600 Operation Manual
166 pages
Excel 365 Charts
No ratings yet
Excel 365 Charts
63 pages
Motor Forward and Reverse Direction Control Using A PLC
No ratings yet
Motor Forward and Reverse Direction Control Using A PLC
5 pages
Important Questions Computer Networks: Question Bank
No ratings yet
Important Questions Computer Networks: Question Bank
2 pages