Partials
Partials
A. HAVENS
Contents
0 Functions of Several Variables 1
0.1 Functions of Two or More Variables . . . . . . . . . . . . . . . . . . . . . . . . . 1
0.2 Graphs of Multivariate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
0.3 Contours and Level Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.4 Real-Valued Functions of Vector Inputs . . . . . . . . . . . . . . . . . . . . . . . 5
0.5 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1 Partial Derivatives 8
1.1 Partial Derivatives of Bivariate Functions . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Partial Derivatives for functions of Three or More Variables . . . . . . . . . . . . 10
1.3 Higher Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6 Implicit Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Further Problems 52
i
2/21/20 Multivariate Calculus: Multivariable Functions Havens
along a line segment connecting to the two bodies. Thus, to properly describe the gravitational
force, we’d need to construct a vector field. This idea will be described later in the course.
What are the level sets, F −1 ({k}), of the gravitational force? Since objects each of mass m at
equal distances should experience the same attractive force towards the central mass, we should
expect radially symmetric surfaces as our level sets, i.e., we should expect spheres! Indeed, k =
F (r) = GM m
krk2
=⇒ krk2 = GM m
k , whence the level set for a force of magnitude k is a sphere of
»
radius GM m/k.
Exercise 0.3. Write out appropriate set theoretic definitions of image and pre-image for an n
variable function f (x1 , . . . , xn ).
1
Exercise 0.4. Describe the natural domain of the function f (x, y, z) = x2 +y 2 −z 2 −1
as a subset of
R3 . What sort of subset is the pre-image f −1 ({1})?
2
2/21/20 Multivariate Calculus: Multivariable Functions Havens
(a) (b)
p
Figure 2.p (A) – The level curves for f (x, y) = x2 + y 2 (B) – The level curves for
g(x, y) = 9 − x2 − y 2 . Warmer colors indicate higher k value in both figures.
Of course, now we can attempt to understand the graphs themselves. The graph of f (x, y) is just
a cone: the level curves are just curves of constant distance from (0, 0), and so the z-traces are these
concentric circles each lifted to a height equal to its radius. The graph of g(x,p y) is of the upper
hemisphere of a radius 3 sphere centered at (0, 0, 0) ∈ R3 : observe that z = 9 − x2 − y 2 =⇒
x2 + y 2 + z 2 = 9, z ≥ 0.
We can also define a notion similar to level curves for an n-variable function f : D → R:
Definition. The set given by the pre-image of a value k ∈ f (D) is called the level set with level k,
and is written
f −1 ({k}) := {(x1 , . . . , xn ) ∈ D | f (x1 , . . . , xn ) = k} .
For a “sufficiently nice” three variable function f (x, y, z), the level sets are surfaces with implicit
equations k = f (x, y, z), except at extrema, where one may have collections of points and curves.
Exercise 0.5. Let a ≥ b > 0 be real constants. Give Cartesian or polar equations for the level curves
of the following surfaces in terms of a, b, and z = k. Where relevant, determine any qualitative
differences between the regimes a > b, a = b and a < b. Sketch a sufficient family of level curves to
capture the major features of each of the surfaces, and attempt to sketch the surfaces using a view
which captures the essential features. You may use a graphing calculator or computer as an aid,
but you must show the relevant algebra in obtaining the equations of the contours.
4
2/21/20 Multivariate Calculus: Multivariable Functions Havens
p
(a) z = x2 + y 2 + a2 (c) z = sin(xy)
Suppose 0 < |α| < 1. What are the level curves? What about for α = 0, α = 1 and α > 1? Sketch
level curves and a surface for each scenario. (Hint: try writing things in polar coordinates; see also
the discussion in section 5.4 of the notes on Curvature, Natural Frames and Acceleration for Space
Curves and problem 23 of those notes.)
§ 0.4. Real-Valued Functions of Vector Inputs
It is often convenient to regard a multivariate function as a map from a set of vectors to the
real numbers. In this sense, we can view multivariable functions as scalar fields over some domain
whose elements are position vectors. E.g., the distance function from the origin for the plane can
be written as the scalar field
√
f (r) = krk = r · r .
Sometimes a multivariable function becomes easier to understand geometrically by writing it in
terms of vector operations such as the dot product and computing magnitudes.
Example. Consider f (x, y) = ax + by for nonzero constants a and b. The graph is a plane, but
how do a and b control the plane? If we rewrite f as f (x, y) = a · r where a = aı̂ + b̂, then it is
clear that the height z = f (x, y) above the xy plane in R3 increases most rapidly in the direction
of a, and decreases most rapidly in the direction of −a. The contours at height k are necessarily
the lines ax + by = k, which are precisely the lines perpendicular to a (observe that such a line
may be parameterized as r(t) = t(bı̂ − â) + (k/b)̂, which has velocity orthogonal to a.) Of course,
if we allow either a = 0 or b = 0, we have the case of planes whose levels are either horizontal or
vertical lines respectively.
It will often be convenient to write definitions for functions in 3 or more variables using vector
notation. For R3 we use the ordered, right-handed basis (ı̂,̂, k̂), so a point (x, y, z) ∈ R3 corresponds
to a position vector xı̂ + ŷ + z k̂ = hx, y, zi. For Rn with n ≥ 4, we use (ê1 , ê2 , . . . , ên ) as the basis.
Occasionally, we’ll write a vector r = x1 ê1 + . . . xn ên and view it as a vector both in Rn and
in Rn+1 , where the additional basis element ên+1 spans the axis perpendicular to our choice of
embedded Rn . This is convenient, e.g., when considering the graph of an n-variable function f (r),
the definition of which can now be written
Gf = {x ∈ Rn+1 | x = r + f (r)ên+1 , r ∈ Dom(f )} .
§ 0.5. Limits
We review here the definitions of limits and continuity. For examples, see the lecture slides on
Limits and Continuity for Multivariate Functions from February 13, 2020.
Definition. Given a function of two variables f : D → R, D ⊆ R2 such that D contains points
arbitrarily close to a point (a, b), we say that the limit of f (x, y) as (x, y) approaches (a, b) exists
and has value L if and only if for every real number ε > 0 there exists a real number δ > 0 such
that
|f (x, y) − L| < ε
whenever »
0< (x − a)2 + (y − b)2 < δ .
5
2/21/20 Multivariate Calculus: Multivariable Functions Havens
We then write
lim f (x, y) = L .
(x,y)→(a,b)
Thus, to say that L is the limit of f (x, y) as (x, y) approaches (a, b) we require that for any given
positive “error” ε > 0, we can find a bound δ > 0 on the distance of an input (x, y) from (a, b)
which ensures that the output falls within the error tolerance around L (that is, f (x, y) is no more
than ε away from L).
Another way to understand this is that for any given ε > 0 defining an open metric neighborhood
(L − ε, L + ε) of L on the number line R, we can ensure there is a well defined δ(ε) such that the
image of any (possibly punctured ) open disk of radius r < δ centered at (a, b) is contained in the
ε-neighborhood.
Recall, for functions of a single variable, one has notions of left and right one-sided limits:
lim f (x) and lim f (x) .
x→a− x→a+
But in R2 there’s not merely left and right to worry about; one can approach the point (a, b)
along myriad different paths! The whole limit lim(x,y)→(a,b) f (x, y) = L if and only if the limits
along all paths agree and equal L. To write a limit along a path, we can parameterize the path as
some vector valued function r(t) with r(1) = ha, bi, and then we can write
lim f (r(t)) = L
t→1−
if for any ε > 0, there is a δ > 0 such that |f (r(t)) − L| < ε whenever 1 − δ < t < 1. Similarly we
may define a “right” limit along r(t), limt→1+ f (r(t)) if r(t) exists and describes a continuous path
for t > 1. The two sided limit along the path is then defined in the natural way:
lim f (r(t)) = L ⇐⇒ ∀ε > 0 ∃δ > 0 :
t→1
|f (r(t)) − L| < ε whenever 0 < |1 − t| < δ .
Using paths gives a way to prove non-existence of a limit: if the limits along different paths
approaching a point (a, b) do not agree, then lim(x,y)→(a,b) f (x, y) does not exist.
Polynomials in two variables are continuous on all of R2 . Recall a polynomial in two variables is
a function of the form
m X
X n
p(x, y) = aij xi y j = a00 + a10 x + a01 y + a11 xy + a21 x2 y + . . . + amn xm y n .
i=0 j=0
Rational functions are also continuous on their domains. Rational functions of twoÄvariables are ä
just quotients of two variable polynomials R(x, y) = p(x, y)/q(x, y). Observe that Dom p(x, y)/q(x, y) =
{(x, y) ∈ R2 : q(x, y) 6= 0}.
7
2/21/20 Multivariate Calculus: Multivariable Functions Havens
1. Partial Derivatives
§ 1.1. Partial Derivatives of Bivariate Functions
Consider a bivariate function f : D → R, and assume f is continuous. We will use the geometry
of the graph to study how the function changes with respect to changes in the input variables. Let
z = f (x, y) be the height of the surface of the graph of f . Consider the planes x = x0 and y = y0 ,
which intersect the graph surface in a pair of curves
Definition. The partial derivative of the two variable function f (x, y) at a point (x0 , y0 ) with
respect to x, denoted variously
∂f
= ∂x f (x0 , y0 ) = fx (x0 , y0 ) = Dx f (x0 , y0 )
∂x (x0 ,y0 )
8
2/21/20 Multivariate Calculus: Multivariable Functions Havens
is Äthe value of the äslope of the tangent line to the curve C1 in the vertical plane y = y0 at the point
P x0 , y0 , f (x0 , y0 ) , which is given by
f (x0 + h, y0 ) − f (x0 , y0 )
∂x f (x0 , y0 ) := lim .
h→0 h
Similarly one defines
∂f f (x0 , y0 + h) − f (x0 , y0 )
= ∂y f (x0 , y0 ) = fy (x0 , y0 ) = Dy f (x0 , y0 ) = lim ,
∂y (x0 ,y0 ) h→0 h
which
Ä is the slopeä of the tangent line to the curve C2 in the vertical plane x = x0 at the point
P x0 , y0 , f (x0 , y0 ) .
Definition. The first order partial derivative functions, or simply, first partial derivatives, of f (x, y)
are the functions
f (x + h, y) − f (x, y) f (x, y + h) − f (x, y)
fx (x, y) = lim , fy (x, y) = lim .
h→0 h h→0 h
It follows straightforwardly from the definitions that to compute the partial derivative functions,
one only has to differentiate the function f (x, y) with respect to the chosen variable, while treating
the other variable as a constant. Partial derivatives obey the usual derivative rules, such as the
power rule, product rule, quotient rule, and chain rule. We’ll discuss the chain rule in detail soon.
Now, we’ll examine how some of the rules interact for partial derivatives, through examples.
Example. Compute the first order partial derivatives fx (x, y) and fy (x, y) for the function f (x, y) =
x3 y 2 + x cos(xy).
Solution: When we consider the first term x3 y 2 , though it is a product of variables, the partial
∂
derivative operator ∂x sees only a constant times a power of x, so
∂ Ä 3 2ä
x y = 3x2 y 2 .
∂x
For the term x cos(xy), though the y is treated as a constant, we still employ a power rule and
chain rule for x to get
∂
(x cos(xy)) = cos(xy) − xy sin(xy) .
∂x
Since the derivative of a sum of functions is still the sum of the derivatives of each function, we
obtain
∂ Ä 3 2 ä
fx (x, y) = x y + x cos(xy)
∂x
∂ Ä 3 2ä ∂
= x y + (x cos(xy))
∂x ∂x
= 3x2 y 2 + cos(xy) − xy sin(xy) .
For the y-partial one obtains, by similar reasoning
fy (x, y) = 2x3 y − x2 sin(xy) .
p
Exercise 1.1. Find the first order partial derivatives fx and fy , for f (x, y) = x2 + y 2 .
Exercise 1.2. Verify the following derivative rules from the limit definitions, assuming the existence
of derivatives as necessary:
(i.) For f (x, y) and g(x, y),
Ä ä
∂x f (x, y) + g(x, y) = ∂x f (x, y) + ∂x g(x, y)
(ii.) For f (x, y) and k(y),
Ä ä
∂x k(y)f (x, y) = k(y)∂x f (x, y) ,
(iii.) For f (x, y) and g(x, y),
Ä ä Ä ä Ä ä
∂x f (x, y)g(x, y) = ∂x f (x, y) g(x, y) + f (x, y) ∂x g(x, y)
9
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Solution:
∂f −x ∂f −y ∂f +z
=p 2 2 2
, =p 2 2 2
, =p .
∂x 1−x −y +z ∂y 1−x −y +z ∂z 1 − x − y2 + z2
2
10
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Exercise 1.3. Find the first order partial derivatives fx , fy , and fz for f (x, y, z) = yz cos(x − z) −
xz sin(y − z) + xy sin(z).
Ä p
Exercise 1.4. Find the x and y partial derivatives of z = arcsin y/ x2 + y 2 ) by writing sin z =
p
y/ x2 + y 2 and differentiating implicitly. Express the final answers as functions of x and y only.
Solution: Observe that the function is undefined along the line x = 0. Its graph is the portion
of the helicoid 2 surface shown in figure 4.
The first partial derivatives are:
−y 1 −y
fx (x, y) = 2 = 2 , x 6= 0 ,
x 1 + y 2 /x2 x + y2
1 1 x
fy (x, y) = 2 2
= 2 , x 6= 0 .
x 1 + y /x x + y2
To compute the second partial derivatives, we merely differentiate the above functions with
respect to either x or y:
−y 2xy
Å ã
fxx (x, y) = ∂x 2 2
= 2 , x 6= 0 ,
x +y (x + y 2 )2
−y −x2 − y 2 + 2y 2 y 2 − x2
Å ã
fxy (x, y) = ∂y = = , x 6= 0 ,
x + y2
2 (x2 + y 2 )2 (x2 + y 2 )2
x −2xy
Å ã
fyy (x, y) = ∂y 2 2
= 2 , x 6= 0 ,
x +y (x + y 2 )2
x x2 + y 2 − 2x2 y 2 − x2
Å ã
fyx (x, y) = ∂x = = , x 6= 0 .
x2 + y 2 (x2 + y 2 )2 (x2 + y 2 )2
Observe that fxy (x, y) = fyx (x, y).
The graphs of the first and second partial derivative functions are shown below in figures 5, 6
and 7.
(a) (b)
Figure 5. (A) – The graph of fx (x, y) = −y/r2 for f (x, y) = arctan(y/x). (B) –
The graph of fy (x, y) = x/r2 for f (x, y) = arctan(y/x).
2A helicoid is a surface swept out by revolving a line around an axis as you slide it along the axis. Stacking the
graphs of functions zk = arctan y/x + kπ for k ∈ Z, and filling in the z - axis and lines x = 0, z = kπ gives an entire
helicoid. It can also be parameterized as the surface σ(u, v) = hu cos v, u sin v, vi, for u ∈ R and v ∈ R.
12
2/21/20 Multivariate Calculus: Multivariable Functions Havens
(a) (b)
Figure 6. (A) – The graph of fxx (x, y) for f (x, y) = arctan(y/x). (B) – The graph
of fyy (x, y) for f (x, y) = arctan(y/x).
Figure 7. The graph of fxy (x, y) = fyx (x, y) for f (x, y) = arctan(y/x).
Exercise 1.5. Rewrite the first and second partial derivatives of f (x, y) = arctan(y/x) in polar
coordinates, and use the polar expressions to explain the symmetries visible in the above graphs of
the partial derivative surfaces.
13
2/21/20 Multivariate Calculus: Multivariable Functions Havens
The equality of the mixed partial derivatives in the preceding example was not pure serendipity:
the functions fxy and fyx are rational, and thus continuous on their domains. The following theorem,
which has a long history of faulty proofs, states that we can expect such equality under suitable
continuity conditions on the mixed partial derivatives:
Theorem (Clairaut-Schwarz Theorem). If the mixed partial derivative functions fxy and fyx are
continuous on a disk D containing the point (x0 , y0 ) in its interior, then fxy (x0 , y0 ) = fyx (x0 , y0 ).
Here is an interpretation of the Clairaut-Schwarz theorem: recall that the partial derivative
function fx (x, y) can be interpreted as the result of measuring slopes of tangent lines along curves
parallel to the x-axis, cut in the graph by planes of constant y value. Then fxy (x, y) measures how
those slopes change as we slide the cutting plane in the ±̂ direction (i.e., parallel to the y-axis).
The other mixed partial fyx (x, y) measures how the slopes of tangent lines along curves parallel to
the y-axis, cut in the graph by planes of constant y value change as we slide the cutting planes in
the ±ı̂ direction. Clairaut-Schwarz then says these must be equal at a point (x, y) if, at and around
that point, both rates of change are well defined and continuous. One way to prove it is to consider
a tiny square with sides parallel to the coordinate axes, and look at how the function changes along
the edges of the square. One can form difference quotients whose limits as the square shrinks give
second partial derivatives. Apply the mean value theorem and carefully examine the limits as the
square shrinks!
∂f ∂f ∂ 2 f ∂2f ∂2f ∂2f
Exercise 1.6. Compute the partial derivatives ∂x , ∂y , ∂x2 , ∂y ∂x , ∂y 2 , and ∂x ∂y for the following
functions:
p 2 2
(a) f (x, y) = ln x2 + xy + y 2 (c) f (x, y) = x2y − y 2x
R xy 2
(b) f (x, y) = ex cos y sin(xy) (d) f (x, y) = x2 +y 2 e−t dt
In each case, verify the equality of the mixed partials for the domains where the mixed partials are
continuous.
1 − xy
Exercise 1.7. Let f (x, y) = ln .
1 + xy
(a) Describe the natural domain of f (x, y) as subset of R2 , and sketch it.
(b) Describe level curves of f (x, y) algebraically, and include a sketch of them.
(c) Compute fx and fy . Hint: use properties of logarithms to simplify before differentiating.
(d) Show that fxy = fyx throughout the domain of f .
Exercise 1.9. Compute the second order partial derivatives fxx , fxy , fxz , fyy , fyx , fyz , fzz , fzx ,
xyz
and fzy , for f (x, y) = x2 +y 2 +z 2 . Make note of which pairs of mixed partials are equal.
14
2/21/20 Multivariate Calculus: Multivariable Functions Havens
The expression ∇2 u(x, y, z) = uxx (x, y, z) + uyy (x, y, z) + uzz (x, y, z) is called the Laplacian of u;
∂2 ∂2 ∂2
the Laplacian operator ∇2 := ∂x 2 + ∂y 2 + ∂z 2 appears in many partial differential equations. We
may think of harmonic functions as those in the kernel of the Laplacian, i.e., as the functions on
which the Laplacian operator vanishes.
Example. Show that u(x, y) = e−x cos y is harmonic.
Solution: We merely compute the partial derivatives and check that u satisfies Laplace’s equa-
tion:
ux (x, y) = −e−x sin y , uxx (x, y) = e−x sin y ,
uy (x, y) = −e−x cos y , uyy (x, y) = −e−x sin y ,
uxx + uyy = e−x sin y − e−x sin y = 0 .
Thus u(x, y) = e−x cos y is harmonic.
Exercise 1.10. Let v(x, y) = (x2 − y 2 )e−y cos x − 2xye−y sin(x). Is v harmonic?
The wave equation furnishes another example of an important partial differential equation in the
physical sciences. A function u(x, t) satisfies the wave equation in one spatial dimension and one
time variable t if
utt = a2 uxx ,
where a is a positive constant representing the speed of propagation of the wave. This is called the
1+1 dimensional wave equation. The 3+1 dimensional wave equation (for a scalar wave propagating
in R3 as time advances) can be expressed using the Laplacian:
∂2u
(r, t) = a2 ∇2 u(r, t) ,
∂t2
where r = xı̂ + ŷ + z k̂ is the spatial position and t is the time variable. One can also define a
vector valued version of the wave equation, as is needed to study electromagnetic waves. To do so,
one needs a vector Laplacian operator ; we leave this digression for our future study of the calculus
of vector fields.
Exercise 1.11. Let a be a positive constant. Show that u(x, t) = cos(x − at) satisfies the (1 + 1)D
wave equation utt = a2 uxx .
3A Riemannian manifold is a space that looks locally like Euclidean space, together with a notion of something
like a dot product. The spaces may be globally quite complex, requiring many patches that glue together, with nice
conditions on how they overlap. Riemannian geometry gives a natural context in which to study intrinsic geometry,
such as distances, curvature, and variational problems, on spaces that may be globally topologically unlike Rn , except
that they locally have the right structure to perform calculus.
15
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Another famous partial differential equation is the heat equation, also called the diffusion equa-
tion. In one spatial variable x and one time variable t, the equation reads
ut = αuxx ,
where α > 0 is the thermal diffusivity or simply the diffusivity. A solution function u is either a
temperature function, or represents a concentration as a function of space and time, subject to
a diffusion process. there are more elaborate heat and diffusion equations, and as with Laplace’s
equation, one can generalize them to higher dimensions and other spaces. For example, we may use
the 3D Laplacian operator to write a heat/diffusion equation in three space variables and one time
variable:
∂
u(x, yz) = α∇2 u(x, y, z) = α (uxx (x, y, z) + uyy (x, y, z) + uzz (x, y, z)) .
∂t
2
e−x /2t
Exercise 1.12. Let u(x, t) = e−t/2 sin x + √ .
2t
Does the function u(x, t) satisfy the heat equation for some constant α > 0?
This is of course something one encounters even when computing partial derivatives of simple
examples, such as for a function like f (x, y) = sin(xy).
Ä ä
Exercise 1.13. Realize sin(xy) in the form g h(x, y) for some g and h, and compute the first and
second partial derivatives fx , fy , fxx , fxy = fyx , and fyy , writing your solutions so as to make the
chain rule explicitly clear.
One can of course write down such a chain rule in any number of variables:
Proposition. Let f : E → R be a real differentiable function of one variable on a domain E ⊂ R,
and suppose g(x1 , . . . , xn ) = g(r) is a function of n ≥ 2 variables such that the first
Ä partials
ä gxi
∂
exist and are continuous on D ⊆ Rn , with image g(D) ⊆ I. Then the partials ∂x i
f g(r) exist for
all r ∈ D and are given by
∂f df ∂g
= f 0 g(r) gxi (r) .
Ä
=
∂xi g(r)
dg g(r) ∂xi r
In the next simplest scenario, a set of variables x1 , . . . , xn are determined as functions of a single
parameter t, and then input into a function of multiple variables. This corresponds geometrically
to asking about the change in the value of the multivariate function along a parameterized curve.
We describe first the bivariate case:
Proposition. Let x(t) and y(t) be differentiable functions of t ∈ E ⊂ R, such that the images
(x(t), y(t)) are contained in the domain D ⊆ R2 of a bivariate function f : D → R. Suppose further
that the partials fx and fy exist and are continuous along the image curve. Then
d Ä ä
f x(t), y(t) = fx (x(t), y(t))ẋ(t) + fy (x(t), y(t))ẏ(t) ,
dt
16
2/21/20 Multivariate Calculus: Multivariable Functions Havens
where a dot above x or y indicates the usual derivative with respect to t. Thinking of z = f (x, y) as
the height of the graph of the function above the xy plane in R3 , one can write
d Ä ä ∂f dx ∂f dy
ż(t) = f x(t), y(t) = + ,
dt ∂x dt ∂y dt
dy
Ä ä
dx
where it is understood that ∂x f and ∂y f are evaluated at x(t), y(t) , and dt = ẋ and dt = ẏ are
likewise being evaluated at t.
More generally:
Proposition. If r(t) = hx1 (t), . . . xn (t)i is a differentiable curve with image contained in the do-
main D of a function f (x1 , . . . , xn ), and all the first partial derivatives of f exist and are continuous
along r(t), then
n
d Ä ä X ∂f dxk Ä ä Ä ä
f r(t) = = ẋ1 ∂x1 f r(t) + . . . + ẋn ∂xn f r(t) .
dt k=1
∂xk dt r(t)
Example. Let f (x, y) = x2 + 4y 2 , and let r(t) = cos(t)ı̂ + sin(t)̂ be the unit circle in the xy-plane.
Find the derivative z 0 (t) for z = f (x, y), and interpret this derivative geometrically.
Solution: There are two routes of solution. One is to substitute the parametric equations of the
curve (namely, the component functions x(t) = cos t and y(t) = sin t) into f (x, y), thus reducing
the problem to a straightforward derivative from a first course in differential calculus. The other
option is to employ the chain rule. We’ll show both methods, beginning with the chain rule.
According to the proposition above, the derivative ż(t) is given by
∂z dx ∂z dy
ż(t) = +
∂x dt ∂y dt
= 2x(t)ẋ(t) + 8y(t)ẏ(t)
= −2 cos t sin t + 8 sin t cos t = 6 sin t cos t
= 3 sin 2t .
Alternatively, we compute z(t) = f (cos t, sin t) = cos2 t + 4 sin2 t, and
ż(t) = −2 cos t sin t + 8 sin t cos t = 3 sin 2t ,
as before.
Geometrically, the image in R3 of f (r(t)) on the graph is a loop on the elliptic paraboloid
z = x2 + 4y 2 , and ż is the rate of change of the height along this loop as the parameter t advances,
see figure 8. Thinking of t as describing a particle, we can think of ż as the vertical component of
its velocity. The chain rule then tells us that this is computable as a dot product:
Ä ä Ä ä
ż(t) = hfx r(t) , fy r(t) i · ṙ(t) .
D Ä ä Ä äE
The vector fx r(t) , fy r(t) is an example of what we will call a gradient vector (in this case,
it’s evaluated along the curve). We’ll discuss gradients in greater depth in §3.2 of these notes.
Next we consider the case when we replace the variables of a bivariate function with bivariate
functions, an application of which will be the study of derivatives of bivariate functions after a
coordinate transformation.
Let u and v be variables, and in the uv-plane, let E be some domain such that we can define
functions g(u, v) and h(u, v) whose first partials all exist and are all continuous
Ä throughout
ä E.
Then there is a multivariate transformation from E to the region R = x(E), y(E) = {(x, y) =
Ä ä
g(u, v), h(u, v) | (u, v) ∈ E} ⊂ R2 , defined by setting x(u, v) = g(u, v) and y(u, v) = h(u, v). Let
D be the domain of a function f (x, y) such that R ⊆ D, and suppose the first partials of f exist and
are continuous
Ä throughout
ä R. Then we have chain rules specifying the u and v partial derivatives
of f x(u, v), y(u, v) :
∂f ∂f ∂x ∂f ∂y
= + ,
∂u ∂x ∂u ∂y ∂u
17
2/21/20 Multivariate Calculus: Multivariable Functions Havens
∂f ∂f ∂x ∂f ∂y
= + .
∂v ∂x ∂v ∂y ∂v
A simple application is computing partial derivatives of a Cartesian bivariate function with
respect to polar coordinate variables. Recall, the Cartesian variables (x, y) can be expressed as
functions of the polar variables r and θ, via elementary trigonometry:
x = r cos θ ,
y = r sin θ .
Example. Let f (x, y) = 3x2 − 2y 2 . Compute ∂r f and ∂θ f , and express the resulting functions in
both in terms of polar variables and in terms of Cartesian variables.
Solution: At each step, we will express things using both coordinate systems, so that we can
express the final answers in either coordinate system. First, observe that we have the four first
partial derivatives of x and y with respect to r and θ:
∂x x x ∂x
= cos θ = = p 2 , = −r sin θ = −y
∂r r x + y2 ∂θ
∂y y y ∂y
= sin θ = = p 2 , = r cos θ = x
∂r r x + y2 ∂θ
Next, we compute the x and y partials of f (x, y):
∂f
= 6x = 6r cos θ ,
∂x
∂f
= −4y = −4r sin θ .
∂y
18
2/21/20 Multivariate Calculus: Multivariable Functions Havens
6x2 − 4y 2
= p 2 2
= 6r cos2 θ − 4r sin2 θ ,
x +y
∂f ∂f ∂x ∂f ∂y
= + = (6x)(−y) + (−4y)(x)
∂θ ∂x ∂θ ∂y ∂θ
= −10xy = −10r2 cos θ sin θ = −5r2 sin(2θ) .
Observe that if we rewrite f in terms of polar coordinates, we have
f (r, θ) = 3r2 cos2 θ − 2r2 sin2 θ ,
from which we can directly compute ∂r f and ∂θ f without the chain rule.
Remark. The computations of ∂u f and ∂v f from ∂x f , ∂y f , ∂u x, ∂u y, ∂v x and ∂v y may be neatly
encoded by a matrix vector product. Observe that in general
ñ ô ñ ôñ ô
∂u f ∂u x ∂ u y ∂x f
= .
∂v f ∂v x ∂v y ∂y f
Note that the column vector that gets transformed by the matrix vector product is none other than
the column vector form of the gradient of f introduced above. The transpose of the square matrix
above, ñ ô
∂(x, y) ∂u x ∂v x
= ,
∂(u, v) ∂u y ∂ v y
is called the Jacobian matrix of the transformation. The Jacobian evaluated at a point gives a matrix
representing the linear map best approximating the coordinate transformation in a neighborhood
of that point. The Jacobian determinant of a transformation is important in the theory of change
of variables for multiple integrals. In a sense, the gradient, as a row vector, is also a Jacobian. We
will refer to the notion of the derivative object giving the best linear approximation of a map as
the Jacobian derivative of the map.
The appeal of the matrix expression of the chain rule is deeper than the mere convenience
of the notation. When our multivariable chain rule arises from a change of coordinates,
we can
express things neatly in the language of Jacobians. Let Dx,y f = ∂x f ∂y f be the Jacobian
derivative of f with respect to (x, y)-coordinates, and let Du,v f = ∂u f ∂v f be the Jacobian
derivative of f with respect to (u, v)-coordinates. Denote by G the transformation (u, v) 7→ (x, y),
∂(x,y)
and Du,v G = ∂(u,v) . Then, using the transposes, the chain rule becomes
ä Ä ä
Du,v (f ◦ G (u, v) = Dx,y f G(u, v) ◦ Du,v G(u, v) ,
where “◦” on the right hand side can be interpreted as composition of linear maps, which is
just matrix multiplication (in this case, the row vector given by the (x, y)-gradient of f , evalu-
ated at G(u, v) = hx(u, v), y(u, v)i, acts on the Jacobian matrix of the transformation Du,v G =
∂(x, y)/∂(u, v), again evaluated for the ordered pair (u, v)). This allows us to rephrase the chain
rule: the Jacobian derivative of a composition of differentiable functions is the composition of their
Jacobian derivative maps.
We will now state a general chain rule for functions of many variables
Proposition. Suppose f (x1 , . . . xn ) has continuous partial derivatives ∂xi f on a domain D ⊆ Rn ,
and the variables x1 , . . . , xn are given as multivariate functions of variables u1 , . . . , um . If the partial
∂xi
derivatives ∂u j
exist and are continuous then
n
∂f X ∂f ∂xi ∂f ∂x1 ∂f ∂xn
= = + ... + .
∂uj i=1
∂xi ∂uj ∂x1 ∂uj ∂xn ∂uj
19
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Exercise 1.14. Appropriately rephrase the above general chain rule in terms of Jacobian deriva-
tives, in keeping with the philosophy that the chain rule should be expressible as “the Jacobian
derivative of a composition of differentiable functions is the composition of their Jacobian derivative
maps.” In particular you should define Jacobians for maps involved, and write out what the matrix
products look like in the general case. Be sure to see that their dimensions are compatible!
Example. Let w = f (x, y, z, t) and suppose fx , fy , fz and fw all exist and are continuous on a
set E ⊂ Dom(f ) ⊆ R4 . Suppose further that x, y, z, and t are each functions of variables u and v
defined on a set U ⊂ R2 such that the image of U is in E, and all necessary first partials exist and
are continuous. Then
∂w ∂w ∂x ∂w ∂y ∂w ∂z ∂w ∂t
= + + + = wx xu + wy yu + wz zu + wt tu ,
∂u ∂x ∂u ∂y ∂u ∂z ∂u ∂t ∂u
∂w ∂w ∂x ∂w ∂y ∂w ∂z ∂w ∂t
= + + + = wx xv + wy yv + wz zv + wt tv .
∂v ∂x ∂v ∂y ∂v ∂z ∂v ∂t ∂u
Example. It can be helpful to form a tree diagram to understand the nesting of variables. A tree
in graph theory is a collection of vertices and edges connecting them, with no closed loops. For our
variable trees, the vertices are labeled by the variables, and edges are labeled by partial derivatives.
E.g., for the example above, one has
∂x w ∂y w ∂z w ∂t w
x y z t
∂u x ∂v x ∂u y ∂v y ∂u z ∂v z ∂u t ∂v t
u v u v u v u v .
Figure 9
The terms of the chain rule sums are then found by taking products of the edge labels along
paths from the root w to the ends of branches with leaves labeled by the appropriate variable.
Thus, for ∂u w, one follows all paths originating from w and ending in u, to collect the product
terms which sum to give us the chain ∂x w ∂u x + ∂y w ∂u y + ∂z w ∂u z + ∂t w ∂u t.
If one has nested several levels ofmultivariable functions, then the tree may have more levels. For
Ä ä Ä ä
example, see the tree diagram for f x u(r, s, t), v(r, s, t) , y u(r, s, t), v(r, s, t) shown in figure 10.
f
∂x f ∂y f
x y
∂u x ∂v x ∂u y ∂v y
u v u v
∂r u ∂s u ∂t u ∂r v ∂s v ∂t v ∂r u ∂s u ∂t u ∂r v ∂s v ∂t v
r s t r s t r s t r s t .
Figure 10
20
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Exercise 1.15. Using the tree in figure 10, write out the chain rule expression for ∂s f .
Example. Let f (x, y) = xy − x2 − y 2 , and x(u, v, w) = uev cos w, y(u, v, w) = euv sin w. Find fw
when u = 1, v = −2, and w = π. We can solve this problem via a tree as follows. The initial variable
tree is shown in figure 11. In red are the branches we need to follow to form the appropriate chain
rule.
f
∂x f ∂y f
x y
∂u x ∂v x ∂w x ∂u y ∂v y ∂w y
u v w u v w
We thus need to compute ∂x f , ∂y f , ∂w x, ∂w y, x(1, −2, π), and y(1, −2, π) in order to compute
∂w f = ∂x f ∂w x + ∂y f ∂w y. We will redraw the tree, filling in information. It is helpful to alter the
left-to-right order in which the leaves appear to make space to write out ∂w x.
Rewriting our tree in terms of the functions and computed partials, we have:
xy − x2 − y 2
∂x f = y − 2x ∂y f = x − 2y
π −2 1 1 −2 π
Figure 12. Filling in the tree with the necessary partial derivatives.
∂f î ó
(1, −2, π) = ∂x f ∂w x + ∂y f ∂w y (u,v,w)=(1,−2,π)
∂w î ó
= (y − 2x)(−uev sin w) + (x − 2y)(euv cos w) (u,v,w)=(1,−2,π)
= (0 − 2e−2 )(0) + (−e−2 − 2(0))(−e−2 )
1
= 4.
e
Consider now a bivariate function f whose variables x and y are given as bivariate functions of
u and v, and assume all first and second partials exist and are continuous on appropriate domains.
Then we can use the chain rule and product rules to compute expressions for the second partials
21
2/21/20 Multivariate Calculus: Multivariable Functions Havens
fuu , fuv , fvu and fvv . For example, to compute fuu , one has
∂2f ∂ ∂f ∂ ∂f ∂x ∂f ∂y
Å ã Å ã
2
= = +
∂u ∂u ∂u ∂u ∂x ∂u ∂y ∂u
∂ ∂f ∂x ∂f ∂ 2 x ∂ ∂f ∂y ∂f ∂ 2 y
Å ã Å ã
= + + +
∂u ∂x ∂u ∂x ∂u2 ∂u ∂y ∂u ∂y ∂u2
∂ 2 f ∂x 2 ∂f ∂ 2 x ∂ 2 f ∂x ∂y ∂ 2 f ∂y 2 ∂f ∂ 2 y
Å ã Å ã
= + + 2 + + ,
∂x2 ∂u ∂x ∂u2 ∂y ∂x ∂u ∂u ∂y 2 ∂u ∂y ∂u2
where we’ve applied the product and chain rules to expand it, and the Clairaut-Schwarz theorem
to combine the mixed partial terms. All together, we have the following
Proposition. If f (x, y) has continuous first and second partial derivatives with respect to x and y,
and x and y are given as functions
Ä of u and väwith continuous first and second partial derivatives
with respect to u and v, then f x(u, v), y(u, v) has continuous first and second partial derivatives
with respect to u and v, and
∂2f ∂ 2 f ∂x 2 ∂f ∂ 2 x ∂ 2 f ∂x ∂y ∂ 2 f ∂y 2 ∂f ∂ 2 y
Å ã Å ã
= + + 2 + +
∂u2 ∂x2 ∂u ∂x ∂u2 ∂y ∂x ∂u ∂u ∂y 2 ∂u ∂y ∂u2
∂2f ∂2f ∂ 2 f ∂x ∂x ∂f ∂ 2 x ∂ 2 f ∂y ∂x ∂f ∂ 2 y ∂ 2 f ∂y ∂y
= = + + 2 + +
∂u ∂v ∂v ∂u ∂x2 ∂v ∂u ∂x ∂v ∂u ∂y ∂x ∂v ∂u ∂y ∂v ∂u ∂y 2 ∂v ∂u
ã2
2
∂ f 2
∂ f ∂x 2
∂f ∂ x ∂ f ∂x ∂y ∂ 2 f ∂y 2 ∂f ∂ 2 y
2
Å Å ã
= + + 2 + + .
∂v 2 ∂x2 ∂v ∂x ∂v 2 ∂y ∂x ∂v ∂v ∂y 2 ∂v ∂y ∂v 2
Exercise 1.16. Apply the chain rule and other applicable principles to get the remaining two
formulae in the above proposition for fuv = fvu and fvv . Compute fuv and fvu separately and
deduce their equality.
Exercise 1.17. Use the above proposition to re-express Laplace’s equation fxx + fyy = 0 in polar
coordinates. In particular, show that
fxx + fyy = frr + 1r fr + 1
f
r2 θθ
,
so that the polar form of Laplace’s equation may be written as
∂2f 1 ∂f 1 ∂2f
+ + = 0.
∂r2 r ∂r r2 ∂θ2
Exercise 1.18. Can you guess a formula for a spherically symmetric scalar wave propagating from
a point with velocity a? That is, find any solution of the (3 + 1)D wave equation
∂2u
(r, t) = a2 ∇2 u(r, t)
∂t2
p
modeling a wave spherically symmetric wave (so in particular, u depends on % = x2 + y 2 + z 2
rather than on x y and z independently). Hint: Building on the previous exercises, find an expression
for the Laplacian in spherical coordinates. Look also at exercise 1.11 above.
Implicit differentiation tackles the problem of computing the slope of a tangent line to such an
dy
implicit curve, by using the chain rule to compute dx . One differentiates both sides of the equation
f (x, y) = k under the assumption that locally, y is a function of x:
d Ä ä ∂f dx ∂f dy
f x, y(x) = + =0
dx ∂x dx ∂y dx
dy ∂f /∂x
=⇒ =− .
dx ∂f /∂y
E.g., for the unit circle, using z = x2 +y 2 , one has y 0 (x) = −∂x z/∂y z = −2x/2y = −x/y, which is
of course geometrically sensible, as the slope of a tangent to a circle must be the negative reciprocal
of the slope of the radial line, since the tangent line is perpendicular to the radius.
Similarly, for an implicit surface given by an equation F (x, y, z) = k, we can compute partial
derivatives under the assumption that one of the variables, say, z, locally depends upon the other
two, with the other two being independent there:
∂ Ä ä ∂F ∂x ∂F ∂z ∂z ∂F/∂x
F x, y, z(x, y) = + = 0 =⇒ =− ,
∂x ∂x ∂x ∂z ∂x ∂x ∂F/∂z
and similarly
∂z ∂F/∂y
=− .
∂y ∂F/∂z
Example. Find ∂x z and ∂y z for the implicit surface xy − xz + yz = 1.
Exercise 1.19. For the surface given implicitly by r4 − (1 + 2xz)r2 + (xz)2 = 0, where r2 = x2 + y 2 ,
∂z ∂z ∂r
use implicit differentiation to compute ∂x , ∂y and ∂z .
An important consideration is absent from our discussion above. We assumed that locally, z was
a function of x and y, and so we could compute partial derivatives via the chain rule. But when is
it okay to assume that an equation F (x, y, z) = 0 implicitly defines a surface in such a way that z
is locally a function of x and y? How do we find the points where this assumption is untenable?
We thus consider the implicit function theorem:
Theorem (Implicit Function Theorem for Trivariate Functions). Let F (x, y, z) be a function such
that F (x0 , y0 , z0 ) = 0, and suppose that the partials Fx , Fy and Fz are all continuous on a ball
containing (x0 , y0 , z0 ), and moreover Fz (x0 , y0 , z0 ) 6= 0. Then there exists a neighborhood U of
(x0 , y0 , z0 ) and a function f : D → R, for some domain D ⊆ R2 , such that
{(x, y, z) ∈ U | F (x, y, z) = 0} = {(x, y, z) | z = f (x, y), (x, y) ∈ D} ⊂ U ,
i.e., the equation F (x, y, z) = 0 in U implicitly defines a surface which is the graph of a bivariate
function.
23
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Certainly, nothing is special about z: one can ask instead that Fx 6= 0, and seek to express x as
a function of y and z locally. One can of course make much more general statements, though we
will leave such considerations for an advanced calculus course.
Exercise 1.20. Reconsider the function F (x, y, z) = xy − xz + yz and the surface F (x, y, z) = 1.
(a) Check that F satisfies the conditions of the implicit function theorem at the point (2, 1, 1),
and verify that this point is on the surface defined by F (x, y, z) = 1.
(b) What is the local expression for z there? What is the domain D for which this local function
is well defined?
(c) What happens to the surface at points where Fz = 0? Can you give an implicit description
around such points using x or y as the dependent variable?
24
2/21/20 Multivariate Calculus: Multivariable Functions Havens
25
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Equivalently, f (x, y) is differentiable at (x0 , y0 ) if the partial derivatives fx (x0 , y0 ) and fy (x0 , y0 )
both exist and there exist remainder functions ε1 and ε2 such that
f (x, y) = Lf,r0 (x, y) + ε1 (x, y) (x − x0 ) + ε2 (x, y) (y − y0 ) ,
and as (x, y) → (x0 , y0 ), (ε1 , ε2 ) → (0, 0).
Writing z = f (x, y), ∆z = z − z0 , ∆x = x − x0 and ∆y = y − y0 , one can rephrase the condition
of differentiability as follows: f is differentiable at (x0 , y0 ) if and only if
∆z = fx (x0 , y0 )∆x + fy (x0 , y0 )∆y + ε1 ∆x + ε2 ∆y
for some pair of functions ε1 and ε2 both of which vanish in the limit as (x, y) → (x0 , y0 ).
Proposition. If a bivariate function is differentiable at r0 = hx0 , y0 i, then the function
Lf,r0 (x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )
is the unique function such that the linear function A(r) := L(r) − f (r0 ) satisfies
f (r) − f (r0 ) − A(r − r0 ) 1 Ä ä
lim = lim f (r0 + h) − f (r0 ) − A(h) = 0 .
r→r0 kr − r0 k h→0 khk
More generally, for a multivariate function f (r) with domain D ⊆ Rn , one defines differentiability
again in terms of the existence and effectiveness of a linear approximation:
Definition. The function f (r) = f (x1 , . . . , xn ) is said to be differentiable at r0 if there exists a
linear function A(r) such that
f (r) − f (r0 ) − A(r − r0 ) 1 Ä ä
lim = lim f (r0 + h) − f (r0 ) − A(h) = 0 .
r→r0 kr − r0 k h→0 khk
Exercise 2.5. Prove that a multivariate function f (r) with domain D ⊆ Rn is differentiable at
r0 = ha1 , . . . , an i ∈ D if and only if
(i.) f is continuous at r0 ,
(ii.) all of the partial derivatives ∂xi f exist at r0 , and
(iii.) there exists a remainder vector ε = hε1 , . . . εn i such that
f (r) = Lf,r0 (r) + ε · (r − r0 ) ,
and as r → r0 , ε → 0.
Deduce that if all of the first partials of f exist and are continuous at r0 , then f is differentiable
at r0 . Can you find an example of a function g which is differentiable at a point r0 , but not
continuously differentiable there?
§ 2.4. The Total Differential
Definition. The total differential of a bivariate differentiable function f (x, y) is
∂f ∂f
df = dx + dy .
∂x ∂y
The total differential can be thought of as a formal analogue to the increment formula
∆z = fx ∆x + fy ∆y ,
giving an “infinitesimal version” of the linearization. It is an example of a differential one-form.
We’ll discuss one-forms again when we discuss line integrals. For now, the first utility of the total
differential is as a means to estimate errors.
27
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Example 2.1. Recall that the volume V of a right circular cone with base of radius r and height h
is V = π3 r2 h. suppose you measure the radius and height of a cone to be r = 10 cm and h = 20 cm
respectively. Suppose that the maximum error in each of your measurements is 0.4 cm = 4 mm.
Estimate the maximum error in the volume using differentials.
Solution: The volume differential is
2π π
dV = Vr dr + Vh dh = rh dr + r2 dh .
3 3
Using dr = 0.4 cm = dh, r = 10 cm, and h = 20 cm gives an error of
2π π 200π
(10 cm)(20 cm)(0.4 cm) + (10 cm)2 (0.4 cm) = cm3 ≈ 209 cm3 .
3 3 3
Observe thus that a small error in length measurements leads to a potentially large error in volume
measurement. Let us compare this error estimate to the real maximum error. The volume calculated
from the measurements is
2000π
V0 = cm3 .
3
The error is maximized in this case when the measurements given are smaller than the real lengths.
Computing the real volume if the lengths are given by r = 10.4 cm and h = 20.4 cm, one obtains
2206.464π
V = cm3
3
The difference is then the real error: V − V0 = (206.464π)/3 cm3 ≈ 216 cm3 .
28
2/21/20 Multivariate Calculus: Multivariable Functions Havens
29
2/21/20 Multivariate Calculus: Multivariable Functions Havens
We can derive an easy expression to compute the directional derivative in terms of û from the
chain rule.
Proposition. The directional derivative of f at r0 in the direction of û is given by
Dû f (r0 ) = hfx (r0 ), fy (r0 )i · û .
Proof. Let g(h) = f (r0 + hû), and write û = cos θı̂ + sin θ̂. Then
f (r0 + hû) − f (r0 )
g 0 (0) = lim = Dû f (r0 ) .
h→0 h
On the other hand, by the chain rule,
d
g 0 (0) = f (r0 + hû)
dh h=0
d
= f (x0 + h cos θ, y0 + h sin θ)
dh h=0
h ∂f dx ∂f dy i
= +
∂x dh ∂y dh h=0
= fx (x0 , y0 ) cos θ + fy (x0 , y0 ) sin θ = hfx (r0 ), fy (r0 )i · û .
Example. Find the directional derivative of f (x, y) = 6−3x2 −2y 2 at (x, y) = (1, 1) in the direction
of û where û makes an angle of π/3 with the x-axis. √
Solution: The direction vector we want is u = cos(π/3)ı̂ + sin(π/3)̂ = 21 ı̂ + 23 ̂. The partial
derivatives at (1, 1) are
fx (1, 1) = −6 , and fy (1, 1) = −4 ,
whence the directional derivative is
√ å
√
Ç
1 3
Dû f (1, 1) = (−6ı̂ − 4̂) · ı̂ + ̂ = −3 − 2 3 .
2 2
√ √
Example. Find the directional derivatives of f (x, y) = sin(xy) at ( π/2, π/3) in the directions
of the vectors v = ı̂ + ̂ and w = 3ı̂ − 4̂.
Solution:
Note that we must normalize the vectors v and w to obtain unit vectors:
√ √
v ı̂ + ̂ 2 2 w 3ı̂ − 4̂ 3 4
v̂ = = √ = ı̂ + ̂ , ŵ = = √ = ı̂ − ̂ .
kvk 2 2 2 kwk 25 5 5
√ √
We then compute the partial derivatives at ( π/2, π/3).
√ √
√ √ π
Å ã
π 3π
fx (x, y) = y cos(xy) =⇒ fx ( π/2, π/3) = cos = ,
3 6 6
√ √
√ √ π
Å ã
π 3π
fy (x, y) = x cos(xy) =⇒ fy ( π/2, π/3) = cos = .
2 6 4
The desired directional derivatives are thus
√ √ √ √ √ √
Dv̂ f ( π/2, π/3) = hfx ( π/2, π/3), fy ( π/2, π/3)i · v̂
Ç√ å Ç√ å Ç√ å Ç√ å √
3π 2 3π 2 5 6π
= + = ,
6 2 4 2 24
√ √ √ √ √ √
Dŵ f ( π/2, π/3) = hfx ( π/2, π/3), fy ( π/2, π/3)i · ŵ
Ç√ å Å ã Ç√ å Å ã √
3π 3 3π 4 3π
= − =− .
6 5 4 5 10
30
2/21/20 Multivariate Calculus: Multivariable Functions Havens
p
Exercise 3.1. Let f (x, y) = 4 − x2 − y 2 .
(a) Find the directional derivative of f at (x, y) in the direction of a vector making angle θ
with ı̂.
(b) At the point (1, 1), for what angle θ between û and ı̂ is the directional derivative largest?
For an arbitrary but fixed point (x0 , y0 ) ∈ Dom(f ), determine the angle which maximizes
the directional derivative in terms of x0 and y0 .
(c) At (1, 1), in what directions ±û is D±û f (1, 1) = 0? Give explicit unit vectors.
(c) For what (x, y) is the directional derivative 0 regardless of the direction û? What does this
reveal about the geometry of the graph?
Exercise 3.2. Let f (x, y) be a two-variable continuously differentiable function, Gf its graph, and
Πû,r0 the vertical plane containing p = r0 +f (r0 )k̂ and determined by a direction û ∈ S1 ⊂ {z = 0}.
Find a parametrization c(t) of the curve which is the locus of the intersection Gf ∩ Πû,r0 , so that
at t = 0 the position
Ä ä is p = r0 + f (r0 )k̂, and compute the curvature of this curve at p using the
chain rule for f c(t) .
Definition (The gradient as the vector of steepest ascent). For f (x1 , . . . , xn ) a multivariate func-
tion differentiable at the point P , the gradient of f at P is the unique vector ∇f (P ) such that
Dû f (P ) is maximized by choosing û = ∇f (P )/k∇f (P )k, and
Dû f (P ) = k∇f (P )k
gives the maximum rate of change of f at P . Observe that the minimum value of Dû f (P ) occurs
for û = −∇f (P )/k∇f (P )k, and the minimum rate of change is −k∇f (P )k, and that ∇f (P ) is
orthogonal to the level set of f containing P .
Exercise 3.3. While exploring an exoplanet (alone and un-armed–what were you thinking‽) you’ve
slid part way down a strangely smooth, deep hole. The alien terrain you are on is modeled locally
(in a neighborhood around you spanning several dozen square kilometers) by the height function
»
z = f (x, y) = ln 16x2 + 9y 2 ,
where the height z is given in kilometers. Let ı̂ point eastward and ̂ point northward. Your current
position is one eighth kilometers east, and one sixth kilometers south, relative to the origin of the
(x, y) coordinate system given. You want to climb out of this strange crater to get away from the
rumbling in the darkness below you.
(a) Find your current height relative to the z = 0 plane.
(b) Show that the level curves z = k for constants k are ellipses, and explicitly determine the
semi-major and semi-minor axis lengths in terms of the level constant k.
(c) In what direction(s) should you initially travel if you wish to stay at the current altitude?
(d) What happens if you travel in the direction of the vector −(1/8)ı̂ + (1/6)̂? Should you try
this?
(e) In what direction should you travel if you wish to climb up (and hopefully out) as quickly
as possible? Justify your choice mathematically.
(f) For each of the directions described in parts (c), (d), and (e), explicitly calculate the rate
of change of your altitude along those directions.
§ 3.3. Tangent Spaces and Normal Vectors
Observe that we can rewrite the linear approximation using the gradient of f :
Lf,r0 (r) = f (r0 ) + ∇f (r0 ) · (r − r0 ) ,
and this formula works to define a linear approximation for n-variable f so long as ∇f (r0 ) is defined.
In particular, we can use gradients to describe the tangent spaces to a (hyper)surface given by a
graph. But what of implicit surfaces?
An implicit surface can be viewed as specifying the surface as the level set of some function.
That is, if the equation of the surface is given by F (x, y, z) = k for some k, then the surface is
precisely the level surface
F −1 ({k}) = {(x, y, z) | F (x, y, z) = k} .
But then, the gradient ∇F is always orthogonal to level sets of F , whence we can use the gradient
vector at a point P ∈ F −1 ({k}) as a normal vector to the surface at P . We can use this normal
vector as the normal to the tangent plane at P , and thus, obtain an equation for the tangent plane
to a point P of the implicit surface with equation F (x, y, z) = k:
Proposition. The equation of the tangent plane at P (x0 , y0 , z0 ) to the surface F (x, y, z) = k,
assuming F is differentiable at P , is
∇F (P ) · hx − x0 , y − y0 , z − z0 i = 0 .
32
2/21/20 Multivariate Calculus: Multivariable Functions Havens
If we write r0 = hx0 , y0 , z0 i, and r = hx, y, zi then this equation has the pleasing and easy to
remember form
∇F (r0 ) · (r − r0 ) = 0 .
Example. For the function F (x, y, z) = xy − xz + yz, consider the implicit surface given by
F (x, y, z) = 1 discussed in the section on implicit differentiation above. At the point (2, 3, −5) on
the surface, the gradient of F is
∇F (2, 3, −5) = hFx , Fy , Fz i = hy − z, x + z, y − xi = h8, −3, 1i ,
(2,3,−5) (2,3,−5)
and so the tangent plane to the surface at (2, 3, −5) has equation
∇f (2, 3, −5) · hx − 2, y − 3, z + 5i = 8(x − 2) − 3(y − 3) + (z + 5) = 0 ,
which can be rewritten as
8x − 3y + z = 2 .
Example. Any graph given by z = f (x, y) can be rewritten as an implicit surface F (x, y, z) =
z − f (x, y) = 0. If we apply the gradient to this F , treating z as an independent variable, we get
∇F (x, y, z) = h−fx (x, y), −fy (x, y), 1i ,
which gives a tangent plane equation at x0 , y0 , z0 = f (x0 , y0 ) of
−fx (x0 , y0 )(x−x0 )−fy (x0 , y0 )(y−y0 )+(z−z0 ) = 0 =⇒ z−z0 = fx (x0 , y0 )(x−x0 )+fy (x0 , y0 )(y−y0 ) ,
which recovers the original tangent plane formula.
2 2 2
Exercise √ Find the equation of the tangent plane to the hyperboloid x + y − z = 1 at the
√ 3.4.
point ( 2, 3, 2).
Exercise 3.5. Use gradients to demonstrate that the tangent plane to a sphere at a point is always
perpendicular to the radius, and give a general formula for the tangent plane at P (x0 , y0 , z0 ) to an
origin centered sphere containing the point P .
Exercise 3.6. Consider the surface implicitly defined by
ã2
5
Å
x2 + y 2 + z 2 − = 1 − 4x2
4
.
(a) Find the equations of the traces for constant x, y, and z, and plot these families (you may
use a computer, especially for the z traces). What is this surface?
(b) Find the heights for which the tangent planes are horizontal.
(c) There are horizontal tangent planes which intersect the surface along curves rather than in
a single point. Sketch these curves.
(d) What do the self-intersections of the level curves in (c) tell us about the surface?
(e) Use techniques from single variable calculus to find the volume enclosed by this surface.
33
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Figure 16. A graph surface revealing a function with a number of local extrema,
as well as absolute extrema, over a disk domain D.
Consider the surface depicted in figure 16, given as the graph z = f (x, y) for some bivariate
function f defined over a domain D. This surface resembles a mountainous terrain, with a few
mountain passes, and some depressions. Some of these features correspond to values of f (x, y)
which are local extrema. For example, a peak of the surface corresponds to some “critical” pair of
an input and output for the function f (x, y), for which the output value is larger than the values
of the function for “nearby” inputs. We’ll call such a value a “local maximum”. There is peak
in the picture which corresponds to a local maximum value that is also a “global” or “absolute”
maximum, in that its z value is larger than all other z values visible. Let us formally define various
types of extrema.
Definition. Let f (r) be a multivariate function defined on a domain D, and let r0 ∈ D be a
particular point. Then
• We say that the value f (r0 ) is a local maximum value, or simply a local maximum if there
is some neighborhood U around r0 such that f (r0 ) ≥ f (r) for all r ∈ U .
• We say that the value f (r0 ) is a global maximum value, an absolute maximum value or
simply an absolute maximum if f (r0 ) ≥ f (r) for all r ∈ D.
• We say that the value f (r0 ) is a local minimum value, or simply a local minimum if there
is some neighborhood U around r0 such that f (r0 ) ≤ f (r) for all r ∈ U .
• We say that the value f (r0 ) is a global minimum value, an absolute minimum value or
simply an absolute minimum if f (r0 ) ≤ f (r) for all r ∈ D.
We say that a point is an extremum of f if it is a local or global maximum or minimum.
Some remarks are in order:
• For bivariate functions, the neighborhoods U can be taken to be small disks around the
input r0 = hx0 , y0 i. More generally, the neighborhoods can be taken to be small balls:
U = {r : kr − r0 k ≤ δ} for some sufficiently small real number δ > 0.
34
2/21/20 Multivariate Calculus: Multivariable Functions Havens
• Note that absolute extrema in the interior of the domain are also local extrema, and that a
given absolute extremum may not be the unique absolute extremum, if the absolute extreme
value occurs at multiple points. Extreme values may occur on the boundary as well; we’ll
call these boundary extreme values, or boundary extrema. These will be discussed in section
4.3 below.
• For f differentiable at r0 , if f (r0 ) is a local extremum, one should expect the tangent plane
to be horizontal there! That is, we expect the partial derivatives at r0 to be 0, else the
gradient vector (respectively, its negative) tell us a direction to travel in to obtain a locally
larger (respectively, smaller) value.
• We
p can imagine an extremum at a non-smooth point: just think about a cone like z =
x2 + y 2 , which clearly has an absolute minimum value of 0 at r0 = 0, but is not differ-
entiable there. There is also no well defined tangent plane at this point, nor a well defined
gradient vector.
In light of the above remarks, we consider investigating the types of inputs which can produce
local extrema.
Definition. A critical point r0 ∈ Dom(f ) of a function f (r) is a point at which the gradient is
zero or fails to exist. The value f (r0 ) is then called a critical value. We can also define the set of
all critical points
crit (f ) := {r0 ∈ Dom(f ) : ∇f (r0 ) = 0 or ∇f (r0 ) does not exist} .
Sometimes we will also use the term critical point to describe the location on the graph corre-
sponding to the pairing of a critical input with the critical value it produces. It should be clear
from context (e.g., if we refer to the graph itself) whether we mean the critical point as an input,
or the location on the graph itself.
Theorem 4.1 (Fermat’s Theorem on Critical Points). If a point r0 ∈ Dom(f ) produces a local
extremum of f , then r0 ∈ crit (f ).
Note that the converse is not true! For example, consider “mountain passes”, like z = x2 − y 2 .
The tangent plane at (0, 0, 0) is horizontal, and the gradient is 0 there, but there are points r
arbitrarily close to 0 in R2 for which f (r) is either positive or negative (consider values along the x
and y axes). Thus the point (0, 0, 0) is neither a local maximum nor a local minimum. See example
4.3.
Example 4.1. Find all critical points of the function f (x, y) = x4 − 4xy + y 4 , and determine the
corresponding critical values.
To begin, we compute the partial derivatives fx and fy :
fx (x, y) = 4x3 − 4y , fy (x, y) = 4y 3 − 4x .
Thus, the critical points are points (x, y) such that x3 = y and y 3 = x simultaneously. Substituting
y 3 for x in the first equation we obtain y 9 = y, which has real solutions precisely when y = 0 or
y = ±1. Returning to the first equation, we see that x = 0 works when y = 0, and x = y = ±1
gives the remaining possible solutions. Thus
crit (f ) = {(−1, −1), (0, 0), (1, 1)} .
The corresponding values are
f (0, 0) = 0, f (±1, ±1) = −2 .
How do we determine in general if a critical point corresponds to a local extremum? Intuitively,
if we understand the concavity or curvature of the graph surface around a critical point, we can
determine if there is an extremum or not. Thus, we will find a generalization the second derivative
test to functions of 2-variables. Before we do, we build some geometric intuition by exploring some
model cases.
35
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Example 4.2. Let f (x, y) = x2 + y 2 and g(x, y) = 1 − x2 − y 2 . Each has a single critical point at
(x0 , y0 ) = (0, 0). We can deduce that (0, 0) gives an absolute minimum value of 0 for f , while for g
it produces an absolute maximum value of 1. Indeed, since x2 + y 2 is a sum of squares, it is always
nonnegative, and 0 is its minimum value. We can rewrite g(x, y) in terms of f as g(x, y) = 1−f (x, y),
which confirms that g has a maximum value of 1.
An alternative approach is to examine the second derivatives. Observe that fxx = 2 = fyy , and
fxy = fyx = 0. We can interpret fxx (x, y0 ) as being the second derivative of the graph of the
36
2/21/20 Multivariate Calculus: Multivariable Functions Havens
function c1 (x) = f (x, y0 ) in the plane y = y0 ; that fxx = 2 means that every such curve is concave
up (i.e. convex in the positive z direction). We deduce similarly that every curve c2 (y) = f (x0 , y) is
concave up. Thus the surface z = f (x, y) bends away from its horizontal tangent plane at (0, 0, 0),
and so this point is a global minimum point.
Similarly for g(x, y), we have gxx = −2 = gyy , and gxy = gyx = 0, and we deduce that every
trace curve in a vertical plane is concave down, and (0, 0, 1) must be a global maximum point for
the graph of z = g(x, y) = 1 − f (x, y).
Example 4.3. Consider h(x, y) = x2 − y 2 and k(x, y) = 2xy. There is again a single critical
point at the origin of R2 for each surface, and the graphs z = h(x, y) and z = k(x, y) share a
horizontal tangent plane of z = 0 at (0, 0, 0). For each of h and k, there are inputs arbitrarily
close to (0, 0) for which outputs can be either positive or negative–that is, the graphs of each rise
above and below the common tangent plane z = 0. Indeed, the tangent plane z = 0 intersects
each graph in a pair of lines: setting h(x, y) = 0 gives x2 = y 2 ⇐⇒ y = ±x, and similarly
k(x, y) = 0 ⇐⇒ xy = 0 ⇐⇒ x = 0 or y = 0. These line pairings act as asymptotes for the
hyperbolic level curves, and we can use this information to help construct the graphs. We see that
z = h(x, y) is a saddle with level curves given as hyperbolae x2 − y 2 = k, while z = k(x, y) is the
same saddle rotated by 45◦ .
As before, let us see if we can use second partial derivatives to analyze the concavity of traces
and recover our conclusions from above. For h(x, y) this will be mostly straightforward. Note that
hxx (x, y) = 2 but hyy = −2. This tells us that traces in planes of constant y are concave up, while
traces in planes of constant x are concave down. Note that hxy = 0 identically.
The other saddle, z = k(x, y) is just a rotation of the saddle z = h(x, y). The interesting thing
is, kxx = 0 = kyy identically. Indeed, since kx (x, y) = 2y and ky (x, y) = 2x, the only nonzero second
derivatives are the mixed partials, both of which are identically 2. What this says is that the traces
along planes of constant x and y are lines, but the lines slopes change as follows: as we sweep the
plane y = y0 through increasing values of y0 , the trace lines’ slopes increase, since kxy = 2 > 0.
37
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Similarly, as we sweep planes x = x0 through increasing values of x0 , the trace lines’ slopes increase.
The saddle, a hyperbolic paraboloid, can be swept out by lines in two ways! It is thus called doubly
ruled.
Observe that though kxy was positive, the quantities hxx hyy − (hxy )2 and kxx kyy − (kxy )2 are
both equal to −4. An interpretation of this will be illuminated in the next section.
Example 4.4. Let l(x, y) = (y −x)2 and let m(x, y) = x3 −3xy 2 . We will see in these cases that the
second derivatives don’t tell the complete story. For l(x, y), the critical locus is the whole line y = x,
which corresponds to a global minimum value of 0. The graph is a parabolic cylinder, appearing
like a trough with a whole line of minima! In particular, there is a direction in which √ the surface
does not bend away from the tangent plane at a minimum, namely, along the vector 22 (ı̂ +̂), even
though lxx = 2 = lyy . But notice that lxy = lyx = −2, and so the quantity lxx lyy − (lxy )2 is 0. In
this case, the non-isolation of the critical points is related to the failure of the second derivatives
to completely explain the behavior of the surface around its minima.
For m(x, y), there is a unique critical point at the origin. However, mxx (0, 0) = 0 = myy (0, 0),
Ä ä2
and mxy (0, 0) = myx (0, 0) = 0. The quantity mxx (0)myy (0) − mxy (0) = 0. In this case, one
can check that the tangent plane intersects the graph in a collection of lines (how many?) and the
surface has neither a local maximum nor a local minimum at (0, 0, 0). The graph of z = m(x, y) is
often called a monkey saddle. Can you explain why?
Exercise 4.1. Suppose we are given the following contour plots (in figure 21) for the graphs of
the first partial derivatives fx and fy of some function f . What information can be determined
about crit (f ) from these plots? Can we determine if critical points correspond to certain types of
extrema?
38
2/21/20 Multivariate Calculus: Multivariable Functions Havens
(a)
(b)
Figure 21. (a) Contours for k = fx (x, y), with the bold black contour correspond-
ing to the zero level. (b) Contours for k = fy (x, y), with the bold black contour
corresponding to the zero level.
39
2/21/20 Multivariate Calculus: Multivariable Functions Havens
(ii) if |H| > 0 and fxx (x0 , y0 ) < 0 then f (x0 , y0 ) is a local maximum value,
(iii) if |H| < 0 then f (x0 , y0 ) is a saddle point, and thus neither a maximum nor a minimum,
(iv) if |H| = 0 then the test is inconclusive, and the point f (x0 , y0 ) can exhibit any of the above
behaviors.
Example 4.5. For each of the model cases above in examples 4.2, 4.3, and 4.4 we can readily
confirm the results of the test:
• For example 4.2: we have a local (and global) minimum value of 0 at (0, 0) for f (x, y) =
x2 + y 2 and |H| = 4 while fxx (0, 0) = 2, and for g(x, y) = 1 − x2 − y 2 we have a local (and
global) maximum value of 1 at (0, 0) with |H| = 4 and fxx (0, 0) = −2. Thus f yields an
example for criterion (i) in the above proposition, and g yields an example for criterion (ii).
• Both saddles h(x, y) = x2 − y 2 and k(x, y) = 2xy of example 4.3 have Hessian discriminant
|H| < 0 for the critical point (0, 0), and thus yield examples of criterion (iii).
• The functions l(x, y) = (y − x)2 and m(x, y) = x3 − 3xy 2 of example 4.4 yield cases under
criterion (iv), since their Hessian discriminants are both 0. These are called degenerate
critical points.
Observe that after perturbing the function m(x, y) by adding a small linear term ε1 x+ε2 y,
the function has new critical points, and the second derivative test will work for these new
critical points.
Exercise 4.2. Let ε1 and ε2 be any small positive constants, and define η(x, y) = ε1 x + ε2 y. Show
that m(x, y) + η(x, y) has two critical points, and classify them using the second derivative test.
40
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Example 4.6. Reconsider the function f (x, y) = x4 − 4xy + y 4 from example 4.1. Recall that
crit (f ) = {(−1, −1), (0, 0), (1, 1)}. The Hessian is
ñ ô
12x −4
H(x, y) = ,
−4 12y
and the corresponding discriminant is
|H(x, y)| = 144xy − 16
|H(±1, ±1)| = 128 > 0, fxx (±1, ±1) = 12 < 0 =⇒ f (±1, ±1) = −2 is a local minimum value,
|H(0, 0)| = −16 < 0 =⇒ (0, 0, 0) is a saddle point .
This is consistent with what we see in figure 17.
Exercise 4.3. Find and classify the critical points for the following functions:
(a) f (x, y) = x3 − 3xy 2 − 9x2 − 6y 2 + 27,
Exercise 4.4. Let f (x, y) = xy + cos(xy). Analyze the critical point locus of f (x, y), and explain
why one might appropriately say that the graph of z = xy + cos(xy) possesses infinitely many
“saddle ridges”. What can you say about extrema of f ?
(c) For a curve as in part (a) (not necessarily unit speed) compute
d2 Ä ä
2
f r(t)
dt
in terms of partials of f and the derivatives ẋ, ẏ, ẍ, ÿ using parts (a) and (b).
Exercise 4.6. Building off of the previous exercise (or assuming its results as needed), prove the
second derivative test. Hints: consider directional derivatives for arbitrary u ∈ S1 , and show that if,
e.g., the assumptions of (i) hold, then the curve of intersection of the graph and the plane through
r0 and parallel to u is concave up; similarly use directional derivatives and the assumptions of (ii)
and (iii) to assess the claims of the test. Finally, for (iv) produce and analyze examples with Hessian
discriminant 0 exhibiting each type of behavior.
41
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Exercise 4.7. This problem deals with the multivariable Taylor series. Consider a function of two
variables defined on a domain D ⊆ R2 . Assume that f has continuous partial derivatives of all
orders (a condition called smoothness; one often writes f ∈ C ∞ (D, R) to indicate that it is in the
class of smooth functions from D to R.) The Taylor series of f centered at a point (x0 , y0 ) ∈ D is
∞ X
∞
X 1 ∂ n+m f
(x − x0 )n (y − y0 )m .
n! m! ∂xn ∂y m (x
i=0 j=0 0 ,y0 )
(a) Let ∇f (r0 ) denote the gradient vector of f evaluated at (x0 , y0 ) and let H(r0 ) denote
the Hessian matrix of f evaluated at r0 = hx0 , y0 i. Show that the second order Taylor
polynomial of f is
1
Tf,2 (r) = f (r0 ) + ∇f (r0 ) · (r − r0 ) + (r − r0 ) · (H(r0 )(r − r0 )) .
2
(b) Compute the second order Taylor polynomials of the following functions at the given points:
2 2
(i) f (x, y) = e−x −y , (x0 , y0 ) = (0, 0),
(d) For g(x, y) = sin(xy), (0, 0) is also a critical point. Compute the second order Taylor
polynomial around (0, 0) and determine the type of critical point. For the critical points
(π, 1/2), and (3π, 1/2), compute 3rd order Taylor polynomials, and analyze the behavior of
g(x, y) around these points.
For exercises 4.8 and 4.9, the notations Hx,y (f ), Hu,v (f ) are used to emphasize which
variables we use to compute partial derivatives, and to match with the Jacobian no-
tation Dx,y (f ), Du,v (f ) that was introduced in section 1.5.
Exercise 4.8. Use the notion of Jacobian derivatives Dx,y f = ∂x f ∂y f to show that the
Hessian matrix may be defined as Hx,y (f ) = Dx,y (Dx,y f )t = Dx,y ∇x,y f , where (Dx,y f )t is the
transpose of the row vector, so (Dx,y f )t = ∇x,y f .
In light of this alternate definition, sometimes the Hessian is denoted as D2 f or ∇∇f by geome-
ters (it should not be confused with ∇2 f , the Laplacian operator, which is a scalar valued function
rather than a matrix).
∂(x,y)
Exercise 4.9. Let G be the coordinate transformation (u, v) 7→ (x, y) and Du,v G = ∂(u,v)
itsÄ Jacobian derivative
ä matrix as defined in section 1.5 on the chain rule. Consider a function
f x(u, v), y(u, v) which is at least twice differentiable with respect to x and y, and assume x and
y are each at least
twice differentiable
with respect to u and v.
Let Dx,y f = ∂x f ∂y f be the Jacobian derivative of f with respect to (x, y)-coordinates, and
let Du,v f = ∂u f ∂v f be the Jacobian derivative of f with respect to (u, v)-coordinates. Let
Hx,y f denote the Hessian of f with respect to x and y coordinates, Hu,v f denote the Hessian of f
with respect to u and v coordinates, and Hu,v G denote the Hessian of the coordinate transformation
G, which can be regarded as a vector/block matrix whose 2 “entries” are the matrices Hu,v x and
Hu,v y.
42
2/21/20 Multivariate Calculus: Multivariable Functions Havens
∂z ∂z E D x y E
D Å ã Å ã
∇V (x, y) = hVx , Vy i = 4yz + 4xy , 4xz + 4xy = 4yz + 4xy − , 4xz + 4xy − ,
∂x ∂y z z
where we used implicit differentiation of x2 + y 2 + z 2 = 3 to obtain ∂x z = −x/z and ∂y z = −y/z.
Setting ∇V (x, y) = 0 gives the equations
0 = 4yz − 4x2 y/z =⇒ yz 2 = x2 y =⇒ z 2 = x2
0 = 4xz − 4xy 2 /z =⇒ xz 2 = xy 2 =⇒ z 2 = y 2 .
Since the point (x, y, z) denotes the√ corner where2 x, y, z > 0, we deduce x = y = z, and since
this point is on the sphere of radius 3, we have 3x = 3, so x = y = z = 1. Thus the box is a half
cube [−1, 1] × [−1, 1] × [0, 1], so the dimensions are 2 by 2 by 1, and the volume is V = 4.
Observe that effectively,
p what we have done is compute an optimum of the function V (x, y, z) =
4xyz on the surface z = 3 − x2 − y 2 . For the space region between the plane and the hemisphere,
we found a maximum value; a local minimum in this region happens at the only critical point of
the (three variable) function V (x, y, z), namely the origin, and also occurs whenever any of the
43
2/21/20 Multivariate Calculus: Multivariable Functions Havens
coordinates takes a value of 0 . The global minimum value of −4 occurs at the boundary points
(−1, 1, 1) and (1, −1, 1). We’ll see shortly that global extrema for such a compact domain, like a
solid ball or solid hemisphere, can either happen at interior critical points or at points along the
boundary.
Exercise 4.10. Use calculus methods to prove that the minimum distance from a point P (x1 , y1 , z1 )
to a plane ax + by + cz + d = 0 is given by
|ax1 + by1 + cz1 + d|
D= √ ,
a2 + b2 + c2
and give the coordinates of the closest point. You should check your work using vector algebra
methods.
We are interested in the following more general questions at the mathematical root of simple
optimization problems:
• When is a function f (x, y) or F (x, y, z) guaranteed to have global extrema?
• How does one procedurally find global extrema, assuming they exist?
To answer these questions, we need a few topological preliminaries.
Definition 4.1. Let E ⊆ R2 . A point r0 ∈ E is called a boundary point of E if every open disk
Bε (r0 ) = {r ∈ R2 : kr − r0 k < ε} centered at r0 contains points both in E and in the complement
of E. The boundary of E, denoted by ∂E is the set of al boundary points:
∂E := {r0 ∈ R2 : for all ε > 0 , Bε (r0 ) ∩ E 6= ∅ and Bε (r0 ) ∩ (R2 − E) 6= ∅} .
Definition 4.2. The interior of E ⊆ R2 is the set of points of E which are not boundary points:
int E := E − ∂E.
Definition 4.3. A set E ⊆ R2 is called closed if it contains all of its boundary points: E closed
⇐⇒ ∂E ⊆ E.
Definition 4.4. A set E ⊂ R2 is called bounded if there exists a disk D such that E ⊆ D.
Remark 4.1. If we replace open disks with open balls, the above definitions generalize to subsets
of R3 or even Rn .
Intuitively, boundary points are at the “edge” of the set; if the set is a contiguous region in R2 ,
then the boundary is the collection of curves delineating the transition from “within” to “outside”.
Sets which are bounded intuitively don’t “run off to infinity”. Sets which are closed and bounded
are often called compact; in the plane these are regions of finite area with boundaries that are
(possibly several) closed curves.
Exercise 4.11. Draw pictures and indicate boundaries and interior for each of the following sets,
and argue the corresponding claims:
(a) The “closed unit disk” D = {r ∈ R2 : krk ≤ 1}; D is closed and bounded by the above
definitions.
(b) The region E = {(x, y) : xy ≤ 1}; E is closed but unbounded by the above definitions.
(c) The “punctured disk” D∗ := D − {0}; D∗ is bounded but is neither closed nor open (see
the next exercise if you forgot the definition of open sets in R2 ).
Exercise 4.12. Recall, a set E ⊆ R2 is called open if around every point r0 ∈ E, there is an open
disk Bε (r0 ) for some sufficiently small ε > 0 such that Bε (r0 ) ⊂ E. Prove the following using this
definition for open sets and the above definitions for boundary points, interior points, and closed
sets.
(a) The boundary ∂E is the complement in E of the interior: ∂E = E − int E,
44
2/21/20 Multivariate Calculus: Multivariable Functions Havens
(b) The interior is the set of points which are everywhere surrounded by other interior points:
r0 ∈ int E if and only if there exists a disk Bε (r0 ) for some ε > 0 such that Bε (r0 ) ⊂ E,
(c) A set is closed in R2 if and only if it is the complement in R2 of an open set: E closed if
and only if there is an open set U ⊂ R2 such that E = R2 − U ,
(d) A set U ⊆ R2 is open if and only if it equals its interior: U open if and only if U = int U ,
i.e., if and only if U ∩ ∂U = ∅,
The reason for introducing these topological ideas is that the question of the existence of absolute
extrema depends upon topological properties of the domain and the function. Namely, we have the
following version of the extreme value theorem:
Theorem 4.2 (Extreme Value Theorem for bivariate functions). A function f (x, y) continuous on
a closed and bounded (i.e., compact) domain D ⊂ R2 attains an absolute maximum value f (r1 ) for
some point r1 ∈ D and an absolute minimum f (r2 ) for some point r2 ∈ D.
We won’t prove this version of the extreme value theorem as it involves rigorously demonstrating
the claims below about sequences in compact sets. We remark that its generalization holds: for
appropriate definitions of compact and continuous, it is always true that a continuous R-valued
function defined on a compact domain K attains an absolute maximum value and an absolute
minimum value for some inputs in K. We will use the abbreviation EVT to refer to any such result;
context should make clear whether we are dealing with bivariate functions, trivariate functions, or
some other case.
Though we won’t prove the result, we make a few remarks about why topology comes up. Conti-
nuity is essentially a topological condition relating the domain and the function4, and compactness
is a topological condition on the domain itself. The intuition is that D being closed and bounded
for continuous f prevents the function’s values from “running away” indefinitely:
(i) because f is continuous, sequences of points in D that converge to a position in D produce
convergent limits of values of f ,
(ii) boundedness of D means no sequence of inputs can run off to infinity, with values of f
becoming arbitrarily large or small,
(iii) because D is closed, sequences in D that converge must converge to points within D, where
f is defined, so in particular sequences converging to boundary points yield definite limits
of values of f ;
(iv) compactness of D implies the boundary is itself compact, so our reasoning here and above
extends to show that sequences of values of the function produced from convergent sequences
within the boundary are also well behaved, and so in particular by reapplying EVT, there
is a well defined boundary extrema problem whose solutions exist (though it may be difficult
to find them),
(v) putting all these ideas together, there is no way for the value of f to increase or decrease
indefinitely along any path or sequence in D, and so there must be some value which is
largest, and some value which is smallest, and these may happen at interior critical points
or somewhere along the boundary.
The main application of EVT is that, together with Fermat’s theorem on critical points and
local extrema, it suggests and guarantees the legitimacy of the following procedure to find absolute
extrema.
4Recall that on page 6 of these notes continuity throughout a domain is rephrased in the context of open sets
and pre-images. Point-set topology concerns itself with the minimum structures on sets necessary to define, analyze
and infer continuity properties of functions; the first step is to create a coherent notion of open sets which defines
“a topology” on the set of interest. Then concepts of connectedness, compactness, boundaries, and interiors are all
definable as topological notions, determined as properties intrinsic to a set endowed with a given topology–that is, a
set given a coherent notion of which subsets are to be regarded as open subsets.
45
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Example 4.9. Find the absolute maximum and minimum values of f (x, y, z) = 2x + y − 2z on the
closed unit ball B = {r : krk ≤ 1}.
Observe that f is linear, and hence its 3D gradient is never 0. Thus, the extreme values must
occur on the boundary sphere x2 + y 2 + z 2 = 1. We can implicitly differentiate f restricted to the
boundary to obtain the (x, y) -gradient
Ä ä
∇f x, y, z(x, y) = h2 − 2∂x z, 1 − 2∂y zi = h2 + 2x/z, 1 + 2y/zi .
46
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Figure 22. The function f (x, y) = 2x2 y 2 −x2 −y 2 +1 graphed over the unit square
[0, 1]2 has one saddle over the interior of the square, attains a boundary maximum
value of 1 at opposite corners (0, 0) and (1, 1), and attains a boundary minimum
value of 0 at the remaining opposite corners (1, 0) and (0, 1). The boundary extrema
give the absolute extrema over the square in this case.
Ä ä
Then ∇f x, y, z(x, y) = 0 if and only if, for z 6= 0, x = −z = y/2. Substituting into the equation
of the sphere and solving, one has x2 + x2 /4 + x2 = 1 =⇒ 9x2 = 4, so x = ±2/3, y = ±1/3 and
z = ∓2/3. The maximum is thus f (2/3, 1/2. − 2/3) = 3 and the minimum is f (−2/3, −1/3, 2/3) =
−3.
Observe that these points are precisely points where the planes f (x, y, z) = 2x + y − 2z = ±3 are
tangent to the sphere. In this case, had we realized that these optima occur where level surfaces
of f are tangent to the constraint surface, we could have used elementary geometry of spheres
and planes to locate these points: indeed they are given as the positive and negative unit vectors
parallel to the gradient of f . In the next section we will exploit the relationship between tangencies
of level sets and constrained extrema to give another method to solve such constrained optimization
problems.
Exercise 4.14. Find the points on the surface S ⊂ R3 with equation xy+xz+yz = 1 that are closest
to the origin (0, 0, 0), and explain why these give maxima of the function f (x, y, z) = xy + xz + yz
inside the closed ball whose radius is the minimum distance D from (0, 0, 0) to the surface.
47
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Exercise 4.15. If you are traveling so that higher terrain is to your left, then are you ascending
or descending if the contour crosses the trail from right to left? Explain why, using the language of
gradients and directional derivatives.
As we follow along, eventually and perhaps often, the trail goes from ascending to descending or
descending to ascending, and correspondingly, on the map the trail goes from crossing contours left
to right to crossing them right to left, or vice versa. We can argue by Rolle’s theorem or the mean
value theorem that there must be a critical point for the height function along the curve. But how
does this relate to the geometry of the trail curve and contours on the map?
Figure 23. For continuously differentiable bivariate functions f and g, the critical
points of the restriction of the height function z = f (x, y) to the curve g(x, y) = k
occur at points where the curve g(x, y) = k is tangent to some level curve of f (x, y).
There are two possibilities depending on how the trail behaves and how the map is drawn. It is
possible that there is a point where the trail is tangent to a contour. If this is the case, and for
some stretch prior to that point, the trail is ascending, and thereafter it is descending, then clearly
48
2/21/20 Multivariate Calculus: Multivariable Functions Havens
there is a local maximum altitude for the path attained at the point of tangency. However, the trail
could be briefly tangent and then continue ascending (or descending if it initially was descending),
or the map may simply not draw the level curve to which the trail is eventually tangent. However,
provided the trail and contours are sufficiently smooth curves (at least differentiable), it should be
clear that when the derivative of altitude changes sign there is some kind of tangency between the
trail and a level curve.
Then to find the highest altitude, one simply finds the points where the trail is tangent to level
curves, and then looks for the highest level curve where this happens. Note that if the curves aren’t
smooth, we also have to check any places where the trail has corners or cusps, or anywhere where
a contour has a corner or cusp that meets the trail (even if the trail is smooth there).
This procedure can be turned into a mathematically rigorous way to find extrema of a two
variable function f (x, y) constrained along a curve given implicitly by g(x, y) = k for some constant
k. Think of g(x, y) = k as describing the trail, and z = f (x, y) being the altitude function. Since
gradients are perpendicular to level curves, and the constraint curve g(x, y) = k is just the level
curve g −1 ({k}), we deduce that tangency points between the constraint curve and a level curve of
f happen at a point r0 = hx0 , y0 i if and only if the gradients ∇f (r0 ) and ∇g(r0 ) are parallel. Thus,
for some constant λ, such a point r0 satisfies the (nonlinear) system of equations
∇f (r0 ) = λ∇g(r0 ) ,
g(r0 ) = k .
The constant λ is called the Lagrange multiplier associated to r0 . Note that crit f is precisely the
points satisfying the gradient condition when λ = 0, but it is possible that crit f is disjoint from
the curve g(r) = k. Different constrained critical points may correspond to different λ values.
Example 4.11. We will find the maximum and minimum values of z = f (x, y) = x2 + 4y 2 along
the unit circle x2 + y 2 = 1. Note that we could do this via a parameterization; instead we will use
g(x, y) = x2 + y 2 = 1 as a constraint curve and apply the method of Lagrange multipliers. The
equations we need to solve are
∇f (x, y) = h2x, 8yi = λh2x, 2yi = λ∇g(x, y) ,
x2 + y 2 = 1 .
From this we get
2x = 2λx
2y = 8λy
x2 + y 2 = 1 .
Note that λ = 0 allows the first two equations to be solved by (0, 0), but this point is not on
the circle x2 + y 2 = 1. In fact, this corresponds to the unique solution to ∇f (x, y) = 0; in general
solutions to the Lagrangian equations in the case λ = 0 recover the critical points of f which also
satisfy the constraint.
If we then assume λ 6= 0, it is clear that either y = 0, in which case λ = 1 and x = ±1, or x = 0
in which case λ = 1/4 and y = ±1.
The extrema of f along the unit circle thus happen above the axes: the maximum is f (0, ±1) = 4,
and the minimum is f (±1, 0) = 1. See figure 8 in the chain rule section (page 17) for a visualization
of the function f (x, y) evaluated along x2 + y 2 = 1.
Exercise 4.16. Rework example 4.7 using the method Lagrange multipliers. Show that these
dimensions also minimize the surface area of the open box for the fixed volume V = 4, again via
Lagrange multipliers.
49
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Exercise 4.17. Consider an ellipse defined by the equation (x/a)2 + (y/b)2 = 1 where a > b > 0
are the lengths of the semi-major and minor axes, respectively. Find the maximum area, in terms
of a and b of a rectangle inscribed in the ellipse, and give the coordinates of its corners. Similarly
find the maximum perimeter of an inscribed rectangle, and give the coordinates of its corners.
Exercise 4.18. Suppose you want to make an (open) cone out of paper. If you want the cone
to have a volume of 4π/3 then what would be the optimum radius and height to minimize the
surface area √of the cone? Recall that the area of an open cone with radius r and height h is
A(r, h) = πr r2 + h2 .
Suppose we wanted to study optimization with two constraints. For example, perhaps we want
to optimize a function f (x, y, z) subject to constraints g(x, y, z) = k and h(x, y, z) = l for constants
k and l. Geometrically, this corresponds to optimizing f along a curve again, this time realized as
a curve of intersection of the two implicit surfaces provided by the constraints g(x, y, z) = k and
h(x, y, z) = l. At an optimum (either maximum or minimum) along the curve, the curve will be
tangent to a level surface of f . But then the gradient of f must be perpendicular to the curve. But
then it follows that the gradient of f is a linear combination of the gradients of g and h, which are
also perpendicular to this curve of intersection. Thus, the two constraint Lagrangian equations are
∇f (x, y, z) = λ∇g(x, y, z) + µ∇h(x, y, z) ,
g(x, y, z) = k ,
h(x, y, z) = l ,
where λ and µ are both Lagrange multipliers.
Example 4.12. Consider the curve of intersection of the cylinder x2 + y 2 = 5 and the plane
6x − 3y + 2z = 5. Find the maximum straight line distance from (0, 0, 0) to the curve, and give the
points along the curve where this distance occurs.
We can let f (r) = r · r be the square distance, for if we maximize distance we also maximize its
square. Our constraints are the cylinder equation g(x, y, z) = x2 + y 2 = 5 and the plane equation
h(x, y, z) = 6x − 3y + 2z = 5. The Lagrangian equations are then
2x = 2λx + 6µ
2y = 2λy − 3µ ,
2z = 2µ ,
5 = x2 + y 2
5 = 6x − 3y + 2z .
The third equation tells us that z = µ, whence from the first two equations we have (1 − λ)x =
3z = 2(λ − 1)y. Then either λ = 1 or x = −2y. Note that λ = 1 then requires z = 0, which gives
x = (5 + 3y)/6 from the equation of the plane. The two points we get from plugging this into the
cylinder’s equation correspond to the minimum square distance of 5 (we leave it as an exercise to
find these two points and show this). For the maximum square distance, we then look at the case
where x = −2y. Substituting into the cylinder equation first gives 4y 2 + y 2 = 5 =⇒ y = ±1,
whence x = ∓2. Substituting these into the plane equation gives
h(∓2, ±1, z) = ∓15 + 2z = 5 =⇒ z = 10 or − 5 .
The corresponding square distances are
f (−2, 1, 10) = 105 , and f (2, −1, −5) = 30 .
√
Thus the maximum distance from (0, 0, 0) to the curve is 105, which occurs at the point (−2, 1, 10).
50
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Exercise 4.19. Find the minimum value of f (x, y, z) = x2 + 4y 2 + 9z 2 on the intersection of the
hyperboloids 4x2 + y 2 − 9z 2 = 1 and 9x2 − 4y 2 − z 2 = 1. Explain why there is no maximum value
of f along this intersection locus.
Exercise 4.20. For the cylinders x2 + y 2 = 1 and y 2 + z 2 = 4/9, find the minimum positive value
of the x coordinate along the intersection curve of the cylinders, and locate all the points where
this value occurs. Set up and solve the problem using the method of Lagrange multipliers with two
constraints.
Note that without loss of generality, we may assume constraints have the form g(r) = 0, as we
may always arrange the equations of a constraint set with all terms on one side. The next theorem
rephrases the idea of Lagrange multipliers for multiple constraints in the language of optimizing a
single function, called a Lagrangian.
Theorem 4.3. Let f (r) be a differentiable multivariate function defined on a domain D ⊆ Rn , and
suppose g1 , . . . gk are differentiable functions on D determining a set of k < n constraint equations
{gi (r) = 0}. Let Λ : D × Rk → R be the Lagrangian function given by
Λ(r, λ) = f (r) − λ · G(r) ,
where G(r) = hg1 (r), . . . , gk (r)i.
Then the absolute maximum and minimum values of f (r) subject to the constraints {gi (r) = 0},
assuming they exist, occur at points r corresponding to points (r, λ) ∈ crit (Λ) such that the extreme
values of Λ(r, λ) give the extreme values of f (r). For a critical point (r0 , λ0 ) ∈ crit (Λ), the vector
λ0 gives the k Lagrange multipliers λi,0 , i = 1, . . . , k, such that ∇f (r0 ) = ki=1 λi,0 gi (r0 ) holds.
P
Exercise 4.21. Prove the above theorem. Hint: first argue that the gradient of f must be a
linear combination of the gradients of g, as we did above in the case of three variables and two
constraints. Then show that the critical points of the Lagrangian correspond to the Lagrange
multiplier equations and constraint equations, and that the corresponding values correspond to the
constrained local extrema of f . In particular, you should be able to argue that there is a one-to-one
correspondence between critical points of Λ and the collection of r such that r either solves the
Lagrange multipliers system and constraint equations, or r ∈ crit f .
Exercise 4.22. Under what conditions on f , D, and the constraints {gi (r) = 0} can we infer the
existence of maximum and minimum solutions to the constrained optimization problem? (Hint:
consider the theorem above and answer the corresponding question about existence of absolute
extrema for the Lagrangian function.)
51
2/21/20 Multivariate Calculus: Multivariable Functions Havens
5. Further Problems
Exercises 5.1, 5.2, and 5.3 are cross-posted from the the notes on Curvature, Natural
Frames, and Acceleration for Plane and Space Curves) and rely on definitions in those
notes.
Exercise 5.1. Let f (r) be a differentiable function defined on a domain D ⊆ R2 . By f (r, θ) we mean
f evaluated at the point with position r = rûr (θ) = r cos(θ)ı̂ + r sin(θ)̂. Express the gradient of f
in polar coordinates, meaning, describe the operator ∇ in terms of ûr and ûθ by giving functions
u(r, θ) and v(r, θ) such that
∂ ∂
∇ = u(r, θ) ûr + v(r, θ) ûθ
∂r Ä∂θ
and so that ∇f (r(x, y), θ(x, y)) = ∂f ∂f
ä
∂x ı̂ + ∂y ̂ for all points r(x, y), θ(x, y) P = (x, y)C in D.
Exercise 5.2. Express the gradient operator in spherical coordinates (see the previous problem
for the two dimensional, polar version of this problem.)
Exercise 5.3. Compute the gradients of the coordinate functions for spherical coordinates, i.e.
compute ∇%, ∇θ and ∇ϕ. Express the answers in both the spherical frame and the rectangular
frame.
Exercise 5.4. Recall the spherical coordinate system described in the notes Curvature, Natural
Frames, and Acceleration for Plane and Space Curves (see pages 11-15). The transformation from
the rectangular coordinates (x, y, z)R on R3 to these spherical coordinates (%, θ, φ)S was given as
x = % cos θ cos ϕ , y = % sin θ cos ϕ , z = % sin ϕ ,
where % ∈ [0, ∞), θ ∈ (−π, π], and ϕ ∈ [−π/2, π/2].
(a) Compute the Jacobian matrices
∂(x, y, z) ∂(%, θ, ϕ)
D%,θ,ϕ G = , Dx,y,z X = ,
∂(%, θ, ϕ) ∂(x, y, z)
and verify that these matrices are inverses. (Note that they are 3 × 3 matrices.)
Ä ä
(b) Express the chain rule for a scalar function f x(%, θ, ϕ), y(%, θ, ϕ), z(%, θ, ϕ) with respect
to the spherical variables, using the Jacobians computed above.
(c) Use the chain rule from part (b) to compute the partials f% , fθ and fϕ , where
1
f (x, y, z) = .
x2 + y2 + z2
df
(d) For f the function in part (c), compute dt along the curve
Ä ä
r(t) = 2 + cos(3t) ûr (2t) + sin(3t)k̂ ,
where Ä ä
[x(t)]2 − [y(t)]2 ı̂ + 2x(t)y(t)̂
ûr (2t) = cos(2t)ı̂ + sin(2t)̂ = .
[x(t)]2 + [y(t)]2
52
2/21/20 Multivariate Calculus: Multivariable Functions Havens
Exercise 5.5. This problem explores partial derivatives and directional derivatives of multivariable
vector-valued functions. Let v : D → Rn be a multivariable vector function over a domain D in Rm .
E.g. for a 2-dimensional vector-valued function from D ⊂ R2 one has
v(x, y) = v1 (x, y)ı̂ + v2 (x, y)̂ ,
where vi (x, y), i = 1, 2 are two-variable functions from D to R. In this 2-dimensional case we define
∂v ∂v1 ∂v2
:= ı̂ + ̂ ,
∂x ∂x ∂x
and analogously
∂v ∂v1 ∂v2
:= ı̂ + ̂ .
∂y ∂y ∂y
One can also define a notion of directional derivative of a vector-valued function along a unit
vector: given v : D → Rn and a unit vector û ∈ Rm , define
1
Dû v(x) = lim [v(x + hû) − v(x)] .
h→0 h
One can show that this can be calculated as
n
X Ä ä
Dû v(x) = û · ∇vi (x)êi = û · ∇ v(x) .
i=1
Here, êi are the coordinate basis vectors in Rn , which are analogous to ı̂, ̂, and k̂ (namely, êi has
entries equal to 0 for all coordinates other than the ith coordinate, which equals 1.)
Parts (a)–(c) focus on polar coordinates and two dimensions. Part (d) works in three dimensions,
with spherical coordinates.
(a) Let ûr (x, y) = 1r (xı̂ + ŷ) and ûθ (x, y) = 1r (−yı̂ + x̂) where r2 = x2 + y 2 , as in the
treatment of the polar frame in the notes on Curvature, Natural Frames, and Acceleration
for Plane and Space Curves. Compute all first and second partials with respect to x and y
of ûr and ûθ .
(b) Justify the two dimensional case of the above formula for the directional derivative of a
vector-valued function along a unit vector, i.e. use the limit definition of the directional
derivative above to show that
ä
Dû v(x, y) = û · ∇v1 (x, y) ı̂ + û · ∇v2 (x, y) ̂ .
(c) Compute Dı̂ ûr , D̂ ûr , Dı̂ ûθ , D̂ ûθ , Dûr ûθ and Dûθ ûr .
Ä ä
(d) Let û% , ûθ , ûϕ be the spherical frame (as in the notes Curvature, Natural Frames, and
Acceleration for Plane and Space Curves). Give û% , ûθ , and ûϕ as vector-valued functions
of x, y and z (the rectangular coordinates on R3 , and compute Dı̂ û% , D̂ û% , Dk̂ û% , Dı̂ ûθ ,
D̂ ûθ , Dk̂ ûϕ , Dı̂ ûϕ , D̂ ûϕ , Dk̂ ûϕ , Dû% ûθ , Dû% ûϕ , Dûθ û% , Dûθ ûϕ , Dûϕ û% , and Dûϕ ûϕ .
(Part (d) is only recommended for a certain sort of student, who really enjoys/needs to
take lots of partial derivatives, and finds it soothing to do so.)
53