Partial Derivatives
Partial Derivatives
Tom Hebdige
Semester 2 – 2024/25
Contents
1 Functions of multiple variables 2
2 Geometric representations 2
2.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Level sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3 Partial derivatives 5
5 Chain rule 8
6 The gradient 11
6.1 Directional derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.2 Geometric interpretation of gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7 Vector fields 15
7.1 Vector fields from scalar fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
8 Implicit differentiation 18
8.1 Implicitly defined functions — one variable . . . . . . . . . . . . . . . . . . . . . . . . . 18
8.2 Implicitly defined functions — two variables . . . . . . . . . . . . . . . . . . . . . . . . 19
1
1 Functions of multiple variables
So far in calculus, you have focused on functions of a single variable, specifically real-valued functions
of the form
f : D → R
,
x → f (x)
where the domain D ⊆ R. Although there are many instances where this applies, physical quantities
will often depend on multiple variables.
Example:
The height above sea level at a position with coordinates (x, y) can be represented by a function
h(x, y). This function takes coordinates (x, y) as input, and returns the height h(x, y) at those
coordinates.
Multivariable function
A real-valued function of n variables with domain D ⊆ Rn is
f : D → R
.
(x1 , . . . , xn ) → f (x1 , . . . , xn )
• Rn = R
| × ·{z
· · × R} , i.e. a Cartesian product between n copies of R.
n copies
• This naturally generalises to complex-valued functions by replacing the range by C. In this module,
we focus on real-valued functions.
• Unless there are more pertinent variable names for a given application, we often write f (x, y) for
a function of two variables, and f (x, y, z) for a function of three variables.
In this part of the module, we will focus on functions of two variables, but the ideas naturally generalise
to more variables.
Explore:
2 Geometric representations
Just as drawing the graph of the function of one variable provides useful insight into the behaviour of
that function, it is often helpful to represent functions of multiple variables geometrically.
2
2.1 Graphs
Graph of a function
Example:
p
If f (x, y) = 1 − x2 − y 2 defined on the domain D = {(x, y) ∈ R2 | x2 + y 2 ≤ 1}, then the
graph of f is a hemisphere.
p
Figure 1: Surface defined by f (x, y) = 1 − x2 − y 2 . You will learn how to plot these yourself in the
computer practical.
Explore:
3
Level set
If f is a function of two variables, then a level set for f is the set
{(x, y) ∈ R2 | f (x, y) = c}
for some constant c ∈ R. Provided f is a nice enough function, its level set will be a smooth
curve and thus called a level curve.
• These should be familiar as contour lines on maps. If x is longitude and y is latitude, and f (x, y)
is the height of the point above sea level, then contour lines on a map are the level curves of f .
• The level set for a function of two variables lives in two dimensions, while the graph lives in three
dimensions.
Example:
Figure 2: Contour plot for f (x, y) = x2 + y 2 with contours representing values f (x, y) =
0, 10, 20, 30, 40, 50. You will learn how to create such plots in the computer practical.
4
Level sets generalize naturally when we consider functions of three variables.
{(x, y, z) ∈ R3 | F (x, y, z) = c}
for some constant c ∈ R. If this is a smooth surface, then it is usually called a level surface.
Explore:
• In the above example with f (x, y) = x2 + y 2 , what is the level set for c < 0?
• How would you extend the level set definition to a function of four variables?
• Level sets are the basis for numerical analysis of surfaces, called the ‘level set method’.
• Can you think of any other ways to geometrically represent functions of more than one
variable?
3 Partial derivatives
We know how to find the rate of change (‘slope’) of a function of one variable; we find its derivative:
df f (x + h) − f (x)
= lim .
dx h→0 h
The situation is more complicated with more variables because the function can have different slopes
in different directions. Consider a function f (x, y) of two variables. To learn about the slope of the
function in the x-direction at a point (x0 , y0 ):
1. Take a slice through the domain by setting y = y0 . The function along this slice is f (x, y0 ), which
is now a function of x only.
Similarly, the slope in the y-direction is found by fixing x and differentiating with respect to y.
Partial derivative
For a function f (x, y):
• the partial derivative of f with respect to x is
∂f f (x + h, y) − f (x, y)
:= lim
∂x h→0 h
• the partial derivative of f with respect to y is
∂f f (x, y + h) − f (x, y)
:= lim
∂y h→0 h
assuming the limits exist.
5
• The symbol “ ∂ ” is a curly “d” pronounced “partial”.
∂f ∂f
• ∂x and ∂y are both functions of x and y.
∂f
• To compute ∂x , apply the usual rules of differentiation to x while treating y as a constant.
Example:
∂f ∂f
= 2x + y and =x+1 .
∂x ∂y
Alternative notations
Unfortunately partial derivatives have a variety of notations:
∂f
≡ ∂x f (x, y) ≡ fx (x, y) ≡ f1 (x, y)
∂x
∂f
≡ ∂y f (x, y) ≡ fy (x, y) ≡ f2 (x, y) .
∂y
∂f
I will largely stick to the ∂x notation in these notes.
Explore:
• The definition of partial derivatives naturally extends to functions of more variables. See if
you can write down the definitions for functions of three variables.
• Read Thomas 13.3
∂2f ∂2f
∂ ∂f ∂ ∂f
= and = .
∂y ∂x ∂y∂x ∂x ∂y ∂x∂y
6
Example:
The second order partial derivatives of the function f (x, y) = x2 + xy + y from the previous
example are
∂2f ∂ ∂2f ∂
= (2x + y) = 2 = (2x + y) = 1
∂x2 ∂x ∂y∂x ∂y
∂2f ∂ ∂2f ∂
= (x + 1) = 1 2
= (x + 1) = 0 .
∂x∂y ∂x ∂y ∂y
Alternative notations
Again, there are multiple notations for second-order partial derivatives:
∂2f
∂ ∂f
= = fxx = f11 (x, y)
∂x ∂x ∂x2
∂2f
∂ ∂f
= = fyy = f22 (x, y)
∂y ∂y ∂y 2
∂2f
∂ ∂f
= = fxy = (fx )y = f12 (x, y)
∂y ∂x ∂y∂x
∂2f
∂ ∂f
= = fyx = (fy )x = f21 (x, y)
∂x ∂y ∂x∂y
∂2f ∂2f
= .
∂x∂y ∂y∂x
According to the following theorem, this property holds for “sufficiently nice” functions.
Theorem (Clairaut)
2 2
If a function f (x, y) and its partial derivatives ∂f ∂f ∂ f ∂ f
∂x , ∂y , ∂x∂y , ∂y∂x are defined throughout an
open region containing the point (a, b) and they are all continuous at (a, b), then
∂2f ∂2f
(a, b) = (a, b) .
∂x∂y ∂y∂x
Clairaut’s theorem (or the mixed derivative theorem) is a useful property since changing the order of
differentiation can sometimes make the calculation simpler.
7
Example:
∂2f
To find ∂x∂y for
ey
f (x, y) = xy + .
y2 +1
we can either take the derivative with respect to y first, so that
∂f ey 2yey ∂2f
=x+ 2 − 2 then =1,
∂y y + 1 (y + 1)2 ∂x∂y
whereas if we do the differentiation the other way around, first x then y, we have
∂f ∂2f
=y so =1.
∂x ∂y∂x
This second approach is clearly much simpler.
Clairaut’s theorem holds for functions of three (or even more) variables. It becomes ever more useful
if we need to know calculate all partial derivatives of some higher order since it reduces the number of
computations we need to carry out.
Explore:
5 Chain rule
Intuitively, a function f (x) of one variable is differentiable at a point x0 if it can be well-approximated
by a linear function in the region close to x0 , such that
df
f (x0 + ∆x) ≈ f (x0 ) + (x0 ) ∆x
dx
for small values of ∆x. This generalises to functions of two variables, but now rather than locally
approximating the function by a straight line, a function f (x, y) is differentiable at (x0 , y0 ) if it is
well-approximated by a plane, such that
∂f ∂f
f (x0 + ∆x, y0 + ∆y) ≈ f (x0 , y0 ) + (x0 , y0 ) ∆x + (x0 , y0 ) ∆y
∂x ∂y
for small values of ∆x and ∆y.
From now on, we will only consider continuously differentiable functions unless stated otherwise.
8
Explore:
• For more about continuity and differentiability of multi-variable functions, see Thomas
sections 13.2 and 13.3.
• Do all continuously differentiable functions satisfy the conditions for Clairaut’s theorem?
In single variable calculus, given two differentiable functions f (x) and x(t), the derivative of their
composition F (t) = f [x(t)] is given by
dF df dx
= .
dt dx dt
This generalises to the multivariable case.
Let f (x, y) be a continuously differentiable function, and x(t) and y(t) be a pair of differentiable
functions. The derivative of the composite function F (t) = f [x(t), y(t)] with respect to the
variable t is given by
dF dx ∂f dy ∂f
(t) = (t) [x(t), y(t)] + (t) [x(t), y(t)] .
dt dt ∂x dt ∂y
Example:
Let f (x, y) = exy , and also x(t) = t2 and y(t) = t. The partial derivatives are given by
∂f ∂f
(x, y) = yexy and (x, y) = xexy ,
∂x ∂y
9
respectively, and upon substituting x(t) and y(t) they evaluate to
∂f 3 ∂f 3
x(t), y(t) = tet x(t), y(t) = t2 et .
and
∂x ∂y
Using
dx dy
(t) = 2t and (t) = 1 ,
dt dt
the chain rule states that
dF ∂f dx ∂f dy
(t) = [x(t), y(t)] (t) + [x(t), y(t)] (t)
dt ∂x dt ∂y dt
3 3 3
= tet · 2t + t2 et · 1 = 3t2 et .
3 3
Directly differentiating F (t) = et gives F 0 (t) = 3t2 et when using the chain rule for a function
of a single variable, thus confirming the result found from the multi-variable chain rule.
For functions defined on surfaces, there is another generalisation of the chain rule.
Let f (x, y, z) be a continuously differentiable function, and x(r, s), y(r, s) and z(r, s) all be
differentiable functions. Then the partial derivatives of f with respect to r and s are
∂f ∂f ∂x ∂f ∂y ∂f ∂z
= + +
∂r ∂x ∂r ∂y ∂r ∂z ∂r
∂f ∂f ∂x ∂f ∂y ∂f ∂s
= + + .
∂s ∂x ∂s ∂y ∂s ∂z ∂r
Figure 4: Dependency diagram for variables in chain rule for functions defined on a surface.
10
Explore:
6 The gradient
The form of the chain rule from the previous section suggests considering the partial derivatives as
components of a vector.
Gradient
The gradient of a function of two variables f (x, y) is given by
∂f ∂f
∇f := i+ j
∂x ∂y
• The symbol ∇ is called the “nabla operator” (or just “nabla”), and we will also call it grad, so
∇f is “grad f ”.
Example:
∂f ∂f
∇f (x, y) = i+ j = 4x(x2 + y) i + 2(x2 + y) j .
∂x ∂y
The gradient allows us to write the chain rule in an elegant way using vector notation. Writing r(t) =
x(t) i + y(t) j, then the composite function F (t) = f [r(t)] has derivative
dF dx ∂f dy ∂f
(t) = (t) [r(t)] + (t) [r(t)]
dt dt ∂x dt ∂y
dx dy ∂f ∂f
= (t) i + (t) j · [r(t)] i + [r(t)] j
dt dt ∂x ∂y
dr
= (t) · ∇f [r(t)] ,
dt
where
dr dx dy
(t) = (t) i + (t) j .
dt dt dt
This naturally generalises to more variables. For a function F (t) = f [r(t)] = f [x(t), y(t), z(t)] of three
variables, the gradient is
∂f ∂f ∂f
∇f = i+ j+ k,
∂x ∂y ∂z
11
and the chain rule has the same form,
dF dr dx ∂f dy ∂f dz ∂f
(t) = (t) · ∇f [r(t)] = (t) [r(t)] + (t) [r(t)] + (t) [r(t)] .
dt dt dt ∂x dt ∂y dt ∂z
Explore:
Directional derivative
If f is a function, r0 is a point, and u is a unit vector, then the directional derivative of f at r0
in the direction of u is
f (r0 + h u) − f (r0 )
Du f (r0 ) := lim .
h→0 h
Explore:
• Recall how to find the norm of a vector. If you can’t remember, look back at your notes
from Foundations & Calculus, or Thomas 11.2.
• What is the definition of the directional derivative for a function of three variables?
• Read Thomas 13.5.
Directional derivatives can be computed from the gradient. The straight line through point r0 in the
direction u can be parameterised as r(h) = r0 + h u. Therefore the directional derivative can be written
as
f [r(h)] − f [r(0)] df
Du f (r0 ) = lim = [r(h)] .
h→0 h dh h=0
Applying the chain rule,
dr
Du f (r0 ) = · ∇f [r(h)] .
dh h=0
Furthermore,
dr d
= r0 + h u = u ,
dh dh
12
Figure 5: Geometric interpretation of the directional derivative as the gradient of the tangent line to
the surface in the direction u. [Taken from Thomas]
Du f (r0 ) = u · ∇f (r0 ) .
• The standard partial derivatives are the directional derivatives in the directions of the coordinate
axes, so
∂f ∂f
(a, b) = Di f (a, b) and (a, b) = Dj f (a, b) ,
∂x ∂y
with i = (1, 0) and j = (0, 1).
13
Example:
Let f (x, y) = tan−1 (y/x) for x > 0 and y > 0. What is the directional derivative of this function
at (2, 1) in the direction of u = ( 53 , 45 )?
Recall that
d 1
(tan−1 x) = ,
dx 1 + x2
so
∂f 1 y y ∂f
= − 2 =− 2 (2, 1) = − 15 ,
∂x 1 + y 2 /x2 x x + y2 ∂x
∂f 1 1 x ∂f
= 2 2
= 2 (2, 1) = 25 ,
∂y 1 + y /x x x + y2 ∂y
∇f (2, 1) = (− 15 , 25 ) ,
and hence
3 4
· − 15 , 52 = 1
Du f (2, 1) = u · ∇f (2, 1) = 5, 5 5 .
a · b = kakkbk cos α
where α is the angle between the two vectors. Given that the directional derivative can be expressed as
a scalar product,
Du f (r0 ) = u · ∇f (r0 ) = kukk∇f (r0 )k cos α ,
where α is the angle between the gradient ∇f and u. Furthermore, u is a unit vector, so kuk = 1,
therefore
Du f (r0 ) = k∇f (r0 )k cos α .
• The directional derivative is maximised when α = 0, i.e. when u is parallel to ∇f . The maximum
rate of change is given by k∇f k.
Theorem (Gradient)
The gradient ∇f (r0 ) is a vector that points in the direction of the steepest increase of the
function f at the point r0 , and its magnitude is equal to the rate of increase.
Explore:
14
Figure 6: Example showing the geometric interpretation of the gradient. [From Thomas]
7 Vector fields
Functions map elements of one space to elements of another space, and they are given specific names
depending on the dimension of their range.
Scalar field
A function f : Rn → R that maps points of the n-dimensional space Rn to real numbers R is
called a scalar field.
Example:
The function F (x, y, z) = x2 + y 2 + z 2 defines a scalar field since it maps triples (x, y, z) ∈ R3
to real numbers R.
Vector field
A function f : Rn → Rn that maps points of the n-dimensional space Rn to other elements of
Rn is called a vector field.
Example:
15
for some functions u and v. Both the input of f and its output are elements of the space R2 ,
i.e. they can be thought of as vectors,
x 2 u(x, y)
∈R and ∈ R2 .
y v(x, y)
Explore:
Example:
From the scalar field F (x, y) = x2 + y 2 , we can construct the vector field
f (x, y) = ∇F (x, y) = 2x i + 2y j ,
Importantly, the converse is false: Not every vector field is the gradient of a scalar field. For a
vector field f = ∇F for some scalar field F , we say that the vector field f is a gradient, and that F is a
potential for f . The following result gives a necessary condition for a vector field in two dimensions to
be a gradient.
∂v ∂u
= .
∂x ∂y
16
Example:
∂v ∂u
− = 3x2 y − 3x2 6= 0 .
∂x ∂y
Example:
The vector field f (x, y) = (2xy + cos x) i + (x2 − sin y) j could be a gradient because
∂v ∂u
− = 2x − 2x = 0 .
∂x ∂y
To confirm, we try to find the potential. Assuming the potential ϕ exists,
∂ϕ
= u = 2xy + cos x (1)
∂x
∂ϕ
= v = x2 − sin y . (2)
∂y
Note that g(y) can be any function of y because it will vanish when differentiated with respect
to x. Substituting this into equation (2) gives
∂ϕ
= x2 + g 0 (y) = x2 − sin y .
∂y
Hence g 0 (y) = − sin y, so g(y) = cos y + C, where C is a constant. Thus, the scalar field
ϕ = x2 y + sin x + cos y + C
Explore:
• The study of scalar potentials has widespread application in physics, especially in electro-
magnetism and fluid dynamics.
• These ideas will be taken much further in the second year module on Vector Calculus.
17
8 Implicit differentiation
8.1 Implicitly defined functions — one variable
Two ways to specify the formula of a function:
Geometric interpretation of implicit definition: f (x, y) = c is a level curve of a two variable function f .
The implicit definition simply declares this curve (or part of it) to be the graph of a new function of
one variable.
It is possible to determine the derivative of an implicitly defined function from the relation f (x, y) = c,
even without being able to write it as an explicit formula such as y = y(x).
d
0= f [x, y(x)]
dx
because c is constant. Treating f [x, y(x)] as a composite function f [x(t), y(t)] with x(t) = t and
y(t) = y(x), the chain rule implies that
d dx ∂f dy ∂f
f [x(t), y(t)] = (t) [x(t), y(t)] + (t) [x(t), y(t)] .
dt dt ∂x dt ∂y
dx
However, x(t) = t implies that dt = 1, and hence
d ∂f dy ∂f
0= f [x, y(x)] = [x, y(x)] + [x, y(x)] .
dx ∂x dx ∂y
dy
Solving for dx gives
∂f
dy ∂x (x, y)
= − ∂f
dx (x, y)
∂y
∂f
when ∂y (x, y) 6= 0.
18
Example:
Example:
dy
Although dx is a function of x, this method does not (usually) give an explicit expression for the
dy
derivative. To compute dx for a given value of x, we must first compute y (perhaps only approximately)
and then insert this value into the expression.
Explore:
• You may have seen implicit differentiation before. Does the above example match with
what you have seen before?
• Read Thomas 13.4
If we differentiate F (x, y, z(x, y)) = c with respect to x (while treating y as a constant), then the chain
rule gives
∂F ∂F ∂x ∂F ∂y ∂F ∂z
0= = + +
∂x ∂x ∂x ∂y ∂x ∂z ∂x
∂F ∂F ∂F ∂z ∂F ∂F ∂z
= ·1+ ·0+ = + ,
∂x ∂y ∂z ∂x ∂x ∂z ∂x
therefore
∂F
∂z ∂x
=− ∂F
.
∂x ∂z
Similarly, differentiating F (x, y, z(x, y)) = c with respect to y (while treating x as a constant) gives
∂F
∂F ∂F ∂z ∂z ∂y
0= + =⇒ =− ∂F
.
∂y ∂z ∂y ∂y ∂z
19
Theorem (Implicit differentiation 2)
If F (x, y, z) = c defines z implicitly as a function of (x, y), and z(x, y) is differentiable, then
∂F ∂F
∂z ∂x ∂z ∂y
=− ∂F
and =− ∂F
∂x ∂z
∂y ∂z
∂F
when ∂z (x, y, z) 6= 0.
Example:
0 = z 5 + xz 3 + yz 2 + y 2 z − 1, (3)
and equals 1 when (x, y) = (−2, −2). To find the partial derivatives with respect to x and y, we
let
F (x, y, z) = z 5 + xz 3 + yz 2 + y 2 z − 1,
so that F (x, y, z) = 0. The partial derivatives of this function are
∂F
(x, y, z) = z 3
∂x
∂F
(x, y, z) = z 2 + 2yz
∂y
∂F
(x, y, z) = 5z 4 + 3xz 2 + 2yz + y 2 ,
∂z
and at the point (−2, −2, 1) they take the values
∂F ∂F ∂F
(−2, −2, 1) = 1 (−2, −2, 1) = −3 (−2, −2, 1) = −1 .
∂x ∂y ∂z
The partial derivatives at the specified point are therefore
∂F
∂z ∂x (−2, −2, 1) 1
(−2, −2) = − ∂F =− =1
∂x ∂z (−2, −2, 1)
−1
and
∂F
∂z ∂y (−2, −2, 1) −3
(−2, −2) = − ∂F =− = −3 .
∂y ∂z (−2, −2, 1)
−1
Explore:
• How would you extend this approach to implicit differentiation to more variables?
20