0% found this document useful (0 votes)
110 views12 pages

F X F (X + H, Y) F (X, Y) H, F y F (X, y + H) F (X, Y) H

This document defines and provides examples of partial derivatives and the chain rule for functions with two or more variables. 1) Partial derivatives measure how a function changes with respect to one variable, treating the other variables as constants. The chain rule expresses derivatives of composite functions in terms of partial derivatives. 2) Examples show how to compute first and second order partial derivatives for functions of two and three variables. The chain rule is also generalized for functions of multiple variables. 3) Applications of the chain rule include computing derivatives of functions where the variables are themselves functions of other variables. The chain rule allows these derivatives to be written in terms of partial derivatives.

Uploaded by

Saurabh Tomar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views12 pages

F X F (X + H, Y) F (X, Y) H, F y F (X, y + H) F (X, Y) H

This document defines and provides examples of partial derivatives and the chain rule for functions with two or more variables. 1) Partial derivatives measure how a function changes with respect to one variable, treating the other variables as constants. The chain rule expresses derivatives of composite functions in terms of partial derivatives. 2) Examples show how to compute first and second order partial derivatives for functions of two and three variables. The chain rule is also generalized for functions of multiple variables. 3) Applications of the chain rule include computing derivatives of functions where the variables are themselves functions of other variables. The chain rule allows these derivatives to be written in terms of partial derivatives.

Uploaded by

Saurabh Tomar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

2.3.

Definition of partial derivative for a function of two variables

Definition: The first partial derivatives of the function f (x, y) with respect to
the variables x and y are given by
f (x + h, y) f (x, y)
f
= lim
,
h0
x
h
f
f (x, y + h) f (x, y)
= lim
.
h0
y
h

(2.42)
(2.43)

Note: Notice that f /x is nothing but the standard first derivative of f (x, y)
when considered as a function of x only, regarding y as a constant parameter.
Similarly f /y is the standard first derivative of f (x, y) when considered as a
function of y only, regarding x as a constant parameter.
Example: Let
f (x, y) = x2 cos y + 2xy,

(2.44)

then, according to the definition above, the partial derivatives of f (x, y) with
respect to x and y are
f
dx2
dx
= cos y
+ 2y
= 2x cos y + 2y,
x
dx
dx
f
d cos y
dy
= x2
+ 2x
= x2 sin y + 2x.
y
dy
dy

(2.45)
(2.46)

Geometric interpretation: Partial derivatives of functions of two variables admit a similar geometrical interpretation as for functions of one variable. For a
function of one variable f (x), the first derivative with respect to x is defined as
df
f (x + h) f (x)
= lim
,
h0
dx
h

(2.47)

and geometrically it measures the slope of the curve f (x) at the point x. This is
illustrated in figure 5.

14

Figure 5: Geometrical picture of df /dx at the point x = x0 . Applying the definition


(2.47), the derivative is nothing but the slope of the line y = a + bx which is tangent to
df
f (x) at the point x0 , namely b = dx
|x0 .

Recalling the definition we gave in the note above, we can now interpret geometrically the first-order partial derivatives of a function of two variables in a
completely analog fashion:
Geometrical definition of fx and fy : The partial derivative f /x at a certain point (x0 , y0 ) is nothing but the slope of the curve of intersection of the function f (x, y) and the vertical plane y = y0 at x = x0 . Likewise, the partial derivative
f /y at a certain point (x0 , y0 ) is nothing but the slope of the curve of intersection
of the function f (x, y) and the horizontal plane x = x0 at y = y0 . Graphically:

Figure 6: Geometrical picture of f /x (1) and f /y (2) at the point (x0 , y0 ). In (1)
f /x is the slope of the red line which is tangent to the green curve resulting from the
intersection of f (x, y) and the plane y = y0 . In (2) f /y can be identified in a similar
fashion.

Notation: From now on we will employ the following shorter notation for the
partial derivatives of f (x, y)
f
= fy .
y

f
= fx ,
x

(2.48)

We will denote by fx (x0 , y0 ) and fy (x0 , y0 ) the partial derivatives at the point
(x0 , y0 ).

15

Definition: Given a function f (x, y), the function is said to be differentiable


if fx and fy exist. If the function is differentiable its first-order derivatives can be
differentiated again and we can define the second-order partial derivatives of
f (x, y) as follows:

f
2f
(2.49)
=
= fxx ,
x x
x2

2f
f
=
(2.50)
= fxy ,
x y
xy

2f
f
=
= fyy ,
(2.51)
y y
y 2

2f
f
=
= fyx .
(2.52)
y x
yx
Note: Notice that fxy and fyx are in principle different functions. Using the
definitions (2.42)-(2.43) twice we can see that the second derivatives fxy and fyx
are given by the double limits
fy (x + p, y) fy (x, y)
(2.53)
p0
p

f (x + p, y + h) f (x + p, y) f (x, y + h) + f (x, y)
= lim lim
,
p0 h0
hp
fx (x, y + h) fx (x, y)
= lim
(2.54)
h0
h

f (x + p, y + h) f (x + p, y) f (x, y + h) + f (x, y)
= lim lim
.
h0 p0
hp

fxy = lim

fyx

Therefore, the only difference between fxy and fyx is the order in which the limits
are taken. It is not guaranteed that the limits commute.
Example 1: Let us compute the first- and second-order partial derivatives of the
function
f (x, y) = exy + ln(x2 + y).
(2.55)
We start with the 1st derivatives,
2x
,
x2 + y
1
= xexy + 2
.
x +y

fx = yexy +

(2.56)

fy

(2.57)

Now we can compute the 2nd derivatives,


fxy =

fy
2x
= exy + xyexy 2
,
x
(x + y)2
16

(2.58)

fyx =

2x
fx
= exy + xyexy 2
.
y
(x + y)2

(2.59)

So, in this case fxy = fyx .


Example 2: As we said at the beginning of this section, all definitions for functions
of two variables extend easily to functions of 3 or more variables. In this
example let us consider the function of three variables
g(x, y, z) = ex2y+3z ,

(2.60)

and compute its 1st and 2nd order partial derivatives. In this case we have 3
1st order derivatives
gx = ex2y+3z ,

gy = 2ex2y+3z ,

gz = 3ex2y+3z .

(2.61)

Now we have in total 9 possible different 2nd order derivatives,


gxx = ex2y+3z ,
gyx = 2ex2y+3z ,
gxy = 2ex2y+3z ,
gyy = 4ex2y+3z ,
gxz = 3ex2y+3z ,
gyz = 6ex2y+3z ,

gzx = 3ex2y+3z ,
(2.62)
gzy = 6ex2y+3z , (2.63)
gzz = 9ex2y+3z . (2.64)

Definition: Consider a function of two variables f (x, y) and let f ,fx , fy , fxx , fyy ,
fxy and fyx exist and be continuous in a neighbourhood of a point (x0 , y0 ). Then
fxy (x0 , y0 ) = fyx (x0 , y0 ).

(2.65)

A nice proof of this theorem is given in chapter 13 of the book Calculus by R.


Adams. The key idea is that the continuity of all 1st and 2nd order derivatives of
f allows us to proof that the order of the limits in (2.53) and (2.54) is irrelevant for
the final result.
2.3.2

Chain rules

Definition: The chain rule for functions of one variable is a formula that gives
the derivative of the composition of two functions f and g (you have used this last
year):
f (g(x + h)) f (g(x))
df dg
df (g(x))
= lim
=
= f 0 (g(x))g 0 (x).
h0
dx
h
dg dx
Example: Let f (x) = 1/t + t2 and x = t2 + t + 1, then

df dt
1
df
=
= 2 + 2t (2t + 1)1
dx
dt dx
t
17

(2.66)

Definition: The chain rule for functions of two variables becomes considerably
more complicated than for functions of one variable. Suppose we have a function
z = f (u(t), v(t)) = g(t),

(2.67)

and we want to know how the function f changes with respect to the variable t. The
way to evaluate that change is to compute the derivative dg/dt, which according to
the definition is
dg
g(t + h) g(t)
f (u(t + h), v(t + h)) f (u(t), v(t))
= lim
= lim
h0
h0
dt
h
h
f (u(t + h), v(t + h)) f (u(t), v(t + h))
= lim
h0
h
|
{z
}
increment with v(t + h) fixed

f (u(t), v(t + h)) f (u(t), v(t))


h0
h
{z
}
|

+ lim

increment with u(t) fixed

f du f dv
=
+
f u ut + f v v t .
u dt
v dt

(2.68)

In the 2th and 3th line of (2.68) we have managed to separate the 1st line into the
sum of two quotients by adding zero to the 1st line in an smart way. The first of
these quotients (2th line) involves changes only on the first variable u(t) whereas
in the second quotient (3th line) we have changes only in the second variable v(t).
Now we can use the chain rule for functions of one variable (2.66) to obtain the
final expression in the 4th line.
Generalizations of the chain rule: The formula above can be also generalized
to the case of functions
z = f (x(s, t), y(s, t)) = g(s, t),

(2.69)

that is, when the variables x and y are functions of two other variables s and t. In
this case the chain rule tells us that the partial derivatives fs and ft can be obtained
as
f x f y
+
fx xs + fy ys ,
x s y s
f x f y
=
+
fx xt + fy yt ,
x t
y t

gs =

(2.70)

gt

(2.71)

18

provided that fx and fy are continuous functions. Notice that these equations
can be written in matrix form

x x

f f
f f
s t
,
=
,
(2.72)

.
s t
x y y y
s t
This matrix form is very convenient for generalizations to functions of more than 2
variables. For example, given a function of n variables f (x1 , . . . xn ) such that
xi = xi (y1 , . . . , ym ) i = 1, . . . , n,
the chain rule can be written as

f
f
,...,
y1
ym

f
f
,...,
x1
xn

x1
x1

y1
ym
..
..
..
.
.
.
xn
xn

y1
ym

(2.73)

(2.74)

and j = 1, . . . , m.

(2.75)

or equivalently
n

X f xi
f
=
yj
xi yj
i=1

i = 1, . . . , n,

provided that the 1st order partial derivatives f /xi are continuous. The matrix
in (2.74) is called the Jacobian matrix of the variable transformation.
Let us consider now several examples:
Example 1: Consider the function
f (x, y) = sin(x + y) with x = st2

and y = s2 + 1/t.

(2.76)

Compute fs and ft in two possible ways.


Two possible ways in which we can compute these partial derivatives are
by using the chain rule,
or by replacing x and y in f (x, y) by their expressions in terms of s and
t and then computing fs and ft directly.
If we use the chain rule we will need the following partial derivatives:
fx = cos(x + y),
xs = t 2 ,
ys = 2s,
19

fy = cos(x + y),
xt = 2ts,
yt = 1/t2 .

(2.77)
(2.78)
(2.79)

Then fs and ft are simply given by


fs = fx xs + fy ys = cos(x + y)(2s + t2 )
1
= cos(st2 + s2 + )(2s + t2 ),
t
1
ft = fx xt + fy yt = cos(x + y)(2ts 2 )
t
1
1
= cos(st2 + s2 + )(2ts 2 ).
t
t

(2.80)

(2.81)

The second way to compute these derivatives is to substitute x and y in terms


of s and t in f (x, y). By doing that we obtain
f (x(s, t), y(s, t)) = sin(st2 + s2 + 1/t).

(2.82)

Now we can obtain fs and ft directly as


fs = (t2 + 2s) cos(st2 + s2 + 1/t),
ft = (2st 1/t2 ) cos(st2 + s2 + 1/t).

(2.83)
(2.84)

Notice that here we have called f (x, y) and f (x(s, t), y(s, t)) both f (before,
we have been using different names). We will keep doing this in the future,
as it makes things simpler.
Example 2: Consider now a function of three variables f (x, y, z) with x = g(z)
and y = h(z). How can we compute the derivative df /dz?
We can again apply the chain rule now for a function of three variables which
in this case are all functions of the same variable z. We need only to use our
general formula (2.75) with n = 3 and m = 1, that is
df
f dx f dy f
=
+
+
.
dz
x dz
y dz
z

(2.85)

Example 3: Suppose that the temperature T of a certain liquid varies with the
depth of the liquid z and the time t as T (z, t) = et z. What is the rate of
change of the temperature with respect to the time at a point that is moving
through the liquid in such a way that at time t its depth is z = f (t)? What
is this rate if f (t) = et ? What is happening in this case?
Here we have an example of a function of two variables T (z, t) and they tell
us to compute T /t for a point such that z = f (t). This is a clear case when
we can use the chain rule
T dz T
dT
=
+
= et f 0 (t) zet = et f 0 (t) f (t)et .
dt
z dt
t
20

(2.86)

If in particular f (t) = et , then the previous formula gives


dT
= et f 0 (t) f (t)et = 1 1 = 0.
dt

(2.87)

In this case the decrease in temperature due to the increase of depth and the
decrease in temperature due to the increase of time are perfectly balanced in
such a way that the temperature does not change with time.
2.3.3

Definition of differential

As in previous sections, it is useful to start this section by recalling the definition


of differential for functions of one variable:
Definition: Given a function f (x) and assuming that its total derivative df /dx
exists at a certain point x, the total differential df of the function is given by

df
df =
dx = f 0 (x)dx.
(2.88)
dx
The quantity df can be interpreted as the infinitesimal change on the value of the
function f (x) when x changes by the infinitesimal amount dx. A mathematical way
of expressing this is to define
f = f (x + x) f (x),

(2.89)

where f and x are finite increments of f and x. Then one needs to prove that
f df

as

x dx.

To prove this we can use the mean value theorem which allows us to rewrite (2.89)
as
(2.93)
f = f 0 (x + x)x with (0, 1).

Given a function f (x) which is continuous and has continuous 1st total derivative f 0 (x) the
mean value theorem tells us that if a, b are points at which the function takes values f (a) and f (b)
and a < b, then a third point c exists, c (a, b) such that

f (b) f (a)
= f 0 (c).
ba

21

(2.90)

Now we can take the limit when x dx which implies f df and dx, df being
infinitesimal increments. Since dx is very small we can use dx x, to obtain
df = f 0 (x)dx.

(2.94)

which is nothing but (2.88).


Definition: A similar quantity can be defined for functions of more than one
variable. For example, let us consider now a function of two variables f (x, y) with
continuous first order partial derivatives fx and fy . We define the total differential
of f ,


f
f
df =
dx +
dy = fx dx + fy dy,
(2.95)
x
y
as the small variation experienced by f when the variables x and y are changed by
infinitesimal amounts dx and dy respectively. As before we can define
f = f (x + x, y + y) f (x, y)
= f (x + x, y + y) f (x, y + y) + f (x, y + y) f (x, y), (2.96)
{z
} |
{z
}
|
increment with y + y fixed

increment with x fixed

now we have managed to split f into two pieces, each of which involves a variation
only in x and only in y, respectively. These two terms are now analogous to the
definition (2.89) for a function of one variable and this allows us to write
f = fx (x + 1 x, y + y)x + fy (x, y + 2 y)y,

with 1 , 2 (0, 1). (2.97)

Now, as for functions of one variable, in the limit x dx and y dy with


dx, dy being infinitesimal increments (dx x and dy y), then f df and we
recover the result (2.95).
The differential is a very useful concept when we want to obtain approximate
values of a function nearby a point at which the value of the function and its partial
derivatives are known. A good example of this is example 2 below. Example 1 is a
practical example of how to compute the differential of a function of two variables:
Example 1: The fundamental equation which characterizes an ideal gas is
RT = P V,
If we define
=

ca
,
ba

(2.98)
(2.91)

we have (0, 1) and we can rewrite (2.90) as


f (b) f (a) = f 0 (a + (b a))(b a).

22

(2.92)

where R is a universal constant and P, V, T are the three state variables (pressure, volume and temperature of the gas). Obtain the change in the pressure
of the gas due to a small change of its volume and temperature.
This is a typical exercise in which they ask us to compute the differential of
the pressure dP as a function of the differential of volume dV and temperature
dT . We only need to use the general formula (2.95) and the relation
T
P (V, T ) = R ,
V
which follows from (2.98) and we obtain


P
P
T
R
dV +
dT = R
dV +
dT.
dP =
2
V T
T V
V
V

(2.99)

(2.100)

The sub-indices V and T in the partial derivatives above only indicate which
variable remains constant. For example (P /V )T is the partial derivative of
the pressure with respect to the volume at constant temperature.

Example 2: Use differentials to estimate the value of 27 3 1021.
Let us start by defining the function
f (x, y) =


x 3 y.

(2.101)

We can easily obtain the value of this function at the point (x, y) = (25, 1000),
f (25, 1000) = 5 10 = 50.

(2.102)

We can now exploit the fact that the point at which we want to compute the
value of f (x, y) is very close to (25, 1000). In other words we can define
(27, 1021) = (x + x, y + y) = (25 + 2, 1000 + 21),

(2.103)

hence identifying x ' dx = 2 and y ' dy = 21. Employing now our


formula (2.95) we have that

1 x
1 3y
df = fx dx + fy dy = dx +
dy,
(2.104)
2 x
3 y 2/3
and evaluating this formula with x = 25, y = 1000, dx = 2 and dy = 21 we
obtain
1 10
1 5
7
df =
2+
21 = 2 +
= 2.35.
(2.105)
2 5
3 100
20
23

Therefore, the approximate value of f (27, 1021) is given by


f (27, 1021) ' f (25, 1000) + df = 52.35.

(2.106)

Now we can check if this approximation is good by calculating exactly the


value

3
f (27, 1021) = 27 1021 = 52.3227 . . . ' 52.32,
(2.107)
and so the approximation is actually very good!
It is now easy to generalize the concept of differential to functions of n variables:
Definition: The total differential df of a function of n variables f (x1 , . . . , xn )
with continuous 1st order partial derivatives fx1 . . . fxn is given by
df = fx1 dx1 + fx2 dx2 + . . . + fxn dxn .

(2.108)

The object df is frequently called the 1st-order differential of the function f .


The reason is that, we can define higher order differentials in the following way:
Definition: Consider a function f (x, y) of two real variables with continuous 1stand 2nd-order partial derivatives. We define the 2nd-order differential of f and we
write it as d2 f as

f
f
2
df = d
dx +
dy = d(fx dx + fy dy)
x
y
= (dfx )dx + (dfy )dy = (fxx dx + fyx dy)dx + (fxy dx + fyy dy)dy
= fxx dx2 + 2fxy dxdy + fyy dy 2 ,
(2.109)
where we assumed fxy = fyx . An equivalent way to write this result is

df=
where we identify

dx +
dy
x
y

2
=
x2

and

2
f,
2
=

(2.110)

2
.
y 2

(2.111)

Now we can generalize the previous formula to the n-th differential of a function
f (x, y) as

24

Definition: The n-th differential of a function of two variables f (x, y) with continuous n-th order partial derivatives is given by
n

n
d f=
dx +
dy f.
(2.112)
x
y
This formula can be proven by induction (which means that assuming it works for
dn1 f we can prove that it works for dn f ). For that proof it is important to notice
that the operator above admits the binomial expansion

dx +
dy
x
y

with

2.3.4

k nk
n
X

n
=
dy nk dxk ,
k
k
x
y nk

(2.113)

k=0

n
k

n!
.
k!(n k)!

(2.114)

Functions of several variables defined implicitly

As usual let us start by looking at the easiest example, namely functions of one
variable.
Definition: We say that the function y = f (x) is defined implicitly if y and x are
related by an equation of the type
(x, y) = 0,

(2.115)

and there is no possibility of obtaining y = f (x) explicitly from the constraint


(2.115).
Note: Notice however that the equation (2.115) allows us to obtain the value of
y for a given value of x (at least numerically), even though it does not allow us to
know f (x) for arbitrary x.
It is easier to understand this definition with some examples:
Example 1: Let
(x, y = f (x)) = log(x + y) sin(x + y) = 0.

(2.116)

The constraint F (x, y) = 0 gives us a relation between x and y, however we


are not able to obtain y as a function of x from this equation. Therefore the
function y = f (x) is defined implicitly through the constraint (x, y) = 0.

25

You might also like