Section 5: The Jacobian Matrix and Applications
Section 5: The Jacobian Matrix and Applications
S1: Motivation
S2: Jacobian matrix + differentiability
S3: The chain rule
S4: Inverse functions
Images from“Thomas’ calculus”by Thomas, Wier, Hass & Giordano, 2008, Pearson Education, Inc.
1
S1: Motivation. Our main aim of this section is to consider “general”
functions and to define a general derivative and to look at its properties.
2
There are still more general functions than those two or three types
above. If we combine the elements of each, then we can form “vector–
valued functions of many variables”.
Example 1
x !
x+y+z
f
y =
xyz
z
defines a function from R3 to R2.
3
When it comes to these vector–valued functions, we should write vectors
as column vectors (essentially because matrices act on column vectors),
however, we will use both vertical columns and horizontal m–tuple no-
tation. Thus, for example, for the vector x ∈ R3 we will write both
x
y or (x, y, z) (and xi + y j + z k)
z
and so we could write f : R3 → R2 as
x !
f1(x, y, z)
fy =
and
f2(x, y, z)
z
f (x, y, z) = (f1(x, y, z), f2(x, y, z))
= f1(x, y, z)i + f2(x, y, z)j
or combinations of columns and m-tuples.
4
In Example 1, the real–valued functions
x
f1 y = x + y + z and
z
x
f2 y = xyz
z
are called the co–ordinate or component functions of f , and we
may write
!
f1
f= .
f2
f (x) = f (x, y)
= (f1(x, y), f2(x, y))
= f1(x, y)i + f2(x, y)j.
f (x) = f (x, y, z)
= (f1(x, y, z), f2(x, y, z), f3(x, y, z))
= f1(x, y, z)i + f2(x, y, z)j + f3(x, y, z)k.
6
One way of visualizing f , say, f : R2 → R2 is to think of f as a
transformation between co–ordinate planes.
8
More generally if f : Rm → R then we take the derivative at p to be
the row vector
!
∂f ∂f ∂f
(p), (p), . . . , ( p) = ∇ p f
∂x1 ∂x2 ∂xm
Now take f : Rm → Rn where f is as in equation (1), then the natural
candidate for the derivative of f at p is
∂f1 ∂f1 ∂f1
. . .
∂x1 ∂x2 ∂xm
∂f2 ∂f2 ∂f2
Jp f =
...
∂x1 ∂x2 ∂xm
.. ...... ...
.
∂f ∂fn ∂fn
n
...
∂x1 ∂x2 ∂xm
where the partial derivatives are evaluated at p. This n × m matrix is
called the Jacobian matrix of f . Writing the function f as a column
helps us to get the rows and columns of the Jacobian matrix the right
way round. Note the “Jacobian” is usually the determinant of this matrix
when the matrix is square, i.e., when m = n.
9
Example 2 Find the Jacobian matrix of f from Example 1 and evaluate
it at (1, 2, 3).
10
Most of the cases we will be looking at have m = n = either 2 or 3.
Suppose u = u(x, y) and v = v(x, y). If we define f : R2 → R2
by
! ! !
x u(x, y) f
f = ≡ 1
y v(x, y) f2
then the Jacobian matrix is
∂u ∂u
∂x ∂y
Jf =
∂v ∂v
∂x ∂y
and the Jacobian (determinant)
∂u ∂u
∂x ∂y
∂u ∂v ∂v ∂u
det(J f ) = = − .
∂v
∂v
∂x ∂y ∂x ∂y
∂x ∂y
∂(u, v)
We often denote det(J f ) by .
∂(x, y)
11
Example 3 Consider the transformation from polar to Cartesian co–
ordinates, where
12
We have already noted that if f : Rm → Rn then the Jacobian matrix
at each point a ∈ Rm is an m × n matrix. Such a matrix Jaf gives
us a linear map Da f : Rm → Rn defined by
13
You should compare this to the one variable case: a function f : R → R
f (a + h) − f (a)
is differentiable at a if lim exists, and we call this
h→0 h
0
limit f (a). But we could equally well say this as f : R → R is
differentiable at a if there is a number, written f 0(a), for which
|f (a + h) − f (a) − f 0(a) · h|
lim = 0,
h→0 |h|
because a linear map L : R → R can only operate by multiplication
with a number.
14
Example 4 Write the derivative of the function in Example 1 at (1, 2, 3)
as a linear map.
16
Let b = g(p) ∈ Rs. If f and g in (2) above are differentiable then
the maps Jpg : Rm → Rs and Jbf : Rs → Rn are defined, and we
have the following general result.
This is again just like the one variable case, except now we are multiplying
matrices (see below).
17
Example 6 Consider Example 5:
! ! !sin x
x x+y x
g = and f = x − y .
y xy y
xy
!
a1
Find Jp(f ◦ g) where p = . We have
a2
! !
1 1 1 1
Jp g = = .
y x a2 a1
p
Also
cos x 0
Jg(p)f = 1 − 1
y x x=a +a ,y=a a
1 2 1 2
cos(a1 + a2) 0
= 1 −1 .
a1 a2 a1 + a2
18
(Ex cont.) and
cos(x + y) cos(x + y)
J p (f ◦ g ) =
1−y 1−x
2xy + y 2 x2 + 2xy p
We observe that
cos(a1 + a2) cos(a1 + a2)
1 − a2 1 − a1
2a1a2 + a22 a12 + 2a1a2
cos(a1 + a2) 0 !
1 1
= 1 −1 ·
a2 a1
a1 a2 a1 + a2
19
The one variable chain rule is a special case of the chain rule that we’ve
just met — the same can be said for the chain rules we saw in earlier
sections.
dt
so that
df ∂f dx ∂f dy
= + ,
dt ∂x dt ∂y dt
which is just what we saw in earlier sections.
21
S4: Inverse functions.
In first year (or earlier) you will have met the inverse function theo-
rem, which says essentially that if f 0(a) is not zero, then there is a
differentiable inverse function f −1 defined near f (a) with
d −1 1
(f ) = 0 .
dt f (a) f (a)
22
Let us consider a case where we can write down the inverse. For polar
coordinates we have
x = r cos θ, y = r sin θ
y
q
r= x2 + y 2 , θ = arctan .
x
Now differentiating we obtain
∂r x r cos θ ∂x
=q = = cos θ and = cos θ
∂x x2 + y 2 r ∂r
∂r 1
i.e., 6= .
∂x ∂x
∂r
We see that the one variable inverse function theorem does not apply to
partial derivatives. However, there is a simple generalisation if we use
the multivariable derivative, that is, the Jacobian matrix.
23
To continue with the polar coordinate example, define
! ! !
r x(r, θ) r cos θ
f = = (3)
θ y(r, θ) r sin θ
and
! ! q
x r(x, y) x 2 + y2
g = = . (4)
y θ(x, y) arctan y x
Consider
! !! ! ! !
x x r x x
(f ◦ g ) =f g =f = = Id .
y y θ y y
Therefore f ◦g = Id, the identity operator on R2. Similarly g◦f = Id.
Recall
! ! !
x x 1 0
Id = so that J(Id) = ≡ 2 × 2 identity matrix.
y y 0 1
24
Thus by the chain rule
!
1 0
J f · J g = J(Id) = = Jg · Jf
0 1
so that (J f )−1 = J g. Note for simplicity the points of evaluation
have been left out. Therefore
∂r ∂r − 1
∂x ∂x
∂x ∂y ∂r ∂θ
= .
∂θ ∂θ ∂y ∂y
∂x ∂y ∂r ∂θ
∂r x
We can check this directly by substituting =q = cos θ
∂x x2 + y 2
etc.
25
Theorem 2 (The Inverse Function Theorem) Let f : Rn → Rn
be differentiable at p. If Jp f is an invertible matrix then there is an
inverse function f −1 : Rn → Rn defined in some neighbourhood of
b = f (p) and
(Jb f −1) = (Jp f )−1.
Note that the inverse function may only exist in a small region around
b = f (p).
Example 7 We earlier saw that for polar coordinates, with the notation
of equation (3)
!
cos θ − r sin θ
Jf = ,
sin θ r cos θ
with determinant r. So it follows from the inverse function theorem that
the inverse function g is differentiable if r 6= 0.
26
Example 8 The function f : R2 → R2 is given by
! ! !
x u x2 − y 2
f = = .
y v x2 + y 2
Where is f invertible? Find the Jacobian matrix of f −1 where f is
invertible.
!
2x −2y
SOLN: J f = and det J f = 8xy, so f is invertible
2x 2y
everywhere except the axes.
!−1
x−1 x−1
! !
2x −2y 1 2y 2y 1
J f −1 = = = .
2x 2y 8xy −2x −2x 4 −y −1 −y −1
Translate to (u, v) coordinates and this is
√
−1/2 −1/2
!
−1 2 (u + v) (u + v)
Jf = −1/2 −1/2 .
4 −(v − u) −(v − u)
27
Finally let us apply the inverse function theorem to the Jacobian deter-
minants. We recall that
∂r ∂r
∂(r, θ) ∂x
∂y
= det J g = ∂θ ∂θ
and
∂(x, y)
∂x ∂y
∂x ∂x
∂(x, y) ∂r ∂θ
= det J f = .
∂(r, θ) ∂y ∂y
∂r ∂θ
Since J g and J f are inverse matrices, their determinants are inverses:
∂(r, θ) 1
= .
∂(x, y) ∂(x,y)
∂(r,θ)
This sort of result is true for any change of variable — in any number
of dimensions — and will prove very useful in integration.
28