0% found this document useful (0 votes)
276 views28 pages

Section 5: The Jacobian Matrix and Applications

The Jacobian matrix provides a way to generalize the derivative to vector-valued functions between Euclidean spaces. The Jacobian of a function f: Rm → Rn at a point p is an n x m matrix containing the partial derivatives of the component functions of f evaluated at p. The Jacobian allows us to represent the derivative of f at p as a linear transformation, acting on vectors in the domain to give vectors in the range. Its determinant provides important information about how f stretches or compresses volumes.

Uploaded by

Aman Saini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
276 views28 pages

Section 5: The Jacobian Matrix and Applications

The Jacobian matrix provides a way to generalize the derivative to vector-valued functions between Euclidean spaces. The Jacobian of a function f: Rm → Rn at a point p is an n x m matrix containing the partial derivatives of the component functions of f evaluated at p. The Jacobian allows us to represent the derivative of f at p as a linear transformation, acting on vectors in the domain to give vectors in the range. Its determinant provides important information about how f stretches or compresses volumes.

Uploaded by

Aman Saini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Section 5: The Jacobian matrix and applications.

S1: Motivation
S2: Jacobian matrix + differentiability
S3: The chain rule
S4: Inverse functions

Images from“Thomas’ calculus”by Thomas, Wier, Hass & Giordano, 2008, Pearson Education, Inc.

1
S1: Motivation. Our main aim of this section is to consider “general”
functions and to define a general derivative and to look at its properties.

In fact, we have slowly been doing this. We first considered vector–


valued functions of one variable f : R → Rn

f (t) = (f1(t), . . . , fn(t))


and defined the derivative as

f 0(t) = (f10 (t), . . . , fn0 (t)).

We then considered real–valued functions of two and three variables


f : R2 → R, f : R3 → R and (as we will see later) we may think of
the derivatives of these functions, respectively, as

∇f = (∂f /∂x, ∂f /∂y)


∇f = (∂f /∂x, ∂f /∂y, ∂f /∂z).

2
There are still more general functions than those two or three types
above. If we combine the elements of each, then we can form “vector–
valued functions of many variables”.

A function f : Rm → Rn (n > 1) is a vector–valued function of m


variables.

Example 1
 
x !
x+y+z
f
y  =

xyz
z
defines a function from R3 to R2.

3
When it comes to these vector–valued functions, we should write vectors
as column vectors (essentially because matrices act on column vectors),
however, we will use both vertical columns and horizontal m–tuple no-
tation. Thus, for example, for the vector x ∈ R3 we will write both
 
x
y  or (x, y, z) (and xi + y j + z k)
 
z
and so we could write f : R3 → R2 as
 
x !
f1(x, y, z)
fy  =

and
f2(x, y, z)
z
f (x, y, z) = (f1(x, y, z), f2(x, y, z))
= f1(x, y, z)i + f2(x, y, z)j
or combinations of columns and m-tuples.

4
In Example 1, the real–valued functions
 
x
f1  y  = x + y + z and
 
z
 
x
f2 y  = xyz
 
z
are called the co–ordinate or component functions of f , and we
may write
!
f1
f= .
f2

Generally, any f : Rm → Rn is determined by n co–ordinate functions


f1, . . . , fn and we write
 
f1(x1, . . . , xm)
f (x , . . . , x )
m 
 2 1 ..
f = . (1)
 . 
fn(x1, . . . , xm)
5
We shall be most interested in the cases where f : R2 → R2 or
f : R3 → R3 because this is where the most applications occur and
because it will prove to be extremely useful in our topic on multiple
integration.

For these special cases we can use the following notation

f (x) = f (x, y)
= (f1(x, y), f2(x, y))
= f1(x, y)i + f2(x, y)j.

f (x) = f (x, y, z)
= (f1(x, y, z), f2(x, y, z), f3(x, y, z))
= f1(x, y, z)i + f2(x, y, z)j + f3(x, y, z)k.

6
One way of visualizing f , say, f : R2 → R2 is to think of f as a
transformation between co–ordinate planes.

So that f may stretch, compress, rotate etc sets in its domain.

The above be particularly useful when dealing with multiple integration


and change of variables.
7
S2: Jacobian matrix + differentiability.

Our first problem is how we define the derivative of a vector–valued


function of many variables. Recall that if f : R2 → R then we can
form the directional derivative, i.e.,
∂f ∂f
Duf = u1 + u2 = ∇f · u
∂x ∂y
where u = (u1, u2) is a unit vector. Thus, knowledge of the gradient
of f gives information about all directional derivatives. Therefore it is
reasonable to assume
!
∂f ∂f
∇p f = (p), ( p)
∂x ∂y
is the derivative of f at p. (The story is more complicated than this but
when we say f is “differentiable” we mean ∇f represents the derivative,
to be discussed a little later.)

8
More generally if f : Rm → R then we take the derivative at p to be
the row vector
!
∂f ∂f ∂f
(p), (p), . . . , ( p) = ∇ p f
∂x1 ∂x2 ∂xm
Now take f : Rm → Rn where f is as in equation (1), then the natural
candidate for the derivative of f at p is
 
∂f1 ∂f1 ∂f1
 . . . 
 ∂x1 ∂x2 ∂xm
 
 
 
 ∂f2 ∂f2 ∂f2 
Jp f = 
 ... 
 ∂x1 ∂x2 ∂xm 

 .. ...... ... 
 . 
 
 ∂f ∂fn ∂fn 
 n
... 
∂x1 ∂x2 ∂xm
where the partial derivatives are evaluated at p. This n × m matrix is
called the Jacobian matrix of f . Writing the function f as a column
helps us to get the rows and columns of the Jacobian matrix the right
way round. Note the “Jacobian” is usually the determinant of this matrix
when the matrix is square, i.e., when m = n.
9
Example 2 Find the Jacobian matrix of f from Example 1 and evaluate
it at (1, 2, 3).

10
Most of the cases we will be looking at have m = n = either 2 or 3.
Suppose u = u(x, y) and v = v(x, y). If we define f : R2 → R2
by
! ! !
x u(x, y) f
f = ≡ 1
y v(x, y) f2
then the Jacobian matrix is
∂u ∂u
 

 ∂x ∂y 
 
Jf = 
 

 ∂v ∂v 
 

∂x ∂y
and the Jacobian (determinant)
∂u ∂u


∂x ∂y

∂u ∂v ∂v ∂u
det(J f ) = = − .

∂v

∂v
∂x ∂y ∂x ∂y

∂x ∂y
∂(u, v)
We often denote det(J f ) by .
∂(x, y)
11
Example 3 Consider the transformation from polar to Cartesian co–
ordinates, where

x = r cos θ and y = r sin θ.


We have

∂x ∂x

∂r ∂θ

∂(x, y) cos θ − r sin θ
= = = r.

∂(r, θ) sin θ r cos θ
∂y ∂y



∂r ∂θ

12
We have already noted that if f : Rm → Rn then the Jacobian matrix
at each point a ∈ Rm is an m × n matrix. Such a matrix Jaf gives
us a linear map Da f : Rm → Rn defined by

(Da f ) (x) := Jaf · x for all x ∈ Rn.


Note that x is a column vector.

When we say f : Rm → Rn is differentiable at q we mean that,



the affine function A(x) := f (q) + Jq f · (x − q), is a “good”
approximation to f (x) near x = q in the sense that
kf (x) − f (q) − (Jq f ) · (x − q)k
lim =0
x→q kx − qk
where
q
kx − qk = (x1 − q1)2 + . . . + (xm − qm)2.

13
You should compare this to the one variable case: a function f : R → R
f (a + h) − f (a)
is differentiable at a if lim exists, and we call this
h→0 h
0
limit f (a). But we could equally well say this as f : R → R is
differentiable at a if there is a number, written f 0(a), for which
|f (a + h) − f (a) − f 0(a) · h|
lim = 0,
h→0 |h|
because a linear map L : R → R can only operate by multiplication
with a number.

How do we easily recognize a differentiable function? If all of the com-


ponent functions of the Jacobian matrix of f are continuous, then f is
differentiable.

14
Example 4 Write the derivative of the function in Example 1 at (1, 2, 3)
as a linear map.

Suppose f and g are two differentiable functions from Rm to Rn. It is


easy to see that the derivative of f + g is the sum of the derivatives of f
and g. We can take the dot product of f and g and get a function from
Rm to R, and then differentiate that. The result is a sort of product
rule, but I’ll leave you to work out what happens. Since we cannot divide
vectors, there cannot be a quotient rule, so of the standard differentiation
rules, that leaves the chain rule.
15
S3: The chain rule. Now suppose that g : Rm → Rs and f :
Rs → Rn. We can now form the composition f ◦ g by mapping with
g first and then following with f :

x → g(x) → f (g(x)) (2)

(f ◦ g) (x) := f (g(x)) for all x ∈ Rm.

Example 5 Let g : R2 → R2 and f : R2 → R3 be defined,


respectively, by
 
! ! ! sin x
x x+y x
g := and f := x − y  .
 
y xy y
xy
Then f ◦ g is defined by
 
! !! ! sin(x + y)
x x x+y
(f ◦ g ) =f g =f =  x + y − xy  .
 
y y xy
(x + y) (xy)

16
Let b = g(p) ∈ Rs. If f and g in (2) above are differentiable then
the maps Jpg : Rm → Rs and Jbf : Rs → Rn are defined, and we
have the following general result.

Theorem 1 (The Chain Rule) Suppose that g : Rm → Rs and


f : Rs → Rn are differentiable. Then

Jp(f ◦ g) = Jg(p)f · Jpg.

This is again just like the one variable case, except now we are multiplying
matrices (see below).

17
Example 6 Consider Example 5:
 
! ! !sin x
x x+y x
g = and f = x − y  .
 
y xy y
xy

!
a1
Find Jp(f ◦ g) where p = . We have
a2
! !
1 1 1 1
Jp g = = .
y x a2 a1
p
Also
 
cos x 0
Jg(p)f =  1 − 1
 
y x x=a +a ,y=a a
1 2 1 2
 
cos(a1 + a2) 0
=  1 −1 .
 
a1 a2 a1 + a2

18
(Ex cont.) and
 
cos(x + y) cos(x + y)
J p (f ◦ g ) = 

1−y 1−x  
2xy + y 2 x2 + 2xy p
We observe that
 
cos(a1 + a2) cos(a1 + a2)

 1 − a2 1 − a1 

2a1a2 + a22 a12 + 2a1a2
 
cos(a1 + a2) 0 !
1 1
= 1 −1  ·
 
a2 a1
a1 a2 a1 + a2

19
The one variable chain rule is a special case of the chain rule that we’ve
just met — the same can be said for the chain rules we saw in earlier
sections.

Let x : R → R be a differentiable function of t and and let u : R → R


a differentiable function of x. Then (u ◦ x) : R → R is given by
(u ◦ x)(t) = u(x(t)). In the notation of this chapter
Jt(u ◦ x) = Jx(t)u · Jtx
d du dx
     
i.e. (u ◦ x) = .
dt t dx x(t) dt t

We usually write this as


du du dx
=
dt dx dt
du
keeping in mind that when we write we are thinking of u as a
dt
du
function of t, i.e., u(x(t)) and when we write we are thinking of u
dx
as a function of x.
20
Now suppose we have x = x(t), y = y(t) and z = f (x, y). Then

Jt(f ◦ x) = Jx(t)f · Jtx


Therefore
 
dx
 dt 
!  
d ∂f ∂f
(f (x(t), y(t))) = · 
 
dt ∂x ∂y  dy 
 

dt
so that
df ∂f dx ∂f dy
= + ,
dt ∂x dt ∂y dt
which is just what we saw in earlier sections.

21
S4: Inverse functions.

In first year (or earlier) you will have met the inverse function theo-
rem, which says essentially that if f 0(a) is not zero, then there is a
differentiable inverse function f −1 defined near f (a) with
d −1 1
 
(f ) = 0 .
dt f (a) f (a)

What happens in the multi–variable case?

22
Let us consider a case where we can write down the inverse. For polar
coordinates we have

x = r cos θ, y = r sin θ  
y
q
r= x2 + y 2 , θ = arctan .
x
Now differentiating we obtain
∂r x r cos θ ∂x
=q = = cos θ and = cos θ
∂x x2 + y 2 r ∂r
∂r 1
i.e., 6= .
∂x ∂x
∂r
We see that the one variable inverse function theorem does not apply to
partial derivatives. However, there is a simple generalisation if we use
the multivariable derivative, that is, the Jacobian matrix.

23
To continue with the polar coordinate example, define
! ! !
r x(r, θ) r cos θ
f = = (3)
θ y(r, θ) r sin θ
and
! ! q 
x r(x, y) x 2 + y2
g = =   . (4)
y θ(x, y) arctan y x
Consider
! !! ! ! !
x x r x x
(f ◦ g ) =f g =f = = Id .
y y θ y y
Therefore f ◦g = Id, the identity operator on R2. Similarly g◦f = Id.

Recall
! ! !
x x 1 0
Id = so that J(Id) = ≡ 2 × 2 identity matrix.
y y 0 1

24
Thus by the chain rule
!
1 0
J f · J g = J(Id) = = Jg · Jf
0 1
so that (J f )−1 = J g. Note for simplicity the points of evaluation
have been left out. Therefore
∂r ∂r − 1
   
∂x ∂x
 ∂x ∂y   ∂r ∂θ 
   
= .
   
 
 ∂θ ∂θ   ∂y ∂y 
   

∂x ∂y ∂r ∂θ
∂r x
We can check this directly by substituting =q = cos θ
∂x x2 + y 2
etc.

The same idea works in general:

25
Theorem 2 (The Inverse Function Theorem) Let f : Rn → Rn
be differentiable at p. If Jp f is an invertible matrix then there is an
inverse function f −1 : Rn → Rn defined in some neighbourhood of
b = f (p) and
(Jb f −1) = (Jp f )−1.

Note that the inverse function may only exist in a small region around
b = f (p).

Example 7 We earlier saw that for polar coordinates, with the notation
of equation (3)
!
cos θ − r sin θ
Jf = ,
sin θ r cos θ
with determinant r. So it follows from the inverse function theorem that
the inverse function g is differentiable if r 6= 0.

26
Example 8 The function f : R2 → R2 is given by
! ! !
x u x2 − y 2
f = = .
y v x2 + y 2
Where is f invertible? Find the Jacobian matrix of f −1 where f is
invertible.
!
2x −2y
SOLN: J f = and det J f = 8xy, so f is invertible
2x 2y
everywhere except the axes.
!−1
x−1 x−1
! !
2x −2y 1 2y 2y 1
J f −1 = = = .
2x 2y 8xy −2x −2x 4 −y −1 −y −1
Translate to (u, v) coordinates and this is

−1/2 −1/2
!
−1 2 (u + v) (u + v)
Jf = −1/2 −1/2 .
4 −(v − u) −(v − u)

27
Finally let us apply the inverse function theorem to the Jacobian deter-
minants. We recall that

∂r ∂r
∂(r, θ) ∂x

∂y
= det J g = ∂θ ∂θ
and
∂(x, y)
∂x ∂y
∂x ∂x

∂(x, y) ∂r ∂θ
= det J f = .
∂(r, θ) ∂y ∂y
∂r ∂θ
Since J g and J f are inverse matrices, their determinants are inverses:
∂(r, θ) 1
= .
∂(x, y) ∂(x,y)
∂(r,θ)
This sort of result is true for any change of variable — in any number
of dimensions — and will prove very useful in integration.

28

You might also like